Data Enrichment – Static and Dynamic Sources Part 1
/As mentioned in a prior post, CAD models are like mini databases, and these CAD models represent just part of a bigger picture. There’s related data in other apps like product lifecycle management (PLM), enterprise content management (ECM), enterprise resource planning (ERP) and maintenance, repair and operations (MRO). For example, a part referenced in a CAD model might have pricing and vendor data in ERP, maintenance data in MRO and user manuals in an ECM environment. There are multiple connections, but rarely is this related data found and/or virtually connected. That’s where data enrichment can add significant value to the wealth of information CADnection extracts from CAD models.
The enrichment process is the third step in the CADnection workflow sequence. This step provides an opportunity to augment the data extracted from CAD models with data from external sources. It might be as simple as retrieving the name of the original CAD modeling app from a lookup table based on the CAD model file extension. Or in a more involved use case, it might be the need to pull a list of open purchase orders from an ERP environment for a part.
Determining if the data to be retrieved will change in the future is of critical importance. Ultimately, the desired outcome for the user experience in the search and/or retrieval of data is that it is current and accurate. The two primary states of data are static and dynamic.
Static data doesn’t change. As cited in the earlier example, the relationships between an authoring app and the associate file extensions will not change. The list of apps can grow, but fundamentally, there is very little change over time.
Another example of static data is the treatment of legacy environments. These apps are no longer supported and nor used but may retain valuable historical insights.
Once static data is injected into the tag/value stream, there is no concern for recognizing updates as the data will not and/or is unlikely to change in the future.
On the flip side, deciding to use dynamic data requires two important considerations.
to ensure that a change detection mechanism exists at the data source and is available to produce a change alert.
to define an acceptable latency period. That is, the period of time between an update and the time it is reflected in the target search index and/or properties database.
This latency period can represent a wide range of time. For example, based on the nature of the data, a 10 second latency period may be required. While in other cases it may be acceptable to have updates reflected within 24 hours. This differential and tolerance will have a significant impact on the design and architecture of the enrichment process.
The next blog in this series will address the operations by which the data enrichment process occurs. This includes treating static data such as units of measure as well as designing and architecting connectivity with an enterprise enrichment source.