In the previous article of this series we gave a first look at the problem of transforming a LIMS data model into another and we discussed an approach that consisted in performing a characterization of entities first, followed by a high level mapping of objects between the source and the target LIMS. As you know, this is the problem we are facing while architecting the data migration from SQL*LIMS to LABVANTAGE 6.
In this article we’ll address the next level of abstraction and talk about data maps.
A data map is the description of a relationship between elements belonging to distinct data models. For the purpose of our discussion, “elements” here mean the database tables and columns of the source and target LIMS’ data models, while the data map is the design at the lowest level of detail of how data will be moved from one database to the other.
How to build a data map
Our starting point is the result of the high level mapping completed at the end of the previous phase. Each of the entities we have identified in that phase is analyzed to determine the group of tables that store information relevant to a full characterization of the entity itself.
For instance, when you decompose SQL*LIMS Specification into its database constituents, you will find that a full definition of a set of specification limits requires a group of seven tables. These tables must be mapped to those obtained by performing the same decomposition on the equivalent LABVANTAGE Specification object. Interestingly enough, LABVANTAGE too uses seven database tables to store specification limits.
Some of the characterizations adopted in the high level mapping will also be helpful here:
- semantic characterization analyzes the type of content of each table and finds where in the target the same type of content is stored
- relational characterization explores relationship between tables in the source data model (master-detail, foreign keys etc.) and looks for similarities in the target
When applied to Specifications this analysis allows to map, for example, SQL*LIMS tables NAIOT_SPECIFICATIONS (master) and NAIOT_SPEC_COMPONENTS (detail) to their LABVANTAGE counterparts SPEC and SPECPARAMITEMS.
As you may expect, the architects of SQL*LIMS and LABVANTAGE took different decisions when designing the two data models and 1:1 relationships between tables like the one described above are certainly the exception and not the rule.
Planning a safe journey for your data
The next and final step consists in transforming the table map into a column map: source and target tables are broken down into their column components and relationships between those are established mainly using semantic characterization.
This step completes the definition of the journey that data will make when travelling from their origin in the source LIMS to their final destination in the target. Most of the times the journey will consist of more than one leg as some manipulation may occur along the way: values may undergo type conversion (e.g. from number to text), could be used in lookups, duplicated and copied into multiple destination tables etc.
SQL*LIMS system owners can rest assured that the consistency of data will not be affected by these transformations. The migration from SQL*LIMS to LABVANTAGE is being designed ensuring the strictest integrity of data, avoiding changes in naming conventions (e.g. no abbreviations, concatenations etc.) and preserving the validation status of the migrated electronic records.
The final products of the data mapping process are interesting pieces of modern art like the one represented in the picture below, created through one of the most suitable tools among the few capable of building data map diagrams. As you can see in the image, the journey of data, although safe, is going to be a bit bumpy!
Maps and gaps
Not surprisingly, gaps are found while performing data mapping between two distinct data models. During table mapping, for instance, you may discover that a match cannot be found: very often this reveals a piece of functionality that is missing in the target application. During column mapping instead, you may hit into potential value truncation, whenever the destination field is not large enough to contain the value coming from the source.
One of the nicest best practices that we have adopted and I’d like to recommend is that the data migration is designed and built in parallel with the agile development of LABVANTAGE 6: gaps emerging from the analysis of the data migration are discussed during the product design sessions, prioritized and often closed already in the next subsequent iteration. This allows the data migration team to make progress with the design, build and testing of data migration procedures without waiting for the very final release of the product.
Following this process, outstanding gaps such as SQL*LIMS Attributes have been quickly addressed, resolved and added to the list of exciting features that are going to be in LABVANTAGE 6.
What’s left out of this short introduction to cross-LIMS data migrations is how the data migration will actually happen. This will be the topic of the third and final article of the series where I’ll bring on stage a character that plays a prominent role in LABVANTAGE’s data integration strategy. Regular readers of Vantage Point should be already familiar with it: our favorite ETL tool expressor.