What's The Big Deal about Data Migration?

Data migration refers to the movement of data from an old or legacy system to a new system (often linking to data quality in context of: “garbage in, garbage out“). It takes place during the transition from one solution to another, is typically multi-phased and is a critical part of any IT deployment and legacy retirement (decommissioning of old systems, legacy data archiving). Data migration differs from data conversion (changing from one format to another) or data remastering (re-creating data in a new master format). The three can however be combined as part of deployment and transition strategies.

It is important to consider data migration and transition from the beginning of any IT-enabled project to avoid technical or functional gaps between the ‘as is‘ and ‘to be‘ processes. For instance, if the ‘to be‘ solution does not allow for all legacy attributes or objects to be migrated, business impact must be evaluated at the process and integration levels; a risk assessment must be performed with downstream applications (e.g. interfaces between PLM and ERP systems will require alignment of data to support data release and change request processes).

Data migration typically follows an ETL pattern:

EXTRACT: data extraction from the source (old) system.
TRANSFORM: data mapping (via common message model) and transformation rules between the source and the target (new) systems.
LOAD: data load into the target system.

Preparing a migration plan includes the following considerations:

Data profiling: to ensure that any historical data due to be migrated is identified, suitable for the changes that are taking place in the organization, of the sufficient level of quality, etc. It is a critical initial step to realize comprehensive checks of existing data model, data volumes, data format and data conformance.
Data quality: to ensure that quality is defined and conformance is attained by elements, attributes and relationships in the source system. These rules are used during profiling to identify whether of not the data is of the correct quality and format, and ready to be extracted while maintaining integrity and consistency of its structure.
Data cleansing: to ensure that data is available, accessible, relevant, accurate, complete and in the correct format as defined in the data quality standards (data verification process). The first stage is to define which cleansing rules will be carried out manually or in a programmatic manner (automated); considering the timing of data cleansing: either before extraction from the source system (fixed by the business or the back office), during the ETL of after load in the target system (fixed by the back office).
Data security: to ensure that there is no leakage of data which might compromise the business competitive advantage.

Data migration plans will be based on data impact analysis, migration strategies and data profiling. Data migration strategies will also be selected based on data quality, amount of manual or automated cleansing needed, data complexity, business impact. Maturity of products, organizations and teams will also be considered in the decision making process. Business change readiness will impact on the level of agility or risk mitigation required. Manual or automated dual maintenance solutions might be required to allow for fall-back scenarios.

Transition strategies include two different approaches that have different impacts and implications from an IT and business (end user) perspectives.

Big bang data migration: small defined processing window while minimizing system downtime. This involves high pressure on data verification and sign-off. Robust dry-runs are required and fall-back scenarios can be complex (high risk). Few mature organizations ever do this.
Multi-stream, multi-step data migration: incremental approach while running the old and new systems in parallel and migrating data in phases. This involves more complex data migration solution design, some manual dual maintenance or else, longer planning periods, stronger business engagement and involvement (key business SMEs, data owners and experts, etc.). It allows re-risking and progressive transition of processes to the new solution. Testing and try-and-error approaches become possible while validating live production-like use cases. Changes done in both old and new systems will add complexity for synchronization, branching, etc. Data master and slave concepts will be required to ensure that the transition can be controlled and to minimize the dual running period as much as possible. Common data between product migrated separately or on-demand will require special attention.

Common mistakes and challenges in data migration projects include (non exhaustive list):

Late inclusion of data migration requirements.
New IT solution designs that do not allow for business transition and production continuity.
Limited upfront data profiling.
Poor data quality, lack of data verification and lack of business ownership / engagement to resolve issues (many organizations are in denial of their data quality issues, sometimes for cultural or social reasons—which can be a hindrance to grow).
Underestimation of business rules and data complexity (volume, architecture, structure, compliance, security, diversity, decay, legal requirement, etc.).
Focus on getting the first data across, without enough consideration of use cases for subsequent data ‘synchronization‘ requirements (the data migration project becomes an IT interface or integration project).
Expectation that data migration should be fully automated; experience shows that manual data cleansing is inevitable in part.
Expectation that someone else in IT will solve the data migration issues with a ‘magic wand‘.

Successful data migration projects require strong planning, due diligence in terms of readiness and data quality, alignment of scope, ‘to be‘ design and data migration design, budget contingency for fall-back scenarios, cost / benefit and risk impact / mitigation analysis, robust data governance (both business and IT sides). This is even more relevant for Product Life-cycle Management (PLM) data migration as data models are complex, with various disconnected out-of-date and discontinued legacy systems. PLM data migration strategies are both IT governance and business deployment dependent.

Well executed PLM data migration strategies can yield significant business value and benefits, some of which can be in the form of cost avoidance—typically:

50% cost avoidance by using PLM offshore services for data profiling, data cleansing, automation tool development and operations.

50% reduction in migration issues.

In the manufacturing industry, data regulation and traceability requirements can be quite stringent on manufacturers to maintain strict security and traceability standards and keep historical records for long term audit and warranty purposes. PLM data migrations are often ‘brown fields‘ and business process optimization and rationalization require lean data models combined with robust archiving strategies.

What are your thoughts?

This post was originally published on LinkedIn on 23 May 2015.

About the Author

Lionel Grealou

Lionel Grealou, a.k.a. Lio, helps original equipment manufacturers transform, develop, and implement their digital transformation strategies—driving organizational change, data continuity and process improvement, managing the lifecycle of things across enterprise platforms, from PDM to PLM, ERP, MES, PIM, CRM, or BIM. Beyond consulting roles, Lio held leadership positions across industries, with both established OEMs and start-ups, covering the extended innovation lifecycle scope, from research and development, to engineering, discrete and process manufacturing, procurement, finance, supply chain, operations, program management, quality, compliance, marketing, etc.