Legacy data can torpedo your PLM implementation

Legacy data can torpedo your PLM implementation

curate-data-mess

To select PLM system can be a complicated project. But let’s imagine the moment you’ve made a choice and decided about PLM strategy, vendor and products. So, you might think, the tough part of the job is over. Here is the thing… You have one significant problem to overcome. Legacy data.

The problem is not specific for PLM. There are things you should not expect your computer system to do. Don’t expect computer and PLM system to clean the mess in your company. Even best data management and search technologies cannot do it. The mess must be organized before you can attempt to computerized it and bring PLM system to manage data and processes. If you don’t do it, you will wind up with “computerized mess”.

Razorleaf article The hidden cost of a free PLM migration brings excellent examples of problems you can face when migrating data between to PLM systems – wrong location of files, duplicated data, broken files, wrong indication of relationships. In the example from the article, four types of mistakes in legacy data cost $230,000 in service fees to handle during legacy data import.

The following passage is my favorite:

The fact of the matter is, the reason data loads are expensive and time consuming is not because there are no data load experts or that there are no good tools to help with data loads.  The reason they are expensive and time consuming is the data is complex, sometimes inaccurate, varied and often even undiscovered.  You see, if the data was 100% known, 100% consistent and 100% standard, we’d be happy to load it for free. It would probably take us an hour to setup and kick-off, and the good will that we would receive would make it worth our effort.  But it never happens like this. Never.

It made me think about fundamental disconnect between PLM vendors and customers in the way technology is built and implemented. Customers are unaware about how messy is legacy data. On the other side, vendors are not paying much attention to develop tools and technologies to cope with legacy data problem. The problem is hard to solve, but I wanted to share some ides how to improve the situation.

Legacy data quality assessment tools

Imagine data assessment tool you can run to provide a quantified assessment of your existing data. To develop such tool is not easy, but doable. You will get a rank of data quality. As soon as you can get this number, you can apply it to the cost of PLM project. Better data quality can discount PLM project cost.

PLM data modeling

Most of PLM products demand data cleansing to be made before importing the data. Data storage is cheap these days. To import data and later on to clean and organize it can be an interesting alternative. It requires re-thinking of PLM system and architecture, which can be a challenge for existing systems.

What is my conclusion? Legacy data is tough and complex problem. It can easy torpedo your PLM implementation by bringing an additional cost to implementation. Planning it upfront is a good idea. To develop data quality assessment tool can be an interesting opportunity for service companies such as Razorleaf. In the future PLM architectures with flexible data organization capable to resolve data conflicts and other problems can win over older dinosaurs. Just my thoughts…

Best, Oleg

Want to learn more about PLM? Check out my new PLM Book website.

Disclaimer: I’m co-founder and CEO of openBoM developing cloud based bill of materials and inventory management tool for manufacturing companies, hardware startups and supply chain.

Share

Share This Post

  • fguillaumin

    Hello Oleg! Fully agree. No doubt. But. Imagine the solution you propose are sufficient. What is the cost to develop and to maintain, and how many people you need to help maintaining? Comparing to $230000?
    Companies in fact know. But know as well the effort to maintain this level of quality over time. And most probably you need to think about the quality tools as soon as you put your new system in production. While you are still very busy to “tune” the new system…
    In fact the cost of the mess is much much higher. Without any cost of migration, the mess impact the every day work of users, not finding the right data at the right time, duplicated data, missing data (don’t forget missing data). The mess can torpedo your PLM implementation. Much earlier than when you will have to migrate your data in the next PLM system.

  • beyondplm

    Francois,

    Thanks for your comments and thoughts! My idea is probably bad and it will cost a lot to develop such solution. An alternative is what we have today – companies like Razorleaf are explaining to customers whey they need to pay for data cleansing. It might be good enough.

    Btw, it is not clear to many CxO people in manufacturing organizations that they need to clean the mess. Especially in small companies and in case an organization is functioning in such a way for decades. They will tell you – “so what? we have mess… but we can operate”. Sometimes it even feels good – we are heroes and we are going through this mess to release products on time.

    Just my opinion, of course 🙂
    Best, Oleg

  • Michael

    $230k is very small compared to what my company is currently doing just to get various CAD files from 9 of 11 site network drives to Autodesk Vault (replicated Vault). Then there’s nearly 4 million flat files to scan, name and add properties prior to adding those files to the Vault.

    We currently have 2.6 millions files in the Vault and are adding 4-5k/week via autoloading in those 9 sites. The last 2 not loading yet have a combined 1.2 million to autoload and an estimated 500k files to scan.

    In just the last 8 months it’s cost $800k and we anticipate another $1.5 million in the next 2 years to complete it all.

    Lessons learned:
    1. Never let IT update the SQL without warning first.
    2. Never let IT update the OS on the Vault or SQL server.
    3. Resyncing Vault is slow above 2 million files.
    4. It takes 1 full time person to be a Vault cop.
    5. 6 gig internet systems between plants and the master vault work really well.
    6. Anything less than 6 gig, not so well due to shared services on that line.
    7. Forget about enforcing unique file names. Email fights ensue.
    8. Vault cop is a 20 hr a day job.
    9. Vault Admin is a 30 hr a day job for 4 days. No one has problems on Friday (it seems).

  • beyondplm

    Michael thanks for sharing your example! It is indeed an example of very complicated “legacy data” import combined with some infrastructure issues as you indicated. From my experience, most of “import tools” are not good when you need to bring a lot of data and many custom techniques are problematic and expensive like you explained.