Graph Databases, Unlocking Data, and Building Linked Data Layers in PLM Software

July 5, 2021
Future / Future Trends
Oleg
1 Comment
0

Data is at the center of digital transformation projects. Questions about how to organize the data in the most efficient way and provide access to the data are on the top of the mind in many companies developing engineering and manufacturing systems. Therefore, I was absolutely not surprised when my article Will Graph Database become a future PLM silver bullet? about graph databases, intelligence and improved data management raised the questions and debates. I also identify a possible confusion when graph database is technically used as a substitution to the network (or graph) based solutions. My main point was that Graph Database is just a database and by itself doesn’t solve problems of data silos, data locking, and connected lifecycle.

Here is a passage from Thomas Kamps, owner of CONWEAVER GmbH – the outfit developing Linksphere – the low code big graph platform (), which is described as follows.

A configurable Full Stack Graph platform to deploy and maintain applications for graph-typical use cases related to data context analysis.In an agile process, Linksphere Graph applications an be created fast and flexibly with the Linksphere Workbench, a configuration environment with over 300 standardized modules. They are made available in a high-performance, scalable, consistent and up-to-date manner.

According to Thomas Kamps, the main issue is that data is distributed in multiple systems, which was the outcome of PLM vendors’ attempts to have all the data in their system world.

…the main issue is that data are scattered in different data systems and this is in my opinion responsible for the unsuccessful attempts of the big PLM vendors you have mentioned because they would like to have all data in their world. A successful strategy at least if I look at our customers is to create a linked data layer on top of the authoring systems that connects the lifecycle or parts of it in the beginning. Since the mathematical structure of graphs is the native way for the te presentation of linked data it makes a lot of sense to make use of graphs for their representation. The more difficult part is, however, how the graphs gets automatically computed from the authoring systems, this is why a graph database is important but only half the way because it can only represent linked data if they are already there. To compute the liked data you need very effective analytics. Whether PLM vendors were successful or not? What the heck! Who cares?

According to Thomas Kamps’s strategy, decoupling the data from all authoring applications and placing the data in a separate graph database will unlock the data from vendors and provide the layer to develop analytic applications opposite to the PDM authoring systems, which is too narrow to support linking strategies.

If you decouple the linking strategy you can start wherever you like in the lifecycle and do crosslinking without being bound to a specific system. But more important is the framework with which graph applications can be handled, because we want to go beyond PLM 😊 and provide graph based applications for other lifecycles (asset lifecycle, customer lifecycle, data lifecycle etc). To do this you need abstraction from PDM systems and if the future is data linking on a corporate level then PDM authoring systems as the basis for linking strategies is too narrow.

The idea to develop application handling data independently from PDM/PLM systems is great, but not a new one. Extracting data and turning it into intelligence is very powerful and fascinating. Such a pattern used by many vendors and gave great results. I think this idea has a big future, but the main challenge to create such a system is actually to be able to bring the right data from multiple sources. The latter is actually a big challenge. In reality, there is a huge amount of data and also data is changing with high velocity. To make it happen, the technologies of fast data acquisitions must be developed, but also the application holding the data must become open enough to open the data source and supply the data. Both processes have many challenges – both technological and organizational.

Does Graph Database Solve The Problem of Data Silos?

A question is if a graph database solves the problem of data availability across multiple silos. From a certain standpoint, if the data is indeed moved to the graph database, then it will become more accessible. However, it brings up many additional questions that need to be answered. Let’s start with the question of graph database availability and openness. Remember the history when SQL databases just came back to replace proprietary databases in the 1980s and the beginning of the 1990s? The arguments towards data transparency and SQL data standards sounded exactly the same as the arguments towards Graph databases. Technically everything depends on how the database will be organized. The graph database can have a proprietary structure and has the same challenges to be accessed as big vendors PLM solutions. So, what is important is actually a data management layer, API, openness, and not the database that is used behind the solution.

From PLM databases to Online Data Services

The power of the data must be unlocked. There is no doubt about this need. However, what is the way to organize data and make it open is really starting from data sources and not from another layer. Making PLM systems open, changing the business model of holding the data hostages, and making data available as a service are the things that can help us to make a change. Open PLM data services are the first step to make it happen.

Modern PLM systems are leveraging the polyglot persistence model and use multiple databases (including Graph Databases) technologies to organize the data and manage relationships. It would be interesting to see how including of graph databases into PLM stack can make these systems more intelligent.

What is the role of Graph Database in PLM Stack?

Graph databases provide a very robust way to manage relationships and PLM vendors are starting to discover it. There are examples of usage Graph Databases already today and I expected more examples to happen sooner than later. It will improve the functional ability of PLM systems towards data intelligence and analytics. But Graph Database immediately won’t impact the data openness the data and if the PLM system is closed than it doesn’t matter what database is used (Graph or Relational database).

What is my conclusion?

Data is a big challenge and a big opportunity. Graph Database provides a much better data model to link the data, but it doesn’t solve the problem of data openness and connectivity between systems. A more structured and fundamental change needs to happen in order to shift engineering and manufacturing software vendors to make data open. Graph Database is yet another database. A combination of business models and technological shifts will make data more open to enable future data availability and the development of business applications. Just my thoughts…

Best, Oleg

Disclaimer: I’m co-founder and CEO of OpenBOM developing a digital network-based platform that manages product data and connects manufacturers, construction companies, and their supply chain networks. My opinion can be unintentionally biased.