The world is slowly but surely getting hooked on the data. Finding the right way to manage the data is not less important than figuring out the user experience or business model. But you can rarely find the conversation about different database technologies coming from PLM vendors. You can ask me why? Well… my hunch is because the vast majority of PLM systems (including cloud and SaaS) are built on top of traditional SQL database technologies. While PLM vendors usually don’t speak about problems with SQL databases, the complexity of many PLM data models related to the management of structured data is well known. So, can PLM technologies discover a magic silver bullet to unlock the future of PLM data management?
According to Ganister PLM article Comparing SQL vs Cypher, the magic bullet is Graph database. Read the article which demonstrated a comparison of two sample demo queries showing how to run structure extraction queries in SQL and Cypher. The point of Ganister (which presents itself as PLM built on top of Graph DB) is simple – GraphDB (and specifically Neo4j) might be the next big thing that will change the way PLM systems can manage and query the data.
Databases in PLM: A bit of history…
A couple of decades ago, the data was something that everyone needed to manage to get in order. PDM and PLM systems were initially invented with the mission to manage the file records and to keep them under control. But as the amount of the data was growing, companies found that data is the biggest source of data intelligence that can be leveraged by businesses. Global internet companies proved that they can build gigantic businesses by figuring out how to monetize the value of the data and network-based systems demonstrated the value of data connections.
Back in the 1990s, the SQL database was automatically assumed as the only reasonable way to manage data and used by IT as a validation checkmark on any enterprise software that required the database technology. In other words, using the right database technology (Eg. DB2, Oracle, MS SQL) was the foundation of IT approvals.
From Polyglot Programming to Polyglot Persistence
Back in the 1990s, developers had to decide what programming language to use. The debates about C, C++, Delphi, or Java were all running around. But the introduction of dynamic libraries and later service-oriented architecture ended the debates. Nobody is looking for the single best programming language. The software is developed using multiple programming languages and tools, while each programming language is the right tool for a task (server, front-end, drivers, etc.)
The databases are going through the same journey as programming languages in the past. The internet, open-source, and the last two decades of web development brought a large number of new database tech. So-called NoSQL databases of different flavors came out and cloud/web architecture allowed the companies to hide the database technologies used. The new term called “polyglot persistence” came out demonstrating that a database is another tool that can be used alongside different programming languages (polyglot programming) to develop efficient data management solutions.
Databases, Data Management and Microservice Architecture
Back in 2013, I published the article – PLM and Data Management in the 21st century, where I bought a comparison between different types of databases and how they can be used. Check this out. My main conclusion was that different databases can offer specific value in building data management solutions and no single one can exclusively claim to be capable of solving all problems.
Combine multiple database approaches with microservice architecture and you can see the modern tech stack that is used in many cloud-based and SaaS applications today. There is no need to figure out what database to use. In the example of Ganister PLM, the Bill of Materials (structure) service can be using Neo4j and other microservices can be used to build a storage or meta-data management service.
What databases are used in different PLM systems?
PLM vendors don’t expose such information very often. So, I cannot provide sources and mostly rely on my knowledge. If you have any different information, please share it in the comments. The majority of old PLM systems run on SQL databases and originally were leaning toward multiple databases. Most of them run on Oracle (Teamcenter, Windchill) and some of them run on Microsoft database stack and other modern SQL database alternatives (eg. PostgreSQL). Aras is exclusively run on MS SQL Server. Some SaaS PLM systems (eg. Arena) also runs on SQL databases such as Oracle. For systems hosted using IaaS such as AWS, there is an opportunity to use AWS RDS as a service. Cloud platforms and SaaS tools such as Autodesk Forge, PTC Atlas, OpenBOM are using multiple databases (among them MongoDB, Neo4j, and others).
What is my conclusion?
Database technology is a tool these days. I don’t see a single database as a magic solution to solve all PLM problems. Back in the old days, PLM vendors tried to use Object Oriented databases because they provided a good abstraction model for PLM data. It didn’t work. I think history repeats with GraphDb. But, SaaS and cloud architecture, gives to graph database a big hope. Because it won’t be used alone. I love the expressiveness of Neo4j and Cypher, but I’d be interested to discuss an entire data management architecture to provide a scalable and global PLM platform. Neo4j is a very promising player in building modern data architectures, but it won’t play it alone. The entire data architecture must be build to create a global data management platform. Just my thoughts…
Best, Oleg
Disclaimer: I’m co-founder and CEO of OpenBOM developing a digital network-based platform that manages product data and connects manufacturers, construction companies, and their supply chain networks. My opinion can be unintentionally biased.