Graphs and Networks are fascinating. The last two decades of technological development show how powerful connections could be. From the internet to social networks and from single servers to data centers, we can see how networks can bring power, performance, rich semantics, intelligence, and many other characteristics. You can see how network business is expanding in a variety of industries and productions. And all this for a good reason. Because networks are much more powerful than a single thing.
PLM vendors are following the network business too. Check PLM websites and you will find a trend of digital thread visualizations showing connected pieces of information. It is powerful and makes tons of sense from technological, business, but also from a marketing standpoint. Just about 10-15 years ago, PDM/PLM programs demonstrated naked tables of data and 3D models. It is very boring, especially compared to the nice visualizations of CAD developers. Not anymore… Now PLM developers can present graphs and networks which are really cool and powerful not only for business but also for marketing.
Earlier this week, I shared my thoughts about Graph Databases – Will Graph Database Become A Future PLM Silver Bullet? Thanks to all my readers for the great comments and discussion online and offline. It made me think that a broader conversation can be beneficial. Because graphs, network-based solutions, and Graph Databases are not exactly the same thing and this is where a lot of confusion is happening.
Graphs and Networks
There is no sharp line between graphs and networks. If you check the literature about data, networks, and graphs, you can find that although the terminology of graphs and networks are different, the models behind both are very similar.
Network terminology is generally used in situations where you want to think of transporting/sending things along with the links between nodes, whether those things are physical objects (road networks and rail networks) or information (computer networks and social networks). Graph terminology is more often used in situations where you want the edges/links to represent other types of relationships between the vertices/nodes. An example that has gotten some attention recently is the “interest graph” in which the vertices are people and topics, and each edge links a person to a topic that they are interested in. You might say that a social network should really be called a graph since we often think of it in terms of the relationships between people rather than the status updates and tweets that get sent between them. In practice, there’s no precise rule for deciding which terms to use, but luckily it isn’t too hard to keep up with both types of terminology.
While according to this terminology, graphs are more about relationships and networks are more about transportation, I found that both network and graphs are used these days practically with the same meaning and similar conceptual and technological purpose.
Graph Databases (aka GraphDB)
A graph database is a data storage and data management technology that is designed to treat relationships as important as data itself. A typical SQL database focuses on the data (tables). You can model relationships using data, but data will always come first. Graph Databases are different and they have relationships equally important as data. Read more about GraphDBs here.
In computing, a graph database (GDB) is a database that uses graph structures for semantic queries with nodes, edges, and properties to represent and store data.[1] A key concept of the system is the graph (or edge or relationship). The graph relates the data items in the store to a collection of nodes and edges, the edges representing the relationships between the nodes. The relationships allow data in the store to be linked together directly and, in many cases, retrieved with one operation. Graph databases hold the relationships between data as a priority. Querying relationships is fast because they are perpetually stored in the database. Relationships can be intuitively visualized using graph databases, making them useful for heavily interconnected data.[2]
Graph databases are not new, but they are trending and the main reason is because of a huge demand for building connections, semantics, intelligence, and other algorithms with the foundations in graphs and networks. Graph Databases have a lot of advantages, but I’d be careful calling them a universal solution. A few years ago, I published an article speaking about polyglot persistence and different types of databases – PLM and Data Management in the 21st century. My main conclusion is that databases are becoming more of a tool and can be chosen based on what needs to be done for a specific solution. Also, the microservice architecture makes multiple databases used together in a single solution.
Neo4j does a great job explaining the value of Graph Databases. Here is a slide from Neo4j that does a great job comparing the value of different databases.
Network-Based Solutions and PLM Architecture
The entire notion of PLM is very much related to relationships and connections. Although PLM is about data management, lifecycle, and many other topics, the idea of a connection between pieces of information and silos is strongly emphasized in all PLM implementations and products. Does it mean Graph DBs is an ideal technology for PLM? As much as it sounds appealing, don’t do it so fast. While relationships are extremely important, PLM systems should support a wide range of information to be managed.
Let’s talk about the entire data management architecture in PLM solutions. A typical PLM system has to manage data, not only connections. Over the last few decades of PLM development, each PLM platform developed its own data abstraction mechanism, which includes a way to manage objects, attributes, files, relationships, processes, and many other elements. As much as relationships are important, other elements of the data are equally important as well. Can systems manage relationships using normal SQL databases, key-value, documents, and other NoSQL databases? Technically it is possible. Can Graph Database be used to manage an entire data set as a property graph? Yes, it is possible too. Which one will be more efficient will depend on the specific implementation and, in general, outside of the scope for such a small article.
While some people assume PLM is a single system, product, and technology, in real life it is different. Each company usually figures out its technological stack and for many companies, especially large OEMs, the solution is combined from multiple tools and technologies. Which can give an idea to create specific solutions focusing on how to extract data from PLM and other systems and to develop specific vertical solutions. The idea of these systems is not very new. These solutions can be very valuable for a specific business function. I found Neo4j, one of the leading Graph Database vendors is making substantial efforts to popularise the idea of developing graph-based solutions for specific business purposes.
One of the challenges is actually getting the data, which is hidden behind multiple PLM databases and legacy solutions, and the development of a robust data model to manage and sync the data. The complexity is in data integrations and data synchronization that needs to be performed to extract data and keep it updated between PLM and graph databases. It is complex, but not an impossible thing. With a demand towards data openness and data availability, I expect to see the industry is moving towards easier ways to extract and transform the data.
What is my conclusion?
Connections and relationships will play an important role in the future of PLM architecture. Companies are actively looking at how to explore the value of data and Graph Databases will play an important role in these future solutions. Modern microservice architectures often used in SaaS solutions are a great foundation to start using Graph Databases purposely for intelligence and analytics. However, GraphDB by itself won’t make future PLM solutions different because as much as the network and graph functions are important, GraphDB is still yet another database. What is important is an entire solution stack, the value of the functions, and the business model that allows the solution to grow in the market. Just my thoughts…
Best, Oleg
Disclaimer: I’m co-founder and CEO of OpenBOM developing a digital network-based platform that manages product data and connects manufacturers, construction companies, and their supply chain networks. My opinion can be unintentionally biased.
Pingback: Beyond PLM (Product Lifecycle Management) Blog Graph Databases, Unlocking Data, and Building Linked Data Layers in PLM Software - Beyond PLM (Product Lifecycle Management) Blog()
Pingback: Beyond PLM (Product Lifecycle Management) Blog Graph Mania - 3 Options To Include Graphs and Networks in PLM Architectures - Beyond PLM (Product Lifecycle Management) Blog()
Pingback: Beyond PLM (Product Lifecycle Management) Blog Global PLM Data Language - Learning From Semantic Web, Linked Data and Schema.org - Beyond PLM (Product Lifecycle Management) Blog()
Pingback: Beyond PLM (Product Lifecycle Management) Blog Structure and Models of Product Data for Connected Processes in PLM - Beyond PLM (Product Lifecycle Management) Blog()