One of the most complicated parts of any PLM implementation is data modeling. Depends on PLM vendor, product and technology, the process of data modeling can be called differently. But fundamentally, you can see it in any PLM implementation. This is a process, which creates an information model of product and processes in a specific company. To get it done is not simple and it requires lot of preparation work, which is usually part of implementation services. Even more, once created data model needs to be extended with new data elements and features.
Is there a better way? How other industries and products are solving similar problems of data modeling and data curating. It made me think about web and internet as a huge social and information system. How data models are managed on the web? How large web companies are solving these problems?
One of the examples of creating a model for data on the web was Freebase. Google acquired Freebase and used as one of the data sources for Google Knowledge Graph. You can catch up on my post why PLM vendors should learn about Google Knowledge Graph. Another attempt to create a model for web data was Schema.org, which is very promising in my view. Here is my earlier post about Schema.org – The future of Part Numbers and Unique Identification. Both are examples of curating data models for web data. The interesting part of schema.org is that several web search vendors are agreed on some elements of data model as well as how to curate and manage schema.org definitions.
However, it looks like manual curating of Google Knowledge Graph and Schema.org is not the approach that makes web companies to feel happy about and leapfrog in the future. Manual work is expensive and time consuming. At least some people are thinking about that. Dataversity article “Opinion: Nova Spivack on a New Era in Semantic Web History” speaks about some interesting opportunities that can open a new page in the way data is captured and modeled. He speaks about possible future trajectories of deep learning, data models and relationships detecting. It can extend Schema.org, especially in the part that related to automatically generated data models and classifications. Here is my favorite passage:
At some point in the future, when Deep Learning not only matures but the cost of computing is far cheaper than it is today, it might make sense to apply Deep Learning to build classifiers that recognize all of the core concepts that make up human consensus reality. But discovering and classifying how these concepts relate will still be difficult, unless systems that can learn about relationships with the subtly of humans become possible.
Is it possible to apply Deep Learning to relationship detection and classification? Probably yes, but this will likely be a second phase after Deep Learning is first broadly applied to entity classification. But ultimately I don’t see any technical reason why a combination of the Knowledge Graph, Knowledge Vault, and new Deep Learning capabilities, couldn’t be applied to automatically generating and curating the world’s knowledge graph to a level of richness that will resemble the original vision of the Semantic Web. But this will probably take two or three decades.
This article made me think about the fact manual data curating for Freebase and Schema.org is a very similar process to what many PLM implementers are doing when applying specific data and process models using PLM tools. Yes, PLM data modeling happens usually for a specific manufacturing companies. At the same time, PLM service providers are re-using elements of these models. Also companies are interconnected and working together. The problem of communication between companies is painful and still requires some level of agreement between manufacturing companies and suppliers.
What is my conclusion? Data modeling is an interesting problem. For years PLM vendors put a significant focus how to make flexible tools that can help implementers to create data and process models. Flexibility and dynamic data models are highly demanded by all customers and this is one of the most important technological element of every PLM platform today. New forms of computing and technologies can come and automate this process. It can help to generate data models automatically via capturing data about what company does and processes in a company. Sounds like a dream? Maybe… But manual curating is not an efficient data modeling. The last 30 years of PDM/PLM experience is a good confirmation to that. To find a better way to apply automatic data capturing and configuration for PLM can be interesting opportunity. Just my thoughts…