What is the right data model for PLM?

by Oleg on August 17, 2012 · 11 comments

I think the agreement about importance of the data model among all implementers of PDM / PLM is almost absolute. Data drives everything PDM / PLM system is doing. Therefore, to define the data model is the first step in many implementations. It sounds as something simple. However, there is implied complexity. In most cases, you will be limited by the data model capabilities of PLM system you have. This is a time, I want to take you back in history.

Spreadsheet Data Model

Historically, it became the most commonly used data model. And the reason is not only because Excel is available to everybody. In my view, it happened also, because tables (aka spreadsheets) is a simple way to think about your data. You can think about table of drawings, parts, ECOs. Since almost everything in engineering starts from Bill of Material, to think about BOM table is also very simple. The key reason why in many cases spreadsheet model became so wide-accepted are simplicity and absolute flexibility. Engineers love flexibility, and this data model became widely popular.

Relational Data Model

This data model was developed by Edgar Codd back more than 50 years ago. Database software runs on top of this model, and we got what known today as RDBMS. Until second half of the last decade, it was the solution all PDM /PLM developers were relying. First PDM systems were developed based on RDBMS. However, they had low flexibility. Because of rigorous rules of this model, making changes was considered as not a simple task. One of the innovations of late 1990s was to develop a flexible data model as an abstraction on top of RDBS. Almost all PDM/PLM systems in production today are using object abstractions developed on top of the relational data model.

The challenges of Spreadsheets and Relational Databases

Despite these technologies are proven and used by many mainstream applications, it is far from perfection. One of the product development demands is flexibility. Spreadsheet model can deliver that, but gets very costly within the time. Relational data model can combine flexibility and support manageability of data. However, it becomes to make a change in these models is costly. Identification, openness and expandability is problematic in relational data models opposite to some other web-based solutions.

Future data models – NoSQL, RDF, etc.

Thinking about what comes in the future, I want to spell to buzzwords – NoSQL and Semantic Web. I can see a growing amount of solutions trying to adopt a variety of new data platforms. NoSQL comes to the place as an alternative solution to Relational Database. If this is a first time you’re hearing this buzzword, navigate to the following Wikipedia link. NoSQL is not all the same. It combined the whole group of solutions such a key-value stores, object databases, graph databases, triple store. Semantic web is collaborative movement led by W3C. The collection of Semantic Web technologies (RDF, OWL, SKOS, SPARQL, etc.) provides an environment where application can query that data, draw inferences using vocabularies, etc. Part of these standards something called Linked Data – a collection of data set in open formats (RDF) that shared on the web.

What is my conclusion? Many of the technologies used by PLM companies these days are outdated and came from the past 20-25 years. There is nothing wrong in these technologies. They are proven and successfully used for many applications. However, in order to achieve the next level of efficiency and embrace future of PLM, new horizons need to be explored. Data flexibility, openness and interoperability – these elements are absolutely important in the future of PLM. Options to use future data models coming from past 10 years of web experience need to be explored. Important. Just my thoughts…

Best, Oleg

Image: FreeDigitalPhotos.net

Share
  • Hakan Karden

    Hi Oleg,
    the enterprise parts of STEP (=PLCS and AP233) should be good candidates to wat you are looking for. Agnostic to processes and technology, rich in context. Not tied into a particular DB paradigm. Long lasting data models. Pragmatic but still by most people seen as futuristic. However, implementations have been operational for several years and interest is growing with proven success. Currently with databases like MS SQL Server.
    I think the issue right now is more to avoid SW vendor lock-in than to move beyond current database technologies incl RDB. Issue is to map PLM into existing IT infrastructure which is most times not the absolute latest. PLM users play conservative because of the value of data and implications of failures. IT infrastructure, security etc needs to be under control.
    What is today new technologies will be used in the future but issue for PLM users is more how to connect to all legacy than using brilliant still unproven (in PLM) technology.
    My thoughts,
    Håkan

  • beyondplm

    Hakan, thanks for your comment and insight! To me, pure STEP is not a data model. It is lack of implementation (you call it mapping to IT infrastructure). You can map it to XML, spreadsheets or anything else. When you created MS SQL implementation, you used Relational Data Model. The problem I can see is cost of mapping of any PLM model (STEP is a good example) to possible data models. Object abstractions created by PLM vendors (and not only) on top of RDBMS is the best PLM vendors did so far. Even so, it doesn't scale from the level where it is now. The task you call “connect to all legacy” is a complicated and costly project. Nobody can do it today efficiently. That's why companies like Daimler will continue to struggle between their PDM/PLM implementations. Just my opinion… Best, Oleg

  • http://twitter.com/jack_b_brown Jack Brown

    My long time sarcastic data model for PLM is a single object type called “STUFF” and a single relationship called “MORE STUFF” that only points STUFF to STUFF and the only file type accepted in this model were Excel Spreadsheets.

  • beyondplm

    Jack, It looks like “Zen guide for PLM”. However, I'm not kidding. To achieve the simplicity is important. Extremely important. Thanks for your comment! Best, Oleg

  • Hakan Karden

    Oleg,
    It seems you talking about implementation data
    models – how the content is stored and manipulated in a computer. The standards mentioned are conceptual/information models – what is it that is
    being stored and what does it mean. For PLM data the decision in STEP was that
    there are lots of choices for “how to implement”. The stuff is the same
    however you store/manipulate it.

    I see the first as a choice and the
    second as a necessity if you are going to communicate (and that communication
    is inherent in any process involving more than one person). Or one person over
    time!

    Assuming you
    address “what is the right choice of
    implementation paradigm for holding/manipulating PLM data?” I would suggest the
    answer is (1) whatever your chosen vendor has used and (2) whatever suits your
    overall business approach, processes and policies. In pure computation
    terms you can use any implementation approach – they are all equivalent from the standards
    viewpoint. In practice however some are better
    supported, more maintainable, easier to archive, easier to interact with… than
    others! Ask a typical CIO if he supports the use of Excel for long-term business
    critical data. “what is the right choice of
    implementation paradigm for holding/manipulating PLM data?” I would suggest the
    answer is (1) whatever your chosen vendor has used and (2) whatever suits your
    overall business approach, processes and policies. In pure computation
    terms you can use any implementation approach – they are all equivalent from the standards
    viewpoint. In practice however some are better
    supported, more maintainable, easier to archive, easier to interact with… than
    others! Ask a typical CIO if he supports the use of Excel for long-term business
    critical data.
    Let's continue to make things easier,
    Cheers,
    Håkan

  • Sylvere Krima

    Oleg,

    I agree with some of what have been said by Hakan: STEP is a good candidate. It has been there for decades now and its models are good and strong. Moreover they are not static information models, something like PLCS (ISO 10303-239) supports dynamic customization, through the use of Reference Data Libraries, which makes it a good solution for now and later when new requirements appear.

    But I also agree with what you said, STEP lacks implementations!! From my personal experience, one big factor is the technology it uses: EXPRESS (ISO 10303-11 and 10303-21). First, it seems barely used outside of the STEP community, meaning that not only the software support is limited, but there aren't a strong community as more recent languages have such as UML and RDF. Secondly, nowadays challenges seem to require more technological support than what EXPRESS offers. I mentioned above the nice mechanism of dynamic customization offered by PLCS (and some other STEP APs). Although it works, EXPRESS is, from my point of view, definitely not the best solution to implement it and a technology such as RDF/OWL is, by far, a better candidate.

    My point here is that we need to find a good compromise between the information model and the technology, which I believe we haven't achieved, yet :-)

  • beyondplm

    Hakan, In order to make things easier and more efficient, the implementation paradigm needs to be fixed/improved. This is a moment when implementation “does matter”. The questions such as TCO, flexibility, cost of change are different and dependent on how to implement things. On the other side, conceptual/information model can be different from company to company. The reason many PLM vendors are trying to lock a specific information model leads to not efficient implementation. Just my thoughts… Best, Oleg

  • beyondplm

    Sylevere, thanks for your comments! I like the way you balanced advantages and disadvantages of STEP. I think, PLM vendors need to catch up on technologies – many things PLM implementations struggle today will disappear. Just my thoughts… Oleg

  • Pingback: PLM Think Tank Top 5 – August. Thoughts about Pink Lady Apples.

  • Paul2002

    HI Oleg – good post – I meant to reply a while back. So do
    you believe that Graph data bases – eg Neo4J – rdf triple stores are the likely
    most fruitful direction for research & development for next gen CAx-PLM ? I certainly see that with this approach NG PLM
    will actually be simplified since it will not be straining at the boundaries of
    interoperability etc – IF a generalised
    model can be developed.

    Best

    Paul Reeves

  • beyondplm

    Paul, I’m not saying Neo4j is the most fruitful. I just think the era of “single database” is over. You might be reading my post from yesterday – What PLM vendors need to know about noSQL database? – http://beyondplm.com/2012/12/14/what-plm-vendors-need-to-know-about-nosql-databases/ . Thanks, oleg

Previous post:

Next post: