A blog by Oleg Shilovitsky
Information & Comments about Engineering and Manufacturing Software

What PLM vendors need to know about noSQL databases?

What PLM vendors need to know about noSQL databases?
Oleg
Oleg
14 December, 2012 | 4 min for reading

Relational databases is a very mature set of technologies. We use RDBM (Relational databases) practically everywhere these days. It is hard to imagine enterprise software and PDM/PLM systems these days without relational databases. At the same time, the new class of database management solution is coming. It called NoSQL (Not Only SQL). I posted about noSQL few times. You can refresh your memory by navigating to the following link. First time this term came in use back in 1998 as “noREL” databases. Later in 2009, the term noSQL was proposed for “to label the emergence of a growing number of non-relational, distributed data stores that often did not attempt to provide atomicity, consistency, isolation and durability guarantees that are key attributes of classic relational database systems”. NoSQL database solutions are widely used today in web and mobile applications. I can see a growing number of noSQL database usage in business intelligence and master data management applications.

NoSQL is not a single database. This is a name for a broad set of data management or database technologies focusing outside of RDBMS world. The technologies and terminologies behind this term is new. PDM/PLM vendors ignored noSQL database management solutions until very recently. It made me think to provide a quick summary of what stands behind this broad term and what PDM/PLM uses cases it can support.

Key-value (KV) databases

KV stores is a simplest database model in noSQL world. It stores “keys” and associated “value”. Basically your database is a storage of pairs of key-value. Some databases support more complex structure behind values such as complex values (list, hash), but it is not required. One of interesting PDM/PLM use cases is to store list of files as a key-value database. In such a case, file name is a key (including full path) and value is actually the content of the file. Examples of KV stores are Riak and Redis.

Colum-oriented databases

This type of database is very close to RDBMS. The main difference is that columnar data model designed to keep data from every column in the table together. It is an opposite solution to RDBMS, which keeps the data for a specific row together. It allows to add a column to a table in a very “inexpensive” way. Each row may have a different set of columns. This type of databases are good for reporting and business intelligence solutions. Columnar data model impacted few PDM/PLM core modeler development available today at the market, by providing a higher level of flexibility in data modeling. Example of column-oriented databases is HBase.

Document-oriented database

Document databases are managing data in a form of documents. Documents can be different and have different structure. The last thing makes document oriented databases very flexible. Some implementations of document oriented databases such as MongoDB provides you an ability to run query against the document structures as well as do mapreduce computations as well. Depends on the need you can consider different DO-databases. Examples of these databases are – MongoDB and CouchDB. You can consider document database in PDM/PLM in two cases – the need for high-performance scalable document store and free form data modeling.

Graph-databases and triple stores 

Graph data model is dealing with highly interconnected data. It contains nodes and relationships between nodes. Both nodes and relationships can have properties (key-value pairs). This data model becomes really important when you are traversing through the nodes with a specific relationships. There are many situations in PDM/PLM applications when we need to traverse data efficiently. Graph database (and predecessors – object databases) has a great potential to bring a value here. The example of graph databases is Neo4j. Also, a specific case of graph databases is so-called triplestores managing information using triples (subject-predicate-object). Examples of triple stores are OWLIM and AllegroGraph. Also triple stores are supported by Oracle and IBM DB2

CAP Theorem and why PLM systems need to use more than one database? 

In computer science CAP theorem states that it is impossible for a distributed computer system to simultaneously provide all there guarantee Consistency (all nodes see the same data at the same time), Availability (a guarantee that every request receives a response about whether it was successful or failed) and Partition tolerance (the system continues to operate despite arbitrary message loss or failure of part of the system).  Navigate here to read more. It is a question of priorities and a tradeoff between what requirements you need to satisfy in your system. PLM systems are facing significant challenges in a variety of data types, retrieve patterns and data scaling. Usage of different strategies in database management can improve existing solutions.

What is my conclusion? PLM is a multidisciplinary approach. It handles variety of data and connected to many places in the organization. Design, engineering, manufacturing, supply chain, support, services. The specialty of PLM environment is to get connected to all data suppliers and interplay with different sources of data. From that standpoint, data behaves like oil – located in multiple places, but needs to be extracted. You need to use different tools to get it out. Think about different database as a tool-set to process and get access to data in a most efficient way. Just my thoughts…

Best, Oleg

Recent Posts

Also on BeyondPLM

4 6
7 August, 2018

One of the most frequent debates in cloud (or SaaS) software are debates about tenancy. Or how you can often...

18 November, 2009

Short alert on Google Labs Releasing Swirl. Google Image Swirl allows you to organize and search images. Official Google’s blog...

24 November, 2023

In the PLM data management, Excel has long been considered the most popular tool. As all my product lifecycle management...

16 September, 2009

Short Prompt. Excel, in my view, is the most popular engineering and product data management tools. I had chance to...

19 May, 2017

Integration and connectivity was always one of the most important aspects of PLM deployment and implementations. It was never simple...

14 July, 2009

At its Worldwide Partner Conference in New Orleans, Microsoft released a preview of Office 2010 and Visio 2010 to all...

7 July, 2009

I think everybody wants to be open these days. We are moving from the closed world to the open world....

22 October, 2018

I’m getting ready for PLMx in Chicago just in few weeks. Haven’t heard about PLMx? There is still time to meet...

2 April, 2024

I’m continue to digest what I learned at CIMdata Industry Forum 2024 last week. The main theme of the forum...

Blogroll

To the top