PLM Cloud, Data Replication and Pig Latin.

by Oleg on November 11, 2011 · View Comments

Have you had a chance to read about cloud in Pig Latin? No? Actually, I did. Recently, I came across an interesting blog post and video on Aras PLM website. Navigate to the following link and have a read. This blog post actually featuring a solution from Ilesfay Technology group – a provide of some advanced replication technologies. Short dig into the website:

The Ilesfay team brings unique IP and tremendous domain expertise to its cloud services.  The team has years of experience tackling infrastructure limitations of PLM implementations.  The traditional approach to replication moves calculations to the data then shuttles results, but this solution quickly fails with highly associative engineering data central to PLM.  Ilesfay has invented preemptive binary differencing, a breakthrough approach for determining who needs data and when and how to schedule it, such that vastly more data can flow through existing IT infrastructure.

The Ilesfay video made me think about some aspects of replication and technology evolution. In my mind, the advantage of cloud is to have a place where the information is located so, it can be conveniently accessed from multiple places and devices. Accessibility is one of the fundamental advantages of the cloud. Google Apps is probably one of the best examples where you can optimize your work by stop sending emails, document attachments and stop replicating stuff between different computers.

Replicated vs. Shared on cloud

So, here is the topic I want to discuss. We can replicate data using different technology. I didn’t try the one that comes from Ilesfay Technology, but assuming everything works smooth, it can help us to replicate data and not to disrupt people’s work in a current environment. Replication is known technology used by multiple PDM/PLM (and not only) companies. I think to prevent people from making changes and support their distributed work is a big advantage, and companies can go with this option. However, think a bit “long term” I can some disadvantages too. We need to take care of storage in all locations (data is replicated), to replicate all data is not always appropriate because of security and IP protection concerns. So, administration will be required to define what should be replicated and what not.

What is my conclusion? I can see some advantages and disadvantages in both solutions. Replicating data is probably fewer concerns and pain (in terms of change). At the same time, future cloud efficiency, system utilization and cost can drive people to a type of solution where data that needs to be shared is located on the cloud. No synchronization needed. I’m interested to know what is your opinion. Speak your mind.

Best, Oleg

  • Share/Bookmark
  • James DeLaPorte

    Oleg - Replication is required in either a cloud or a file based system for the sole purpose of data redundancy. I have seen in my experience many "fail safe" IT infrastructure systems fail - often for conditions the inventors did not anticipate.
    In aviation every critical onboard system has to be duplicated for safety. There is no "stop on the next cloud (no pun intended) so I can fix the landing gear" in mid flight.
    Critical data systems that support continued operations need to have a similar real time fail over in order to maintain business continuity. When critical systems and access to data go down real dollars for which there is no recovery are “flying” out the window.
    In an economic environment where profit margins are continually challenged real time access to critical data can mean the difference between profit and loss. Compare this scenario to retail businesses where 11 months of annual revenue does nothing more than reach a financial breakeven point on the income statement for the company and all profit is made during the Christmas season. Imagine losing real time access to data due to no real time fail over capability of an IT system during the critical Christmas season.
    Just sharing my experience.

  • beyondplm

    James, thanks for your comment and sharing insight! I agree with you with regards to the replication of critical environment. In the context of cloud, these are elements of what called "high availability" of service. And in many cases, redundancy is the right way to support it. My specific example or question in this particular post was about the need to replicate data between different client environment. This is something, that in my view, will disappear and cloud will pay a central role here. Think about "replicating of your pst email folders" vs. high-availability of gmail. Does it sound right to you?

    Best, Oleg

  • Hello Oleg,
    thanks for your article (and your live reporting from DSCC).
    If I understand you correctly, you are talking about metadata (=database) and files (=stored in file system) and suggesting that the files could be managed in the cloud.
    I think that this is certainly an option for companies that are willing to give their files to outside parties. The cloud provider would have to employ similar replication mechanisms internally as today's PLM systems are doing, in order to provide decent performance in all regions of the world (-> SLA). Splitting metadata from files becomes interesting when you want to use the metadata to replicate the files more intelligently, e.g. by "replicating all files belonging to project X to site Y".
    Best regards,
    Jens

  • beyondplm

    Jens, in my view, cloud doesn't require massive replication of data. You are right, some elements of replication moved to support multiple geo locations. However, overall things are more optimized. You also don't need to replicate heavy 3D files to local computers in many scenarios. Just my opinion. Thanks for commenting! Oleg

  • Oleg,

    They will always be a problem with replication or data sync until PLM vendors figure out that they can describe their CAD with only metadata. 
    To my knowledge there is no one on the market that expose the content of a CAD file. That lead to several issue especially around the size of the file. Even in CATIA v6 where there is virtually no file, there still is design information that are not exposed.
    Ilesfay has not invented something revolutionary, the concept has existed for years. They have a good approach but they focus on the binaries, which means that their system do not understand fundamentally the CAD. Mainly comparing bytes. Changing the positioning is something rather simple, but imagine you change a surface from a plane to a bezier surface (and keep it plane) theire might be a huge amount of bytes that have been modified when basically nothing has changed for the design.

    Now let's say you expose it. In a standard way (XML for instance - but please don't get me started about 3DXML please), the change of the design does not have to be replicated, but could be understood by each endpoint for reconstruction. This approach has the advantage to reduce the amount of data that is exchanged (even more with binary differencing techniques on the top of it), easier to maintain, CAD intelligent and more scalable (you would no care if you are on premise or in the cloud - the system work the same).

    It would open a new door to any company who would want to invest data exchange and replication (which is often a hard case for PLM vendors) not an issue they would have to worry anymore (or not too much).

  • beyondplm

    Yannis, I understand what you are saying about "binaries" vs. "logical". However, my question is simpler. Why to replicate? Isn't it easier to keep data on the cloud? Just an opinion... Best, Oleg

  • Oleg,

    Of course it is easier to keep them on the cloud, but the amount of data is too big. If you talk about a pure metadata BOM management system, that's fine. But when you start to touch about CAD, then it becomes tricky. It is not unusual to have 100 Mb file to work on (and I am talking about a single part here). Should I wait the 4 minutes to download it, 1 minutes to open it before I can work on it?
    1- the bandwidth is not free - are companies willing to pay for it?
    2- The cloud means access to data quicker, easier... The value in that case is not great

    From there they are several approaches:
    1- You design directly in the cloud (google docs-like) - may be quite heavy development for PLM vendors
    2- You design in a VM that runs in the cloud that would be closer to the data
    3- You design locally but only 'logical' differences are transmitted when main file is in cache

    I believe that the easiest one is the second, the most efficient would be a mix between 2 and 3.

    As a matter of fact I have no example of a company that has big-data type of application running them fully in the cloud.

  • beyondplm

    Yannis, thanks for the comments! Agree with your opinion. The balance between cloud and local shared storage can be an interesting option. However, thinking about a longer term, I can see CAD software that won't require to download 100MB part to your workstation. Just work it out in the cloud. Does it make sense? Thanks, Oleg

blog comments powered by Disqus

Previous post:

Next post: