The future of your business isn't in the boardroom or the quarterly reports; it's in the data. Chief AI Officers, CDOs, and CIOs are grappling with one of the most critical challenges of our time: crafting a resilient, adaptable IT architecture that will help their business thrive in the AI-first era.
We all know that AI is as good as the data it was trained on. Therefore establishing a proper data stack is fundamental in enabling AI at scale.
The stakes?
Nothing short of market relevance and competitive edge. In this high-stakes game, choosing a data platform can make or break your digital strategy.
We’ll cut through the noise, dissect the market landscape, and review the major players you need to know about to future-proof your tech stack.
Dimensions
As an industrial company, your key differentiator is the breadth of the data flowing through - OT, IT, and ET.
OT (Operational Technology) is data coming from your machinery - control systems, SCADA, historians, PLCs, and all IoT sensors you deployed in the first wave of digital transformation.
IT (Information Technology) covers traditional business apps like ERP (your SAP or Infor), CRM, HR portal, Sharepoint, etc.
ET (Engineering Technology) is what makes you unique (and special from the IT point of view) from the most of other companies out there. ET spans CAD models, point cloud/laser scans, P&IDs, diagrams, physical simulators, or reports from NDT inspections.
Therefore, as the Chief Industrial Officer in the industrial company, when selecting a data platform, it's essential to consider the following dimensions:
Pure Data Play vs. Industry-Specific
Pure Data Platforms like all of the hyperscalers provide fundamental offerings such as data storage, processing, and analysis, providing robust, generalized capabilities for various use cases. Although, it must be noted that Fabric - Micrsoft’s product should fit more into the next category. In any case, the key is to remember that hyperscalers offer you a set of (LEGO) building blocks that you need to select, assemble, and maintain.
Then you have solutions like Snowflake or Databricks offering data lakehouse, data management, and analytics solutions. Through their SaaS offerings, they abstract away some of the complexity of hyperscalers and bring the ability to power multiple use cases ranging from supply chain, finance, and core business functions - in other words, they rely on tabular data found in CRM and ERP solutions. They see digital as a greenfield opportunity requiring a fresh approach.
Industrial Data Ops platforms are in the sweet spot.
They were the first to realize that OT, IT, and ET worlds need stronger connections, and while banking, commerce, marketing, and other consumer-facing verticals spearheaded the digital transformation, heavy asset industries are lagging. The players in this category worked hard to change that very often, besides pure SaaS products providing professional services to help with change management.
Engineering vendors are decade-old companies providing core ET tools (PLM, MES, APM), which see digitalization as a logical extension of their core capabilities.
Very often they come with a Digital Twin marketing angle (due to their, generally, 3D-based products). I bet every industrial CAIO or CDO/CIO has been in touch as they have long-standing customer relationships and a deep understanding of industry needs.
Industrial OEMs traditionally own the OT domain. Their product philosophy is to apply analytics as a bridge between the OT and IT worlds. Companies like GE (with Predix) and Siemens (with Mindsphere) were developing software offerings more than a decade ago. Well, neither of the two is on the market.. and this is one of the reasons a blog like ours exists.
Data Management vs. Application Development
Another dimension worth analyzing from the enterprise architecture perspective is the product approach.
Data Management Platforms emphasize comprehensive data governance, integration, and storage solutions.
Application Development Platforms provide a more extensive suite of tools for building and deploying applications.
In the diagrams below, we overlaid it with Industrial (OT, ET) and Enterprise (IT) split.
Palantir, proclaiming itself as “iOS for Enterprise” excels in the Enterprise Domain being strong in both data management and application development.
C3 AI, on the other hand, is first and foremost an “Enterprise AI” platform meant to “deliver a comprehensive Enterprise AI application development platform and a large and growing family of turnkey enterprise AI applications”.
Both platforms “play” largely on the right side of the diagram, with Palantir being stronger on the data side and industrial use cases.
Snowflake and Databricks are other players deeply rooted in the Enterprise Domain. Yet, they mainly play in the “data management” box, competing in the Lakehouse space. We believe Databricks, thanks to its roots in analytics and a few industrial use cases is looking more promising at this stage.
Cognite seems to be a positive outlier with its deep industrial roots and DataOps positioning. It doesn’t put much focus on the enterprise domain space and although not featuring a very strong app development offering it’s meant to leverage its open API architecture and partner ecosystem.
Decision criteria
As mentioned in our intro, the industrial world has its own rights. Therefore, we decided to look into the following decision criteria:
Time to Value
Contextualization
Data management
Analytics & MLOps
Industry Expertize
Openness
From the initial diagram, we selected a subset of players playing mainly in the analytics and the industrial data ops space since the choice here is fundamental to your industrial data & AI strategy.
Here is the final ranking:
Time to Value
In 2024, with AI at the top of the inflated expectations curve, there is no place for a digital transformation initiative without a strong business case.
Therefore it’s fundamental to assess how quickly your data platform can deliver actionable insights and business value. With custom code, everything is possible so look for products with pre-built integrations, user-friendly interfaces, and robust support.
We give 4 points to Palantir and Cognite for their proven track record in industrial settings. C3 gets one point less due to maintainability (tougher to manage custom apps than data products). Snowflake and Databricks have a rich user base, community, and training material so building solutions is quick, yet, the product limits the type of use cases you can solve in the industrial setting.
Contextualization
Contextualization ability to bring together OT, IT, and ET data and create meaningful relationships from them.
You don’t want to buy/build another data platform 3 years from now.
Here, we focus on heavy asset industries, therefore modules for managing drones, robotics, visual data management, and data from various NDT inspections are key.
Cognite is the only major player offering breadth and depth in IT, OT, and ET sources. Palantir and C3 AI focus on IT and OT although Palantir gets points for its recent work on P&ID parsing.
Snowflake shines in IT, Databricks too but tends to play more and more in the OT space.
AVEVA used to be a pure ET play (and a very strong one!) and with the acquisition of Osisoft PI, its OT offering got a significant boost. Yes, the synergies from the integration are yet to be materialized.
Cognite and AVEVA get the same score for OT but there is a nuance to it.
Cognite has a highly performant time series database on the cloud yet doesn’t offer historian capabilities (i.e. on-prem data storage). PI is an industry leader in the historian space, yet, its cloud offering is far from perfect.
We will observe how the AVEVA/PI/Schneider trio is evolving and consider that for the next edition.
Data management
There is a set of classical data management / Data Ops features we analyzed here:
Data lineage
Versioning
Pipeline orchestration
Observability
Support for various development environments
Access control and data sharing
and a few more more suited for industry:
Industrial data models
Data type support
Incorporating physics
Data discovery suited for SMEs
Time series data quality monitoring
Palantir, Snowflake, and Databricks are by far the strongest players in this space with Palantir additionally having a rich offering for industrial data ops.
C3 AI and Cognite, while having a solid offering scored higher simply because of their product maturity today.
Analytics & MLOps
The minimum is to have a Python SDK and a way to run calculations.
That’s why we excluded players like AVEVA from the analysis, and believe Databricks is a category winner. The rest of the players have a product there, none impressed us as much as Databricks.
Moreover, nowadays a clear and focused GenAI positioning is key. Even though none of the aforementioned players can claim supremacy in this space (sorry Palantir and C3 AI) yet, this is the area where late movers have no space.
Industry Expertize
Naturally, engineering vendors and Industrial OEMs would shine in this category. However, due to poor scoring in other areas, we haven’t included them in the review.
But this category is also not just about leveraging intrinsic OEM knowledge of the machinery. It’s about incorporating various physics-based simulators, having the ability to analyze lab/NDT inspection results, and experience with more forward-looking digital products like drones and robots.
This is why we put Cognite ahead of Palantir. The Norwegian company also has a stronger industrial talent in the team as well as is 100% focused on industrial use cases (opposite to ~20 business categories that Palantir is playing in).
C3 AI was a leader in this category 3-4 years ago, however, we are waiting for a turnaround and more customer success stories.
Ecosystem
It’’s not just about having open APIs (although not everyone has them..) and signing partnership agreements with SIs.
It’s about:
Overall nature of the vendor - are they positioning themselves as an E2E shop or promoting an open ecosystem)
Amount of connectors (both to other data sources and BI/app development tools) as well as SDKs to build solutions in your language of choice (Python and JavaScript are must-haves)
Partners - if a data platform vendor has partnerships with various SIs and Accenture’s of the world that’s a big plus. We, however, look for the depth of such partnerships and not breadth.
User community - A vibrant community means it will be easy for you to use the platform, find talent, and run efficient implementation projects
Documentation - same as above
Snowflake and Databricks are clear winners in this category and set the example for the others. Palantir is a bit lower today than Cognite since it was historically very closed. Yet, this is changing. Curious to see such a ranking in 12-24 months.
And yes, C3 AI we expect more from you.
Conclusion
Remember, this overview reflects the situation at a specific moment in time.
SaaS is a highly competitive space and all of the aforementioned players invest a lot in R&D. Therefore, besides grandiose strategy, the execution plays a key role in where the players are in the next 2-3 years.
On purpose, we haven’t included any scoring or weighting criteria in the ranking. This should be adapted based on your individual company situation, other tooling, and talent that you have.
Hope such an overview will help you in your purchase/product strategy decisions!
PS. If you feel something is missing or quite the opposite want to congratulate us for terrific work, please leave a comment.