Main content

ScaleOut open sources real-time digital twin development tech

George Lawton Profile picture for user George Lawton January 31, 2024
ScaleOut Software has open sourced a new real-time digital twin API and workbench. This promises to usher in a new generation of interconnected simulations to improve logistics, smart cities, and operational planning.


ScaleOut Software has released new open source APIs and development tools that will make it easier to monitor large-scale collections of things and run what-if scenarios. For example, airlines can test out different responses to inclement weather, cities can optimize traffic lights, and factories can plan around equipment outages – all in real time. 

How real-time? As fast as 1-3 milliseconds when running on the company’s in-memory computing platform. That’s according to Dr. William Bain, founder and CEO of ScaleOut Software.

For example, an airline operations manager faced with flight cancellations due to inclement weather will be able to model the impact of holding specific outbound flights on the system’s key performance indicators and help guide critical decisions. Simulations will model the ripple effects of schedule changes on the myriad interacting entities within an airline system, including passengers, bags, pilot availability, gate assignments, and much more. The results of these simulations will provide managers with new insights that help them manage live operations.

The new platform bridges a gap between streaming analytics infrastructure that excels in aggregating data from many sources, digital twins in Product Lifecycle Management (PLM) for improving product development, and existing cloud digital twin solutions optimized for asset management rather than real-time simulation. 

The new platform is already being used to prevent train derailments, unlock new capabilities for Azure digital twins, quickly analyze and resolve operational issues in vehicle fleets, and streamline security analysis and response in complex cyber/physical networks. 

ScaleOut software has been developing infrastructure for weaving larger numbers of PCs into data grids since 2003 to create large in-memory databases. In the early days, this was a promising alternative to the scale-up approach required for traditional databases that involved adding more memory, compute, and storage to a single server. Its work to support a real-time digital twin tier is a natural outgrowth of that early work to extend in-memory computing to deliver the required computing power to simultaneously run thousands of digital twins that can analyze telemetry in microseconds. 

Bain explains:

Instead of just storing key information in databases, where it can be slow to retrieve and analyze, in-memory computing keeps all needed data in memory distributed across a cluster of servers or cloud instances, which form a scalable computing cluster. It also routes incoming messages to their corresponding digital twins for processing in a manner that minimizes data motion and delays. Lastly, it incorporates data-parallel computing that continuously combines data from multiple digital twins and visualizes trends within seconds.

In-memory computing enables thousands (or even millions) of digital twins to continuously process incoming telemetry with peak performance for real-time monitoring or to simulate a large, complex system, such as an airline’s operations. This is just what enterprise applications need to track large populations of data sources, provide immediate alerting, visualize emerging trends, and make informed decisions in the moment.

 A new architecture

The big innovation is that ScaleOut’s approach allows developers to think about coding digital twins as objects that can be organized into hierarchies or interacting agents. For example, a hierarchy might allow an airline to represent the individual components of a plane, the plane as a whole, and then multiple planes as part of a larger-scale scheduling simulation. The interacting agents might model the behavior of a collection of cars in heavy traffic and how different traffic light timings, accidents, or construction works affect aggregate throughput or speed. 

This object-oriented approach differs from existing approaches for building digital twins using model definition languages. For example, Azure Digital Twins supports the Digital Twin Definition language. However, this approach can be more difficult to conceptually understand and program since it differs from how we think about the world as objects. Bain argues that the traditional approach has slowed down the widespread adoption of digital twins, particularly when it comes to simulations. He explains: 

Widely accepted open-source APIs for constructing digital twins have not yet emerged. Likely because digital twin applications have largely focused on asset management, most APIs take the form of model definition languages, such as those in Azure Digital Twins and UA Cloud Twin, rather than object-oriented APIs. While this approach enables the creation of complex, rich asset hierarchies (such as the components of an office building), it does not take full advantage of the power of digital twins to track live entities or build simulations.

That said, the new model can integrate with these other models to add real-time analytics and simulations and can reduce the need for a complex web of serverless functions between multiple digital twins. ScaleOut’s digital twin model can integrate with other models, such as Azure Digital Twins, to add real-time analytics using fast, in-memory computing that extends the power of those models. This new real-time digital twin tier can mirror the digital twins created by the cloud platforms, perform real-time processing using in-memory information, and push back changes to the cloud digital twin service. 

The new architecture also differs from the current digital twins used in PLM and the NVIDIA Omniverse, which focus on optimizing the design of complex products, training autonomous driving AIs, and improving robot control models. One big leap is the extension of the notion of digital threads to real-time scenarios. In the traditional digital twin model, a digital thread helps bring together data from across different PLM, CRM, ERP, and manufacturing execution systems to streamline collaboration across different views and roles in large-scale product development efforts. 

Referring to this current crop of PLM-focused digital twins, Bain says:

They employ the digital twin concept as a generalized, guiding principle for constructing software models of these products. In contrast, ScaleOut’s digital twin model defines a precise software architecture for creating large numbers of digital twins that track and model real-world entities. ScaleOut aims to help managers of large systems, such as trucking fleets and security systems, maximize situational awareness and make the best possible decisions in the moment.

The new architecture also promises to extend the utility of traditional streaming analytics architectures focused on quickly aggregating data. These architectures excel at bringing together incoming telemetry data. But Bain argues they struggle to tease apart the context of the information they are processing across different levels of granularity in real-time:  

Conventional streaming analytics platforms do not separate messages from each data source and analyze them in context as they flow through message pipelines. Instead, they ingest and store telemetry from all data sources, attempt a preliminary search for interesting patterns in the aggregated data stream, and defer detailed analysis to offline batch processing, which may take minutes or hours to complete. As a result, they are unable to introspect on the dynamic, evolving state of each data source and alert on issues that require immediate attention.

The open source bit

To clarify, ScaleOut is open sourcing tools for improving developer productivity when crafting digital twins, that will run on ScaleOut’s proprietary architecture and services. This promises a big leap for enterprise developers building solutions in logistics, security, fleet management, and smart cities apps. In particular, it will make it easier to experiment with object-oriented design principles for building large-scale interconnected digital twins. 

ScaleOut’s development workbench allows users to build digital twin applications using either Java or C# APIs. Using workbench APIs, they also write test programs to run individual digital twin models or simulations in their development environment and examine state information as necessary. Once they have completed testing, developers can easily deploy digital twin applications to a production platform. Bain says this will make it easier to test and debug digital twin applications:

As all developers know, the process of testing and debugging a new application can be arduous and require repeated testing. The open-source workbench shortens the development cycle by enabling developers to test digital twin applications in a familiar development environment. It also avoids the need to deploy applications to a live production platform, like ScaleOut Digital Twins, during the development process.

However, for the moment, these new digital twins will only run on ScaleOut’s platform. Other vendors may develop supporting infrastructure if the object-oriented architecture and APIs catch on. Competitive in-memory data grids include the open source Apache Ignite, and commercial offerings from GridGain, Hazelcast, and Oracle. Also, Rescale is developing digital twins for cloud-based high-performance computing. 

ScaleOut is also working with the Digital Twin Consortium to support industry definitions, collaboration and standardization. On this front, Bain says:

By unlocking our APIs to open-source development and testing, I hope that DTC members and other organizations will be able to explore the use of real-time digital twins for new, innovative use cases as we work to create open source standards.

My take

ScaleOut’s work on extending digital twin infrastructure to support real-time analysis and simulation represents an important milestone in the history of digital twins. Thus far, most digital twin efforts have focused on making it easier to improve collaborations across different roles, designing better autonomous car and robot controls, or aggregating fleet data for later offline analysis. 

If the idea and architecture catch on, it may also make it easier to scale and train new AI and machine learning models to improve ongoing prediction and control. It could also take advantage of new innovations in MQTT messaging and unified namespace interoperability to improve industrial automation.

Another area of innovation is new AI surrogate models that can accelerate simulations thousands or millions of times compared to traditional approaches. For example, Google DeepMind made a big splash late last year with a new weather forecasting model built on graph neural networks. It could accurately predict 10-day forecasts on a single Google TPU v4 machine that previously took hours on hundreds of machines using conventional physics-based models. 

Bain thinks that similar innovations in other types of models could eventually be applied to other use cases in security and operations monitoring, such as improving the signal-to-noise ratio for accurately determining when to signal alerts without unnecessarily disrupting operations: 

By avoiding the need for complex coding and leveraging their powerful pattern recognition capabilities, machine learning algorithms can further increase the power of real-time digital twin implementations.

A grey colored placeholder image