One of the hot subjects from last year has been the move towards multi-cloud services creating the opportunity for businesses to build out increasingly virtual, dispersed data centers, environments where the increasing gravity of data makes it much more practical, economic and just plain sensible to carry out processing where the data is created, out on the many edges of a company’s network.
The good side of this is that in most cases it is only the results of such processing – with localised analytics likely to be a favourite task – that make up the core traffic. The primary traffic load will be metadata about the analytical results travelling up to the central back office, plus instructions on next actions back out to the edge. And even then, the edge resources will be well able to manage running services to plan, so only change information with be necessary.
One major downside of this approach however, is the fact that the results of such processing will have, by the very nature of being conducted at the edges, a high potential to appear at the center as an asynchronously entangled mess. Technically, it will seem a thoroughly elegant solution, but from a business management point of view it will have – at best – little or no value, and at worst it could be seriously counter-productive.
So to make the multi-cloud-based dispersed data center work at all will require a serious expansion in the capabilities and data resources of the metadata being used. It has to grow to become the connective tissue of the dispersed data center. It must not only describe what the data is but also describe its context, its place and priority, in the whole information schema of the business management process. These are the essential rules by which it plays it’s part in making sense of a much greater business `whole’.
ICM – gotta call it something
If this new role was to have a three letter acronym (and that is almost essential these days) a likely candidate might be ICM - Information Context Management. It was a suggestion that Orchestra Networks’ co-founder and CEO Christophe Barriolade approved, for that is the role that EBX has been designed to fill:
The only way to make it work is to provide some context to the AI, or even to train it. Otherwise, you get crazy results. And so EBX can help because it will provide all the right definition of data sets and the hierarchies that are correct, that are true, and that will help control the result of what the machine learning system will do.
The company describes the EBX as a single solution to manage, govern, and share all master data, reference data, and metadata assets. The objective was to identify the key shared data that big organisations rely on, that basically define their business. That can be data for products, customers, or employees, that can be reference data, older classifications and codifications, and that can be metadata. That, in his view, is the data that has to be correctly interconnected in real life.
Modern dispersed data centers will have to simultaneously work with a vast array of mixed data, ranging from unstructured data from staff, such as email traffic, and customers/consumers in the form of social media input (including photographs and video), through collated IoT sensor data, and on to the metadata of local edge analytics. All of that will be arriving, randomly, from every edge location the business has.
All of that then forms a set of shared data that needs to be used by all operational and business management systems, which have to rely on the on using the same data to maintain business coherency and integrity. A key additional component here is compliance, because to comply to regulations, a business has to be sure that all operations rely on the same share data. Barriolade explains:
The cloud is basically the new 'mess' if you don't control the shared data. For when you put your system in the cloud, you need to make sure the shared data really aligns with what defines your business.
Metadata will soon outrank data
It was this `ICM’ capability that attracted the interest of Matt Quinn, Tibco’s Chief Operating Officer:
Sometimes the metadata, the data about data, is more important than the data itself, and that is becoming more true, not less true. And the other thing is the organisation of that metadata, because it's becoming more valuable, is also becoming more important. Building modern applications and systems, users rely more heavily on metadata, because data is going to change over time. The rise of metadata has always been there, but what EBX provides is a platform to manage this in a very natural way as it evolves. And that to me was absolutely critical to where Tibco needs to go.
EBX therefore provides Tibco with the important data management piece in the multi-cloud dispersed data center. The company has most of the other tools needed to integrate, to connect, to create areas, provide collaboration with API's, manage events across that whole environment, and provide the all-important piece – the local, end point analytics at the edges. But in the end, what comes back through that could still be a mess, because the one piece that was missing was the ability to organise what is going on with that metadata and making sense of it all when it ends up at the center.
Quinn sees this a significant development on from MDM, which has tended to be limited to very large scale, quite complex, very customised, multi-master problems. The classic example of this is where a business has different applications with different understandings of who the customer is:
Arguably, that era was more miss than hit, just because of the size, scope and complexity of what companies did. Now, at the same time, the analytical job back in that era was relatively straightforward. It was largely based around batch-driven systems that would do all the cleaning, the metadata and master data munching; and at the same time was fiercely complex but ultimately, relatively straightforward. This was because no one expected a report to be real time. Back then I just expected a report to be on my desk on a Monday morning.
When it comes to the technology, Tibco sets out to cover most of the `-ilities’, such as flexibility, scalability and agility, and that is what you want to get from the cloud. But the other side of that is, if you're not very careful you also end up with an almost uncontrollable mess. That will certainly be counter-productive when it comes to the growth and development of the edge, and the increasing use that there will be of analytics out where the data is generated. And it will take place there because the 'gravity' of vast volumes of data being created at the edge means that the cost and time latency involved in moving large amounts of data to the center makes it more sensible to put analytics processing out at the edge where the data is.
All of this is of course irrelevant, regardless of its technical elegance, if there is no management of the data flow and no method for contextually connecting data into meaningful – and valuable – relationships.
On this basis the acquisition looks like a shrewd move by the company, giving it a shot at being one of the early contenders in the `one-stop-shop’ market of vendors able to provide a soup-to-nuts dispersed data center infrastructure environment.