The Babel Fish becomes a paradox, as Druva demonstrates

Profile picture for user mbanks By Martin Banks March 29, 2016
Summary:
Converged data protection specialist Druva shows that secondary storage is not just nice-to-have backup option, but is fast becoming an essential point of centralisation that businesses may ignore at their peril.

babel fish
Babel-ing

It is becoming clear that there is now a real issue at the heart of just about all IT-related architectures, and certainly any that contain an element of cloud, which can be best defined as the 'Babel Fish Paradox'.

This Babel Fish motif popped up again during a recent briefing with Jaspreet Singh, the CEO of US-based specialist in converged data protection Druva.

The common theme now in most IT architectures, from mobile apps, through large enterprise management services and onto not just IoT systems but also their essential collaboration with business management tools, is the universal deployment of ever-increasing levels of distributed services. The growth in API understanding and use has led directly to the ability to glue together limitless numbers of different applications and services in a limitless combination of services.

What is more, those applications and services do not have to be in the same building, or even the same country. They can be anywhere in the world an internet connection can be made.

So far so amazingly wonderful, and the capability is leading to the development of services that would not have been thought possible only a few years ago. But it does have a down-side, and it is one that could stall and negate all that development potential before has a too much more chance to grow.

That down-side is the fact that each of those distributed applications and services ends up growing its own database or repository, and every business then ends up with N different sources of similar and closely related data that could be anywhere around the globe. For the business, there is no single version of the truth about the business; instead there are N partial versions of bits of the truth.

And that is the nature of the paradox. The flexibility of distributed systems - being able to build complex business management services that incorporate the powers and capabilities of any number of different applications - will by its very nature build isolated data silos which can be difficult to keep clean and generate that all important `One Version Of The Truth' (OVOTT) that all businesses need.

This is especially important in a business world where compliance with legislation and best practice requirements are becoming ever more stringent.

So OVOTT creates an irresistible need for the exact opposite of distribution. At some point there is an absolute requirement for centralisation, and this is starting to appear in many areas, not just OVOTT. The recent features about Pentaho and Equinix are good examples of how the many flavours of IoT are creating the need for intelligent communications tools to sit between both the many flavours and the business-related tools and services that make sense of their activities in business terms.

It is in the OVOTT arena that Jaspreet Singh is pitching the Druva tent, rather than just be seen as a 'cloud storage services' vendor. The company's basic marketplace is the provision of secondary storage services, but that is to seriously misunderstand its real target.

According to Singh, the goal is to provide a single repository of all the data a business creates in its daily processes, regardless of the source, and make it a value-generating entity:

With the lines blurring between using multiple cloud services, IoT, and mobile services there is no centralisation and therefore no overall data management. Most business users are in a weak position because they create multiple copies of data that are totally de-centralised and duplicated. And just adding secondary storage for backup purposes then doubles up the problem. What is needed is a move to the next level of abstraction.

Secondary data

The need he sees now is for businesses to use just one source of secondary data for all data sources that is a version-controlled, de-duplicated, single copy of all the data associated with the business. As the number of data sources a company uses grows – and for most it will – the ability of business or IT managers to keep track of not just the data but more importantly its quality will become a major stumbling block.

Poor quality data will certainly lead to the loss of business, either by reducing the ability to provide customers with an engagement experience they wish to repeat, or through simply not knowing who is a customer. Perhaps more important, poor quality data management will lead a growing number of companies into the realms of compliance breaches, which these days can be rapidly followed by fines and court cases.

It is Singh’s view that secondary storage now has to be able to meet the needs of providing OVOTT services to users:

This is no longer about secondary storage, it is about providing data management in an area where users will need to take more data sources on board, and be able to do more with that data. They will all need one version of the truth if they are to fulfil requirements such as compliance management.

Many of the new services coming along, such as Office 365, Box and Google Apps, support more open data sources and are, according to Singh, therefore much more easy to work with. To meet that change, Druva has just introduced inSync. This allows data collection, recoverability and governance for these increasingly popular business tools to be managed from a single, integrated platform.

And as the complexity of data sources increases, the company is already finding that some customers are building multiple, layered implementations of the storage system:

For example, companies are using Druva to provide specific storage and information management for national operations, so that `national’ data is stored and managed within that country. They then have an over-arching implementation of the system to create a single, global corporate picture of that company’s operational truth.

Because the way data in secondary storage systems are starting to be used, it is now important that the service switches to become event-driven rather than time-driven. That means much of the data storage, together with processes such as de-duping, has now to done ever-closer to real time:

Backing up one large event such as a time-driven daily backup is now far less valuable to a business than storing an event when it happens. This is increasingly valuable in a number of areas but as an example, managing compliance now means that just logging the storage of a data event is no longer good enough. Compliance increasingly requires the event, and information about the data stored, to be not just reported at the right time, but reported to the right person. That is what secondary storage now has to provide.

My take

OK, using the Babel Fish as a motif here may be fanciful, but the paradox – that the greater the distribution of resources the greater the need for centralisation at some point(s) in the process – is becoming clear and its impact may prove significantly negative to businesses that charge gung-ho into the cloud without too much forethought.