Stratus goes high availability in the cloud

Profile picture for user mbanks By Martin Banks November 13, 2014
Summary:
Putting high availability and high reliability services in the cloud gives Stratus the chance to make `mission critical’ the everyday standard.

.

OpenStack is now starting to get some real traction, what with the likes of Rackspace, Red Hat, HP and IBM all offering cloud services built on the infrastructure, and that in turn is starting to attract applications and service providers to create a richer set of tools for users to exploit.

High_availability_247
Some of these, such as the recently announced Software-Defined Availability (SDA) services from Stratus, have the potential to add serious mission critical strength to what can be achieved with OpenStack.

Stratus, of course, has built its reputation on providing high reliability services. But this has been on the back of inevitably expensive, dedicated hardware and software systems, making it one of those truly `mission critical’ solutions – only used where the need is great and the investment justifiable.

But, as Jason Andersen, Senior Director, Products and Marketing at Stratus has noted, the advent of the cloud is changing all that. The need now is to get high reliability, high availability services available out there in the cloud, available for use on the commodity servers found in every datacenter and cloud services provider:

The people who can do this best have been the old school Unix and mainframe communities, but when it comes to the cloud everyone wants cheap hardware. So SDA sets about solving the same problems and apply it to cloud-based hardware environments.

Some of this can be done at the hypervisor level, some at the workload management level, and some has to happen around OpenStack itself, because this was initially never designed to support mission critical applications. It is still really geared towards the stateless web-type stuff rather than transaction-based systems such as ERP or other more robust applications.

Our assertion is that if we could build a set of products, capabilities, frameworks and structure to enable customers to save money by running those types of application in the cloud it would be a good thing. They will have lower development cost because of the way we have designed our hypervisor and, because it is cloud, availability becomes a service of that cloud.

The company has had to develop its own hypervisor, which smacks of a proprietary trap looming for users, but that will be of less importance with a cloud service. The company has come up with a modified version of the KVM hypervisor that can now support various fault tolerant situations.

According to Andersen, certain types of availability are suited to certain types of hypervisor, so the trick is to make sure workloads are directed to the appropriate environment. This is achieved with a management layer that can analyse tasks for their operational requirements and route them to systems running the appropriate hypervisor and environment.

When put into the cloud he says the argument gets more compelling, because in the cloud there are supposed to be good charge back systems that allow users to be charged only for what they use:

This approach allows it to happen, and until now the only way of achieving this has been to install dedicated, hardwired and expensive systems. Yet everyone knows the scenario where a business has an application like CRM or accounts. Most of the time this can work in a normal environment, but in the last two weeks before closing a quarter’s business, that application better not fail.

New services

The users will have to expect to pay more for such a service, but will not need to carry the cost of specific high availability systems that are not really needed for 10 weeks out of the 12.

Stratus calls this capability Workload Services and is one of the two `secret sauces’ that go to make up its cloud-based SDA offering. The other is Availability Services, which is its KVM derived hypervisor. This then provides the environment that can deliver high availability on commodity servers as a cloud service.

This raises the issue of what this then does to the existing Stratus hardware business:

If you look at our hardware business it has two pieces. The first is what I call our current business is FT server, which is deployed in remote environments where there are no datacentres, such as gas pipelines. So I don’t see it affect that at all.

But we also have a very robust hardware business across the world for datacenter products. It is possible that some time down the road some of those applications might move to the cloud. But what we are finding is that most of the OpenStack-based applications are new ones, there is no mass-migration of existing applications emerging yet.

It is, indeed, quite likely that SDA will drive new developments and applications of HA services rather than migrate existing ones. It is not too difficult to see the possibilities attracting the interest of more specialist members of the small and medium sized business community where loss of availability could be a negative business issue.

Andersen acknowledges that there is some scope in the long term for looking at stretching the capabilities of Workload Services to work with the webscale approach being adopted by companies like Nutanix. This would add the ability to cover elasticity as well as availability. It is, however, most certainly not part of the company’s road map at this point in time.

It can already add some elasticity, of course, by using the classic approach of adding new instances to cope with the workload, but Andersen can see that there is scope for a better engineered solution, especially where the application demand, such as the quarter-end scenario, pushes elasticity and availability to the point of being highly complementary capabilities.

The underlying management operations are based on Service Templates, which define all the elements being managed. He can see that there is a possibility to tweak these to accommodate elasticity services, though they are currently more geared towards the needs of applications that have traditionally required high availability services.

When Stratus started the only way to do high availability was with hardware, so it was inevitably expensive and for only used on the most demanding of applications. The move to the cloud, however, does prompt Andersen to envision a case where `availability’ becomes far more widely used and accepted as a business tool, to the point where it is effectively part of the `norm’:

What we are really trying to do now is be a solution enabler, so if you buy a cloud from, say, IBM, you can say `this portion of my application needs high availability’ and IBM can say `OK, we can dial that up for you’. So we’re not trying to be the next Red Hat, but we do want to be the next Intel of the cloud, the enabler of the cloud where people take this stuff for granted.

My take:

Putting high reliability into the cloud as a service may take some time to catch – it is probably still far too ingrained as `too expensive’ for many users to tumble that they too can start using it. But it is an excellent example of the true value of the cloud in opening up new horizons for all and sundry.