Why SAP should put HANA on Open Compute

Profile picture for user pwainewright By Phil Wainewright October 2, 2013
Summary:
High-performance cloud applications and big data stacks often skip virtualization and write to the metal. Could OCP be the way to hone HANA for cloud?

Industrial bakery
Those who imagine that virtualization is an essential ingredient of cloud computing must have been mystified to discover from my post on Monday that Facebook prefers not to use the technology. Instead of abstracting away the hardware layer as conventional wisdom suggests one should do in a cloud architecture, Facebook very much factors it in, with its insistence on designing its own custom-built database machine. What's going on?

In fact it's replication not virtualization that counts in cloud computing — the ability to scale out across an elastic multitude of identical instances.

How replication is achieved depends on the context. In an IaaS environment like Amazon Web Services (AWS), which seeks to host a large and diverse population of computing loads, virtualization is a crucial agent of homogenization across the computing stack that hosts those loads. For Facebook, where the load is already homogenized because the vendor is providing the same application to hundreds of thousands of users, virtualization is an unnecessary overhead.

Bare metal cloud

This is why the future isn't going to converge on a tiny handful of global cloud computing giants all competing to provide identical IaaS services. Although there will be giants in that space, there will also be huge fragmentation of the market across specialist providers of application-specific cloud computing.

So Salesforce.com on Exadata is one template for apps that use very large Oracle 12c database instances. Facebook provides a very different template based on using its Open Compute designs. As I noted on Monday, OCP server designs are starting to find their way into enterprise datacenters mainly because of "the lack of a pre-existing industry standard for cloud datacenter servers."

Many other templates will emerge. Expect to see some interesting developments once the first generation of 64-bit ARM servers comes out. The ARM architecture is best exploited by parallelizing similar workloads such as large numbers of web servers rather than virtualizing many different workloads.

Indeed it's already possible to rent 'bare metal cloud' in which you pay by the hour for your preferred hardware configuration. Bigstep, a London-based venture launched by an established global hosting provider, quotes benchmarks that show virtualization decreases performance by anything between 20 and 80 percent, especially for I/O intensive workloads such as big data. It is therefore targeting its offering at operators of large-scale big data platforms, who have most need to squeeze every ounce of performance out of the underlying computing.

Honed for HANA

Where else will this lead? Well if big data is a beachhead use case for application-specific cloud computing stacks, then perhaps SAP should be looking closely at teaming up with the Open Compute Project (OCP) to develop an open-source hardware specification that's honed for HANA.

This would be a fitting follow-on from its BW on AWS offering and also benefit many of its customers who currently face huge problems when they come to implement HANA in-house.

As my diginomica colleague Dennis Howlett put it when he broached the idea in an email conversation earlier this week:

"Provisioning HANA is a dog's breakfast because there are seven approved hardware manufacturer configurations, all of which have a different approach. OCP might represent a way out of this spaghetti soup in the context of a soon (so they say) OLTP/analysis 'version' of the HANA architecture."

This it must be said is pure speculation, but a single, open-source hardware reference design could be a considerable help to enterprises aiming to implement HANA. Instead of every customer paying a tidy sum to systems integrators or spending in-house resources on separately working out the best way to get optimum performance, the open source model allows those resources to be pooled, achieving a better result at a lower cost for each participant.

Scale-out challenge

The big potential stumbling block in this potentially rosy picture is HANA's ability to scale out across multiple commodity hardware boxes (noting that the definition of 'commodity' in this context extends to include Facebook's design for a server incorporating a 3.2-terabyte solid-state flash memory card). Facebook has an OCP design for database machines because it runs hundreds or thousands of them at a time. If one fails, the provider just brings up another node without the end user ever noticing a glitch.

Can HANA achieve the same effortless fault tolerance? It has been validated to scale out across 12 nodes in production systems and is capable of much more according to SAP executive board member Vishal Sikka, quoted in that announcement in May last year:

"Recently, in our research labs, we have assembled the largest SAP HANA system that scales to 100 nodes, 4000 computing cores and 100 TB in DRAM."

Verdict

All of this is an interesting thought experiment. Perhaps the gulf between cloud and traditional enterprise computing is not as unbridgeable as I have previously portrayed, at least when operating at a sufficiently large scale. Cloud application players amply demonstrate that once scale becomes large enough, there is much they can do to optimize under the covers of their own infrastructure. Maybe SAP should use OCP to help its HANA customers do likewise.

Disclosure: SAP, Oracle and Salesforce.com are diginomica premium partners.

Image credit: Baking bread © Minerva Studio - Fotolia.com.