How Workday does cloud

Profile picture for user pwainewright By Phil Wainewright September 5, 2013
Summary:
Insights into how Workday handles security, multi-tenancy, release cycles, development and performance in its cloud application stack

stan-swete_120x93
Stan Swete, Workday

Next week's Workday Rising customer event will provide a welcome opportunity to catch up once more with the vendor's evolution of its infrastructure. Cloud architectures are a big talking point in IT these days, so it is always interesting to see how leading vendors in the industry are architecting their cloud applications.

As background preparation, this week I dug out some notes I made during a deep dive conversation with Workday's CTO Stan Swete earlier last year. Quick disclosure: this research was part of some consulting work I was doing for the vendor; but the conversation was on the record.

Swete and I covered some interesting points about security, mult-tenancy, release cycles, metadata-based coding and in-memory processing. This provides a useful baseline for comparison with any updates coming out of next week's event.

Security in cloud applications

You often hear people talk about 'co-mingling data' in multi-tenancy infrastructures as if this were a somewhat distasteful side-effect that no right-thinking person would countenance. I've always felt this line of argument ignores the fundamentals of how digital security works.

Cloud vendors use powerful logical separation to keep data from 'co-mingling'. As I've said in the past, the fact that it may be stored on the same disk or go through the same processor chip is as irrelevant as worrying about sending your physical mail through the same postal system as your competitors.

In Workday's case, there's a consistent security layer built into the application that governs all access to the data, whether directly by a user or via an API. "Every single bit of access comes through this single security layer in the application server," Swete told me.

Each customer has a unique tenant ID that is coded into every piece of data and metadata. Any data requests or updates are mediated through a java object that uses the tenant ID to access whatever data and metadata it needs. It's a logical impossibility for that software object to bring in someone else's data.

There is no way of accessing the underlying database directly without passing through the security layer. But just suppose for a moment that an intruder did manage to, they'd encounter a further obstacle in the way the data is stored, due to the in-memory architecture.

What's stored in the database isn't a set of conventional rows of data in meaningful tables such as 'customer', 'sales' and 'employees'. All of that metadata is put together with the raw data when the application fires up in memory. What you would get directly off the disk would have very little use.

So the concept of 'comingling' data is a red herring. The only way to get at the data is by logging into the application as an authorized user. The key to security here is in closely managing the access rights (and in taking advantage of additional safeguards such as multi-factor authentication).

In Workday, access rights can be defined individually, according to an employee's role within the organization, or based on contextual factors such as direct or indirect lines of reporting and cost centers. As an HR vendor, Workday typically already maintains the information that feeds into these role-based and contextual policies, and it can be configured to automatically update access rights if an employee changes roles — including leaving the company of course. Using technology from identity management vendor Okta, Workday can also manage access rights for other applications.

Practical multi-tenancy

Like many SaaS vendors, Workday runs its application on commodity servers and whether customers share a server or not is determined by factors such as load balancing, volume of transactions and usage patterns. Among the many definitions of multi-tenancy, this is an approach that some purists would argue isn't the real thing.

My take is that it counts as multi-tenant if the vendor is making the decisions about the most efficient way to use the underlying resources. That includes, for example, running busy customer instances on dedicated machines while less active instances run on shared machines. You could imagine a scenario in which a vendor is load-balancing those instances between shared and dedicated within a 24-hour cycle. You would have to be a real stickler to insist that was not a truly multi-tenant architecture.

At last year's Rising, word emerged of Workday using Basho Riak as an underlying data store, which certainly takes a further step towards satisfying the purists. I'll be probing next week to find out what more has changed in the intervening year.

Phased releases

Workday's server farm also allows it to give customers a certain amount of flexibility over when they move to a new release. Whereas some multi-tenant infrastructures implement the new release for everyone simultaneously, Workday phases it in so that customers can choose their preferred timing within a three-week period. This means they can avoid having to accept an update when in the midst of a financial year close or an annual review deadline.

Workday has three major updates a year, in March, July, and October/November. Each is handled as a six week process.

The first step is to make a copy of all production data and make it available for customers to test in their own sandbox implementation of the new release. That gives them a couple of weeks in which to do any necessary conversion work on custom reports, business processes and other configurations that may be affected. The update to production instances then happens in three waves spread over three weeks, and customers are able to choose which wave to be in. At the end of the third wave, everyone is on the new release.

To ease the transition to new functionality, customers have a lot of control over when to switch it on. For example, the Workday landing page that users first see when they log in is constantly being redesigned. Both the new and the old design get supported in a new update, with the old version as the default. When the next release comes around, what had been the new version now becomes the default, and so on.

Metadata-based coding

Workday made a big break with the past in its application architecture. Traditional client-server architectures are based around complex relational database models that interact with even more complex programming codelines. The drawback with that approach is that introducing any changes or enhancements to the application means changing both of those complex layers and then making sure they still work together.

Workday moved away from this model by adopting a relational database with a relatively unchanging schema, while the structure of the application is defined as a set of software objects — sets of classes (data definitions) that relate to other classes. The relationships between objects are defined as metadata, so that when any feature of the application needs to be changed, it is done in the metadata. There's no need to touch the database schema. This has been a huge boost to developer efficiency and is how Workday is able to go through three major releases every year.

Of course, metadata-driven development is not limited to cloud applications — most software vendors are moving towards this model. But SaaS vendors have been in the vanguard, mainly because they are able to keep all of their users on the same code base as they move from one release to another. Therefore they're better able to keep enhancing the software at the rapid pace the model allows.

In-memory processing

The other big break with the past was to build an application that operates entirely in memory. But as Workday has scaled up the size of its customer base and the breadth of its feature set, it has had to fine tune how it uses memory.

"We continue to be smarter and smarter about how we use memory and what we keep in memory," Swete told me.

That means constantly evaluating the optimum balance between what is loaded into memory and what is left on disk. The objective is to keep the 'hot' data and instances in memory while saving the rest to disk — for example audit logs don't need to be retained in memory once they've been written.

With Workday rival SAP also talking up the benefits of in-memory processing, it will be interesting to hear what developments there have been in this realm over the past year.

Of course all of this technology is merely a means to an end. The other huge area of interest at Rising is to hear customer stories about the business outcomes they've achieved. More on that next week.

Disclosure: SAP and Workday are diginomica premium partners. Workday is a recent consulting client of the author. The vendor is contributing to the author's travel and accommodation to attend Workday Rising.

Photo credit: Workday Way sign @philww; Stan Swete headshot courtesy of Workday.