A growing number of examples seem to suggest that moving computing to Amazon Web Services (AWS) or other hyperscale public cloud platforms isn’t as cost-effective as it’s generally thought to be. Dropbox, Netflix and Basecamp are the best known of many businesses that once ran on AWS but have since returned to datacenters where they manage their own servers and infrastructure. Is the heyday of the hyperscalers over? Or is it simply that they're charging all of us too much for public cloud?
The argument in favour of public cloud computing, advanced by many over the years, myself included, rests on the economies of scale that can be achieved by centralizing compute in a single, shared resource that is made available for consumption as-a-service. Back in 2010, I argued that the whole notion of private cloud was discredited, citing a Microsoft white paper that concluded, among other findings, that:
- A 100,000-server datacenter had an 80% lower total unit cost of ownership (TCO) compared to a 1,000-server datacenter.
- "For organizations with a very small installed base of servers (<100), private clouds are prohibitively expensive compared to public cloud," yielding a 40-fold cost reduction for SMBs.
- For large agencies with an installed base of approximately 1,000 servers, private clouds were feasible but came with a significant cost premium of about 10 times the cost of a public cloud for the same unit of service.
It was a compelling argument back then, and in a series of blog posts at the time I dug further into the advantages of the multi-tenant SaaS model versus the then mainstream practice of installing and running private implementations of client-server applications, aka 'private cloud'. I still stand by the arguments in favor of the SaaS model, in particular the ongoing improvement and innovation that comes from continuous engagement with customers. But does this argument still stand up at the hyperscale Infrastrucuture-as-a-Service (IaaS) layer, in particular for SaaS vendors themselves?
'The scale economics aren't real'
Zoho, a SaaS company with $1+ billion annual recurring revenue, thousands of customers across the globe and over 80 million users, has always spurned the hyperscalers, preferring to run its own datacenters for all but a small subset of local points of presence. Sridhar Vembu, its CEO, says that analyzing the financial reports of other SaaS vendors that use AWS, Microsoft Azure or Google Cloud Platform (GCP) reveals that they typically spend as much as 20-25% of their total revenue on the hyperscalers. Zoho’s own infrastructure costs it half as much. He says:
I don't believe that a lot of the scale economics they sold us is actually real ... We actually save a ton of money not going to AWS today. We know our numbers, we can benchmark with any of the public companies ... We go over [their financials] to know what they're spending on AWS, all of that. We benchmarked our numbers, at least 10 percentage points [of revenue] is what we’re saving. And I think I can squeeze another five percent.
Zoho's datacenter spend currently stands at 12-15% of revenue, according to Shailesh Davey, VP of Engineering and a co-founder of the company alongside Sridhar Vembu and two others. Shaving off a further 5%, as Vembu suggests is possible, would bring the figure down to 7-10%, which is at the low end of what SaaS vendors have historically achieved.
Is this simply a reflection of the same economies of scale cited in the Microsoft report, where once a SaaS vendor reaches a certain volume of activity for a specific set of applications, the cost variance with public cloud simply narrows to a rounding error? Or have the economics shifted significantly in the 17 years since Amazon first launched its S3 storage and EC2 Elastic Compute Cloud services? Davey believes it's the latter, and that AWS is reaping a profit bonanza from a market that doesn't think to challenge the conventional wisdom that public cloud costs less than the alternatives. He says he's surprised at the pricing he sees from AWS, Azure and GCP:
I feel somehow there is a lot of pressure on the hyperscalers to maintain that price. Because when we work out the math and look at it — because we do deal with the hyperscalers, there are two or three instances where we use them in a very tactical way — some of it's surprising to me why they are priced so much.
Factors that change the calculation
What accounts for this discrepancy? In conversation with Zoho executives last week, several factors came to light that together explain how a SaaS vendor running at Zoho's scale can run its infrastructure for less than half or even a third of what it would pay a hyperscaler for the same capabilities. Davey believes that AWS and its peers are continuing to rake in profits based on innovations that they pioneered in the early days, but which are now commonplace. He cites two key factors:
Initially, when AWS and the hyperscalers started off, I think they did two things. One is they had a software stack, which made it easy for people to come to the cloud and work. Nowadays, a lot of the software stack is anyway available, either as open source, or people have built that expertise. This is number one.
Number two is, if you go and look at the server companies — the companies who sell servers to us — if you go and look at their reports, I think they are in the range of 2-5% profit margin. So I'm guessing there is not much to cut there, that is one of the things I can take.
Based on these two [factors], when you look at it now, when you go beyond something like 200-250 servers on a hyperscaler, a lot of [companies] feel that if they have their own cloud instead as a private cloud, they can turn into a lower price.
Building on the growing prevalence of open-source infrastructure software, another huge factor is the emergence of containerization. This has brought economies of scale to individual servers, making it possible for any datacenter to rapidly provision and deprovision instances without the higher cost and complexity of managing earlier server virtualization technologies. This is the most significant change since that Microsoft white paper back in 2010, with a dramatic impact on the calculations around economies of scale.
A final factor that Vembu raises is the role of innovation and R&D. For example, Zoho has adopted the open-source Postgres database, and has figured out how to run it on Graphical Processing Units (GPUs) rather than conventional CPUs. This yields a further price-performance benefit, but it's down to the work of one very talented engineer. Vembu points out that this kind of creativity isn't a matter of economies of scale, but more down to how a company nurtures its talent. He says:
GPUs are far more efficient than CPUs, but it's also very difficult to run software. We have a very small team, basically one super-brainy engineer, who cracked the problem over the last five to six years ...
It doesn't cost $100 million to solve this problem. It costs the one super brain you can find who is committed to solving the problem. A lot of R&D is like that. It is not about how much money you employ. It is about how good a talent you have or how passionate the talent is for solving the task. It took five years to crack the problem. We’ve now cracked it, it's actually running, it is deployed, it will be worldwide deployed across our data centers. So the scale economics don't typically operate on this as much.
Taking a longer-term view
As a private company with no outside investors to answer to, Zoho has the additional benefit of being able to take a longer-term view than others who have to meet quarterly reporting benchmarks or aggressive growth targets. Radha Vembu, who manages the Zoho Mail and Workplace products, has been with the company since inception. She adds:
What we save in terms of money I think we spend in terms of time. What took AWS five years to do probably took 15 years for us to ... solve all the problems related to scalability and performance. [When] companies want to launch something fast, they are going to go with this readily available solution.
Certainly, the calculations change for a venture that wants to start small and grow fast but doesn't yet know for sure what its overall workloads will be, or where a company has highly variable workloads that don't justify the cost of owning servers outright to cater for intermittent spikes in demand. But where the workload is relatively predictable and at an established scale, the case for examining a private cloud option becomes stronger.
Raju Vegesna, Chief Evangelist, adds that Zoho reaps other benefits, such as being able to add extremely computationally intensive services into its platform, including search and AI, without massively adding to its costs. He says:
If you're priced by CPU cycles, you won't be able to offer it for free ... Something like Zoho search, we're able to offer it for free as part of the service, because we own that infrastructure.
There's one other factor that Zoho's executives didn't mention but is also crucial here — the company is committed to keeping its prices affordable for customers, which provides a powerful incentive for it to minimize its costs. It doesn't do this as a loss leader — in another conversation during my week-long visit to the company's headquarters in Chennai, India, last week, Vijay Sundaram, Chief Strategy Officer, told me that every product in the portfolio is required to show a profit. But its strategy is to grow revenue profitably, rather than to maximize profits. The philosophy is captured in a quote attributed to Vembu that's printed on a T-shirt Zoho gave me at an earlier event:
You are never going to achieve happiness while competing with your neighbor.
This is a strategy that the typical equity-funded tech company cannot choose. Imagine if AWS or another of the hyperscalers decided to drop its prices, not because of competitive pressures, but because it wanted to make its services more affordable for its customers. The activist investors would soon be circling, seeking to put a new leadership in place. Zoho's argument is that quarterly reporting cycles inevitably put the short-term interests of shareholders ahead of those of customers and employees. It believes this leads to decisions that are not in the long-term interests of any stakeholders, and is able to choose a different path.
None of this invalidates the arguments in favor of public cloud — Zoho is itself a public cloud company in how it provides services to its customers. The hyperscalers still offer remarkably good value for ad-hoc cloud infrastructure needs where the workload is highly variable or high-growth. But Zoho's example, along with others who have chosen to repatriate their workloads to their own infrastructure, suggests that the threshold at which running a modern, cloud-first application stack on your own servers becomes a more cost-effective option than a hyperscale public cloud has fallen significantly from those early days — and that hyperscaler pricing has not yet caught up to that reality.
[Updated March 1st. When first published, this article incorrectly stated Zoho's user base as 16 million instead of 80 million]