AWS fully embraces hybrid cloud, ARM and custom silicon at re:Invent 2018

Kurt Marko Profile picture for user kmarko December 3, 2018
What happened and why at Amazon's re:Invent conference in Las Vegas.

For cloud users and IT leaders, AWS re:Invent is an annual firehose of product releases, feature updates and industry gossip that, like AWS itself, gets bigger every year. Indeed, as the company grows, it seems driven to one-up itself year after year. Re:Invent 2018 was no exception, with dozens of announcements spanning multiple disciplines and across the entire spectrum of infrastructure, developer platforms and applications. A conference with a portfolio so broad necessarily encompasses several themes, but looking at the number of announcements and how CEO Andy Jassy allocated his time during an epic 2-hour 45-minute keynote, several key areas stand out:
  • Hybrid on premise-cloud infrastructure and services
  • Enterprise IT management and governance services; making the cloud safe for the enterprise
  • Machine learning services and performance optimization
  • Bridging edge and IoT services with cloud services

Complementing these vertical, albeit often overlapping, strategies and associated products are AWS's increasing use of its massive technical and financial resources to develop and deploy custom hardware that provides both tangible customer benefits and significant competitive differentiation.

I'll tackle the ML and IoT topics in future columns, but my goal here is to examine AWS's now open embrace of the hybrid IT operating model and its use of custom components. These are simultaneously complementary and independent, however I highlight them together since re:Invent 2018 marked a milestone in which AWS revealed significant investments in and commitments to each area. Indeed, the company openly flexed its muscle as the largest and most aggressive cloud provider by challenging competitors on several fronts with a breadth of technology and products that is astonishing.

AWS - we do chips too

The mega-cloud operators have long eschewed standard enterprise hardware for cheaper OEM servers and switches that are tailored for their massive scale. Indeed, the gap between the equipment going into a hyperscale data center and those sold to enterprise IT grows wider each year. The increasing influence of cloud operators like AWS on the systems and components business has manifested itself in market share estimates that now show ODMs and other white box manufacturers comprise close to 40 percent of the server market.

Until recently, these machines were entirely based on Intel Xeon processors, however the AI renaissance in the form of highly parallelizable deep learning algorithms has fueled the use of GPUs as auxiliary accelerators. AWS and the other cloud services have responded by offering a variety of compute instances with NVIDIA GPUs. While GPUs are effective at both aspects of deep learning, model training and data inference, they are expensive to buy and operate, significantly adding to a system’s power and cooling load. In response, many companies see an opportunity in developing processors custom designed for deep learning that are both more efficient and cheaper (see my overview in two columns here and here).


Google was the first to introduce custom AI silicon with its TPU chip, now in its third generation, while Microsoft uses FPGAs programmed for various deep learning algorithms. AWS has now joined them by announcing its Inferentia custom machine learning processor at re:Invent. Unlike TPUs, which can be used for both training and inference, Inferentia is designed solely for the latter, since as Jasey pointed out in his keynote, machine learning applications spend the vast majority of their resources running the model, not building it.

Technical details about Inferentia are scant, however AWS VP and distinguished engineer James Hamilton summed up the basics of the chip on his blog.

Inferentia offers scalable performance from 32 TOPs to 512 TOPS at INT8 and our focus is on high scale deployments common to machine learning inference deployments where costs really matter. We support near-linear scale-out, and build on high volume technologies like DRAM rather than HBM. We offer INT8 for best performance, but also support mixed-precision FP16 and bfloat16 for compatibility, so that customers are not forced to go through the effort of quantizing their neural networks that have been trained in FP16, FP32 or bfloat16, and don’t need to compromise on which data type to use for each specific workload. Working with the wider Amazon and AWS Machine learning service teams like Amazon Go, Alexa, Rekognition, and SageMaker helps our Inferentia hardware and software meet wide spectrum of inference use cases and state of art [sic] neural networks.

Since Inferentia won’t be available until “late 2019”, we can’t make any estimates about its cost or performance benefits, however the performance numbers cited by Hamilton exceed those of Googles gen-2 TPU and should be in line with its gen-3 TPU module.

Inferentia will undoubtedly be supported a new AWS Elastic Inference service that for now allows attaching one of 3 different sizes of GPU to any EC2 instance. The service provides the flexibility to mix and match various configurations of conventional CPU, memory and local storage with machine learning acceleration hardware.

Not just AI accelerators

AWS also unveiled custom hardware designed for conventional workloads in the form of its Graviton ARM-based processors. Available in a new A1 series of EC2 instances, Graviton systems target distributed applications such as container-based microservices, Web front-ends, caching servers, distributed databases and development systems that can scale across multiple smaller compute instances.

Together with its previously announced introduction of EC2 instances using AMDs EPYC CPU, AWS provides a double-barreled threat to Intel’s data center hegemony. Indeed, by endorsing the ARM architecture in the strongest possible way, i.e. by devoting R&D resources to building custom silicon, AWS opens the door to ARM becoming a legitimate alternative for enterprise developers. Indeed, as its announcement blog points out, there are already several Linux distributions with pre-packaged AMIs ported to ARM meaning that applications built on scripting languages like Node.js, Ruby and Python can be migrated as is, without recompilation. Similarly, although AWS didn’t mention this scenario, Graviton would be an ideal platform for Lambda serverless functions.

Accurate comparisons aren’t possible until Graviton benchmarks come out, however on-demand instances of the A1 series are about half the price per vCPU of M5 instances using Xeon processors.

AWS Hybrid Architecture - acknowledging the inevitable

Jassy and other AWS executives once put the case for public cloud services in Manichean terms that left no middle ground between traditional enterprise infrastructure and cloud services. The implication being that hybrid cloud was a bastardized form that diluted cloud benefits with few, if any countervailing advantages. The company started coming around to hybridization in 2017 by partnering with VMware to enable the latter’s vCloud services to run on customized AWS infrastructure. AWS also released a host of other hybrid products that operated outside its massive facilities, including Snowball Edge, Greengrass and Storage Gateway, demonstrating that like it or not, the company didn’t ignore the demands of enterprise customers.

At re:Invent finally gave the hybrid model a wholehearted embrace with Outposts, a service of AWS-managed hardware providing a variety of compute, storage and database services on customer sites. Underscoring the importance of Outposts to AWS winning more enterprise business, Jassy invited VMware CEO Pat Gelsinger on stage for the introduction, reprising the pair's duet from the past two VMworld keynotes.

As with the other products AWS announced but hasn't yet released, we were provided with few technical details about Outposts, with the company only saying that "are fully managed and configurable compute and storage racks built with AWS-designed hardware." The service will have two options: one that runs VMware vCloud as the infrastructure management layer and control plane and one that runs native AWS services. The latter will initially support local EC2 instances and EBS volumes, but AWS plans to add others like RDS, ECS, EKS, SageMaker, EMR "at launch or in the months after."


My take

Amazon has never been reluctant to spend profusely on R&D and capital expansion to secure long-term competitive advantages at the expense of short-term profits. Until recently, such investments focused on facilities like massive distribution and data centers and related equipment, including its Prime Air cargo plane and future drone fleet. Lately, Amazon seems to be pouring as much or more money into R&D as capital as evidenced by its growing line of chatbot (Echo) products. Thus, AWS pursuing expensive projects to develop custom components like Inferentia, Graviton and the Nitro network and security chips shouldn't come as a surprise given that it is a unit of the same company that has more than 10,000 people working on Alexa technology alone.

With the announcements highlighted here, AWS is taking on a multi-front battle with the biggest names in the IT equipment industry. Given its technical and financial resources, aggressiveness and cloud market dominance, AWS represents an acute threat to each of them.

Graviton puts the AWS imprimatur on ARM servers, which the release of AMD EPYC instances does the same for Intel's rejuvenated competitor in the x86 market. Combined with Intel's ongoing search for a CEO and festering semiconductor process miscues, AWS could make 2019 a very rough year for the chip giant.

Inferentia, along with Google TPU and forthcoming neural network accelerators from a host of startups, poses a similar long-term threat to NVIDIA in the market for AI hardware.

Outposts and the growing VMware partnership are the most direct competitor to Microsoft Azure and Azure Stack in the market for tightly-integrated hybrid infrastructure and services. While IBM-Red Hat could prove to another worthy competitor, they have sufficient integration hurdles to overcome that the risk to Microsoft and VMware-AWS is more potential than real.

As I've written before, the breadth of announcements at re:Invent is evidence that AWS inherited the competitive, customer-obsessed traits of its parents and aggressively spends on new technology and services once it sees a significant customer need. While it sometimes seems like the company throws products against the wall to see what sticks, it is too successful, with too many savvy executives for its actions to be taken as anything but cold, calculated strategies. It's a thought that should delight customers, but keep competitors awake at night knowing the leviathan they are up against.

A grey colored placeholder image