Amazon Web Services (AWS) saw cloud revenue grow 13% year-on-year in Q4 to hit $24.2 billion, with operating income of $7.2 billion, but it was generative AI that was the key theme of 2023.
CEO Andrew Jassy called it “a very significant year of delivery and customer trial for generative AI “, highlighting the three layer vision of gen AI that the company promotes:
At the bottom layer, where [for] customers who are building their own models run training and inference on compute with a chip is the key component in that compute, we offer the most expansive collection of compute instances with NVIDIA chips. We also have customers who would like us to push the price performance envelope on AI chips, just as we have with Graviton for generalized CPU chips, which [offer] 40% more price performance than other X86 alternatives. As a result, we built custom AI training chips, named Trainium, and inference chips, named Inferentia.
Next layer up sees companies seeking to leverage an existing Large Language Model, customizing it with their own data, and leveraging AWS’ security features as a managed service. The firm’s Bedrock offering here has picked up “many thousands” of customers after only a few months, said Jassy:
The team continues to rapidly iterate on Bedrock, recently delivering capabilities including guardrails to safeguard what questions applications will answer, knowledge bases to expand models knowledge base with Retrieval Augmented Generation - or RAG - and real time queries, agents to complete multi step tasks and fine tuning to keep teaching and refining models. All which will help customers applications be higher quality and have better customer experiences.
Customers are still picking their way through which layers of the stack they want to operate in, said Jassy:
We predict that most companies will operate in at least two of them. But I also think, even though it may not be the case early on, many of the technically-capable companies will operate at all three. They will build their own models. They will leverage existing models from us, and then they're going to build the apps.
Customers want choice, he argued:
They don't want just one model to rule the world. They want different models for different applications and they want to experiment with all different sized models because they yield different cost structures and different latency characteristics. So Bedrock is really resonating with customers. They know they want to change all these variables and try and experiment and they have something that manages all those different transitions and changes so they can figure out what works best for them, especially in the first couple of years where they're learning how to build successful generative applications.
Customers are also learning that there is meaningful iteration required in building a production gen AI application with the requisite enterprise quality at the cost and latency needed, he added:
Customers don't want only one model, they want different models for different types of applications, and different size models for different applications. Customers want a service that makes this experimenting and iterating simple and this is what Bedrock does, which is why so many customers are excited about it.
The top layer of the stack is the application layer, with Jassy citing Amazon Q, pitched as an expert on AWS, which writes code, debugs code, tests code, does translations, such as moving from an old version of Java to a new one, as well as being able to query customers various data repositories. Jassy said:
It was designed with security and privacy in mind from the start, making it easier for organizations to use generative AI safely. Q is the most capable work assistant and another service that customers are very excited about.
And Amazon is ‘eating its own dog food’ when it comes to generative AI, said Jassy, including Rufus, an “expert shopping assistant trained on our product and customer data that represents a significant customer experience improvement for discovery”. He explained:
Rufus lets customers ask shopping journey questions like what is the best golf ball to use for better spin control or which are the best cold weather rain jackets and get thoughtful explanations for what matters and recommendations on products. You can carry on a conversation with Rufus on other related or unrelated questions and retains context coherently. You can sift through our rich product pages by asking Rufus questions on any product features and will return answers quickly.
Every one of Amazon’s consumer businesses has “a significant number of generative AI applications that they either have built and delivered or they're in the process of building”, said Jassy:
They're all in different stages, many of which have launched and others of which are in development. If you just look at some of our consumer businesses on the retail side, we built a generative AI application that allowed customers to look at summary of customer review, so that they didn't have to read hundreds and sometimes thousands of reviews to get a sense for what people like or dislike about a product.
We launched a generative AI application that allows customers to quickly be able to predict what kind of fit they'd have for different apparel items. We built a generative AI application that in our fulfilment centers, that forecasts how much inventory we need in each particular fulfilment center.
A strong set of numbers from Amazon overall and a welcome rebound from last year. AWS as a cloud platform provider continues to perform well. As for the generative story being pitched, it’s clearly going to be a crucial aspect of the strategic roadmap over the next few years. But it will be a while before it’s making a significant commercial impact. Jassy admits:
If you look at the gen AI revenue we have, in absolute numbers it's a pretty big number, but in the scheme of a $100 billion annual revenue run rate business, it's still relatively small, much smaller than what it will be in the future, where we really believe we're going to drive tens of billions of dollars of revenue over the next several years.