For AI developers and golf fans, Christmas comes in early spring when NVIDIA runs its GTC event around the time Augusta National hosts "a tradition unlike any other.™" GTC 2021 is online for the second year in a row, but the bonanza of new and updated products announced this week demonstrate that the business and societal disruptions of the past year didn't cause NVIDIA's research and product teams to miss a beat. GTC has evolved into an enterprise-centric event focused on expanding the AI ecosystem and meeting the needs of more demanding workloads, however, NVIDIA didn't forget 3D graphics developers and autonomous vehicle manufacturers, whose symbiotic relationship has created new methods of training AV models and optimizing assembly lines via 3D simulations.
NVIDIA has reaped vast rewards for igniting a renaissance in AI - a previously shopworn concept that had seen more false starts than the Olympics - based on deep learning algorithms made possible by the parallelized architecture of GPUs. With data center products accounting for most of the company's growth and margin expansion over the past four years, the scope of NVIDIA's ambitions has expanded to encompass silicon to supercomputers with software that doesn't just analyze and automate the physical world, but creates lifelike virtual ones, aka the metaverse. As CEO Jensen Huang describes his brainchild:
NVIDIA is a computing platform company, helping to advance the work for the Da Vinci’s of our time - in language understanding, drug discovery, or quantum computing. NVIDIA is the instrument for your life's work.
A heart of silicon
From its start 28 years ago, NVIDIA was a chip company and while it has since built an incredible portfolio of software products and development platforms, silicon remains the foundation of everything it does. Like past events, chips and systems are the stars of GTC 2021, with NVIDIA announcing both evolutionary - but no less impressive - enhancements to existing products and a significant new category that rides the wave of Arm SoCs invading cloud data centers. Indeed, within minutes of announcing the Grace CPU, NVIDIA's stock price spiked 3.5 percent.
The most strategically significant GTC announcement was oddly unsurprising since NVIDIA has long used Arm cores in its SoCs (for example, Xavier announced in 2016) and is in the midst of acquiring Arm, a deal whose outcome remains in doubt. Thus, a logical next step was designing a data center CPU optimized for AI and HPC workloads.. Nonetheless, the fact that NVIDIA will soon offer an alternative to x86 processors for cloud operators and hardware vendors, albeit targeting a subset of AI-targeted systems, is significant as it comes against the backdrop of a growing presence of Arm in the data center. Even though Grace doesn't present an imminent threat to Ice Lake (see my last column) since (a) it won't be available for a couple of years and (b) it is optimized for a subset of AI workloads, not everyday enterprise applications, it's another sign that Arm's data center ecosystem is rapidly growing and innovating.
Grace is motivated by a problem that has long plagued GPU-accelerated deep and machine learning algorithms: the bottleneck between data-hungry GPUs and the CPUs and system memory that feeds them. PCIe Gen4, the standard interface on x86 systems, allows CPUs to transfer data from RAM at 16 GB/sec using an 8-lane PCIe slot. Thus, a CPU can feed four 8-lane (x8) GPU cards at 64 GB/sec even though the GPU itself can digest data at more than a hundred times that rate. NVIDIA solved this problem five years ago via its NVLink protocol, which in the current 3.0 version is about 3-times faster per lane and PCIe. Thus, the 12 lanes of NVLink 3.0 in the A100 GPU provide 600 GB/sec of bandwidth, almost 10-times that of PCIe Gen4.
The problem for NVIDIA is that GPUs rely on the CPU to access DRAM and until now, only IBM, in its POWER CPUs, has seen fit to build NVLink into their processors. As the following slides from Huang's keynote illustrate, Grace solves the CPU-to-GPU data bottleneck by including enough NVLink lanes to feed four GPUs at an aggregate of 2,000 GB/sec, 30-times faster than x86 systems. Grace gets an added boost by supporting LPDDR5x memory which in NVIDIA's implementation provides twice the bandwidth with lower power than the DDR4 used in conventional servers.
Grace won't be available until early 2023, so design, fabrication and packaging details are non-existent, but it's a significant milestone in Arm's penetration of cloud and edge servers. However, Grace was but one example of NVIDIA fortifying the Arm ecosystem. It also announced,
- A partnership with AWS to support GPUs on Graviton2 instances to accelerate online gaming and AI inference.
- An agreement with MediaTek to extend its expertise in phone SoCs to a new class of Arm-based notebooks that will include RTX GPUs for graphics and media processing. The pair will develop a reference platform and SDKs supporting Chromium (Chrome OS) and Linux.
- Work with Marvell to integrate its Arm-based OCTEON processors used for networking and edge applications with NVIDIA GPUs to accelerate network optimization, security and other virtual network services and AI workloads.
Spanning supercomputers, data centers and edge computing
CPUs are the third tier of NVIDIA's tri-part chip strategy, complimenting GPUs and DPUs (data processing unit, aka network and storage accelerators). Emulating Intel's tick-tock strategy, Huang says that each category will be on a two-year release cycle, annually alternating between architectural redesigns and incremental 'kicker' upgrades. He explains it this way,
One year will focus on x86 platforms, the next on Arm platforms. Every year will see new exciting products from us. Three chips, yearly leaps, one architecture.
The three categories will increasingly be combined in systems spanning 2U servers for edge deployments to integrated DGX SuperPODs that package a supercomputer in an 18-rack system. For enterprises and mobile carriers, the most significant system announcement out of GTC is the EGX reference platform that bundles the new A10 or A30 Ampere GPUs in an x86 server certified to run VMware vSphere and its Tanzu container stack. EGX systems will be available from NVIDIA hardware partners like Dell, Inspur, Lenovo and Supermicro.
As carriers deploy 5G infrastructure enabling a new class of mobile, real-time applications, it's becoming untenable to backhaul data to cloud services or central data centers for processing. Furthermore, virtualized network services (NFV) and 5G radios (V-RAN) allow carriers to consolidate basestation functions onto increasingly capable servers.
NVIDIA expects 5G to be the catalyst for both V-RAN wireless infrastructure and a new class of AI and data analytics applications that will run in basestations and other edge locations. To exploit this trend, it announced an AI-on-5G variant of the EGX platform with a new BlueField-2 A100 DPU-GPU converged card and that supports the Aerial SDK for building software-defined 5G applications.
GTC 2021 also featured the BlueField-3 DPU with support for 400 gigabit Ethernet and 16 Arm A78 cores to offload packet processing and network functions from the host CPU. NVIDIA expects developers to use the horsepower to build advanced security features using its Morpheus AI cybersecurity framework. Although Bluefield-3 will not be available for another year, its predecessor is a capable platform for piloting security software that sits on the network interface to provide granular protection, monitoring and zero-trust security enforcement.
NVIDIA has evolved into a company that provides the core technology reshaping industry and society. Its strategic success can be traced to a visionary founder and CEO who, like another engineer-CEO Elon Musk, focuses on analyzing and solving problems from first principles. During a post-keynote press conference, Huang cited five areas as driving the technology industry and that define NVIDIA's product strategy and roadmap:
- Accelerated computing, e.g. HPC, AI
- Powering AI through both hardware and software
- Software-defined data centers, which he terms "a new computing unit" (and that in some cases can be packaged like DGX SuperPOD).
- 5G but particularly private 5G networks, which will fuel enterprise AI applications at the edge.
- Robotics, in which simulation and training can be accelerated and optimized using virtual worlds with digital twins.
Any organization that expects to be relevant in a data-driven business environment must have plans in most, if not all of these areas. Since NVIDIA must develop components and software long before these technologies become mainstream, following its lead is an excellent way to anticipate trends and stay ahead of the competition.