California's latest gold rush - Google and Intel dig in to tap the AI business market seam

Profile picture for user kmarko By Kurt Marko April 16, 2019
Summary:
The latest California gold rush is underway with AI market dominance as the seam that everyone's trying to dig into.

gold-1575220

It’s become a business axiom that in a gold rush, the real money isn’t made by digging for ore, but by selling picks, shovels and supplies to the miners. How appropriate then that the truism has its origins in the original California Gold Rush, one that often left miners destitute and bitter even is at created fortunes for the likes of Sam Brannon, Levi Strauss, Leland Stanford, Henry Wells and William Fargo, the legacy of which lives on in business and academia.

The Golden State is also the focal point for the biggest gold rush in technology today in AI. As with the 49ers, many of AI's pioneers might eventually be lost to history, however, it’s already evident that the field is proving to be a bonanza for many tech industry Titans that are supplying the picks and shovels of AI researchers and developers.

NVIDIA catalyzed a renaissance in AI development by repurposing its GPU technology to accelerate highly parallelizable machine and deep learning algorithms (for background, see my coverage of NVIDIA and its AI strategy here). Although NVIDIA has a first-mover advantage in components and developer tools, it hasn't dissuaded Intel, AMD and others from attacking such a rapidly growing market. As with other categories of IT infrastructure, these companies find that their biggest customers are the mega-cloud companies like AWS, Microsoft Azure and Google that use AI tooling to power growing portfolios of AI development platforms and packaged services. Indeed, as I've discussed, many of these expertise-laden cloud vendors are themselves developing AI acceleration hardware to improve the price-performance value of their services and create a competitive advantage.

The nexus of component suppliers, software development platforms and cloud services amount to the picks and shovels of the AI gold rush and recent events by Intel and Google demonstrate how vital the flourishing market is to the future of these Silicon Valley bluebloods.

AI illustrates Intel's strategic evolution

Intel recently provided a status update on its comprehensive data center strategy that, as I detailed in an earlier column, seeks

to expand its TAM into all facets of the data center and beyond, including burgeoning needs to AI acceleration and telecom equipment at the edge.

Although the headline was Intel’s introduction of next-generation Xeon processors, buried amidst the new products and features were several items targeting AI developers and service providers, with a notable focus on the inferencing phase of deep learning applications.

(As a reminder, most AI applications use a form of neural network in which the structural details and parameters are calculated through a series of computationally-intensive training runs on sample data. The trained network is later deployed and run to generate predictions and classifications using operational data in a process called inferencing.)

Improving the performance and reducing the cost of model inferencing is more important to most organizations because model training is only done once or infrequently while inferencing might happen continuously using real-time data streams. Intel’s strategy involves turning its omnipresent Xeon x86 processors into inferencing engines that are competitive with dedicated AI accelerators using GPUs or custom chips like Google’s TPU.

Neural network calculations involve a lot of matrix math, but unlike scientific simulations, they don’t require the high precision calculations that conventional floating point hardware is designed for. Thus, the easiest way to accelerate inferencing calculations is by trading off mathematical precision for performance by using a data format with fewer significant digits, a technique known as reduced-precision operations. Intel’s approach to inference acceleration is twofold:

Intel markets these features collectively as DL Boost which it incorporates in the latest version of its Xeon Scalable processors. As its recent Data Centric event, Intel performed a demonstration in which DL Boost almost doubled the performance of inference calculations for a recommendation engine, a popular application of deep learning. It also showed a Cascade Lake Xeon exceeding the inference performance of NVIDIA’s Volta V100 superchip by a factor of 4.6. Intel benchmarks show DL Boost improving overall inference performance by 2-4-times over previous Xeon floating point units across a range of deep learning workloads.

word-image

Source: Intel slide presentation from Tech Field Day talk

.

word-image

Source: Intel slide presentation from Tech Field Day talk

Processor features are only useful if developers can easily exploit them. To that end, Intel has developed the OpenVINO (Open Visual Inference and Neural network Optimization) toolkit, an SDK, model optimizer and set of other utilities and sample applications that are available for both Xeon and other Intel hardware including FPGAs and the Movidius neural network compute stick.

Memory capacity, latency and throughput are also significant contributors to deep learning performance given the amount of memory required of increasingly large models. To that end, Intel noted that its new Optane Persistent Memory and Optane based NVMe drives can significantly improve AI performance by dramatically expanding memory capacity and reducing latency by three orders of magnitude compared to conventional SSDs.

word-image

Source: Intel slide presentation from Tech Field Day talk

Intel also continues to pursue the use of FPGAs as special-purpose AI accelerators, releasing its most powerful chips to date, the Agilex family, promising up to a 40 percent improvement in performance and power efficiency compared to its prior-generation Stratix 10 devices. Although FPGAs are used for a variety of applications, Intel (and Microsoft, among others) sees AI as a prime opportunity. It recently acquired Omnitek, a small developer of video and image processing FPGA products. It expects to use Omnitek technology and expertise to develop efficient vision applications, with a particular focus on cloud service providers.

Google Cloud unleashes a barrage of AI services

Further up the pick-and-shovel AI product stack, Google Cloud recently used its Next event to introduce no fewer than 29 AI and data science services and updates. (Note: see my previous Diginomica column for a summary and analysis of Google Cloud's enterprise announcements). Granted, most of these are incremental improvements or pre-production beta services, but some of the most significant share the theme of democratizing access to AI technology that Google has espoused at previous Next events.

word-image

Source: Google Cloud website

The most significant announcements for enterprises centered on Google’s AI Platform, an umbrella term to describe a range of AI and data science services. Furthering its commitment to simplify AI model development, Google made several enhancements to the AutoML feature it introduced last year. As the name implies, AutoML automates the creation and tuning of machine learning applications by starting with a set of proven, Google-developed models and using various techniques to build models for different scenarios. At Next 2019, Google introduced the following improvements, all still in beta release:

  • AutoML Tables to build and train models without writing code by ingesting data from object storage, data warehouses (BigQuery) and other sources.
  • Object detection for AutoML Vision
  • The ability to run image classification on edge devices via AutoML Vision Edge
  • Metadata classification and labeling for video using AutoML Video
  • The ability to isolate custom text fields and perform sentiment analysis on unstructured data via AutoML Natural Language.

word-image

Source: Google Cloud YouTube video

Google also introduced some new packaged AI services that target common business scenarios including:

  • Digital and scanned document processing
  • Product recommendations
  • Visual product search (match user-provided photos with a relevant product in one’s catalog).

Data scientists weren’t left out, with several improvements to Google’s BigQuery ML portfolio, support for Jupyter notebooks in AI Platform and a new AI Hub service providing a collaboration environment for data scientists to share notebooks, models, APIs and results.

Each of these announcements furthers Google’s goal, which the product manager for Google Cloud’s AI portfolio reiterates in a blog post, to make “AI accessible to every business” and continue providing packaged AI products tailored to different businesses and industries.

My take

As I stressed after NVIDIA’s recent technology conference, the significant developments in AI are shifting away from pure research towards practical applications towards various business problems, that is from AI prospectors to pick and shovel suppliers. Companies like NVIDIA and Intel target the most fundamental AI needs with computational engines and software development frameworks, which cloud service providers like Google, AWS and Microsoft, along with a host of other software and SaaS vendors, build upon with development platforms, automation tools and packaged applications.

The evolution of AI technology into products and services that are usable by non-specialists has profound implications for business, similar to those of office productivity software like word processors, spreadsheets and presentation packages that forever changed business processes and became baseline capabilities for even the smallest organizations. Today, sophisticated AI techniques such as facial recognition, recommendation engines and sentiment analysis are becoming available to anyone. Then as now, the competitive advantage goes to the individuals and organizations that most creatively use and apply the technology, not merely to those that possess it. With AI, it will be the companies that can exploit it to turn unstructured content into contextualized summaries and data into discoveries.