The Artificial Intelligence Computing Stack

This blog post is cross-posted with the O’Reilly Radar blog.

A gigantic shift in computing is about to dawn upon us, one that is as significant as only two other moments in computing history. First came the “Desktop era” of computing, powered by central processing units (CPUs), followed by the “mobile era” of computing, powered by more power-efficient mobile processors. Now, there is a new computing stack that is moving all of software with it, fueled by Artificial Intelligence (AI) and chips specifically designed to accommodate its grueling computations.

In the past decade, the computational demands of AI put a strain on CPUs, unable to shake off physical limits in clock speed and heat dissipation. Luckily, the computations that AI requires only need Linear Algebra operations, the same Linear Algebra you learned about in high school mathematics. It turns out the best hardware for AI speaks Linear Algebra natively, and Graphics Processing Units (GPUs) are pretty good at that, so we used GPUs to make great strides in AI.

While GPUs are good at Linear Algebra, their lead is being challenged by dozens of Chinese and American companies creating chips designed from the ground-up for Linear Algebra computations. Some call their chips “Tensor Processing Units” (TPUs), others call them “Tensor Cores”. It is no surprise these products even compete on the word “Tensor”: it is a core concept from Linear Algebra used heavily in AI. All of these companies support running the “TensorFlow” software library, released by Google in November 2015. Indeed, I am operating a Computer Vision company whose name is a mathematical generalization of a Tensor, named Matroid, that uses these chips heavily.

The chips specialize toward different modes of computation: those that operate in a data-center vs low-power embedded devices, and those that operate primarily for training vs inference. Each chip has its strengths and weaknesses, and NVidia GPUs paved the way and have the lead with training in the datacenter. The competition for the other modes of computation has not yet settled.

In the aftermath of the competition between these hardware companies, a new type of chip will stay standing, one that is computationally superior for use in almost all software, as AI rapidly eats all of software, while software eats the world. The chain of thought can described succinctly:

Graphics and tensor processors are eating linear algebra.
Linear algebra is eating deep learning.
Deep learning is eating machine learning.
Machine learning is eating artificial intelligence.
Artificial intelligence is eating software.
Software is eating the world.

Even consumer-oriented companies are exploring the space with no stated intention of selling the chips directly, but rather with the goal of improving their end-products. Tesla for example, has been building AI chips with the goal of self-reliance for Autopilot. Apple has rolled out custom silicon for its face recognition capability in iPhone X. Microsoft Azure is using FPGAs for their machine learning workloads, and Google has been using TPUs for AlphaGo, Street View, and many other applications. These companies have not publicly declared intentions to sell their chips, but are already using them to improve applications.

With this dramatic change in the computing landscape, China is rightfully pouring hundreds of millions of dollars into the field, understanding this tectonic shift at the highest levels of their administration. In an attempt to win over the semiconductor business, AI chips are one of the eight “Key General Technologies” identified by its government as being crucial to its national AI strategy (translation available here). Chinese companies building AI chips include Bitmain, Cambricon, DeePhi, Horizon Robotics, and SenseTime, many of which are valued at more than a billion dollars.

Both the US and China are investing heavily in this new computing stack, as they should. In August, China’s State Development & Investment Corp. (a fund owned by the Chinese Government) led a $100 million funding round in Beijing-based Cambricon. Cambricon and Bitmain both announced new chips in the past two months, both of which directly compete with Nvidia’s offerings. Unrelenting, a more recent October call for research proposals from the National Development and Reform Commission included another request for high-powered AI chips.

The American computing industry currently leads in the space, with incumbents Nvidia, Intel, and Qualcomm. Internet giant Google has announced its TPU to be offered for rent as part of the Google Cloud Platform. Meanwhile, a gaggle of American startups has emerged, all with the goal of being the next computer hardware giant. They include AIMotive, BrainChip, Cerebras, Deep Vision, Graphcore (British, with American funding), Groq, Mythic, Remicro, ThinCI, Unisound, and Wave Computing. In addition, there is a growing interest in the development of AI chatbot, which is also a field in which many companies are investing.

Some of these companies focus on power efficiency, while others focus on raw computing power, or arithmetic operations performed per second, while others focus on building a rich software ecosystem of libraries. It is not clear which of these American and Chinese companies will win the new computing stack and keep Moore’s law alive, but what is sure is that the way we build software and hardware is dramatically changing, with AI burning the path forward.

The Artificial Intelligence Computing Stack

Download Our Free

Step By Step Guide

Building Custom Computer Vision Models with Matroid

Featured Resources.

Pioneering AI Computer Vision in Aerospace: Matroid Founder Engages with Global Leaders

The Role of Computer Vision in Industrial Internet of Things (IIoT) Applications

Announcing the 6th Annual ScaledML!