See what Ezra Gottheil, Principal Analyst for Technology Business Research, Inc. has to say about Mipsology’s Zebra AI inference accelerator…
MIPSOLOGY’S ZEBRA LOOKS LIKE A WINNER
Mipsology is a 5-year-old company, based in France and California, with a differentiated product that solves a real problem for some customers. The company’s product, Zebra, is a deep learning compute engine for neural network inference. While these engines are not uncommon, Zebra unlocks a potentially important platform for inference using the field programmable gate array (FPGA). There are two parts to this story, which is one of the challenges Mipsology faces.
INFERENCE — THE PHASE WHERE DEEP LEARNING GOES TO WORK
Deep learning has two phases: training and inference. In training, the engine learns to do the task for which it is designed. In inference, the operational half of deep learning, the engine performs the task, such as identifying a picture or detecting a computer threat or fraudulent transaction. The training phase can be expensive, but once the engine is trained it performs the inference operation many times, so optimizing inference is critical for containing costs in using deep learning. Inference can be performed in the cloud, in data centers or at the edge. The edge, however, is where there is the greatest growth because the edge is where data is gathered, and the sooner that data can be analyzed and acted upon, the lower the cost in data transmission and storage.
SPECIALIZED AI CHIPS ARE HOT, BUT THE MATURE FPGA IS A PLAYER TOO
For both training and inference, specialized processors are emerging that reduce the cost of using deep learning. The most popular deep learning processor is the graphics processing unit (GPU), principally Nvidia’s GPUs. GPUs rose to prominence because Nvidia, seeing the computational potential of its video cards, created a software platform, CUDA, that made it easy for developers and data scientists to use the company’s GPUs in deep learning applications. The GPU is better suited to training than inference, but Nvidia has been enhancing its GPUs’ inference capabilities. Other specialized processors for deep learning inference include Google’s Tensor Processing Unit (TPU) and FPGAs.
FPGAs have been around since the 1980s. They are chips that can be programmed so the desired tasks are implemented in electronic logic, allowing very efficient repetitive execution, which is ideal for some deep learning inference tasks. Mipsology lists several advantages of FPGAs over GPUs for inference, including a lower cost of implementation, a lower cost of ownership and greater durability. While FPGAs have been used in some implementations, including on Microsoft’s Azure platform, these chips have not received the attention that GPUs have.
ZEBRA IS WHERE INFERENCE MEETS FPGAS
Mipsology’s Zebra compute engine makes it easy for deep learning developers to use FPGAs for inference. Zebra is a software package that provides the interface between the deep learning application and the FPGA, so that specialized FPGA developers do not to have to be brought in to exploit the benefits of the processors. Zebra is analogous to nVidia’s CUDA software; it removes a barrier to implementation.
BRINGING TOGETHER THE PUZZLE PIECES
FPGAs are mature and powerful potential solutions that lower the cost of inference, a key to expanding the role of deep learning. However, the programming of FPGAs is often a barrier to their adoption. Zebra is an enabling technology that lowers that barrier. In the world of specialized solutions based on broadly applicable technologies such as deep learning, there are opportunities for products and services to make it easier to assemble the pieces and lower the cost of development. Zebra is exploiting one of these opportunities.
See the article on TBR’s website.