Tired of GPUs Randomly Failing in the Field? So were we…
What good are dNN (deep neural network) models if they cannot be effectively deployed in the real world? It may seem like common sense. Yet all too often data scientists tinker away, painstakingly honing and training their models in climate-controlled data centers, to craft solutions that cannot be used in the field – at least, not without a significant amount of time and effort.
The problem lies with an inherent disconnect between the lab-trained NN and the team who will actually deploy the NN and face real-world constraints. Data scientists generally use GPUs for the compute engine that trains the network. However, many issues come afterward, during inference and deployment in the field, when size, power and environmental factors often impose constraints that impact the ability to use the trained NN. This can greatly impact the time to market, cost and performance when deploying leading-edge AI applications like IoT, automotive, smart cities and robotics.
What can be done about the bottleneck between the promise of the lab and the realities of practical implementation?
The Data Center Can Be Optimized for AI; Real Life Cannot
Before an AI product is released to the public, it is trained and tested in data centers without restriction on server horsepower. But this is a far cry from the realities of a field deployment.
In the real world, weather and motion may cause devices running AI to fail. For example, a device that operates outdoors in Florida’s hot, humid weather will likely deteriorate more quickly than one that is subject to San Francisco’s more moderate climate. Likewise, if a device is exposed to excessive dirt and dust, its fan may get obstructed and malfunction, causing it to overheat. These types of issues are no stranger to GPUs, the chip of choice for many due to their early popularity in the AI research field.
GPUs use excessive power and require significant cooling, both of which increase the total cost of ownership (TCO) and the risk of failure in real-world applications that must run uninterrupted, 24/7. A GPU within a unit or system may be put under a desk if there is good air conditioning, but not in a device at an outdoor retail outlet or crosswalk where it can be subjected to high temperatures. For instance, GPUs are often used for hours by gamers, but if one triangle on the screen looks wrong when the GPU temperature increases, who is going to notice? Yet, this is not the case for an AI-based security system that needs to operate in all conditions to detect and respond to threats, or for a medical system that needs to detect cancer.
AI Use Cases Need Resilient, Flexibile Solutions
Because they use so much power, GPUs generate a significant amount of heat that degrades the silicon quickly. Theoretically, this shouldn’t be an issue because GPUs were designed for systems such as game consoles or personal computers that were meant to run a few hours a day and replaced every couple of years anyway. However, real-world AI applications are often complex and expensive, and cannot afford to use a chip that must be replaced every two years. GPUs don’t typically have the lifespan required by most real-world applications, including security systems (~5 years), robots (~10 years), and cars (~15 years). Imagine having to replace your car’s GPU every 2-3 years at the cost of the high-end GPU.
Mipsology’s Zebra + FPGA Levels the Playing Field
“AI everywhere” requires solutions that work as well in the field as they do in the data center. One proven approach is to combine FPGAs with Mipsology’s Zebra inference acceleration software platform.
FPGAs have many advantages over GPUs. They are reliable and can be used anywhere, regardless of weather conditions, dust, or motion (some are even going to Mars, which is probably quite harsh!). They also last a long time, with many seeing a life span of ten years or more.
It is not always in the manufacturer’s best interest to have a long-lasting component, but for critical applications that are meant to last years, an FPGA ticks off many boxes.
The most common argument against using FPGAs for AI inference is that programming the chips requires specific high-end expertise. This is where Mipsology’s Zebra comes in: it completely removes the complexity. With a single command and zero effort, an engineer can replace CPUs or GPUs with FPGAs for NN inference without having any FPGA expertise whatsoever.
Mipsology’s Zebra software automatically adapts to varying AI parameters/needs, delivering industry-high performance with low TCO whether on the data center or on the edge. It fits on any sized FPGA and can be used anywhere, and the computation engine remains the same. This means there is no cost to go from the data center to the edge, and the performance is optimized in every environment.
Zebra-enhanced FPGAs use less power than GPUs, are more durable, and can adapt to the needs of any AI application or environment. They also boost computational throughput for critical applications.
To leverage the full power of AI and ease the bottleneck between the realities of the lab and practical implementation, we must reduce our reliance on power-guzzling, live-fast-die-young GPUs for deployments and embrace a new paradigm. A Zebra-enhanced FPGA is an easy-to-use solution to this dilemma that enables longer AI application lifespans, higher performance, greater flexibility and reduced TCO.