Machine learning software innovator Mipsology today announced that its Zebra AI inference accelerator achieved the highest efficiency based on the latest MLPerf inference benchmarking. Zebra on a Xilinx Alveo U250 accelerator card achieved more than 2x higher peak performance efficiency compared to all other commercial accelerators.
With a peak TOPS, the standard for measuring computation performance potential, of 38.3 announced by Xilinx, the Zebra-powered Alveo U250 accelerator card significantly outperformed competitors in terms of throughput per TOPS and ranks among the best accelerators available today. It delivers performance similar to an NVIDIA T4, based on the MLPerf v0.7 inference results, while it has 3.5x less TOPS. In other words, Zebra on the same number of TOPS as a GPU would deliver 3.5x more throughput or 6.5x higher than a TPU v3. This performance does not come at the cost of changing the neural network. Zebra was accepted in the demanding closed category of MLPerf, requiring no neural network changes, high accuracy, and no pruning or other methods requiring user intervention. Zebra achieves this efficiency all while maintaining TensorFlow and Pytorch framework programmability.