More Throughput on Tough Models, 
less $, Less Watts

Previous
Next

See how reconfigurability enables high efficiency inference acceleration

Read Cheng Wang’s AI Hardware Summit and Linley Processor talk HERE

InferX Software makes AI Inference easy!

See Jeremy Roberson’s Linley presentation HERE. Watch the video of the presentation HERE

X1M allows you to put high performance AI inference anywhere

See Cheng Wang’s Linley presentation HERE. Watch the video of the presentation HERE

InferX X1 Delivers More Throughput/$ Than Tesla T4, Xavier NX and Jetson TX2

 

The InferX X1 Edge Inference Co-Processor is optimized for large models and megapixel images at batch=1. It’s price/performance is much better than existing edge inference solutions. InferX X1 is programmed using TensorFlow Lite and our software is easy to use.

TOPS is a misleading marketing metric. It is the number of MACs times the frequency: it is a peak number. Having a lot of MACs increases cost but only delivers throughput if the rest of the architecture is right.

The right metric to focus on is Throughput: for your model, your image size, your batch size.  Even ResNet-50 is a better indicator of throughput than TOPS (ResNet-50 is not the best benchmark because of it’s small image size: real applications process megapixel images). Inference Efficiency is achieved by getting the most throughput for the least cost (and power).

In the absence of cost information we can get a sense of throughput/$ by plotting throughput/TOPS, throughput/number of DRAMs & throughput/MB of SRAM: the most efficient architecture will need to get good throughput from each of these major cost factors. See our Inference Efficiency slides for more information.

Resources

Resources