Understanding AI Inferencing
Modern Artifical Intelligence (AI) systems use a paradigm called machine learning. Machine learning (ML) is typically composed of both training and inferencing components. Training is the highly computationally intensive process where the machine (computer) learns how to perform a task. ML Training and is usually performed in very large scale cloud computing systems and can take a very long time to process (weeks or months) even when running on very high performance hardware. The output of the training process, a trained ML model, can be leveraged across many systems in the form of inference processing. Inference processing or inferencing is the term that is used to refer to the process of providing a response to a stimulus based on training from example data sets. Example inferencing tasks include object or face detection in images or video, understanding human speech, or identifying cancerous cells in X-Ray images.
The AI training processing can be very computationally intensive and take weeks or months to complete running on large scale data center server.
AI Inferencing at the Edge is historicallly accomplished with GPU-based accelerator solutions. AI Inferencing doesn't require the same high performance data center class systems needed for AI training but does require much higher performance than is available from standard CPU processors. GPU-based solutions are difficult to program are expensive and power hungry. For AI inferencing to flourish a new solution is required.
InferX Provides the Best Inference Solution
Inference processing when properly accelerated requires much less processing and can typically be performed in a fraction of a second when using InferX AI acceleration technology.
The Flex Logix InferX AI acceleration technology is designed to provide acceleration of AI applications at the Edge of the Internet. Edge devices typically have stringent power dissipation, size and cost requirements. The InferX technology is able to compress the trillions of operations required for performing AI inferencing into a very compact and efficient AI accelerator bringing AI capabilities, like real-time vision, that would have required a super computer just a few years ago within the reach of any company's budget.
Flex-Logix Solutions for Edge Inferencing
InferX Family of Edge Inferencing Solutions
InferX X1 Delivers More Throughput/$ Than Tesla T4, Xavier NX and Jetson TX2
Inference optimized solutions like the InferX X1 are designed to be very silicon efficient. When compared to GPU appoaches the silicon savings are significant.
It’s an exciting time to be a part of the rapidly growing AI industry, particularly in the field of inference. Once relegated simply to high-end and outrageously expensive computing systems, AI inference has been marching towards the edge at super-fast speeds. Today, customers in a wide range of industries – from medical, industrial, robotics, security, retail and imaging – are either evaluating or actually designing AI inference capabilities into their products and applications.
Why it’s so important to match the AI task to the right type of chip. Machine learning (ML)-based approaches to system development employ a fundamentally different style of programming than historically used in computer science. This approach uses example data to train a model to enable the machine to learn how to perform a task. ML training is highly iterative with each new piece of training data generating trillions of operations.
An edge inference accelerator developed by Flexlogix has a 4k MAC dynamic tensor processor array and is optimised for Mpixel image processing models in medical, surveillance and IoT applications.