Deep thinking on compute-in-memory in AI inference

Deep thinking on compute-in-memory in AI inference
by Don Dingee on 03-09-2023 at 6:00 am

Compute-in-memory for AI inference uses an analog matrix to instantaneously multiply an incoming data word

Neural network models are advancing rapidly and becoming more complex. Application developers using these new models need faster AI inference but typically can’t afford more power, space, or cooling. Researchers have put forth various strategies in efforts to wring out more performance from AI inference architectures,… Read More


Area-optimized AI inference for cost-sensitive applications

Area-optimized AI inference for cost-sensitive applications
by Don Dingee on 02-15-2023 at 6:00 am

Expedera uses packet-centric scalability to move up and down in AI inference performance while maintaining efficiency

Often, AI inference brings to mind more complex applications hungry for more processing power. At the other end of the spectrum, applications like home appliances and doorbell cameras can offer limited AI-enabled features but must be narrowly scoped to keep costs to a minimum. New area-optimized AI inference technology from… Read More