WEBINAR: FPGA-Accelerated AI Speech Recognition

WEBINAR: FPGA-Accelerated AI Speech Recognition
by Don Dingee on 12-14-2023 at 6:00 am

Cloud ASR demo on Speedster 7t FPGA

The three-step conversational AI (CAI) process – automatic speech recognition (ASR), natural language processing, and text-to-synthesized speech response – is now deeply embedded in the user experience for smartphones, smart speakers, and other devices. More powerful large language models (LLMs) can answer more queries… Read More


Fast Path to Baby Llama BringUp at the Edge

Fast Path to Baby Llama BringUp at the Edge
by Bernard Murphy on 09-26-2023 at 10:00 am

Baby Llama min

Tis the season for transformer-centric articles apparently – this is my third within a month. Clearly this is a domain with both great opportunities and challenges: extending large language model (LLM) potential to new edge products and revenue opportunities, with unbounded applications and volumes yet challenges in meeting… Read More


Inference Efficiency in Performance, Power, Area, Scalability

Inference Efficiency in Performance, Power, Area, Scalability
by Bernard Murphy on 09-19-2023 at 6:00 am

AI graphic

Support for AI at the edge has prompted a good deal of innovation in accelerators, initially in CNNs, evolving to DNNs and RNNs (convolutional neural nets, deep neural nets, and recurrent neural nets). Most recently, the transformer technology behind the craze in large language models is proving to have important relevance at… Read More


Scaling LLMs with FPGA acceleration for generative AI

Scaling LLMs with FPGA acceleration for generative AI
by Don Dingee on 09-13-2023 at 6:00 am

Crucial to FPGA acceleration of generative AI is the 2D NoC in the Achronix Speedster 7t

Large language model (LLM) processing dominates many AI discussions today. The broad, rapid adoption of any application often brings an urgent need for scalability. GPU devotees are discovering that where one GPU may execute an LLM well, interconnecting many GPUs often doesn’t scale as hoped since latency starts piling up with… Read More


Fitting GPT into Edge Devices, Why and How

Fitting GPT into Edge Devices, Why and How
by Bernard Murphy on 09-05-2023 at 6:00 am

It is tempting to think that everything GPT-related is just chasing the publicity bandwagon and that articles on the topic, especially with evidently impossible claims (as in this case), are simply clickbait. In fact, there are practical reasons for hosting at least a subset of these large language models (LLMs) on edge devices… Read More


A Negative Problem for Large Language Models

A Negative Problem for Large Language Models
by Bernard Murphy on 05-23-2023 at 6:00 am

Picture1

I recently read a thought-provoking article in Quanta titled Chatbots Don’t Know What Stuff Isn’t. The point of the article is that while large language models (LLMs) such as GPT, Bard and their brethren are impressively capable, they stumble on negation. An example offered in the article suggests that while a prompt, “Is it true… Read More