Suresh is a technology executive with deep technical expertise in semiconductors, artificial intelligence, cybersecurity, internet-of-things, hardware, software, etc. He spent 20 years in the industry, most recently serving as an Executive Director for open-source zero-trust chip development at Technology Innovation Institute, Abu Dhabi, and in other Fortune 500 semiconductor companies such as Intel, Qualcomm, and MediaTek in various leadership roles, where he researched and developed high-performant, energy-efficient, post-quantum secure, safe microchips/ system-on-chips (SoCs)/ accelerators for the Datacenter, Client, Smartphone, Networking, IoT, and AI/ML markets. He holds 15+ US Patents and has published/presented at more than 20+ conferences.
Suresh is also actively serving in a leadership position at RISC-V International where he chairs the Trusted Computing Group to develop RISC-V confidential computing capability and chairs the AI/ML Group to develop RISC-V hardware acceleration for AI/ML workloads such as Transformer Large Language Models used in ChatGPT kind of applications. He also advises startups and venture capital firms on investment decision support, product strategy, technology due diligence, etc.
He earned an MBA from INSEAD, an MS from Birla Institute of Technology & Science Pilani, a Systems Engineering certificate from MIT, an AI certificate from Stanford, and an automotive functional safety certificate from TÜV SÜD.
Tell us about your company
“Mastiṣka AI” (Mastiṣka means Brain in Sanskrit) is an AI company focused on building brain-like computers to run foundation models more efficiently for Generative AI use cases of tomorrow.
What problems are you solving?
Given the benefits of AI/ GenAI, its demand is only bound to go up, and so will its side effects on our planet. How can we reduce or neutralize the side effects of AI on our planet? Carbon capture and nuclear power are in the right direction. But we need to fundamentally rethink the way we do AI, is it the wrong way to do tonnes of matrix multiplications?
Our brain can learn and do many tasks in parallel, in and under 10W, but why do these AI systems consume 10s of megawatts to train models?
Perhaps the future holds energy-efficient architectures such as neuromorphic architectures and spiking neural network-based transformers that are closest to the human brain, which might consume 100-1000x lower energy, hence reducing the cost of using AI, thereby democratizing it and saving our planet.
The current challenges we face with AI namely a) availability, b) accessibility, c) affordability, and d) environmental safety along with some recommendations to tackle them.
If we foresee in the future, some useful AGI concepts are demonstrated in the movie “HER”, where the character ‘Samantha’ – a conversational agent who is natural, understands emotions, shows empathy, is an amazing copilot at work — and runs on handheld devices the entire day, then we may have to address the below challenges right now.
Issue 1: Training an LLM can cost anywhere from 150K to 10+ million dollars, and it allows only those with deeper pockets to develop AI. On top, inferencing costs are huge too (costs 10x more than a web search)
—> We need to improve the energy efficiency of models/ hardware to democratize AI for the benefit of humanity.
Issue 2: Running ginormous AI models for conversational agents or recommendation systems, puts a toll on the environment in terms of electricity consumption and cooling.
—> We need to improve the energy efficiency of models/ hardware to save our planet for our kids.
Issue 3: The human brain is capable and can multitask, but consumes only 10 Watts instead of Megawatts.
—> Perhaps we should build machines like our brains and not the regular matrix multipliers faster.
Humanity can only thrive with sustainable innovations, and not by cutting down all forests and boiling the oceans in the name of innovation. We must protect our planet for the welfare of our children and future generations to come…
What application areas are your strongest?
Training and Inferencing of Transformer (and future neural architecture) based foundation models, at 50-100x more energy efficiently compared to today’s GPU-based solutions.
What keeps your customers up at night?
Issues for customers who currently use other products:
Electricity consumption for training humungous language models is beyond the roof, for example, training a 13B parameter LLM on 390B text tokens on 200 GPUs for 7 days costs $151,744 (Source: HuggingFace new training cluster service page – https://lnkd.in/g6Vc5cz3). And even larger models with 100+B parameters cost $10+M just to train. Then pay for inferencing every time a new prompt request arrives.
Water consumption for cooling, researchers at the University of California, Riverside estimated the environmental impact of ChatGPT-like service, and say it gulps up 500 milliliters of water (close to what’s in a 16-ounce water bottle) every time you ask it a series of between 5 to 50 prompts or questions. The range varies depending on where its servers are located and the season. The estimate includes indirect water usage that the companies don’t measure — such as to cool power plants that supply the data centers with electricity. (Source: https://lnkd.in/gybcxX8C)
Issues for non-customers of current products:
Can’t afford CAPEX to buy hardware
Can’t afford to use cloud services
Can’t innovate or leverage AI — stuck with services model that eliminates any competitive advantage
What does the competitive landscape look like and how do you differentiate?
- GPUs dominiate training space, even though specialized ASICs also compete in this segment
- Cloud & Edge inference has too many options available
Digital, Analog, Photonic — you name it people are trying to tackle the same problem.
Can you share your thoughts on the current state of chip architecture for AI/ML, meaning, what do you see as the most significant trends and opportunities right now?
Trend 1: 10 years ago, hardware-enabled deep learning flourished, and now the same hardware is inhibiting progress. Due to the huge cost of hardware and electricity costs to run models, it has become a challenge to access the hardware. Only companies with deep pockets are able to afford these and are becoming monopolies.
Trend 2: Now that these models are there, we need to use them for practical purposes so that the inferencing load will increase, allowing CPUs with AI accelerators to come to the limelight again.
Trend 3: Startups are trying to come up with alternative floating point number representations that the traditional IEEE format – such as logarithmic and posit-based — are good but not enough. PPA$ design space optimization explodes when we try to optimize one and another goes for a toss.
Trend 4: The industry is moving away from the service-based model of AI to hosting its own private models on its own premises — but access to hardware is a challenge due to supply shortages, sanctions, etc
Current state of affairs:
Availability of hardware and data fueled the growth of AI 10 years ago, now the same hardware is sort of inhibiting it — let me explain
Ever since CPUs were doing miserable and GPUs were repurposed to do AI, many things happened
Companies have been addressing 4 segments of AI/ML namely – 1) cloud training, 2) cloud inferencing, 3) edge inferencing, and 4) edge training (federated learning for privacy-sensitive applications).
Digital & Analog
Training side – a plethora of companies doing GPUs, customer accelerators based on RISC-V, wafer-scale chips (850K cores), and so on where traditional CPUs lack (their general purpose). Inference side – NN accelerators are available from every manufacturer, in smartphones, laptops, and other edge devices.
Analog memristor-based architectures also showed up some time ago.
We believe CPUs can be very good at inferencing if we enhance it with acceleration such as matrix extensions
RISC-V side of things:
On the RISC-V side of things, we are developing accelerators for matrix operations and other non-linear operations to eliminate possible bottlenecks for transformer workloads. Von Neumann bottlenecks are also being addressed by architecting memories closer to computing, eventually making CPUs with AI acceleration the right choice for inferencing.
Unique opportunities exist to fill in the market of foundation models. Example – OpenAI have been mentioning they were not able to secure enough AI compute (GPUs) to continue to push their ChatGPT services… and the news reports about electricity costs of 10x of that of regular internet search and 500ml of water to cool down the systems for every query. There is a market to fill in here — its not niche, but its the entire market that will democratize AI tackling all the challenges mentioned above – a) availability, b) accessibility, c) affordability, and d) environmental safet
What new features/technology are you working on?
We are building brain like computer leveraging neuromodrphic technuques and tailoring models to take advantage of the energy efficient hardware, reusing may of open frameworks available
How do you envision the AI/ML sector growing or changing in the next 12-18 months?
As the demand for GPUs have soured (costing like $30K) plus some parts of the world are facing sanctions to buy these GPUs, some parts of the world are feeling they are frozen in AI research and development without access to GPUs. Alternate hardware platforms are going to capture the market.
Models perhaps will start shrinking — custom models or even fundamentally the information density would grow
Same question but how about the growth and change in the next 3-5 years?
a) CPUs with AI extensions would capture the AI inference market
b) Models would become nimble, and parameters will drop out as information density improves from 16% to 90%
c) Energy efficiency improves, CO2 foot print reduces
d) New architectures come up
e) hardware costs and energy costs go down so the barrier to entry for smaller companies to create and train models becomes affordable
f) people talk about pre-AGI moment, but my benchmark would be the characted Samantha (conversational AI) in movie “her”.. that maybe unlikely given the high cost of scaling up
What are some of the challenges that could impact or limit the growth in AI/ML sector?
a) Access to hardware
b) Energy costs and cooling costs and environmental harm