RVN! 26 Banner revised (800 x 100 px) (600 x 100 px)
WP_Term Object
(
    [term_id] => 15929
    [name] => CEO Interviews
    [slug] => ceo-interviews
    [term_group] => 0
    [term_taxonomy_id] => 15929
    [taxonomy] => category
    [description] => 
    [parent] => 0
    [count] => 300
    [filter] => raw
    [cat_ID] => 15929
    [category_count] => 300
    [category_description] => 
    [cat_name] => CEO Interviews
    [category_nicename] => ceo-interviews
    [category_parent] => 0
)

CEO Interview with Dr. Mohammad Rastegari of Elastix.AI

CEO Interview with Dr. Mohammad Rastegari of Elastix.AI
by Daniel Nenni on 03-15-2026 at 2:00 pm

Key takeaways

mohammad headshotMohammad Rastegari is a prominent AI researcher and entrepreneur currently serving as the CEO and Co-Founder of Elastix.AI. Based in the Greater Seattle Area, he also holds the position of Affiliate Assistant Professor at the University of Washington’s Electrical & Computer Engineering Department. His professional background includes high-level leadership roles as a Distinguished AI Scientist at Meta and a Principal AI/ML Manager at Apple. Previously, he was a Research Scientist at the Allen Institute for AI and the Co-founder and CTO of Xnor.ai, which was acquired by Apple in 2020.

Tell us about your company?

ElastixAI is building a new class of AI inference infrastructure designed to dramatically improve the efficiency, scalability, and adaptability of large-scale model deployment. We combine advanced model optimization with reconfigurable hardware—primarily FPGAs—to deliver high-performance inference without the cost, rigidity, and power constraints of traditional GPU-based systems.

Our mission is to make AI infrastructure fundamentally more efficient and future-proof, enabling organizations to deploy and evolve models at scale without being locked into a single hardware generation.

What problems are you solving?

AI inference is becoming the dominant cost driver in large-scale AI deployments, and current infrastructure is not built for it.

Today’s GPU-centric approach is:
• Expensive (both capex and opex)
• Power-hungry
• Rigid (requires new silicon for every major model shift)

We address these challenges by:
• Reducing cost per inference by up to 10x
• Improving power efficiency by up to 5x
• Enabling hardware reconfiguration as models evolve

This fundamentally changes the economics of deploying LLMs and other generative AI systems at scale.

What application areas are your strongest?

Our strongest focus is large-scale AI inference, particularly:
• Large Language Models (LLMs)
• Generative AI (text, code, multimodal)
• Enterprise AI copilots
• Real-time and latency-sensitive inference workloads

We are especially strong in environments where cost, power, and scalability constraints make GPU-only solutions unsustainable.

What keeps your customers up at night?

Our customers are facing a structural problem:
• Exploding inference costs as usage scales
• Power and data center constraints limiting growth
• Hardware lock-in to GPU vendors
• Rapid model evolution that outpaces hardware refresh cycles

They are asking a fundamental question:

How do we scale AI economically without rebuilding infrastructure every 12–18 months?

What does the competitive landscape look like and how do you differentiate?

The landscape is dominated by GPU vendors and a growing set of ASIC-based accelerators.
• GPUs offer limitted flexibility but are expensive and power-inefficient for inference at scale
• ASICs improve efficiency but are rigid and take years to develop

ElastixAI sits in a unique position:
• We leverage existing, deployable FPGA infrastructure
• We provide software-hardware-ML co-optimization rather than just silicon
• We enable post-deployment adaptability, not just point-in-time optimization

Our key differentiation is reconfigurability at scale—we can adapt to new models, architectures, and optimizations without requiring new hardware.

What new features/technology are you working on?

We are advancing several areas:
• Next-generation ML optimization techniques tailored for reconfigurable hardware
• Dynamic model-to-hardware mapping, enabling real-time adaptation to workload changes
• Inference orchestration across heterogeneous infrastructure
• Turnkey deployment platforms that abstract hardware complexity from customers

Our goal is to make high-efficiency AI infrastructure as easy to consume as cloud GPUs—but significantly more efficient.

How do customers normally engage with your company?

Customers potentially can engage with us in three ways:
1. Token-as-a-Service
They can sign up directly to buy tokens from our maintained inference services.
2. Software Subscription
They can buy the hardware and sign up for a monthly/yearly subscription for our inference software
3. Leasing or Purchase
They can order a purchase or lease of a prebuilt rack of our inference infrastructure, fully enabled and ready to generate tokens.

We also partner closely with:
• Cloud providers
• Data center operators
• AI model companies

This allows us to integrate into existing ecosystems while delivering immediate value.

CONTACT ELASTIX.AI

Also Read:

CEO Interview with Jerome Paye of TAU Systems

CEO Interview with Juniyali Nauriyal of Photonect

CEO Interview with Aftkhar Aslam of yieldWerx

 

Share this post via:

Comments

There are no comments yet.

You must register or log in to view/post comments.