Bronco Webinar 800x100 1

CEO Interview with Steve Kim of Chips&Media

CEO Interview with Steve Kim of Chips&Media
by Daniel Nenni on 04-10-2026 at 6:00 am

C&M CEO Steve 2

I’m Steve Kim, the CEO of Chips&Media. I’ve been immersed in the multimedia imaging industry for approximately two decades. Prior to joining Chips&Media, I spent over five years working within handset manufacturing companies. Following more than ten years here at Chips&Media in roles spanning Marketing, Sales, and Procurement, I was appointed CEO. I hold a Bachelor of Science in Electrical Engineering from the US, obtained in 1995, and subsequently completed my MBA studies in the US in 1997.

Tell us about your company.

‘Powering 3 Billion Devices Across 150+ Global Top-Tiers’

Chips&Media, headquartered in Seoul, Korea, is a Premium Multimedia IP company, including video codec, image processing NPU, frame buffer compression IP, and more. With over two decades of industry expertise and strong international collaborations, we are recognized for delivering dedicated, high-performance architectures optimized for memory bandwidth, power consumption, gate size, and reliability across various consumer applications.
Building on our proven success in diverse multimedia applications, we are expanding our footprint into overwhelming, including Automotive (IVI, ADAS, and Autonomous Driving), Cloud Data Centers, Robotics and AI accelerators. What sets us apart is our highly configurable and modular IP architecture, which empowers SoC designers to seamlessly select the optimal configuration for performance, power, and area (PPA) to best fit their target applications.

What problems are you solving?

‘High Performance, Low Power, Small Footprint’

We solve the core challenges of processing high-resolution video on resource-constrained edge devices. Our solutions compress high-quality video into the smallest possible size without loss and ensure smooth playback with minimal power consumption.

Ultimately, we satisfy our partners by delivering the “Triple Crown” of hardware design: High Performance, Low Power, and Optimized Area (PPA).

What application areas are your strongest?

‘Beyond Video Codecs: Leading the Future of Multimedia with AI-Powered NPU and FBC Innovation’

Building on over 20 years of proven expertise in the global video codec market, Chips&Media is solidifying its position in the multimedia IP industry.
Aligned with the latest trends in image processing, we offer real-time AI-driven video enhancement through our high-performance NPU IP. Furthermore, our FBC (Frame Buffer Compression) technology provides peak system optimization by effectively resolving DRAM bandwidth bottlenecks and eliminating the need for line buffers (SRAM) during data conversion. These innovations are the core drivers enabling us to deliver differentiated value as a comprehensive global multimedia IP provider.

What does the competitive landscape look like and how do you differentiate?
‘Powered by Technology, Proven by Performance, Trusted by 3 Billion Devices’
Currently, the multimedia IP market is structured around a competition between the general-purpose solutions of global EDA giants and the presence of region-based providers. However, as demand for AI and high-resolution video surges, the market is shifting.

Rather than a broad portfolio that simply lists features, there is now a stronger-than-ever demand for Deep-Tech specialized IP that delivers superior efficiency in specific domains. In response, Chips&Media is establishing a differentiated market position through three core strategic pillars.

1) Superior Tech Efficiency

– We provide industry-leading 8K60fps multi-standard video codecs that deliver superior image quality with minimal power consumption, offloading the host CPU to maximize overall system efficiency.
– Chips&Media has released the latest video standard IPs faster than anyone else in the market. We were the first codec IP provider for H.264/AVC, H.265/HEVC, AV1, APV and others; now, we are about to release our AV2 IP.
– Our FBC technologies minimize chip area by eliminating unnecessary line buffers (SRAM) and ensure real-time processing with ultra-low latency through its high-speed on-the-fly architecture.
– Our specialized image processing NPU maximizes MAC utilization through its efficient architecture. Its proprietary line-by-line processing technology drastically reduces DRAM bandwidth, enabling high-performance real-time video processing.

2) Committed to Functional Safety: On track to achieve ISO 26262 (ASIL-B) certification by 2026
3) Market-Proven Reliability:

We have secured over 150 global top-tier customers with a track record of more than 3 billion units delivered, while offering a wide range of customization options to meet specific customer requirements.

What new features/technology are you working on?

‘Driving the Future: AI Evolution and Ultra-High-Resolution Optimization’

We are focusing on three core innovations to lead the shift toward ultra-high-resolution data and AI evolution:

1. Next-Gen FBC (Memory Optimization):

Our latest FBC algorithms maximize compression ratios with minimal quality loss, drastically reducing system costs and power consumption by solving memory bottlenecks.

2. Image-Specific NPU (AI Efficiency):

Unlike general-purpose NPUs, our NPU IP (WAVE-N) architecture is optimized specifically for image processing, delivering exceptional performance with near-zero bandwidth consumption.

3. Versatile Portfolio & Customization (Multimedia IP Provider):

We offer high-performance video codec IPs adaptable to various industries, including Mobile, Automotive, and HPC. By providing flexible configuration to each client’s unique requirements, we continue to diversify our global portfolio.

How do customers normally engage with your company?

‘Business Model: License Fee + Royalty’

Our business model is built on a synergistic structure of license fees and royalties. Customers reduce development time through upfront licensing and subsequently share revenue with us through royalties as their products enter the market. This close collaboration has created a virtuous cycle, where over 90% of our clients consistently choose Chips&Media for their subsequent projects.

Also Read:

Chips&Media and Visionary.ai Unveil the World’s First AI-Based Full Image Signal Processor, Redefining the Future of Image Quality

yieldHUB Expands Its Impact with New Technology and a New Website

NXP Expands Arteris NoC Deployment to Scale Edge AI Architectures


yieldHUB Expands Its Impact with New Technology and a New Website

yieldHUB Expands Its Impact with New Technology and a New Website
by Mike Gianfagna on 04-09-2026 at 10:00 am

yieldHUB Expands Its Impact with New Technology and a New Website

yieldHUB is a unique company that focuses on yield optimization for the semiconductor industry. The company aims to bring engineering teams together with a platform that allows sharing of data analytics and knowledge about products to improve yield. This goal is certainly fueled by unifying data from multiple steps in the manufacturing process. But the problem is bigger than that. Changes that occur in real-time during the manufacturing process can have significant impact on performance and yield. Consolidated data measurements will miss this kind of problem.

yieldHUB has responded to this imperative with new, real-time intelligence and a new website that presents broader information in a unified, easy to use manner. Let’s examine what you will find at the new site and take a closer look at the new real-time capabilities as we explore how yieldHUB expands its impact with new technology and a new website.

New Look, Expanded Information

The new website reflects how the yieldHUB platform has evolved over the years. You will find a much broader structure covering semiconductor manufacturing across the product lifecycle, industries and device architectures. A portion of the new homepage is shown in the graphic above. The core of the platform section contains two key components:

yieldHUB – the traditional platform for semiconductor yield management, unifying wafer probe and final test data to support yield analysis, engineering insight and production monitoring. (Explore the platform in the graphic above.)

yieldHUB Live – is the new real-time manufacturing intelligence capability designed to provide visibility and control on the test floor. (See real-time monitoring in the graphic above.) This new capability allows engineering and operations teams to monitor yield, bins and parameters as production is happening, helping detect anomalies earlier and respond faster. More on this in a moment.

The website has also expanded significantly. The breadth and depth of the site are substantially increased with a well-designed and easy-to-use architecture and visual presentation. The new site includes:

  • Solutions organized across the semiconductor product lifecycle (e.g., NPI, yield ramp, high-volume production and smart manufacturing)
  • Content structured by engineering objectives such as yield improvement, engineering efficiency, test cost reduction and quality/reliability
  • Expanded coverage of industries and device architectures, including AI/HPC, power semiconductors (e.g., silicon carbide and gallium nitride), photonics, RF, MEMS and of course advanced packaging fueled by multi-die design
  • Sections tailored to different roles across semiconductor organizations including engineering, operations, quality, finance and IT
  • Coverage of the broader semiconductor ecosystem including IDMs, fabless companies, OSATs, foundries and equipment manufacturers

You will also find significantly expanded technical content, with more blogs and engineering resources focused on topics such as yield ramp, NPI analytics and semiconductor data science. No matter what your industry or job is, if efficient semiconductors are important to you there is a place to learn more at the new yieldHUB site.

yieldHUB Live

yieldHUB Live provides semiconductor test floor monitoring and overall equipment effectiveness (OEE) optimization for high-volume manufacturing environments. It delivers real-time production visibility and control across testers and sites without hardware changes or test program modification, reducing retest, detecting yield and parametric drift, increasing utilization, and protecting cost per die.

yieldHUB Live supports complex global test operations across OSATs, IDMs, and fabless semiconductor companies. Some of its capabilities include:

  • Monitor semiconductor test floors with continuous production visibility
  • Improve tester utilization and OEE
  • Reduce unnecessary retest
  • Detect yield and parametric drift early
  • Provide cross-site visibility for OSATs and IDMs
  • As mentioned, no hardware changes or test program disruption – compatible with most testers

Other capabilities include extending manufacturing execution systems (MES) with real-time production control. Traditional systems track transactions, but they don’t enforce operational intelligence. yieldHUB Live adds a real-time semiconductor manufacturing analytics layer that transforms static reporting into continuous production control.

The system also provides accurate insight into real test floor activity enabling operators, technicians, and engineers to act immediately with dashboards that enable factory visualization. Examples include:

  • Continuous tester status monitoring
  • Live yield and bin tracking across lots, testers, and sites
  • Parametric monitoring at wafer, bin, and die level
  • Alerts and emerging trends
  • Site-to-site health monitoring in high-parallel environments
  • Site-to-site performance monitoring in high-parallel environments

In high-volume semiconductor environments, even a single figure percentage improvement in utilization can unlock millions of dollars in annual value. yieldHUB Live exposes production losses in real time and converts them into sustainable performance gains.

There is a lot to this new capability. I have just scratched the surface in this discussion. The good news is that you can easily learn more on the new site with enhanced visuals and summaries. You can even set up a discovery call to learn more

Learning about yieldHUB Live

To Learn More

If items such as protecting margins, improving profitability, meeting SLAs, coordinating across your supply chain with data-driven conversations, and maximizing performance at scale are important to you, yieldHUB can help. And the company’s new website and new technology make it easier to achieve your goals.

You can get a good picture of the breadth of capabilities offered by yieldHUB on SemiWiki here.  And you should definitely check out the new website here.  If you want to learn more about yieldHUB live you can do that here. Short on time? Access this informative short video to get started.  

And that’s how yieldHUB expands its impact with new technology and a new website.

Also Read:

Podcast EP301: Celebrating 20 Years on Innovation with yieldHUB’s John O’Donnell

SemiWiki Outlook 2025 with yieldHUB Founder & CEO John O’Donnell

Podcast EP254: How Genealogy Correlation Can Uncover New Design Insights and improvements with yieldHUB’s Kevin Robinson


NXP Expands Arteris NoC Deployment to Scale Edge AI Architectures

NXP Expands Arteris NoC Deployment to Scale Edge AI Architectures
by Daniel Nenni on 04-09-2026 at 8:00 am

NXP announcement v4b 021026 FINAL

As edge AI systems become more centralized and compute-dense, on-chip data movement is increasingly the architectural bottleneck. NXP’s expanded deployment of Arteris network-on-chip (NoC) and cache-coherent interconnect IP highlights a broader industry trend: interconnect architecture is now a first-order design challenge, not just plumbing.

Arteris recently announced that NXP is broadening its use of FlexNoC®, Ncore®, CodaCache®, and Magillem® integration automation tools across AI-enabled silicon platforms. While the announcement may read like a routine IP expansion, it reflects something more strategic—NXP is standardizing around scalable interconnect infrastructure to support increasingly heterogeneous and safety-critical edge AI designs.

The Real Challenge: Heterogeneous Scaling at the Edge

Automotive and industrial SoCs have shifted dramatically in the past decade. What were once distributed MCU-based systems are evolving into centralized compute platforms integrating:

  • High-performance application CPUs
  • Real-time safety cores
  • NPUs and AI accelerators
  • GPUs and vision processors
  • Security enclaves
  • High-bandwidth memory subsystems

This heterogeneity creates enormous stress on the on-chip fabric. The traditional bus-based interconnect architectures used in earlier generations cannot efficiently scale to support high core counts, accelerator-heavy workloads, and mixed-criticality traffic.

Edge AI workloads—such as sensor fusion, ADAS perception stacks, industrial machine vision, and predictive maintenance—require deterministic latency, sustained bandwidth, and strict isolation between safety and non-safety domains. At the same time, power efficiency remains a hard constraint.

This is precisely where configurable NoC architectures have become essential.

FlexNoC as the Data Movement Backbone

NXP’s expanded use of Arteris FlexNoC suggests a continued architectural commitment to packetized, scalable interconnect fabrics.

FlexNoC enables customized topologies—mesh, hierarchical, crossbar, or hybrid—tailored to workload characteristics. That flexibility is increasingly important as SoCs integrate compute clusters with very different traffic patterns. AI accelerators generate bursty, high-bandwidth transactions. Real-time cores demand low-latency determinism. Safety subsystems require strict partitioning.

Fine-grained quality-of-service (QoS), bandwidth allocation, and traffic shaping capabilities allow architects to enforce policy at the fabric level. This becomes critical in automotive designs targeting ISO 26262 compliance, where isolation and predictable behavior must be guaranteed.

In centralized domain-controller architectures, the NoC is no longer just a connectivity layer—it becomes the performance governor of the entire SoC.

Scaling Coherency Without Power Explosion

NXP’s use of Arteris Ncore® cache-coherent NoC IP also reflects the growing complexity of multi-core and heterogeneous coherency domains.

As edge devices adopt higher core counts and accelerator integration, maintaining efficient hardware coherency becomes increasingly challenging. Broadcast-based snooping quickly becomes unsustainable at scale due to power and bandwidth overhead.

Directory-based coherency with distributed snoop filtering, such as that implemented in Ncore, reduces unnecessary traffic while enabling scalable coherency domains. For heterogeneous compute clusters where CPUs and accelerators must share memory space, this is critical.

The alternative, software-managed coherency or non-coherent partitions, often increases latency and complexity. Hardware-managed coherency remains the most efficient path for many high-performance AI workloads at the edge.

Memory Pressure and the Role of CodaCache

Edge AI workloads are often memory-bound. Sensor fusion pipelines and neural inference engines generate significant DRAM traffic. External memory bandwidth is expensive in power, latency, and cost.

CodaCache® last-level cache IP helps mitigate off-chip bandwidth pressure by improving effective memory utilization. Configurable associativity, partitioning, and QoS-aware policies enable performance isolation across safety domains while reducing DRAM transactions.

In thermally constrained environments such as automotive ECUs and industrial controllers, reducing off-chip memory traffic directly translates into improved power efficiency and system reliability.

Preparing for Chiplets and Long-Term Scalability

Another strategic aspect often overlooked in such announcements is future packaging direction.

Modern NoC architectures are increasingly being designed with multi-die scalability in mind. Clean partition boundaries, protocol abstraction, and modular network interface units (NIUs) allow interconnect fabrics to extend across die-to-die interfaces as chiplet adoption increases.

For companies like NXP with long automotive product lifecycles, selecting an interconnect IP provider that supports both current monolithic SoCs and future heterogeneous packaging strategies reduces long-term architectural risk.

Integration Complexity Is Now a Bottleneck

It’s also notable that NXP continues to deploy Arteris Magillem® for IP integration automation.

As SoCs integrate hundreds of IP blocks, managing configuration, interface validation, and register maps becomes a non-trivial engineering burden. Metadata-driven automation through IP-XACT-based flows improves reuse and reduces integration errors especially important in safety-certified programs where traceability and documentation matter.

The complexity of integration now rivals the complexity of microarchitecture. Automation tools are no longer optional productivity enhancements—they are risk mitigation instruments.

The Bigger Industry Trend

The expanded Arteris deployment at NXP illustrates a broader shift across the semiconductor industry:

  • Interconnect is a strategic architectural layer.
  • Coherency scaling is a power problem as much as a performance problem.
  • Memory efficiency is central to AI performance.
  • Integration automation is becoming mission-critical.

As AI workloads move from cloud to edge, and as automotive architectures centralize compute, scalable and configurable NoC infrastructure becomes foundational.

Bottom line: For semiconductor architects, this is a reminder that future SoC competitiveness will depend not just on compute IP selection, but on how effectively data moves between those blocks. In the AI era, the fabric is the architecture, absolutely.

CONTACT ARTERIS IP

Also Read:

Arteris Smart NoC Automation: Accelerating AI-Ready SoC Design in the Era of Chiplets

WEBINAR: Why Network-on-Chip (NoC) Has Become the Cornerstone of AI-Optimized SoCs

The IO Hub: An Emerging Pattern for System Connectivity in Chiplet-Based Designs


Architecting Intelligence: The Rise of RISC-V CPUs in Agentic AI Infrastructure

Architecting Intelligence: The Rise of RISC-V CPUs in Agentic AI Infrastructure
by Daniel Nenni on 04-09-2026 at 6:00 am

The rise of RISC V CPUs SiFive

SiFive’s newly announced $400 million Series G financing represents a significant technical inflection point for high-performance RISC-V CPU development targeted at agentic AI data center workloads. The funding, which values the company at $3.65 billion, is specifically intended to accelerate next-generation CPU IP, software ecosystem maturation, and hyperscale deployment enablement. These initiatives collectively address emerging compute bottlenecks where traditional architectures struggle to balance orchestration efficiency, scalability, and power constraints in increasingly heterogeneous AI infrastructure.

A central technical driver behind the investment is the growing role of CPUs in agentic AI systems. While GPUs and specialized accelerators deliver high throughput for tensor operations, they are not optimized for complex control flow, scheduling, and system-level coordination. Agentic models composed of multiple interacting inference loops, tool integrations, and dynamic decision trees require low-latency orchestration and efficient context switching. CPUs, particularly those designed with extensible instruction sets and scalable vector capabilities, are well positioned to handle these workloads. RISC-V’s modular architecture enables vendors to tailor scalar, vector, and matrix extensions to specific orchestration patterns, improving efficiency compared with monolithic legacy instruction set architectures.

From a microarchitectural perspective, the roadmap focuses on tightly integrating scalar pipelines with vector and matrix compute units. This co-design reduces memory bandwidth overhead by minimizing data movement between heterogeneous compute blocks. By embedding domain-specific accelerators directly within the CPU fabric, RISC-V implementations can support hybrid workloads that interleave control-heavy logic with localized numeric computation. This is particularly valuable for AI agents performing reasoning, planning, and iterative refinement tasks, where frequent transitions between symbolic and numeric operations occur. Such integration also simplifies cache coherence and reduces latency penalties associated with discrete accelerator offloading.

Power efficiency is another major technical motivation. As AI clusters scale, total facility power and thermal density become limiting factors. Traditional architectures often rely on high clock frequencies and deep out-of-order execution pipelines to boost performance, increasing energy consumption. RISC-V designs can instead leverage workload-specific instruction extensions and right-sized pipelines to achieve better performance-per-watt. This approach enables data center operators to expand compute capacity within existing power envelopes, a critical requirement as AI training and inference demand grows exponentially.

The software ecosystem component of the investment is equally important. Expanding support for widely used operating systems and acceleration frameworks ensures that new hardware can be deployed without extensive porting overhead. Native compatibility with Linux distributions and GPU interconnect technologies enables heterogeneous clusters where RISC-V CPUs orchestrate GPU-accelerated compute. This tight coupling improves scheduling efficiency and reduces host-side bottlenecks. Additionally, standardized toolchains and compiler optimizations for vector and matrix extensions are necessary to fully exploit hardware capabilities. Investments in software infrastructure will accelerate adoption by hyperscalers and enterprise users.

Customer enablement efforts also highlight a broader architectural trend toward co-design. Hyperscale operators increasingly require customized CPU IP to differentiate their infrastructure. Unlike fixed architectures, RISC-V allows integration of proprietary accelerators, specialized memory hierarchies, and tailored interconnect logic. This flexibility shortens design cycles and allows rapid iteration aligned with evolving AI workloads. As agentic AI systems grow more complex, the ability to customize CPU features such as hardware task schedulers, low-latency messaging primitives, or domain-specific vector units becomes strategically valuable.

Another technical advantage lies in ecosystem openness. Open standards encourage collaboration across semiconductor vendors, cloud providers, and software developers. This collaborative model accelerates innovation by allowing independent contributions to instruction set extensions, verification frameworks, and performance optimization tools. Over time, this can produce a robust ecosystem comparable to established architectures while maintaining flexibility for specialization.

Bottom line: The financing supports three interconnected technical objectives: advancing high-performance RISC-V CPU IP, expanding software compatibility, and enabling large-scale deployment in AI data centers. Together, these efforts address the orchestration, efficiency, and scalability challenges introduced by agentic AI workloads. As compute infrastructure evolves toward heterogeneous and power-constrained environments, customizable CPU architectures with integrated vector and matrix capabilities are poised to play a central role in next-generation AI systems.

Also Read:

SiFive’s AI’s Next Chapter: RISC-V and Custom Silicon

SiFive to Power Next-Gen RISC-V AI Data Centers with NVIDIA NVLink Fusion

Tiling Support in SiFive’s AI/ML Software Stack for RISC-V Vector-Matrix Extension


Intel, Musk, and the Tweet That Launched a 1000 Ships on a Becalmed Sea

Intel, Musk, and the Tweet That Launched a 1000 Ships on a Becalmed Sea
by Jonah McLeod on 04-08-2026 at 12:00 pm

Intel Terafab SemiWiki

Intel, Musk, and the Tweet That Launched a 1000 Ships on a Becalmed Sea
Why do professional executives running major corporations frame a major moment in their company’s history with a tweet? Jerry Sanders spent his career yelling “real men have fabs!” Now Intel has fabs–and apparently tweets about them.

“Intel is proud to join the Terafab project with @SpaceX, @xAI, and @Tesla to help refactor silicon fab technology. Our ability to design, fabricate, and package ultra-high-performance chips at scale will help accelerate Terafab’s aim to produce 1 TW/year of compute to power future advances in AI and robotics.” Source: (Intel, X, April 7, 2026)

But that’s what Intel, beneficiary of $11.1 billion in federal support–the largest government industrial rescue since Carter handed Chrysler $1.5 billion in federal loan guarantees back in 1979–posted: a single sentence announcing it was joining the Terafab project alongside SpaceX, xAI, and Tesla. The difference is Chrysler paid its loans back. Intel’s arrangement, restructured by the Trump administration, converted the remaining unspent grants into a 9.9% U.S. government equity stake. Washington isn’t a creditor here. It’s a shareholder. No press release, no technical briefing, no joint statement. Just a short message about “refactoring silicon fab technology” and enabling 1 terawatt per year of compute–for something that, if real, could reshape the semiconductor supply chain.

What made it more interesting was who didn’t speak. Elon Musk said nothing. Tesla said nothing. SpaceX said nothing. xAI said nothing. The entire Musk ecosystem–usually not shy about declaring intent–went quiet. That’s not a communications failure. That’s a system being assembled before the full structure is locked. Most of the early coverage has treated Terafab as Musk entering the semiconductor manufacturing business. Convenient framing, wrong conclusion. Musk doesn’t need another business. He needs control.

Across Tesla, SpaceX, and xAI, he already owns the inputs that matter most: real-world workloads, massive deployment platforms, continuous streams of operational data. What he doesn’t control is the one layer that gates everything–the ability to turn those workloads into silicon, at scale, on his own timeline.

That dependency is the constraint. The traditional model–design a chip, hand it to a foundry, wait months, iterate slowly–is too disconnected for what he’s trying to build. Terafab isn’t a fab. It’s an attempt to close the loop. Workloads define architecture, architecture drives silicon, silicon feeds deployed systems, deployed systems generate data, repeat.

Seen that way, Intel’s role becomes clearer. It’s the missing piece. And also, uncomfortably, the weakest one. Intel brings things Musk can’t build quickly: advanced process technology, manufacturing infrastructure, packaging capabilities that matter enormously for AI systems. Without those, Terafab is a concept. With them, it becomes plausible, if not proven.

For all of Intel’s recent progress–and there has been genuine progress–one milestone remains conspicuously absent. No public evidence that Intel Foundry has taken a true external customer all the way from RTL through GDSII to tapeout on its most advanced nodes. That’s the moment a foundry stops being a promise and becomes a platform. Everything else–PDK availability, design engagements, ecosystem partnerships–is pre-validation.
Intel has spent years operating in a relatively controlled environment. Its biggest “customers” have been internal product groups, government programs, and early strategic partners. Cooperative relationships, all of them. Expectations negotiable. Messaging manageable. That’s not how the foundry business works at scale. TSMC was forged by customers who had no patience for delay, no tolerance for yield problems, and no incentive to soften feedback.

Musk is not going to soften feedback.

Which raises the obvious question: why would he choose a partner with Intel’s execution history? Because he’s not optimizing for what everyone else is optimizing for. TSMC offers near-perfect execution. It does not offer flexibility, co-design influence, or capacity on demand. It certainly doesn’t offer control. Intel, by contrast, needs anchor customers, needs volume, needs to prove something–and that need creates an unusual dynamic. Intel is willing to adapt in ways TSMC simply isn’t. Musk isn’t betting on Intel’s past. He’s betting that Intel’s desperation to change makes it the only partner willing to align on his terms.

That cuts both ways. If Intel’s recent progress has occurred in a protected environment, this partnership removes the protection entirely. Musk is not a captive customer. He’s not patient and he’s not dependent. If timelines slip or yields disappoint, he’ll route around the problem–fast. This is less a partnership than a forcing function. It moves Intel out of controlled validation and into real exposure, compresses timelines, raises expectations, and eliminates any ability to manage the narrative independent of actual execution.

That may be exactly why Intel agreed to it. For Intel, this is the anchor customer it’s been looking for–high volume, high visibility, capable of proving in public that it can compete again at the leading edge. It’s also a genuine risk, because the window to demonstrate that capability is no longer measured in decades. It’s measured in product cycles. One, maybe two.

Read the tweet in that context. It’s not a finished deal announcement. It’s a marker that two specific problems–Musk’s need for control and Intel’s need for validation–have found each other.

Whether they actually solve each other, we’re about to find out.

Also Read:

Agentic AI Demands More Than GPUs

Silicon Insurance: Why eFPGA is Cheaper Than a Respin — and Why It Matters in the Intel 18A Era

Captain America: Can Elon Musk Save America’s Chip Manufacturing Industry?


From SoC to System-in-Package: Transforming Automotive Compute with Multi-Die Integration

From SoC to System-in-Package: Transforming Automotive Compute with Multi-Die Integration
by Daniel Nenni on 04-08-2026 at 10:00 am

Types of Mutli Deisgn Packaging Synsopsys

Modern automotive electronics are undergoing a rapid transformation driven by increasing compute demands, functional safety requirements, and the shift toward scalable semiconductor architectures. One of the most significant technological developments enabling this transformation is the adoption of multi-die system integration. Multi-die design refers to integrating multiple semiconductor dies—either homogeneous or heterogeneous—into a single package to deliver improved scalability, performance, and reliability. This architectural evolution is particularly relevant for advanced driver assistance systems (ADAS), autonomous driving, and digital cockpit applications, where traditional monolithic system-on-chip (SoC) designs struggle to meet growing requirements.

Automotive environments impose some of the harshest operating conditions for electronics. Devices must withstand vibration, temperature extremes, humidity, and electromagnetic noise, all while maintaining functional safety. Additionally, vehicles are expected to operate reliably for 10–15 years with minimal maintenance. As automotive autonomy levels increase, the computational demand grows exponentially. Higher levels of automation require complex processing pipelines involving CPUs, GPUs, AI accelerators, digital signal processors, and high-bandwidth memory subsystems. These requirements often exceed the practical limits of monolithic chip fabrication, motivating the transition to modular multi-die architectures.

Multi-die design provides several technical advantages. First, it improves scalability by allowing designers to reuse proven dies and combine them in different configurations. This reduces development time and risk compared to designing a new monolithic chip for each product variant. Second, partitioning functionality across smaller dies can improve manufacturing yield. Large monolithic dies are more susceptible to defects, whereas smaller dies increase the probability of obtaining functional silicon. Third, multi-die packaging allows heterogeneous integration. Designers can combine components fabricated in different process nodes, such as advanced digital logic in a cutting-edge node and analog or I/O circuitry in mature technologies, optimizing power, performance, and cost.

Another key benefit is improved interconnect performance. Die-to-die communication within a package provides significantly higher bandwidth and lower latency than traditional chip-to-chip communication over printed circuit boards. This is particularly important for AI inference workloads, sensor fusion, and high-resolution camera processing in autonomous vehicles. Advanced packaging technologies such as 2.5D interposers, 3D stacking, and microbump interconnects enable extremely high I/O density. These technologies allow designers to stack memory on top of compute dies or distribute functional blocks across multiple dies while maintaining high throughput.

Safety and reliability remain central considerations in automotive multi-die systems. Standards such as ISO 26262 require fault detection, redundancy, and fail-safe mechanisms. Multi-die architectures introduce additional challenges, including monitoring die-to-die interconnects, managing thermal hotspots, and ensuring package-level reliability. To address these challenges, designers incorporate silicon lifecycle management (SLM) techniques, including process, voltage, and temperature sensors, error-correcting codes, and health monitoring circuits. These mechanisms enable predictive maintenance and in-field diagnostics, ensuring that faults are detected early and mitigated before they compromise vehicle safety.

The adoption of multi-die architectures is also driven by emerging vehicle design trends such as zonal architectures and software-defined vehicles. Instead of distributing many small electronic control units across the vehicle, modern designs centralize compute resources in high-performance processors. Multi-die platforms provide the flexibility needed to scale compute resources across vehicle tiers, from entry-level driver assistance to fully autonomous systems. Manufacturers can create families of chips by combining base dies with optional GPU or AI accelerator dies, enabling efficient product differentiation.

Despite its advantages, multi-die design introduces engineering complexity. Designers must carefully partition functionality, optimize interconnect topology, and validate system-level behavior across multiple dies. Thermal management becomes more challenging due to higher power density. Verification flows must consider both die-level and package-level interactions. However, advances in electronic design automation tools and standardized interconnect protocols are making these challenges manageable.

Bottom line: multi-die semiconductor integration is becoming a foundational technology for next-generation automotive electronics. By enabling scalable compute architectures, improved yield, heterogeneous integration, and enhanced reliability, multi-die design addresses the limitations of monolithic SoCs. As vehicles continue to evolve toward autonomy and software-defined functionality, multi-die systems will play a critical role in delivering the performance, safety, and flexibility required for future automotive platforms.

Multi-Die Design for Automotive Applications

Also Read:

Podcast EP337: The Importance of Network Communications to Enable AI Workloads with Abhinav Kothiala

Synopsys Advances Hardware Assisted Verification for the AI Era

Scaling Multi-Die Connectivity: Automated Routing for High-Speed Interfaces


Agentic AI Demands More Than GPUs

Agentic AI Demands More Than GPUs
by Daniel Nenni on 04-08-2026 at 8:00 am

Agentic AI Requires More CPUs

Agentic AI workloads are reshaping the compute requirements of modern data center infrastructure by shifting performance bottlenecks from GPU-centric inference to CPU-heavy orchestration and workflow management. Traditional AI inference pipelines relied primarily on GPUs performing a single forward pass, where input tokenization, model execution, and output generation occurred sequentially. However, emerging agentic AI systems transform inference into a distributed, multi-step process involving planning, tool invocation, validation, and iterative reasoning. This architectural change introduces substantial CPU demand, making CPU capacity a critical factor in maintaining system throughput and overall cost efficiency.

In agentic workflows, CPUs perform orchestration tasks such as control flow management, branching logic, retries, and coordination between multiple agents and external services. Each agent invocation may require interaction with databases, APIs, search engines, or vector stores, all of which generate additional CPU, memory, and I/O overhead. Moreover, reasoning-heavy workloads often require sandboxed execution environments for validation and testing. These iterative loops create multi-turn workflows in which CPUs determine end-to-end throughput. When CPU resources are insufficient, GPUs remain idle while waiting for preprocessing, tool execution, or verification steps to complete, resulting in inefficient use of expensive accelerator hardware.

Experimental benchmarks reinforce the significance of CPU workloads in agentic pipelines. In a financial anomaly detection workflow modeled after regulatory filing analysis, CPUs handled tasks such as data loading, baseline calculation, anomaly detection, document retrieval, and enrichment through web searches. The results demonstrated that CPU operations dominated the total runtime, with enrichment alone consuming significantly more time than the GPU-based model inference step. This highlights that inference acceleration alone cannot optimize performance; instead, system balance between CPU orchestration and GPU computation is required.

A second benchmark focusing on AI-assisted code generation further illustrated CPU bottlenecks. In this workflow, the GPU generated candidate solutions, while CPUs executed and verified code within sandboxed environments. Across more than two thousand tasks, CPU-based sandbox execution consumed slightly more time than GPU code generation, despite utilizing a high-core-count system. The CPU phase involved subprocess management, test execution, and result analysis, demonstrating that validation loops can rival or exceed inference time in agentic systems. These findings indicate that increasing GPU performance alone does not improve overall throughput without proportional CPU scaling.

Infrastructure sizing recommendations emerging from these experiments emphasize maintaining balanced CPU-to-GPU ratios. Current guidance suggests a ratio between 1:1 and 1.4:1 CPUs to GPUs, equivalent to approximately 86 to 120 CPU cores per GPU, depending on workload characteristics. Smaller models generating tokens more quickly require additional CPU capacity to keep GPUs saturated, while more powerful CPUs can reduce the required ratio. Future high-performance GPUs may further increase CPU demand, potentially pushing ratios higher when orchestration complexity grows.

The implications extend beyond performance optimization. Under-provisioned CPU resources can introduce latency in orchestration, delay tool execution, and slow verification loops, all of which reduce GPU utilization and increase operational costs. Conversely, scaling CPUs ensures continuous data preparation, coordination, and validation, allowing GPUs to operate at maximum efficiency. This system-level balance mirrors microservices architectures, where overall performance depends on the slowest component rather than the fastest.

Bottom line: As agentic AI continues to evolve, CPUs will play an increasingly central role in inference infrastructure. The transition from single-pass inference to multi-step workflows shifts value toward orchestration, coordination, and runtime management. Organizations deploying agentic systems must therefore reconsider traditional GPU-centric scaling strategies and instead design balanced architectures that provision sufficient CPU capacity. By aligning CPU and GPU resources, data centers can sustain throughput, minimize idle accelerators, and optimize total cost of ownership for next-generation AI deployments.

Agentic AI Requires More CPUs

Also Read:

Silicon Insurance: Why eFPGA is Cheaper Than a Respin — and Why It Matters in the Intel 18A Era

Captain America: Can Elon Musk Save America’s Chip Manufacturing Industry?

Intel to Compete with Broadcom and Marvell in the Lucrative ASIC Business


Accellera Updates at DVCon 2026

Accellera Updates at DVCon 2026
by Bernard Murphy on 04-08-2026 at 6:00 am

logo accellera

Lu Dai (chair of Accellera) and I had our regular chat at DVCon U.S. 2026. Accellera also hosted a reception in the exhibits hall, with free snacks and drinks, very well attended. We talked about what’s new in Accellera, with a particular emphasis on the recently released standard for CDC and RDC tool interoperability, also Lu’s thoughts on trust.

Accellera updates

The members list is largely unchanged: semiconductor and EDA of course but also multiple systems companies including Cariad (VW), Microsoft, Google and Apple. One new member is ChipAgents.

Conferences are expanding, geographically and in size. India, Taiwan and Japan conferences are leading in growth. The China conference is not growing as fast, though Lu sees opportunity to accelerate growth in China in a couple of ways. First the current event is hosted only in Shanghai and primarily attracts multinationals. This may limit attendance since their involvement in Accellera is already represented in other geographies. Growing into other areas outside Shanghai should attract more local interest.

Second, Lu suggested that an academic track would have significant appeal in the Chinese technical community. They want, as much as we do, an opportunity to be recognized for their research in a credible international forum. Adding this feature would extend the success DVCon Europe has already seen in this respect.

The SystemC group has scheduled a first SystemC code sprint in April, as a shift from traditional bi-weekly meetings to an approach more common in the open-source community. A bunch of developers will write code (libraries and reference examples) together and in-person. They are also continuing an online event in April called a Fika (a Swedish concept). These are short meetings to chat and catch up. Shout out to DVCon Europe for leading the way in multiple areas!

Accellera hosted a Summer of Code project directed at interns in 2025. This may be rebranded to target other standards such as CDC and UVM. Ambitious interns should note this year’s program. Lu is optimistic that with committed mentors this project could be a very effective method to grow the visibility and adoption of Accellera standards.

CDC and RDC standard – users need to step up!

The 1.0 version of this standard was released in March 2025. Lu says this is pretty complete for CDC, though he expects some iteration for RDC given respective levels of interest in those domains. Just as a reminder, IP and subsystem suppliers use their approved vendor checkers with those vendor’s constraints format to validate CDC and RDC with an output in that vendor’s format. Today if you as an integrator have a different approved vendor for CDC/RDC, you must translate those constraints and outputs to your vendor format. Which is doable but requires maintenance and, as always, is a potential source of errors.  The only other option is to run flat, an expensive approach and increasingly not feasible due to design complexity.

Now the community needs to push the compatibility burden back on the tool vendors. The standard is ready, vendors are aware of the standard, but they are not as motivated to prioritize support in their tools if customers aren’t demanding support. Time to express your expectations more vigorously!

Risk

I’m always interested in Lu’s viewpoints independent of Accellera; he also chairs RISC-V International and is VP of Technical Standards at Qualcomm. He has a broader view of risk than I have been thinking about recently (for AI in verification). I’ll just highlight a couple of points here.

One point he raised in the context of standards is possible exposure to IPR (intellectual property rights) conflicts in AI based contributions to the standard. Did the contributor use a bot to create or format their contribution? If so, does that create a legal hazard for the standards body or more widely? Lu said they have examined the legal question for Accellera and are comfortable that they are OK. However, he suggests that contributors (and innovators in general) should not use AI to document their final contributions because they don’t know how created material might innocently tread on other IPR.

Lu also mentioned a trust-related risk, illustrated by a hack introduced into the open-source Linux kernel in early 2024 and only caught by chance in unrelated beta testing. The hacker was a maintainer who had done good work for two years and therefore was trusted by the open-source community. Once trust was established, this sleeper agent introduced the hack. Here the problem was human generated but could equally have been AI generated.

Good chat. You can learn more about the CDC/RDC release HERE.

Also Read:

Accellera Strengthens Industry Collaboration and Standards Leadership at DVCon U.S. 2026

Podcast EP330: An Overview of DVCon U.S. 2026 with Xiaolin Chen

Boosting SoC Design Productivity with IP-XACT


When a Platform Provider Becomes a Competitor: Why Arm’s Silicon Strategy Changes the Incentives

When a Platform Provider Becomes a Competitor: Why Arm’s Silicon Strategy Changes the Incentives
by Admin on 04-07-2026 at 10:00 am

SemiWiki

Marc Evans, Director of Business Development & Marketing, Andes Technology USA

I work at a RISC-V IP company, and I genuinely root for Arm — probably more than most people in my position would admit. Not because I’m confused about who competes with whom, but because Arm’s best move for their shareholders is also RISC-V’s biggest tailwind yet.

This isn’t really an Arm vs. RISC-V story. It’s a platform economics story: what happens when a neutral platform provider begins competing with the customers it enables.

The Value Chain Climb — and Why It Makes Sense

Throughout its history, Arm has steadily moved up the value chain — from CPU IP to system and GPU IP to full Compute Subsystems — capturing more silicon value at each step. More license fees, more royalties, more of the margin their customers were earning. Smart business.

The economics are straightforward: IP licensing captures a small but highly profitable slice of system value. Moving into silicon means competing for a much larger share of that system value — potentially orders of magnitude more revenue, at lower margin but far greater profit. For a public company under growth pressure, that math is compelling.

Now they’ve announced their AGI CPU with Meta as a lead customer, targeting a $100B TAM in datacenter CPUs. This is a smart move — and largely aligned with where the market was already going.

The hyperscalers were already moving off x86: AWS Graviton, Google Axion, Microsoft Azure Cobalt, Oracle on Ampere. Arm is formalizing that shift and taking direct aim at Intel, AMD, and the internal silicon teams of hyperscalers.

This doesn’t meaningfully threaten Arm’s broader customer base. Hyperscalers at tier-1 scale have the volume and leverage to manage that relationship. Tier-2 players generally can’t justify the custom silicon investment regardless of who supplies the IP. Good for Arm shareholders. Good for the ecosystem. Cheer for it.

The Statement Worth Paying Attention To

But then came this, from Rene Haas at Arm Everywhere:

“There will be some tomorrows,” he said at Arm Everywhere, “And we think this opportunity to take the work we’ve done across all of the markets — as you’ve heard in the videos from edge to cloud, from milliwatts to gigawatts — we think we have an opportunity to address greater than a $1T TAM by the end of the decade.”

That’s the statement worth paying attention to.

Because that $1T TAM isn’t new market creation. It’s not x86 territory. It maps directly to Arm’s existing customer base: smartphones, automotive, industrial, AI acceleration, storage and networking, communications infrastructure.

The Structural Tension

Arm’s IP business was built on a model where they capture value because their customers succeed. Licensing revenue scales with their customers’ volume. The incentive alignment was clean — Arm wins when its customers win.

Moving into silicon in their customers’ end markets changes that alignment.

At sufficient scale, Arm’s silicon revenue competes directly with the revenue of the same companies paying their licensing fees and royalties. When silicon becomes the larger profit pool, which business gets the roadmap investment? Which gets the favorable terms?

That shift inevitably influences where engineering investment and long-term roadmap priority go.

That’s not a criticism. It’s just what the business model evolution implies.

When Switzerland Picks a Side

Arm’s entire IP business was built on being Switzerland. You could build on Arm and trust that your foundational CPU supplier wasn’t going to show up as a competitor in your end market. That neutrality had real value. Customers paid for it, designed around it, built long-term product roadmaps on top of it.

That Switzerland just picked a side. And that changes the relationship.

If you’re an automotive OEM trying to differentiate on processing, or an industrial company with a specific long-cycle compute roadmap, you now have to factor something into your planning you never had before: your strategic IP vendor is also a competitor with de facto first-mover advantage in your market, while simultaneously setting your license fees and royalty rates.

Those are design cycles measured in years and product investments measured in hundreds of millions. The risk doesn’t have to be immediate to be real. By the end of the decade — which is exactly the timeframe Haas cited — this becomes existential for some.

The New Switzerland

RISC-V is the new Switzerland — and it’s not entirely ironic that RISC-V International is incorporated in Switzerland.

Open standard, no single entity controlling the architecture, no sole vendor who can pivot to compete with you. You can license a proven commercial implementation and differentiate however you need to — with full confidence that your IP supplier’s business model depends on your success, not on displacing you.

The risk profile is structurally different. That matters in a design decision with a five-to-ten year horizon.

What About the Software Ecosystem?

It’s a fair challenge, and I won’t oversell it. But the framing matters.

What made Arm ubiquitous was underwriting the full transition of the software world from x86 to a RISC architecture — operating systems, compilers, middleware, application stacks. That was a decades-long, industry-wide investment.

Moving between RISC architectures is a fundamentally different problem. The architectural model is established, tooling and abstraction layers exist, and the transition cost is a fraction of the original lift.

The RISC-V software base reflects that — Linux, Android, real-time OSs, and expanding AI and HPC frameworks are already in place for a broad set of real products shipping today.

More importantly: ecosystem maturity follows economic incentive. It always has.

If you’re starting a design today with a two-year horizon to production, the question isn’t where the RISC-V software ecosystem is right now. It’s where it will be when your product ships.

Arm’s move just poured accelerant on that timeline.

The Bottom Line

This isn’t about ideology. It’s about structural incentives — and those incentives are shifting in a way that is both predictable and significant.

So yes — go Arm. Build the chips. Capture that silicon TAM. It’s the right move for your shareholders and it’s genuinely good for the industry: competition at the silicon level raises the bar for everyone, and customers ultimately win with real choice.

But for the companies now rethinking their silicon roadmap — the ones doing the math on long-cycle design risk, differentiation strategy, and supplier alignment — this shift is no longer abstract. It’s something to plan around now, before the next design cycle commits you to a path.

Where This Discussion Is Happening

These questions are already being worked through in real silicon programs. RISC-V Now! by Andes is focused on exactly this transition — real deployments, real tradeoffs, and lessons from teams who have already made the move to production. Not standards. Not theory. What’s actually shipping, what worked, and what didn’t. If you’re doing serious roadmap evaluation, this is the room to be in.

Learn more and register at www.riscv-now.com.

Also Read:

RISC-V Now! — Where Specification Meets Scale!

The Evolution of RISC-V and the Role of Andes Technology in Building a Global Ecosystem

The Launch of RISC-V Now! A New Chapter in Open Computing


An Upper Bound on Effective Quantum Computation?

An Upper Bound on Effective Quantum Computation?
by Bernard Murphy on 04-07-2026 at 6:00 am

QC with entanglement ceiling

You may think that quantum theory is fully understood but that view is not quite right. There remain open questions around the uncertainty principle, wave-particle duality, measurement collapse, and harmonizing quantum mechanics and gravitation. These concerns may seem very abstract and irrelevant to everyday applications but together promote a lingering sense that we still don’t fully understand quantum mechanics. Efforts to correct our understanding have been with us for over 100 years, starting with Einstein, Born and many others.

One such proposal in a recent paper could have rather dramatic consequences for quantum computing. The method used to address gaps in our understanding suggests that there might be a theoretical upper bound to the number of qubits that can be usefully superposed and/or entangled at any one time. If true, industrial cryptography may never be cracked by a quantum computer. The paper is available from the Proceedings of the National Academy of Sciences, though this is an easier read. Below is my attempt to abstract the key ideas. Apologies up-front – this is a geeky blog.

Clarification

The paper doesn’t suggest a limit on the number of qubits we can stuff into a quantum computer. That number might be practically bounded but so far is not theoretically bounded. The limit proposed in this paper is on how large a set of interdependent qubits is possible in a quantum algorithm.

Superposition/entanglement is table stakes for any useful quantum algorithm. You may be able to compute with a larger number of qubits, but not any faster than a classical computer if you don’t use superposition or entanglement. Interdependence between qubits is fundamental to quantum advantage.

Rethinking space

There is a widely held (not universal) view in physics that to unify quantum theory and general relativity we must switch from continuous space representations to a view that space is not arbitrarily divisible. There are physical limits (Planck length) on continuity, also uncertainty and wave-particle duality start to look more reasonable in discrete space. Further, in quantum computing an information-theoretic view of qubit states is appealing following Shannon, and this too works most comfortably in discrete space.

An N-qubit vector should in principle be able to address any arbitrary state in the full possible state space of that vector. Remember qubit states are slightly more involved than regular bit states. A bit can be 0 or 1. A qubit can be ⍺|0> + β|1> where |0> and |1> are “pure” quantum states like spin-up and spin-down and ⍺, β are complex phases such that |⍺|2 + |β|2 = 1. In continuous space quantum theory this formulation can represent any possible state in the full state space of a qubit, similarly an N-qubit vector could represent any state in that N-qubit space. However, if space is discretized in some manner, this guarantee can no longer be provided according to the paper, and there is an upper limit to how many states can be addressed by an N-qubit vector.

Since discretized, each qubit ⍺ and β will have a finite range of possible values. Extending to an N-qubit vector there will be similarly be a finite bound on how many states can be encoded within the state space. Less obviously, there will be in-principle “reachable” states in the state space which fail to meet discretization rules, and therefore are undefined/unreachable. As you add qubits, state space expands exponentially as do unreachable states, and quantum advantage begins to tail off. The author suggests that the absolute upper limit N for which a qubit vector could effectively address only legal states is 1000 qubits and that practical limits could be even lower.

This 1,000-qubit limit is well below any qubit count I have seen suggested (>104) for Shor or beyond-Shor algorithms applied to RSA 2048. If the theory is correct, this is a significant limitation. Many useful applications may still be possible, especially in quantum chemistry and materials science, but problem sizes will be much more constrained than we have been led to believe.

It’s just a theory

True, although the theory proposes intriguing resolutions to several of the open quantum questions I mentioned earlier. The paper suggests a practical test for a limit, which should be possible within the next 5-10 years. Simply run Shor’s algorithm attempting to factor a large integer on an N-qubit machine (logical qubits). If the concept behind the algorithm holds, performance should saturate to classical performance beyond some threshold for N. The paper suggests saturation may start as low as 500 qubits. If quantum advantage disappears or starts to disappear around this point, we will have hit a fundamental barrier in quantum computing. If not, then the theorists must go back to the drawing board.

Incidentally, the author illustrates his reasoning using rational number discretization, though stresses that his primary conclusion should hold (if correct) independent of that choice.

I will be interested to hear what QM builders and theorists have to say about this.

Also Read:

Another Quantum Topic: Quantum Communication

PQShield on Preparing for Q-Day

Where is Quantum Error Correction Headed Next?