SiC Forum2025 8 Static v3

Chiplets: Powering the Next Generation of AI Systems

Chiplets: Powering the Next Generation of AI Systems
by Kalar Rajendiran on 10-23-2025 at 10:00 am

Arm Synopsys at Chiplet Summit

AI’s rapid expansion is reshaping semiconductor design. The compute and I/O needs of modern AI workloads have outgrown what traditional SoC scaling can deliver. As monolithic dies approach reticle limits, yields drop and costs rise, while analog and I/O circuits gain little from moving to advanced process nodes. To sustain performance growth, the industry is turning to chiplets—modular, scalable building blocks for multi-die designs that redefine how high-performance systems are built.

Why Chiplets Matter

Forecast to become a $411 billion market by 2035 (IDTechEx), multi-die designs divide large SoC functions into smaller, reusable dies (also called chiplets) that can be integrated into a single system-in-package (SiP). These chiplets may be heterogeneous or homogeneous, replicating cores for scaling. SiPs can rely on standard organic substrates or advanced interposers that enable dense interconnects and greater functionality within a compact footprint.

The vision is an open marketplace where designers can mix and match chiplets from multiple suppliers. Beyond JEDEC’s HBM memory modules, however, widespread adoption of off the shelf chiplets has been limited by fragmented standards and fragmented use cases. Progress continues with UCIe (Universal Chiplet Interconnect Express), Arm’s Chiplet System Architecture (CSA), and new industry collaborations aimed at breaking these barriers.

System Partitioning and Process Node Choices

The first step in chiplet design is deciding how to partition system functions. Compute, I/O, and memory blocks can each be implemented on the process node that offers the best balance of power, performance, and cost. For example, an AI compute die benefits from the latest node, while SRAM or analog functions may be built on less advanced—and less expensive—nodes.

Latency and bandwidth demands guide how these blocks connect. A 2.5D interposer may provide sufficient performance, but latency-sensitive systems sometimes require 3D stacking, as seen in AMD’s Ryzen 7000X3D processors, where compute and cache are vertically integrated for faster data access.

Designing Die-to-Die Connectivity

Interconnect performance defines chiplet success. UCIe has become the industry’s preferred die-to-die standard, offering configurations for both cost-efficient organic substrates and high-density silicon interposers. Designers must weigh data rates, lane counts, and bump pitch to achieve the right mix of bandwidth, area, and power.

AI I/O chiplets, for instance, may require UCIe links supporting 16G–64G data rates to maintain low-latency communication with compute dies. Physical layout choices for interface IPs—single-row or double-stacked PHYs—may affect the beachfront area available for die-to-die interfaces and affect both area efficiency and design complexity.

Bridging UCIe’s streaming interface with on-chip protocols such as AXI, Arm CXS, or PXS is also key to maximizing throughput and minimizing wasted bandwidth.

Advanced Packaging and Integration

Packaging now sits at the heart of semiconductor innovation. Designers must choose between lower-cost organic substrates and denser 2.5D or 3D approaches. Silicon interposers deliver unmatched interconnect density but come with size and cost constraints. Emerging RDL (Redistribution Layer) interposers provide a balanced alternative—supporting larger system integration at reduced cost. Typical bump pitches range from 110–150 microns for substrates to 25–55 microns for interposers, shrinking further for 3D stacks.

Thermal, mechanical, and power-integrity challenges grow as multiple chiplets share one package. Early co-design across silicon and packaging domains is essential. Testability must also be planned in advance, using the IEEE 1838 protocol and multi-chiplet test strategies to ensure known-good-die (KGD) quality before assembly.

Securing and Verifying Multi-Die Designs

With multiple chiplets, the attack surface widens. Each chiplet must be authenticated and protected through attestation and secure boot mechanisms. Depending on the application, designers may integrate a root of trust to manage encryption keys or isolate sensitive workloads.

Data in transit must be secured using standards such as PCIe and CXL Integrity and Data Encryption (IDE), DDR inline memory encryption (IME), or Ethernet MACsec. Verification is equally critical: full-system simulation, emulation, and prototyping are required to validate die interactions before fabrication. Virtual development environments enable parallel software bring-up, shortening time-to-market.

Synopsys and Arm: Simplifying AI Chip Design

AI accelerators bring these challenges into sharp focus. They demand enormous compute density, massive bandwidth, and efficient integration across heterogeneous dies. To address this complexity, Synopsys and Arm—long-time collaborators—are combining their expertise to streamline AI and multi-die development.

At Chiplet Summit 2025, Synopsys VP of Engineering Abhijeet Chakraborty and Arm VP of Marketing Eddie Ramirez discussed how their companies are reducing design risk and speeding delivery. Under  Arm Total Design, Arm’s Neoverse Compute Subsystems (CSS) are now pre-validated with Synopsys IP, while Synopsys has expanded its Virtualizer prototyping solution and Fusion Compiler quick-start flows for the Arm ecosystem. These integrations let customers implement Arm compute cores more efficiently, validate designs earlier, and begin software development long before silicon arrives.

“We feel we are barely scratching the surface,” Chakraborty noted. “There’s a lot more work we can do in this space.” Both leaders emphasized that interoperability, reliability, and security will remain top priorities as chiplet ecosystems evolve.

The Road Ahead

The semiconductor industry is shifting from monolithic to modular design. Continued progress will depend on collaboration, standardization, and shared innovation across companies and ecosystems. With Synopsys advancing chiplet standards, design flows, and verified IP subsystems, the path from concept to production is becoming faster and more predictable. Customers can focus on their core competencies, while offloading other aspects of the design to respective experts in those areas for fast and reliable time-to-market.

The next generation of AI systems won’t rely on bigger chips—they’ll be built from smarter, interconnected chiplets, delivering scalable performance, efficiency, and flexibility for the most demanding compute workloads of the future.

 


Better Automatic Generation of Documentation from RTL Code

Better Automatic Generation of Documentation from RTL Code
by Tom Anderson on 10-23-2025 at 6:00 am

Specador Doc

One technical topic I always find intriguing is the availability of links between documentation and chip design. It used to be simple: there weren’t any. Architects wrote a specification (spec) in text, in Word if they had PCs, or using “troff” or a similar format if they were limited to Unix platforms. Then the hardware designers started drawing schematics or writing RTL code, and the programmers did their thing. Verification and validation were all about making sure everything worked together.

Whenever the specification changed, manual updates to the hardware and software were required. When implementation issues caused the design to differ from the original intent, the impact was rarely reflected back in the spec. Thus, when it came time to produce documentation for the end user, it was a lot of manual work to combine bits of the spec, update to reflect the actual design, and add explanatory material for the target audience.

These days, we have links galore. There are many ways to generate hardware and software code from various specification and documentation formats, plus methods to generate documentation from source code. The former is not a new idea. I remember working on a new processor design around 1988-1989 in which the details of the instruction set changed numerous times. I wrote an “awk” script to automatically generate the Verilog RTL design for the instruction decoder based on a tabular representation of the opcodes and their meanings.

These days, there are lot of pieces in a typical chip that can be generated from various specification formats, such as registers from IP-XACT or SystemRDL and state machines from transition tables. We’re starting to see generative AI spit out even larger chunks of the design, essentially based on natural language specifications in the form of chats. Being able to regenerate RTL code whenever a specification changes saves a great deal of time of effort over the course of a chip project.

Generating documentation from code is also not a new idea. Solutions in this space started on the software side, with tools such as Doxygen. The idea is that certain aspects of the documentation can be generated automatically from the code, with pragmas or some other in-line mechanism available for programmers to control the generation and add content. Numerous options for document generation are now available, with AI-based techniques quickly gaining acceptance. Being able to regenerate documentation every time the source code changes also saves a lot of time and eliminates a lot of manual effort.

From what I can gather, documentation generation from code is very common in the software world. However, I’ve been surprised by how few hardware projects embrace this approach in a big way.  I often hear designers and verification engineers complain that languages such as SystemVerilog are not as well supported by shareware documentation tools. They also say that they don’t have the level of control needed to get the quality of results their end users demand.

AMIQ EDA has a commercial product, Specador Documentation Generator, focused on hardware design and verification code. I figured that there must be some good reasons why their users chose this solution over free utilities, so I chatted with CEO Cristian Amitroaie. The first thing he said was that Specador was created with hardware engineers in mind. It supports source code written in SystemVerilog, Verilog, VHDL, the e language, and more. It covers the RTL design plus the verification testbench, components, models, and tests. It generates both PDF and HTML output.

To me, the most impressive aspect of Specador is that it leverages all the language knowledge available in the AMIQ EDA Design and Verification Tools (DVT) suite. Their front end compiles all the design and verification code and builds a flexible internal model. Users of the integrated development environment DVT IDE can easily browse, edit, and understand the code, and even query the model with AI Assistant.

Understanding the design means, for example, that users can generate design hierarchies, schematics, and state machine diagrams. Since the DVT tools also understand the Universal Verification Methodology (UVM), users can generate class or component diagrams including TLM connections, cross-linked class inheritance trees, and other useful forms of documentation for the testbench. My choice of the word “documentation” here is deliberate, because many of the design and verification diagrams that users might generate within the IDE are also useful as part of user manuals and other chip documentation.

Cristian stressed that Specador (like all their products) uses accurate language parsers to compile the code so that it understands the project structure. Users can employ it to document design or verification environments, even when comments are not present to provide additional context. Of course, Specador also supports the ability to use comments to format documentation and to add content that can’t be inferred from the source code.

I asked Cristian what’s new in Specador, and he mentioned that AMIQ EDA keeps implementing new features and enhancements based on customer feedback. For example, they recently added the ability to quickly preview the documentation directly in the IDE, the ability to apply custom filters when generating schematic or FSM diagrams, the ability to work with Markdown and reStructuredText markup languages, and last but not least the ability to generate documentation using their AI Assistant.

Specador makes it possible for design and verification engineers to easily create and maintain proper and well-organized documentation. Users can control what documentation they create by filtering or selecting elements in the design and testbench. They can quickly embed or link to external documentation. Specador integrates easily into existing development flows, allowing design and verification groups to automate the documentation process.

Above all, Specador keeps the generated documentation in sync with the source code, saving a great deal of maintenance time and effort as the code evolves. I thank Cristian for his time, and recommend looking at the product information, exploring the documentation generated for the Ibex embedded 32-bit RISC-V CPU core, and reading a post on real-world user experience to learn more.

Also Read:

2025 Outlook with Cristian Amitroaie, Founder and CEO of AMIQ EDA

Adding an AI Assistant to a Hardware Language IDE

Writing Better Code More Quickly with an IDE and Linting

 

 


FD-SOI: A Cyber-Resilient Substrate for Secure Automotive Electronics

FD-SOI: A Cyber-Resilient Substrate for Secure Automotive Electronics
by Daniel Nenni on 10-22-2025 at 10:00 am

Soitec white paper image

The paper highlights how Fully Depleted Silicon-On-Insulator (FD-SOI) technology provides a robust defense against Laser Fault Injection (LFI), a precise, laboratory-grade attack method that can compromise cryptographic and safety-critical hardware. As vehicles become increasingly digital and connected, with dozens of microcontrollers and over-the-air updates, hardware-level security has become central to automotive cybersecurity standards such as ISO/SAE 21434.

The Rising Threat of Physical Fault Attacks

Physical fault injection attacks (FIA) can bypass secure boot, unlock protected debug ports, and disrupt program flow. Among these, LFI stands out for its precision, using tightly focused near-infrared laser pulses to flip bits or alter circuit timing. While voltage and electromagnetic glitches can occur in the field, LFI remains the gold standard for systematically probing silicon vulnerabilities in controlled laboratory conditions.

As front-side access becomes harder due to thicker metal layers and shielding, back-side laser access through the substrate is increasingly used. This shift makes substrate engineering—the physical foundation of a chip—a critical security factor.

Why FD-SOI Disrupts Laser Attack Mechanisms

FD-SOI differs from bulk CMOS in that its transistors are built on an ultra-thin silicon layer electrically isolated from the main wafer by a buried oxide (BOX). This structural difference eliminates the main LFI fault mechanisms found in bulk silicon.

Four dominant bulk mechanisms are neutralized by FD-SOI:

  1. Drain/body charge collection – FD-SOI’s thin silicon layer and BOX barrier dramatically reduce the photocurrent that lasers generate at PN junctions.

  2. Laser-induced IR-drop – In bulk CMOS, current loops between wells and substrate can cause transient voltage drops. FD-SOI, using isolated body-bias networks, removes this conduction path.

  3. Substrate diffusion and funneling – Charge carriers cannot spread vertically through the BOX, preventing multi-cell upsets and latch-up.

  4. Parasitic bipolar amplification – Only a weak, lateral bipolar effect remains in FD-SOI, which can be further mitigated using reverse body-bias (RBB) to raise the laser energy threshold.

By blocking substrate conduction and confining active regions, FD-SOI significantly reduces the area and energy range vulnerable to laser faults.

Experimental Validation

Experiments comparing 22FDX FD-SOI and 28 nm bulk CMOS devices including D-flip-flops, SRAMs, and AES/ECC crypto cores confirmed the theoretical advantages. In tests, FD-SOI required up to 150× more laser shots to produce the same fault observed in bulk devices. The time-to-first-fault rose from roughly ten minutes to ten hours, while the minimum fault energy threshold increased from 0.3 W to over 0.5 W.

Spatial and depth-sensitivity mapping showed that bulk silicon has wide fault-prone zones, while FD-SOI faults are confined to sub-micron “hotspots” with a narrow focal depth of only about ±1 µm. Attackers must therefore perform ultra-fine spatial scans, drastically increasing effort and cost.

Furthermore, excessive laser power in FD-SOI caused permanent damage or stuck bits effectively creating a natural deterrent since aggressive attempts could destroy the target device.

Implications for Automotive Security Compliance

In the ISO/SAE 21434 framework, reducing attack likelihood directly lowers cybersecurity risk. FD-SOI’s physical resilience therefore simplifies compliance and can help products achieve Common Criteria or SESIP assurance levels (EAL4+ or higher) without extensive additional countermeasures. Because attack duration, equipment complexity, and expertise all increase, FD-SOI provides a quantifiable uplift in assurance for automotive OEMs and tier-one suppliers.

Toward a Next-Generation Secure Substrate

The authors envision extending FD-SOI’s benefits through substrate-level innovation, transforming it from a passive platform into an active cyber-resilient layer. Two emerging techniques are highlighted:

  1. Buried optical barriers—highly doped layers under the BOX that absorb or scatter infrared light, reducing LFI energy transmission while enabling anti-counterfeit watermarking.

  2. Integrated sensors and PUFs (Physically Unclonable Functions) substrate-embedded monitors that detect tampering or derive unique cryptographic identities from manufacturing variations.

Together, these innovations could allow the substrate to detect attacks, react in real time, and cryptographically bind the silicon identity to the vehicle platform.

Bottom line: FD-SOI represents a material-level breakthrough in hardware security. By eliminating substrate pathways exploited in bulk CMOS, it narrows the laser fault window, increases attack complexity, and provides tunable resilience through body-bias control. These benefits align directly with evolving automotive cybersecurity regulations, offering faster certification and lower system costs.

As substrate engineering continues toward integrated optical barriers and anti-tamper features, FD-SOI is poised to become the reference platform for secure automotive electronics, anchoring trust at the silicon level.

Read the full white paper here.

Also Read:

Soitec’s “Engineering the Future” Event at Semicon West 2025

How FD-SOI Powers the Future of AI in Automobiles

Powering the Future: How Engineered Substrates and Material Innovation Drive the Semiconductor Revolution


Podcast EP312: Approaches to Advance the Use of Non-Volatile Embedded Memory with Dave Eggleston

Podcast EP312: Approaches to Advance the Use of Non-Volatile Embedded Memory with Dave Eggleston
by Daniel Nenni on 10-22-2025 at 8:00 am

Daniel is joined by Dave Eggleston is senior business development manager at Microchip with a focus on licensing SST SuperFlash technology. Dave’s extensive background in Flash, MRAM, RRAM, and storage is built on 30+ years of industry experience. This includes serving as VP of Embedded Memory at GLOBALFOUNDRIES, CEO of RRAM pioneer start-up Unity Semiconductor (acquired by Rambus), Director of Flash Systems Engineering at Micron, NVM Product Engineering manager at Sandisk, and NVM Engineer at AMD. Dave is frequently invited as a speaker at international conferences as an expert on emerging NVM technologies and their applications and has 25+ NVM-related patents granted.

Dan explores the requirements of embedded non-volatile memory (NVM) for application in 32-bit microcontrollers with Dave, who provides a broad overview of the many markets served by these technologies. He describes the challenges of integrating NVM as process nodes advanced. Dave explains the benefits of Microchip’s SST SuperFlash technology and also discusses the cost and time-to-market benefits of using a chiplet approach to add VVM.

Dave also touches on the recently announced strategic collaboration between Microchip/SST and Deca Technologies to innovate a comprehensive NVM chiplet package to facilitate customer adoption of modular, multi-die systems.

The views, thoughts, and opinions expressed in these podcasts belong solely to the speaker, and not to the speaker’s employer, organization, committee or any other group or individual.


Learning from In-House Datasets

Learning from In-House Datasets
by Bernard Murphy on 10-22-2025 at 6:00 am

Training in a Constrained Environment min

At a DAC Accellera panel this year there was some discussion on cross-company collaboration in training. The theory is that more collaboration would mean a larger training set and therefore higher accuracy in GenAI (for example in RTL generation). But semiconductor companies are very protective of their data and reports of copyrighted text being hacked out of chatbots do nothing to allay their concerns. Also, does evidence support that more mass training leads to more effective GenAI? GPT5 is estimated to have been trained on 70 trillion tokens versus GPT4 at 13 trillion tokens, yet GPT5 is generally viewed as unimpressive, certainly not a major advance on the previous generation. Maybe we need a different approach.

More training or better focused training?

A view gathering considerable momentum is that while LLMs do an excellent job in understanding natural language, domain-specific expertise is better learned from in-house data. While this data is obviously relevant, clearly there’s a lot less of it than in the datasets used to train big GenAI models. A more thoughtful approach is necessary to learn effectively from this constrained dataset.

Most/all approaches start with a pre-trained model (the “P” in GPT) since that already provides natural language understanding and a base of general knowledge. New methods add to this base through fine-tuning. Here I’ll touch on labelling and federated learning methods.

Learning through labels

Labeling harks back to the early days of neural nets, where you provided training pictures of dogs labelled “dog” or perhaps the breed of dog. The same intent applies here except you are training on design data examples which you want a GenAI model to recognize/classify. Since manually labeling large design datasets would not be practical, recent innovation is around semi-automated labeling assisted by LLMs.

Some large enterprises outsource this task to value-added service providers like Scale.com who deploy large teams of experts using their internal tools to develop labeling, using supervised fine-tuning (SFT) augmented by reinforcement learning with human feedback (RLHF). Something important to understand here is that labeling is GenAI-centric. You shouldn’t think of labels as tags on design data features but rather as fine-tuning additions to GenAI data (attention, etc) generated from training question/answer (Q/A) pairs expressed in natural language, where answers include supporting explanations perhaps augmented by content for RAG.

In EDA this is a very new field as far as I can tell. The topic comes up in some of the papers from the first International Conference on LLM-Aided Design (LAD) held this year at Stanford. One such paper works around the challenge of getting enough expert-generated Q/A pairs by generating synthetic pairs through LLM analysis of unlabeled but topic-appropriate documents (for example on clock domain crossings). This they augment with few-shot learning based on whatever human expert Q/A pairs they can gather.

You could imagine using similar methods for labeling around other topics in design expertise: low-power design, secure design methods, optimizing synthesis, floorplanning methods and so on. While attention in the papers I have read tends to focus on using this added training to improve RTL generation, I can see more immediate value in verification, especially in static verification and automated design reviews.

Federated Learning

Maybe beyond some threshold more training data isn’t necessarily better, but perhaps the design data that can be found in any given design enterprise doesn’t yet suffer from that problem and more data could still help, if we could figure out how to combine learning from multiple enterprises without jeopardizing the security of each proprietary dataset.  This is a common need across many domains where webcrawling for training data is not permitted (medical and defense data are two obvious examples).

Instead of bringing data to the model for training, Federated Learning sends an initial model from a central site (aggregator) to individual clients and develops fine-tuning training in the conventional manner within that secure environment. When training is complete, trained parameters only are sent back to the aggregator which harmonizes inputs from all clients, then sends the refined model back to the clients. This process iterates, terminating when the central model converges.

There are some commercial platforms for Federated Learning, also open-source options from some big names: TensorFlow Federated from Google and NVIDIA FLARE are two examples. Google Cloud and IBM Cloud offer Federated Learning support, while Microsoft supports open-source Federated Learning options within Azure.

This method could be quite effective in the semiconductor space if a central AI platform or consortium could be organized to manage the process. And if a critical mass of semiconductor vendors is prepared to buy in 😀.

Perhaps the way forward for learning in industries like ours will be through a combination of these methods – federated learning as a base layer to handle undifferentiated expertise and labeled learning for continued differentiation in more challenging aspects of design expertise. Definitely an area to watch!

Also Read:

PDF Solutions Calls for a Revolution in Semiconductor Collaboration at SEMICON West

The AI PC: A New Category Poised to Reignite the PC Market

Webinar – The Path to Smaller, Denser, and Faster with CPX, Samtec’s Co-Packaged Copper and Optics


ASU Silvaco Device TCAD Workshop: From Fundamentals to Applications

ASU Silvaco Device TCAD Workshop: From Fundamentals to Applications
by Daniel Nenni on 10-21-2025 at 10:00 am

SILVACO ASU Workshop 400x400

The ASU-Silvaco Device Technology Computer-Aided Design Workshop is a pivotal educational and professional development event designed to bridge the gap between theoretical semiconductor physics and practical device engineering. Hosted by Arizona State University in collaboration with Silvaco, a leading provider of TCAD software, this workshop offers participants a comprehensive exploration of semiconductor device simulation, from foundational concepts to advanced applications. Spanning topics such as device physics, process simulation, and real-world design challenges, the workshop equips engineers, researchers, and students with the tools to innovate in the rapidly evolving field of microelectronics.

The workshop typically begins with an introduction to TCAD fundamentals, emphasizing the role of simulation in modern semiconductor design. Participants learn how TCAD tools model the electrical, thermal, and optical behavior of devices at the nanoscale. Silvaco’s suite of software, including Atlas, Victory Process, and DeckBuild, is introduced as a powerful platform for simulating semiconductor fabrication and performance. These tools allow users to predict device behavior under various conditions, optimize designs, and reduce the need for costly physical prototyping. The foundational sessions cover key concepts like carrier transport, quantum effects, and material properties, ensuring attendees grasp the physics underpinning TCAD simulations.

As the workshop progresses, it delves into practical applications, demonstrating how TCAD is used in industries such as integrated circuits, power electronics, and photovoltaics. Participants engage in hands-on sessions, guided by ASU faculty and Silvaco engineers, to simulate processes like doping, oxidation, and lithography. These exercises highlight how TCAD can optimize fabrication steps, improve yield, and enhance device reliability. For instance, attendees might simulate a MOSFET’s performance to analyze parameters like threshold voltage or leakage current, gaining insights into design trade-offs. The workshop also covers advanced topics, such as modeling FinFETs, tunnel FETs, or emerging 2D materials like graphene, reflecting the cutting-edge needs of the semiconductor industry.

A key strength of the ASU-Silvaco workshop is its emphasis on bridging academia and industry. ASU’s expertise in semiconductor research, combined with Silvaco’s industry-standard tools, creates a unique learning environment. Participants, ranging from graduate students to seasoned engineers, benefit from real-world case studies, such as optimizing power devices for electric vehicles or designing low-power chips for IoT applications. The collaborative setting fosters networking, enabling attendees to connect with peers and experts, potentially sparking future research or career opportunities.

By the workshop’s conclusion, participants gain a robust understanding of TCAD’s role in accelerating innovation. They leave equipped with practical skills to simulate and analyze semiconductor devices, as well as an appreciation for how these tools address challenges like scaling, power efficiency, and thermal management. The ASU-Silvaco Device TCAD Workshop stands out as a vital platform for advancing semiconductor expertise, empowering attendees to contribute to the next generation of electronic devices in a world increasingly driven by technology.

Register Here

About Silvaco
Silvaco is a provider of TCAD, EDA software, and SIP solutions that enable semiconductor design and digital twin modeling through AI software and innovation. Silvaco’s solutions are used for semiconductor and photonics processes, devices, and systems development across display, power devices, automotive, memory, high performance compute, foundries, photonics, internet of things, and 5G/6G mobile markets for complex SoC design. Silvaco is headquartered in Santa Clara, California, and has a global presence with offices located in North America, Europe, Egypt, Brazil, China, Japan, Korea, Singapore, Vietnam, and Taiwan. Learn more at silvaco.com.

Also Read:

GaN Device Design and Optimization with TCAD

Simulating Gate-All-Around (GAA) Devices at the Atomic Level

Silvaco: Navigating Growth and Transitions in Semiconductor Design


PDF Solutions Calls for a Revolution in Semiconductor Collaboration at SEMICON West

PDF Solutions Calls for a Revolution in Semiconductor Collaboration at SEMICON West
by Mike Gianfagna on 10-21-2025 at 6:00 am

PDF Solutions Calls for a Revolution in Semiconductor Collaboration at SEMICON West

SEMICON West was held in Phoenix, Arizona on October 7-9. This premier event brings the incredibly diverse global electronics supply chain together to address the semiconductor ecosystem’s greatest opportunities and challenges. The event’s tagline this year is:

Stronger Together — Shaping a Sustainable Future in Talent, Technology, and Trade, underscoring the semiconductor industry’s commitment to collaboration in addressing key challenges and opportunities. One of the participants at this event presented an approach and technology that form the foundation for global collaboration across the semiconductor industry. Let’s look at how PDF Solutions used the SEMICON West stage to call for a revolution in semiconductor collaboration.

PDF Solutions at the Show

PDF highlighted its game-changing technology during SEMICON West with product demos, daily booth presentations and speaking engagements in the SEMICON West program. Attendees at the PDF booth interacted with a virtual reality model of semiconductor manufacturing chamber equipment and the associated connection to PDF’s EquipmentTwin solution. Other demos illustrated the latest products that are shaping the future of the semiconductor industry.

A Keynote Aimed at Catalyzing Collaboration

John Kibarian

John Kibarian, CEO of PDF Solutions, presented an important keynote during SEMICON West. His talk was titled Revolutionizing Semiconductor Collaboration: The Emergence of AI-Driven Industry Platforms.  

John pointed out that the semiconductor industry thrives on innovation, and this work is fueled by collaboration. But the nature of this collaboration is changing. The growing technology trend towards 3D and hybrid packaging is creating a larger and more complex global supply chain. He explained that the ability to leverage AI at all levels to drive operational efficiency is needed.

John used a couple of visuals to make his point. First, he illustrated the nature of collaboration driven by the current model, which is largely reactionary and uses simple linear handoffs by sharing relatively small amounts of data as shown below.

Traditional collaboration model

He went on to point out the substantial changes that technologies such as 3D are bringing to the collaboration model. Some of those drivers are summarized in the graphic below.

He explained that what is needed is a new, AI-driven collaboration model that is depicted in the graphic at the top of this post. He detailed the collaboration that will be required both within the enterprise and across the global supply chain. The many aspects of the required collaboration across the supply chain are shown in the graphic below.

John then distilled the key building blocks or foundational capabilities required to build the platform needed to enable AI-driven collaboration across the industry:

  • A secure data infrastructure
  • Automated orchestration
  • AI agents

He explained that PDF Solutions has already deployed a secure data infrastructure to enable global collaboration. It is called secureWISE™ and it connects over 300 manufacturing locations and manages exabytes of data between fabs and OEMs. Over 100 OEMs are connected, and the system is ISO: 27001 compliant. He reported there have been zero security breaches in over 20 years. An impressive statistic.

John then went on to describe how secureWISE provides secure remote connectivity for the semiconductor industry. Technologies used include end-to-end secure access across private networks, virus scan for secure equipment software updates, and granular user permission for remote control and optimization. With this secure remote connection data infrastructure, equipment OEMs are able to remotely execute software upgrades and use collected data to deliver AI based value added equipment optimization services.  For Fab operators connecting their equipment with secureWISE increases fab uptime and improve operational efficiency.

The secure global data infrastructure is the foundation on top of which global collaboration can be executed relying on automated orchestrations which enable to Align and Abstract data across the supply chain to accelerate decisions. This is achieved by:

  • Using actionable detailed manufacturing data to drive business decisions
  • Automating business processes within and across enterprises
  • Deploying and orchestrating AI agents to automate and accelerate decisions

With this approach business decisions can be made much faster, using more timely and accurate data coming from across multiple applications and organizations.

Examples include:

  • Product Costing: Accurate and up-to-date costing information based on actual resource consumption
  • Order Status: Real-time updates on orders status and yield
  • Quality: Rapid identification and isolation of at-risk materials
  • Test Flow: Automated test flow management
  • WIP: Real-time WIP tracking and management across the supply chain

In most manufacturing organizations a very small amount of the collected manufacturing data is actually used for Analysis. PDF Solutions estimates that at most 5% of manufacturing data is used for meaningful analytics. To fully leverage and scale the use of AI semiconductor companies need to be able automate a lot of the analytics processes while enabling human governance for this AI execution. Aspects of this work include:

  • Enforce Data Quality: Humans define standards; AI implements data quality
  • Humans Set Collaboration Rules: To define collaboration principles, data sharing, and security boundaries
  • AI Executes at Scale: Operating autonomously within those boundaries, handling complex, high-volume tasks
  • Cross-Organization Governance: Allows fabs to define vendor access and data protocols and AI manages daily execution

An example of this that John provided is guided analytics, as a process that mines 100% of the data, 100% of the time, and automates up to 90% of the analysis. He went on to describe how seamless integration of data types is achieved. For example: hard bin, soft bin, parametric, PCM, test tools, and units per hour (UPH). This process delivers an issue-based flow that uses AI/ML powered diagnosis, resulting in the ability to render data visualization in seconds.

John went on to describe many valuable and high-impact capabilities that are enabled by AI agents including:

  • Predictive test to predict future test results so test dollars can be spent on units that need the most attention
  • Predictive burn-in to predict which burn-in will pass to eliminate costly and unnecessary burn-in runs
  • Predictive binning to predict which tests will fail – finding failures early saves time and money

John concluded the story with an overall collaboration vision and a call to action for the industry to step up its work in this area. The vision John presented, which provides a pathway to greater worldwide semiconductor innovation, is summarized below.

PDF Solutions collaboration vision

To Learn More

The work presented at SEMICON West illustrates the significant progress PDF Solutions is making to revolutionize semiconductor collaboration. John’s keynote underscored the importance of worldwide adoption of these new strategies. You can access John Kibarian’s keynote on the PDF Solutions website. You can learn more about what PDF is doing on SemiWiki here.  You can also explore the suite of integrated products offered by the company here.  And that’s how PDF Solutions calls for a revolution in semiconductor collaboration at SEMICON West.

Also Read:

PDF Solutions Adds Security and Scalability to Manufacturing and Test

PDF Solutions and the Value of Fearless Creativity

Podcast EP259: A View of the History and Future of Semiconductor Manufacturing From PDF Solution’s John Kibarian


The AI PC: A New Category Poised to Reignite the PC Market

The AI PC: A New Category Poised to Reignite the PC Market
by Jonah McLeod on 10-20-2025 at 10:00 am

Fig 1

The PC industry is entering its most significant transformation since the debut of the IBM PC in 1981. That original beige box ushered in a new era of productivity, reshaping how corporations and individuals worked, communicated, and created. More than four decades later, the AI PC is emerging as a new category — one that promises to reignite growth in a market that has otherwise plateaued. Where the IBM PC democratized computing power for spreadsheets, word processing, and databases, the AI PC integrates machine intelligence directly into the device, enabling capabilities once reserved for cloud data centers.

“If the IBM PC made computers personal, the AI PC makes them perceptive”

What Defines an AI PC

An AI PC isn’t just another laptop or desktop with more cores or a faster GPU. At its heart lies a dedicated Neural Processing Unit (NPU) or an equivalent accelerator, designed to handle machine learning and inference tasks efficiently. Apple was among the first to bundle AI capability into all its Macs via the Neural Engine in its M-series silicon. In 2025, nearly all Macs shipped qualify as AI PCs by default. On the Windows/x86 side, Intel and AMD are racing to deliver NPUs in their latest laptop platforms, though only about 30% of PCs shipping this year meet the ‘AI PC’ definition. Meanwhile, RISC-V vendors are entering the scene with experimental AI PCs, such as DeepComputing’s DC-ROMA II, proving that even open architectures are chasing this category.

This hardware shift is paired with software integration. AI PCs promise not just raw horsepower but contextual, on-device intelligence. They run large language models (LLMs), generative tools, transcription, translation, and real-time personalization — all locally, without depending exclusively on the cloud.

The IBM PC Parallel

The IBM PC, released in 1981, was revolutionary not because of its raw specs — an Intel 8088 processor, 16KB of RAM, and two floppy drives hardly impress today — but because of its timing and positioning. IBM gave business managers and knowledge workers their first taste of personal productivity at scale. VisiCalc and Lotus 1-2-3 spreadsheets became corporate staples, while WordPerfect transformed document workflows. The IBM PC became the catalyst for office automation and, eventually, the rise of the information economy.

The AI PC in 2025 carries a similar inflection point. Just as the IBM PC allowed managers to manipulate numbers without waiting for mainframe operators, the AI PC gives today’s professionals the power to run generative models, analyze vast data sets, and automate creative tasks directly from their desks. Where the IBM PC shifted power from IT departments to individual workers, the AI PC shifts power from centralized cloud servers back to the personal device.

How Apple Got the Jump on the x86 Crowd

Apple’s head start in AI PCs is the product of long-term bets that Intel and AMD were late to match. Apple began shipping a Neural Engine in iPhones as early as 2017. By the time the M1 arrived in Macs in 2020, Apple already had multiple generations of AI silicon in production. Intel’s first true NPU platform, Meteor Lake, didn’t appear until 2023, and AMD’s Ryzen AI chips landed around the same time.

Vertical integration gave Apple another edge. Because it controls silicon, operating system, and frameworks like Core ML and Metal, Apple can route workloads seamlessly across CPU, GPU, and NPU. The x86 ecosystem, fragmented between Microsoft, Intel, AMD, and dozens of OEMs, could not move nearly as fast.

Apple’s unified memory architecture offered a further advantage, eliminating costly data transfers that plague PCs where CPU and GPU have separate memory pools. And Apple made AI consumer-friendly: Apple Intelligence features in Mail, Notes, and Photos gave users visible, everyday value. Windows AI PCs, by contrast, still lean heavily on Microsoft’s Copilot features, many of which depend on cloud services.

By 2025, Apple had made AI hardware and features standard across its lineup. All Macs shipped are AI PCs, while only about a third of x86 PCs qualify.

The Day Apple Changed the Rules

The pivotal moment came at Apple’s Worldwide Developers Conference (WWDC) in June 2020. On stage, Tim Cook described it as a ‘historic day for the Mac’ and announced the company’s transition to Apple-designed silicon. He framed it as the next evolution in Apple’s decades-long control of its own hardware and software stack.

Johny Srouji, Apple’s Senior Vice President of Hardware Technologies, then took the stage to explain the architectural foundations of what would later be branded the M1. He described how Apple’s silicon journey — from iPhone to iPad to Apple Watch — had matured into a scalable architecture capable of powering Macs. Srouji highlighted the unification of CPU, GPU, and Neural Engine within a single system-on-chip, the adoption of a unified memory architecture, and a focus on performance per watt as the key to efficiency. He emphasized that Apple’s decision to design its own SoCs specifically for the Mac would give the platform a leap in power efficiency, security, and AI capability.

This keynote framed the M1 not just as a chip, but as the product of a strategic pivot — one that caught the x86 world by surprise and set Apple years ahead in the race to define the AI PC.

Inside Apple’s Secret AI Weapon

The Neural Processing Unit (NPU) in every modern Mac isn’t an ARM building block — it’s an Apple original design. Apple licenses the ARM instruction set for its CPUs but develops its own CPU, GPU, and NPU cores.

The Neural Engine debuted with the A11 Bionic in 2017, years before ARM’s Ethos NPU line, and has scaled up steadily to handle tens of trillions of operations per second. It was conceived and built in-house by Apple’s silicon team under Johny Srouji, tailored specifically for machine-learning inference. In the M1, the Neural Engine delivered 11 TOPS, while in the M3 it exceeds 35 TOPS.

The GPU inside the M1 was also a clean Apple design. Until the A10 generation, Apple licensed PowerVR graphics cores from Imagination Technologies, but starting with the A11 Bionic in 2017, Apple switched to its own custom GPU microarchitecture. The M1’s 8-core GPU, capable of 2.6 TFLOPS, is part of this lineage.

Taken together, Apple’s CPU, GPU, and Neural Engine represent a vertically integrated architecture — all custom, all Apple — with only the instruction set itself licensed from ARM. This tight ownership is what allows Apple to optimize across hardware and software, and it explains why the M-series has been able to leapfrog x86 designs in efficiency and AI capability.

Birth of The AI PC

Apple didn’t set out to build the M1 in 2010, but its path toward full integration began to take shape years earlier. When the company introduced its first in-house Neural Engine and custom GPU alongside its homegrown CPU in the A11 Bionic (2017), it effectively unified all three pillars of modern computation under one roof. From that point on, Apple’s silicon roadmap evolved with a clear long-term goal: to bring these independently perfected engines — CPU, GPU, and Neural Engine — onto a single fabric with shared memory and software control. The M1, unveiled in 2020, was the culmination of that decade-long convergence, transforming what had started as separate mobile components into a cohesive architecture optimized for both performance and efficiency across the entire Mac lineup.

While NVIDIA’s “aha moment” came in 2012, when its GPUs unexpectedly became the workhorses of the deep learning revolution, Apple was on a parallel but opposite trajectory. NVIDIA was scaling up for the cloud, harnessing GPU clusters for AI training, while Apple was scaling down — embedding intelligence directly into personal devices. Both arrived at AI through performance innovation rather than foresight: NVIDIA by discovering that its gaming chips excelled at matrix math, and Apple by realizing that machine learning could make mobile experiences smarter, faster, and more private. The convergence of these two paths — one born in the data center, the other in the palm of the hand — defined the modern era of AI computing.

Competing Compute Models—Apple vs. RISC-V

Apple’s M-series SoCs represent a tightly integrated approach. At the CPU level, Apple uses a combination of high-performance and efficiency cores, delivering strong single-thread performance while managing orchestration tasks at low power. Its GPU, designed in-house, is a tile-based architecture that handles both graphics and general-purpose compute. For AI-specific workloads, Apple includes its Neural Engine, a dedicated block capable of roughly 35 TOPS in its latest Macs, deeply integrated with Core ML and Apple Intelligence. Together, these components form a unified architecture with shared memory, which eliminates costly data transfers and optimizes performance for consumer-facing AI applications.

RISC-V vendors, by contrast, take a modular approach based on scalar, vector, and matrix engines. Scalar cores serve as the foundation for control and orchestration, while the RISC-V Vector Extension (RVV 1.0) provides scalable registers ranging from 128 to 1024 bits, ideal for SIMD tasks like convolution, dot-products, and signal processing. For high-intensity AI workloads, matrix engines (MMA) accelerate tensor math in formats such as INT8, FP8, and FP16, targeting operations like GEMM and transformer attention. Rather than a closed design, this modular architecture allows vendors to tailor solutions to specific needs and even adopt chiplet-based scaling across regions and partners.

The contrast between Apple and RISC-V is sharp. Apple delivers a closed but seamless integration, while RISC-V offers openness, extensibility, and flexibility. Apple’s ecosystem is polished and tightly controlled, with Core ML and Apple Intelligence driving user-facing features. RISC-V, on the other hand, still relies on maturing toolchains like LLVM, ONNX, and TVM, but its open model makes it attractive for sovereign compute initiatives and experimental AI PCs. Apple scales its approach within its own lineup, while RISC-V enables innovation across global vendors, offering a pathway that Apple’s ecosystem simply does not allow.

Why the x86 Crowd Got Caught Flat-Footed

The x86 ecosystem misjudged both the timing and the scope of the AI PC transition. For years, Intel and AMD assumed that AI workloads would remain concentrated in data centers and GPUs rather than extending into consumer PCs. Their product roadmaps and marketing focused on server accelerators and high-performance gaming graphics, while NPUs for laptops were treated as an afterthought.

This miscalculation was compounded by the industry’s preoccupation with the data center boom. As hyperscalers poured billions into AI infrastructure, chipmakers directed their attention toward GPUs and server CPUs, chasing growth where the margins were highest. In doing so, they overlooked the parallel opportunity Apple had identified — embedding AI into personal devices where privacy, latency, and convenience are paramount.

By the time Intel released Meteor Lake in 2023 and AMD introduced Ryzen AI, Apple already had a three-year head start in consumer AI integration. Microsoft’s Copilot+ PC initiative in 2024 only underscored how reactive the x86 response had become. Moreover, Intel’s manufacturing struggles and AMD’s limited focus on NPUs slowed their ability to pivot, while power efficiency remained a glaring weakness. Apple could deliver hours of local LLM performance on battery, something x86 laptops could not match without resorting to power-hungry discrete GPUs.

Ultimately, the fixation on data centers blinded x86 vendors to the rise of the AI PC. Apple exploited this gap decisively, while RISC-V vendors now see an opportunity to carve out their own space with modular, integrated solutions that offer an open alternative to both Apple and x86.

NVIDIA’s Head is in The Cloud

NVIDIA dominates AI in the cloud and remains the gold standard for GPU-accelerated training and inference. Its CUDA ecosystem, TensorRT optimizations, and developer lock-in make it indispensable in enterprise and data center environments. But the AI PC revolution is shifting focus toward efficient, always-on AI computing at the edge, and here NVIDIA plays a more complicated role.

On the one hand, NVIDIA powers some of the most capable ‘AI PCs’ today. Discrete RTX GPUs deliver blazing inference speeds for LLMs and generative models, and Microsoft has partnered with NVIDIA to brand ‘RTX AI PCs.’ For power users and creators, an x86 laptop with an RTX 4090 GPU can churn out tokens per second far beyond what Apple’s Neural Engine can achieve.

On the other hand, NVIDIA’s model depends on discrete GPUs that consume far more power than Apple’s integrated NPUs or the NPUs Intel and AMD are embedding into CPUs. AI PCs are not just about raw throughput — they are about balancing performance with efficiency, portability, and battery life. Apple has made AI capability universal across its product line, while NVIDIA’s approach is tied to high-end configurations.

This leaves NVIDIA both central and peripheral. Central, because any developer serious about AI still needs NVIDIA for training and high-performance inference. Peripheral, because the AI PC category is being defined around integrated NPUs, not discrete GPUs. If ‘AI PC’ comes to mean a lightweight laptop with always-on AI features, NVIDIA risks being left out of the mainstream narrative, even as it continues to dominate the high end.

 Apple’s Big Mac: The Comeback Story

After two decades in the iPhone’s shadow, the Mac is regaining its relevance — not as a nostalgia act, but as the world’s first mass-market AI PC. The AI PC wave also reshapes Apple’s internal dynamics. For nearly two decades, the iPhone was Apple’s growth engine, overshadowing the Mac. iPhone revenue reached more than $200 billion annually, while the Mac hovered around $25–30 billion. The company’s focus, culture, and ecosystem tilted toward mobile.

But smartphones are now a mature market. Global shipments have flattened, replacement cycles lengthened, and Apple increasingly leans on services for growth. The iPhone remains indispensable, but its role as the company’s primary driver is fading.

The Mac, reborn as the AI PC, offers Apple a chance to regain strategic balance. AI workloads — text generation, media editing, data analysis — naturally fit the desktop and laptop form factor, not the smartphone. Apple Intelligence on the Mac positions it as an AI hub for professionals, creators, and students, in ways the iPhone cannot match due to thermal and battery constraints.

This doesn’t mean the Mac will replace the iPhone. Instead, Apple could emerge as a two-pillar company: the iPhone for mobility, and the Mac for intelligence. For the first time in 20 years, the Mac may outpace the iPhone in growth, reclaiming relevance and giving Apple a new narrative.

Why the Market Needs Reignition
AI PC Shipments, Worldwide, 2023-2025 (Thousands of Units)
2023 Shipments 2024 Shipments 2025 Shipments
AI Laptops 20,136 40,520 102,421
AI Desktops 1,396 2,507 11,804
AI PC Units Total 21,532 43,027 114,225
Source: Gartner (September 2024)

The global PC market has stagnated for years, with shipments hovering between 250 and 300 million units annually. Upgrades slowed as performance improvements became incremental, and consumers extended replacement cycles. But AI is creating a new reason to buy. Gartner projects AI PC shipments will grow to about 114 million units in 2025 — a 165 percent increase over 2024 — representing more than 40 percent of the entire PC market. That figure is expected to rise sharply as AI features become standard in both macOS and Windows, echoing how spreadsheets once drove mass PC adoption.

Apple has tightly coupled its Apple Intelligence software features with the Neural Engine in its Macs, positioning every new Mac as an AI PC. Microsoft is building its Copilot assistant into Windows, with hardware requirements that virtually guarantee demand for NPUs in next-generation x86 machines. Even RISC-V, still a nascent player in consumer computing, is positioning AI PCs as a proving ground for its open ISA.

Corporate and Individual Drivers

The IBM PC spread through corporations first. Executives and managers demanded their own productivity machines, which soon became indispensable for day-to-day decision-making. A similar pattern is emerging with AI PCs. Corporations now view them as essential tools for efficiency, where sales teams can generate proposals on demand, analysts can automate reporting, and creative departments can accelerate design and media production. Buying an AI PC for every employee is becoming the new baseline for productivity, much like issuing a PC to every manager was mandatory in the 1980s.

At the same time, individuals are also driving adoption. Students, freelancers, and creators are eager to run local language models for research, content generation, and coding without being tethered to cloud subscriptions. Emerging players such as DeepComputing, in partnership with Framework, are helping expand access to RISC-V–based AI PCs designed for developers and open-source enthusiasts who want full control over their hardware and software stack. Just as early home PCs became invaluable to small business owners and families, today’s AI PCs are rapidly evolving into indispensable personal assistants.

Challenges and Opportunities

As with the IBM PC era, the rollout of the AI PC comes with challenges. Software ecosystems must catch up, ensuring that frameworks like PyTorch, TensorFlow, and ONNX can fully exploit NPUs across different architectures. Pricing remains a consideration as well. Macs with integrated NPUs begin at around $1,099, while x86-based AI PCs often carry higher costs, and RISC-V systems remain experimental and relatively expensive.

Despite these hurdles, the opportunities are far greater. The AI PC offers a compelling reason to refresh hardware and injects new vitality into a stagnant market. It has the potential to redefine productivity just as spreadsheets once did in the 1980s. The modern equivalent of the spreadsheet ‘killer app’ may well be the personal AI assistant — a ubiquitous capability that transforms how individuals and corporations alike work, learn, and create.

“Copilot isn’t the killer app the AI PC needs. That breakthrough will come when on-device AI stops looking to the cloud—and starts thinking for itself.”

Conclusion

The AI PC in 2025 echoes the IBM PC in 1981: a new category that redefines what personal computing means and who benefits from it. The IBM PC turned the desktop into a productivity hub. The AI PC transforms it into a creativity and intelligence hub.

Apple is the clear frontrunner in this takeoff. Years of NPU integration, vertical stack control, unified memory, and seamless software have given it a commanding lead. The x86 vendors were caught flat-footed not only by ecosystem fragmentation and roadmap delays, but also by their tunnel vision on data centers. NVIDIA, meanwhile, remains the giant in cloud AI and the supplier of the fastest PC accelerators, but it risks being sidelined in the volume AI PC market if integrated NPUs become the standard definition.

For Apple, this represents more than just an industry lead. It signals the Mac’s return as a growth engine at a time when the iPhone’s dominance is beginning to plateau. If history is any guide, this shift will not just reinvigorate PC shipments but will reshape the role of the computer in society — making the AI PC the defining tool of the next era in computing.

Also Read:

GlobalFoundries, MIPS, and the Chiplet Race for AI Datacenters

Yuning Liang’s Painstaking Push to Make the RISC-V PC a Reality

Rapidus, IBM, and the Billion-Dollar Silicon Sovereignty Bet


The Rise, Fall, and Rebirth of In-Circuit Emulation: Real-World Case Studies (Part 2 of 2)

The Rise, Fall, and Rebirth of In-Circuit Emulation: Real-World Case Studies (Part 2 of 2)
by Lauro Rizzatti on 10-20-2025 at 6:00 am

The Rise, Fall, and Rebirth of In Circuit Emulation real world case studies figure 1

Recently, I had the opportunity to speak with Synopsys’ distinguished experts in speed adapters and in-circuit emulation (ICE). Many who know my professional background see me as an advocate for virtual, transactor-based emulation, hence I was genuinely surprised to discover the impressive results achieved by today’s speed adapters critical to the validation of system in their actual use environment.

In this article, I share what I learned. While confidentiality prevents me from naming customers, all the examples come from leading semiconductor companies and major hyperscalers across the industry using ZeBu® emulation or HAPS® prototyping together with Synopsys speed adaptors. As you read through the article you can refer to the following diagram of the Synopsys Speed Adaptor Solution:

Figure 1: Deployment diagram of Synopsys’ Speed Adaptor Solution and System Validation Server (SVS)

Case Study #1: The Value of Combining Fidelity and Flexibility in System Validation

The Challenge

A major fabless semiconductor company adopted virtual platforms as well as Hardware-Assisted Verification (HAV) platforms to accelerate early software development and design verification.

The company’s operations were organized around three distinct business units, each responsible for designing its own silicon independently. Each unit selected a different major EDA vendor for its virtual host solution platform. At first glance, such a multi-vendor setup might seem fragmented, but because virtual platforms are generally built on similar architectural blueprints, the approach still resulted in a verification environment that was consistent and standardized across all three BUs.

Alongside these virtual setups, the engineering teams also adopted In-Circuit Emulation (ICE). Here again, they diversified their tools, sourcing speed adapters and emulation from two of the three major EDA vendors. This allowed them to carry out system-level testing, interfacing the emulated environments with real hardware components to validate behavior under realistic conditions.

During a critical design milestone, a senior VP overseeing design verification mandated a cross-platform validation initiative: swap designs and tools across BUs, validate that silicon from each BU worked on all vendors’ platforms, uncover hidden inconsistencies before tape-out.

The mandate required running each BU’s design on all three virtual host platforms and on both ICE setups to ensure environment independence.

That’s when the surprise hit! One design passed flawlessly on all three virtual host platforms. It passed on one of the ICE platforms, but it failed on the other ICE platform, halting system boot entirely. The immediate suspicion fell on the speed adapter. The design team escalated the issue to the EDA vendor’s ICE experts for root-cause analysis.

The Solution

The EDA vendor’s ICE team dug deep into the logs and waveform traces and found the real culprit. It wasn’t the adapter. It was a bug in the DUT’s RTL.

This RTL flaw had escaped all three virtual platforms because of missing low-level system behavior modeling. Escaped one of the ICE setups due to lower fidelity implementation and surfaced only on the higher-fidelity ICE platform, which accurately mirrored real server behavior.

In real-world server systems, three critical hardware/software layers interact simultaneously. From the bottom up, the layers are:

  1. Motherboard chipsets, including PCIe switches, bridge chips, and other supporting silicon
  2. BIOS, handling low-level system initialization and configuration
  3. Operating System (OS), such as Linux or Windows, running on top

Virtual host platforms typically simulate only the OS layer using a virtual machine approach (typically QEMU based). The BIOS is minimally represented, and chipset behavior is completely abstracted out.

On the high-fidelity ICE platform, however, a real Intel server board was connected through the speed adapter. During boot, this Intel chipset correctly issued a Vendor Defined Message (VDM) over PCIe, a standard behavior in many production Intel servers, but not modeled at all in virtual platforms. Upon receiving this VDM, the DUT incorrectly dropped the packet instead of updating the PCIe flow control. This caused a deadlock during system boot. There was no software workaround, the only solution was to fix the RTL before tape-out.

Results

If undetected, the chip would have failed in every server deployment, resulting in a dead-on-arrival product. Detecting the bug pre-silicon saved the company a multi-million-dollar re-spin and months of schedule delay. The incident demonstrated why high accuracy virtual environments are critical to finding bugs early while high fidelity in-circuit setups are necessary to have final fidelity and confidence in the design.

Case Study #2: ICE Delivers Superior Throughput

The Challenge

When designing peripheral interface products, engineering teams often rely on virtual solutions for early verification. While virtual environments can model a protocol controller, they cannot accurately represent the physical (PHY) layer.

In these virtual models, the PHY is removed and replaced by a simplified “fake” model allowing to write software for basic register programming but does not support link training, equalization, or true electrical signaling. As a result, link training may appear to succeed because the model “assumes” compliance. Subtle issues like timing mismatches, equalization failures, and signal integrity problems remain hidden until late post-silicon testing. Testing real-world interoperability, especially with diverse third-party hardware and drivers, is not possible.

A leading hyperscaler faced significant challenges because of this drawback. Early in their design cycles, they faced months-long delays just to program and train PHYs, pushing crucial bug discovery into expensive post-silicon stages.

The Solution

To overcome these challenges, they adopted Synopsys Speed Adapters to bring PHYs into the emulation environment.

With this approach, PHYs are physically connected to the emulator through the speed adaptation. These boards support full programming, training, and link initialization just as they would on silicon.

This integration effectively turns the emulation environment into a true In-Circuit Emulation (ICE) platform, combining the speed and visibility of pre-silicon emulation with the physical accuracy and interoperability of real-world hardware

Examples of Impact

PCIe Gen5 Interface

  • In a virtual setup, a Gen5 device’s link training sequences seemed successful.
  • When tested via a speed adapter and a PHY, the customer uncovered critical timing mismatches and equalization failures that would have escaped virtual verification.
  • Catching these issues pre-silicon avoided a potential costly silicon re-spin and ensured full compliance with Gen5 specs.

UFS Storage Interface

  • A UFS host controller passed functional tests in a virtual model.
  • When connected to a real UFS PHY through a speed adapter, engineers discovered clock misalignments, burst mode instabilities, and data corruption under stress conditions—problems rooted in real signaling, invisible in virtual models.
  • Early detection improved system reliability and ensured compliance with JEDEC standards.

Driver Interoperability Testing

  • In root complex mode, different GPUs (NVIDIA, AMD, Intel) each use different drivers and optimizations.
  • Virtual environments cannot test these real drivers because they require a physical interface.
  • Speed adapters allowed full driver stacks against real devices, exposing errata and interoperability bugs that virtual models could never catch.
Results

Previous four months to program the PHY plus up to six months to train it in post-silicon were executed in pre-silicon in a couple of weeks. This was possible because speed adapters ran workloads , enabling rapid design iterations and faster bring-up cycles. Another benefit was improved debug and reuse since the same PHY configuration trained in pre-silicon could be directly reused in post-silicon, accelerating bring-up.

Case Study #3: Ethernet Product Validation

Challenge

When developing advanced Ethernet products—such as ultra-Ethernet, smart NICs, or intelligent switches—engineers face a recurring challenge: how to bring real software traffic into the Ethernet validation environment.

Virtual environments offer partial solutions. Virtual tester generators (VTG) offer low-level packet traffic (Layer 2, Layer 3) but do not exercise the application software stack. Virtual Hosts (VHS) allow software interaction but lacks flow-control capabilities. Without flow control, packets are dropped, an unacceptable limitation for validation environments where fidelity and determinism are critical.

As a result, traditional virtual environments are either incomplete (VTG) or not fully-reliable from a traffic control perspective (VHS). This gap left design teams without a way to fully validate Ethernet products across all protocol layers—especially the higher layers (L4–L7) that depend on real drivers, operating systems, and diagnostic software.

Solution

Ethernet speed adapters provide the missing link by bridging virtual test environments with real software execution over Ethernet.

Unlike VHS, speed adapters guarantee zero packet drops, delivering deterministic performance even under high traffic. Virtual testers (e.g., from Ixia or Spirent) remain useful for low-level (layer 2/3) functional validation. Speed adapters enable execution of real drivers and Linux-based diagnostic tools that testers cannot emulate. Together, virtual testers and speed adaptors form a complete solution spanning all Ethernet layers.

For startups or budget-constrained customers, speed adapters complement more expensive virtual tester licenses.  Speed adapters can provide equivalent packet generation and analysis at lower cost. Also, free and open-source test generators can be layered on top of a speed adapter to replicate tester functions at much lower cost.

Results

In practice, this hybrid approach has enabled customers to validate real software stacks against hardware under development without packet loss. Catch design bugs that only appear in higher protocol layers, issues that purely virtual test environments cannot expose. Scale affordably, combining limited VTG licenses with speed adapters to achieve full test coverage.

Case Study #4: Real-World Sensor Interoperability with MIPI Speed Adapters

The Challenge

The Display and Test Framework (DTF) team of a major fabless enterprise faced a recurring and costly problem. They needed to validate their chip design against real MIPI-based image sensors and cameras. However, in a virtual emulation environment, this was impossible because virtual models can mimic protocol behavior but cannot replicate real sensor electrical signaling or timing. Vendor-specific cameras and sensors each have unique initialization sequences, timing quirks, and signal integrity characteristics that no generic virtual model can capture. When first silicon returned from the fab, it frequently failed to interface with the intended cameras and sensors, leading to long bring-up efforts or even full silicon re-spins.

This limitation created a significant time-to-market bottleneck. By the time hardware compatibility issues can be found the design has already gone through costly fabrication, delaying product launches.

The Solution

To eliminate this bottleneck, the company adopted MIPI speed adapters to enable ICE with real sensor hardware. Using this approach, the chip design running inside the emulator could be directly connected to real, vendor-specific MIPI cameras and image sensors. Engineers could exercise full initialization, configuration, and data streaming paths just as they would on physical silicon. The setup supported easy swapping of different sensors and camera models, enabling rapid interoperability testing across vendors.

This capability gave the DTF team the real-world coverage they needed in pre-silicon, without waiting for chips to return from the fab.

Results

The design was successfully tested with the exact vendor-specific camera and sensor models planned for production. By catching integration issues pre-silicon, the enterprise avoided costly design re-spins caused by post-silicon bring-up failures. Removing the post-silicon camera/sensor debug cycle accelerated overall product schedules. Finally, the team could sign off knowing the design was already proven with real-world peripherals.

Case Study #5: How Synopsys’ System Validation Server (SVS) Caught Critical Bugs Missed by other solutions

The Challenge

Pre-silicon validation using ICE has historically faced a critical obstacle. Standard off-the-shelf host servers are not designed to tolerate the slow or intermittent response times of an emulator. When the emulator clock stalls or slows the host often times out, aborting the test run.

This customer’s silicon validation team encountered this limitation firsthand. While they used a commercial emulation host server for ICE, this system wasn’t enforcing strict real-world timing AND protocol checks. This risked letting flawed designs pass pre-silicon signoff, only to fail later in production.

The Solution

To overcome these limitations, the customer’s validation team adopted Synopsys’ System Validation Server (SVS) as their host system for ICE validation. SVS is a specialized, pre-validated host machine designed specifically to work with speed adapters and emulators. It offers two major advantages over generic hosts or legacy commercial host server setups. The SVS ships with a custom BIOS, engineered to tolerate the slow response times of emulators to eliminate timeouts that can otherwise terminate validation runs prematurely. SVS faithfully mimics the DUT that will eventually plug into, including enforcing strict specification compliance, especially for complex subsystems like PCIe. See figure 1 at the top of the article.

The validation team tested their design on both 3rd party emulation hardware and Synopsys’ SVS. Using the 3rd party emulation, the system booted successfully, but on SVS, the boot failed completely. Initially, the engineers suspected a hardware fault in SVS. As they put it: “Your SVS is broken while the other guys work fine.”

However, after a detailed debug session, it emerged that their DUT contained configuration errors in PCIe space registers. The 3rd party emulation solution and host server masked these errors because it used an outdated BIOS that failed to enforce PCIe register constraints. By contrast, SVS strictly enforced PCIe specifications and correctly rejected the illegal register values. The bug was non-fixable by firmware (no software patch could correct it).

Results

SVS exposed an RTL-level configuration bug that virtual flows and another emulation solution missed. It eliminated timeout instability in virtue of the SVS’s modified BIOS that allowed stable, long-duration tests.

SVS ensured that only spec-compliant designs advanced to tape-out, eliminating false positives from legacy flows.

Had the design been taped out based on 3rd party emulation “pass,” the silicon would have been DOA, requiring a full, costly respin.

Conclusion

Back in 2015, I wrote “The Melting of the ICE Age” for Electronic Design, where I predicted the demise of in-circuit emulation (ICE). Its numerous drawbacks (see Part 1 of this series) seemed to doom it to history, replaced by transaction-based emulation and, later, hybrid approaches that drove the shift-left verification methodology. In hindsight, I must admit I underestimated the ingenuity and resourcefulness of the engineering community.

Today, the third generation of speed adapters has propelled ICE again into the limelight of system-level validation. Bugs once detectable only in post-silicon labs can now be identified pre-silicon. This capability not only reduces the risk of re-spins but also accelerates time-to-tapeout and saves enormous expense. Far from melting away, ICE has re-emerged as a cornerstone of system-level verification.

The Rise, Fall, and Rebirth of In-Circuit Emulation (Part 1 of 2)

Also Read:

Statically Verifying RTL Connectivity with Synopsys

Why Choose PCIe 5.0 for Power, Performance and Bandwidth at the Edge?

Synopsys and TSMC Unite to Power the Future of AI and Multi-Die Innovation


CMOS 2.0 is Advancing Semiconductor Scaling

CMOS 2.0 is Advancing Semiconductor Scaling
by Daniel Nenni on 10-19-2025 at 10:00 am

CMOS 2.0

In the rapidly evolving landscape of semiconductor technology, imec’s recent breakthroughs in wafer-to-wafer hybrid bonding and backside connectivity are paving the way for CMOS 2.0, a paradigm shift in chip design. Introduced in 2024, CMOS 2.0 addresses the limitations of traditional CMOS scaling by partitioning a system-on-chip (SoC) into specialized functional tiers. Each tier is optimized for specific needs—such as high-performance logic, dense memory, or power efficiency—through system-technology co-optimization (STCO). This approach moves beyond general-purpose platforms, enabling heterogeneous stacking within the SoC itself, similar to but more integrated than current 3D stacking of SRAM on processors.

Central to CMOS 2.0 is the use of advanced 3D interconnects and backside power delivery networks (BSPDNs). These technologies allow for dense connections on both sides of the wafer, suspending active device layers between independent interconnect stacks. At the 2025 VLSI Symposium, imec demonstrated key milestones: wafer-to-wafer hybrid bonding at 250nm pitch and through-dielectric vias (TDVs) at 120nm pitch on the backside. These innovations provide the granularity needed for logic-on-logic or memory-on-logic stacking, overcoming bottlenecks in compute scaling for diverse applications like AI and mobile devices.

Wafer-to-wafer hybrid bonding stands out for its ability to achieve sub-micrometer pitches, offering high bandwidth and low-energy signal transmission. The process involves aligning and bonding two processed wafers at room temperature, followed by annealing for permanent Cu-to-Cu and dielectric bonds. Imec has refined this flow, achieving reliable 400nm pitch connections by 2023 using SiCN dielectrics for better strength and scalability. Pushing further, simulations revealed non-uniform bonding waves causing wafer deformation, impacting overlay accuracy. By applying pre-bond litho corrections, imec reached 300nm pitch with <25nm overlay error for 95% of dies. At VLSI 2025, they showcased 250nm pitch feasibility on a hexagonal pad grid, with high electrical yield in daisy chains, though full-wafer yield requires next-gen bonding tools.

Complementing frontside bonding, backside connectivity enables front-to-back links via nano-through-silicon vias (nTSVs) or direct contacting. For CMOS 2.0’s multi-tier stacks, this allows seamless integration of metals on both sides, with BSPDNs handling power from the backside to reduce IR drops and decongest frontside BEOL for signals. Imec’s VLSI 2025 demo featured barrier-less Mo-filled TDVs with 20nm bottom diameter at 120nm pitch, fabricated via a via-first approach in shallow-trench isolation. Extreme wafer thinning maintains low aspect ratios, while higher-order lithography corrections ensure 15nm overlay margins between TDVs and 55nm backside metals. This balances fine-pitch connections on both wafer sides, crucial for stacking multiple heterogeneous layers like logic, memory, and ESD protection.

BSPDNs further enhance CMOS 2.0 by relocating power distribution to the backside, allowing wider, less resistant interconnects. Imec’s 2019 pioneering work has evolved, with major foundries adopting it for advanced nodes. DTCO studies show PPAC gains in always-on designs, but VLSI 2025 extended this to switched-domain architectures—relevant for power-managed mobile SoCs. In a 2nm mobile processor design, BSPDN reduced IR drop by 122mV compared to frontside PDNs, enabling fewer power switches in a checkerboard pattern. This yielded 22% area savings, boosting performance and efficiency.

These advancements, supported by the NanoIC pilot line and EU funding, bring CMOS 2.0 from concept to viability. By enabling heterogeneity within SoCs, they offer scalable solutions for the semiconductor ecosystem, from fabless designers to system integrators. As pitches scale below 200nm, collaboration with tool suppliers will be key to overcoming overlay challenges. Ultimately, high-density front and backside connectivity heralds a new era of compute innovation, meeting demands for performance, power, and density in an increasingly diverse application space.

Read the source article here.

Also Read:

Exploring TSMC’s OIP Ecosystem Benefits

Synopsys Collaborates with TSMC to Enable Advanced 2D and 3D Design Solutions

Advancing Semiconductor Design: Intel’s Foveros 2.5D Packaging Technology