RVN! 26 Banner revised (800 x 100 px) (600 x 100 px)

Learning from In-House Datasets

Learning from In-House Datasets
by Bernard Murphy on 10-22-2025 at 6:00 am

Training in a Constrained Environment min

At a DAC Accellera panel this year there was some discussion on cross-company collaboration in training. The theory is that more collaboration would mean a larger training set and therefore higher accuracy in GenAI (for example in RTL generation). But semiconductor companies are very protective of their data and reports of copyrighted text being hacked out of chatbots do nothing to allay their concerns. Also, does evidence support that more mass training leads to more effective GenAI? GPT5 is estimated to have been trained on 70 trillion tokens versus GPT4 at 13 trillion tokens, yet GPT5 is generally viewed as unimpressive, certainly not a major advance on the previous generation. Maybe we need a different approach.

More training or better focused training?

A view gathering considerable momentum is that while LLMs do an excellent job in understanding natural language, domain-specific expertise is better learned from in-house data. While this data is obviously relevant, clearly there’s a lot less of it than in the datasets used to train big GenAI models. A more thoughtful approach is necessary to learn effectively from this constrained dataset.

Most/all approaches start with a pre-trained model (the “P” in GPT) since that already provides natural language understanding and a base of general knowledge. New methods add to this base through fine-tuning. Here I’ll touch on labelling and federated learning methods.

Learning through labels

Labeling harks back to the early days of neural nets, where you provided training pictures of dogs labelled “dog” or perhaps the breed of dog. The same intent applies here except you are training on design data examples which you want a GenAI model to recognize/classify. Since manually labeling large design datasets would not be practical, recent innovation is around semi-automated labeling assisted by LLMs.

Some large enterprises outsource this task to value-added service providers like Scale.com who deploy large teams of experts using their internal tools to develop labeling, using supervised fine-tuning (SFT) augmented by reinforcement learning with human feedback (RLHF). Something important to understand here is that labeling is GenAI-centric. You shouldn’t think of labels as tags on design data features but rather as fine-tuning additions to GenAI data (attention, etc) generated from training question/answer (Q/A) pairs expressed in natural language, where answers include supporting explanations perhaps augmented by content for RAG.

In EDA this is a very new field as far as I can tell. The topic comes up in some of the papers from the first International Conference on LLM-Aided Design (LAD) held this year at Stanford. One such paper works around the challenge of getting enough expert-generated Q/A pairs by generating synthetic pairs through LLM analysis of unlabeled but topic-appropriate documents (for example on clock domain crossings). This they augment with few-shot learning based on whatever human expert Q/A pairs they can gather.

You could imagine using similar methods for labeling around other topics in design expertise: low-power design, secure design methods, optimizing synthesis, floorplanning methods and so on. While attention in the papers I have read tends to focus on using this added training to improve RTL generation, I can see more immediate value in verification, especially in static verification and automated design reviews.

Federated Learning

Maybe beyond some threshold more training data isn’t necessarily better, but perhaps the design data that can be found in any given design enterprise doesn’t yet suffer from that problem and more data could still help, if we could figure out how to combine learning from multiple enterprises without jeopardizing the security of each proprietary dataset.  This is a common need across many domains where webcrawling for training data is not permitted (medical and defense data are two obvious examples).

Instead of bringing data to the model for training, Federated Learning sends an initial model from a central site (aggregator) to individual clients and develops fine-tuning training in the conventional manner within that secure environment. When training is complete, trained parameters only are sent back to the aggregator which harmonizes inputs from all clients, then sends the refined model back to the clients. This process iterates, terminating when the central model converges.

There are some commercial platforms for Federated Learning, also open-source options from some big names: TensorFlow Federated from Google and NVIDIA FLARE are two examples. Google Cloud and IBM Cloud offer Federated Learning support, while Microsoft supports open-source Federated Learning options within Azure.

This method could be quite effective in the semiconductor space if a central AI platform or consortium could be organized to manage the process. And if a critical mass of semiconductor vendors is prepared to buy in 😀.

Perhaps the way forward for learning in industries like ours will be through a combination of these methods – federated learning as a base layer to handle undifferentiated expertise and labeled learning for continued differentiation in more challenging aspects of design expertise. Definitely an area to watch!

Also Read:

PDF Solutions Calls for a Revolution in Semiconductor Collaboration at SEMICON West

The AI PC: A New Category Poised to Reignite the PC Market

Webinar – The Path to Smaller, Denser, and Faster with CPX, Samtec’s Co-Packaged Copper and Optics


Liberty IP Excellence: Building a Robust Verification Framework for Automotive IPs

Liberty IP Excellence: Building a Robust Verification Framework for Automotive IPs
by Daniel Nenni on 10-21-2025 at 2:00 pm

As 2025 draws to a close, the semiconductor industry continues to push boundaries, particularly in automotive applications where reliability is non-negotiable. At the TSMC Open Innovation Platform forum this year, a collaborative presentation by NXP Semiconductors and Siemens EDA stood out: “Liberty IP Excellence: Building a Robust Verification Framework for Automotive IPs.” Presented by Santhosh K, Khushboo R, Pramod G from NXP, alongside Ajay Kumar and Ray Valencia from Siemens EDA, this talk highlighted the critical role of Liberty files in SoC design and proposed innovative quality assurance (QA) methodologies to ensure flawless IP delivery.

The motivation stems from the foundational IPs (standard cells, memories, IOs) that form the backbone of System-on-Chip designs. Liberty (.lib) files serve as the industry standard for encapsulating timing, power, noise, and more, including advanced features like statistical variation and waveform data for sub-10nm nodes. NXP, a pioneer in automotive semiconductors, emphasizes uncompromising quality to meet the sector’s stringent demands. Early QA in their flow minimizes costs, time, and resources, but Liberty’s complexity encompassing aspects like Liberty Variation Format (LVF) for statistical data and Composite Current Source (CCS) for timing, noise, and power—poses significant challenges. Interpreting these files manually is error-prone, and inaccuracies can cascade into design failures, especially in safety-critical automotive systems.

The proposed methodology embeds advanced verification tools into NXP’s QA flow, leveraging Siemens’ Solido Analytics for AI-driven error detection, analysis, and comparison. This solution enables full automation, easing adoption for Liberty users while saving engineering and compute resources. It targets advanced nodes, focusing on LVF and CCS to ensure design reliability.

Diving into NXP’s QA flow analysis, the presentation detailed several key components. First, outlier detection using AI identifies anomalies in Liberty data that deviate from neighboring values, such as slew/load points within tables or across Process-Voltage-Temperature (PVT) conditions. Siemens Solido Analytics employs machine learning models to sweep dimensions like transitions, constraints, temperature, voltage, and custom sweeps (e.g., cell drive strength). Users set tolerance thresholds, triggering alerts for outliers, which could indicate characterization issues.

Version-to-version comparison addresses common scenarios like PDK revisions, design spec changes, or legacy setup recreations. The flow uses plotting for rapid visualization and interpolates tables for fair, apples-to-apples assessments, accelerating correlation and reducing manual effort.

Another pillar is CCS versus Non-Linear Delay Model (NLDM) verification. CCS captures timing in the current domain, while NLDM uses voltage; mismatches signal incorrect settings. The methodology converts CCS data to the voltage domain for direct comparison, ensuring consistency.

LVF and LVF Moments checks tackle the complexity of statistical characterization. Nominal Liberty characterization takes hours, but LVF—requiring up to five additional statistical measures—can extend to weeks or months. LVF formats include 1-sigma early/late (typically 3-sigma/3, representing Gaussian distributions) and Moments (mean shift, skewness, standard deviation) for complex distributions. On-chip variation is crucial for nodes ≤20nm, where inaccurate LVF can cause 50-100% timing deviations. Automated checks detect outliers in LVF groups and recreate sigma tables from Moments for matching; discrepancies highlight inconsistencies.

Finally, Power-Performance-Area (PPA) analysis enables early library comparisons across technologies. With thousands of cells and varying formats, traditional methods delay insights until later design phases like synthesis or P&R. The proposed heuristics align data by cell/pin names, functionalities, and table indices, revealing trends like performance versus drive strength (e.g., Dataset A buffers outperforming B at drives >3) or power consumption (e.g., Dataset A flip-flops superior across drives).

Bottom line: this embedded solution delivers comprehensive QA for automotive libraries, including impact analysis across PDK revisions, automated execution, LVF validation, outlier detection, CCS-NLDM alignment, and full coverage. NXP achieved 2x efficiency gains: for trends and validation on 1000 cells across 25 PVTs, runtime dropped from 10 to 5 units; impact analysis (6 PVTs) from 5 to 3 units; PPA comparison (4 PVTs) similarly improved. This collaboration exemplifies 2025’s tech ethos—AI-augmented precision amid escalating complexity—paving the way for safer, more efficient automotive semiconductors.

Also Read:

Accelerating SRAM Design Cycles: MediaTek’s Adoption of Siemens EDA’s Additive AI Technology at TSMC OIP 2025

AI-Driven DRC Productivity Optimization: Insights from Siemens EDA’s 2025 TSMC OIP Presentation

Why chip design needs industrial-grade EDA AI


ASU Silvaco Device TCAD Workshop: From Fundamentals to Applications

ASU Silvaco Device TCAD Workshop: From Fundamentals to Applications
by Daniel Nenni on 10-21-2025 at 10:00 am

SILVACO ASU Workshop 400x400

The ASU-Silvaco Device Technology Computer-Aided Design Workshop is a pivotal educational and professional development event designed to bridge the gap between theoretical semiconductor physics and practical device engineering. Hosted by Arizona State University in collaboration with Silvaco, a leading provider of TCAD software, this workshop offers participants a comprehensive exploration of semiconductor device simulation, from foundational concepts to advanced applications. Spanning topics such as device physics, process simulation, and real-world design challenges, the workshop equips engineers, researchers, and students with the tools to innovate in the rapidly evolving field of microelectronics.

The workshop typically begins with an introduction to TCAD fundamentals, emphasizing the role of simulation in modern semiconductor design. Participants learn how TCAD tools model the electrical, thermal, and optical behavior of devices at the nanoscale. Silvaco’s suite of software, including Atlas, Victory Process, and DeckBuild, is introduced as a powerful platform for simulating semiconductor fabrication and performance. These tools allow users to predict device behavior under various conditions, optimize designs, and reduce the need for costly physical prototyping. The foundational sessions cover key concepts like carrier transport, quantum effects, and material properties, ensuring attendees grasp the physics underpinning TCAD simulations.

As the workshop progresses, it delves into practical applications, demonstrating how TCAD is used in industries such as integrated circuits, power electronics, and photovoltaics. Participants engage in hands-on sessions, guided by ASU faculty and Silvaco engineers, to simulate processes like doping, oxidation, and lithography. These exercises highlight how TCAD can optimize fabrication steps, improve yield, and enhance device reliability. For instance, attendees might simulate a MOSFET’s performance to analyze parameters like threshold voltage or leakage current, gaining insights into design trade-offs. The workshop also covers advanced topics, such as modeling FinFETs, tunnel FETs, or emerging 2D materials like graphene, reflecting the cutting-edge needs of the semiconductor industry.

A key strength of the ASU-Silvaco workshop is its emphasis on bridging academia and industry. ASU’s expertise in semiconductor research, combined with Silvaco’s industry-standard tools, creates a unique learning environment. Participants, ranging from graduate students to seasoned engineers, benefit from real-world case studies, such as optimizing power devices for electric vehicles or designing low-power chips for IoT applications. The collaborative setting fosters networking, enabling attendees to connect with peers and experts, potentially sparking future research or career opportunities.

By the workshop’s conclusion, participants gain a robust understanding of TCAD’s role in accelerating innovation. They leave equipped with practical skills to simulate and analyze semiconductor devices, as well as an appreciation for how these tools address challenges like scaling, power efficiency, and thermal management. The ASU-Silvaco Device TCAD Workshop stands out as a vital platform for advancing semiconductor expertise, empowering attendees to contribute to the next generation of electronic devices in a world increasingly driven by technology.

Register Here

About Silvaco
Silvaco is a provider of TCAD, EDA software, and SIP solutions that enable semiconductor design and digital twin modeling through AI software and innovation. Silvaco’s solutions are used for semiconductor and photonics processes, devices, and systems development across display, power devices, automotive, memory, high performance compute, foundries, photonics, internet of things, and 5G/6G mobile markets for complex SoC design. Silvaco is headquartered in Santa Clara, California, and has a global presence with offices located in North America, Europe, Egypt, Brazil, China, Japan, Korea, Singapore, Vietnam, and Taiwan. Learn more at silvaco.com.

Also Read:

GaN Device Design and Optimization with TCAD

Simulating Gate-All-Around (GAA) Devices at the Atomic Level

Silvaco: Navigating Growth and Transitions in Semiconductor Design


PDF Solutions Calls for a Revolution in Semiconductor Collaboration at SEMICON West

PDF Solutions Calls for a Revolution in Semiconductor Collaboration at SEMICON West
by Mike Gianfagna on 10-21-2025 at 6:00 am

PDF Solutions Calls for a Revolution in Semiconductor Collaboration at SEMICON West

SEMICON West was held in Phoenix, Arizona on October 7-9. This premier event brings the incredibly diverse global electronics supply chain together to address the semiconductor ecosystem’s greatest opportunities and challenges. The event’s tagline this year is:

Stronger Together — Shaping a Sustainable Future in Talent, Technology, and Trade, underscoring the semiconductor industry’s commitment to collaboration in addressing key challenges and opportunities. One of the participants at this event presented an approach and technology that form the foundation for global collaboration across the semiconductor industry. Let’s look at how PDF Solutions used the SEMICON West stage to call for a revolution in semiconductor collaboration.

PDF Solutions at the Show

PDF highlighted its game-changing technology during SEMICON West with product demos, daily booth presentations and speaking engagements in the SEMICON West program. Attendees at the PDF booth interacted with a virtual reality model of semiconductor manufacturing chamber equipment and the associated connection to PDF’s EquipmentTwin solution. Other demos illustrated the latest products that are shaping the future of the semiconductor industry.

A Keynote Aimed at Catalyzing Collaboration

John Kibarian

John Kibarian, CEO of PDF Solutions, presented an important keynote during SEMICON West. His talk was titled Revolutionizing Semiconductor Collaboration: The Emergence of AI-Driven Industry Platforms.  

John pointed out that the semiconductor industry thrives on innovation, and this work is fueled by collaboration. But the nature of this collaboration is changing. The growing technology trend towards 3D and hybrid packaging is creating a larger and more complex global supply chain. He explained that the ability to leverage AI at all levels to drive operational efficiency is needed.

John used a couple of visuals to make his point. First, he illustrated the nature of collaboration driven by the current model, which is largely reactionary and uses simple linear handoffs by sharing relatively small amounts of data as shown below.

Traditional collaboration model

He went on to point out the substantial changes that technologies such as 3D are bringing to the collaboration model. Some of those drivers are summarized in the graphic below.

He explained that what is needed is a new, AI-driven collaboration model that is depicted in the graphic at the top of this post. He detailed the collaboration that will be required both within the enterprise and across the global supply chain. The many aspects of the required collaboration across the supply chain are shown in the graphic below.

John then distilled the key building blocks or foundational capabilities required to build the platform needed to enable AI-driven collaboration across the industry:

  • A secure data infrastructure
  • Automated orchestration
  • AI agents

He explained that PDF Solutions has already deployed a secure data infrastructure to enable global collaboration. It is called secureWISE™ and it connects over 300 manufacturing locations and manages exabytes of data between fabs and OEMs. Over 100 OEMs are connected, and the system is ISO: 27001 compliant. He reported there have been zero security breaches in over 20 years. An impressive statistic.

John then went on to describe how secureWISE provides secure remote connectivity for the semiconductor industry. Technologies used include end-to-end secure access across private networks, virus scan for secure equipment software updates, and granular user permission for remote control and optimization. With this secure remote connection data infrastructure, equipment OEMs are able to remotely execute software upgrades and use collected data to deliver AI based value added equipment optimization services.  For Fab operators connecting their equipment with secureWISE increases fab uptime and improve operational efficiency.

The secure global data infrastructure is the foundation on top of which global collaboration can be executed relying on automated orchestrations which enable to Align and Abstract data across the supply chain to accelerate decisions. This is achieved by:

  • Using actionable detailed manufacturing data to drive business decisions
  • Automating business processes within and across enterprises
  • Deploying and orchestrating AI agents to automate and accelerate decisions

With this approach business decisions can be made much faster, using more timely and accurate data coming from across multiple applications and organizations.

Examples include:

  • Product Costing: Accurate and up-to-date costing information based on actual resource consumption
  • Order Status: Real-time updates on orders status and yield
  • Quality: Rapid identification and isolation of at-risk materials
  • Test Flow: Automated test flow management
  • WIP: Real-time WIP tracking and management across the supply chain

In most manufacturing organizations a very small amount of the collected manufacturing data is actually used for Analysis. PDF Solutions estimates that at most 5% of manufacturing data is used for meaningful analytics. To fully leverage and scale the use of AI semiconductor companies need to be able automate a lot of the analytics processes while enabling human governance for this AI execution. Aspects of this work include:

  • Enforce Data Quality: Humans define standards; AI implements data quality
  • Humans Set Collaboration Rules: To define collaboration principles, data sharing, and security boundaries
  • AI Executes at Scale: Operating autonomously within those boundaries, handling complex, high-volume tasks
  • Cross-Organization Governance: Allows fabs to define vendor access and data protocols and AI manages daily execution

An example of this that John provided is guided analytics, as a process that mines 100% of the data, 100% of the time, and automates up to 90% of the analysis. He went on to describe how seamless integration of data types is achieved. For example: hard bin, soft bin, parametric, PCM, test tools, and units per hour (UPH). This process delivers an issue-based flow that uses AI/ML powered diagnosis, resulting in the ability to render data visualization in seconds.

John went on to describe many valuable and high-impact capabilities that are enabled by AI agents including:

  • Predictive test to predict future test results so test dollars can be spent on units that need the most attention
  • Predictive burn-in to predict which burn-in will pass to eliminate costly and unnecessary burn-in runs
  • Predictive binning to predict which tests will fail – finding failures early saves time and money

John concluded the story with an overall collaboration vision and a call to action for the industry to step up its work in this area. The vision John presented, which provides a pathway to greater worldwide semiconductor innovation, is summarized below.

PDF Solutions collaboration vision

To Learn More

The work presented at SEMICON West illustrates the significant progress PDF Solutions is making to revolutionize semiconductor collaboration. John’s keynote underscored the importance of worldwide adoption of these new strategies. You can access John Kibarian’s keynote on the PDF Solutions website. You can learn more about what PDF is doing on SemiWiki here.  You can also explore the suite of integrated products offered by the company here.  And that’s how PDF Solutions calls for a revolution in semiconductor collaboration at SEMICON West.

Also Read:

PDF Solutions Adds Security and Scalability to Manufacturing and Test

PDF Solutions and the Value of Fearless Creativity

Podcast EP259: A View of the History and Future of Semiconductor Manufacturing From PDF Solution’s John Kibarian


The AI PC: A New Category Poised to Reignite the PC Market

The AI PC: A New Category Poised to Reignite the PC Market
by Jonah McLeod on 10-20-2025 at 10:00 am

Fig 1

The PC industry is entering its most significant transformation since the debut of the IBM PC in 1981. That original beige box ushered in a new era of productivity, reshaping how corporations and individuals worked, communicated, and created. More than four decades later, the AI PC is emerging as a new category — one that promises to reignite growth in a market that has otherwise plateaued. Where the IBM PC democratized computing power for spreadsheets, word processing, and databases, the AI PC integrates machine intelligence directly into the device, enabling capabilities once reserved for cloud data centers.

“If the IBM PC made computers personal, the AI PC makes them perceptive”

What Defines an AI PC

An AI PC isn’t just another laptop or desktop with more cores or a faster GPU. At its heart lies a dedicated Neural Processing Unit (NPU) or an equivalent accelerator, designed to handle machine learning and inference tasks efficiently. Apple was among the first to bundle AI capability into all its Macs via the Neural Engine in its M-series silicon. In 2025, nearly all Macs shipped qualify as AI PCs by default. On the Windows/x86 side, Intel and AMD are racing to deliver NPUs in their latest laptop platforms, though only about 30% of PCs shipping this year meet the ‘AI PC’ definition. Meanwhile, RISC-V vendors are entering the scene with experimental AI PCs, such as DeepComputing’s DC-ROMA II, proving that even open architectures are chasing this category.

This hardware shift is paired with software integration. AI PCs promise not just raw horsepower but contextual, on-device intelligence. They run large language models (LLMs), generative tools, transcription, translation, and real-time personalization — all locally, without depending exclusively on the cloud.

The IBM PC Parallel

The IBM PC, released in 1981, was revolutionary not because of its raw specs — an Intel 8088 processor, 16KB of RAM, and two floppy drives hardly impress today — but because of its timing and positioning. IBM gave business managers and knowledge workers their first taste of personal productivity at scale. VisiCalc and Lotus 1-2-3 spreadsheets became corporate staples, while WordPerfect transformed document workflows. The IBM PC became the catalyst for office automation and, eventually, the rise of the information economy.

The AI PC in 2025 carries a similar inflection point. Just as the IBM PC allowed managers to manipulate numbers without waiting for mainframe operators, the AI PC gives today’s professionals the power to run generative models, analyze vast data sets, and automate creative tasks directly from their desks. Where the IBM PC shifted power from IT departments to individual workers, the AI PC shifts power from centralized cloud servers back to the personal device.

How Apple Got the Jump on the x86 Crowd

Apple’s head start in AI PCs is the product of long-term bets that Intel and AMD were late to match. Apple began shipping a Neural Engine in iPhones as early as 2017. By the time the M1 arrived in Macs in 2020, Apple already had multiple generations of AI silicon in production. Intel’s first true NPU platform, Meteor Lake, didn’t appear until 2023, and AMD’s Ryzen AI chips landed around the same time.

Vertical integration gave Apple another edge. Because it controls silicon, operating system, and frameworks like Core ML and Metal, Apple can route workloads seamlessly across CPU, GPU, and NPU. The x86 ecosystem, fragmented between Microsoft, Intel, AMD, and dozens of OEMs, could not move nearly as fast.

Apple’s unified memory architecture offered a further advantage, eliminating costly data transfers that plague PCs where CPU and GPU have separate memory pools. And Apple made AI consumer-friendly: Apple Intelligence features in Mail, Notes, and Photos gave users visible, everyday value. Windows AI PCs, by contrast, still lean heavily on Microsoft’s Copilot features, many of which depend on cloud services.

By 2025, Apple had made AI hardware and features standard across its lineup. All Macs shipped are AI PCs, while only about a third of x86 PCs qualify.

The Day Apple Changed the Rules

The pivotal moment came at Apple’s Worldwide Developers Conference (WWDC) in June 2020. On stage, Tim Cook described it as a ‘historic day for the Mac’ and announced the company’s transition to Apple-designed silicon. He framed it as the next evolution in Apple’s decades-long control of its own hardware and software stack.

Johny Srouji, Apple’s Senior Vice President of Hardware Technologies, then took the stage to explain the architectural foundations of what would later be branded the M1. He described how Apple’s silicon journey — from iPhone to iPad to Apple Watch — had matured into a scalable architecture capable of powering Macs. Srouji highlighted the unification of CPU, GPU, and Neural Engine within a single system-on-chip, the adoption of a unified memory architecture, and a focus on performance per watt as the key to efficiency. He emphasized that Apple’s decision to design its own SoCs specifically for the Mac would give the platform a leap in power efficiency, security, and AI capability.

This keynote framed the M1 not just as a chip, but as the product of a strategic pivot — one that caught the x86 world by surprise and set Apple years ahead in the race to define the AI PC.

Inside Apple’s Secret AI Weapon

The Neural Processing Unit (NPU) in every modern Mac isn’t an ARM building block — it’s an Apple original design. Apple licenses the ARM instruction set for its CPUs but develops its own CPU, GPU, and NPU cores.

The Neural Engine debuted with the A11 Bionic in 2017, years before ARM’s Ethos NPU line, and has scaled up steadily to handle tens of trillions of operations per second. It was conceived and built in-house by Apple’s silicon team under Johny Srouji, tailored specifically for machine-learning inference. In the M1, the Neural Engine delivered 11 TOPS, while in the M3 it exceeds 35 TOPS.

The GPU inside the M1 was also a clean Apple design. Until the A10 generation, Apple licensed PowerVR graphics cores from Imagination Technologies, but starting with the A11 Bionic in 2017, Apple switched to its own custom GPU microarchitecture. The M1’s 8-core GPU, capable of 2.6 TFLOPS, is part of this lineage.

Taken together, Apple’s CPU, GPU, and Neural Engine represent a vertically integrated architecture — all custom, all Apple — with only the instruction set itself licensed from ARM. This tight ownership is what allows Apple to optimize across hardware and software, and it explains why the M-series has been able to leapfrog x86 designs in efficiency and AI capability.

Birth of The AI PC

Apple didn’t set out to build the M1 in 2010, but its path toward full integration began to take shape years earlier. When the company introduced its first in-house Neural Engine and custom GPU alongside its homegrown CPU in the A11 Bionic (2017), it effectively unified all three pillars of modern computation under one roof. From that point on, Apple’s silicon roadmap evolved with a clear long-term goal: to bring these independently perfected engines — CPU, GPU, and Neural Engine — onto a single fabric with shared memory and software control. The M1, unveiled in 2020, was the culmination of that decade-long convergence, transforming what had started as separate mobile components into a cohesive architecture optimized for both performance and efficiency across the entire Mac lineup.

While NVIDIA’s “aha moment” came in 2012, when its GPUs unexpectedly became the workhorses of the deep learning revolution, Apple was on a parallel but opposite trajectory. NVIDIA was scaling up for the cloud, harnessing GPU clusters for AI training, while Apple was scaling down — embedding intelligence directly into personal devices. Both arrived at AI through performance innovation rather than foresight: NVIDIA by discovering that its gaming chips excelled at matrix math, and Apple by realizing that machine learning could make mobile experiences smarter, faster, and more private. The convergence of these two paths — one born in the data center, the other in the palm of the hand — defined the modern era of AI computing.

Competing Compute Models—Apple vs. RISC-V

Apple’s M-series SoCs represent a tightly integrated approach. At the CPU level, Apple uses a combination of high-performance and efficiency cores, delivering strong single-thread performance while managing orchestration tasks at low power. Its GPU, designed in-house, is a tile-based architecture that handles both graphics and general-purpose compute. For AI-specific workloads, Apple includes its Neural Engine, a dedicated block capable of roughly 35 TOPS in its latest Macs, deeply integrated with Core ML and Apple Intelligence. Together, these components form a unified architecture with shared memory, which eliminates costly data transfers and optimizes performance for consumer-facing AI applications.

RISC-V vendors, by contrast, take a modular approach based on scalar, vector, and matrix engines. Scalar cores serve as the foundation for control and orchestration, while the RISC-V Vector Extension (RVV 1.0) provides scalable registers ranging from 128 to 1024 bits, ideal for SIMD tasks like convolution, dot-products, and signal processing. For high-intensity AI workloads, matrix engines (MMA) accelerate tensor math in formats such as INT8, FP8, and FP16, targeting operations like GEMM and transformer attention. Rather than a closed design, this modular architecture allows vendors to tailor solutions to specific needs and even adopt chiplet-based scaling across regions and partners.

The contrast between Apple and RISC-V is sharp. Apple delivers a closed but seamless integration, while RISC-V offers openness, extensibility, and flexibility. Apple’s ecosystem is polished and tightly controlled, with Core ML and Apple Intelligence driving user-facing features. RISC-V, on the other hand, still relies on maturing toolchains like LLVM, ONNX, and TVM, but its open model makes it attractive for sovereign compute initiatives and experimental AI PCs. Apple scales its approach within its own lineup, while RISC-V enables innovation across global vendors, offering a pathway that Apple’s ecosystem simply does not allow.

Why the x86 Crowd Got Caught Flat-Footed

The x86 ecosystem misjudged both the timing and the scope of the AI PC transition. For years, Intel and AMD assumed that AI workloads would remain concentrated in data centers and GPUs rather than extending into consumer PCs. Their product roadmaps and marketing focused on server accelerators and high-performance gaming graphics, while NPUs for laptops were treated as an afterthought.

This miscalculation was compounded by the industry’s preoccupation with the data center boom. As hyperscalers poured billions into AI infrastructure, chipmakers directed their attention toward GPUs and server CPUs, chasing growth where the margins were highest. In doing so, they overlooked the parallel opportunity Apple had identified — embedding AI into personal devices where privacy, latency, and convenience are paramount.

By the time Intel released Meteor Lake in 2023 and AMD introduced Ryzen AI, Apple already had a three-year head start in consumer AI integration. Microsoft’s Copilot+ PC initiative in 2024 only underscored how reactive the x86 response had become. Moreover, Intel’s manufacturing struggles and AMD’s limited focus on NPUs slowed their ability to pivot, while power efficiency remained a glaring weakness. Apple could deliver hours of local LLM performance on battery, something x86 laptops could not match without resorting to power-hungry discrete GPUs.

Ultimately, the fixation on data centers blinded x86 vendors to the rise of the AI PC. Apple exploited this gap decisively, while RISC-V vendors now see an opportunity to carve out their own space with modular, integrated solutions that offer an open alternative to both Apple and x86.

NVIDIA’s Head is in The Cloud

NVIDIA dominates AI in the cloud and remains the gold standard for GPU-accelerated training and inference. Its CUDA ecosystem, TensorRT optimizations, and developer lock-in make it indispensable in enterprise and data center environments. But the AI PC revolution is shifting focus toward efficient, always-on AI computing at the edge, and here NVIDIA plays a more complicated role.

On the one hand, NVIDIA powers some of the most capable ‘AI PCs’ today. Discrete RTX GPUs deliver blazing inference speeds for LLMs and generative models, and Microsoft has partnered with NVIDIA to brand ‘RTX AI PCs.’ For power users and creators, an x86 laptop with an RTX 4090 GPU can churn out tokens per second far beyond what Apple’s Neural Engine can achieve.

On the other hand, NVIDIA’s model depends on discrete GPUs that consume far more power than Apple’s integrated NPUs or the NPUs Intel and AMD are embedding into CPUs. AI PCs are not just about raw throughput — they are about balancing performance with efficiency, portability, and battery life. Apple has made AI capability universal across its product line, while NVIDIA’s approach is tied to high-end configurations.

This leaves NVIDIA both central and peripheral. Central, because any developer serious about AI still needs NVIDIA for training and high-performance inference. Peripheral, because the AI PC category is being defined around integrated NPUs, not discrete GPUs. If ‘AI PC’ comes to mean a lightweight laptop with always-on AI features, NVIDIA risks being left out of the mainstream narrative, even as it continues to dominate the high end.

 Apple’s Big Mac: The Comeback Story

After two decades in the iPhone’s shadow, the Mac is regaining its relevance — not as a nostalgia act, but as the world’s first mass-market AI PC. The AI PC wave also reshapes Apple’s internal dynamics. For nearly two decades, the iPhone was Apple’s growth engine, overshadowing the Mac. iPhone revenue reached more than $200 billion annually, while the Mac hovered around $25–30 billion. The company’s focus, culture, and ecosystem tilted toward mobile.

But smartphones are now a mature market. Global shipments have flattened, replacement cycles lengthened, and Apple increasingly leans on services for growth. The iPhone remains indispensable, but its role as the company’s primary driver is fading.

The Mac, reborn as the AI PC, offers Apple a chance to regain strategic balance. AI workloads — text generation, media editing, data analysis — naturally fit the desktop and laptop form factor, not the smartphone. Apple Intelligence on the Mac positions it as an AI hub for professionals, creators, and students, in ways the iPhone cannot match due to thermal and battery constraints.

This doesn’t mean the Mac will replace the iPhone. Instead, Apple could emerge as a two-pillar company: the iPhone for mobility, and the Mac for intelligence. For the first time in 20 years, the Mac may outpace the iPhone in growth, reclaiming relevance and giving Apple a new narrative.

Why the Market Needs Reignition
AI PC Shipments, Worldwide, 2023-2025 (Thousands of Units)
2023 Shipments 2024 Shipments 2025 Shipments
AI Laptops 20,136 40,520 102,421
AI Desktops 1,396 2,507 11,804
AI PC Units Total 21,532 43,027 114,225
Source: Gartner (September 2024)

The global PC market has stagnated for years, with shipments hovering between 250 and 300 million units annually. Upgrades slowed as performance improvements became incremental, and consumers extended replacement cycles. But AI is creating a new reason to buy. Gartner projects AI PC shipments will grow to about 114 million units in 2025 — a 165 percent increase over 2024 — representing more than 40 percent of the entire PC market. That figure is expected to rise sharply as AI features become standard in both macOS and Windows, echoing how spreadsheets once drove mass PC adoption.

Apple has tightly coupled its Apple Intelligence software features with the Neural Engine in its Macs, positioning every new Mac as an AI PC. Microsoft is building its Copilot assistant into Windows, with hardware requirements that virtually guarantee demand for NPUs in next-generation x86 machines. Even RISC-V, still a nascent player in consumer computing, is positioning AI PCs as a proving ground for its open ISA.

Corporate and Individual Drivers

The IBM PC spread through corporations first. Executives and managers demanded their own productivity machines, which soon became indispensable for day-to-day decision-making. A similar pattern is emerging with AI PCs. Corporations now view them as essential tools for efficiency, where sales teams can generate proposals on demand, analysts can automate reporting, and creative departments can accelerate design and media production. Buying an AI PC for every employee is becoming the new baseline for productivity, much like issuing a PC to every manager was mandatory in the 1980s.

At the same time, individuals are also driving adoption. Students, freelancers, and creators are eager to run local language models for research, content generation, and coding without being tethered to cloud subscriptions. Emerging players such as DeepComputing, in partnership with Framework, are helping expand access to RISC-V–based AI PCs designed for developers and open-source enthusiasts who want full control over their hardware and software stack. Just as early home PCs became invaluable to small business owners and families, today’s AI PCs are rapidly evolving into indispensable personal assistants.

Challenges and Opportunities

As with the IBM PC era, the rollout of the AI PC comes with challenges. Software ecosystems must catch up, ensuring that frameworks like PyTorch, TensorFlow, and ONNX can fully exploit NPUs across different architectures. Pricing remains a consideration as well. Macs with integrated NPUs begin at around $1,099, while x86-based AI PCs often carry higher costs, and RISC-V systems remain experimental and relatively expensive.

Despite these hurdles, the opportunities are far greater. The AI PC offers a compelling reason to refresh hardware and injects new vitality into a stagnant market. It has the potential to redefine productivity just as spreadsheets once did in the 1980s. The modern equivalent of the spreadsheet ‘killer app’ may well be the personal AI assistant — a ubiquitous capability that transforms how individuals and corporations alike work, learn, and create.

“Copilot isn’t the killer app the AI PC needs. That breakthrough will come when on-device AI stops looking to the cloud—and starts thinking for itself.”

Conclusion

The AI PC in 2025 echoes the IBM PC in 1981: a new category that redefines what personal computing means and who benefits from it. The IBM PC turned the desktop into a productivity hub. The AI PC transforms it into a creativity and intelligence hub.

Apple is the clear frontrunner in this takeoff. Years of NPU integration, vertical stack control, unified memory, and seamless software have given it a commanding lead. The x86 vendors were caught flat-footed not only by ecosystem fragmentation and roadmap delays, but also by their tunnel vision on data centers. NVIDIA, meanwhile, remains the giant in cloud AI and the supplier of the fastest PC accelerators, but it risks being sidelined in the volume AI PC market if integrated NPUs become the standard definition.

For Apple, this represents more than just an industry lead. It signals the Mac’s return as a growth engine at a time when the iPhone’s dominance is beginning to plateau. If history is any guide, this shift will not just reinvigorate PC shipments but will reshape the role of the computer in society — making the AI PC the defining tool of the next era in computing.

Also Read:

GlobalFoundries, MIPS, and the Chiplet Race for AI Datacenters

Yuning Liang’s Painstaking Push to Make the RISC-V PC a Reality

Rapidus, IBM, and the Billion-Dollar Silicon Sovereignty Bet


The Rise, Fall, and Rebirth of In-Circuit Emulation: Real-World Case Studies (Part 2 of 2)

The Rise, Fall, and Rebirth of In-Circuit Emulation: Real-World Case Studies (Part 2 of 2)
by Lauro Rizzatti on 10-20-2025 at 6:00 am

The Rise, Fall, and Rebirth of In Circuit Emulation real world case studies figure 1

Recently, I had the opportunity to speak with Synopsys’ distinguished experts in speed adapters and in-circuit emulation (ICE). Many who know my professional background see me as an advocate for virtual, transactor-based emulation, hence I was genuinely surprised to discover the impressive results achieved by today’s speed adapters critical to the validation of system in their actual use environment.

In this article, I share what I learned. While confidentiality prevents me from naming customers, all the examples come from leading semiconductor companies and major hyperscalers across the industry using ZeBu® emulation or HAPS® prototyping together with Synopsys speed adaptors. As you read through the article you can refer to the following diagram of the Synopsys Speed Adaptor Solution:

Figure 1: Deployment diagram of Synopsys’ Speed Adaptor Solution and System Validation Server (SVS)

Case Study #1: The Value of Combining Fidelity and Flexibility in System Validation

The Challenge

A major fabless semiconductor company adopted virtual platforms as well as Hardware-Assisted Verification (HAV) platforms to accelerate early software development and design verification.

The company’s operations were organized around three distinct business units, each responsible for designing its own silicon independently. Each unit selected a different major EDA vendor for its virtual host solution platform. At first glance, such a multi-vendor setup might seem fragmented, but because virtual platforms are generally built on similar architectural blueprints, the approach still resulted in a verification environment that was consistent and standardized across all three BUs.

Alongside these virtual setups, the engineering teams also adopted In-Circuit Emulation (ICE). Here again, they diversified their tools, sourcing speed adapters and emulation from two of the three major EDA vendors. This allowed them to carry out system-level testing, interfacing the emulated environments with real hardware components to validate behavior under realistic conditions.

During a critical design milestone, a senior VP overseeing design verification mandated a cross-platform validation initiative: swap designs and tools across BUs, validate that silicon from each BU worked on all vendors’ platforms, uncover hidden inconsistencies before tape-out.

The mandate required running each BU’s design on all three virtual host platforms and on both ICE setups to ensure environment independence.

That’s when the surprise hit! One design passed flawlessly on all three virtual host platforms. It passed on one of the ICE platforms, but it failed on the other ICE platform, halting system boot entirely. The immediate suspicion fell on the speed adapter. The design team escalated the issue to the EDA vendor’s ICE experts for root-cause analysis.

The Solution

The EDA vendor’s ICE team dug deep into the logs and waveform traces and found the real culprit. It wasn’t the adapter. It was a bug in the DUT’s RTL.

This RTL flaw had escaped all three virtual platforms because of missing low-level system behavior modeling. Escaped one of the ICE setups due to lower fidelity implementation and surfaced only on the higher-fidelity ICE platform, which accurately mirrored real server behavior.

In real-world server systems, three critical hardware/software layers interact simultaneously. From the bottom up, the layers are:

  1. Motherboard chipsets, including PCIe switches, bridge chips, and other supporting silicon
  2. BIOS, handling low-level system initialization and configuration
  3. Operating System (OS), such as Linux or Windows, running on top

Virtual host platforms typically simulate only the OS layer using a virtual machine approach (typically QEMU based). The BIOS is minimally represented, and chipset behavior is completely abstracted out.

On the high-fidelity ICE platform, however, a real Intel server board was connected through the speed adapter. During boot, this Intel chipset correctly issued a Vendor Defined Message (VDM) over PCIe, a standard behavior in many production Intel servers, but not modeled at all in virtual platforms. Upon receiving this VDM, the DUT incorrectly dropped the packet instead of updating the PCIe flow control. This caused a deadlock during system boot. There was no software workaround, the only solution was to fix the RTL before tape-out.

Results

If undetected, the chip would have failed in every server deployment, resulting in a dead-on-arrival product. Detecting the bug pre-silicon saved the company a multi-million-dollar re-spin and months of schedule delay. The incident demonstrated why high accuracy virtual environments are critical to finding bugs early while high fidelity in-circuit setups are necessary to have final fidelity and confidence in the design.

Case Study #2: ICE Delivers Superior Throughput

The Challenge

When designing peripheral interface products, engineering teams often rely on virtual solutions for early verification. While virtual environments can model a protocol controller, they cannot accurately represent the physical (PHY) layer.

In these virtual models, the PHY is removed and replaced by a simplified “fake” model allowing to write software for basic register programming but does not support link training, equalization, or true electrical signaling. As a result, link training may appear to succeed because the model “assumes” compliance. Subtle issues like timing mismatches, equalization failures, and signal integrity problems remain hidden until late post-silicon testing. Testing real-world interoperability, especially with diverse third-party hardware and drivers, is not possible.

A leading hyperscaler faced significant challenges because of this drawback. Early in their design cycles, they faced months-long delays just to program and train PHYs, pushing crucial bug discovery into expensive post-silicon stages.

The Solution

To overcome these challenges, they adopted Synopsys Speed Adapters to bring PHYs into the emulation environment.

With this approach, PHYs are physically connected to the emulator through the speed adaptation. These boards support full programming, training, and link initialization just as they would on silicon.

This integration effectively turns the emulation environment into a true In-Circuit Emulation (ICE) platform, combining the speed and visibility of pre-silicon emulation with the physical accuracy and interoperability of real-world hardware

Examples of Impact

PCIe Gen5 Interface

  • In a virtual setup, a Gen5 device’s link training sequences seemed successful.
  • When tested via a speed adapter and a PHY, the customer uncovered critical timing mismatches and equalization failures that would have escaped virtual verification.
  • Catching these issues pre-silicon avoided a potential costly silicon re-spin and ensured full compliance with Gen5 specs.

UFS Storage Interface

  • A UFS host controller passed functional tests in a virtual model.
  • When connected to a real UFS PHY through a speed adapter, engineers discovered clock misalignments, burst mode instabilities, and data corruption under stress conditions—problems rooted in real signaling, invisible in virtual models.
  • Early detection improved system reliability and ensured compliance with JEDEC standards.

Driver Interoperability Testing

  • In root complex mode, different GPUs (NVIDIA, AMD, Intel) each use different drivers and optimizations.
  • Virtual environments cannot test these real drivers because they require a physical interface.
  • Speed adapters allowed full driver stacks against real devices, exposing errata and interoperability bugs that virtual models could never catch.
Results

Previous four months to program the PHY plus up to six months to train it in post-silicon were executed in pre-silicon in a couple of weeks. This was possible because speed adapters ran workloads , enabling rapid design iterations and faster bring-up cycles. Another benefit was improved debug and reuse since the same PHY configuration trained in pre-silicon could be directly reused in post-silicon, accelerating bring-up.

Case Study #3: Ethernet Product Validation

Challenge

When developing advanced Ethernet products—such as ultra-Ethernet, smart NICs, or intelligent switches—engineers face a recurring challenge: how to bring real software traffic into the Ethernet validation environment.

Virtual environments offer partial solutions. Virtual tester generators (VTG) offer low-level packet traffic (Layer 2, Layer 3) but do not exercise the application software stack. Virtual Hosts (VHS) allow software interaction but lacks flow-control capabilities. Without flow control, packets are dropped, an unacceptable limitation for validation environments where fidelity and determinism are critical.

As a result, traditional virtual environments are either incomplete (VTG) or not fully-reliable from a traffic control perspective (VHS). This gap left design teams without a way to fully validate Ethernet products across all protocol layers—especially the higher layers (L4–L7) that depend on real drivers, operating systems, and diagnostic software.

Solution

Ethernet speed adapters provide the missing link by bridging virtual test environments with real software execution over Ethernet.

Unlike VHS, speed adapters guarantee zero packet drops, delivering deterministic performance even under high traffic. Virtual testers (e.g., from Ixia or Spirent) remain useful for low-level (layer 2/3) functional validation. Speed adapters enable execution of real drivers and Linux-based diagnostic tools that testers cannot emulate. Together, virtual testers and speed adaptors form a complete solution spanning all Ethernet layers.

For startups or budget-constrained customers, speed adapters complement more expensive virtual tester licenses.  Speed adapters can provide equivalent packet generation and analysis at lower cost. Also, free and open-source test generators can be layered on top of a speed adapter to replicate tester functions at much lower cost.

Results

In practice, this hybrid approach has enabled customers to validate real software stacks against hardware under development without packet loss. Catch design bugs that only appear in higher protocol layers, issues that purely virtual test environments cannot expose. Scale affordably, combining limited VTG licenses with speed adapters to achieve full test coverage.

Case Study #4: Real-World Sensor Interoperability with MIPI Speed Adapters

The Challenge

The Display and Test Framework (DTF) team of a major fabless enterprise faced a recurring and costly problem. They needed to validate their chip design against real MIPI-based image sensors and cameras. However, in a virtual emulation environment, this was impossible because virtual models can mimic protocol behavior but cannot replicate real sensor electrical signaling or timing. Vendor-specific cameras and sensors each have unique initialization sequences, timing quirks, and signal integrity characteristics that no generic virtual model can capture. When first silicon returned from the fab, it frequently failed to interface with the intended cameras and sensors, leading to long bring-up efforts or even full silicon re-spins.

This limitation created a significant time-to-market bottleneck. By the time hardware compatibility issues can be found the design has already gone through costly fabrication, delaying product launches.

The Solution

To eliminate this bottleneck, the company adopted MIPI speed adapters to enable ICE with real sensor hardware. Using this approach, the chip design running inside the emulator could be directly connected to real, vendor-specific MIPI cameras and image sensors. Engineers could exercise full initialization, configuration, and data streaming paths just as they would on physical silicon. The setup supported easy swapping of different sensors and camera models, enabling rapid interoperability testing across vendors.

This capability gave the DTF team the real-world coverage they needed in pre-silicon, without waiting for chips to return from the fab.

Results

The design was successfully tested with the exact vendor-specific camera and sensor models planned for production. By catching integration issues pre-silicon, the enterprise avoided costly design re-spins caused by post-silicon bring-up failures. Removing the post-silicon camera/sensor debug cycle accelerated overall product schedules. Finally, the team could sign off knowing the design was already proven with real-world peripherals.

Case Study #5: How Synopsys’ System Validation Server (SVS) Caught Critical Bugs Missed by other solutions

The Challenge

Pre-silicon validation using ICE has historically faced a critical obstacle. Standard off-the-shelf host servers are not designed to tolerate the slow or intermittent response times of an emulator. When the emulator clock stalls or slows the host often times out, aborting the test run.

This customer’s silicon validation team encountered this limitation firsthand. While they used a commercial emulation host server for ICE, this system wasn’t enforcing strict real-world timing AND protocol checks. This risked letting flawed designs pass pre-silicon signoff, only to fail later in production.

The Solution

To overcome these limitations, the customer’s validation team adopted Synopsys’ System Validation Server (SVS) as their host system for ICE validation. SVS is a specialized, pre-validated host machine designed specifically to work with speed adapters and emulators. It offers two major advantages over generic hosts or legacy commercial host server setups. The SVS ships with a custom BIOS, engineered to tolerate the slow response times of emulators to eliminate timeouts that can otherwise terminate validation runs prematurely. SVS faithfully mimics the DUT that will eventually plug into, including enforcing strict specification compliance, especially for complex subsystems like PCIe. See figure 1 at the top of the article.

The validation team tested their design on both 3rd party emulation hardware and Synopsys’ SVS. Using the 3rd party emulation, the system booted successfully, but on SVS, the boot failed completely. Initially, the engineers suspected a hardware fault in SVS. As they put it: “Your SVS is broken while the other guys work fine.”

However, after a detailed debug session, it emerged that their DUT contained configuration errors in PCIe space registers. The 3rd party emulation solution and host server masked these errors because it used an outdated BIOS that failed to enforce PCIe register constraints. By contrast, SVS strictly enforced PCIe specifications and correctly rejected the illegal register values. The bug was non-fixable by firmware (no software patch could correct it).

Results

SVS exposed an RTL-level configuration bug that virtual flows and another emulation solution missed. It eliminated timeout instability in virtue of the SVS’s modified BIOS that allowed stable, long-duration tests.

SVS ensured that only spec-compliant designs advanced to tape-out, eliminating false positives from legacy flows.

Had the design been taped out based on 3rd party emulation “pass,” the silicon would have been DOA, requiring a full, costly respin.

Conclusion

Back in 2015, I wrote “The Melting of the ICE Age” for Electronic Design, where I predicted the demise of in-circuit emulation (ICE). Its numerous drawbacks (see Part 1 of this series) seemed to doom it to history, replaced by transaction-based emulation and, later, hybrid approaches that drove the shift-left verification methodology. In hindsight, I must admit I underestimated the ingenuity and resourcefulness of the engineering community.

Today, the third generation of speed adapters has propelled ICE again into the limelight of system-level validation. Bugs once detectable only in post-silicon labs can now be identified pre-silicon. This capability not only reduces the risk of re-spins but also accelerates time-to-tapeout and saves enormous expense. Far from melting away, ICE has re-emerged as a cornerstone of system-level verification.

The Rise, Fall, and Rebirth of In-Circuit Emulation (Part 1 of 2)

Also Read:

Statically Verifying RTL Connectivity with Synopsys

Why Choose PCIe 5.0 for Power, Performance and Bandwidth at the Edge?

Synopsys and TSMC Unite to Power the Future of AI and Multi-Die Innovation


CMOS 2.0 is Advancing Semiconductor Scaling

CMOS 2.0 is Advancing Semiconductor Scaling
by Daniel Nenni on 10-19-2025 at 10:00 am

CMOS 2.0

In the rapidly evolving landscape of semiconductor technology, imec’s recent breakthroughs in wafer-to-wafer hybrid bonding and backside connectivity are paving the way for CMOS 2.0, a paradigm shift in chip design. Introduced in 2024, CMOS 2.0 addresses the limitations of traditional CMOS scaling by partitioning a system-on-chip (SoC) into specialized functional tiers. Each tier is optimized for specific needs—such as high-performance logic, dense memory, or power efficiency—through system-technology co-optimization (STCO). This approach moves beyond general-purpose platforms, enabling heterogeneous stacking within the SoC itself, similar to but more integrated than current 3D stacking of SRAM on processors.

Central to CMOS 2.0 is the use of advanced 3D interconnects and backside power delivery networks (BSPDNs). These technologies allow for dense connections on both sides of the wafer, suspending active device layers between independent interconnect stacks. At the 2025 VLSI Symposium, imec demonstrated key milestones: wafer-to-wafer hybrid bonding at 250nm pitch and through-dielectric vias (TDVs) at 120nm pitch on the backside. These innovations provide the granularity needed for logic-on-logic or memory-on-logic stacking, overcoming bottlenecks in compute scaling for diverse applications like AI and mobile devices.

Wafer-to-wafer hybrid bonding stands out for its ability to achieve sub-micrometer pitches, offering high bandwidth and low-energy signal transmission. The process involves aligning and bonding two processed wafers at room temperature, followed by annealing for permanent Cu-to-Cu and dielectric bonds. Imec has refined this flow, achieving reliable 400nm pitch connections by 2023 using SiCN dielectrics for better strength and scalability. Pushing further, simulations revealed non-uniform bonding waves causing wafer deformation, impacting overlay accuracy. By applying pre-bond litho corrections, imec reached 300nm pitch with <25nm overlay error for 95% of dies. At VLSI 2025, they showcased 250nm pitch feasibility on a hexagonal pad grid, with high electrical yield in daisy chains, though full-wafer yield requires next-gen bonding tools.

Complementing frontside bonding, backside connectivity enables front-to-back links via nano-through-silicon vias (nTSVs) or direct contacting. For CMOS 2.0’s multi-tier stacks, this allows seamless integration of metals on both sides, with BSPDNs handling power from the backside to reduce IR drops and decongest frontside BEOL for signals. Imec’s VLSI 2025 demo featured barrier-less Mo-filled TDVs with 20nm bottom diameter at 120nm pitch, fabricated via a via-first approach in shallow-trench isolation. Extreme wafer thinning maintains low aspect ratios, while higher-order lithography corrections ensure 15nm overlay margins between TDVs and 55nm backside metals. This balances fine-pitch connections on both wafer sides, crucial for stacking multiple heterogeneous layers like logic, memory, and ESD protection.

BSPDNs further enhance CMOS 2.0 by relocating power distribution to the backside, allowing wider, less resistant interconnects. Imec’s 2019 pioneering work has evolved, with major foundries adopting it for advanced nodes. DTCO studies show PPAC gains in always-on designs, but VLSI 2025 extended this to switched-domain architectures—relevant for power-managed mobile SoCs. In a 2nm mobile processor design, BSPDN reduced IR drop by 122mV compared to frontside PDNs, enabling fewer power switches in a checkerboard pattern. This yielded 22% area savings, boosting performance and efficiency.

These advancements, supported by the NanoIC pilot line and EU funding, bring CMOS 2.0 from concept to viability. By enabling heterogeneity within SoCs, they offer scalable solutions for the semiconductor ecosystem, from fabless designers to system integrators. As pitches scale below 200nm, collaboration with tool suppliers will be key to overcoming overlay challenges. Ultimately, high-density front and backside connectivity heralds a new era of compute innovation, meeting demands for performance, power, and density in an increasingly diverse application space.

Read the source article here.

Also Read:

Exploring TSMC’s OIP Ecosystem Benefits

Synopsys Collaborates with TSMC to Enable Advanced 2D and 3D Design Solutions

Advancing Semiconductor Design: Intel’s Foveros 2.5D Packaging Technology


Podcast EP311: An Overview of how Keysom Optimizes Embedded Applications with Dr. Luca TESTA

Podcast EP311: An Overview of how Keysom Optimizes Embedded Applications with Dr. Luca TESTA
by Daniel Nenni on 10-17-2025 at 10:00 am

Daniel is joined by Luca TESTA, the COO and co-founder of Keysom. After studying microelectronics in Italy, Luca obtained his PhD in France while working with STMicroelectronics on analog/RF circuit design.

Dan explores the charter and focus of Keysom with Luca. Luca describes how Keysom is providing an automated and reliable way to create optimized, efficient 32-bit processors for embedded applications such as IoT and edge AI. He explains the challenges of using standard processors for applications that demand small area and low power. In these cases, on average 40% of the instructions in a standard processor are not used. He describes Keysom’s CoreXplorer tool that provides an easy and efficient way to develop a customized processor that fits the specific needs of an application.

He describes real examples where 30% – 70% area reduction is achieved along with a 25% – 40% reduction in power. The approach uses a RISC-V architecture and ensures compatibility with the RISC-V ecosystem to create an optimized workflow. Luca goes on to describe additional benefits of Keysom’s approach and the company’s plans to expand sales and support in the US.

Contact Keysom

The views, thoughts, and opinions expressed in these podcasts belong solely to the speaker, and not to the speaker’s employer, organization, committee or any other group or individual.


CEO Interview with Dr. Bernie Malouin Founder of JetCool and VP of Flex Liquid Cooling

CEO Interview with Dr. Bernie Malouin Founder of JetCool and VP of Flex Liquid Cooling
by Daniel Nenni on 10-17-2025 at 8:00 am

Bernie Malouin Headshot JetCool

Bernie Malouin is a technical professional with demonstrated experience from concept studies through system deployment. He has a strong track record working in dynamic environments, from highly complex, multi-million dollar development programs to deeply technical research projects. He founded JetCool Technologies after 8 years at MIT Lincoln Laboratory. There, he served as the Chief Engineer leading the technical development of a $100M+ airborne payload program for the US Government.

Tell us about your company?

JetCool, now part of Flex, designs and manufactures advanced liquid cooling for the world’s most demanding AI and HPC workloads. We spun out of MIT in 2019 with a mission to reinvent cooling at the chip level. In 2024, we joined Flex, a global leader in design, manufacturing, and supply chain, which has enabled us to scale faster and reach customers worldwide.

What sets us apart is performance and practicality. Our SmartPlate™ direct-to-chip technology outperforms leading liquid cooling solutions by more than 20%. It targets hotspots on the silicon, reducing thermal resistance and unlocking higher compute densities. Combined with Flex’s vertically integrated approach where compute, power, and cooling are designed and validated together we deliver rack-ready systems that lower power consumption, cut water use by up to 90%, and deploy seamlessly at global scale.

What problems are you solving?

AI has pushed compute demand, and heat, to unprecedented levels. Traditional air cooling simply can’t keep pace with today’s CPUs and GPUs, leaving data centers constrained by power, thermal limits, and sustainability pressures. JetCool, as a Flex company, solves this by delivering vertically integrated rack-level solutions that combine power distribution, liquid cooling, and system design from a single vendor. This integration reduces complexity, shortens deployment timelines, and ensures every component is validated to work together seamlessly.

Beyond technology, Flex provides global warranty for its products and service and support. That means customers can scale AI infrastructure anywhere in the world with confidence, knowing their racks are supported end-to-end from design and manufacturing to deployment and ongoing operations.

What application areas are your strongest?

We excel in helping customers maximize compute in power-constrained environments. Our single-phase, direct-to-chip technology cools high-power AI and HPC processors more effectively than conventional cold plates, delivering up to 20% better performance while lowering total IT power use. We understand that every data center is at a different stage of adoption. That’s why our portfolio spans from more efficient self-contained air-cooled solutions to fully liquid-cooled racks. This gives customers a clear migration path—start with air efficiency gains, then scale to hybrid or full liquid cooling as density and power demands grow.

Our strong partnerships with leading colocation providers, including Equinix, Telehouse, and Sabey Data Centers, showcase this flexibility in action. Together, we’ve deployed solutions that allow customers to adopt liquid cooling on their terms, whether through incremental pilots or complete rack-level rollouts. With JetCool and Flex, they gain a partner who can meet them where they are today and help them scale for tomorrow.

What keeps your customers up at night?

Our customers don’t want to invest in expensive infrastructure this year only to see it become obsolete the next. With AI chips advancing at breakneck speed, they worry about stranded capacity—data centers built for yesterday’s processors that can’t support tomorrow’s. At the same time, power and sustainability pressures are mounting. Many regions are already at their grid limits, and operators are being asked to do more compute with less energy and water. This is where JetCool and Flex step in. Our insight allows us to design vertically integrated power and cooling solutions that help customers future-proof. We’re acting as a resource and an ally, ensuring our customers are prepared for the next wave of AI hardware and can scale confidently without constant reinvestment.

What does the competitive landscape look like and how do you differentiate?

Liquid cooling is no longer experimental; it’s becoming the standard for AI infrastructure. That said, not all solutions are equal. JetCool differentiates through precision. Our microconvective technology cools chips at their hottest points, reducing thermal resistance and improving performance per watt. As part of Flex, we combine this innovation with large-scale manufacturing, systems integration, and global service. With Flex, customers get a fully validated, warrantied solution from a single partner.

What new features/technology are you working on?

We’re pushing toward the 1MW rack. That means not just higher-capacity cold plates, but rack-level solutions that integrate cooling distribution, power management, and monitoring. We’re also advancing smart sensing and telemetry, enabling operators to see and control cooling performance in real time. And at the silicon level, we’re collaborating with chipmakers to co-design next-generation cooling interfaces that reduce thermal bottlenecks from the start.

How do customers normally engage with your company?

We meet customers wherever they are in their liquid cooling journey. Some start with pilot deployments in a single row; others adopt our rack-ready systems through OEM partners like Dell. Because Flex can integrate, validate, and ship fully configured solutions, we simplify what has traditionally been a complex, multi-vendor process. Customers gain confidence knowing they’re supported end-to-end—from design through deployment and ongoing service.

Also Read:

CEO Interview with Gary Spittle of Sonical

CEO Interview with David Zhi LuoZhang of Bronco AI

CEO Interview with Jiadi Zhu of CDimension 


Webinar – The Path to Smaller, Denser, and Faster with CPX, Samtec’s Co-Packaged Copper and Optics

Webinar – The Path to Smaller, Denser, and Faster with CPX, Samtec’s Co-Packaged Copper and Optics
by Mike Gianfagna on 10-17-2025 at 6:00 am

Webinar – The Path to Smaller, Denser, and Faster with CPX, Samtec’s Co Packaged Copper and Optics

For markets such as data center, high-performance computing, networking and AI accelerators the battle cry is often “copper is dead”. The tremendous demands for performance and power efficiency often lead to this conclusion. As is the case with many technology topics, things are not always the way they seem. It turns out a lot of the “copper is dead” sentiment has to do with the view that it’s a choice of either copper or optics. In such a situation, optical interconnect will win.

But what if copper and optics could be integrated and managed together on one platform?  It turns out there are many short-reach applications where copper is superior. The ability to achieve this co-technology integration at advanced 224G speeds was the topic of a recent webinar from Samtec. If you struggle with the negative ramifications of “copper is dead”, you’ll will want to see the webinar replay. More details are coming, but first let’s examine the path to smaller, denser, and faster with CPX, Samtec’s co-packaged copper and optics solution.

You can view the webinar here.

Who’s Speaking

Matt Burns

The quality of a webinar, especially a live one is heavily influenced by the quality of the speaker. In the case of the upcoming event, everyone is in good hands. Matt Burns will be presenting. I’ve known Matt for quite a while. Samtec was an excellent partner of eSilicon back in the day, and I’ve attended many discussions and events with Matt. He has an easy-going presentation style, but under it all is a substantial understanding of what it takes to build high-performance communication channels and why it matters for any successful system design.

A quick summary of Matt’s background is in order. He develops go-to-market strategies for Samtec’s Silicon-to-Silicon solutions. Over the course of 25 years, he has been a leader in design, applications engineering, technical sales and marketing in the telecommunications, medical and electronic components industries. He currently serves as Secretary at PICMG. If it’s a close-to-impossible system design issue, Matt has likely seen it and helped to flatten it.

Some Topics to be Covered

Using all the tools and technologies available for any complex design project is usually the best approach. Matt discuses this in some detail, describing the situations where passive copper interconnect delivers the best result. Short reach is certainly one aspect that influences this decision, but there are other considerations as well.

For longer reach channels, active optical channels can be an excellent choice. The reasons to drive one way vs. another are not as simple as you may think and Matt helps with examples for various strategies.

The key point in all this is, what if you could deploy both copper and optical interconnect in a unified way?  A mix and match scenario if you will. It turns out Samtec has been managing this kind of platform-level integration for about 15 years.

Getting into some specifics, the transition to 224 Gbps PAM4 signaling can strain copper interconnects due to reduced signal-to-noise rations and tighter insertion loss budgets. This usually limits reach to under 1 meter. Using co-packaged copper (CPC), this limit can be extended to 1.5 meters, enabling dense intra-rack GPU clusters while lowering system cost. But copper’s limitations over longer distances hinder inter-rack scaling.

Co-packaged optics (CPO) helps by integrating the optical engine within the switch silicon, enabling high-bandwidth, scalable links across racks. CPO overcomes copper’s physical constraints, reducing power and cooling costs, and unlocking scalable, efficient AI supercomputing fabrics that interconnect thousands of GPUs across data centers.

But what if you could have it both ways? Matt describes Samtec’s new strategy for advanced channel speeds that combines CPC and CPO to create a new category called CPX. This capability is delivered by Samtec’s Si-Fly® HD. A photo of the platform is shown on the top of this post.

Matt describes how this technology delivers the highest density 224 Gbps PAM4 solution in today’s market. He provides details about how the electrically pluggable co-packaged copper and optics solutions (CPX) are achievable on a 95 mm x 95 mm or smaller substrate using Samtec’s SFCM connector. The SFCM mounts directly to the package substrate and is pluggable with Samtec’s SFCC cable assembly or an optical cable assembly of your choosing.

Synopsys Example

Samtec has already worked with several high-profile system OEMs and IP providers to deploy this technology. Matt also talks about some of those achievements.

To Learn More

If you are faced with tough decisions regarding channel interconnect choices, you may have more options that you think. Matt Burns will take you through a new set of options enabled by Samtec’s new Si-Fly HD.  The webinar’s full title is CPX: Leveraging CPC/CPO for the Latest Scale-Up and Scale-Out AI System Topologies.  You can view a replay of this important webinar here.

Also Read:

Samtec Practical Cable Management for High-Data-Rate Systems

How Channel Operating Margin (COM) Came to be and Why It Endures

Visualizing System Design with Samtec’s Picture Search