RVN! 26 Banner revised (800 x 100 px) (600 x 100 px)

Two Open RISC-V Projects Chart Divergent Paths to High Performance

Two Open RISC-V Projects Chart Divergent Paths to High Performance
by Jonah McLeod on 02-16-2026 at 2:00 pm

yun chip hier

Up to now the RISC-V community has been developing open-source processor implementations to a stage where they can appeal to system designers looking for alternatives to proprietary Arm and x86 cores. Toward this end, two projects have emerged as particularly significant examples of where RISC-V is heading. One is Ara, a vector processor developed at ETH Zürich as part of the PULP platform. A second is XiangShan, a high-performance scalar core developed in China. Both are serious engineering efforts. Both are open source. Yet they represent fundamentally different answers to the same question: how should RISC-V scale performance?

Ara Takes the Explicit Vector Path

Ara implements the RISC-V Vector Extension by making parallelism explicit to software. The design exposes vector width, register grouping, and data locality directly to the programmer. Software controls how many elements execute in parallel through VL, how wide those elements are via SEW, and how registers are grouped using LMUL. Memory behavior remains visible and largely software managed.

The key architectural decision in Ara is the elimination of speculation. Rather than attempting to discover parallelism dynamically in hardware, Ara requires software to declare it explicitly. Because the work is explicitly structured, there is no branch speculation inside vector loops, no instruction reordering speculation, no guessing about memory dependencies, and no need for rollback mechanisms. Ara executes exactly the work it is instructed to execute.

This distinction matters for performance analysis. A stall is simply waiting for data to arrive. A speculative penalty is wasted execution followed by recovery. Ara still pays for memory latency, but it never flushes pipelines, squashes instructions, replays large instruction windows, or discards completed work.

“Ara was designed to prioritize efficiency for highly parallel workloads rather than speculative general-purpose execution. By eliminating speculation in the vector engine, we avoid the energy cost of mispredictions and pipeline recovery, allowing the hardware to focus almost entirely on productive computation. For large, structured workloads such as matrix multiplication, this approach consistently delivers very high functional-unit utilization and strong performance-per-watt,” says Matteo Perotti, ETH Zürich.

Ara’s vector model makes this non-speculative execution practical. Vector instructions amortize control overhead across many elements. Loop bounds and memory access patterns are regular. Control flow is largely outside the hot loop, and data dependencies are explicit. That structure eliminates the need for speculation to keep pipelines busy.

A Shallow Memory Hierarchy

Ara’s memory system reinforces this philosophy. Unlike conventional CPUs or GPUs, Ara does not include private L1 caches for vector execution. Its load-store unit connects directly to a shared SRAM-based last-level memory. Depending on the integration context, this memory may act as a cache, a software-managed scratchpad, or a preloaded working set.

In simulation, Ara is exercised in a bare-metal environment where data is preloaded into memory to accelerate simulation runtime. Full operating-system bring-up and debugging are performed directly on FPGA platforms where execution speed makes system-level validation practical. In ASIC prototypes such as Yun, the last-level memory appears as a small on-chip SRAM. In FPGA integrations such as Cheshire, Ara is integrated into a full SoC with operating-system support.

What remains consistent across these systems is the architectural intent: locality is a software responsibility, not something to be speculatively optimized away by deep cache hierarchies. This approach aligns closely with RVV’s execution model. Vector performance depends less on hiding latency than on sustaining bandwidth and reuse.

Physical implementation of Ara on the Yun chip. The CVA6 scalar processor with instruction and data caches sits at top. Four identical vector lanes surround the central vector load-store units (VLSU) and mask unit. Courtesy PULP Platform (ETH Zurich and University of Bologna)
Where Vectors Begin to Strain

Ara is also instructive because it reveals where vector architectures begin to strain. Matrix-dominated workloads, now central to AI and machine learning, can be expressed on vector engines through careful tiling and accumulation. Ara demonstrates that this can be done effectively, but not without increasing register pressure, instruction overhead, and software complexity.

Rather than masking these challenges with additional hardware, Ara exposes them cleanly. In doing so, it helps explain why the RISC-V ecosystem is now exploring matrix extensions as a distinct architectural layer above vectors. Ara effectively defines the upper bound of what pure vector execution can deliver, making it a valuable reference point rather than an endpoint, as illustrated by AraXL’s large-scale vector implementations.

XiangShan Takes the Traditional Path

By way of comparison, XiangShan follows the traditional high-performance CPU path. The project refines speculative scalar execution to extract instruction-level parallelism from largely unstructured code. Its design relies on deep out-of-order pipelines, aggressive branch prediction, speculative memory access, and multi-level caching to infer parallelism dynamically and hide latency behind hardware complexity.

Performance emerges when predictions are correct, and the cost of being wrong is absorbed through rollback, replay, and wasted energy. XiangShan must speculate because scalar code is dominated by frequent branches, irregular memory access, fine-grained dependencies, and unpredictable control flow. Speculation is the only way to extract performance from that environment.

This approach is familiar, effective for general-purpose workloads, and deliberately conservative. XiangShan aims to demonstrate that an open RISC-V core can compete by mastering the same techniques long used by x86 and Arm processors. The trade-off is therefore not one of right versus wrong, but of where complexity lives: XiangShan concentrates complexity in hardware to preserve the illusion of fast sequential execution, while Ara moves structure into software and removes speculative machinery entirely.

The Commercialization Question

Unlike Ara, which is best understood as a reference and research platform, XiangShan occupies a more ambiguous space between research and industry. XiangShan is not owned or sold by a commercial IP vendor in the traditional sense. There is no company marketing XiangShan cores under paid licensing terms. Instead, the project’s RTL is released under the Mulan PSL v2 open-source license, allowing companies to adopt, modify, and integrate the design without royalties.

However, XiangShan has progressed well beyond academia. The project has produced multiple physical tape-outs across successive generations, including chips fabricated in both mature and more advanced process nodes. Systems are capable of booting Linux and running standard benchmarks. Project materials describe collaboration with industry partners and evaluation within SoC development workflows.

This places XiangShan in a Linux-like model of commercialization. The core itself is not monetized as proprietary IP. Instead, its value emerges through adoption, integration, and downstream products built by third parties. In other words, XiangShan has been commercialized in practice, but not in the conventional IP-licensing sense. Its success depends on whether companies choose to build products around it, rather than on direct sales of the core itself.

XiangShan succeeds in demonstrating that open-source hardware can scale to complex, production-class microarchitectures. Its investment in tooling, simulation, and verification shows that openness need not imply fragility. In that respect, it validates RISC-V as a viable foundation for serious scalar CPUs.

At the same time, XiangShan’s conservatism defines its limits. By adhering closely to the speculative scalar tradition refined by x86 and Arm, it avoids questioning the underlying assumptions of that model. It does not attempt to make parallelism explicit, to rethink locality management, or to reduce reliance on speculation as the primary driver of performance. XiangShan improves the state of the art within a familiar framework but does not attempt to redraw that framework.

Two Paths, Not One Winner

Comparing Ara and XiangShan is illuminating precisely because they are not competing for the same point in design space. Ara explores explicit, structured parallelism and predictable performance, scaling by adding lanes, bandwidth, and disciplined data reuse. XiangShan refines speculative scalar execution, scaling by increasing pipeline sophistication, prediction accuracy, and cache depth. One exposes trade-offs to software. The other works hard to hide them. One favors determinism. The other embraces speculation. Neither approach is inherently superior, but each excels in different domains.

What Open Source Means in Practice

Earlier analysis of XiangShan made the case that open source alone does not guarantee architectural boldness. Ara reinforces the complementary point: architectural boldness does not require commercial polish to be meaningful. Ara’s value lies in clarity. It shows what RVV actually implies when implemented honestly, including both its strengths and its limits. XiangShan’s value lies in execution discipline and scale. It shows how far open source can go by perfecting known techniques and coupling them with institutional support.

Together, these projects illustrate the breadth of architectural exploration now possible within the RISC-V ecosystem. One path is evolutionary and production-oriented. The other is exploratory and architectural. Understanding both is essential for anyone trying to anticipate where RISC-V and high-performance computing more broadly is headed next.

Also Read:

The Launch of RISC-V Now! A New Chapter in Open Computing

The Foundry Model Is Morphing — Again

SiFive to Power Next-Gen RISC-V AI Data Centers with NVIDIA NVLink Fusion


On the high-speed digital design frontier with Keysight’s Hee-Soo Lee

On the high-speed digital design frontier with Keysight’s Hee-Soo Lee
by Don Dingee on 02-16-2026 at 10:00 am

Chiplet 3D Interconnect Designer reduces interconnect analysis in high-speed digital design from weeks to minutes

High-speed digital (HSD) design is one of the more exciting areas in EDA right now, with design practices, tools, and workflows evolving to keep pace with increasing design complexity. With the annual Chiplet Summit and DesignCon festivities right around the corner, we sat down with Keysight’s Hee-Soo Lee, HSD Segment Lead, to get his insights into what’s happening and where engineers should be looking for workflow improvements. What you’ll read next focuses on trends, customer needs, and what Keysight EDA is doing about them. Keysight will present its solutions, including Chiplet 3D Interconnect Designer, at both events.

SW: When we talked last year, we discussed new approaches to automated crosstalk analysis and its role in the first-pass success of more complex 3DHI designs. You’ve learned more about how engineers approach high-speed chiplet interconnects – what’s happening right now?

HSL: Everybody is talking about AI/ML, right? Many design teams are moving in that direction. One area attracting strong interest is advanced package design for high-traffic AI data centers. There are new technologies in play – multi-die, stacked die, heterogeneous integration, hybrid bonding, and more, all in advanced packaging. Packages are becoming faster and more complex, as designers aim to pack more functionality into a single package and save space. It’s turning out that success or failure in these designs largely depends on getting the high-complexity interconnects right.

SW: Sure, multiple 128-bit buses running at 64GT/s rates, what could possibly go wrong?

HSL: Exactly. There’s a lot that can go wrong when integrating different functional blocks into a single advanced package. Effects now happen on a 3D landscape. Signal integrity used to mean looking across a bus from source to destination. While bits on a bus may couple to other bits on the same bus, they can also affect other signals in the package, especially given vertical proximity, or part of the bus might be affected by something else. Designers start with “proven” chiplets and integrate them using silicon bridges or interposers, then route signals from those chiplets through physical packaging. Simplistic models don’t work well because every interconnect is a transmission line at these speeds. Signal integrity is a major concern in hatched or waffled ground-plane structures, which are almost mandatory given manufacturing constraints. There are holes in the ground return path, which may increase signal reflection and crosstalk, and this is really hard to model with conventional techniques due to its complexity.

SW: This is way beyond guessing where to place bypass capacitors on a printed circuit board, I suspect.

HSL: Without accurately modeling and simulating parasitic effects of all 3D structures in an advanced package design, there’s no hope to mitigate behaviors with external band-aids on a board. Everything going on inside the package has to be modeled and analyzed – looking at just part of the design and assuming the rest will perform the same way misses problems. However, customers tell us that finding the simplest ground-plane issues can take weeks of analysis, and even then, the analysis still doesn’t provide enough information to fix the more complex problems they’re encountering. It’s bringing traditional workflows to a grinding halt.

SW: So, they’re asking you for help? Modeling and simulation capabilities in Keysight Advanced Design System (ADS), along with new features such as Chiplet 3D Interconnect Designer, can provide a clearer picture?

HSL: Customers told us we need to do something, fast, and we have. We’ve been able to show engineers how to turn a comprehensive analysis of interconnects, including hatched ground planes, from weeks into minutes using Chiplet 3D Interconnect Designer. Imagine visually highlighting problems in a design in a few minutes, making corrections in one or more structures, re-running the analysis in a few more minutes, and seeing better results. Now teams can focus on optimizing the design and validating that it meets specifications.

SW: Or trying to find HBM parts, many of which are now in what vendors call allocation.

HSL: HBM supply is certainly a problem right now for all but a select few customers. You don’t want to be in the position of finding that a design needs to be redone for a different part because the one you specified had its lead time blow up while weeks of analysis was running.

SW: What about keeping up to date with the latest specifications?

HSL: That can be hard, too, and making that easier is a big focus in ADS and companion tools. In the last year, we’ve seen several revisions, including UCIe 3.0, PCIe Gen 7, and HBM4. And to complicate matters, customers sometimes customize their interconnects, borrowing concepts from UCIe or BoW but not adhering to all compliance rules. Faster analysis of the latest revision of pertinent specs or a user-defined set of rules enables our customers to stay competitive at the leading edge.

SW: You’ve coined a fascinating term – “design for hope.”

HSL: RF designers are dealing with a few crucial signal paths. High-speed digital designers are dealing with 128 or more signals between endpoints. Crosstalk analysis has been so tedious that engineers choose to look at only a few signals, get the eye margin where they’d like it, replicate the bus structures, and hope the rest of the lines are OK. As soon as real-world bridges, interposers, and packages appear, there’s a huge risk that the margin on the unobserved parts of the bus disappears. That’s an expensive omission just to save simulation time, yet engineers do it every day, and they pay the price if they miss. We’re at the point where data rates, ground planes, materials, structures, and packaging are unforgiving versus tight margins in specifications.

SW: Speaking of packaging, this workflow also helps there, correct?

HSL: Yes, the days when folks could design a chip and toss it over the wall to the packaging person and expect decent results are over. The classic workflow would perform a post-layout EM extraction, which at least provided some information, but not enough. Advanced packages mandate a co-design strategy with the 3DHI chip and either a bridge or an interposer. When comprehensive analysis runs in minutes, design space exploration, including packaging, yields better results.

SW: What big stuff is looming on the high-speed digital design frontier?

HSL: Power consumption is still a big concern at scale. Large AI chips have broken the 1000A barrier on their DC supply rail, and data center racks are pushing over 35,000A. Any ADCs or DACs in a design are power hogs. A new initiative, co-packaged optics (CPO), replaces electrons with photons, making it much more power-efficient. CPO may eventually flow back through the packages and interposers to the die-to-die interconnect, creating another transition for high-speed digital EDA tools and workflows.

There are, of course, other developments at Keysight EDA in the high-speed digital, power, and signal integrity circles. Teams of engineers, including Hee-Soo, are descending on Chiplet Summit and DesignCon shortly to share additional insights with audiences. Keysight has a resource hub where you can see everything happening at this year’s DesignCon in Santa Clara.

Keysight DesignCon Resource Hub

You’ll find this short introductory video for Chiplet 3D Interconnect Designer among the information there.

[videopack id=”366695″]https://semiwiki.com/wp-content/uploads/2026/02/Introduction-to-Chiplet-3D-Interconnect-Designer.mp4[/videopack]


Samtec Ushers in a New Era of High-Speed Connectivity at DesignCon 2026

Samtec Ushers in a New Era of High-Speed Connectivity at DesignCon 2026
by Mike Gianfagna on 02-16-2026 at 10:00 am

Samtec Ushers in a New Era of High Speed Connectivity at DesignCon 2026

As I’ve discussed before, Samtec has a way of dominating every trade show the company participates in. The upcoming DesignCon event is no exception. At the show, Samtec will be discussing data rates up to 448 Gbps and signals up to 130 GHz. Beyond a rich set of demonstrations in the company’s booth, Samtec engineers will be participating as authors and/or speakers in papers, panels, and sponsored sessions throughout the conference. Attendees will even be treated to a bar crawl. DesignCon is a major event not to be missed, and Samtec’s presence makes it even more worthwhile. Let’s see how Samtec ushers in a new era of high-speed connectivity at DesignCon 2026.

At the Samtec Booth (939)

Samtec Booth

A main focus will be demonstrations of Samtec’s CPX offerings, including co-packaged copper (CPC) and co-packaged optics (CPO) solutions. Demonstrations will include Si-Fly® HD CPC connectors operating at 224 Gbps. Other booth demonstrations will include Samtec’s new Si-Fly® backplane system that supports 224 Gbps signals. This includes a 112 Gbps active optics demonstration incorporating Samtec’s FireFly and new Halo products.

You can also see Samtec’s distinctive orange Nitrowave RF cable as part of the new Bulls Eye BE71A test assembly for 224Gbps PAM4 SERDES and the Bulls Eye BE130 test assembly running 448 Gbps differential signals for test channels. Samtec will also showcase advancements in material science with its SureCoat ultra-rugged coatings for high temperatures and harsh environments.

Samtec in the Conference Agenda (and More)

February 25, 8:00 AM – 8:45 AM, Ballroom G

Improving Spectral Efficiency by Optimizing Sub-Nyquist Equalization for 448 Gbps

While 224 Gbps PAM4 systems are currently under development, standards bodies such as IEEE and OIF are already investigating 448 Gbps solutions.  Doubling the data rate to 448 Gbps also doubles the Nyquist frequency—from 56 GHz to 112 GHz—posing significant new challenges. The objective of this paper is to make a comprehensive analysis identifying the best equalizers to recover a 448 Gbps PAM4 signal on a channel whose spectrum shows a significant roll-off or resonances laying below Nyquist frequency.

February 25, 12:10 PM – 12:50 PM, Great America 1

Successful PCIe 6.0 and 7.0 System Guidelines

As PCI-SIG moved to PAM4 modulation, the interconnect budget constrained requiring improved loss and noise.  This talk discusses the recipes for a robust PCIe 6.0 and 7.0 system, validating case studies by the PCI-SIG pre-layout channel compliance methodology.

February 25, 12:15 PM – 1:00 PM, Ballroom G

Distributed Capacitor Characterization for Advanced Packaging

Distributed capacitors play a crucial role in ensuring power integrity in modern GPUs, which demand high current delivery and minimal voltage ripple at ever-increasing switching speeds. A systematic modeling approach has been developed to capture the electrical characteristics of these distributed capacitive structures. The proposed models were extensively validated through correlation with VNA measurements conducted across wide voltage and temperature ranges.

February 25, 3:00 PM – 3:45 PM, Ballroom E

An Improved Broadband Material Characterization Method

A novel method utilizing airline measurements is introduced to accurately and efficiently determine the dielectric constant and loss tangent of a material for broadband applications. This method is independent of metal loss and physical lengths, thus eliminating errors from measurement uncertainties. The proposed method has been validated through simulations and comparisons with commercial resonator methods.

February 26, 2:00 PM – 2:45 PM, Ballroom H

Lessons Learned at 224 Gbps

Design-in for 224 Gbps architectures has begun, due largely to the intensive data needs of AI HW architectures. This expert discussion will examine how we can design, build, and test 224 Gbps systems and achieve acceptable SI in light of technological and physical limitations.

February 24, 4:45 PM – 6:00 PM, Ballroom A

Panel – Designing & Validating the Future: SERDES & Channel Innovations for PCIe at 128 GT/s

Continuing the tradition of previous years, this panel will focus on the latest updates and changes to the PCIe signaling and physical topologies with focus on PAM4 signaling and the PCIe 7.0 specification. Building upon this panel’s past contributions, this year’s participants bring a diverse knowledge base to discuss the latest advancements simulation, design, and innovative test and measurement methodologies required for these current and future PAM4 inflection points. Additional topics include correlation between simulation and validation, design practices for PCIe over optical cables and through electrical pathways, and signal integrity complications.

And of course, join Samtec at the company’s booth on Wednesday from 5:00-6:00pm during the DesignCon booth bar crawl.

Samtec Conference Experts in Attendance

Two of Samtec’s staff members have earned the prestigious title of Engineer of the Year at DesignCon. They are noted in the conference homepage:

Last year was the 30th anniversary of DesignCon. In honor of that, Istvan Novak provided an informative perspective on the history of DesignCon. You can read that interview here.

To Learn More

You can see more detail about Samtec’s participation in DesignCon 2026 here.  You can register for the show here. And don’t forget to Register with the Samtec VIP code: INVITE373903 to receive a free Expo Pass or a 15% discount on conference passes.  And that’s how Samtec ushers in a new era of high-speed connectivity at DesignCon 2026.

Also Read:

2026 Outlook with Mathew Burns of Samtec

Webinar – The Path to Smaller, Denser, and Faster with CPX, Samtec’s Co-Packaged Copper and Optics

Samtec Practical Cable Management for High-Data-Rate Systems


Bronco Debug Stress Tested Measures Up

Bronco Debug Stress Tested Measures Up
by Bernard Murphy on 02-16-2026 at 6:00 am

Automated debugger

I wrote last year about a company called Bronco, offering an agentic approach to one of the hottest areas in verification – root-cause debug. I find Bronco especially interesting because their approach to agentic is different than most. Still based on LLMs of course but emphasizing playbooks of DV wisdom for knowledge capture versus other learning methods. Last year, setup required playbooks developed in a collaborative approach between DV experts and the Bronco debug agent. Unsurprisingly as a young company in a fast-moving field, learning methods have evolved considerably to the point that the debug agent can now build its own reusable playbook to cover a substantial range of bugs. Proprietary expert DV refinement can then be added on top. David Zhi LuoZhang (CEO, Bronco) suggests that this expert step is a 10–15-minute task, hardly challenging. He adds that now generated playbooks and memories from debug runs are combined with specs and user-provided documentation. Together this institutional knowledge coalesces in a customer-specific library he calls the Bronco Library (to be covered in a later blog perhaps).

How well does it work?

An industrial strength test

David can’t share a name for a recent successful eval but he can say it is a major public company who evaluated on a big SoC. Lots of sensors, suggesting probably several modalities of AI processing plus the usual multi-core CPU, memory management, etc, etc. Also sounds like it may be safety critical. The eval task was to find a significant number of known bugs discovered during active development on a design. Also important, these were SoC-level bugs, the hardest to root-cause. Examples they cite include performance problems in AI accelerators, deadlocks in power mode transitions, nasty UVM race conditions and assertion-firing issues.

Bronco was able to isolate an exact root-cause location for about 50% of those bugs without help, some in UVM testbenches, some in the design. Just by looking at the RTL, UVM testbench, design spec and playbooks. For another 25% the agent was able to localize the root cause to a file (out of 10k files). After initial setup, analysis was hands-free and completed in 15 minutes. A human debugger would surely have taken hours if not days to get to similar closure. That’s a pretty significant advance on reducing the largest overhead (debug) in DV.

Integration with regression flows

AI-based methods can have challenges integrating into regression flows. Apparently, Bronco have found a way to coexist very neatly with these flows. Their debugger can be instrumented into an overnight regression, triggering whenever a failure is found., launching an investigation of each failure in turn. For each, the debug agent creates a ticket, showing not only the root problem but also the steps that lead to this diagnosis. The tcket is then ready for DV to review in the morning. When an engineer analyzes a ticket the following morning, they can submit it to Jira directly if they choose.

On that note a sample ticket is worth a look, just to understand the detailed analysis this debug agent is able to generate:

This example is based on bug analysis on a block rather than one of the SoC bugs to protect proprietary details. Nevertheless, the quality of analysis suggested by this ticket is pretty impressive. If this example is representative for even 50% of the bugs exposed in a regression, I imagine debug technologies like this are going to take off fast.

What if the debug agent delivers an incorrect analysis or doesn’t get close enough in isolating a root-cause? Working through the list of tickets, in such a case the DV engineer can provide extra guidance, and/or suggest the debugger go deeper. That re-analysis continues in the background, so that well before the engineer has finished checking subsequent tickets, updates are ready for re-analysis.

Big step forward in agentic debug. You can learn a bit more about Bronco HERE.

Also Read:

Verification Futures with Bronco AI Agents for DV Debug

Superhuman AI for Design Verification, Delivered at Scale

AI RTL Generation versus AI RTL Verification


TSMC and Cadence Strengthen Partnership to Enable Next-Generation AI and HPC Silicon

TSMC and Cadence Strengthen Partnership to Enable Next-Generation AI and HPC Silicon
by Daniel Nenni on 02-15-2026 at 6:00 pm

TSMC WAFER

TSMC continues to reinforce its leadership in advanced semiconductor manufacturing through its deepening collaboration with Cadence Design Systems. The expanded partnership focuses on enabling next-generation artificial intelligence and high-performance computing innovations by aligning advanced electronic design automation, 3D-IC technologies, and silicon-proven intellectual property with TSMC’s most advanced process nodes and packaging platforms.

At the heart of this collaboration is support for TSMC’s cutting-edge process technologies, including N3, N2, and A16™, which are critical for meeting the escalating performance, power efficiency, and scalability demands of AI workloads. Cadence’s AI-driven design flows have been optimized and validated for these nodes, allowing customers to achieve superior power, performance, and area outcomes while accelerating time to market. These flows leverage machine learning–based optimization to address the growing complexity of advanced-node designs, particularly for large-scale AI accelerators and HPC processors.

TSMC’s roadmap toward even more advanced technologies is further strengthened by joint development efforts with Cadence on future nodes, including the upcoming A14 process. Early EDA flow collaboration and PDK readiness ensure that customers can begin design work sooner, reducing risk and enabling faster adoption of next-generation silicon technologies. This early alignment between foundry and EDA provider is increasingly vital as design margins shrink and integration challenges intensify at advanced nodes.

Beyond transistor scaling, the partnership plays a critical role in advancing TSMC’s 3DFabric® platform, which enables heterogeneous integration through advanced packaging and die stacking. Cadence’s comprehensive 3D-IC solutions support a wide range of TSMC 3DFabric configurations, providing automation for bump connections, multi-chiplet physical implementation, and system-level analysis. AI-driven tools such as Clarity™ 3D Solver and Sigrity™ X enable accurate signal integrity, power integrity,  and thermal analysis, helping designers manage the complexities of large, multi-die systems.

Photonics integration is another area of collaboration, particularly through support for TSMC’s Compact Universal Photonic Engine (COUPE™). By combining Cadence’s Virtuoso® Studio and Celsius™ Thermal Solver with TSMC-developed productivity enhancements, customers can more effectively model thermal and electrical interactions in photonic and electronic systems. This capability is increasingly important for AI and data center applications, where power density and thermal management directly impact system reliability and performance.

A key pillar of the Cadence and TSMC partnership is the availability of design-in-ready, silicon-proven IP on advanced nodes such as TSMC N3P. Leading-edge memory and interface IP, including HBM4, LPDDR6/5X, DDR5 MRDIMM Gen2, PCIe® 7.0, and UCIe™, addresses the growing memory bandwidth and interconnect challenges faced by AI systems. These IP offerings enable customers to scale compute performance efficiently while overcoming bottlenecks associated with data movement and power consumption.

Bottom line: Together with the broader Open Innovation Platform® ecosystem, TSMC and Cadence are streamlining the path from design to silicon. By integrating AI-driven EDA, advanced packaging solutions, and next-generation IP with TSMC’s manufacturing leadership, the partnership empowers customers to deliver faster, more energy-efficient AI and HPC solutions. As AI adoption accelerates globally, this close collaboration positions TSMC at the center of the AI semiconductor super-cycle, enabling innovation across the entire value chain, from process technology to system-level integration.

Also Read:

TSMC & GCU Semiconductor Training Program: Preparing Tomorrow’s Workforce

NanoIC Extends Its PDK Portfolio with First A14 Logic and eDRAM Memory PDK

TSMC’s 2026 AZ Exclusive Experience Day: Bridging Careers and Semiconductor Innovation


CEO Interview with Elad Raz of NextSilicon

CEO Interview with Elad Raz of NextSilicon
by Daniel Nenni on 02-15-2026 at 4:00 pm

Elad Raz NextSilicon HS (1) (3)

Elad Raz is the founder and CEO of NextSilicon, which he established in 2018 to fundamentally rethink how HPC architectures are built. After more than two decades designing and scaling advanced software and compute systems, Elad saw firsthand the limits of fixed, inflexible processor designs. He founded NextSilicon to address those constraints with a software-defined approach to HPC, enabling architectures that can adapt to evolving workloads across HPC, AI, and data-intensive computing.

Tell us about your company?

NextSilicon is redefining the future of HPC and AI with Maverick-2, the industry’s first production dataflow accelerator powered by our Intelligent Compute Architecture (ICA). Maverick-2 combines intelligent software with adaptable hardware to deliver unprecedented energy efficiency, scalability, and precision across applications ranging from scientific discovery and generative AI to climate modeling and healthcare.

NextSilicon emerged from stealth in October 2024 to launch Maverick-2. Since then, Maverick-2 has been redefining the interaction between hardware and software to establish a compute paradigm rooted in adaptability and efficiency. With over $300 million in funding and more than 400 employees globally, we’ve already achieved significant milestones, including deployment at Sandia National Laboratories’ Vanguard-II supercomputer.

What problems are you solving?

There is a fundamental mismatch in how we are building compute today. Modern high-performance computing (HPC) and AI workloads are irregular, memory-intensive, and require massive amounts of data. Yet, we are still forcing them onto architectures optimized decades ago for completely different problems.CPUs and GPUs simply weren’t built for what we’re asking them to do now.

Maverick-2 addresses the growing inefficiencies of CPUs and GPUs, which struggle to meet the architectural, energy, and scalability demands of AI and HPC workloads in fields such as science, weather, energy, and defense. These workloads involve complex data dependencies, memory access, and computational patterns that today’s processors cannot handle, resulting in suboptimal performance and bottlenecks that stifle innovation.

Beyond hardware challenges, there’s also a developer pain point that leads teams to spend months porting and rewriting code for new architectures. Maverick-2 eliminates this burden by supporting existing code in C/C++, FORTRAN, OpenMP, and Kokkos without requiring specialized programming or domain-specific languages. We achieve up to 10x performance improvements while using 60% less power – a 4x performance-per-watt advantage – all without forcing developers to rewrite their applications.

What application areas are you strongest in?

We’re seeing strong traction in workloads that struggle on traditional architectures, think HPC simulations, scientific computing scenarios, and data-intensive enterprise and research applications. Specifically, our dataflow architecture excels in:

○ Energy: Fluid dynamics, exploration and seismic analysis, and particle simulation
○ Life Sciences: Drug discovery, protein docking, and multiomics analysis
○ Financial Services: Risk simulation, back testing, and options modeling
○ Manufacturing: Finite element analysis, crash simulation, and digital twins
○ AI/ML: Thinking models, reasoning workloads, and trillion-parameter LLMs with massive context windows approaching 1M tokens
○ Graph Analytics: Social network analysis, fraud and anomaly detection, and recommendation systems

What’s important is that Maverick-2’s programmability allows a single platform to span multiple domains. As AI and HPC continue to converge, that flexibility becomes critical.

What keeps your customers up at night?

Our customers face enormous pressure to achieve greater performance within strict limits on power, cooling, and budget. They worry about locking themselves into architectures optimized for today’s models that can’t adapt to the next challenge. Efficiency, scalability, and longevity are critical.

Developer productivity is another major concern. Many customers have CPU workloads that were never GPU-accelerated. NextSilicon’s Intelligent Compute Architecture can accelerate these codes without forcing them into a GPU-centric paradigm. For teams already using GPUs, the goal is to explore new architectures without extensive code rewrites or learning proprietary languages and frameworks. They want to focus on scientific discovery and innovation, not porting cycles.

Customers also value future-proofing. As algorithms and models evolve at an unprecedented pace, choosing the right architecture has become genuinely difficult. Our software-defined hardware adapts dynamically during execution and evolves with workloads over time. This protects long-term investments while delivering immediate performance and efficiency gains. Customers can use existing codebases in familiar languages while achieving significant improvements while addressing both technical and resource constraints.

What does the competitive landscape look like and how do you differentiate?

The landscape is crowded, but it’s constrained by legacy thinking. CPUs continue to add more cores, more instructions, and more cost. GPUs are broad-based across all use cases. Then, many accelerators optimize for very narrow use cases.

At NextSilicon, we’re not trying to incrementally improve an old paradigm. Dataflow is a fundamentally different execution model. Our strengths lie in our true runtime reconfigurability, general-purpose programmability, and system-level efficiency.

The key technical differentiator is our dataflow architecture, which allows hardware to adapt to software in real time rather than forcing software to conform to rigid hardware designs. This, combined with our strong financial backing and scale, positions us uniquely in the market.

Maverick-2 isn’t about replacing everything overnight. It’s about offering a compelling complement to traditional architectures that are showing diminishing returns.

What new features / technology are you working on?

We’re continuously enhancing Maverick-2’s hardware and software stack. Right now, we are focused on expanding our software ecosystem to support additional programming models and frameworks, making integration even easier. We’re also optimizing performance and efficiency for emerging workload patterns as HPC and AI workflows converge. The roadmap is driven by feedback. And working closely with our early customers and partners. We want to ensure our roadmap aligns with real-world requirements and problems, not theoretical ones.

How do customers normally engage with your company?

We work with customers both directly and through our partner ecosystem, which includes Dell Technologies, Penguin Solutions, HPE, Databank, Vibrint, and E4. These partnerships enable us to provide integrated solutions and support to organizations seeking to accelerate their HPC and AI workloads. For specific engagement options and to discuss how we can address computational challenges, we encourage interested organizations to reach out to our team.

Also Read:

CEO Interview with Naama BAK of Understand Tech

CEO Interview with Dr. Heinz Kaiser of Schott

CEO Interview with Moshe Tanach of NeuReality


Podcast EP331: Soitec’s Broad Impact on Quantum Computing and More with Dr. Christophe Maleville

Podcast EP331: Soitec’s Broad Impact on Quantum Computing and More with Dr. Christophe Maleville
by Daniel Nenni on 02-13-2026 at 10:00 am

Daniel is joined by Dr. Christophe Maleville, Chief Technology Officer and Senior Executive Vice-President of Soitec’s Innovation. He joined Soitec in 1993 and was a driving force behind the company’s joint research activities with CEA-Leti. For several years, he led new SOI process development, oversaw SOI technology transfer from R&D to production, and managed customer certifications. He also served as vice president, SOI Products Platform at Soitec, working closely with key customers worldwide. He has authored or co-authored more than 30 papers and also holds 30 patents.

In this broad discussion of technology development at Soitec and its impact Dan initially explores with Christophe how Soitec’s work with STMicroelectronics on 28Si FD-SOI substrates is enabling quantum computing development. Christophe also provides details about how Soitec and its work with the semiconductor ecosystem is enabling advances in a broad range of applications including AI, sensing, automotive, and edge AI.

CONTACT SOITEC

The views, thoughts, and opinions expressed in these podcasts belong solely to the speaker, and not to the speaker’s employer, organization, committee or any other group or individual.


TSMC vs Intel Foundry vs Samsung Foundry 2026

TSMC vs Intel Foundry vs Samsung Foundry 2026
by Daniel Nenni on 02-13-2026 at 6:00 am

TSMC vs Intel Foundry vs Samsung Foundry

The global semiconductor industry sits at the foundation of modern technology, powering everything from smartphones and cloud data centers to artificial intelligence, automobiles, and national defense systems. At the center of advanced chip manufacturing are three major players: TSMC, Samsung Foundry, and Intel Foundry. Each represents a distinct manufacturing model and strategic philosophy, and together they form a competitive landscape that is essential for innovation, resilience, and long-term industry health.

TSMC is the undisputed leader in pure-play foundry manufacturing. By focusing exclusively on manufacturing and avoiding competition with its customers in chip design, TSMC has built deep trust with fabless companies such as Nvidia, AMD, Apple, and Qualcomm. This focus has allowed TSMC to lead in process technology, consistently delivering the most advanced nodes such as N5, N3, and the upcoming N2 with strong yields and predictable execution. Its dominance has been especially visible in the AI era, where advanced nodes and packaging technologies like CoWoS have become critical bottlenecks.

Samsung Foundry represents a vertically integrated alternative. As part of Samsung Electronics, it both manufactures chips and designs its own products, including memory, logic, and consumer devices. Samsung has pushed aggressively into leading-edge nodes such as 2nm using gate-all-around (GAA) transistors and continues to invest heavily in advanced packaging and U.S. manufacturing. While Samsung has faced significant challenges in yield consistency compared to TSMC they routinely undercut TSMC wafer pricing. It is hard to figure out the math on that point. Even so, Samsung’s presence provides customers with an important second source at advanced nodes.

Intel Foundry is the most strategically significant entrant into the modern foundry race. Historically a vertically integrated company that designed and manufactured its own chips, Intel is opening its leading edge fabs to external customers while rebuilding its process leadership. Intel’s roadmap includes advanced nodes such as Intel 18A, as well as differentiated capabilities in advanced packaging (EMIB, Foveros) with U.S. based manufacturing. While Intel Foundry is still in the initial stages of winning major external customers, its success would meaningfully rebalance the industry by adding large-scale leading-edge capacity inside the United States.

Competition among these three players is not merely a commercial or political issue, it is structurally critical for the semiconductor ecosystem.

First, competition drives technological progress. Advanced chip manufacturing requires enormous capital investment, deep engineering talent, and long development cycles. Without competitive pressure, there would be less incentive to take risks on new transistor architectures, materials, or manufacturing techniques. The rapid evolution from FinFETs to GAAFET transistors is a direct result of competitive urgency.

Second, competition improves supply-chain resilience. Semiconductors are now a matter of national and economic security. Over-reliance on a single foundry or region increases vulnerability to geopolitical tensions, natural disasters, and capacity shocks. A competitive landscape with strong players in different regions reduces single-point-of-failure risk for governments and industries alike.

Third, customers benefit from choice and leverage. Fabless chip designers depend on foundries not just for wafers, but for co-optimization across design, packaging, and manufacturing. When customers have alternatives, they gain negotiating power on pricing, capacity allocation, and long-term roadmap alignment. This keeps foundries responsive to customer needs rather than dictating terms.

Finally, competition fuels ecosystem growth. Foundries anchor vast networks of equipment suppliers, materials companies, EDA vendors, and OSAT partners. When multiple foundries invest aggressively, the entire ecosystem advances faster, benefiting innovation well beyond any single company.

Bottom line: TSMC, Samsung Foundry, and Intel Foundry are not redundant competitors they are essential counterweights. The semiconductor industry needs all three to succeed, because competition ensures innovation, resilience, and sustainable growth in one of the most strategically important industries in the world, absolutely,

Also Read:

TSMC & GCU Semiconductor Training Program: Preparing Tomorrow’s Workforce

NanoIC Extends Its PDK Portfolio with First A14 Logic and eDRAM Memory PDK

TSMC’s 2026 AZ Exclusive Experience Day: Bridging Careers and Semiconductor Innovation


Silicon Catalyst at the Chiplet Summit: Advancing the Chiplet Economy

Silicon Catalyst at the Chiplet Summit: Advancing the Chiplet Economy
by Daniel Nenni on 02-12-2026 at 10:00 am

Chiplet Summit 2026

The rapid evolution of semiconductor design has elevated chiplets from a niche concept to a foundational strategy for next-generation computing. At the upcoming Chiplet Summit – February 17–19, 2026 Santa Clara Convention Center. Silicon Catalyst will play a central role in shaping this conversation, highlighting how startups, investors, and supply-chain partners can collaborate to unlock value in the emerging chiplet economy.

Through exhibition presence, keynote insights, and a dedicated panel session, Silicon Catalyst  will demonstrate its unique position as an enabler of innovation at the intersection of technology, entrepreneurship, and capital.

Silicon Catalyst’s presence on the exhibition floor underscores its mission as the world’s only accelerator focused exclusively on semiconductor startups. By engaging directly with attendees, Silicon Catalyst will showcase how its comprehensive ecosystem, spanning IP providers, foundries, packaging experts, and venture partners, reduces barriers for early-stage companies seeking to commercialize chiplet-based solutions. In a market where design complexity and manufacturing costs can overwhelm young companies, Silicon Catalyst’s support model offers a critical on-ramp. Leaders from several chiplet companies in Silicon Catalyst’s portfolio will be available for discussions at the show.

A major highlight of the summit will be the presentation by Nick Kepler, Silicon Catalyst COO, on Wednesday, February 18. Nick will emphasize the strategic importance of chiplets in scaling performance, managing cost, and accelerating time to market. By framing chiplets not merely as a technical innovation but as a business enabler, the presentation will reinforce a core theme of the summit: success in the chiplet era depends as much on ecosystem coordination as on engineering excellence.

This theme will come into sharper focus on the Thursday, February 19th panel session at 3pm entitled “Chiplets for Entrepreneurs – Making Money in the Chiplet Game,” hosted by Silicon Catalyst.

The panel session will bring together perspectives from venture capital, the supply chain, and the startup community and includes various presentations and a panel discussion from industry and venture capital executives in the chiplet industry.

The supply-chain perspective will be presented by Qnity (previously DuPont Electronics), highlighting the often underappreciated role of advanced materials, packaging, and manufacturing readiness. As chiplet architectures rely heavily on heterogeneous integration, tight collaboration across the supply chain becomes essential to achieving performance and reliability targets on an industrial scale.

A quick series startup presentations form the core of the session, offering concrete examples of how chiplets are being applied across diverse markets:

  • Athos Silicon covering safety-critical AI compute for the physical world, emphasizing deterministic performance and reduced certification friction for autonomy.
  • CrossFire Technologies addressing one of AI’s most pressing challenges, the compute-to-memory bottleneck, through its patented Bridgelet™ products.
  • HEPT Lab will illustrate the power of heterogeneous integration with its multi-chiplet 3D image sensor combining silicon photonics, CMOS, and InP technologies into a single package.
  • Quadric AI will showcase its fully programmable, chiplet-ready AI inference IP, targeting generative AI and autonomous driving.

The concluding panel discussion, featuring leaders from startups, the investment community, and the supply chain, is targeted at reinforcing the idea that chiplets are not a silver bullet but a powerful tool when paired with the right ecosystem support. The key is that technical innovation, manufacturability, and business strategy must evolve together.

Bottom line: Silicon Catalyst’s leadership at the Chiplet Summit highlights its pivotal role in transforming chiplets from promising technology into viable businesses. By convening investors, suppliers, and entrepreneurs under a shared vision, Silicon Catalyst is helping to define how value is created, and captured, in the chiplet era, making it a cornerstone of the semiconductor industry’s next growth chapter.

Note: Silicon Catalyst has arranged a special Chiplet Summit registration discount code that you can use at checkout:  CS26SICA

REGISTER NOW

Also Read:

Silicon Catalyst: Searching for the Next Great Start-up

Revitalizing Semiconductor StartUps

Silicon Catalyst on the Road to $1 Trillion Industry


Giving AI Agents Access to a Compiled Design and Verification Database

Giving AI Agents Access to a Compiled Design and Verification Database
by Tom Anderson on 02-12-2026 at 8:00 am

SemiWiki Snapshot

A few weeks ago, I had the chance to work with AMIQ EDA as they introduced a new product: DVT MCP Server. I was quite intrigued by the role it will play in AI-assisted chip design and verification, so I wanted to learn more. I spoke with Gabriel Busuioc, the AI Assistant team leader at AMIQ EDA, to understand more about the product and how their users will benefit from it.

Busu, it is great to talk with you. What is your role?

I have worked at AMIQ EDA for almost five years now. I started as an intern and joined full-time after I earned an MS in Advanced Software Services from the Politehnica University of Bucharest. I’ve worked on several very interesting projects, adding new features to our existing products as well as developing this new product.

What was the motivation for DVT MCP Server?

As you know, we provide integrated development environments (IDEs) and related tools for hardware design and verification. We support a wide range of languages, all of which we compile and elaborate. We connect together the code from hundreds or thousands of files into a single internal database of the complete hierarchical design and verification environment. That database gives us the ability to enable very smart editing, automated analysis and linting, debugging, documentation generation, and more.

This compiled database is internal to your Design and Verification Tools (DVT) products, correct?

Yes, and that’s where DVT MCP Server comes in. We started to wonder whether other tools, specifically AI agents, could benefit if they had access to project information within our internal database. It turned out that there is an open industry standard called Model Context Protocol (MCP) that serves exactly this purpose. It’s designed to connect AI agents to external data and applications. The goal is to make AI results more accurate by providing access to specialized or application-specific knowledge that was not learned through general training.

That sounds like the sort of knowledge in your database.

Exactly. We have detailed knowledge of design and verification languages plus of course the project-specific knowledge of the design and verification environment. DVT MCP Server allows all sorts of other applications to benefit from our knowledge and invoke our analysis engines.

Can you please give an example of how this might work?

Many engineers are experimenting with using AI to generate code, including for design and verification environments. They’re finding that AI agents are more effective with general-purpose languages than domain-specific languages. Limited training data and lack of context means that AI may hallucinate or generate incorrect code. DVT MCP Server provides quick, compiler-backed feedback to ground AI reasoning in accurate language semantics and their project context while detecting any errors in generated code.

So the benefit to users is better design and verification code?

Yes, that is what our users are reporting. If an AI agent makes a subtle language error, or incorrectly references part of the design or testbench, DVT MCP Server catches that and reports it back to the agent. Users don’t need to pay attention to errors that happen internally; all they see is correctly generated code. AI agents can understand, generate, modify, debug, and correct code for real-world design and verification projects efficiently and accurately. This is simply not possible using only generic training data.

What else should we know?

DVT MCP Server supports Verilog, SystemVerilog, VHDL, and the e language, so it covers the most widely used languages for chip development. It can run within DVT IDE to provide live project context to interactive AI assistants, or operate in batch mode to support fleets of AI agents in automated workflows.

Is this related to AI Assistant that you introduced in late 2024?

You can think of them as complementary. AI Assistant runs within our products, enabling users to generate, modify, and understand code more easily. It relies on our internal design and verification database. DVT MCP Server provides external access to information in this same database for other tools to yield better results.

Is there anything new with AI Assistant?

Yes. We originally introduced this feature for DVT IDE, but have now integrated it with all our products. For example, it enables Verissimo SystemVerilog Linter to better explain and auto-correct linting failures in design and verification code. AI Assistant also helps Specador Documentation Generator produce description comments, for example to fully document a module or entity, including all ports and signals. Moreover, we’ve enhanced AI Assistant within DVT IDE with a new Agentic profile that enables the connected LLM to autonomously get project information and do file edits end-to-end. All the users have to do is specify their requests in plain, natural language.

How can our readers learn more?

You can start with our product page, and then request a demo or an evaluation license. You can also meet with members of our team in person at DVCon U.S. in Santa Clara March 2-5, where we will be exhibiting at Booth 204. We hope to talk with you there or online.

Busu, thank you very much for your time. AI and AMIQ EDA are two great topics to discuss.

I agree, and I think they’re even more exciting when combined. Thank you, Tom.

Also Read:

2026 Outlook with Cristian Amitroaie, Founder and CEO of AMIQ EDA

Runtime Elaboration of UVM Verification Code

Better Automatic Generation of Documentation from RTL Code