Bronco Webinar 800x100 1

Improving Retrieval Accuracy in AI

Improving Retrieval Accuracy in AI
by Bernard Murphy on 02-18-2026 at 6:00 am

Agentic RAG expert

While there are big ambitions for virtual engineers and other self-guiding agentic applications, today estimates show 83-90% of AI inferences are for internet searches. On a related note, chatbots are now said to account for nearly 60% of internet traffic. Search and support are the biggest market drivers for automation and unquestionably have improved through AI automation. Search gets closer to what you want in one pass. Chatbots also depend on retrieving domain-specific information following a question. In either case, RAG – retrieval augmented generation – plays an important role in finding the most relevant sources for a search or chat response.

Or so you hope. My experience is that the RAG results I get in a basic search/question are most useful for simple (one-part) questions in areas where I have no expertise. The more expertise I have or the more complex my question, the less useful I find the response. I can do somewhat better by adding context (You are an expert in … My question is …). Asking for citations also helps. But even these tricks don’t always work. The problem is that RAG as originally conceived (2020) has limitations. I thought it would be interesting to look at advances in this field, for variety looking from a business perspective and a healthcare perspective. In AI, it seems that our needs and priorities are not so different.

A business perspective from Elastic and Cohere

Target applications here cover a wide range in business: finance, public sector, energy, media, etc. I found this webinar which presents a combination of these two technologies with particular emphasis on RAG, the basics, challenges, and advances.

First, a quick note on RAG. LLMs are trained on publicly accessible corpora. RAG training derives information from separate and typically internal proprietary sources: PDFs, spreadsheets, images, etc. This information is chunked in some manner (eg. paragraphs in PDF text) and encoded as vectors based on similarity (scalar products of vectors, so related objects are close and unrelated objects are not). Chunks in training data must be expert (human) labeled.

Retrieval then uses a mix of keyword matching and similarity-based search to develop a top-ranked set of responses to your question. RAG is more accurate in retrieval than a general-purpose LLM because it can exploit semantic understanding based on similarity matching between a query and labeled training data.

So far this is naive RAG, with known limitations. These include struggles where the answer needed may require wider understanding of a source, or multiple sources, or the question asked has multiple clauses and requires sequential reasoning.

You know what’s coming next: agentic RAG, also called advanced-RAG. To address these limitations a system must develop a plan of attack, do multiple hops of reasoning, and self-reflect/verify after each step, potentially triggering rework. This is what agentic does. As soon as a question/request becomes even moderately complex, resolution must turn agentic, even in RAG. Tools used to support such agentic flows in business applications might be Microsoft Office, CRM, or SQL databases.

For completeness, a further advance you may find is modular RAG. These systems allow for more building block approaches to blend retrieval and refinement in structuring pipelines.

A healthcare perspective from Kent State and Rutgers

Here I draw on a long but very interesting paper. The authors suggest the following as key applications in healthcare: diagnostic assistance by retrieving information on similar cases; summarizing health records and discharge notes; answering complex medical questions; educating patients and tailoring responses to user profiles; matching candidates to clinical trials; and retrieving and summarizing biomedical literature, especially recent literature, in response to a clinical or research query.

The authors note a range of challenges in retrieving information. Obviously such a system must handle a wide range of data types (modalities), from doctor notes to X-rays, EKG traces, lab results, etc. They must also contend with a wide range of potentially incompatible health record sources, some with technically precise notes (myocardial infarction), some less precise (heart attack). Users face challenges in understanding the credibility of sources (media health articles, versus Reddit, versus respected journals in a field) and how these contribute to ranking conclusions. Familiar challenges even in our field.

There is a longer list from which I’ll call out one widely relevant item: the need to continuously update as new research, drugs and treatments emerge, also need to deprecate outdated sources. In a medical context the authors suggest that manual updates would be too slow and error-prone and that any useful RAG system for their purposes must build continuous update into the system.

They look at tradeoffs between the three RAG architectures mentioned earlier (naive, advanced, and modular). They find naive RAG easy to setup and use, though for their purposes too noisy and risky for high-stakes scenarios. Advanced RAG is more promising in diagnostic support and EHR summarization, striking a balance between factual grounding and speed, but requires significant compute resource (presumably an on-prem datacenter). This method looks most ready today for clinical use, at least in hospitals and large clinics. They see modular RAG as interesting for ongoing research, though training and resource costs make it impractical to consider for near-term deployment.

Relevance to design automation

Accuracy is critical for technical support in our domain, whether internal or external. Our users are very knowledgeable and intolerant of beginner-level suggestions. Experiences above suggest that advanced/agentic RAG may be the most appropriate method to deploy support here.

That guidance should aim to avoid mistakes made in some ambitious all-AI rollouts (Klarna customer support for example). These certainly should include emphasis on “don’t know” for suggestions with low support, explainability for top candidate response offered, and methods to escalate to a human expert when the bot is uncertain. I am starting to see some of this in general customer support.

Meantime, agentic RAG can make a big difference in productivity and user satisfaction for in-house and external users. Most of us would prefer to explore on our own supported by effective agentic RAG, only turning to a human expert when we’re not making progress. That’s technology worth supporting.

Also Read:

Bronco Debug Stress Tested Measures Up

TSMC and Cadence Strengthen Partnership to Enable Next-Generation AI and HPC Silicon

How Memory Technology Is Powering the Next Era of Compute


Ceva IP: Powering the Era of Physical AI

Ceva IP: Powering the Era of Physical AI
by Daniel Nenni on 02-17-2026 at 2:00 pm

Ceva IP Powering the Era of Physical AI

Artificial intelligence is rapidly moving beyond the digital domain and into the physical world. From autonomous robots and smart factories to intelligent vehicles and connected consumer devices, AI systems are increasingly expected to perceive their surroundings, make real-time decisions, and act on them instantly. This shift marks the rise of Physical AI, a new generation of intelligence that combines sensing, connectivity, and on-device inference. At the heart of this transformation is Ceva IP, a leading provider of semiconductor and software intellectual property that enables Physical AI at the edge.

Physical AI differs from traditional cloud-centric AI in a fundamental way. Rather than relying on remote servers to process data, Physical AI systems operate directly on the device. They gather data from sensors such as cameras, microphones, radar, and motion detectors, analyze it locally using AI inference engines, and respond in real time. This approach is essential for applications where latency, power efficiency, reliability, and privacy are critical. Ceva’s technology portfolio is purpose-built to address these exact requirements.

Ceva’s role in Physical AI begins with connectivity, a foundational pillar for intelligent devices. The company offers a comprehensive suite of wireless IP solutions, including Bluetooth, Wi-Fi, ultra-wideband, and cellular technologies. These connectivity solutions allow devices to communicate seamlessly with other machines, infrastructure, and ecosystems while maintaining low latency and high reliability. In Physical AI systems, fast and robust wireless communication ensures that sensor data, control signals, and AI-driven decisions flow without interruption, even in complex or dynamic environments.

Equally important is sensing and sensor fusion, another core strength of Ceva IP. Physical AI systems rely on multiple sensors to understand the real world accurately. Ceva provides DSP architectures and software frameworks optimized for handling multimodal sensor data. By efficiently combining inputs from vision, audio, motion, and radar sensors, Ceva-powered systems gain a richer, more contextual understanding of their surroundings. This capability is crucial for applications such as robotics, advanced driver-assistance systems (ADAS), and industrial automation, where precise perception directly impacts safety and performance.

At the center of Physical AI lies on-device inference, and this is where Ceva’s neural processing technologies play a decisive role. Ceva’s AI and neural network IP solutions are designed to deliver high performance at ultra-low power, enabling complex AI workloads to run directly on edge devices. These inference engines support a wide range of machine learning models, from classical AI algorithms to modern deep learning networks. By performing inference locally, devices can respond instantly to real-world events without depending on cloud connectivity, while also protecting sensitive data.

One of Ceva’s key differentiators is its system-level approach. Rather than offering isolated components, Ceva provides an integrated IP ecosystem that allows semiconductor designers to build complete Physical AI platforms. Connectivity, sensing, and inference IP blocks are designed to work together efficiently, reducing integration complexity and accelerating time to market. This modular and scalable approach enables customers to tailor solutions for diverse applications, from tiny battery-powered IoT devices to high-performance automotive and industrial systems.

The impact of Ceva IP is already visible at scale. Ceva technologies have been deployed in tens of billions of devices worldwide, underscoring the company’s influence across consumer, automotive, industrial, and mobile markets. As the demand for smarter, more autonomous systems continues to grow, Physical AI is becoming a defining trend in the semiconductor industry.

Bottom line: Ceva IP is powering the era of Physical AI by providing the essential building blocks that allow machines to sense, connect, think, and act in the real world. By enabling real-time intelligence at the edge, Ceva is helping bridge the gap between digital computation and physical interaction—unlocking a new generation of responsive, efficient, and intelligent devices.

CONTACT CEVA-IP

Also Read:

Ceva-XC21 Crowned “Best IP/Processor of the Year”

United Micro Technology and Ceva Collaborate for 5G RedCap SoC and Why it Matters

Ceva Unleashes Wi-Fi 7 Pulse: Awakening Instant AI Brains in IoT and Physical Robots


Accelerating Static ESD Simulation for Full-Chip and Multi-Die Designs with Synopsys PathFinder-SC

Accelerating Static ESD Simulation for Full-Chip and Multi-Die Designs with Synopsys PathFinder-SC
by Kalar Rajendiran on 02-17-2026 at 10:00 am

SNPS PathFinder SC ESD Verification

As analog and mixed-signal designs become increasingly complex, parasitic effects dominate both design time and cost, consuming 30–50% of engineers’ effort in debugging and reanalyzing circuits. Addressing these multiphysics effects requires early verification strategies and reliable simulation solutions. Modern verification must extend beyond traditional RC parasitics to encompass inductance, RF interactions, voltage drop, RDS(on) effects, thermal behavior, signal integrity, photonics, and electrostatic discharge (ESD).

Synopsys recently hosted a webinar on ESD verification for full-chip and multi-die designs using its PathFinder-SC platform. The session was presented by Peter Tsai, Product Manager; Marc Swinnen, Product Marketing Manager; and John Alwyn, Product Specialist. It provided a detailed look at PathFinder-SC’s capabilities in addressing modern ESD verification challenges, highlighting workflows for early-stage validation, full-chip and multi-die simulation, and layout-driven debugging. The session emphasized effective protection circuits, bump-to-bump, bump-to-clamp and clamp-to-clamp discharge paths, and adherence to foundry-certified thresholds for voltage, current, and electromigration limits.

Synopsys PathFinder-SC Overview

PathFinder-SC enables early ESD verification through cell-based modeling of discharge circuits derived from GDS, OASIS, DEF, and LEF data. This approach allows potential reliability issues to be identified well before full layout completion. Its scalable architecture, powered by the Seascape distributed computing platform, supports simulations of full-chip and multi-die designs with billions of nodes.

By leveraging RedHawk-SC’s certified extraction and electromigration engines, PathFinder-SC ensures that effective resistance and current density checks comply with foundry guidelines. This guarantees that discharge paths, metal routing, and protection circuits can safely handle ESD currents. Layout-driven debugging allows engineers to trace shortest-path resistances, visualize current density flows, and pinpoint potential failure points.

Multi-scenario simulations that include variations in bump placement, clamp types, and extraction corners help optimize design robustness. PathFinder-SC also extends verification from the die to the package and board level, incorporating package netlists and compact impedance models to ensure end-to-end reliability.

Technical Capabilities and Workflows

PathFinder-SC supports static ESD simulations, including effective resistance and current density checks along intended discharge paths. It verifies protection circuits such as primary and secondary diodes, clamps, and cross-domain devices, ensuring that ESD currents safely exit through the nearest bump without damaging functional circuits. When discharge paths are poorly routed or have insufficient metalization, unintended paths may form, potentially causing latent device damage. PathFinder-SC detects these weaknesses early, enabling designers to optimize layouts before fabrication.

Simulation workflows include design netlist analysis, physical verification, static resistance and current density simulations, and dynamic simulations. Using schematic and layout checks, PathFinder-SC identifies ESD devices, verifies connections, and ensures compliance with foundry guidelines. Static simulations measure effective resistance and current density along partial or full discharge paths, while dynamic simulations assess peak stress voltages and currents in protection circuits. These analyses enable robust verification of bump-to-bump, bump-to-clamp, and clamp-to-clamp paths, including multi-die interposer connections.

Distributed processing via Seascape allows large designs to be simulated efficiently. Tasks such as geometry processing, resistance extraction, zap simulations, and result aggregation are parallelized across hundreds of CPU cores or cloud workers. PathFinder-SC can handle designs with hundreds of thousands of bumps, tens of thousands of protection devices, and multiple dies, completing workflows in hours rather than days. Sensitivity analysis and shortest-path resistance tracing allow engineers to pinpoint high-resistance segments and electromigration hotspots for targeted optimization.

The tool’s view-based architecture supports simulation of multiple variations in bump placement, clamp types, and extraction corners. Each view executes workflows such as RC extraction, clamp identification, and zap simulations, with results stored in a distributed database for analysis. Users can query the database, visualize current density and electromigration maps, and browse resistance check results for full-chip or multi-die designs. Compact impedance modeling and detailed standard parasitic format (DSPF) stitching allow transient simulations of ESD events, linking die-level and package-level effects.

Performance Insights

Performance metrics shared during the webinar demonstrated PathFinder-SC’s ability to handle very large designs efficiently. Full-chip SoCs and multi-die stacks with millions of nodes, tens of thousands of clamps, and hundreds of thousands of bumps were simulated using distributed workers.

For example, a multi-die 3DIC design with 168,000 protection diodes and 144 clamps completed clamp-to-clamp and bump-to-clamp resistance checks in under 14 hours using 80 CPUs. Current density simulations for a 5nm SoC with around 163,000 PG bumps finished in about 37 hours using 50 CPUs. A very large interposer layout covering 24 cm × 18 cm, with over 500,000 nets and 163 million geometry shapes, completed extraction and design modeling in 47 hours using hundreds of cores. These examples demonstrate that PathFinder-SC scales to extreme design sizes without compromising accuracy or speed.

Summary

The webinar highlighted that ESD events can occur throughout a chip’s lifecycle and ESD verification is especially critical in advanced packaging technologies such as chiplets and 3D ICs. For reliability engineers, physical design engineers, SoC architects, packaging and 3DIC engineers, and others responsible for ESD protection, the session offered in-depth insight into PathFinder-SC’s capabilities. The presenters shared practical workflows, performance statistics, and real-world examples, with actionable guidance for integrating ESD verification into modern design flows. From pre-LVS verification to final signoff, PathFinder-SC helps teams accelerate design completion, mitigate risk, and ensure robust ESD protection in complex, advanced-node chips.

You can watch the entire webinar here.

Also Read:

Podcast EP330: An Overview of DVCon U.S. 2026 with Xiaolin Chen

2026 Outlook with Abhijeet Chakraborty VP, R&D Engineering at Synopsys

Advances in ATPG from Synopsys


A Century of Miracles: From the FET’s Inception to the Horizons Ahead

A Century of Miracles: From the FET’s Inception to the Horizons Ahead
by Daniel Nenni on 02-17-2026 at 6:00 am

From the FET’s Inception to the Horizons Ahead

The Field-Effect Transistor (FET), a cornerstone of modern electronics, marks its centennial in 2025, tracing back to Julius Edgar Lilienfeld’s groundbreaking invention in 1925. Born in 1882 in what is now Lviv, Ukraine, Lilienfeld was a prolific physicist who earned his PhD from Berlin University in 1905. His early work at Leipzig University focused on vacuum conduction, contributing to X-ray tubes and even collaborating with Count Ferdinand von Zeppelin on dirigibles. A friend of Albert Einstein, Lilienfeld relocated to the United States in 1927 to defend his X-ray patents, eventually directing research at Ergon Laboratories in Malden, Massachusetts. There, he explored electrolyte interfaces and semiconductors, leading to his FET patents. Retiring to St. Thomas in 1935 due to allergies, he continued research until his death in 1963 at age 81.

Lilienfeld’s FET aimed to replace bulky vacuum tubes, invented in 1906 by Lee De Forest for wireless communication. Vacuum tubes controlled electron beams easily via grid potential in a vacuum. However, semiconductors posed challenges: abundant charges made current control by a gate electrode nearly impossible. Lilienfeld proposed structures like a MESFET in 1925 using p-type semiconductors with Schottky contacts and a MOSFET in 1928 with an aluminum oxide insulator on copper sulfide.

From 1925 to 1960, FET development stalled with a “hopeless period.” Materials like Cu2S and Cu2O were poor choices, semiconductor purity was abysmal, and foundational physics and technologies were lacking. Progress accelerated around 1940 when silicon was refined for military radar detectors, achieving 99.8% purity via thermal processes. This led to Russell Ohl’s accidental discovery of the pn junction in 1940, separating p-type (boron) and n-type (phosphorus) impurities, enabling photovoltaic and rectification effects.

Key theoretical advances followed. In 1938, Walter Schottky described inversion and depletion layers at metal-semiconductor junctions. Heinrich Welker proposed an inversion layer channel in 1945, noting its thinness for easy gate control and low scattering in depletion regions. By 1953, Walter Brown and William Shockley demonstrated FET operation using a back-gate structure on germanium, incorporating pn junctions for source/drain isolation.

The 1947 invention of the point-contact transistor by John Bardeen and Walter Brattain, followed by Shockley’s bipolar junction transistor (BJT) in 1948, shifted focus temporarily. BJTs benefited from 1950s technologies like silicon crystal pulling, impurity doping, photolithography, ion implantation, and epitaxial growth—processes essential for FETs too, but MOSFETs couldn’t be fabricated without them.

Interface instabilities plagued early MOSFETs, with charges like fixed oxide, trapped, mobile ionic, and interface states causing drift until resolved around 1969. This breakthrough enabled mass production of LSIs. Intel’s 1101 256-bit SRAM (1969) and 1103 1k-bit DRAM (1970) featured over 1,000 MOSFETs, using 8-10 µm rules and multiple voltages. The 1971 Intel 4004, the first microprocessor, integrated 2,300 pMOSFETs at 10 µm, designed by Federico Faggin, Masatoshi Shima, and others.

Robert Dennard’s 1974 scaling scheme revolutionized downsizing: reducing dimensions and voltages by factor K (typically 0.7 every 2-3 years) maintained power density while boosting speed and density. Over 25 generations from 1964’s 20 µm to 2025’s 2 nm, this yielded nanoelectronics, transitioning from PMOS/NMOS to CMOS, bulk to FinFET/GAA structures.

By 2025, line widths approach 10 nm limits—practical (demerits outweigh benefits), direct-tunneling (~3 nm), and atomic (~0.3 nm). Memory cells rival influenza virus sizes; chips hold 100 billion transistors, wafers trillions, rivaling galactic stars.

Yet, challenges loom, especially for AI. Semiconductor-based AI systems consume vastly more power than human brains: a massive AI with 5,000 GPUs uses 50 MW for 500 trillion “synapses,” versus a human’s 100 trillion synapses at 100 W. MOSFETs leak subthreshold current even “off,” unlike efficient biological synapses operating at <100 mV and <1 kHz without constant voltage.

Bottom line: This inefficiency explains why nature favors biology over semiconductors for cognition. Future horizons may involve bio-inspired designs or new materials to curb power hunger, ensuring FET miracles continue.

Also Read:

Samtec Ushers in a New Era of High-Speed Connectivity at DesignCon 2026

Silicon Catalyst at the Chiplet Summit: Advancing the Chiplet Economy

Accellera Strengthens Industry Collaboration and Standards Leadership at DVCon U.S. 2026


Two Open RISC-V Projects Chart Divergent Paths to High Performance

Two Open RISC-V Projects Chart Divergent Paths to High Performance
by Jonah McLeod on 02-16-2026 at 2:00 pm

yun chip hier

Up to now the RISC-V community has been developing open-source processor implementations to a stage where they can appeal to system designers looking for alternatives to proprietary Arm and x86 cores. Toward this end, two projects have emerged as particularly significant examples of where RISC-V is heading. One is Ara, a vector processor developed at ETH Zürich as part of the PULP platform. A second is XiangShan, a high-performance scalar core developed in China. Both are serious engineering efforts. Both are open source. Yet they represent fundamentally different answers to the same question: how should RISC-V scale performance?

Ara Takes the Explicit Vector Path

Ara implements the RISC-V Vector Extension by making parallelism explicit to software. The design exposes vector width, register grouping, and data locality directly to the programmer. Software controls how many elements execute in parallel through VL, how wide those elements are via SEW, and how registers are grouped using LMUL. Memory behavior remains visible and largely software managed.

The key architectural decision in Ara is the elimination of speculation. Rather than attempting to discover parallelism dynamically in hardware, Ara requires software to declare it explicitly. Because the work is explicitly structured, there is no branch speculation inside vector loops, no instruction reordering speculation, no guessing about memory dependencies, and no need for rollback mechanisms. Ara executes exactly the work it is instructed to execute.

This distinction matters for performance analysis. A stall is simply waiting for data to arrive. A speculative penalty is wasted execution followed by recovery. Ara still pays for memory latency, but it never flushes pipelines, squashes instructions, replays large instruction windows, or discards completed work.

“Ara was designed to prioritize efficiency for highly parallel workloads rather than speculative general-purpose execution. By eliminating speculation in the vector engine, we avoid the energy cost of mispredictions and pipeline recovery, allowing the hardware to focus almost entirely on productive computation. For large, structured workloads such as matrix multiplication, this approach consistently delivers very high functional-unit utilization and strong performance-per-watt,” says Matteo Perotti, ETH Zürich.

Ara’s vector model makes this non-speculative execution practical. Vector instructions amortize control overhead across many elements. Loop bounds and memory access patterns are regular. Control flow is largely outside the hot loop, and data dependencies are explicit. That structure eliminates the need for speculation to keep pipelines busy.

A Shallow Memory Hierarchy

Ara’s memory system reinforces this philosophy. Unlike conventional CPUs or GPUs, Ara does not include private L1 caches for vector execution. Its load-store unit connects directly to a shared SRAM-based last-level memory. Depending on the integration context, this memory may act as a cache, a software-managed scratchpad, or a preloaded working set.

In simulation, Ara is exercised in a bare-metal environment where data is preloaded into memory to accelerate simulation runtime. Full operating-system bring-up and debugging are performed directly on FPGA platforms where execution speed makes system-level validation practical. In ASIC prototypes such as Yun, the last-level memory appears as a small on-chip SRAM. In FPGA integrations such as Cheshire, Ara is integrated into a full SoC with operating-system support.

What remains consistent across these systems is the architectural intent: locality is a software responsibility, not something to be speculatively optimized away by deep cache hierarchies. This approach aligns closely with RVV’s execution model. Vector performance depends less on hiding latency than on sustaining bandwidth and reuse.

Physical implementation of Ara on the Yun chip. The CVA6 scalar processor with instruction and data caches sits at top. Four identical vector lanes surround the central vector load-store units (VLSU) and mask unit. Courtesy PULP Platform (ETH Zurich and University of Bologna)
Where Vectors Begin to Strain

Ara is also instructive because it reveals where vector architectures begin to strain. Matrix-dominated workloads, now central to AI and machine learning, can be expressed on vector engines through careful tiling and accumulation. Ara demonstrates that this can be done effectively, but not without increasing register pressure, instruction overhead, and software complexity.

Rather than masking these challenges with additional hardware, Ara exposes them cleanly. In doing so, it helps explain why the RISC-V ecosystem is now exploring matrix extensions as a distinct architectural layer above vectors. Ara effectively defines the upper bound of what pure vector execution can deliver, making it a valuable reference point rather than an endpoint, as illustrated by AraXL’s large-scale vector implementations.

XiangShan Takes the Traditional Path

By way of comparison, XiangShan follows the traditional high-performance CPU path. The project refines speculative scalar execution to extract instruction-level parallelism from largely unstructured code. Its design relies on deep out-of-order pipelines, aggressive branch prediction, speculative memory access, and multi-level caching to infer parallelism dynamically and hide latency behind hardware complexity.

Performance emerges when predictions are correct, and the cost of being wrong is absorbed through rollback, replay, and wasted energy. XiangShan must speculate because scalar code is dominated by frequent branches, irregular memory access, fine-grained dependencies, and unpredictable control flow. Speculation is the only way to extract performance from that environment.

This approach is familiar, effective for general-purpose workloads, and deliberately conservative. XiangShan aims to demonstrate that an open RISC-V core can compete by mastering the same techniques long used by x86 and Arm processors. The trade-off is therefore not one of right versus wrong, but of where complexity lives: XiangShan concentrates complexity in hardware to preserve the illusion of fast sequential execution, while Ara moves structure into software and removes speculative machinery entirely.

The Commercialization Question

Unlike Ara, which is best understood as a reference and research platform, XiangShan occupies a more ambiguous space between research and industry. XiangShan is not owned or sold by a commercial IP vendor in the traditional sense. There is no company marketing XiangShan cores under paid licensing terms. Instead, the project’s RTL is released under the Mulan PSL v2 open-source license, allowing companies to adopt, modify, and integrate the design without royalties.

However, XiangShan has progressed well beyond academia. The project has produced multiple physical tape-outs across successive generations, including chips fabricated in both mature and more advanced process nodes. Systems are capable of booting Linux and running standard benchmarks. Project materials describe collaboration with industry partners and evaluation within SoC development workflows.

This places XiangShan in a Linux-like model of commercialization. The core itself is not monetized as proprietary IP. Instead, its value emerges through adoption, integration, and downstream products built by third parties. In other words, XiangShan has been commercialized in practice, but not in the conventional IP-licensing sense. Its success depends on whether companies choose to build products around it, rather than on direct sales of the core itself.

XiangShan succeeds in demonstrating that open-source hardware can scale to complex, production-class microarchitectures. Its investment in tooling, simulation, and verification shows that openness need not imply fragility. In that respect, it validates RISC-V as a viable foundation for serious scalar CPUs.

At the same time, XiangShan’s conservatism defines its limits. By adhering closely to the speculative scalar tradition refined by x86 and Arm, it avoids questioning the underlying assumptions of that model. It does not attempt to make parallelism explicit, to rethink locality management, or to reduce reliance on speculation as the primary driver of performance. XiangShan improves the state of the art within a familiar framework but does not attempt to redraw that framework.

Two Paths, Not One Winner

Comparing Ara and XiangShan is illuminating precisely because they are not competing for the same point in design space. Ara explores explicit, structured parallelism and predictable performance, scaling by adding lanes, bandwidth, and disciplined data reuse. XiangShan refines speculative scalar execution, scaling by increasing pipeline sophistication, prediction accuracy, and cache depth. One exposes trade-offs to software. The other works hard to hide them. One favors determinism. The other embraces speculation. Neither approach is inherently superior, but each excels in different domains.

What Open Source Means in Practice

Earlier analysis of XiangShan made the case that open source alone does not guarantee architectural boldness. Ara reinforces the complementary point: architectural boldness does not require commercial polish to be meaningful. Ara’s value lies in clarity. It shows what RVV actually implies when implemented honestly, including both its strengths and its limits. XiangShan’s value lies in execution discipline and scale. It shows how far open source can go by perfecting known techniques and coupling them with institutional support.

Together, these projects illustrate the breadth of architectural exploration now possible within the RISC-V ecosystem. One path is evolutionary and production-oriented. The other is exploratory and architectural. Understanding both is essential for anyone trying to anticipate where RISC-V and high-performance computing more broadly is headed next.

Also Read:

The Launch of RISC-V Now! A New Chapter in Open Computing

The Foundry Model Is Morphing — Again

SiFive to Power Next-Gen RISC-V AI Data Centers with NVIDIA NVLink Fusion


On the high-speed digital design frontier with Keysight’s Hee-Soo Lee

On the high-speed digital design frontier with Keysight’s Hee-Soo Lee
by Don Dingee on 02-16-2026 at 10:00 am

Chiplet 3D Interconnect Designer reduces interconnect analysis in high-speed digital design from weeks to minutes

High-speed digital (HSD) design is one of the more exciting areas in EDA right now, with design practices, tools, and workflows evolving to keep pace with increasing design complexity. With the annual Chiplet Summit and DesignCon festivities right around the corner, we sat down with Keysight’s Hee-Soo Lee, HSD Segment Lead, to get his insights into what’s happening and where engineers should be looking for workflow improvements. What you’ll read next focuses on trends, customer needs, and what Keysight EDA is doing about them. Keysight will present its solutions, including Chiplet 3D Interconnect Designer, at both events.

SW: When we talked last year, we discussed new approaches to automated crosstalk analysis and its role in the first-pass success of more complex 3DHI designs. You’ve learned more about how engineers approach high-speed chiplet interconnects – what’s happening right now?

HSL: Everybody is talking about AI/ML, right? Many design teams are moving in that direction. One area attracting strong interest is advanced package design for high-traffic AI data centers. There are new technologies in play – multi-die, stacked die, heterogeneous integration, hybrid bonding, and more, all in advanced packaging. Packages are becoming faster and more complex, as designers aim to pack more functionality into a single package and save space. It’s turning out that success or failure in these designs largely depends on getting the high-complexity interconnects right.

SW: Sure, multiple 128-bit buses running at 64GT/s rates, what could possibly go wrong?

HSL: Exactly. There’s a lot that can go wrong when integrating different functional blocks into a single advanced package. Effects now happen on a 3D landscape. Signal integrity used to mean looking across a bus from source to destination. While bits on a bus may couple to other bits on the same bus, they can also affect other signals in the package, especially given vertical proximity, or part of the bus might be affected by something else. Designers start with “proven” chiplets and integrate them using silicon bridges or interposers, then route signals from those chiplets through physical packaging. Simplistic models don’t work well because every interconnect is a transmission line at these speeds. Signal integrity is a major concern in hatched or waffled ground-plane structures, which are almost mandatory given manufacturing constraints. There are holes in the ground return path, which may increase signal reflection and crosstalk, and this is really hard to model with conventional techniques due to its complexity.

SW: This is way beyond guessing where to place bypass capacitors on a printed circuit board, I suspect.

HSL: Without accurately modeling and simulating parasitic effects of all 3D structures in an advanced package design, there’s no hope to mitigate behaviors with external band-aids on a board. Everything going on inside the package has to be modeled and analyzed – looking at just part of the design and assuming the rest will perform the same way misses problems. However, customers tell us that finding the simplest ground-plane issues can take weeks of analysis, and even then, the analysis still doesn’t provide enough information to fix the more complex problems they’re encountering. It’s bringing traditional workflows to a grinding halt.

SW: So, they’re asking you for help? Modeling and simulation capabilities in Keysight Advanced Design System (ADS), along with new features such as Chiplet 3D Interconnect Designer, can provide a clearer picture?

HSL: Customers told us we need to do something, fast, and we have. We’ve been able to show engineers how to turn a comprehensive analysis of interconnects, including hatched ground planes, from weeks into minutes using Chiplet 3D Interconnect Designer. Imagine visually highlighting problems in a design in a few minutes, making corrections in one or more structures, re-running the analysis in a few more minutes, and seeing better results. Now teams can focus on optimizing the design and validating that it meets specifications.

SW: Or trying to find HBM parts, many of which are now in what vendors call allocation.

HSL: HBM supply is certainly a problem right now for all but a select few customers. You don’t want to be in the position of finding that a design needs to be redone for a different part because the one you specified had its lead time blow up while weeks of analysis was running.

SW: What about keeping up to date with the latest specifications?

HSL: That can be hard, too, and making that easier is a big focus in ADS and companion tools. In the last year, we’ve seen several revisions, including UCIe 3.0, PCIe Gen 7, and HBM4. And to complicate matters, customers sometimes customize their interconnects, borrowing concepts from UCIe or BoW but not adhering to all compliance rules. Faster analysis of the latest revision of pertinent specs or a user-defined set of rules enables our customers to stay competitive at the leading edge.

SW: You’ve coined a fascinating term – “design for hope.”

HSL: RF designers are dealing with a few crucial signal paths. High-speed digital designers are dealing with 128 or more signals between endpoints. Crosstalk analysis has been so tedious that engineers choose to look at only a few signals, get the eye margin where they’d like it, replicate the bus structures, and hope the rest of the lines are OK. As soon as real-world bridges, interposers, and packages appear, there’s a huge risk that the margin on the unobserved parts of the bus disappears. That’s an expensive omission just to save simulation time, yet engineers do it every day, and they pay the price if they miss. We’re at the point where data rates, ground planes, materials, structures, and packaging are unforgiving versus tight margins in specifications.

SW: Speaking of packaging, this workflow also helps there, correct?

HSL: Yes, the days when folks could design a chip and toss it over the wall to the packaging person and expect decent results are over. The classic workflow would perform a post-layout EM extraction, which at least provided some information, but not enough. Advanced packages mandate a co-design strategy with the 3DHI chip and either a bridge or an interposer. When comprehensive analysis runs in minutes, design space exploration, including packaging, yields better results.

SW: What big stuff is looming on the high-speed digital design frontier?

HSL: Power consumption is still a big concern at scale. Large AI chips have broken the 1000A barrier on their DC supply rail, and data center racks are pushing over 35,000A. Any ADCs or DACs in a design are power hogs. A new initiative, co-packaged optics (CPO), replaces electrons with photons, making it much more power-efficient. CPO may eventually flow back through the packages and interposers to the die-to-die interconnect, creating another transition for high-speed digital EDA tools and workflows.

There are, of course, other developments at Keysight EDA in the high-speed digital, power, and signal integrity circles. Teams of engineers, including Hee-Soo, are descending on Chiplet Summit and DesignCon shortly to share additional insights with audiences. Keysight has a resource hub where you can see everything happening at this year’s DesignCon in Santa Clara.

Keysight DesignCon Resource Hub

You’ll find this short introductory video for Chiplet 3D Interconnect Designer among the information there.

[videopack id=”366695″]https://semiwiki.com/wp-content/uploads/2026/02/Introduction-to-Chiplet-3D-Interconnect-Designer.mp4[/videopack]


Samtec Ushers in a New Era of High-Speed Connectivity at DesignCon 2026

Samtec Ushers in a New Era of High-Speed Connectivity at DesignCon 2026
by Mike Gianfagna on 02-16-2026 at 10:00 am

Samtec Ushers in a New Era of High Speed Connectivity at DesignCon 2026

As I’ve discussed before, Samtec has a way of dominating every trade show the company participates in. The upcoming DesignCon event is no exception. At the show, Samtec will be discussing data rates up to 448 Gbps and signals up to 130 GHz. Beyond a rich set of demonstrations in the company’s booth, Samtec engineers will be participating as authors and/or speakers in papers, panels, and sponsored sessions throughout the conference. Attendees will even be treated to a bar crawl. DesignCon is a major event not to be missed, and Samtec’s presence makes it even more worthwhile. Let’s see how Samtec ushers in a new era of high-speed connectivity at DesignCon 2026.

At the Samtec Booth (939)

Samtec Booth

A main focus will be demonstrations of Samtec’s CPX offerings, including co-packaged copper (CPC) and co-packaged optics (CPO) solutions. Demonstrations will include Si-Fly® HD CPC connectors operating at 224 Gbps. Other booth demonstrations will include Samtec’s new Si-Fly® backplane system that supports 224 Gbps signals. This includes a 112 Gbps active optics demonstration incorporating Samtec’s FireFly and new Halo products.

You can also see Samtec’s distinctive orange Nitrowave RF cable as part of the new Bulls Eye BE71A test assembly for 224Gbps PAM4 SERDES and the Bulls Eye BE130 test assembly running 448 Gbps differential signals for test channels. Samtec will also showcase advancements in material science with its SureCoat ultra-rugged coatings for high temperatures and harsh environments.

Samtec in the Conference Agenda (and More)

February 25, 8:00 AM – 8:45 AM, Ballroom G

Improving Spectral Efficiency by Optimizing Sub-Nyquist Equalization for 448 Gbps

While 224 Gbps PAM4 systems are currently under development, standards bodies such as IEEE and OIF are already investigating 448 Gbps solutions.  Doubling the data rate to 448 Gbps also doubles the Nyquist frequency—from 56 GHz to 112 GHz—posing significant new challenges. The objective of this paper is to make a comprehensive analysis identifying the best equalizers to recover a 448 Gbps PAM4 signal on a channel whose spectrum shows a significant roll-off or resonances laying below Nyquist frequency.

February 25, 12:10 PM – 12:50 PM, Great America 1

Successful PCIe 6.0 and 7.0 System Guidelines

As PCI-SIG moved to PAM4 modulation, the interconnect budget constrained requiring improved loss and noise.  This talk discusses the recipes for a robust PCIe 6.0 and 7.0 system, validating case studies by the PCI-SIG pre-layout channel compliance methodology.

February 25, 12:15 PM – 1:00 PM, Ballroom G

Distributed Capacitor Characterization for Advanced Packaging

Distributed capacitors play a crucial role in ensuring power integrity in modern GPUs, which demand high current delivery and minimal voltage ripple at ever-increasing switching speeds. A systematic modeling approach has been developed to capture the electrical characteristics of these distributed capacitive structures. The proposed models were extensively validated through correlation with VNA measurements conducted across wide voltage and temperature ranges.

February 25, 3:00 PM – 3:45 PM, Ballroom E

An Improved Broadband Material Characterization Method

A novel method utilizing airline measurements is introduced to accurately and efficiently determine the dielectric constant and loss tangent of a material for broadband applications. This method is independent of metal loss and physical lengths, thus eliminating errors from measurement uncertainties. The proposed method has been validated through simulations and comparisons with commercial resonator methods.

February 26, 2:00 PM – 2:45 PM, Ballroom H

Lessons Learned at 224 Gbps

Design-in for 224 Gbps architectures has begun, due largely to the intensive data needs of AI HW architectures. This expert discussion will examine how we can design, build, and test 224 Gbps systems and achieve acceptable SI in light of technological and physical limitations.

February 24, 4:45 PM – 6:00 PM, Ballroom A

Panel – Designing & Validating the Future: SERDES & Channel Innovations for PCIe at 128 GT/s

Continuing the tradition of previous years, this panel will focus on the latest updates and changes to the PCIe signaling and physical topologies with focus on PAM4 signaling and the PCIe 7.0 specification. Building upon this panel’s past contributions, this year’s participants bring a diverse knowledge base to discuss the latest advancements simulation, design, and innovative test and measurement methodologies required for these current and future PAM4 inflection points. Additional topics include correlation between simulation and validation, design practices for PCIe over optical cables and through electrical pathways, and signal integrity complications.

And of course, join Samtec at the company’s booth on Wednesday from 5:00-6:00pm during the DesignCon booth bar crawl.

Samtec Conference Experts in Attendance

Two of Samtec’s staff members have earned the prestigious title of Engineer of the Year at DesignCon. They are noted in the conference homepage:

Last year was the 30th anniversary of DesignCon. In honor of that, Istvan Novak provided an informative perspective on the history of DesignCon. You can read that interview here.

To Learn More

You can see more detail about Samtec’s participation in DesignCon 2026 here.  You can register for the show here. And don’t forget to Register with the Samtec VIP code: INVITE373903 to receive a free Expo Pass or a 15% discount on conference passes.  And that’s how Samtec ushers in a new era of high-speed connectivity at DesignCon 2026.

Also Read:

2026 Outlook with Mathew Burns of Samtec

Webinar – The Path to Smaller, Denser, and Faster with CPX, Samtec’s Co-Packaged Copper and Optics

Samtec Practical Cable Management for High-Data-Rate Systems


Bronco Debug Stress Tested Measures Up

Bronco Debug Stress Tested Measures Up
by Bernard Murphy on 02-16-2026 at 6:00 am

Automated debugger

I wrote last year about a company called Bronco, offering an agentic approach to one of the hottest areas in verification – root-cause debug. I find Bronco especially interesting because their approach to agentic is different than most. Still based on LLMs of course but emphasizing playbooks of DV wisdom for knowledge capture versus other learning methods. Last year, setup required playbooks developed in a collaborative approach between DV experts and the Bronco debug agent. Unsurprisingly as a young company in a fast-moving field, learning methods have evolved considerably to the point that the debug agent can now build its own reusable playbook to cover a substantial range of bugs. Proprietary expert DV refinement can then be added on top. David Zhi LuoZhang (CEO, Bronco) suggests that this expert step is a 10–15-minute task, hardly challenging. He adds that now generated playbooks and memories from debug runs are combined with specs and user-provided documentation. Together this institutional knowledge coalesces in a customer-specific library he calls the Bronco Library (to be covered in a later blog perhaps).

How well does it work?

An industrial strength test

David can’t share a name for a recent successful eval but he can say it is a major public company who evaluated on a big SoC. Lots of sensors, suggesting probably several modalities of AI processing plus the usual multi-core CPU, memory management, etc, etc. Also sounds like it may be safety critical. The eval task was to find a significant number of known bugs discovered during active development on a design. Also important, these were SoC-level bugs, the hardest to root-cause. Examples they cite include performance problems in AI accelerators, deadlocks in power mode transitions, nasty UVM race conditions and assertion-firing issues.

Bronco was able to isolate an exact root-cause location for about 50% of those bugs without help, some in UVM testbenches, some in the design. Just by looking at the RTL, UVM testbench, design spec and playbooks. For another 25% the agent was able to localize the root cause to a file (out of 10k files). After initial setup, analysis was hands-free and completed in 15 minutes. A human debugger would surely have taken hours if not days to get to similar closure. That’s a pretty significant advance on reducing the largest overhead (debug) in DV.

Integration with regression flows

AI-based methods can have challenges integrating into regression flows. Apparently, Bronco have found a way to coexist very neatly with these flows. Their debugger can be instrumented into an overnight regression, triggering whenever a failure is found., launching an investigation of each failure in turn. For each, the debug agent creates a ticket, showing not only the root problem but also the steps that lead to this diagnosis. The tcket is then ready for DV to review in the morning. When an engineer analyzes a ticket the following morning, they can submit it to Jira directly if they choose.

On that note a sample ticket is worth a look, just to understand the detailed analysis this debug agent is able to generate:

This example is based on bug analysis on a block rather than one of the SoC bugs to protect proprietary details. Nevertheless, the quality of analysis suggested by this ticket is pretty impressive. If this example is representative for even 50% of the bugs exposed in a regression, I imagine debug technologies like this are going to take off fast.

What if the debug agent delivers an incorrect analysis or doesn’t get close enough in isolating a root-cause? Working through the list of tickets, in such a case the DV engineer can provide extra guidance, and/or suggest the debugger go deeper. That re-analysis continues in the background, so that well before the engineer has finished checking subsequent tickets, updates are ready for re-analysis.

Big step forward in agentic debug. You can learn a bit more about Bronco HERE.

Also Read:

Verification Futures with Bronco AI Agents for DV Debug

Superhuman AI for Design Verification, Delivered at Scale

AI RTL Generation versus AI RTL Verification


TSMC and Cadence Strengthen Partnership to Enable Next-Generation AI and HPC Silicon

TSMC and Cadence Strengthen Partnership to Enable Next-Generation AI and HPC Silicon
by Daniel Nenni on 02-15-2026 at 6:00 pm

TSMC WAFER

TSMC continues to reinforce its leadership in advanced semiconductor manufacturing through its deepening collaboration with Cadence Design Systems. The expanded partnership focuses on enabling next-generation artificial intelligence and high-performance computing innovations by aligning advanced electronic design automation, 3D-IC technologies, and silicon-proven intellectual property with TSMC’s most advanced process nodes and packaging platforms.

At the heart of this collaboration is support for TSMC’s cutting-edge process technologies, including N3, N2, and A16™, which are critical for meeting the escalating performance, power efficiency, and scalability demands of AI workloads. Cadence’s AI-driven design flows have been optimized and validated for these nodes, allowing customers to achieve superior power, performance, and area outcomes while accelerating time to market. These flows leverage machine learning–based optimization to address the growing complexity of advanced-node designs, particularly for large-scale AI accelerators and HPC processors.

TSMC’s roadmap toward even more advanced technologies is further strengthened by joint development efforts with Cadence on future nodes, including the upcoming A14 process. Early EDA flow collaboration and PDK readiness ensure that customers can begin design work sooner, reducing risk and enabling faster adoption of next-generation silicon technologies. This early alignment between foundry and EDA provider is increasingly vital as design margins shrink and integration challenges intensify at advanced nodes.

Beyond transistor scaling, the partnership plays a critical role in advancing TSMC’s 3DFabric® platform, which enables heterogeneous integration through advanced packaging and die stacking. Cadence’s comprehensive 3D-IC solutions support a wide range of TSMC 3DFabric configurations, providing automation for bump connections, multi-chiplet physical implementation, and system-level analysis. AI-driven tools such as Clarity™ 3D Solver and Sigrity™ X enable accurate signal integrity, power integrity,  and thermal analysis, helping designers manage the complexities of large, multi-die systems.

Photonics integration is another area of collaboration, particularly through support for TSMC’s Compact Universal Photonic Engine (COUPE™). By combining Cadence’s Virtuoso® Studio and Celsius™ Thermal Solver with TSMC-developed productivity enhancements, customers can more effectively model thermal and electrical interactions in photonic and electronic systems. This capability is increasingly important for AI and data center applications, where power density and thermal management directly impact system reliability and performance.

A key pillar of the Cadence and TSMC partnership is the availability of design-in-ready, silicon-proven IP on advanced nodes such as TSMC N3P. Leading-edge memory and interface IP, including HBM4, LPDDR6/5X, DDR5 MRDIMM Gen2, PCIe® 7.0, and UCIe™, addresses the growing memory bandwidth and interconnect challenges faced by AI systems. These IP offerings enable customers to scale compute performance efficiently while overcoming bottlenecks associated with data movement and power consumption.

Bottom line: Together with the broader Open Innovation Platform® ecosystem, TSMC and Cadence are streamlining the path from design to silicon. By integrating AI-driven EDA, advanced packaging solutions, and next-generation IP with TSMC’s manufacturing leadership, the partnership empowers customers to deliver faster, more energy-efficient AI and HPC solutions. As AI adoption accelerates globally, this close collaboration positions TSMC at the center of the AI semiconductor super-cycle, enabling innovation across the entire value chain, from process technology to system-level integration.

Also Read:

TSMC & GCU Semiconductor Training Program: Preparing Tomorrow’s Workforce

NanoIC Extends Its PDK Portfolio with First A14 Logic and eDRAM Memory PDK

TSMC’s 2026 AZ Exclusive Experience Day: Bridging Careers and Semiconductor Innovation


CEO Interview with Elad Raz of NextSilicon

CEO Interview with Elad Raz of NextSilicon
by Daniel Nenni on 02-15-2026 at 4:00 pm

Elad Raz NextSilicon HS (1) (3)

Elad Raz is the founder and CEO of NextSilicon, which he established in 2018 to fundamentally rethink how HPC architectures are built. After more than two decades designing and scaling advanced software and compute systems, Elad saw firsthand the limits of fixed, inflexible processor designs. He founded NextSilicon to address those constraints with a software-defined approach to HPC, enabling architectures that can adapt to evolving workloads across HPC, AI, and data-intensive computing.

Tell us about your company?

NextSilicon is redefining the future of HPC and AI with Maverick-2, the industry’s first production dataflow accelerator powered by our Intelligent Compute Architecture (ICA). Maverick-2 combines intelligent software with adaptable hardware to deliver unprecedented energy efficiency, scalability, and precision across applications ranging from scientific discovery and generative AI to climate modeling and healthcare.

NextSilicon emerged from stealth in October 2024 to launch Maverick-2. Since then, Maverick-2 has been redefining the interaction between hardware and software to establish a compute paradigm rooted in adaptability and efficiency. With over $300 million in funding and more than 400 employees globally, we’ve already achieved significant milestones, including deployment at Sandia National Laboratories’ Vanguard-II supercomputer.

What problems are you solving?

There is a fundamental mismatch in how we are building compute today. Modern high-performance computing (HPC) and AI workloads are irregular, memory-intensive, and require massive amounts of data. Yet, we are still forcing them onto architectures optimized decades ago for completely different problems.CPUs and GPUs simply weren’t built for what we’re asking them to do now.

Maverick-2 addresses the growing inefficiencies of CPUs and GPUs, which struggle to meet the architectural, energy, and scalability demands of AI and HPC workloads in fields such as science, weather, energy, and defense. These workloads involve complex data dependencies, memory access, and computational patterns that today’s processors cannot handle, resulting in suboptimal performance and bottlenecks that stifle innovation.

Beyond hardware challenges, there’s also a developer pain point that leads teams to spend months porting and rewriting code for new architectures. Maverick-2 eliminates this burden by supporting existing code in C/C++, FORTRAN, OpenMP, and Kokkos without requiring specialized programming or domain-specific languages. We achieve up to 10x performance improvements while using 60% less power – a 4x performance-per-watt advantage – all without forcing developers to rewrite their applications.

What application areas are you strongest in?

We’re seeing strong traction in workloads that struggle on traditional architectures, think HPC simulations, scientific computing scenarios, and data-intensive enterprise and research applications. Specifically, our dataflow architecture excels in:

○ Energy: Fluid dynamics, exploration and seismic analysis, and particle simulation
○ Life Sciences: Drug discovery, protein docking, and multiomics analysis
○ Financial Services: Risk simulation, back testing, and options modeling
○ Manufacturing: Finite element analysis, crash simulation, and digital twins
○ AI/ML: Thinking models, reasoning workloads, and trillion-parameter LLMs with massive context windows approaching 1M tokens
○ Graph Analytics: Social network analysis, fraud and anomaly detection, and recommendation systems

What’s important is that Maverick-2’s programmability allows a single platform to span multiple domains. As AI and HPC continue to converge, that flexibility becomes critical.

What keeps your customers up at night?

Our customers face enormous pressure to achieve greater performance within strict limits on power, cooling, and budget. They worry about locking themselves into architectures optimized for today’s models that can’t adapt to the next challenge. Efficiency, scalability, and longevity are critical.

Developer productivity is another major concern. Many customers have CPU workloads that were never GPU-accelerated. NextSilicon’s Intelligent Compute Architecture can accelerate these codes without forcing them into a GPU-centric paradigm. For teams already using GPUs, the goal is to explore new architectures without extensive code rewrites or learning proprietary languages and frameworks. They want to focus on scientific discovery and innovation, not porting cycles.

Customers also value future-proofing. As algorithms and models evolve at an unprecedented pace, choosing the right architecture has become genuinely difficult. Our software-defined hardware adapts dynamically during execution and evolves with workloads over time. This protects long-term investments while delivering immediate performance and efficiency gains. Customers can use existing codebases in familiar languages while achieving significant improvements while addressing both technical and resource constraints.

What does the competitive landscape look like and how do you differentiate?

The landscape is crowded, but it’s constrained by legacy thinking. CPUs continue to add more cores, more instructions, and more cost. GPUs are broad-based across all use cases. Then, many accelerators optimize for very narrow use cases.

At NextSilicon, we’re not trying to incrementally improve an old paradigm. Dataflow is a fundamentally different execution model. Our strengths lie in our true runtime reconfigurability, general-purpose programmability, and system-level efficiency.

The key technical differentiator is our dataflow architecture, which allows hardware to adapt to software in real time rather than forcing software to conform to rigid hardware designs. This, combined with our strong financial backing and scale, positions us uniquely in the market.

Maverick-2 isn’t about replacing everything overnight. It’s about offering a compelling complement to traditional architectures that are showing diminishing returns.

What new features / technology are you working on?

We’re continuously enhancing Maverick-2’s hardware and software stack. Right now, we are focused on expanding our software ecosystem to support additional programming models and frameworks, making integration even easier. We’re also optimizing performance and efficiency for emerging workload patterns as HPC and AI workflows converge. The roadmap is driven by feedback. And working closely with our early customers and partners. We want to ensure our roadmap aligns with real-world requirements and problems, not theoretical ones.

How do customers normally engage with your company?

We work with customers both directly and through our partner ecosystem, which includes Dell Technologies, Penguin Solutions, HPE, Databank, Vibrint, and E4. These partnerships enable us to provide integrated solutions and support to organizations seeking to accelerate their HPC and AI workloads. For specific engagement options and to discuss how we can address computational challenges, we encourage interested organizations to reach out to our team.

Also Read:

CEO Interview with Naama BAK of Understand Tech

CEO Interview with Dr. Heinz Kaiser of Schott

CEO Interview with Moshe Tanach of NeuReality