llmda newsletter ad (2)

Securing RISC-V Third-Party IP: Enabling Comprehensive CWE-Based Assurance Across the Design Supply Chain

Securing RISC-V Third-Party IP: Enabling Comprehensive CWE-Based Assurance Across the Design Supply Chain
by Admin on 03-02-2026 at 10:00 am

RISC V 3PIP CWE Workflow BR 022626

by Jagadish Nayak

RISC-V adoption continues to accelerate across commercial and government microelectronics programs. Whether open-source or commercially licensed, most RISC-V processor cores are integrated as third-party IP (3PIP), potentially introducing supply chain security challenges that demand structured, design-level assurance.

As systems become more heterogeneous and interconnected, design supply chain security is no longer a documentation exercise, but an engineering challenge. A single weakness in processor IP can cascade into systemic risk. That reality makes scalable, repeatable 3PIP assurance essential, especially for RISC-V cores deployed in mission-critical environments.

From Third-Party IP Risk to Repeatable Assurance

Traditional IP integration workflows often rely on vendor claims, checklist-based reviews, and limited test evidence. While helpful, these approaches rarely provide design-level assurance across all relevant weakness classes. To address this gap, a Common Weakness Enumeration (CWE)-based methodology enables structured, measurable, and portable security validation.

A structured CWE-based methodology replaces ad hoc reviews with measurable validation. Relevant weaknesses are scoped from the MITRE database, translated into security requirements, verified through executable properties and tests, and captured as traceable assurance artifacts.

The outcome is not simply test coverage, but documented security assurance tied directly to recognized weakness definitions.

Scaling Assurance for the RISC-V Ecosystem

RISC-V’s consistent ISA foundation enables reusable security requirement templates, parameterized verification properties, and portable C-based test workloads. Once developed, these artifacts can be applied across multiple RISC-V cores with minimal modification, significantly reducing non-recurring engineering (NRE) effort.

The benefits of templatizing CWE-derived security requirements for RISC-V processors include:

  • Teams can avoid starting from scratch for each integration.
  • Scope inclusion decisions can be repurposed.
  • Verification properties can be parameterized for specific RTL implementations.
  • C-based tests can be compiled using standard RISC-V toolchains and reused across multiple cores with minimal modification.

This portability is particularly powerful for programs integrating multiple RISC-V implementations across product lines or lifecycle revisions.

Demonstrated Use Case: SiFive X280 3PIP Assurance

In a collaboration between Cycuity (an Arteris brand), SiFive, and BAE Systems, the methodology was applied to a commercial RISC-V core integrated into a larger SoC. Of 60 CWEs identified as potentially in scope, 16 have been analyzed using templated security requirements and reusable verification infrastructure spanning information-flow rules, static analysis, portable C-tests, and assertion-based verification.

What the Results Reveal

Of the 16 CWEs analyzed:

  • 12 were confirmed passing under the defined requirements.
  • 3 were flagged as failing by rule definition.
  • 1 was determined to be out of scope after deeper analysis.

Importantly, failing a CWE does not inherently indicate a vulnerability—it highlights divergence from formal CWE definitions and prompts system-level evaluation of mitigation strategy.

For example, evaluation of debug-mode transitions revealed that sensitive registers are not automatically cleared when entering debug mode. While architecturally intentional, this behavior required software mitigation planning at the system level. Similarly, analysis of register reset conditions identified registers not explicitly initialized at reset. Although deemed non-critical in context, the structured analysis ensured no assumptions were left unvalidated.

These findings highlight an essential point: assurance is not simply about finding flaws, but it is about eliminating uncertainty. Engineers and managers alike gain clearer visibility into implementation behavior, design intent, and mitigation boundaries.

Reducing NRE Through Reusable Assurance Templates

One of the most valuable outcomes of the RISC-V initiative is measurable reduction in assurance effort. Once security requirement templates, property macros, and portable test harnesses are defined, subsequent RISC-V cores can be evaluated with significantly reduced engineering investment. The methodology enables acceleration without sacrificing rigor.

Verification teams can focus effort where differentiation truly exists, such as implementation-specific signals, privilege modes, memory maps, and integration boundaries, rather than recreating foundational security requirements. For government and defense programs under Trusted and Assured Microelectronics (T&AM) objectives, this repeatability directly supports both technical assurance and program schedule constraints.

Strengthening the Design Supply Chain for RISC-V

As microelectronics ecosystems diversify, design supply chains now span open-source repositories, commercial IP vendors, integrators, tool providers, and system-level developers. Supply chain security cannot be enforced solely at procurement but must be embedded within the design verification lifecycle.

CWE-based assurance provides a shared technical language across stakeholders, for instance:

  • IP providers can align their documentation and artifacts to standardized weakness definitions.
  • Integrators can demand traceable evidence.
  • System architects can quantify residual risk and implement deliberate mitigations.

This transparency strengthens collaboration without exposing proprietary RTL or design details unnecessarily.

Looking Ahead: Expanding Beyond RISC-V

While this work focuses on RISC-V, the methodology generalizes to any third-party IP, from processors to accelerators and peripherals. Assurance will never be zero cost, but structured, reusable frameworks transform it from reactive compliance into a scalable engineering discipline. For organizations building on RISC-V and beyond, this shift is foundational to safeguarding modern design supply chains.

As RISC-V deployment continues to grow in high-assurance and mission-critical systems, design teams must move beyond trust-by-assumption. Comprehensive, CWE-based 3PIP verification enables measurable confidence, reduces integration uncertainty, and strengthens the entire microelectronics ecosystem from IP provider to end system.

CONTACT ARTERIS

Jagadish Nayak is a Distinguished Engineer in Security at Arteris (formerly Cycuity). He provides technical expertise and guidance on the Hardware Security Verification and the Radix family of tools for security verification. He has an extensive background in hardware design, verification and security analysis with over 30 years of semiconductor industry experience.

Also Read:

Arteris Smart NoC Automation: Accelerating AI-Ready SoC Design in the Era of Chiplets

The IO Hub: An Emerging Pattern for System Connectivity in Chiplet-Based Designs

Arteris Simplifies Design Reuse with Magillem Packaging


Apple’s iPhone 17 Series 5G mmWave Antenna Module Revealed to be Powered by Soitec FD-SOI Substrates

Apple’s iPhone 17 Series 5G mmWave Antenna Module Revealed to be Powered by Soitec FD-SOI Substrates
by Daniel Nenni on 03-02-2026 at 8:00 am

Qualcomm’s QTM565 mmWave Antenna Module

Recent independent teardown and technical analyses have confirmed that the 5G mmWave antenna module powering Apple’s latest iPhone 17 lineup relies on advanced SOITEC based Fully Depleted Silicon-On-Insulator (FD-SOI) substrate technology. The discovery highlights a significant architectural shift in high-frequency RF integration for flagship smartphones.

Industry intelligence firms Yole Group and TechInsights recently conducted detailed teardowns of the Qualcomm QTM565, the mmWave integrated antenna-in-package (AiP) module used in the iPhone 17, iPhone 17 Pro, and iPhone 17 Pro Max. According to TechInsights’ study, Qualcomm HG11-34443-2 (QTM565) FR2 Tx/Rx Front End Die RFIC Process Analysis by Sharath Poikayil Satheesh, and corroborated by Yole Group’s component analysis, the Qualcomm QTM565 module utilizes GlobalFoundries’ 22FDX RF process. This process is fundamentally built upon advanced FD-SOI substrates supplied by Soitec.

The same RF die has been embedded in the AiP mmWave solution within the iPhone 17 series, highlighting the growing use of FD-SOI substrates in mmWave RFIC design for premium smartphones. This marks one of the most visible commercial validations of FD-SOI for high-volume consumer 5G mmWave applications.

Daniel Nenni, founder of SemiWiki, commented: “The findings highlight a major industry shift toward highly integrated 5G mmWave Systems-on-Chip (SoCs), where phased array element spacing and area constraints are highly critical.” As antenna arrays move to higher frequencies in the FR2 mmWave spectrum, the physical spacing between elements becomes increasingly constrained, demanding exceptional integration density and signal integrity at the silicon level.

TechInsights’ executive summary notes that FD-SOI devices are uniquely engineered to operate effectively into the mmWave band, making monolithic and highly integrated phased-array transmit and receive SoCs feasible for top-tier consumer electronics. Unlike traditional bulk CMOS approaches, FD-SOI leverages a thin buried oxide layer that reduces parasitic capacitance, enhances electrostatic control, and improves RF isolation—critical attributes for mmWave performance.

The FD-SOI Advantage in the iPhone 17

By serving as the foundational substrate for the Qualcomm QTM565 module, FD-SOI technology enables fully integrated 5G mmWave SoCs. For manufacturers like Apple and Qualcomm, this platform delivers several strategic advantages:

Miniaturization and Footprint Optimization

FD-SOI provides significant logic scaling benefits while maintaining strong RF characteristics. Designers can integrate baseband functions, beamforming control logic, power management, and RF front-end components onto a single die. This high level of monolithic integration reduces the Bill of Materials (BOM), minimizes PCB footprint, and lowers interconnect losses between discrete components. In space-constrained smartphones, these savings directly translate into slimmer form factors or additional room for battery capacity and thermal management.

Best-in-Class Power Efficiency

Operating at low voltages, FD-SOI enables dynamic body biasing and precise threshold control, optimizing performance per watt. The inherent low-noise analog devices and excellent device matching support stable beamforming and signal integrity without excessive power draw. For end users, this means sustained 5G mmWave throughput without disproportionately draining battery life—a critical factor as carriers continue expanding high-band spectrum deployments.

Unmatched RF Performance

As demonstrated by silicon mmWave prototypes cited in the TechInsights analysis, FD-SOI provides the device-level precision necessary for high-frequency operation across both sub-6 GHz and FR2 mmWave bands. Improved isolation and reduced variability enhance linearity and gain control in phased-array architectures, directly impacting range, data rate stability, and thermal performance.

Strategic Implications for the Semiconductor Ecosystem

The integration of FD-SOI technology into Apple’s flagship iPhone 17 series underscores the substrate’s expanding role in next-generation RF system design. It also reflects a broader industry trend: convergence of digital logic and high-frequency RF on a single optimized platform.

As 5G evolves and early 6G research accelerates, the demand for compact, power-efficient, and highly integrated mmWave solutions will intensify. FD-SOI’s combination of RF excellence, power efficiency, and scalability positions it as a compelling enabler for future mobile connectivity platforms.

Bottom line: With independent validation from Yole Group and TechInsights, and commercial deployment in one of the world’s highest-volume premium smartphones, Soitec’s FD-SOI substrate technology has secured a visible and strategic foothold in the mmWave era, driving miniaturization, extending battery life, and redefining what is possible in integrated RF design.

CONTACT SOITEC

Also Read:

Podcast EP331: Soitec’s Broad Impact on Quantum Computing and More with Dr. Christophe Maleville

Podcast EP321: An Overview of Soitec’s Worldwide Leadership in Engineered Substrates with Steve Babureck

FD-SOI: A Cyber-Resilient Substrate for Secure Automotive Electronics

 


Another Quantum Topic: Quantum Communication

Another Quantum Topic: Quantum Communication
by Bernard Murphy on 03-02-2026 at 6:00 am

Quantum teleportation

In my recent series on quantum computing (QC), I intentionally overlooked a couple of adjacent topics: quantum communication and quantum sensing. These face some of the same challenges as QC, however I noticed a recent report on a test quantum network implemented by Cisco and Qunnect which led me to find more from Cisco on their work in quantum networking.

Early post since I will be at DVCon next week moderating a panel, among other activities.

What is the point of quantum communication?

Quantum communication is based on entanglement; two physically separated qubits whose states are nevertheless coupled so that if one somehow changes state the other “instantaneously” also changes state. This holds even if the qubits are separated by thousands of kilometers and led Einstein to call this behavior “spooky action at a distance”.

This idea prompted some to think entanglement implied faster-than-light communication. Sadly no – physical laws are not violated by this technique. Still, the concept has led quantum experts (and Star Trek enthusiasts) to label methods using this technique as “quantum teleportation”, which I’ll call QT.

The point of QT is security in the communication channel. If a third-party attempts to monitor a qubit state at either end, both qubit states immediately collapse and the information is lost. Which immediately signals an attempted hack while also destroying information before it can be revealed, valuable for cryptographic key distribution.

These methods are considered “quantum-safe” unlike “quantum-resistant” methods for protecting encrypted data, which are known to defend (classically) against Shor’s algorithm but not against as-yet unknown advances on Shor. External hacks against entanglement-based QT must (as far as I can tell) hack physics, a very tall order.

Cisco work in QT

I got my information from this Cisco article and this blog. Cisco have already developed a research prototype quantum network entanglement chip able to generate a million entangled photons per second, running at room temperature and working with existing photonic infrastructure.

Cisco’s prototype run with Qunnect suggests that this entanglement chip can be used to connect a cryogenic-based system (a superconducting QC for example) on one end to fiber and on the other end of the fiber back to another QC. Details on how this works are sparse I’m afraid but they do claim they were able to connect reliably over more than 17km, a very practical distance for the banks and other finance institutions that cluster around New York area where this trial was run.

Cisco have a higher aim, for quantum networking that could scale up QC capacity without needing to wait for qubit counts to scale up in individual QCs. A quantum network connected topology of QCs could in theory provide almost as much capability (?) as a single large QC. If this works out in practice it could be huge for QC.

Sanity checks

Quantum key distribution (QKD seems to be the most real part of this story today. China claims a 2000km QKD backbone between Beijing and Shanghai supporting banks. This has been in operation for quite a while.

The idea of connecting multiple QC nodes through a quantum internet still looks experimental. The University of Chicago is active in this area, also see the earlier Cisco reference on their quantum labs.

Interesting – a new possible path towards a large scale quantum computer and truly secure networking.

Also Read:

PQShield on Preparing for Q-Day

Where is Quantum Error Correction Headed Next?

Quantum Computers: Are We There Yet?


Advancing Automotive Memory: Development of an 8nm 128Mb Embedded STT-MRAM with Sub-ppm Reliability

Advancing Automotive Memory: Development of an 8nm 128Mb Embedded STT-MRAM with Sub-ppm Reliability
by Daniel Nenni on 03-01-2026 at 6:00 pm

World First 8nm 128Mb Embedded STT MRAM for Automotive
IEDM 2025 Papers MRAM RRAM

The rapid evolution of automotive technology has intensified the demand for highly reliable, high-performance semiconductor memory solutions. Modern vehicles increasingly rely ADAS driving features, and complex infotainment platforms, all of which require memory that can operate flawlessly under extreme environmental conditions. Among emerging memory technologies, embedded magnetic random access memory (eMRAM) stands out as a compelling candidate due to its non-volatility, high endurance, and fast read/write capabilities. The development of an 8nm 128Mb embedded STT-MRAM specifically tailored for automotive applications represents a significant technological milestone in this field.

One of the primary challenges in automotive memory design is ensuring reliable operation across a wide temperature range, typically from –40°C to 150°C. Unlike consumer electronics, automotive systems must maintain data integrity and functional stability even under prolonged exposure to high temperatures. This stringent requirement places considerable pressure on memory architectures, particularly when scaling down to advanced process nodes such as 8nm. Shrinking the technology node increases memory density and performance but also introduces heightened risks of failure mechanisms, including short defects, read margin degradation, and retention loss.

A major breakthrough in the 8nm 128Mb eMRAM development lies in the aggressive scaling of the memory cell to 0.017 μm². While this scaling enables higher density and improved integration with advanced logic nodes, it also intensifies process complexity. Higher bitcell density increases the probability of short failures due to redeposition and patterning challenges during fabrication. To address this, improvements in integration processes significantly reduced in-line defect counts, resulting in a substantial decrease in median short fail bit counts. Achieving sub-parts-per-million (sub-ppm) levels of short failure demonstrates that high-density scaling can coexist with automotive-grade reliability when supported by meticulous process optimization.

Another critical concern in scaled MRAM technology is maintaining sufficient read margin. As the back-end-of-line (BEOL) thermal budget increases in advanced nodes, thermal migration can degrade the magnetic tunnel junction (MTJ) properties, particularly the tunneling magnetoresistance (TMR). A lower TMR reduces the resistance gap between parallel (P) and anti-parallel (AP) states, narrowing the sensing window and increasing the risk of read errors. By optimizing the MTJ stack, especially through fine-tuning the free layer composition, the design achieved improved thermal tolerance. In fact, enhanced crystallization of the MgO barrier after thermal treatment led to an increase in TMR, thereby widening the read margin. Combined with patterning improvements that drastically suppressed inter-cell leakage, these advancements enabled ppm-level read failure rates even at elevated temperatures.

Write performance and data retention present another delicate trade-off. Automotive specifications demand both low write error rates (WER) and robust long-term retention, often exceeding 20 years at high temperatures. However, optimizing for easier write switching can compromise thermal stability, and vice versa. To balance this trade-off, pinned layer optimization was employed to tailor asymmetry between P and AP switching characteristics. By carefully adjusting the magnetic stack, engineers identified an optimal asymmetry point that minimized overall bit error rates while preserving retention strength. Furthermore, reducing the temperature dependence of switching current improved write reliability at low temperatures, where higher currents are typically required.

In addition to pinned layer refinement, enhancements in spin-transfer torque (STT) efficiency further reduced switching current requirements without sacrificing thermal stability. Improved MTJ engineering broadened the switching current window, lowering the voltage necessary to meet WER specifications while significantly improving distribution tail behavior. These refinements resulted in sub-ppm levels of both write error rate and retention bit error rate, effectively eliminating yield loss related to these failure mechanisms.

Finally, comprehensive chip-level validation confirmed full functionality of write and read operations across the entire automotive temperature range. Shmoo plot analyses demonstrated robust voltage and timing margins, with read speeds as fast as 8ns under worst-case conditions. This performance underscores not only reliability but also competitiveness in high-speed embedded applications.

Bottom line: The successful realization of an 8nm 128Mb embedded STT-MRAM for automotive use demonstrates that aggressive scaling and stringent reliability requirements can be achieved simultaneously. Through innovations in integration processing, MTJ stack engineering, and magnetic layer optimization, this technology meets sub-ppm failure targets while delivering high performance across extreme temperatures. Such advancements position eMRAM as a leading memory solution for next-generation automotive electronics, paving the way for safer, smarter, and more connected vehicles.

Also Read:

Memory Matters: Signals from the 2025 NVM Survey

Akeana Partners with Axiomise for Formal Verification of Its Super-Scalar RISC-V Cores

SiFive’s AI’s Next Chapter: RISC-V and Custom Silicon

Ceva IP: Powering the Era of Physical AI


Podcast EP333: A Look at the Broad, Worldwide Impact SEMI Has on the Semiconductor Industry with Ajit Manocha

Podcast EP333: A Look at the Broad, Worldwide Impact SEMI Has on the Semiconductor Industry with Ajit Manocha
by Daniel Nenni on 02-27-2026 at 10:00 am

Daniel is joined by Ajit Manocha, president and CEO of SEMI, the global industry association serving the semiconductor and electronics manufacturing and design supply chain. Throughout his career, Manocha has been a champion of industry collaboration as a critical means of advancing technology for societal and economic prosperity. He began his career at AT&T Bell Laboratories as a research scientist and was granted more than a dozen patents related to semiconductor manufacturing processes that served as the foundation for modern microelectronics manufacturing.

Ajit discusses his AT&T Bell Laboratories roots and the focus on “connect, collaborate, innovate” that the organization instilled. He explains that he found this same focus at SEMI, which drew him closer to the organization to become its president and CEO. Dan explores the substantial impact SEMI has on the semiconductor industry. Ajit describes broad coalitions between SEMI members, governments and academia to address key issues such as talent pool growth, energy reduction and reduction of harmful compounds such as PFAS. The collaboration with the United Nations and the EU is also described.

Dan explores future efforts of SEMI with Ajit that include AI data protection and cybersecurity.

The views, thoughts, and opinions expressed in these podcasts belong solely to the speaker, and not to the speaker’s employer, organization, committee or any other group or individual.


Memory Matters: Signals from the 2025 NVM Survey

Memory Matters: Signals from the 2025 NVM Survey
by Daniel Nenni on 02-27-2026 at 6:00 am

when do you expect to choose

Non-volatile memory choices are becoming more complex as SoC designs push into advanced nodes, and new requirements driven by AI, new sensor technologies and stringent quality standards.

The second annual 2025 NVM Survey, completed in December, captures a market that still hangs on established technologies but is increasingly testing alternatives in response to these new design and production constraints.

More than 80% of respondents say they use or evaluate embedded non-volatile memory technologies. 20% are looking for an NVM now, and just under 30% expect to choose NVM IP within the coming year. Taken together, this points to a market that is both experienced and active. A meaningful proportion of near-term decisions are yet to be made, leaving a lot to play for among the competing technologies.

Fig. When do respondents plan to select an NVM

Embedded flash continues to dominate in terms of technology recognition, with awareness exceeding 80% of respondents, reflecting its long-standing role as the default choice. That said, the survey shows a broadening of familiarity beyond flash. FRAM, MRAM, and ReRAM are each recognized by more than a quarter of respondents, indicating that alternative NVM technologies are now part of mainstream awareness.

Vendor recognition follows a similar pattern. A small group of suppliers stand out in terms of familiarity, led by SST (embedded flash), Infineon (SONOS), and Weebit Nano (ReRAM), in that order.

When respondents were asked to weigh the importance of embedded NVM selection criteria, the results emphasize practicality. Reliability, endurance, and data retention all score at the top of the range, each with weighted averages well above 3.0 out of 4.0, confirming that they remain foundational requirements for embedded NVM selection. Process scalability follows closely, also scoring above the 3.0 mark, reflecting the growing difficulty of extending traditional embedded NVM into advanced geometries that embedded flash cannot scale to. Power efficiency scored over 3.0 too. Integration risk and long-term predictability sit only marginally behind, indicating that manufacturing readiness and lifecycle stability are now considered nearly as important as raw technical performance. This shows that the market is maturing; people understand that the raw technical capability of new NVMs is there, but the risk and cost of integration are becoming real concerns, especially for advanced nodes where flash integration is not an option.

The risk and pain-point data reinforce this view. Scalability limitations and power-performance trade-offs rank highest, both scoring with weighted averages above 3.0 out of 4.0, indicating that they are seen as critical constraints in current NVM deployments, especially in advanced process nodes. Reliability concerns and cost uncertainty follow closely behind, also clustering in the upper end of the scale, suggesting that long-term predictability and economic risk remain unresolved issues for many designs. Taken together, these pressures help explain why awareness of alternative NVM technologies is increasing, even where adoption remains cautious.

Fig. Pain points by importance

Design pressure seems to be increasing faster than legacy memory can adapt. What has changed since last year’s survey is not a collapse in confidence in embedded flash, but a clear acceleration in the pressures acting upon it. More teams are now evaluating alternatives not out of curiosity, but because scaling, power, and long-term predictability are becoming binding constraints on future designs.

Overall, the 2025 survey does not point to an abrupt abandonment of embedded flash, but it does suggest that the transition away from traditional memory technologies is entering a more decisive phase and might be likely to accelerate as design starts transition to nodes where new NVMs are required for technical reasons. Awareness of alternative NVMs is rising, evaluation is broadening, and a significant share of teams expect to make concrete IP choices within the next year.

Fig. Planned design starts by node

External forecasts point the same way: Yole Group’s outlook suggests embedded emerging NVMs could reach $3.3B by 2030, driven by adoption of technologies such as MRAM, PCM and ReRAM in next-generation MCUs and SoCs.

Compared with last year’s results, the direction of travel is clearer: the question is no longer whether embedded flash can be extended further, but how long it can continue to meet the combined demands of scaling, power efficiency, reliability, and cost predictability.

Bottom line: For many SoC teams, NVM selection is shifting from a background assumption to an urgent architectural decision that will shape product viability in the next generation of designs.

Also Read:

Weebit Nano Reports on 2025 Targets

Relaxation-Aware Programming in ReRAM: Evaluating and Optimizing Write Termination

SiFive’s AI’s Next Chapter: RISC-V and Custom Silicon

Ceva IP: Powering the Era of Physical AI


AI Drives Strong Semiconductor Market in 2025-2026

AI Drives Strong Semiconductor Market in 2025-2026
by Bill Jewell on 02-26-2026 at 1:00 pm

2026 Market Forecast

The global semiconductor market in 2025 was $792 billion, according to WSTS. 2025 was up 25.6% from 2024, the strongest growth since 26.2% in the COVID recovery year 2021. The increase was driven by AI, with Nvidia revenues up 65%. The major memory companies (Samsung, SK Hynix, Micron Technology, Kioxia and Sandisk) all cited AI as the primary revenue driver in their collective 29% revenue growth.

Fourth quarter 2025 results were mixed. The memory companies reported revenue growth in the range of 21% to 34% versus 3Q 2025. Nvidia was up 20%. Ten companies had 4Q 2025 revenue up in the range of 0.2% to 11%. Four companies (Texas Instruments, Infineon, Sony Imaging and Onsemi) reported revenue declines.

Guidance for 1Q 2026 revenue change from 4Q 2025 is mixed. The three memory companies providing guidance are expecting substantial revenue increases in 1Q 2026 with Micron at 37%, Sandisk at 52%, and Kioxia at 64%. Nvidia projects AI will drive 14% revenue growth. Four companies project revenue gains ranging from 2% to 11% based on a recovering industrial market and continuing AI strength. AMD, NXP Semiconductors, STMicroelectronics and Onsemi project revenue declines primarily due to seasonality.

The huge memory demand in AI is causing shortages of memory for other applications. Intel expects an 11% decline in revenue in 1Q 2026 versus 4Q 2025 due to shortages of memory for PCs. Qualcomm and MediaTek both cite memory shortages for smartphones as the reason for projected revenue declines.

In December, IDC cited the memory shortage as potentially leading to declining shipments of smartphones and PCs in 2026.

Thus, if strong AI growth continues in 2026, semiconductor companies dependent on the smartphone and PC markets could see revenues decline in 2026.

A year ago, no one predicted demand for AI would drive 25.6% growth in the semiconductor market in 2025. We at Semiconductor Intelligence give a virtual award for the most accurate semiconductor market forecast for the year. The criteria are publicly available forecasts released between October of the previous year and release of the WSTS January data in early March. The winner for 2025 is IDC which predicted 15% growth. Several other prognosticators were in the 12% to 14% range.

Looking ahead to 2026, recent forecasts are in two groups. In the lower group, the Cowan LRA model (based on historical revenue trends) has 9.5%. Future Horizons projected 12%. The higher group includes RCD Advisors at 23%, WSTS at 26.3%, and Semiconductor Intelligence at 30%.

We at Semiconductor Intelligence believe the robust expansion of AI will continue through at least the first half of 2026. The high quarter-to-quarter semiconductor market growth of 16% in 3Q 2025 and 14% in 4Q 2025 followed by an expected strong 1Q 2026 practically guarantees 2026 growth over 20%. Even if memory shortages impact the smartphone and PC markets, the booming AI market and the relative stability of the industrial and automotive markets will continue to drive semiconductor growth in 2026.

Semiconductor Intelligence is a consulting firm providing market analysis, market insights and company analysis for anyone involved in the semiconductor industry – manufacturers, designers, foundries, suppliers, users or investors. Please contact me if you would like further information.

Bill Jewell
Semiconductor Intelligence, LLC
billjewell@sc-iq.com

Also Read:

AI Bubble?

Semiconductors Up Over 20% in 2025

U.S. Electronics Production Growing


How Customized Foundation IP Is Redefining Power Efficiency and Semiconductor ROI

How Customized Foundation IP Is Redefining Power Efficiency and Semiconductor ROI
by Kalar Rajendiran on 02-26-2026 at 10:00 am

chip design for blog

As computing expands from data centers to edge devices, semiconductor designers face increasing pressure to optimize both performance and energy efficiency. Advanced process nodes continue to provide transistor-level improvements, but scaling alone cannot meet the demands of hyperscale AI infrastructure or ultra-low-power edge systems.

Synopsys Foundation IP enables SoC designers to customize their designs for specific application requirements, combining IP optimization with advanced EDA flows. Real customer engagements demonstrate how this approach improves power efficiency, reduces energy consumption, and unlocks system-level performance gains beyond standard scaling benefits.

Hyperscale Compute: Power-Efficient 2nm SoCs

One customer develops SoCs for hyperscale AI and cloud infrastructure, where compute density and power efficiency directly impact operational costs. Even at 2nm nodes, transistor scaling alone could not deliver the needed performance-per-watt improvements.

Synopsys collaborated with the customer to customize Foundation IP and integrate it with advanced EDA optimization flows. Standard cells were refined for transistor sizing and threshold voltages, while layouts were adjusted to reduce routing parasitics. These changes improved both energy efficiency and performance in dense compute blocks.

The result was meaningful reductions in power consumption and higher silicon utilization, demonstrating how hyperscale customers can extend the value of advanced-node technology while lowering system-level operational costs.

  • 34% reduced power consumption over baseline (using baseline EDA flow)
  • 51% reduced power consumption over baseline (using an optimized EDA flow)
  • 5% silicon area advantage over baseline

Edge AI Devices: Ultra-Low-Voltage Operation

Another customer develops Edge AI devices that require always-on functionality and strict energy efficiency. Battery life, standby power, and thermal constraints are critical, and standard IP could not reliably operate at ultra-low voltages.

Synopsys helped redesign memory bit cells and peripheral circuits to maintain read/write stability under low supply voltage. Assist circuitry improved access reliability, while memory compiler updates reduced standby power without sacrificing performance.

Logic libraries were optimized using low-leakage transistor configurations, and multi-rail voltage strategies allowed memory and logic to operate at independently optimized voltages. Variation-aware modeling and silicon correlation analysis reduced conservative guard-bands, enabling further voltage scaling and energy reduction.

These optimizations enabled every part of the chip to consume less power and occupy less space,  delivering longer battery life, improved thermal performance, and reliable always-on operation. The approach provides a repeatable framework for other Edge AI device manufacturers pursuing aggressive power efficiency goals.

Cross-Domain Engineering: A New Implementation Model

Both customer engagements highlight the value of cross-domain engineering, where IP design, EDA flows, and system-level architecture are optimized together. Coordinated optimization allows teams to evaluate performance and power across multiple layers of design and operating conditions.

This methodology helps uncover efficiency opportunities that traditional sequential design approaches often miss. It also reduces design risk, improves first-silicon success, and accelerates time-to-market. Customized IP can be reused across future designs, amplifying long-term return on engineering investment.

Delivering System-Level ROI

These customer engagements illustrate a broader industry trend: semiconductor return on investment increasingly depends on extracting value across the entire design stack, rather than relying solely on transistor scaling. For hyperscale infrastructure, customized IP helps reduce energy consumption, increase compute density, and lower operational costs, while for Edge AI devices, it enables ultra-low-voltage operation, extends battery life, and improves overall device functionality. In addition, reducing guard-bands and optimizing design margins further enhance manufacturing efficiency and strengthen product competitiveness. The techniques demonstrated through these engagements are transferable across similar market segments, providing a practical framework that allows both hyperscale and Edge AI customers to accelerate innovation and maximize performance-per-watt.

Summary

As AI workloads grow and edge intelligence proliferates, customized Foundation IP coupled with advanced EDA optimization will continue to be a key enabler of power-efficient, high-performance semiconductor design.

By combining cross-domain engineering with application-specific IP customization, Synopsys helps customers extend the benefits of scaling into system-level performance, energy efficiency, and economic gains across hyperscale infrastructure, Edge AI devices, and emerging intelligent computing platforms.

Visit Synopsys Foundation IP page.

Also Read:

Designing the Future: AI-Driven Multi-Die Innovation in the Era of Agentic Engineering

Hardware is the Center of the Universe (Again)

Smarter IC Layout Parasitic Analysis


Akeana Partners with Axiomise for Formal Verification of Its Super-Scalar RISC-V Cores

Akeana Partners with Axiomise for Formal Verification of Its Super-Scalar RISC-V Cores
by Daniel Nenni on 02-26-2026 at 8:00 am

Akeana Partners with Axiomise

Akeana Inc. announced a key milestone in the development of its advanced RISC-V technology: a successful partnership with Axiomise Limited to formally verify its super-scalar test chip, Alpine. The collaboration highlights the growing importance of formal verification in ensuring correctness, performance, and efficiency in next-generation semiconductor designs.

Alpine is a 4nm silicon and software development board that integrates high-performance, out-of-order RISC-V cores. As semiconductor process nodes continue to shrink and architectural complexity increases, ensuring functional correctness before tape-out has become both more challenging and more critical. Super-scalar, out-of-order cores, designed to execute multiple instructions per clock cycle, introduce intricate control logic, speculative execution paths, and numerous corner cases that are difficult to fully validate using traditional simulation techniques alone.

To address these challenges, Akeana turned to Axiomise for its deep expertise in formal verification. Unlike simulation-based approaches, which rely on test vectors and probabilistic coverage, formal verification applies mathematical proof techniques to exhaustively analyze all reachable states of a design. This guarantees that specific properties hold true under every possible condition, eliminating entire classes of latent bugs that could otherwise escape detection.

According to Nitin Rajmohan, Co-founder of Akeana, the results of the engagement exceeded expectations. Within just a few months, Axiomise’s team not only identified functional issues but also uncovered potential redundant logic in the design—findings that were not anticipated at the outset. These insights provided Akeana with opportunities to further optimize its RTL before tape-out, reducing risk and improving overall design quality. The experience reinforced the long-term value of formal verification within Akeana’s broader development methodology.

Axiomise employs a structured methodology that combines expert consulting with proprietary applications such as formalISA®, footprint®, and floatrix®, powered by its CoreProve® framework. These tools are designed to achieve full proof convergence using commercial EDA platforms, enabling end-to-end formal verification sign-off. By integrating advanced automation with domain-specific expertise, Axiomise delivers mathematically rigorous results across functional correctness, performance constraints, and area optimization.

For a complex 4nm design like Alpine, this approach provides significant advantages. At advanced process nodes, the cost of a silicon re-spin can be enormous, not only financially but also in terms of market timing and competitive positioning. Formal verification reduces this risk by ensuring that corner cases, particularly those involving concurrency, memory ordering, branch prediction, and pipeline hazards, are thoroughly analyzed before fabrication.

Beyond functional correctness, the partnership also addressed PPA (Power, Performance, Area) considerations. While verification is traditionally associated with functional validation, formal methods can also reveal inefficiencies in control logic or data paths that affect power consumption and silicon footprint. By identifying redundant or suboptimal structures early, Akeana was able to make informed design refinements that support both performance targets and area constraints.

Dr. Ashish Darbari, CEO of Axiomise, emphasized the broader industry significance of the project. As RISC-V adoption accelerates across markets—including mobile, automotive, data centers, and cloud computing—the demand for high-performance, reliable cores continues to grow. Formal verification provides the exhaustive guarantees required for these mission-critical applications, where undetected design flaws can have far-reaching consequences.

The tape-out of Alpine represents a meaningful milestone not only for Akeana but also for the expanding RISC-V ecosystem. It demonstrates that open-standard architectures can meet the stringent quality and performance expectations traditionally associated with proprietary designs. By incorporating formal verification at a sign-off level, Akeana underscores its commitment to delivering robust, production-ready IP to its customers.

Headquartered in Santa Clara, California, Akeana is backed by prominent investors including Kleiner Perkins, Mayfield Fund, and Fidelity Ventures. The company focuses on configurable RISC-V-based compute, interconnect, and AI accelerator IP solutions. Axiomise, based in the UK, has built a reputation over the past eight years for advancing formal verification adoption through consulting, training, and custom software solutions.

Bottom line: The two companies have demonstrated how collaboration between IP innovators and formal verification specialists can accelerate development while maintaining uncompromising quality standards. As semiconductor designs grow increasingly complex, partnerships like this are likely to become essential in ensuring that next-generation silicon achieves both performance leadership and mathematical correctness from day one.

CONTACT AKEANA

CONTACT AXIOMISE

Also Read:

An AI-Native Architecture That Eliminates GPU Inefficiencies

An AI-Native Architecture That Eliminates GPU Inefficiencies
by Lauro Rizzatti on 02-26-2026 at 6:00 am

VSORA SemiWiki 2026

A recent analysis highlighted by MIT Technology Review puts the energy cost of generative AI into stark perspective. Generating a simple text response from Llama 3.1-405B—a model with 405 billion parameters, the adjustable “knobs” that enable prediction—requires on average 3,353 joules, nearly 1 watt-hour (Wh). Once cooling and supporting infrastructure are factored in, that figure effectively doubles to about 6,706 joules (~2 Wh) per response.

The picture becomes even more striking with video. The same study found that producing just five seconds of low-resolution video at 16 frames per second with the open-source CogVideoX model consumed approximately 3.4 million joules, nearly 1 kilowatt-hour (kWh), as measured via CodeCarbon. To put that into perspective, this is roughly the amount of electricity a typical household appliance uses in an hour.

To scale this scenario, public estimates suggest that in mid-2025, platforms such as ChatGPT were handling over 2.5 billion queries per day. Even conservatively extrapolated, generative AI systems were dissipating energy on the order of gigawatt-hours daily, a level of consumption that rival industrial operations.

This raises two urgent questions:
  • Why does AI inference consume so much energy?
  • More importantly, can processor architecture be redesigned to dramatically reduce this cost?

The answer lies not only in model size, but in the silicon beneath it. AI processor architecture is no longer just a performance concern, rather it is a defining factor in the energy efficiency, scalability, and sustainability of artificial intelligence itself.

GPGPU: The Right Architecture for the Wrong Workload

GPGPUs are built around a micro-level execution model implemented on the Single Instruction, Multiple Threads (SIMT) model. In this model, performance is achieved by launching thousands of tiny threads, each operating on small pieces of data. Developers are expected to carefully coordinate these threads so that, together, they complete a larger computation.

This approach emerged from computer graphics, where workloads are highly irregular and branching behavior is common. SIMT excels in that environment because it allows thousands of threads to hide latency and tolerate divergence. However, when applied to artificial intelligence workloads the mismatch becomes apparent. AI computations are highly structured, repetitive, and mathematically regular, yet SIMT forces them to be expressed through an abstraction designed for far more chaotic workloads.

As a result, a significant fraction of execution time in SIMT-based systems is not spent performing useful mathematical work. Instead, it is consumed by what can be thought of as a management overhead. The hardware and software stack must constantly schedule threads, synchronize execution, handle divergence within warps, and coordinate memory accesses across deep hierarchies. As models grow larger and latency constraints tighten—particularly in real-time or interactive inference scenarios—this overhead begins to dominate overall performance.

VSORA: Redefining the Rules of the Game in AI Processors

This is the context in which VSORA enters the picture. With more than a decade of experience designing advanced digital signal processing architectures, VSORA approaches AI computation from a different starting point. Its background lies in deeply pipelined processors with rich instruction sets capable of executing complex operations in a single clock cycle. Rather than adapting an existing GPU model, VSORA leveraged this expertise to design a processor architecture specifically tailored for large language model inference, both in the cloud and at the edge. The goal was not incremental improvement, but a clean break from the inefficiencies inherent in GPGPU-based designs.

VSORA MPU: A Structural Shift in How AI Gets Computed

At the heart of this new approach sits the VSORA Matrix Processing Unit, or MPU. To appreciate how it differs, consider what happens when the basic unit of computation changes. In SIMT systems, threads are the atomic unit, and everything—from memory layout to scheduling—is organized around them. VSORA discards this assumption entirely. Instead of threads, the MPU treats tensors—multidimensional arrays representing matrices and vectors—as the fundamental unit of work.

In practical terms, this means that instructions operate on entire tensors at once. The programmer describes a mathematical operation, such as a matrix multiplication or transformation, without specifying how the work should be divided among thousands of execution contexts. The hardware itself is responsible for decomposing the operation, distributing it across compute resources, and executing it efficiently. This shift moves complexity out of software and into silicon, where it can be handled deterministically and at far lower cost.

For developers, this tensor-centric abstraction simplifies both programming and reasoning about performance. There is no need to manually manage threads, worry about warp divergence, or tune kernel launch parameters. Because execution management is internalized by the hardware, performance becomes more predictable, and developers can focus on correctness and algorithmic structure rather than orchestration.

Massive Register File and Tightly Coupled Memory

One of the most visible consequences of this architectural philosophy appears in the MPU’s memory design. Traditional processors rely heavily on multi-level cache hierarchies that attempt to predict which data will be needed next. While caches work well in many general-purpose scenarios, they are fundamentally probabilistic. When predictions fail, cache misses introduce long and unpredictable delays, which are especially problematic for real-time inference.

Figure 1: VSORA architecture replaces the traditional multi-level memory hierarchy with a unified, massive flat register file to minimize data movement latency.

VSORA replaces this uncertainty with a large, explicitly managed memory structure. The MPU includes a massive, software-visible register file implemented as several megabytes of tightly coupled memory, or TCM. This memory sits physically close to the compute engines and behaves like a flat, deterministic scratchpad rather than a cache. Its capacity is sufficient to hold entire weight matrices and intermediate activations on-chip, allowing the system to operate without relying on speculative caching behavior. See figure 1.

By designing around tensor-level locality and provisioning enough on-chip storage to support it, the MPU ensures consistent access latency. As long as the working set fits within the TCM, memory access times remain uniform and predictable. This eliminates the performance cliffs that often occur when cache hierarchies are overwhelmed, a common issue in large neural networks.

Continuous Pipelining and Deterministic Throughput

Once data is resident in the TCM, the MPU leverages highly efficient prefetching techniques to minimize latency. Instead of treating AI workloads as a series of discrete kernel launches, VSORA views them as sustained computational flows. An intuitive way to think about this is as an assembly line: once the pipeline is filled, new results emerge at a steady rate every cycle.

This pipelining operates at multiple levels. At the micro-architectural level, data streams continuously into compute units without stalls. At the instruction level, preparation and execution overlap so that hardware resources remain fully utilized. The architecture also allows multiple MPUs to be chained together, enabling data to flow directly from one unit to the next without detouring through external memory. After an initial warm-up period, throughput stabilizes and becomes largely independent of the complexity of individual operations.

Automated Data Layout and Reduced Software Burden

Another area where the MPU reduces developer burden is data layout. On many accelerators, achieving high performance requires manually rearranging data in memory to match hardware-specific access patterns. This process is error-prone, time-consuming, and often ties software to a specific architecture.

VSORA intentionally removes this responsibility from the programmer by introducing a memory access abstraction. The MPU hardware automatically handles alignment, padding, swizzling, and internal data reordering needed to sustain peak bandwidth. Developers work with tensors as abstract mathematical objects, while the hardware transparently performs the low-level transformations required for efficient execution. This approach not only improves productivity but also reduces performance fragility caused by subtle layout mismatches.

These architectural choices make the VSORA MPU particularly well suited for inference workloads, where latency and predictability matter more than raw peak throughput. Unlike GPUs, which often require large batch sizes to amortize overheads and reach high utilization, the MPU remains efficient even with batch size one. This is critical for real-time applications such as robotics, autonomous systems, and interactive AI, where waiting to accumulate large batches is not an option.

Dataflow Execution Model

In conventional multi-core and multi-accelerator systems, scaling often introduces diminishing returns due to synchronization overhead and shared memory contention. Additional compute resources increase coordination costs, reducing effective throughput.

Instead of treating each processing unit as an independent island, multiple MPUs are connected into a single, deeply pipelined dataflow graph. The output of one MPU becomes the direct input of the next, enabling true zero-copy execution at the hardware level.

Each MPU maintains its own TCM, allowing large models to be partitioned cleanly across units. Data moves directly between register files rather than through external memory interfaces, which is especially advantageous for the hot data paths common in modern neural networks. As models scale, throughput remains flat and predictable as long as active tensors fit within the available TCM.

Simplified Scaling and System-Level Efficiency

From a system-level perspective, this results in an architecture that scales without imposing additional complexity on developers. Instead of implementing intricate tiling strategies, synchronization mechanisms, and scheduling logic, programmers define tensor flows and dependencies. The hardware autonomously manages execution, handshaking, and scheduling, ensuring consistent performance even under tight latency constraints.

This makes the VSORA architecture especially conducive to high-pressure environments such as cloud inference platforms, edge deployments, and autonomous systems, where strict latency budgets leave no room for scheduling inefficiencies or unpredictable stalls.

Conclusion

By eliminating kernel launch overhead and dismantling the traditional memory wall between layers, the VSORA Matrix Processing Unit redefines AI efficiency at its core. It delivers near-peak hardware utilization even at batch size one—something conventional accelerators simply cannot achieve. Performance is no longer dependent on artificial batching to mask architectural inefficiencies.

This makes the architecture uniquely suited for interactive and real-time AI, where milliseconds determine safety, usability, and user experience. From real-time autonomy to fluid conversational systems, VSORA prioritizes determinism, latency consistency, efficiency, architectural simplicity, and cost effectiveness over brute-force parallelism.

Equally transformative is the ease of adoption. There is no new programming model, no proprietary language, no disruptive toolchain shift. Developers continue using familiar frameworks such as TensorFlow, PyTorch, or ONNX—without rewriting models or retraining teams. Transitioning to VSORA requires no paradigm change, only performance gains.

In short, the VSORA MPU does not just accelerate AI workloads—it removes the structural bottlenecks that have defined them.

CONTACT VSORA

Also Read:

VSORA Board Chair Sandra Rivera on Solutions for AI Inference and LLM Processing

Silicon Valley, à la Française

Inference Acceleration from the Ground Up