SNPS1670747138 DAC 2025 800x100px HRes

Beyond the Memory Wall: Unleashing Bandwidth and Crushing Latency

Beyond the Memory Wall: Unleashing Bandwidth and Crushing Latency
by Lauro Rizzatti on 05-07-2025 at 2:00 pm

Figure 1

VSORA AI Processor Raises $46 Million to Fast-Track Silicon Development

We stand on the cusp of an era defined by ubiquitous intelligence—a stone’s throw from a tidal wave of AI-powered products underpinned by next-generation silicon. Realizing that future demands nothing less than a fundamental rethink of how we design semiconductors and architect computers.

At the core of this transformation is a simple—but profound—shift: AI silicon must be shaped by AI workloads from day one. Gone are the days when hardware and software evolve in parallel—and only converge at validation, by which point the architecture is set in stone. Today’s paradigm demands re-engineer engineering, i.e., software-defined hardware design, tightly integrating AI code and silicon from the ground up.

Brute Force, No Grace: GPUs Hit the Memory Wall Processing LLMs

Today, the dominant computing architecture for AI processors is the Graphics Processing Unit (GPU). Originally conceived in 1999, when Nvidia released the GeForce 256 marketed as the “world’s first GPU”, it addressed the growing demand for parallel processing in rendering computer graphics. The GPU has since been repurposed to handle the massive, highly parallel workloads required by today’s AI algorithms—particularly those based on large language models (LLMs).

Despite significant advancements in GPU theoretical throughput, GPUs still face fundamental limitations, namely, poor computational efficiency, high power consumption, and suboptimal latency. To exemplify, a GPU with a theoretical peak performance of one PetaFLOPS and a realistic efficiency of 10% when processing a state-of-the-art LLM such as GPT-4 or LLM3-405B (noting that efficiency varies depending on the specific algorithm), would in practice deliver only 100 TeraFLOPS. To achieve a sustained PetaFLOPS of performance, 10 such GPUs would be required resulting in substantial more power consumption than that of a single device. Less apparent, this configuration also introduces significantly longer latency, compounding the inefficiencies.

Peeling back the layers of a GPU would uncover the culprit behind its poor efficiency: the memory wall. This long-standing bottleneck arises from an ever-widening gap between the insatiable demand of compute cores for data and the finite bandwidth of off-chip memory. As a result, cores frequently stall waiting on data transfers, preventing sustained utilization even when computational resources are plentiful.

Enhancements to the memory bandwidth via layered access in the form of multi-level caches have helped mitigate the impact—until the advent of AI workloads exposed this limitation. GPU’s brute-force approach, necessary to handle large language models (LLM), comes at a price: poor efficiency resulting in high energy consumption and long latency.

While GPU limitations during LLM training primarily manifest as increased computational cost, they pose a more critical obstacle during inference. This is especially pronounced in edge deployments, where stringent power budgets and real-time latency requirements, crucial for applications like autonomous driving, severely restrict GPU viability.

The VSORA Solution: Knocking Down the Memory Wall

While the semiconductor industry is intensely focused on mitigating the memory bandwidth bottleneck that plagues LLM inference processing, French startup VSORA has quietly pioneered a disruptive solution. The solution represents a paradigm shift in memory management.

VSORA Architecture: Functional Principles

The VSORA’s architecture redefines how data is stored, moved, and processed at scale. At its heart lies an innovative scalable compute core designed around a very fast tightly-coupled-memory (TCM).

The TCM functions like an expansive and vast register file—offering the lowest-latency, single-cycle read/write access of any on-chip memory. Placed directly alongside the compute fabric, it bypasses the multi-cycle penalties of conventional cache hierarchies. As a result, VSORA maintains exceptionally high utilization even on irregular workloads, since hot data is always available in the very next cycle.

Together, the compute logic and the TCM form a unified, scalable compute core that minimizes data-movement overhead and bypasses traditional cache hierarchies. The result is an order-of-magnitude reduction in access latency and blazing-fast end-to-end inference performance across edge and data-center deployments. See figure 1.

Figure 1: Traditional hierarchical-cache memory structure vs VSORA register-like memory approach [Source: VSORA]
VSORA Architecture: Physical Implementation

The VSORA architecture is realized using a chiplet-based design within a 2.5D silicon‐interposer package, coupling compute chiplets to high-capacity memory chiplets. Each compute chiplet carries two VSORA basic compute cores, and each memory chiplet houses a high-bandwidth memory stack. Compute and memory chiplets communicate over an ultra-low-latency, high-throughput Network-on-Chip (NoC) fabric.

In the flagship Jotunn8 device, eight compute chiplets and eight HBM3e chiplets are tiled around the central interposer, delivering massive aggregate bandwidth and parallelism in a single package.

Beyond Bandwidth/Latency: VSORA’s On-the-Fly Re-configurable Compute Cores Unlock Algorithm-Agnostic Deployment

In most AI accelerators today, the fundamental compute element is a single-bit multiply-accumulate (MAC) unit. Thousands—or even hundreds of thousands—of these MACs are woven together in a massive array, with both the compiler and the user defining how data flows spatially across the array and in what temporal order each operation executes. While this approach excels at raw throughput for uniform, fixed-precision workloads, it begins to fracture under the demands of modern large language models and cutting-edge AI applications, which require:

  • Mixed-precision support: LLMs often need to employ different quantization on different layers, for example, a mix of FP8 Tensorcore, FP16 Tensorcore and FP16 DSP layers within the same network to balance performance, accuracy and numerical fidelity. This requires the system to repeatedly quantize and dequantize data, introducing both overhead and rounding error
  • Dynamic range management: Activations and weights span widely varying magnitudes. Architectures built around a single bit can struggle to represent very large or very small values without resorting to costly software-driven scaling.
  • Irregular and sparse tensors: Advanced workloads increasingly exploit sparsity to prune redundant connections. A rigid MAC mesh, optimized for dense operations, underutilizes its resources when data is sparse or when operations deviate from simple dot products.

These limitations introduce bottlenecks and reduce accuracy, consequently throughput drops when precision conversions don’t map neatly onto the MAC fabric, and critical data must shuffle through auxiliary units for scaling or activation functions.

VSORA’s architecture flips the script on traditional accelerator fabrics by adopting reconfigurable compute tiles that adapt on the fly—zero downtime, zero manual reprogramming. Instead of dedicating large swaths of silicon to fixed-function MAC arrays or rigid tensor cores, each VSORA tile can instantly assume either DSP-style or Tensorcore-style operation, at any precision (FP8, FP16, INT8, etc.), on a per-layer basis.

In practice, this means that:

  • Layer-optimal precision: One layer might run at FP16 with high-dynamic-range DSP operations for numerically sensitive tasks, then the very next layer switches to FP8 Tensorcore math for maximum throughput—without any pipeline stalls.
  • Resource consolidation: Because every tile can serve multiple roles, there’s no idle silicon stranded when workloads shift in precision or compute type. VSORA sustains peak utilization across the diverse math patterns of modern LLMs.
  • Simplified compiler flow: The compiler’s task reduces to choosing the ideal mode per layer—Tensorcore or DSP—instead of wrestling with mapping data to dozens of discrete hardware blocks.

The result is an accelerator that tunes itself continuously to each model’s needs, delivering higher accuracy, lower latency, and superior energy efficiency compared to static, single-purpose designs.

The VSORA’s architecture is not just about raw bandwidth; it’s about intelligent data processing, tailored to the specific demands of each application. This meticulous attention to detail at the core level is what distinguishes VSORA, enabling them to deliver AI inference solutions that are both powerful and efficient.

VSORA’s Secret Weapon: The Intelligent Compiler

Hardware ingenuity is only half the equation. VSORA’s algorithm-agnostic compiler consists of two stages. A front-end graph, hardware-independent compiler, ingests standard model formats (Tensorflow, PyTorch, ONNX, etc.) and optimizes the model via layer fusion, layer re-ordering, weight compilation and scheduling, slicing, tensor layout optimization, execution scheduling and sparsity enabling (data and weights). A back-end, LLVM-based compiler, fully automates the mapping of leading-edge LLMs—such as Llama—onto the VSORA J8.

VSORA’s architecture radically simplifies the deployment of large language models by replacing the tedious, error-prone mapping workflows common in GPU environments with an automated, software-defined memory management layer. Unlike traditional GPU toolchains—where developers must hand-tune data layouts, manage low-level memory transfers, and master platform-specific APIs such as NVIDIA CUDA—VSORA’s compiler handles all of this transparently. As a result, teams can bring LLMs online far more quickly and reliably, even in power-constrained or latency-sensitive applications, without sacrificing performance or requiring deep hardware-level expertise.

The result is a seamless compilation software stack that maximizes chip utilization, simplifies deployment, and unleashes the full performance potential of VSORA’s breakthrough inference platform.

Conclusion

Unlike general-purpose accelerators optimized for training, VSORA conceived an architecture optimized for inference. The specialization reduces latency, boosts real-world responsiveness, and drives down operational costs in scenarios where every millisecond counts—from on-device AI in smart cameras to safety-critical systems in self-driving cars.

Market research forecasts AI inference revenue to double from about $100 billion in 2025 to an estimated $250 billion by 2030—a 15+ percent compound annual growth rate. As enterprises race to deploy real-time AI at scale, VSORA’s efficiency-first approach could redefine cost structures and performance benchmarks across the industry.

On April 27, 2025, VSORA announced a $46 millions investment led by Otium Capital and a prominent French family office, with participation from Omnes Capital, Adélie Capital, and co-financing by the European Innovation Council Fund. In the words of Khaled Maalej, VSORA founder and CEO, “this funding empowers VSORA to tape-out the chip and ramp up production.”

Also Read:

SNUG 2025: A Watershed Moment for EDA – Part 1

SNUG 2025: A Watershed Moment for EDA – Part 2

DVCon 2025: AI and the Future of Verification Take Center Stage


Intel’s Foundry Transformation: Technology, Culture, and Collaboration

Intel’s Foundry Transformation: Technology, Culture, and Collaboration
by Kalar Rajendiran on 05-07-2025 at 10:00 am

Intel and UMC 2025

Intel’s historical dominance in semiconductor process technology began to erode around 2018, as competitors started delivering higher performance at smaller nodes. In response, Intel is now doubling down on innovation across two fronts: advanced process nodes such as Intel 18A and 14A, and cutting-edge packaging technologies.

Interestingly, this emphasis on packaging innovation isn’t a deviation from Moore’s Law—it’s an expansion of it. In the original paper that gave birth to Moore’s Law, Gordon Moore wrote that it may prove economical to build large systems out of smaller functions, which are separately packaged and interconnected. That concept is materializing today through multi-die architectures and chiplet-based integration, which are key to Intel’s packaging roadmap.

These dual pillars of process and packaging took center stage at the recent Intel Foundry Direct Connect event, where Intel outlined how these technologies will power next-generation products in a world increasingly defined by AI-driven workloads and heterogeneous computing.

A separate article covers what was shared regarding advanced process and packaging technology. During Day 2 of the Direct Connect event, Walter Ng, VP of Worldwide Business Development at Intel Foundry Services, and TJ Lin, President of UMC-USA, gave a joint talk. This article focuses on that session.

The Cultural Challenge: From Products to Services

Technology alone is not enough to reinvent Intel’s role in the industry. A transformation from a product-centric company to a customer-focused foundry demands an equally profound cultural shift. For decades, Intel has engineered and delivered its own products; now, it must serve as a platform for others’ innovations. This shift was a major theme at the event, especially during the joint presentation by Intel and its strategic foundry partner, United Microelectronics Corporation (UMC).

UMC’s own evolution from an IDM (Integrated Device Manufacturer) to a dedicated foundry equips it with a culture deeply rooted in customer collaboration, operational efficiency, and service orientation. These are exactly the qualities Intel must adopt to succeed in its foundry ambitions—and UMC is well-positioned to help guide that transformation.

A Strategic Opportunity

While Intel is forging ahead on advanced process and packaging fronts, the 12nm process node was selected for the Intel-UMC partnership for several strategic reasons. Although future collaborations may include additional nodes, the immediate focus is on delivering a competitive 12nm platform that targets a broad range of applications: high-performance computing, mobile, RF, consumer, industrial, automotive, aerospace, and medical sectors.

This market is expected to grow to $20 billion by 2028, with early momentum driven by logic and RF designs. From 2027 onward, growth in specialty technologies is expected. Application areas include WiFi combo chips, RF transceivers, image signal processors, set-top box SoCs, and more—addressing the full spectrum of modern semiconductor demands.

Distributed Development and Accelerated Execution

Development is proceeding in parallel at UMC’s Tainan facility in Taiwan and Intel’s Ocotillo Technology Fabrication (OTF) site in Arizona, reinforcing a geo-diversified manufacturing strategy. With fabs across the US, Taiwan, Korea, China, EMEA, and Japan, the collaboration supports customers in building resilient, multi-sourced supply chains.

Initial performance benchmarks are promising: compared to UMC’s 22uLP node, the new 12nm offering delivers 28% better performance, 47% lower power consumption, and over 50% area savings. In response to anchor customers, Intel has accelerated its Process Design Kit (PDK) delivery schedule, enabling earlier design-in and tape-out.

The partners are also closely coordinating foundry operations and support services to ensure a seamless transition from design to high-volume manufacturing.

UMC’s Role and Expertise

UMC brings decades of experience in foundry operations, with a comprehensive ecosystem of IP and design enablement tools, support for specialty devices, and a diverse global customer base. Its track record in delivering complex, customized solutions makes it a strong partner in applications where tailored performance is essential.

Intel’s Added Value

Intel contributes significant R&D depth in FinFET technology, established advanced-node capacity, and leadership in packaging innovation. Initiatives like the Chiplet Alliance are enabling a robust ecosystem for modular system design. Furthermore, Intel’s domestic manufacturing footprint in the U.S. strengthens its appeal for customers with localization or national security requirements.

Together, Intel and UMC are offering a competitive FinFET solution that supports multi-sourcing strategies and provides a clear technology migration path for future products.

Service Culture Learning as a Catalyst for Change

Beyond technological and operational synergies, this collaboration serves a more profound purpose in Intel’s evolution: accelerating its cultural transformation. UMC’s journey from IDM to foundry is now becoming part of Intel’s learning curve. As Intel adopts a more customer-first mindset, this partnership offers valuable guidance and real-world insight.

The collaboration is not merely an exchange of capabilities; it is also a transfer of values, principles, and best practices that may shape the long-term success of Intel Foundry Services.

Summary

In a semiconductor industry defined by diversification, specialization, and global complexity, the Intel-UMC 12nm partnership exemplifies smart, strategic collaboration. By combining UMC’s mature process expertise with Intel’s FinFET and packaging leadership—alongside a deepening cultural alignment—the partnership is well-positioned to unlock new market opportunities.

As Intel seeks to reclaim its role as a technology leader and establish itself as a next-generation foundry platform, this collaboration with UMC isn’t just strategic—it’s foundational.

Also Read:

Intel’s Path to Technological Leadership: Transforming Foundry Services and Embracing AI

Intel Foundry Delivers!

Intel Presents the Final Frontier of Transistor Architecture at IEDM


Speculative Execution: Rethinking the Approach to CPU Scheduling

Speculative Execution: Rethinking the Approach to CPU Scheduling
by Admin on 05-07-2025 at 6:00 am

IBM AI
First used in the IBM 360 and now central to modern CPUs, speculative execution boosts performance by predicting instruction outcomes. Dr. Thang Tran’s predictive execution model charts a simpler, more efficient path forward.

By Dr. Thang Minh Tran, CEO/CTO Simplex Micro

In the world of modern computing, speculative execution has played a pivotal role in boosting performance by allowing processors to guess the outcomes of instructions ahead of time, keeping pipelines full and reducing idle cycles. Initially introduced during the development of the IBM 360 series in the 1960s, speculative execution helped break through the barriers of earlier architectures, enabling better CPU performance.

However, as computing demands have grown, so too have the problems caused by speculative execution. While it was a necessary innovation in the past, speculative execution has evolved into a complex, resource-hungry solution that now contributes to inefficiencies in modern processors. The need for continued patching to address its shortcomings has led to a sprawling web of fixes that add to power consumption, security risks, and memory inefficiencies.

The Legacy of Speculative Execution

From the early days of the IBM 360 to modern processors, speculative execution has been a cornerstone of processor architecture. Its ability to predict instructions before they are needed allowed for increased speed and reduced idle time in early systems. However, the cost of continuing to rely on this strategy is becoming increasingly apparent.

As processors have evolved, the complexity of speculative execution has grown in lockstep. Branch predictors, reorder buffers, load-store queues, and speculative memory systems have all been layered on top of each other, building a complicated and often inefficient architecture designed to “hide” the mispredictions and errors that result from speculative execution. As a result, modern CPUs still carry the weight of speculative execution’s legacy, creating complexity without addressing the fundamental inefficiencies that have surfaced in recent years.

The Hidden Costs of Speculation

While speculative execution offers a theoretical performance boost, the reality is more complex. There are significant costs in terms of silicon area, power consumption, and security vulnerabilities:

Silicon Overhead: Around 25–35% of a modern CPU’s silicon area is dedicated to structures that support speculative execution. These areas are consumed by components such as branch predictors, reorder buffers, and load-store queues (TechPowerUp Skylake Die Analysis).

Power Consumption: Studies from UC Berkeley and MIT suggest that up to 20% of a CPU’s energy is consumed by speculative execution activities that ultimately get discarded, adding a substantial energy overhead (CPU Power Consumption Study).

Security Penalties: The discovery of vulnerabilities like Spectre and Meltdown has shown that speculative execution can introduce serious security risks. Mitigations for these vulnerabilities have resulted in performance penalties ranging from 5–30%, particularly in high-performance computing (HPC) and cloud computing environments (Microsoft Spectre and Meltdown Performance Impact).

These overheads are not just theoretical. In practice, speculative execution leads to slower, more energy-intensive processors that also pose serious security risks—issues that have only become more pressing with the advent of cloud computing and AI applications that require efficiency at scale.

Looking Beyond Speculation: A Path Forward

The time has come for a new approach to CPU architecture, one that moves away from the heavy reliance on speculation. It’s clear that predictive scheduling offers a promising alternative—one that can achieve the same performance improvements without the waste associated with speculative execution.

Recent patented inventions in predictive execution models offer a glimpse of the future. By scheduling tasks based on accurate predictions of when work can begin, rather than relying on speculative guesses, it becomes possible to eliminate the need for rollback systems, avoid speculative memory accesses, and create a more efficient, secure architecture.

Conclusion: A Call to Action

In conclusion, the history of speculative execution shows us both the innovation it sparked and the limitations it has imposed. While speculative execution was a crucial step in the evolution of computing, the time has come to move beyond it. Recent patents filed on predictive execution provide a promising path forward, one that offers greater efficiency, security, and power savings for future architectures.

Let’s not just confirm the misery of the past decades but instead embrace a brighter future where CPU architectures can be both smarter and more efficient. The world is ready for a new era in computing—one that moves beyond speculation and into the realm of precision, predictability, and performance.

Also Read:

Intel’s Path to Technological Leadership: Transforming Foundry Services and Embracing AI

Turnkey Multi-Protocol Wireless for Everyone

Intel Foundry Delivers!

 


Intel’s Path to Technological Leadership: Transforming Foundry Services and Embracing AI

Intel’s Path to Technological Leadership: Transforming Foundry Services and Embracing AI
by Kalar Rajendiran on 05-06-2025 at 10:00 am

Compressed Dont be encumbered by history

Intel, long a leader in semiconductor manufacturing, is on a determined journey to reclaim its technological leadership in the industry. After facing significant challenges in recent years, the company is making a concerted effort to adapt and innovate, with a clear focus on AI-driven technologies, advanced packaging solutions, and building a robust ecosystem.

During last week’s Intel Foundry Direct Connect event, Intel outlined its strategy and the investments it is making to enable this transformation. The company certainly does not intend to be encumbered by its recent history.

Transforming Intel Foundry: A Customer-Centric Approach

Intel is undertaking a significant transformation of its foundry business, with a renewed focus on becoming a customer-first, service-oriented organization. At the core of this strategy is a commitment to close collaboration with customers to ensure Intel not only meets, but anticipates their evolving needs. This transformation is guided by three strategic priorities:

Matching Technology to Customer Needs: Intel is aligning its technology offerings with the specific demands of industries like artificial intelligence (AI) and high-performance computing (HPC). Flexibility, predictability, and scalability are key pillars of this approach.

Improving Execution and Responsiveness: The company is refining its internal processes to better deliver on time and meet customer expectations with greater reliability.

Expanding the Ecosystem: Intel is investing significantly in ecosystem growth, including design enablement, IP support, and advanced packaging. These investments are designed to support its foundry business at scale.

Through these efforts, Intel aims to reshape its foundry operations into a comprehensive, end-to-end solutions provider equipped to meet the complex requirements of the modern semiconductor industry.

AI and Advanced Packaging: The Next Frontier

AI is transforming semiconductor design and manufacturing, and Intel is positioning itself as a foundational technology provider for this revolution. Recognizing that future computing performance relies not just on smaller transistors but also on smarter integration, Intel is making bold moves in advanced packaging.

Intel’s packaging technologies, including 2.5D and 3D solutions, are designed to offer increased design flexibility, faster time-to-market, and efficient performance scaling. Technologies such as Through-Silicon Vias (TSVs), embedded silicon bridges (E-bridges), and interconnect standards like UCIe are being implemented to address the demands of AI workloads—high bandwidth, low latency, and energy efficiency.

Intel’s EMIB (Embedded Multi-die Interconnect Bridge) enables high-density die-to-die connections in a 2.5D package without the need for a silicon interposer, offering cost and performance benefits for large chiplet-based designs. Foveros, Intel’s 3D stacking technology, allows different logic tiles to be stacked vertically, enabling heterogenous integration across process nodes. Foveros Direct, an evolution of this platform, introduces direct copper-to-copper bonding for ultra-high-density, low-resistance interconnects, pushing the boundaries of integration and energy efficiency.

The development of advanced solutions like HPM4 for high-performance memory and support for reticle-size packages up to 12X the standard, enable Intel to support large, AI-centric designs at scale.

A Collaborative Ecosystem: Strengthening Partnerships

Intel recognizes that its comeback won’t happen in isolation. Key to its strategy is building a robust, collaborative ecosystem. Lip-Bu Tan, CEO of Intel, underscored this during his talk, emphasizing the importance of ecosystem enablement to support AI’s massive growth.

Its leadership in forming the Chiplet Alliance, comprising over a dozen companies, highlights its effort to create a secure, standardized, and interoperable chiplet ecosystem. This initiative underlines Intel’s commitment to building a connected value chain across design, packaging, and manufacturing.

Intel is also partnering with United Microelectronics Corporation (UMC) to bring its 12nm technology to market, aimed at serving specialty applications. This collaboration leverages Intel’s manufacturing expertise with UMC’s strengths in design enablement. You can read about this on a separate post on SemiWiki: Intel’s Foundry Transformation: Technology, Culture, and Collaboration

The U.S. Government’s Role in Intel’s Vision

Intel views the U.S. government as a foundational partner in its mission to bring semiconductor leadership back to American soil. Programs like RAMP (Rapid Assured Microelectronics Prototyping) and the Secure Enclave Program are key enablers of trusted domestic manufacturing. These initiatives support the development and ramp-up of Intel’s cutting-edge 18A and 16 nodes, all of which will be manufactured in the U.S. Intel’s alignment with national security and economic priorities strengthens its position as both a commercial and strategic partner.

The government’s involvement is not just about funding—it’s about ensuring a resilient and secure semiconductor supply chain for both public and private sectors.

Summary

Intel’s strategy to regain technological leadership is built on four pillars: advanced process technology, next-generation packaging, ecosystem collaboration, and strong public-private partnerships. Looking ahead, Intel’s roadmap is anchored in technologies that provide scalable performance with predictable delivery.

With a customer-first mindset, a reinvigorated focus on execution, and a bold investment in innovation, Intel is poised to lead the next era of semiconductor technology.

Also Read:

Intel’s Path to Technological Leadership: Transforming Foundry Services and Embracing AI

Intel Foundry Delivers!

Intel Presents the Final Frontier of Transistor Architecture at IEDM

An Invited Talk at IEDM: Intel’s Mr. Transistor Presents The Incredible Shrinking Transistor – Shattering Perceived Barriers and Forging Ahead


Intel Foundry Delivers!

Intel Foundry Delivers!
by Daniel Nenni on 05-05-2025 at 10:00 am

Intel Foundry Direct Connect Hall 4 1024x576

Now that the dust has settled, I will give you my take on the Intel Foundry event. Some might call me a semiconductor event critic as I have attended hundreds of them over the last 40 years starting with the Design Automation Conference in 1984. Foundry events are my favorite because they really are the pulse of the semiconductor industry, it is all about the ecosystem of partners and customers. The message Intel Foundry sent this year is that they are going to earn your foundry business.

Last year, Intel Foundry was all about technology, which is fine, but what really matters is customers and that message was loud and clear to me this year. At some events I am herded around with media people but last week I ran free and was able to talk candidly with my fellow semiconductor professionals. More importantly, the badges were readable and mine even had my name on it in instead of just PRESS or MEDIA because I am so much more. I was amazed at the overall support by partners, pretty much everyone I saw at the TSMC event the previous week was there and quite a few badges from the top semiconductor companies were in attendance as well.

Surprisingly, the keynotes were live streamed which was a foundry event first for me. Speaking in front of more than a thousand of your peers AND live streamed? Horrifying if you think about it, and I have, but it was well done.

First up was Intel CEO Lip-Bu Tan. I have told people before that you should never bet against Lip-Bu. When he tells you something you can bank on it and that is my experience based on his time with Cadence. He first joined the Cadence board in 2004, was CEO 2009-2021, and left the board in 2023.

I started in the industry before there was a Cadence or Synopsys and spent the majority of my career in EDA and foundation IP. Cadence was having a very rough time when Lip-Bu joined and he turned Cadence into what they are today, a leading innovator in the semiconductor ecosystem.

Lip-Bu showed his EDA savvy by inviting EDA CEOs to the stage in a show of support on both sides. I have never seen this done before. It was amazing! Remember, without EDA companies there would be no foundries. Mike Ello did a nice job for Siemens EDA. I worked with Mike at Berkely DA up until the acquisition by Mentor. For those people who are concerned about Cadence having an inside advantage with Intel Foundry don’t be. Lip-Bu does not play that game. The foundry ecosystem has to be a level playing field and no one knows that better than Lip-Bu. On the EDA Enablement slide the key players were properly listed in alphabetical order: Ansys, Cadence, Keysight, Siemens EDA, and Synopsys.

Synopsys was nice enough to share some slides and a quote with me that really encapsulates what is going on with Intel 18A. I have heard this from multiple ecosystem people and customers so it is not just Synopsys and I agree 100% with Sassine Ghazi, CEO of Synopsys.

You cannot only win with technology, you need to have the whole process of enablement ready in order for the customer to see it as viable,” said Sassine, holding an early Synopsys test chip produced on Intel 18A about one year ago. He also said the teams are now in early-stage DTCO for Intel14A-E, leveraging Synopsys Technology Computer-Aided Design (TCAD) to use computer simulations for process node development.

Speaking of 14A, Intel mentioned a while back that HNA-EUV will not be required for 14A and from what I understand the first foundry version of 14A will be EUV. TSMC has said the same and this is a very good thing if you remember when EUV was delayed for YEARS! As I always say, plan for the worst and hope for the best.

Second up was Naga Chandrasekaran, Intel Foundry chief technology and operations officer followed by Kevin O’Buckley, general manager of Foundry Services. I do not know Naga personally but I do know people who worked for him at Micron and they speak very highly of him. Naga is a strong leader. Kevin I do know and I can vouch for him, he is a true professional and will work well with Lip-Bu. Here is a 30 minute clip from the keynotes that is worth watching:

Bottom line: Even with the short amount of time he has been CEO, Lip-Bu has already made a difference. Just wait until next year. I would bet that the Intel Foundry 18A customer tape-out list will be the Who’s Who of the semiconductor industry, absolutely.

Also Read:

Intel Presents the Final Frontier of Transistor Architecture at IEDM

An Invited Talk at IEDM: Intel’s Mr. Transistor Presents The Incredible Shrinking Transistor – Shattering Perceived Barriers and Forging Ahead


Silicon Creations Presents Architectures and IP for SoC Clocking

Silicon Creations Presents Architectures and IP for SoC Clocking
by Mike Gianfagna on 05-05-2025 at 6:00 am

Silicon Creations Presents Architectures and IP for SoC Clocking

Design & Reuse recently held its IP-SoC Days event at the Hyatt Regency in Santa Clara. Advanced IP is now the fuel for a great deal of innovation in semiconductor design. This popular event allows IP providers to highlight the latest products and services and share a vision of the future. IP consumers can easily get updates on the latest IP technology trends and innovations all at one event. There was a presentation that dove into a lot of very relevant details. The information was complete and compelling. In case you missed the event, here is a summary of how Silicon Creations presents architectures and IP for SoC clocking.

About the Authors

Jeff Galloway

The presentation was given by Jeff Galloway, co-founder of Silicon Creations. Jeff has a long career in analog/mixed signal design dating back to Hewlett Packard in the late 1990’s. He also developed advanced AMS designs at Agilent and MOSAID/Virtual Silicon before founding Silicon Creations in 2006. His command of the challenges of AMS design and the impact these designs have on innovation is substantial. Jeff brought a lot of perspective and insight to the presentation.

The paper was co-authored by Krzysztof Kasiński, Director of IP Validation Laboratory at Silicon Creations and University Professor at AGH University of Kraków.

About Silicon Creations

Silicon Creations is a self-funded, leading silicon IP provider with offices in the US and Poland as shown in the diagram above. The company has sales representation worldwide and delivers precision and general-purpose timing (PLLs), oscillators, low-power, high-performance multi-protocol and targeted SerDes and high-speed differential I/Os. This IP is used in diverse applications including smart phones, wearables, consumer devices, processors, network devices, automotive, IoT, and medical devices. An overview of where you will find Silicon Creations IP is also shown in the diagram above.

The company works with many foundries, delivering IP in planar, FinFET, FD-SOI, and Gate-All-Around technologies. The company reports that the majority of the world’s top 50 IC companies are working with Silicon Creations. The figure below shows the substantial ecosystem footprint the company delivers.

Problems to be Solved

Clock jitter provides a lot of headaches for advanced design teams. This is the situation where clock edges move from the ideal locations. Clocking systems should be designed with jitter in mind to avoid system level problems.

The diagram above, provided by Silicon Creations, provides some context for the jitter types by application. It turns out various applications require different clock quality metrics. Here are some examples:

  • Generic, digital synchronous logic: Total period jitter (peak-negative) to define clock uncertainty for STA
  • Multi-cycle paths and e.g. DDR controllers: N-period jitter (depending on number of cycles / latency)
  • ADC/DAC: Long-term jitter / Phase Noise to achieve ENOB; LJT reduces ENOB
  • RF: Long-term jitter / Phase Noise to low Error Vector Magnitude; LTJ rotates constellation increasing EVM
  • SerDes: Clock specification (including integration bands) tightly related to the protocol and PHY vendor. Possible filtering (e.g. PCIe spec). Embedded clock applications require low LJT

Jitter can come from many sources. Here are some examples provided by Silicon Creations with the type of remediation required:

  • FREF signal quality including impact from its signal path à design choices
  • PLL design and operating point (FPFD, FVCO, FOUT configuration, PVT condition) à IP vendor + programming
  • PLL conditions (decoupling, supply noise) à IP vendor support + design choices
  • Output clock path (supply noise, length / delay, additive jitter) à design choices

The Silicon Creations Approach

Jeff provided a very informative chart (below) that shows where the company’s broad IP portfolio fits in advanced designs.

Silicon Creations IP in Today’s SoCs

It turns out a large library of precision IP is needed to address clocking challenges in advanced designs. Silicon Creations has developed a comprehensive approach to these challenges. The chart below provides an overview of the various types of designs requiring help and how the Silicon Creations IP portfolio fits.

Optimized Clocking IP Portfolio

In his presentation at IP-SoC Days, Jeff provided examples of how Silicon Creations IP provides design teams options for dealing with clocking issues. He also provided an informative chart of a PLL performance matrix for its IP across frequency, area, and power. It turns out Silicon Creations PLL IP covers a vast range of performance metrics and can be a fit to many applications. Here are some highlights:

  • Power: below 50 µW through 5mW up to 50mW
  • Area:07mm2 through 0.1mm2 up to 0.14mm2
  • Output Frequency Range: From MHz range up to tens of GHz; frequency jumping, glitch-free operation, de-skewing
  • Reference Frequency Range: From 32kHz, through all crystal oscillator types, external clock chips, on-chip de-skewing

Jeff also described the comprehensive engagement process the company offers its customers. He explained that the Silicon Creations engineering team is involved from the beginning to help design SoC clock systems, from IP through distribution to the power delivery network. An overview of the engagement process is provided below.

To Learn More

If you are doing advanced designs that are performance sensitive, you will likely face the kind of clocking problems that Silicon Creations can address. The company’s broad, high-quality IP portfolio and domain expertise can be a real asset.

You can access the Silicon Creations presentation from IP-SoC days here.  And you can preview Silicon Creations’ IP portfolio on Design & Reuse here

Of course, you can also learn more about Silicon Creations at the company’s website here. There is also some excellent in-depth coverage on SemiWiki here. And you can always reach out to the company to start a dialogue here. And that’s how Silicon Creations presents architectures and IP for SoC clocking at IP-SoC Days.


Impact of Varying Electron Blur and Yield on Stochastic Fluctuations in EUV Resist

Impact of Varying Electron Blur and Yield on Stochastic Fluctuations in EUV Resist
by Fred Chen on 05-03-2025 at 4:00 pm

P30 stochastics vs attenuation length

A comprehensive update to the EUV stochastic image model

In extreme ultraviolet (EUV) lithography, photoelectron/secondary electron blur and secondary electron yield are known to drive stochastic fluctuations in the resist [1-3], leading to the formation of random defects and the degradation of pattern fidelity at advanced nodes. For simplicity, blur and electron yield per photon are often taken to be fixed parameters for a given EUV resist film. However, there is no reason to expect this to be true, since the resist is inhomogeneous on the nanoscale [4-6].

I have updated the model I have been using to analyze EUV stochastics with the following:

•Image fading from EUV pole-specific shift is not included (expected to be minor)

•Polarization consideration: assume 50% TE/50% TM for 2-beam image commonly used for EUV lines and spaces [7]

•Poisson statistics is applied to absorbed photons/nm2

•Electron blur fixed at zero at zero distance and match exponential attenuation length at larger distances => exp(-x/attenuation length)

•Electron blur is considered locally varying rather than a fixed number

•Electron (or acid) number per absorbed EUV photon has a minimum and maximum number

•Acid blur still uses Gaussian form (s=5 nm)

A key feature is that electron yield per photon is part of a distribution. This distribution is often modeled as Poissonian but in actuality can deviate from it significantly [8]. The maximum number of electrons is roughly the EUV photon energy (~92 eV) divided by the ionization potential (~10 eV), giving 9. TThe minimum number of electrons can be estimated as 6 from the Auger emission scenario in Figure 1, with one electron assumed lost to the underlayer, giving 5.

Figure 1. EUV Auger emission scenario releasing minimal number of electrons.

We also recall that electron blur is not a fixed value, but can take on different values at different locations [4-6]. As constructed previously [3], the electron blur function shape arises from the difference of two exponential functions. To maintain zero probability at zero distance, an exponential function with 0.4 nm attenuation length is subtracted from the normalized exponential function with target attenuation length.

While a typical blur can correspond to an attenuation length of ~2 nm, the inhomogeneity within the resist can lead to a rare occurrence of higher blur, e.g., corresponding to an attenuation length of ~4 nm (Figure 2).

Figure 2. A “typical” electron blur distribution could peak at just under 1 nm distance and decay exponentially with an attenuation length of ~2 nm. On the other hand, a “rare” electron blur distribution may peak at ~1 nm distance and decay exponentially with an attenuation length of ~ 4 nm.

The impact of varying blur is to increase the % of defective pixels in the image (Figure 3). We expect stochastic defectivity to be higher. This is to be expected as increasing blur decreases the max-min difference, making it more likely for particle number fluctuations to cross a given threshold.

Figure 3. A higher local blur (i.e., attenuation length) reduces image contrast more, increasing the likelihood of electron number fluctuations to cross the printing threshold.

Etching can affect the stochastics in the final image. Etch bias can make obstructions in the trench more likely, or openings between trenches more likely. This is the origin of the “valley” between the stochastic defectivity cliffs (Figure 4).

Figure 4. Etch bias can also affect the stochastic defectivity. Increasing etch bias (right) means undersizing the feature, which is more likely for trenches to be blocked.

As pitch increases, the contrast loss from the “typical” electron blur shown in Figure 2 will reach a minimum value of 20% [3]. However, the stochastic behavior for larger pitches improves as thicker resists may be used, increasing photon absorption density. Going from 30 nm to 40 nm pitch (Figure 5), the absorbed photon density increases , and the contrast reduction from electron blur is also improved. However, there is still significant noise in the electron density at the edge and defectivity with etch bias.

Figure 5. Same as Figure 4 but for 40 nm pitch. The absorbed photon density and contrast are increased, but the defectivity is only slightly improved.

When chemically amplified resists are used, acid blur must be added. Acid blur is usually modeled as a Gaussian function with sigma on the order of 5 nm [9]. The extra blur from acid aggravates the stochastic behavior even more (Figure 6).

Figure 6. Acid blur is a strong blur component in chemically amplified EUV resists.

Consequently, double or multiple patterning is used with EUV comprising exposures at > 40 nm pitch, with larger features and resist thicknesses. DUV multipatterning will still have the following advantages over EUV multipatterning:

  • Polarization restriction (not unpolarized)
  • No electron blur
  • Higher photon density
  • 2-beam imaging (instead of 3- or 4-beam)
  • Well-developed phase-shift mask technology

Most importantly, DUV double patterning has lower cost than EUV double patterning.

Pledge your support

Thanks for reading Exposing EUV! Subscribe for free to receive new posts and support my work.

References

[1] Z. Belete et al., J. Micro/Nanopattern. Mater. Metrol. 20, 014801 (2021).

[2] F. Chen, A Perfect Storm for EUV Lithography.

[3] F. Chen, A Realistic Electron Blur Function Shape for EUV Resist Modeling.

[4] F. Chen, Measuring Local EUV Resist Blur with Machine Learning.

[5] F. Chen, Stochastic Effects Blur the Resolution Limit of EUV Lithography.

[6] G. Denbeaux et al., “Understanding EUV resist stochastic effects through surface roughness measurements,” IEUVI Resist TWG meeting, February 23, 2020.

[7] H. J. Levinson, Jpn. J. Appl. Phys. 61, SD0803 (2022).

[8] E. F. da Silveira et al., Surf. Sci. 408, 28 (1998); Z. Bay and G. Papp, IEEE Trans. Nucl. Sci. 11, 160 (1964); L. Frank, J. Elec. Microsc. 54, 361 (2005).

[9] H. Fukuda, J. Micro/Nanolith. MEMS MOEMS 19, 024601 (2020).

This article is based on the presentation in the following video: Comprehensive Update to EUV Stochastic Image Model.

Thanks for reading Exposing EUV! Subscribe for free to receive new posts and support my work.


Executive Interview with Koji Motomori, Senior Director of Marketing and Business Development at Numem

Executive Interview with Koji Motomori, Senior Director of Marketing and Business Development at Numem
by Daniel Nenni on 05-03-2025 at 2:00 pm

Koji Motomori

Koji Motomori is a seasoned business leader and technologist with 30+ years of experience in semiconductors, AI, embedded systems, data centers, mobile, and memory solutions, backed by an engineering background. Over 26 years at Intel, he drove strategic growth initiatives, securing $2B+ in contracts with OEMs and partners. His expertise spans product marketing, GTM strategy, business development, deal-making, and ecosystem enablement, accelerating the adoption of CPU, memory, SSD, and interconnect technologies.

Tell us about your company.

At Numem, we’re all about taking memory technology to the next level, especially for AI, Edge Devices, and Data Centers. Our NuRAM SmartMem™ is a high-performance, ultra-low-power memory solution built on MRAM technology. So, what makes it special? It brings together the best of different memory types—SRAM-like read speeds, DRAM-like write performance, non-volatility, and ultra-low power.

With AI and advanced computing evolving fast, the demand for efficient, high-density memory is skyrocketing. That’s where we come in. Our solutions help cut energy consumption while delivering the speed and reliability needed for AI training, inference, and mission-critical applications. Simply put, we’re making memory smarter, faster, and more power-efficient to power the future of computing.

What problems are you solving?

That’s a great question. The memory industry is really struggling to keep up with the growing demands of AI and high-performance computing. Right now, we need memory that’s not just fast, but also power-efficient and high-capacity. The problem is, existing technologies all have major limitations.

Let’s take SRAM, for example, it’s fast but has high leakage power and doesn’t scale well at advanced nodes. HBM DRAM is another option, but it’s higher cost, power-hungry, and still not fast enough to fully meet AI’s needs. And then there’s DDR DRAM, which has low bandwidth, making it a bottleneck for high-performance AI workloads.

That’s exactly why we developed NuRAM SmartMem to solve these challenges. It combines the best of different memory types:

  • It gives you SRAM-like read speeds and DRAM-like write speeds, so AI workloads run smoothly.
  • It has 200x lower standby power than SRAM, which is huge for energy efficiency.
  • It’s 2.5x denser than SRAM, helping reduce cost and die size.
  • It delivers over 3x the bandwidth of HBM, eliminating AI bottlenecks.
  • And it’s non-volatile, meaning it retains data even when the power is off.

So, with NuRAM SmartMem™, we’re not just making memory faster, we’re making it more efficient and scalable for AI, Edge, and Data Center applications. It’s really a game-changer for the industry.

What application areas are your strongest?

That’s another great question. Our memory technology is designed to bring big improvements across a wide range of applications, but we’re especially strong in a few key areas.

For data centers, we help make AI model training and inference more efficient while cutting power consumption. Since our technology reduces the need for SRAM and DRAM, companies see significant Total Cost of Ownership (TCO) benefits. Plus, the non-volatility of our memory enables instant-on capabilities, meaning servers can reboot much faster.

In automotive, especially for EVs, real-time decision-making is critical. Our low-power memory helps extend battery life, and by consolidating multiple memory types like NOR Flash and LPDDR, we save space, power, cost, and weight—while also improving reliability.

For Edge AI devices and IoT applications, power efficiency is a huge concern. Our ultra-low-power memory helps reduce energy consumption, making these devices more sustainable and efficient.

Aerospace is another area where we stand out. Mission-critical applications demand reliability, energy efficiency, and radiation immunity—all of which our memory provides.

Then there are security cameras—with ultra-low power consumption and high bandwidth, our memory helps extend battery life while supporting high-resolution data transmission. And since we can replace memory types like NOR Flash and LPDDR, we also optimize space, power, and cost.

For wearable devices, battery life is everything. Our technology reduces power consumption, enabling lighter, more compact designs that last longer—something consumers really appreciate.

And finally, in PCs and smartphones, AI-driven features need better memory performance. Our non-volatile memory allows for instant-on capabilities, extends battery life, and replaces traditional memory types like boot NOR Flash and DDR, leading to power and space savings, plus faster boot times and overall better performance.

So overall, our memory technology delivers real advantages across multiple industries.

What keeps your customers up at night?

A lot of things. AI workloads are becoming more demanding, and our customers are constantly looking for ways to stay ahead.

One big concern is power efficiency and thermal management. AI systems push power budgets to the limit with rising energy costs, the total cost of ownership (TCO) becomes a huge factor. Keeping power consumption low is critical, not just for efficiency, but for performance and profitability.

Then there’s the issue of memory bandwidth bottlenecks. Traditional memory architectures simply can’t keep up with the growing performance demands of AI, which creates bottlenecks and limits system scalability.

Scalability and cost are also major worries. AI applications need more memory, but scaling up can increase the spending fast. Our customers want solutions that provide higher capacity without blowing the budget.

And finally, reliability and data retention are key, especially for AI and data-heavy applications. These workloads require memory that’s not just fast, but also non-volatile, secure, and long-lasting while still keeping power consumption low.

That’s exactly where NuRAM SmartMem comes in. Our technology delivers ultra-low power, high-density, and high-bandwidth memory solutions that help customers overcome these challenges and future-proof their AI-driven applications.

What does the competitive landscape look like, and how do you differentiate?

The high-performance memory market is dominated by SRAM, LPDDR DRAM, and HBM. Each of these technologies has strengths, but they also come with some major challenges.

SRAM, for example, is fast, but it has high standby power and scalability limitations at advanced nodes. LPDDR DRAM is designed to be lower power than standard DRAM, but it still consumes a lot of energy. And HBM DRAM delivers high bandwidth, but it comes with high cost, power constraints, and integration complexity.

That’s where NuRAM SmartMem™ stands out. We’ve built a memory solution that outperforms these technologies in key areas:

  • 200x lower standby power than SRAM, making it perfect for always-on AI applications that need ultra-low power.
  • 5x higher density than SRAM, reducing die size and overall memory costs.
  • Non-volatility, which unlike SRAM and DRAM, NuRAM retains data even without power. This adds both energy efficiency and reliability.
  • Over 3x faster bandwidth than HBM3E, solving AI’s growing memory bandwidth challenges.
  • Over 260x lower standby power than HBM3E, thanks to non-volatility and our flexible power management feature per block.
  • Scalability & Customization—NuRAM SmartMem™ is available as both IP cores and chiplets, making integration seamless for AI, IoT, and Data Center applications.

So, what really differentiates us? We’re offering a next-generation memory solution that maximizes performance while dramatically reducing power and cost. It’s a game-changer compared to traditional memory options.

What new features/technology are you working on?

We’re constantly pushing the boundaries of AI memory innovation, focusing on performance, power efficiency, and scalability. A few exciting things we’re working on right now include:

  • Smart Memory Subsystems – We’re making memory smarter. Our self-optimizing memory technology is designed to adapt and accelerate AI workloads more efficiently.
  • 2nd-Gen NuRAM SmartMem™ Chiplets – We’re taking things to the next level with even higher bandwidth, faster read/write speeds, lower power consumption, and greater scalability than our first generation.
  • AI Optimized Solutions – We’re fine-tuning our memory for LLM inference, AI Edge devices, and ultra-low-power AI chips, ensuring they get the best performance possible.
  • High-Capacity & Scalable Operation – As AI models keep growing, memory needs to scale with them. We’re expanding die capacity and improving stacking while working closely with foundries to boost manufacturability and yield for high-volume production.
  • Memory Security & Reliability Enhancements – AI applications rely on secure, stable memory. We’re enhancing data integrity, security, and protection against corruption and cyber threats to ensure reliable AI operations.

For the future, we’re on track to deliver our first-generation chiplet samples in Q4 2025 and second-generation chiplets samples in Q2 2026. With these advancements, we’re setting a new benchmark for efficiency, performance, and power optimization in AI memory.

How do customers normally engage with your company?

We work closely with a wide range of customers, including AI chip makers, MCU/ASIC designers, SoC vendors, Data Centers, and Edge computing companies. Our goal is to integrate our advanced memory solutions into their systems in the most effective way possible.

There are several ways customers typically engage with us:

  • NuRAM + SmartMem™ IP Licensing – Some customers embed our NuRAM SmartMem™ technology directly into their ASICs, MCUs, MPUs, and SoCs, boosting performance and efficiency for next-gen AI and computing applications.
  • SmartMem™ IP Licensing—Others use our SmartMem™ technology on top of their existing memory architectures, whether Flash, RRAM, PCRAM, traditional MRAM, or DRAM, to improve memory performance and power efficiency.
  • Chiplet Partnerships – For customers looking for a plug-and-play solution, we offer SmartMem™ chiplets that deliver high bandwidth and ultra-low power, specifically designed for server and Edge AI accelerators while seamlessly aligning with industry-standard memory interfaces.
  • Custom Memory Solutions – We also work with customers to customize memory architectures to their specific AI and Edge workloads, ensuring optimal performance and power efficiency.
  • Collaborations & Joint Development – We actively partner with industry leaders to co-develop next-generation memory solutions, maximizing AI processing efficiency and scalability.

At the end of the day, working with Numem gives customers access to ultra-low-power, high-performance, and scalable memory solutions that help them meet AI’s growing demands while significantly reducing energy consumption and cost.

Also Read:

Executive Interview with Leo Linehan, President, Electronic Materials, Materion Corporation

CEO Interview with Ronald Glibbery of Peraso

CEO Interview with Pierre Laboisse of Aledia


Video EP3: A Discussion of Challenges and Strategies for Heterogeneous 3D Integration with Anna Fontanelli

Video EP3: A Discussion of Challenges and Strategies for Heterogeneous 3D Integration with Anna Fontanelli
by Daniel Nenni on 05-02-2025 at 10:00 am

In this episode of the Semiconductor Insiders video series, Dan is joined by Anna Fontanelli, founder and CEO of MZ Technologies. Anna explains some of the substantial challenges associated with heterogeneous 3D integration. Dan then begins to explore some of the capabilities of GenioEVO, the first integrated chiplet/package EDA tool to address, in the pre-layout stage the two major issues of 3D-IC design, thermal and mechanical stress.

Contact MZ Technologies

The views, thoughts, and opinions expressed in these videos belong solely to the speaker, and not to the speaker’s employer, organization, committee or any other group or individual.


CEO Interview with Richard Hegberg of Caspia Technologies

CEO Interview with Richard Hegberg of Caspia Technologies
by Daniel Nenni on 05-02-2025 at 6:00 am

Richard Hegberg

Rick has a long and diverse career in the semiconductor industry. He began as VP of sales at Lucent Microelectronics. He has held executive roles at several high-profile companies and participated in several acquisitions along the way. These include NetApp, SanDisk/WD, Atheros/Qualcomm, Numonyx/Micron, ATI/AMD, and VLSI Technology.

Rick was CEO of three semiconductor start-ups. He has a deep knowledge and passion for semiconductor systems with a special focus on analog and AI. Rick joined Caspia Technologies as CEO in September 2024.

Tell us about your company

Caspia Technologies was born out of pioneering work in cybersecurity at the University of Florida in Gainesville.  The founding team includes Dr. Mark Tehranipoor, Department Chair of Electrical and Computer Engineering. Mark is also the founding director of the Florida Institute for Cybersecurity Research.

He and his research team have driven an expanded understanding of the complex world of cybersecurity. This team began developing unique GenAI assisted tools to assure new chip designs are resistant to current and future cyberattacks. That commercial application of the team’s work is what created Caspia Technologies and got my attention to lead the effort.

What problems are you solving?

Caspia is delivering technology and know-how to ensure chip designs are resistant to cyberattacks. If you think about current production chip design flows, there are well-developed processes to verify the functionality, timing and power of new chip designs. What is missing is a process to verify how secure and resistant to attack these chips designs are. A security verification flow is needed, and Caspia is delivering that.

There is more to this story. We all know that AI is driving many new and highly complex chip designs. Design teams are now using AI technology to make it easier and faster to design the required chips. Using AI to design AI chips if you will. While this approach has shown substantial results, there are two critical risks that emerge.

First, the vast data sets that are being used to train AI models are typically not managed and secured appropriately. This opens the door to faulty design practices. And second, these same AI algorithms can be used to mount very sophisticated cyberattacks. AI makes it easier to design chips but also easier to attack those same chips.

Using GenAI technology and a deep understanding or cyberattack methods, Caspia is addressing these problems.

What application areas are your strongest?

Assurance of security at the hardware level is our focus and our core strength. We are presently deploying a GenAI assisted security platform that examines pre-silicon designs using three approaches.

First, we perform static checking of RTL code to identify design practices that can lead to security weaknesses. The tool also helps designers fix the problems it finds. We us a large set of checks that is constantly updated with our security trained large language models (LLMs).

Second, we use GenAI to create security-based assertions that can be fed to standard formal verification tools. This opens the door to deep analysis of new designs from a security standpoint.

And third, we use GenAI to set up scenarios using standard co-simulation and emulation technology to ensure the design is indeed resistant to real-world cyberattacks.

What keeps your customers up at night?

That depends on the customer. Those that are new to the use of AI technology for chip design may not realize the risks they are facing. Here, we provide training to create a deeper understanding of the risks and how to address them.

Design teams that are already using AI in the design flow better understand the risks they face. What concerns these teams usually falls into two categories:

  1. How can I protect the data I am using to train my LLMs, and how can I use those LLMs more effectively?
  2. How can I be sure my design is resistant to current and future cyberattacks?

Caspia’s security platform addresses both of these issues.

What does the competitive landscape look like and how do you differentiate?

There are many commercially available IP blocks that claim various degrees of security hardness. There are also processes, procedures and tools that can help ensure a better design process.

But there is no fully integrated platform that uses GenAI to verify new chip designs are as secure as they can be. This is the unique technology that is only available from Caspia.

What new features/technology are you working on?

Our current security verification platform is a great start. This technology has received an enthusiastic response from both small and very large companies. The ability to add a security verification to your existing design flow is quite compelling. No new design flow, just new security insights.

We are constantly updating our GenAI models to ensure we are tracking and responding to the latest threats. Beyond that, we will be adding additional ways to check a design. Side-channel assessment, fault injection, IP protection and silicon backside protection are all on our roadmap.

How do customers normally engage with your company?

We are starting to be visible at more industry events. For example, we are a Corporate Sponsor at the IEEE International Symposium on Hardware Oriented Security and Trust (HOST) in San Jose on May 5-8 this year and Dr. Tehranipoor will be giving a SkyTalk at DAC as well in June. You can stop by and see us at these events. You can also reach out to us via our website here. We’ll take it from there.

Also Read:

CEO Interview with Dr. Michael Förtsch of Q.ANT

Executive Interview with Leo Linehan, President, Electronic Materials, Materion Corporation

CEO Interview with Ronald Glibbery of Peraso