SNPS1670747138 DAC 2025 800x100px HRes

Interface IP in 2022: 22% YoY growth still data-centric driven

Interface IP in 2022: 22% YoY growth still data-centric driven
by Eric Esteve on 09-04-2023 at 10:00 am

IF 2018 2027no$

We have shown in the “Design IP Report” 2022 that the market share of the wired Interface IP category is a growing part of the total IP, and that this trend is confirmed year after year. The interface IP category has moved from 18% share in 2017 to 25% in 2022.

During the 2010-decade, smartphone was the strong driver for the IP industry, pushing the CPU, GPU categories and some interface protocols like LPDDR, USB and MIPI. Since 2018, and again in 2022, the new drivers are data-centric application, including servers, datacenter, wired and wireless networking and emerging AI. All these applications share the need for higher and higher bandwidth, in term of speed and volume. This translates into high-speed memory controller (DDR5, HBM or GDDR6) and faster release of interface protocols (PCIe 5, 400G and 800G Ethernet, 112G SerDes). We think that this trend will confirm during the 2020-decade. It can be illustrated by comparing TSMC Revenues by Platform in 2022 and 2020: HPC has grown from 33% to 41% when smartphone has declined from 47% to 33%!

As usual, IPnest has made the five-year forecast (2023-2027) by protocol and computed the CAGR by protocol (picture below). As you can see on the picture, most of the growth is expected to come from three categories, PCIe, memory controller (DDR) and Ethernet & D2D, exhibiting 5 years CAGR of resp. 19.2%, 18.8% and 22.3%. It should not be surprising as all these protocols are linked with data-centric applications! If we consider that the weight of the Top 5 protocols was $1440 million in 2022, the value forecasted in 2027 will be $3500 million, or CAGR of 18%.

Conclusion

Synopsys has built a strong position on every protocol -and on every application, enjoying more than 55% market share, by doing strategic acquisitions since the early 2000’s and by offering integrated solutions, PHY and Controller. We still don’t see any competitor in position of challenging the leader.

In 2020, we have seen the emergence of Alphawave Semi building strong position on the high-end interface IP segment (thanks to PAM4 DSP SerDes), creating “Stop-for-Top” strategy, by opposition with Synopsys “One-Stop-Shop”. If we consider that this high-end segment, strongly driven by HPC (including datacenter, IA, storage, etc.), is expected to considerably grow on the 2020 decade, Alphawave Semi could enjoy 25% market share on this $3 billion sub-segment by 2027, a revenue of $600 to $800 million being realistic. At that time Synopsys revenues could be close to $2 billion range for interface IP only.

In 2023, we think that a major strategy change will happen during the decade. IP vendors focused on high-end IP architecture will try to develop a multi-product strategy and market ASIC, ASSP and chiplet derived from leading IP (PCIe, CXL, memory controller, SerDes…). Some have already started, like Credo, Rambus or Alphawave. Credo and Rambus already see significant revenues results on ASSP, but we will have to wait to 2025, at best, to see measurable results on chiplet. The interesting question is whether Synopsys or Cadence will adopt this new strategy or wait until its success will have been proven to make a decision (by acquisition if they want to share this multi-product strategy).

This is the 15th version of the survey, started in 2009 when the Interface IP category market was $250 million (in 2022 $1650 million), and we can affirm that the 5 years forecast stayed within +/- 5% error margin!

IPnest predict in 2023 that the interface IP category in 2027 will be in the $3750 million range (+/- $200), and this forecast is realistic.

If you’re interested by this “Interface IP Survey” released in July 2023, just contact me:

eric.esteve@ip-nest.com .

Eric Esteve from IPnest

Also Read:

Design IP Sales Grew 20.2% in 2022 after 19.4% in 2021 and 16.7% in 2020!

Interface IP in 2021: $1.3B, 22% growth and $3B in 2026

Stop-For-Top IP Model to Replace One-Stop-Shop by 2025


Former TSMC President Don Brooks

Former TSMC President Don Brooks
by Daniel Nenni on 09-04-2023 at 6:00 am

Don Brooks

Don Brooks is well known to many long time semiconductor insiders, like myself, but most SemiWiki readers have probably never heard of him. Don is a semiconductor legend and here is his story. This will be in two parts since he had a big impact on the semiconductor industry and TSMC. From 1991 to 1997 Don served as President of TSMC and helped grow the nascent company into what it is today, the world’s largest semiconductor foundry with a market capitalization of $500B.

Don Brooks passed away in 2013 and here is the story from his memorial. If you read between the lines you can get a real sense of who Don really was, a very intelligent, driven, semiconductor professional of the highest caliber, absolutely.

Don graduated from Sunset High School in 1957 and was a key player on their basketball team, which won the City Championship his senior year. Don attended Tarleton State College on a basketball scholarship his freshman year. He married his high school sweetheart in 1958 and enrolled in SMU under a co-op program with Texas Instruments.

He happened to be assigned to TI’s Research Lab during a time when Jack Kilby invented/developed the integrated circuit. Consequently, his entire 25-year career at TI focused on the commercialization and production of semiconductors. He rapidly rose through the ranks of TI’s management and became the youngest man ever to be promoted to Senior Vice President at Texas Instruments. Under his leadership TI developed a reputation as the world’s leading supplier of MOS memories.

In 1983 he became President & CEO of Fairchild Industries in Mountain View, CA. He founded KLM Capital in 1988 and served as its Chairman for years. Don joined TSMC as President in 1991. During his tenure as President, TSMC returned to profitability, and grew to become the world’s largest independent semiconductor fabrication company.

Morris Chang, Founder and Chairman of TSMC had these words to say about Don’s tenure as President of the Company “Since his arrival in 1991 Don Brooks has provided dramatic leadership that built TSMC into the world’s most successful dedicated foundry”. TSMC grew at an average annual rate of 54% over Don’s time as President of the Company and achieved record profits.

Following TSMC, he was a board member of United Microelectronics Corporation of Taiwan (NYSE: UMC and TSE: 2303) and previously served as its President and co-Chief Executive Officer from 1997-1999. From what I understand, Morris Chang and Don had a disagreement (broken promise) and moving across the street to UMC was Don’s way of resolving it.

In addition to Don’s success as a senior executive, he also had significant success as a private investor including, but not limited to;

Don was the first outside investor in Silicon Labs (SLAB; NASDAQ) one of the premier success stories of the Austin high tech boom, and he was the first outside investor in Broadcom (BRCM; NASDAQ) one of the most successful startup semiconductor companies of all time.

One thing that impresses me about Don and other semiconductor legends is their dedication to family. In my opinion, fifty plus year marriages show true character, compassion, and the ability to compromise. My father’s parents were married for 72 years, I saw it first hand, something I aspire to.

There is a lot more to Don’s TSMC story of course and that is what I will cover in Part II.

Also Read:

How Taiwan Saved the Semiconductor Industry

Morris Chang’s Journey to Taiwan and TSMC

How Philips Saved TSMC

The First TSMC CEO James E. Dykes


Podcast EP179: An Expert Panel Discussion on the Move to Chiplets

Podcast EP179: An Expert Panel Discussion on the Move to Chiplets
by Daniel Nenni on 09-01-2023 at 10:00 am

Dan is joined by a panel of experts to discuss chiplets and 2.5/3D design. The panelists are: Saif Alam – Vice President of Engineering at Movellus Inc., Tony Mastroianni Siemens EDA- Advanced Packaging Solutions Director and Craig Bishop – CTO Deca Technologies.

In this spirited and informative discussion the panel explores the move to chiplets. Why it’s happening now and who can benefit from the trend are discussed in detail, along with considerations for ecosystem management. design methodology, the role of standards and addressing the risks associated with this new design style.

The views, thoughts, and opinions expressed in these podcasts belong solely to the speaker, and not to the speaker’s employer, organization, committee or any other group or individual.


The Incredible Journey of Analog Bits Through the Eyes of Mahesh Tirupattur

The Incredible Journey of Analog Bits Through the Eyes of Mahesh Tirupattur
by Mike Gianfagna on 09-01-2023 at 6:00 am

The Incredible Journey of Analog Bits Through the Eyes of Mahesh Tirupattur

If you’ve designed a chip with analog content (and who hasn’t), you know Analog Bits. Along the way, you likely met Mahesh. If you are a lover of fine wines, you probably know Mahesh quite well. More on that later. I got the opportunity to speak with him recently about what he’s been up to, both now and over the past few years. It’s a story about a love for technology and a love for wine. If you believe that wine is an art form, then the statement “life imitates art” is very relevant to what follows. Read on to learn about the incredible journey of Analog Bits through the eyes of Mahesh Tirupattur.

Wine, and Life Imitating Art

Just like in the high-tech world of Silicon Valley, there are many M&A transactions occurring in Napa Valley and beyond. Private equity firms are acquiring and consolidating many of the large wineries we’ve come to know and love over the years. While the details of these transactions are often not public, I do know a few facts about some of the larger ones, thanks to my love for wine and the connections I’ve made along the way.

The model for several of these deals is quite unique. A private equity firm will acquire a controlling interest in a winery and then essentially do nothing, allowing the original creativity to flow unhindered. The message to the owners is simple – we love your wine and the brand you’ve built. Please continue to do what you do regarding making your product. We’ll worry about the operational details.  And when you’re ready to step down, just call us and we’ll be ready to take over. Until then, keep focused on your passion.

This laissez faire acquisition strategy from the wine industry has found its way to other transactions as well. Case in point being the acquisition of Analog Bits by SEMIFIVE that occurred about a year and half ago. As covered on SemiWiki here, there is a variety of business models and foundry relationships that comprise the combination of these two companies. SEMIFIVE is the pioneer of platform-based SoC design, working with customers to implement innovative ideas into custom silicon in the most efficient way. The company has a close relationship with Samsung Foundry. Analog Bits is the leader in developing and delivering low-power integrated clocking, sensors and interconnect IP that are pervasive in virtually all of today’s semiconductors. The company has not only developed IP on the Samsung process, but it also has a close and growing relationship with TSMC.

One approach (and a common one) would be to combine the operations of both companies into one model with one set of relationships. That would cause a significant ripple effect in one or both of these company’s businesses, and not a good ripple effect. Rather than do that, SEMIFIVE took a page out of the winery acquisition playbook being used in Napa Valley and elsewhere.

Analog Bits continues to operate as an independent entity, but now as part of a larger enterprise. The company continues to do the things it loves to do, providing critical enabling IP that its customers need. Dan Nenni summarized it well in the SemiWiki post:

To me this acquisition is another 1+1=3. SEMIFIVE gets a strong IP base in North America plus foundry and customer relationships that have been silicon proven for 20+ years. Analog Bits gets the ability to scale rapidly and increase the depth and breadth of their IP offering.

I mentioned a connection between Mahesh and wine earlier. It turns out he is quite an accomplished Sommelier as well as a technologist, completing three of the four levels that pave the way to Master Sommelier. While there is still more road ahead for Mahesh to achieve this ultimate title, his progress in the face of also building a very successful IP business is noteworthy. There are 269 Master Sommeliers in the world today. This is truly a rare achievement. Mahesh has also become an expert in the making of Sake, which he claims is far more complex and nuanced than wine.

Perhaps this is the topic of a future blog post or podcast.

The Road Ahead

During my discussions with Mahesh, it was quite clear that he was happy with the outcome of the acquisition. The ability to continue to operate independently, continuing to do what he loves with the backing of a larger enterprise feels good. I can imagine the winemakers that were part of the Napa Valley acquisitions saying the same thing.

He talked about the great position Analog Bits enjoys in the development of purpose-built IP blocks for various high-growth markets. The track record and customer-focused nature of the company make this a great match. Mahesh talked about many new market opportunities. One interesting one is power management and spike detection. With so many cores and power domains in advanced designs, often fueled by AI, power spikes have become a very real liability. Analog Bits is developing on-chip IP to sense and manage these events.

Overall, Analog Bits is becoming more “sticky” for advanced designs thanks to their broad catalog and excellent track record. According to Mahesh, the future is bright and a larger operation at Analog Bits seems likely. And that’s just part of the incredible journey of Analog Bits through the eyes of Mahesh Tirupattur.

 


ISO 21434 for Cybersecurity-Aware SoC Development

ISO 21434 for Cybersecurity-Aware SoC Development
by Kalar Rajendiran on 08-31-2023 at 10:00 am

Cybersecurity agreement in supply chain

The automotive industry is undergoing a remarkable transformation, with vehicles becoming more connected, automated, and reliant on software. While these advancements promise convenience, comfort and efficiency to the consumers, the nature and complexity of the technologies also raise concerns for functional safety and security. The ISO 26262 standard was established for ensuring a systematic approach to functional safety in the automotive industry. This standard provides a comprehensive framework for managing functional safety throughout the entire product development lifecycle, including concept, design, implementation, production, operation, maintenance, and decommissioning. It offers guidance on hazard analysis, risk assessment, safety goals, safety mechanisms, and verification and validation processes to ensure that electronic systems function as intended and maintain safety even in the presence of faults or errors.

The ISO 26262 standard addresses impact to safety due to faults and failures. What about addressing factors such as cybersecurity? The soaring adoption of electronics in the automotive sector has led to a corresponding expansion in the cybersecurity threat landscape. As vehicles become more connected and reliant on software-driven functionality, the attack surface expands significantly. This convergence of technological advancement and risk underscores the critical importance of cybersecurity-aware development practices. Road vehicles rely heavily on communication between components and external systems, making them susceptible to various cyber risks. Over-the-Air (OTA) software updates dramatically increase cybersecurity risks. Hackers could potentially manipulate sensor data, compromise vehicle control systems, or gain unauthorized access to sensitive personal information. The ISO/SAE 21434 Road Vehicles—Cybersecurity Engineering standard was established to address the security challenges posed by cyberthreats to road vehicles.

Synopsys has recently published a whitepaper that delves into the ISO 21434 driven best practices for cybersecurity-aware SoC development. Anyone involved in the development and post-production support of automotive related products and systems would find this whitepaper very informative. Following are some excerpts.

Key Aspects of ISO 21434

The ISO 21434 standard provides a structured approach to identifying, assessing, and mitigating cybersecurity risks throughout the development of automotive products, including components like SoCs. This comprehensive framework builds upon similar principles of ISO 26262 to address the cybersecurity dimension. The alignment between these two standards not only streamlines the integration of cybersecurity practices but also establishes a common vocabulary, ensuring seamless adaptation for organizations already compliant with ISO 26262.

Organizational Responsibilities

ISO 21434 follows in the footsteps of ISO 26262 by delineating roles and responsibilities across various stages of product development. This includes the commitment of executive management, the establishment of standardized roles between suppliers and supply chain entities, the creation of distinct phases within the product life cycle, and the formulation of Threat Analysis and Risk Assessment (TARA) processes equivalent to Hazard Analysis and Risk Assessment (HARA) in ISO 26262.

Cybersecurity Risk Assessment and Management

Cybersecurity hinges on a thorough assessment of a product’s inherent risks and its vulnerabilities when deployed. Four critical factors govern the severity of a cybersecurity risk, enabling an informed approach to risk mitigation. These four key factors are the Threat Scenario, Impact, Attack Vector, and Attack Feasibility. Together, these factors determine the potential harm, enabling a structured evaluation of the risk’s impact and the need for intervention. In essence, the Threat Scenario and its Impact gauge potential damage, the Attack Vector factor maps how an attack could be executed, while the Feasibility factor evaluates the ease of enacting the attack. ISO 21434 offers techniques for calculating the risk score from these four factors and elucidates a structured approach for fostering a proactive stance against cyberattacks.

Security by Design

The Secure Development Lifecycle (SDL) process championed by Microsoft to address cybersecurity permeates all facets of production development. SDL orchestrates a number of measures during the design phase to safeguard products against potential vulnerabilities. At the heart of this phase lies the mandate to generate concrete evidence affirming the integration of secure practices the team has been trained for. This evidence encompasses a spectrum of reviews and metrics, from security design reviews and verification plan assessments to privacy design reviews. Tools such as Synopsys Coverity and Black Duck play pivotal roles, generating code coverage and composition analysis reports. These reports help gauge the codebase’s maturity while flagging vulnerabilities in third-party components.

Collaboration and Communication

In the interconnected world of today’s product development, cybersecurity cannot operate in isolation. A collaborative approach is imperative, demanding a cohesive and cybersecurity-aware handshake between every link in the supply chain. The collaborative mindset guides the development cycles, necessitating an ongoing flow of cybersecurity information among supply chain entities.

Cybersecurity agreement in supply chain

Continuous Monitoring and Updating

Continuously monitoring products to identify known vulnerabilities and updating, ensures cybersecurity from the product’s release to its decommissioning. Post-release support is a focal point in SDL’s continuum. It mandates the specification of requirements for post-production security controls. This meticulous preparation equips the product to navigate the complexities of its operational environment and supply chain.

Summary

Given the surge in electronics adoption into road vehicles and the evolving landscape of cyberattack threats, customers are demanding cybersecurity assurances.  Cybersecurity impacts every level of the automotive supply chain starting with semiconductor SoCs. For component suppliers, embracing standardized cybersecurity principles and processes becomes a strategic imperative to remain competitive in the dynamic automotive market. By adhering to these evolving industry standards, suppliers can not only address the growing cybersecurity concerns but also cater to the mounting customer expectations for robust cybersecurity assurance.

During development of complex SoCs, partnering with an IP supplier with a structured ISO 21434 development platform minimizes cybersecurity risks and ensures highest levels of success. Synopsys develops IP products as per the ISO 21434 standard and rigorously follows cybersecurity policies, processes and procedures as promulgated in the standard. The company deploys cybersecurity teams through all levels of the organization.

Cybersecurity teams through all levels of an organization

For more details, visit Accelerate Your Automotive Innovation page.

You can access the entire whitepaper here.

Also Read:

Key MAC Considerations for the Road to 1.6T Ethernet Success

AMD Puts Synopsys AI Verification Tools to the Test

WEBINAR: Why Rigorous Testing is So Important for PCI Express 6.0


Anomaly Detection Through ML. Innovation in Verification

Anomaly Detection Through ML. Innovation in Verification
by Bernard Murphy on 08-31-2023 at 6:00 am

Assertion based verification only catches problems for which you have written assertions. Is there a complementary approach to find problems you haven’t considered – the unknown unknowns? Paul Cunningham (Senior VP/GM, Verification at Cadence), Raúl Camposano (Silicon Catalyst, entrepreneur, former Synopsys CTO and now Silvaco CTO) and I continue our series on research ideas. As always, feedback welcome.

The Innovation

This month’s pick is Machine Learning-based Anomaly Detection for Post-silicon Bug Diagnosis. The paper published in the 2013 DATE Conference. The authors are/were from the University of Michigan.

Anomaly detection methods are popular where you can’t pre-characterize what you are looking for, in credit card fraud for example or in real-time security where hacks continue to evolve. The method gathers behaviors over a trial period, manually screened to be considered within expected behavior, then looks for outliers in ongoing testing as potential problems for closer review.

Anomaly detection techniques either use statistical analyses or machine learning. This paper uses machine learning to build a model of expected behavior. You could also easily imagine this analysis being shifted left into pre-silicon verification.

Paul’s view

This month we’ve pulled a paper from 10 years ago on using machine learning to try and automatically root cause bugs in post-silicon validation. It’s a fun read and looks like a great fit for re-visiting again now using DNNs or LLMs.

The authors equate root-causing post-silicon bugs to credit card fraud detection: every signal traced in every clock cycle can be thought of as a credit card transaction, and the problem of root causing a bug becomes analogous to identifying a fraudulent credit card transaction.

The authors’ approach goes as follows: divide up simulations into time slices and track the percent of time each post-silicon traced debug signal is high in each time slice. Then partition the signals based on the module hierarchy, aiming for a module size of around 500 signals. For each module in each time slice train a model of the “expected” distribution of signal %high times using a golden set of bug free post-silicon traces. This model is a very simple k-means clustering of the signals using difference in %high times as the “distance” between two signals.

For each failing post-silicon test, the %high signal distribution for each module in each time slice is compared to the golden model and the number of signals whose %high time is outside the bounding box of its golden model cluster are counted. If this number is over a noise threshold, then those signals in that time slice are flagged as the root cause of the failure.

It’s a cool idea but on the ten OpenSPARC testcases benchmarked, 30% of the tests do not report the correct time slice or signals, which is way too high to be of any practical use. I would love to see what would happen if a modern LLM or DNN was used instead of simple k-means clustering.

Raúl’s view

This is an “early” paper from 2013 using machine learning for post-silicon bug detection. For the time this must have been advanced work listed with 62 citations in Google Scholar.

The idea is straight forward: run a test many times on a post-silicon design and record the results. When intermittent bugs occur, different executions of the same test yield different results, some passing and some failing. Intermittent failures, often due to on-chip asynchronous events and electrical effects, are among the most difficult to diagnose. The authors briefly consider using supervised learning, in particular one-class learning (there is only positive training data available, bugs are rare), but discard it as “not a good match for the application of bug finding”. Instead, they apply k-means clustering; similar results are grouped into k clusters consisting of “close” results minimizing the sum-of-squares distance within clusters. The paper reveals numerous technical details necessary to reproduce the results: Results are recorded as the “fraction of time the signal’s value was one during the time step”; the number of signals from a design, of the order of 10,000, is the dimensionality in k-means clustering which is NP-hard with respect to the number of dimensions, so the number of signals is capped to 500 using principal component analysis; the number of clusters can’t be too small (underfitting) nor too large (overfitting); a proper anomaly detection threshold needs to be picked, expressed as the percentage of the total failing examples under consideration; time localization of a bug is achieved by two-step anomaly detection, identifying which time step presents a sufficient number of anomalies to reveal the occurrence of a bug and then in a second round identifying the responsible bug signals.

Experiments for an OpenSPARC T2 design of about 500M transistors ran 10 workloads of test lengths ranging between 60,000 and 1.2 million cycles 100 times each as training. Then they injected 10 errors and ran 1000 buggy tests. On average 347 signals were detected for a bug (ranging from none to 1000) and it took ~350 cycles of latency from bug injection to bug detection. Number of clusters and detection threshold strongly influence the results, as does the training data quantity. False positives and false negatives added up to 30-40 (in 1000 buggy tests).

Even though the authors observe that “Overall, among the 41,743 signals in the OpenSPARC T2 top-level, the anomaly detection algorithm identified 347, averaged over the bugs. This represents 0.8% of the total signals. Thus, our approach is able to reduce the pool of signals by 99.2%”, in practice this may not be of great help to an experienced designer. 10 years have passed, it would be interesting to repeat this work using today’s machine learning capabilities, for example LLMs for anomaly detection.


RISC-V 64 bit IP for High Performance

RISC-V 64 bit IP for High Performance
by Daniel Payne on 08-30-2023 at 10:00 am

Atrevido min

RISC-V as an Instruction Set Architecture (ISA) has grown quickly in commercial importance and relevance since its release to the open community in 2015, attracting many IP vendors that now provide a variety of RTL cores. Roger Espasa, CEO and Founder of Semidynamics, has presented at RISC-V events on how their IP is customized for compute challenges that require high bandwidth and high performance cores with vector units. Semidynamics was founded in 2016, has Barcelona for the HQ, and already has customers in the US and Asia by offering two customizable RISC-V IPs:

  • Avispado – in-order RISCV64GCV, supporting AXI and CHI
  • Atrevido – out-of-order RISCV64GC, supporting AXI and CHI

A typical CPU has a handful of big cores and large caches, making them easy to program, though not high performance.

GPUs, by contrast, have many tiny cores that provide high performance for parallel code, but are harder to program and add communication latency through the PCIe bus when data needs to be passed back and forth between the CPU and the GPU.

CPU, GPU comparison

The approach at Semidynamics is to use a RISC-V core connected to compute cores which makes it easy to program, higher performance for parallel codes and offering zero communication latency. CPU plus vector unit provides the best of both worlds.

CPU plus Vector unit

The RISC-V specification documents 32 vector registers, and you can add a number of vector cores, along with a connection to your cache inside a vector unit.

Vector Unit

With Semidynamics IP you can customize the number of Vector Cores: 4, 8, 16, 32. Another way to look at this is to note that 4 Vector Cores is 256-bit, up to 32 Vector Cores which is 2,048-bit.

IP users also choose which data types: FP64, FP32, FP16, BF16, INT64, INT32, INT16, INT8. For an AI application they may choose data types of FP16, BF16, while an HPC application could select FP64, FP32.

The third customization is the Vector Register Length, where for more performance and lower power you can make the vector register bigger than the vector unit.

Here’s the block diagram of the Atrevideo 423-V8:

Atrevido 423 + V8 Vector Unit

The vector unit is fully out of order, which is unique among RISC-V IP vendors. The combination of the vector unit plus Gazzillion unit are capable of streaming data at over 60 Bytes/cycles.

High Bandwidth: Vector + Gazzillion

The purple line shows the Read performance and in the L1 Cache it’s 20-60 bytes/cycle, other machines show a rapid drop in bandwidth after leaving L1 Cache, while this approach keeps going, with a flattening at 56. Even going to DDR memory shows a bandwidth of 40. With a clock rate of 1.0GHz that makes 40 GB/s bandwidth.

IP customers can even add their own RTL code connected to the Vector Unit for their own purposes.

Performance of matrix multiplication is important in AI workloads, and on the OOO V8 Vector Unit there’s a peak of 16 FP64 FLOPS/cycle, and a 99% of peak for a matrix size >= 400. For a small matrix size of 24×24 the performance is 7 FP64 FLOPS/cycle, or 50% of peak. Matrix multiplication for FP16 using a Vector Unit with 8 vector cores has a peak of 64 FP16 FLOPS/cycle, and 99% of peak for M >= 600.

A real-time object detection benchmark called YOLO (You Only Look Once) was run on the Atrevido 423-V8 platform, and it showed a 58% higher performance per vector core than competitors. These results were for video with 24 layers. 5.56 Gops/frame and about 9M parameters.

YOLO Comparison

Summary

Choosing a RISC-V IP vendor is a complicated task, so knowing about vendors like Semidynamics can help you better understand how a customized approach could most efficiently run your specific workloads. With Semidynamics you get to choose between architectural choices like in-order or out-of-order, with or without vector units. The reported numbers from this IP vendor look promising, and I look forward to their future announcements.

Related Videos

Also Read:

Deeper RISC-V pipeline plows through vector-scalar loops

RISC-V Summit Buzz – Semidynamics Founder and CEO Roger Espasa Introduces Extreme Customization

Configurable RISC-V core sidesteps cache misses with 128 fetches


Modeling EUV Stochastic Defects with Secondary Electron Blur

Modeling EUV Stochastic Defects with Secondary Electron Blur
by Fred Chen on 08-30-2023 at 8:00 am

Modeling EUV Stochastic Defects With Secondary Electron Blur

Extreme ultraviolet (EUV) lithography is often represented as benefiting from the 13.5 nm wavelength (actually it is a range of wavelengths, mostly ~13.2-13.8 nm), when actually it works through the action of secondary electrons, electrons released by photoelectrons which are themselves released from ionization by absorbed EUV (~90-94 eV) photons. The photons are not only absorbed in the photoresist film but also in the layers underneath. The released electrons migrate varying distances from the point of absorption, losing energy in the process.

These migration distances can go over 10 nm [1-2]. Consequently, images formed by EUV lithography are subject to an effect known as blur. Blur can be most basically understood as the reduction of the difference between the minimum and maximum chemical response of the photoresist. Blur is often modeled through a Gaussian function convolved with the original optical image [3-4].

In such modeling, however, it is often neglected to mention that the blur scale length, often referred as sigma, is not a fundamentally fixed number, but belongs to a distribution [5]. This is consistent with the fact that the higher EUV dose leads to a larger observed blur [2,5]. More electrons released allows a larger range of distances traveled [2,6]. Note that pure chemical blur from diffusion does not have the same dose dependence [3,7].

It was recently demonstrated that secondary electron blur increasing with dose can lead to the observed stochastic defects in EUV lithography [8]. The higher dose leads to a wider allowed range of blur.

Local base blur range at different doses, taken at different probabilities from the base blur probability distribution.

The simulation model combines three stages of random number generation: (1) photon absorption, (2) secondary electron yield, and (3) electron dose-dependent blur range. Unexposed stochastic defects are dominant at low doses where there are too few photons absorbed. Exposed stochastic defects are dominant at higher doses where the rare (e.g., probability ~ 1e-8) ultrahigh (>10 nm) blur promotes too much secondary electron exposure near the threshold value for printing.

Higher blur makes it easier for smaller stochastic dose variations to cross the printing threshold, enabling exposed or unexposed defects.

One consequence of both insufficient low photon absorption and dose-increased blur causing defects is the emergence of a floor or valley for stochastic defects, preventing them from being absent entirely.

At lower dose or exposed CD there tend to be unexposed defects, while at higher dose or exposed CD there tend to be exposed defects. This results in a floor or valley for stochastic defect occurrence.

Another way to interpret the defect floor or valley is that the enlarged blur range at low enough probability increases the entropy significantly and damages the image across all possible printing thresholds.

With the much larger blur range at low enough probabilities (1e-9 in this example), there is significant entropy in the image and the image is damaged regardless of printing threshold. At more commonly observed probabilities (e.g., 1e-1), the image preserves its usual appearance. Note: the raw pixel images were smoothed for better visualization.

It is therefore very risky to not include dose-dependent secondary electron blur ranges in any model for EUV lithography image or defect formation.

References

[1] I. Bespalov, “Key Role of Very Low Energy Electrons in Tin-Based Molecular Resists for Extreme Ultraviolet Nanolithography,” ACS Appl. Mater. Interfaces 12, 9881 (2020).

[2] S. Grzeskowiak et al., “Measuring Secondary Electron Blur,” Proc. SPIE 10960, 1096007 (2019).

[3] D. Van Steenwinckel et al., “Lithographic Importance of Acid Diffusion in Chemically Amplified Resists,” Proc. SPIE 5753, 269 (2005).

[4] T. Brunner et al., “Impact of resist blur on MEF, OPC, and CD control,” Proc. SPIE 5377, 141 (2004).

[5] A. Narasimhan et al., “Studying secondary electron behavior in EUV resists using experimentation and modeling,” Proc. SPIE 942, 942208 (2015).

[6] M. Kotera et al., “Extreme Ultraviolet Lithography Simulation by Tracing Photoelectron Trajectories in Resist, Jpn. J. Appl. Phys. 47, 4944 (2008).

[7] M. Yoshii et al., “Influence of resist blur on resolution of hyper-NA immersion lithography beyond 45-nm half-pitch,” J. Micro/Nanolith. MEMS MOEMS 8, 013003 (2009).

[8] F. Chen, “EUV Stochastic Defects from Secondary Electron Blur Increasing With Dose,” https://www.youtube.com/watch?v=Q169SHHRvXE, 8/20/2023.

This article first appeared in LinkedIn Pulse: Modeling EUV Stochastic Defects With Secondary Electron Blur

Also Read:

Enhanced Stochastic Imaging in High-NA EUV Lithography

Application-Specific Lithography: Via Separation for 5nm and Beyond

ASML Update SEMICON West 2023


Arm Inches Up the Infrastructure Value Chain

Arm Inches Up the Infrastructure Value Chain
by Bernard Murphy on 08-30-2023 at 6:00 am

Arm just revealed at HotChips their compute subsystems (CSS) direction led by CSS N2. The intent behind CSS is to provide pre-integrated, optimized and validated subsystems to accelerate time to market for infrastructure system builders. Think HPC servers, wireless infrastructure, big edge systems for industry, city, enterprise automation. This for me answers how Arm can add more value to system developers without becoming a chip company. They know their technology better than anyone else; by providing pre-designed, optimized and validated subsytems – cores, coherent interconnect, interrupt, memory management and I/O interfaces, together with SystemReady validation – they can chop a big chunk out of the total system development cycle.

Accelerating Custom Silicon

A completely custom design around core, interconnect, and other IPs obviously provides maximum flexibility and ability to differentiate but at a cost. That cost isn’t only in development but also in time to deployment. Time is becoming a very critical factor in fast moving markets – just look at AI and the changes it is driving in hyperscaler datacenters. I have to believe current economic uncertainties compound these concerns.

Those pressures are likely forcing an emphasis on differentiating only where essential and standardizing everywhere else, especially when proven experts can take care of a big core component. CSS provides a very standard yet configurable subsystem for many-core compute, include N2 cores (in this case), the coherent mesh network between those cores, together with interrupt and memory management, cache hierarchy, chiplet support through UCIe or custom interfaces, DDR5/LPDDR5 external memory interface, PCIe/CXL Gen5 for fast IO and or coherent IO, expansion IO, and system management.

All PPA optimized for an advanced 5nm TSMC process and proven SystemReady® with a reference software stack. The system developer still has plenty of scope for differentiation through added accelerators, specialized compute, their own power management, etc.

Neoverse V2

Arm also announced a next step in the Neoverse V-series, unsurprisingly improved over the V1 version with improved integer performance and reduction in system level cache misses. There is improvement on a variety of other benchmarks also.

Also noteworthy is its performance in the NVIDIA Grace-Hopper combo (based on Neoverse V2). NVIDIA shared real hardware data with Arm on performance versus Intel Sapphire Rapids and AMD Genoa. In raw performance the Grace CPU was mostly at par with AMD and generally faster than Sapphire Rapids by 30-40%.

Most striking for me was their calculation for a datacenter limited to 5MW, important because all datacenters are ultimately power limited. In this case Grace bested AMD in performance by between 70% and 150% and was far ahead of Intel.

Net value

First on Neoverse’s contribution to Grace-Hopper – wow. That system is at the center of the tech universe right now, thanks to AI in general and large language models in particular. This is an incredible reference. Second, while I’m sure that Intel and AMD can deliver better peak performance than Arm-based systems, and Grace-Hopper workloads are somewhat specialized, (a) most workloads don’t need high end performance and (b) AI is getting into everything now. It is becoming increasingly difficult to make a case that, for cost and sustainability over a complete datacenter, Arm-based systems shouldn’t play a much bigger role especially as expense budgets tighten.

For CSS-N2, based on their own analysis Arm estimates up to 80 engineering years of effort required to develop the CSS N2 level of integration, a number that existing customers confirm is in the right ballpark. In an engineer-constrained environment, this is 80 engineering years they can drop from their program cost and schedule without compromising whatever secret differentiation the want to add around the compute core.

These look like very logical next steps for Arm in their Neoverse product line. Faster performance in the V-series and let customers take advantage of Arm’s own experience and expertise in building N2-based compute systems, while leaving open lots of room for adding their own special sauce. You can read the press release HERE.


Visit with Easy-Logic at #60DAC

Visit with Easy-Logic at #60DAC
by Daniel Payne on 08-29-2023 at 10:00 am

Easy-Logic at #60DAC

I had read a little about Easy-Logic before #60DAC, so this meeting on Wednesday in Moscone West was my first in-person meeting with Jimmy Chen and Kager Tsai to learn about their EDA tools and where they fit into the overall IC design flow. A Functional Engineering Change Order (ECO) is a way to revise an IC design by updating the smallest portion of the circuit, avoiding a complete re-design. An ECO can happen quite late in the design stage, causing project delays or even failures, so minimizing this risk and reducing the time for an ECO is an important goal, one that Easy-Logic has productized in a tool called EasylogicECO.

Easy-Logic at #60DAC

This EDA tool flow diagram shows each place where EasylogicECO fits in with logic synthesis, DFT, low power insertion, Place & Route, IC layout and tape-out.

EasylogicECO tool flow

Let’s say that your engineering team is coding RTL and they find a bug late in the design cycle, they could make an RTL change and then use the EasylogicECO tool to compare the differences between the two RTL versions, and then implement the ECO changes, where the output is an ECO netlist and the commands to control the Place & Route tools from Cadence or Synopsys.

Another usage example for EasylogicECO is post tape-out where a bug is found or the spec changes, and then you want to do a metal-only ECO change in order to keep mask costs lower.

Easy-Logic is a 10 year old company, based in Hong Kong, and their EasylogicECO tool came out about 5-6 years ago. Most of their customers are in Asia and the names have been kept private, although there are quotes from several companies, like: Sitronix, Phytium, Chipone, Loongson Technology, ASPEED and Erisedtek. Users have designed products in industries for cell phone, HPC, networking, AI, servers, and high-end segments.

EasylogicECO is being used mostly on the advanced nodes, such as 7nm and 10nm, where design sizes can be 5 million instances per block, and functional ECOs are used at the module and block levels. Their tool isn’t really replacing other EDA tools, rather it fits neatly into existing EDA tool flows as shown above. Both Unix and Linux boxes run EasylogicECO, and the run times really depend on the complexity of the design changes. With a traditional methodology it could take 5 days to update a block with 5 million instances, but now with the Easy-Logic approach it can take only 12 hours. This methodology aims to make the smallest patch in the shortest amount of time.

Easy-Logic works at the RTL level. After logic synthesis you basically lose the design hierarchy, which makes it hard to do an ECO. Patents have been issued for the unique approach that EasylogicECO takes by staying at the RTL level.

Engineering teams can quickly evaluate within a day or two this approach from Easy-Logic. They’ve made the tool quite easy to use, so there’s a quick learning curve, as your inputs are just the original RTL, the revised RTL, the original netlist, the synthesized netlist of the revised RTL, and a library.

With 50 people in the company, you can contact an office in Hong Kong, San Jose, Beijing or Taiwan. 2023 was the first year at DAC for the company. Engineers can use this new ECO approach in four use cases:

  • Functional ECO
  • Low power ECO
  • Scan chain ECO
  • Metal ECO

Summary

SoC design is a very challenging approach to product development where time is money, and making last-minute changes like ECOs can make or break the success of a project. Easy-Logic has created a methodology to drastically shorten the time it takes for an ECO, while staying at the RTL level. I expect to see high interest in their EasylogicECO tool this year, and more customer success stories by next DAC in 2024.

Related Blogs