Bronco Webinar 800x100 1

Podcast EP150: How Zentera Addresses Development Security with Mike Ichiriu

Podcast EP150: How Zentera Addresses Development Security with Mike Ichiriu
by Daniel Nenni on 03-31-2023 at 10:00 am

Dan is joined by Michael Ichiriu, Vice President of Marketing at Zentera Systems. Prior to Zentera Mike was a senior executive at NetLogic Microsystems where he played a critical role in shaping the company’s corporate and product strategy. While there, he built the applications engineering team, and helped lead the organization from pre-revenue to its successful IPO and eventual acquisition by Broadcom.

Dan explores the recent updates to the National Cybersecurity Strategy with Mike. The structure and implications of these new requirements are explored. Mike describes the impact the new rule will have on development infrastructure and discusses how Zentera has helped organizations achieve compliance in as little as three months.

The views, thoughts, and opinions expressed in these podcasts belong solely to the speaker, and not to the speaker’s employer, organization, committee or any other group or individual.


Chiplets, is now their time?

Chiplets, is now their time?
by Daniel Nenni on 03-31-2023 at 6:00 am

P3221015 cropped
Bapi, Dan and Sagar

Chiplets appeared on SemiWiki in 2020 and have been a top trending keyword ever since. The question is not IF chiplets will disrupt the semiconductor industry, the question is WHEN? I certainly have voiced my opinion on this (pro chiplet) but let’s hear it from the experts. There was a live panel recently sponsored by Silicon Catalyst held in Silicon Valley. These types of events were quite common before the pandemic and it is great to see them coming back. Spending time with industry icons with food and wine for all, that is what networking is all about, absolutely.

Chiplets, is now their time?
Chiplets have gained popularity in the last few years. Recently, VCs (Mayfield) have expressed interest in this technology as well. The first industry symposium on chiplets was held a few weeks ago in San Jose which was very well attended. Work on this technology has been going on for the past 20+ years. This informal panel discusses whether this is for real or the next industry “fad”. This is intended to be the first of series of events/webinars addressing this topic in 2023.

Particpants:
Moderated by Dan Armbrust, Co-founder, Board Director and initial CEO of Silicon Catalyst. Dan has more than 40 years of semiconductor experience starting at 26 years with IBM at the East Fishkill, NY and Burlington, VT fabs followed by president and CEO of Sematech, then Board Chairman of PVMC (PhotoVoltaic Mfg Consortium), and the founding of Silicon Catalyst.

Panelist Dr. Bapi Vinnakota, PhD from Princeton in computer engineering, Bapi is a technologist and architect (Intel/Netronome), academic (University of Minnisota/San Jose State University), and is currently with the Open Compute Project Foundation.

Panalist Sagar Pushpala has 40 years of experience starting with AMD as a process engineer, National Semiconductor, Maxim, Intersil, TSMC, Nuvia, Qualcomm, and is now an active advisor, investor, and board member.

The panelists shared their personal experience which was quite interesting. The audience was Silicon Catalyst advisors so the question really is WHEN will the commercial chiplet ecosystem be ready for small to medium companies?

I attended the first Chiplet Summit referenced above and was very impressed with the content and attendance.  The next one is mid June so stay tuned. I have also spent many hours researching and discussing chiplets with the foundries and their top customers. Xilinx, Intel, AMD, NVIDIA, Broadcom, amongst others have implemented the chiplets concept with their internal designs. The point being, chiplets have already been proven in R&D and are in production so that answers the questions to IF and WHEN for the top semiconductor companies.

As to when the commercial chiplet ecosystem will be ready a laundry list of technical challenges were discussed which included: die to die communication, die interoperability, bumping, access to packaging and assembly houses, firmware, software, known good dies, system test and test coverage, EDA and simulation tools to cover multi-physics (electrical, thermal, mechanical). More importantly these different groups or different companies will have to work together in a whole new chiplet way.

In my opinion this is not as hard as it sounds and this was also covered. The foundry business is a great example. When we first started going fabless there was no commercial IP market. Today we have a rich IP ecosystem anchored by the foundries like TSMC. Chiplets will be a similar process but we really are at the beginning of said process and that was talked about as well.

An interesting discussion point was with DARPA and the Electronics Resurgence Initiative. To me chiplets is all about high volume leading edge designs and the ability to reduce design time and cost. But now I also see how the US Government can greatly benefit from chiplets and hopefully be a funding source for the ecosystem.

As much as I like Zoom and virtual conferences there is nothing like a live gathering. The chiplet discussion will continue and I highly recommend doing it live whenever possible. The next big event is the annual ecosystem TSMC Technology Symposium, I hope to see you there.

Also Read:

CEO Interview: Dr. Chris Eliasmith and Peter Suma, of Applied Brain Research Inc.

2023: Welcome to the Danger Zone

Silicon Catalyst Angels Turns Three – The Remarkable Backstory of This Semiconductor Focused Investment Group


Securing Memory Interfaces

Securing Memory Interfaces
by Kalar Rajendiran on 03-30-2023 at 10:00 am

synopsys secure ddr controller with ime

News of hackers breaking into systems is becoming common place these days. While many of the breaches reported to date may have been due to security flaws in software, vulnerabilities exist in hardware too. As a result, the topic of security is getting increased attention within the semiconductor industry around system-on-chip (SoC) and high-speed data interfaces. The goal is to make sure that data that moves across these interfaces is protected from being accessed or manipulated by unauthorized agents.

One type of interface that is proliferating rapidly is related to memories. High-bandwidth interfaces such as DDR continue to increase in transfer speeds with every new generation. It is imperative to secure off-chip dynamic random-access memory (DRAM) interfaces which are vulnerable to certain types of attacks. With increasingly complex systems, securing data should be an integral part of hardware design.

DRAM-specific Vulnerabilities

DRAM-specific vulnerabilities include Row hammer, RAMBleed, and cold-boot attacks. Row hammer attacks are executed by repeatedly reading data in a memory row at high speed. This activity causes bits in adjacent rows to flip, enabling the attacker to gain read-write access to the entire physical memory. RAMBleed uses the same principles as Row hammer but reads the information instead of modifying it to extract information from DRAM, thus threatening the confidentiality of the data stored in memory. With cold-boot attacks, attackers can reset a system, access pre-boot physical memory data to retrieve encryption keys and cause damage. The consequences of these attacks can be severe including identity theft, fraud, financial losses, and the liability expenses to clean up the damage. The attacks can compromise the overall system and its data, resulting in significant reputational damage as well.

Securing Against Above Vulnerabilities

Encryption prevents Row hammer attacks by making it more difficult for an attacker to gain access to physical memory locations. Encrypted data is more difficult to manipulate since it is not in its original form, which makes it harder for the attacker to create a Row hammer attack. Furthermore, authentication, can be used to protect the integrity of data, which makes it more difficult for an attacker to alter the contents of memory locations. Authentication can also be used to ensure controlled access to specific memory locations, which can help in preventing unauthorized access.

To safeguard memory interfaces by design, designers can turn to high-performance, low-latency memory encryption solutions, such as AES-XTS based encryption that can be augmented with cryptographic hashing algorithms, to address both the confidentiality and integrity of data. Encryption covers all the bits, making it nearly impossible to create Row hammer patterns. Refreshing keys and memory encryption can also protect against RAMBleed and cold-boot attacks.

Challenges

Implementing memory encryption comes with a cost, including overhead that impacts power, performance, area (PPA), and latency. Designers must consider the tradeoffs and ensure that security is integrated into the design from the beginning. It is critical for keys to be generated and managed in a trusted/secure area of the SoC and distributed via dedicated channels to the encryption module. Control configuration and readback protection of keys should also be part of the overall security architecture.

Optimal Solution Strategy

DDR and LPDDR interfaces would benefit from inline memory encryption (IME) security just as PCI Express® (PCIe®) and Compute Express Link (CXL) interfaces benefitted from integrity and data encryption (IDE). The IME solution should tightly couple the encryption/decryption inside the DDR or LPDDR controller, allowing for maximum efficiency of the memory and lowest overall latency. The solution should also allow for ongoing adaptation of an ever evolving security threat.

Figure: Secure DDR5 Controller with Inline Memory Encryption (IME)

IME Security Module for DDR/LPDDR

Synopsys IME Security Module provides data confidentiality for off-chip memory over DDR/LPDDR interfaces, supporting both write and read channels with AES-XTS encryption. It seamlessly integrates with Synopsys DDR/LPDDR controllers, reducing risk and accelerating SoC integration with ultra-low latency of just two cycles. The module is scalable, FIPS 140-3 certification ready, and supports different datapath widths, key sizes, and encryption modes. It also offers efficient key control, SRAM zeroization, and mission mode bypass. With its standalone or integrated solution, the IME Security Module provides optimal PPA and latency for secure and compliant SoC designs.

For more details, refer to Synopsys IME Security Module page.

Summary

Incorporating security into SoCs is a fundamental requirement for fulfilling privacy and data protection requirements of electronic systems users. Securing the high-speed interfaces is key to addressing this requirement. The deployed mechanisms need to be highly efficient with optimal latency. Authentication and key management in the control plane and integrity and data encryption in the data plane are essential components of a complete security solution. Synopsys provides complete solutions to secure SoCs, their data, and communications.

Also Read:

Power Delivery Network Analysis in DRAM Design

Intel Keynote on Formal a Mind-Stretcher

Multi-Die Systems Key to Next Wave of Systems Innovations


Adaptive Clock Technology for Real-Time Droop Response

Adaptive Clock Technology for Real-Time Droop Response
by Kalar Rajendiran on 03-30-2023 at 6:00 am

Example Sea of Processor SoC with Distributed Generate Modules for Local Droop Response

In integrated circuit terminology, a droop is the voltage drop that happens in a circuit. This is a well-known phenomenon and can happen due to the following reasons. The power supply falls below the operating range for which a chip was designed for, resulting in a droop. More current is drawn by the conductive elements than they were designed for, resulting in a droop. Sometimes, signal interference or noise on the power supply can also cause voltage fluctuations, resulting in a droop.

Droops can impact the operations of a circuit. Reduced performance of the chip, leading to longer processing times is one such impact. But the following are some of the more serious and/or catastrophic impacts. The chip could draw more current to maintain the level of performance leading to increased power consumption and heat dissipation. This can lead to reduced life of the chip and in severe cases, a complete failure of the chip due to setup and hold variations. Droops can also cause data corruption or errors in the output. This is a very serious issue for applications that depend on the accuracy and reliability of the chip.

Naturally, the phenomenon of droops is taken into serious consideration when designing chips and systems. The most common methods to mitigate droops are power supply decoupling, voltage regulation, circuit optimization and system-level power management. The conditions and the operating environment in which the chip will be performing are carefully considered when designing droop mitigating solutions.

Modern Day Problems

As SoCs become more complex, droop issues can get quite complex too. A SoC’s design needs to be optimized for performance, power, cost, form factor, etc. in addition to optimizing for droop mitigation. Sometimes these optimization goals can compete with each other and run counter and tradeoffs have to be made. For example, SoC architects can raise the operating voltage, adding margin, to circumvent local and global droop, but this rise quadratically increases power. Alternatively, designers can have their clock generation adapt to droop, which makes performance a function of clock generation switching time.

Consequently, large SoCs in the datacenter compute and AI space are notably susceptible to droop. Customer workloads are very diverse and dynamic, leading to significant fluctuations in switching activity and current draw. Of course, systems cannot afford to let droop issues go unaddressed. The potential liability from inaccurate output or catastrophic failure of a chip is too high for today’s systems and applications.

Localized Droop Issues

Application-specific accelerators are widely used in tandem with general-purpose processors to deliver the performance and power efficiency required in today’s demanding computing environments. But these accelerators as well as the increasing number of cores and the asymmetric nature of workloads, increase the risk of localized voltage droops. These localized voltage drops are a result of sudden increase in switching activity and can cause transient glitches and potential mission-mode failures.

When localized droop occur, the impact can be mitigated through dynamic frequency scaling. This is achieved by adjusting the timing of a circuit using a programmable clock. A programmable clock allows the clock frequency and timing to be adjusted dynamically based on the current operating conditions of the circuit.

Movellus Makes it Easy to Address Localized Droops

Movellus, a leading digital system-IP provider, has developed the Aeonic Generate family of products to address localized droops. The Movellus Aeonic portfolio offers adaptive clocking solutions that deliver rapid droop response. The portfolio includes the adaptive clocking system. The building blocks are built with synthesizable Verilog, making them intrinsically flexible. The solutions are configurable, scannable and process-portable for a wide range of advanced SoC applications.

The Aeonic Generate family of products is also significantly smaller than traditional analog solutions. As a result, designers can instantiate the IP at the granularity required without any significant impact on the area. Additionally, as designs move to finer process geometries, the Aeonic Generate area continues to scale, making it an ideal solution for future designs.

A Couple of Use Cases

The following Figure from Movellus shows an example architecture of an ADAS processor with the Aeonic Generate AWM Platform for localized droop support. An architect would pair an AWM module with an application-specific sub-block or accelerator to respond to workload-driven localized droops within five clock cycles with glitch-less and rapid frequency shifts. This approach provides a reliable and efficient solution for addressing the challenges of localized droops in ADAS, 5G, and data center networking markets.

The following Figure from Movellus shows an example architecture of a sea of processor SoC with Aeonic Generate for localized droop support. An architect would pair an Aeonic Generate AWM module with a droop detector for the processor cluster and associated voltage domain to rapidly respond to workload-driven localized droops. This allows designers to deliver localized and independent droop response without altering the performance of neighboring processor clusters.

Summary

Localized voltage droops can occur in heterogeneous SoCs containing application-specific accelerators. These droops can lead to timing glitches, transient glitches and mission-mode failures in ADAS, data center networking and 5G applications. System architects can implement adaptive clocking to respond to these droops and mitigate the impact.

The Movellus™ Aeonic Generate Adaptive Workload Module (AWM) family of high-performance clock generation IP products are part of the Aeonic Intelligent Clock Network™ architecture. For more information, refer to Movellus’ Aeonic Generate™ AWM page.

Also Read:

Advantages of Large-Scale Synchronous Clocking Domains in AI Chip Designs

It’s Now Time for Smart Clock Networks

Performance, Power and Area (PPA) Benefits Through Intelligent Clock Networks


Automotive Lone Bright Spot

Automotive Lone Bright Spot
by Bill Jewell on 03-29-2023 at 10:00 am

Automotive Annual Unit Change

Automotive appears to be about the only bright spot in the semiconductor market for 2023. Forecasts for the overall semiconductor market range from a decline of 4% to a decline of 20%. Semiconductor companies generally have bleak outlooks for the start of 2023, citing excess inventories ad weak end market demand. The chart below shows annual unit change for the key semiconductor market drivers PCs & tablets, smartphones, and motor vehicles. PCs and tablets boomed in the first two years of the COVID-19 pandemic but declined 17% in 2022. IDC projects an 11% decline in PCs & tablets in 2023. Smartphones dropped 11% in 2022 after 6% growth in 2021. Gartner expects smartphone units to drop 4% in 2023. Light vehicle production declined 15% in 2000 after automakers cut back production over concerns related to the pandemic. Light vehicles returned to growth in 2021 at 3% and in 2022 at 6%. S&P Global Mobility is forecasting a 4% increase in light vehicles produced in 2023.

Automotive Semiconductor Companies

The top five semiconductor suppliers to the automotive industry are shown in the chart below. Infineon is the largest at $8.1 billion in automotive semiconductor revenue in 2022. The top five account for about half of the total automotive semiconductor market. These companies experienced strong growth in the automotive market in 2022, ranging from 17% to 46%, compared to only 3.3% growth in the overall semiconductor market. Automotive is a significant portion of these companies’ overall revenues, ranging from 25% to 52%.

The top five companies have all provided revenue guidance for 1Q 2023 calling for a decline in total revenue from 4Q 2022 (except for Renesas, which did not provide guidance). However, each company cited the automotive segment as remaining strong. In its 4Q 2022 earnings conference call, NXP cited a “pricing tailwind” for automotive, implying increasing prices.

Automotive Semiconductor Market

The automotive semiconductor market should show healthy growth in 2023, in contrast to most of the rest of the semiconductor market. We at Semiconductor Intelligence are forecasting 14% growth for the automotive semiconductor market in 2023. Key factors driving this growth are:

· Strong revenue momentum for semiconductor suppliers
· Easing of semiconductor shortages, but some remaining through 2023
· Automotive semiconductor inventories generally below desired levels
· Some price increases for automotive semiconductors
· Growth of 4% or more in vehicle production
· Continued increases in semiconductor content per vehicle

The longer-term outlook for automotive semiconductors is also very healthy. The semiconductor content per vehicle will steadily increase over the next several years. S&P AutoTechInsight in January 2023 projected the average semiconductor content per vehicle will increase 80% over the next seven years from $854 in 2022 to $1,542 in 2029.

A McKinsey & Company report from April 2022 projected the overall semiconductor market will pass a trillion dollars at $1,065 billion in 2030, a compound annual growth rate (CAGR) of 6.8% from 2021. The automotive semiconductor market is expected to reach $150 billion in 2030, a CAGR of 13.0% from 2021. Thus, the automotive semiconductor CAGR is almost twice the growth rate of the overall semiconductor market.

The key drivers of automotive semiconductors through the end of the decade are electric vehicles (EVs), driver-assistance systems and autonomous-driving, and infotainment systems.

Electric Vehicles – Battery Electric Vehicle (BEV) sales were about 10 million units in 2022, about 12% of total vehicles sold, according to Counterpoint Research. Counterpoint estimates about 40% of vehicles sold in 2030 will be BEV. Several major automakers including Honda, Volkswagen and Hyundai are targeting BEVs to account for 50% or more of production by 2030. BEVs require sophisticated battery management systems. BEVs are estimated to have semiconductor dollar content two times (according to X-FAB) to three times (according to Analog Devices) the content of internal combustion engine vehicles. Thus, the shift to high semiconductor value BEVs will significantly contribute to the overall growth of automotive semiconductors.

Driver assistance and autonomous-driving – vehicles are increasingly incorporating technology to aid the driver such as adaptive cruise control, lane keeping assistance, rearview video, and automatic emergency braking. These features require numerous sensors and controllers. According to the Statista Mobility Market Insights, cars with at least some driver assistance features accounted for 86% of sales in 2020 compared to just 49% in 2015. McKinsey & Company estimates the 2022-2030 CAGR of driver assistance systems at 17%

Fully self-driving cars or autonomous vehicles (AV) will be slower to develop. McKinsey projects only 12% of cars sold in 2030 will be AVs, increasing to 37% in 2035. Adoption of AVs will require advances in technology, changing consumer attitudes and changing government regulations. Tesla reported its vehicles using Autopilot technology in the U.S. averaged over 5.6 million miles per accident in 2022 compared to 652 thousand miles per accident for the U.S. overall. Although the Autopilot accident rate is about one-ninth of the overall rate, one could argue AVs should be orders of magnitude safer than human drivers. A 2023 AAA survey showed 68% of U.S. drivers are afraid of self-driving vehicles, with 23% unsure and only 9% trusting them.

Infotainment – a combination of information and entertainment these systems provide services such as navigation systems, wi-fi, smartphone integration, voice commands, audio, and video. The vast majority of cars sold today include infotainment systems, especially in advanced nations. Analysts estimate the CAGR of the automotive infotainment market from 2022 to 2030 at about 9% to 11%.

The automotive semiconductor industry looks strong in 2023 and through the end of the decade. The increasing semiconductor content of vehicles will make automotive the fastest growing major segment of the semiconductor market through 2030.

Semiconductor Intelligence is a consulting firm providing market analysis, market insights and company analysis for anyone involved in the semiconductor industry – manufacturers, designers, foundries, suppliers, users or investors. Please contact me if you would like further information.

Bill Jewell
Semiconductor Intelligence, LLC
billjewell@sc-iq.com

Also Read:

Bleak Year for Semiconductors

CES is Back, but is the Market?

Semiconductors Down in 2nd Half 2022

Continued Electronics Decline


Advanced electro-thermal simulation sees deeper inside chips

Advanced electro-thermal simulation sees deeper inside chips
by Don Dingee on 03-29-2023 at 6:00 am

Advanced electro-thermal simulation in Keysight PathWave ADS

Heat and semiconductor reliability exist in an inversely proportional relationship. Before the breaking point at the thermal junction temperature rating, every 10°C rise in steady-state temperature cuts predicted MOSFET life in half. Yet, heat densities rise as devices plunge into harsher environments like smartphones, automotive electronics, and space-based electronics, while reliability expectations remain high. Higher frequencies and increased power delivery with newer semiconductor materials and packaging schemes push thermal behavior harder. Accurate reliability prediction is only possible with advanced electro-thermal simulation – an emphasis for several Keysight teams working with PathWave Advanced Design System (ADS).

Keysight’s advanced thermal analysis capability stretches back to its acquisition of Gradient Design Automation and its HeatWave thermal solver in 2014. ADS Electro-Thermal Simulation adapted and integrated HeatWave to co-simulate with industry-leading RF circuit simulators in ADS. Gradient’s former VP of Engineering, Adi Srinivasan, now a Product Owner and Principal Engineer for electro-thermal simulation at Keysight, still works with the technology he helped pioneer today. “Coupling is now at least as important as self-heating,” says Srinivasan, and like EM simulation, the co-simulation must span time and frequency domains.

RF designers widely use ADS Electro-Thermal Simulator to avoid thermal hazards and improve design quality, with a long history of solving complex RF intra-die thermal problems. A PCB-level thermal solver, PE-Thermal, is creating similar problem-solving opportunities for power electronics designers. Increasing the reach of thermal simulation for more customers and larger domains like modules is a priority; thermal modeling in foundry process design kits (PDKs) is a strong driver of broader adoption.

 

 

 

 

 

 

 

 

III-V and silicon foundries picking up the thermal pace

Foundries are seeing increased demand for rich-model PDKs as customers crank up more simulations in their EDA workflows. Behind the scenes, Keysight has been working aggressively to evangelize the need for advanced electro-thermal simulation and asking foundries to consider adding thermal models ready for the W3050E PathWave ADS Electro-thermal Simulator to their PDK support.

 

 

 

 

 

 

 

“We’ve recently seen thermal-enabled PDK announcements from III-V foundries like GCS (Global Communication Semiconductors) in the US, and AWSC (Advanced Wireless Semiconductor) and Wavetek in Taiwan,” says Kevin Dhawan, III-V Foundry Program Manager at Keysight. These foundries are leaders in technology used in power and RF electronics, and Dhawan noted that other III-V foundries such as WIN, UMS, and OMMIC also offer PDKs for ADS Electro-Thermal Simulator. Additionally, Dhawan says several high-volume silicon foundries have PDK support for ADS Electro-Thermal Simulator but have yet to make specific announcements.

“Customers are asking about a single solution for both EM and electro-thermal effects, but the domains are a bit different,” Dhawan observes. “EM focuses more on back-end layout, metallization, and bias, while electro-thermal needs to model additional heat from the semiconductor devices themselves.” ADS supports several types of PDKs: ADS-native PDKs, ADS-Virtuoso interoperable PDKs, and interoperable PDKs (iPDKs) based on OpenAccess and customizable via Python and Tcl. Dhawan says to check with the foundries for the latest on ADS Electro-Thermal Simulator-ready PDKs.

Incredible heat densities in power electronics

Power converter designers are chasing extremely dense “bricks” for applications like aerospace, automotive, and device chargers. This chart shows why more designers are turning to III-V processes like GaN and SiC – switching speeds are higher, and power delivery increases.

 

“With tens of kilowatts or more in play, things tend to heat up,” says Steven Lee, Power Electronics Product Manager for Keysight EDA. “Traditional workflows and tools like Spice don’t consider detailed temperature profiles or post-layout thermal impacts.” The result often is a hardware prototyping surprise, where measurements find power transistors pushed beyond their rated junction temperature, triggering an expensive late part change and a re-spin.

Thermal simulation, taking device models, packaging, and board materials and layout into consideration, prevents problems. “Hot transistors next to each other can couple via metal traces and planes and magnify heating problems,” Lee points out. First-pass thermal effects discovered quickly via simulations can validate component selection and guide packaging and layout adjustments before committing to hardware.

 

 

 

 

 

 

 

PE-Thermal implements a thermal resistance editor and extraction tool for more robust temperature modeling capability. “With a design already in the ADS workspace, designers can select the components for thermal analysis, tune models as necessary, and simulate with PE-Thermal in a few minutes,” explains Uday Ramalingam, R&D engineer at Keysight. “Ultimately, designers will be able to go inside a transistor and explore package properties – maybe they have the right part but not the right package for their context.”

 

 

 

 

 

 

“Just like the EM solver in ADS takes metallization into account, the electro-thermal solver does too, and enhancements in future ADS releases will take us all the way to complete board-level layout thermal effects automatically,” Lee wraps up.

An RF design example of advanced electro-thermal simulation

Reliability prediction tools are only as good as the temperature data that goes into them. Excruciating setups can get accurate temperature measurements on physical hardware, but they slow design cycles and don’t preclude the re-spin surprise if bad results appear.

EnSilica (Abingdon, UK) is a fabless chipmaker delivering RF, mmWave, mixed-signal, and high-speed digital designs for their customers in automotive, communications, and other applications. Using Keysight PathWave ADS and PathWave ADS Electro-Thermal Simulator is taking them from a practice of embedding and measuring many temperature sensors on a chip to fully simulating thermal effects with high accuracy.

A Ka-band transceiver project, implemented in an automotive-certified 40nm CMOS foundry process, is EnSilica’s first foray into virtual thermal analysis. An interesting wrinkle is another chip on their board, next to the RF transceiver, creating 3°C boundary heating on one edge, clearly seen on the right side of the heat map produced in the ADS Electro-Thermal Simulator.

 

 

 

 

 

 

 

 

 

Results from the ADS electro-thermal simulation were within 0.7°C of actual measurements (with simulations slightly higher, an excellent conservative result), increasing confidence in meeting 10-year reliability goals. During thermal resistance modeling, EnSilica also found improvements in chip layout and package bumping that lowered operating temperatures in the final product.

Seeing deeper inside chips to avoid design hazards and enhance packages and layouts are powerful justifications for advanced electro-thermal simulation. Keysight’s ability to fit into multi-vendor workflows means high-accuracy thermal analysis is available to more design teams. Please visit these resources for the EnSIlica case study and further background on Keysight’s ADS solutions, including ADS Electro-Thermal Simulator.

 

Design and simulation environment: PathWave Advanced Design System

Thermal simulation add-on: W3050E PathWave Electro-Thermal Simulator

Webinar:  Using Electro-Thermal Simulation in Your Next IC Design

Video:  Using Electro-Thermal Simulation in ADS 2023

Case study:  Predicting Ka-band Transceiver Thermal Margins, Wear, and Lifespan


The Rise of the Chiplet

The Rise of the Chiplet
by Kalar Rajendiran on 03-28-2023 at 10:00 am

Open Chiplet Economy

The emergence of chiplets as a technology is an inflection point in the semiconductor industry. The potential benefits of adopting a chiplets-based approach to implementing electronic systems are not a debate. Chiplets, which are smaller, pre-manufactured components can be combined to create larger systems, offering benefits such as increased flexibility, scalability, and cost-effectiveness in comparison to monolithic integrated circuits. However, chiplets also present new challenges in terms of design, integration, and testing. The technology is still in flux, and there are many unknowns that need to be addressed over the coming years. The success of chiplets will depend on factors such as manufacturing capabilities, design expertise, and the ability to integrate chiplets into existing systems.

While sophisticated packaging and interconnect technologies have been receiving a lot of press, there are many more aspects that are critical too. Designing chiplets-based systems requires a different mindset and skillset than traditional chip design. Many more things need to come together to enable a chiplet-based economy. This was the focus of a recently held webinar titled “The Rise of the Chiplet.” The webinar was moderated by Brian Bailey, Technology Editor/EDA from SemiEngineering.com. The panelists were Nick Ilyadis, Sr. Director of Product Planning, Achronix; Rich Wawrzyniak, Principal Analyst ASIC & SoC, Semico Research Corp; and Bapi Vinnakota, OCP ODSA Project Lead, Open Compute Project Foundation.

The composition of the panel allowed the audience to hear a market perspective, and product perspective as well as the collaborative community perspective for designing efficiency into solutions.

What is needed for chiplet adoption

For chiplet adoption, the industry needs to worry not just about the die-to-die interfaces and packaging technology but the whole chiplet economy.

For example, how to describe a chiplet before building it in order to achieve efficient modularity. On the physical description for a chiplet, the standard things to include are area, orientation, thermal map, power delivery, bump maps, etc., This physical part description is very important when integrating chiplets from multiple vendors. OCP is beginning work with JEDEC to create a standard JEP30 part model to physically describe a chiplet. Some of the other areas to get addressed include: How to address known-good-die (KGD) in business contracts. How to accomplish architecture exploration? How to handle business logistics?

Various workgroups within OCP are focusing on many of these areas and more and making available downloadable worksheets or templates for use by designers. For example, designers can download a worksheet that helps them compare a chiplet-based design to a monolithic design for design costs and manufacturing costs. When it comes to chiplet interfaces, Bunch of Wires (BoW) for example may be the choice for some applications and Universal Chiplet Interconnect Express (UCIe) may be the right one for some other applications. There are tools available for comparing various die-to-die interfaces available in the marketplace.

The following table shows the various areas that need to be addressed.

Another important thing that needs to be understood and addressed is whether all the chiplets to be included in a product need to be from the same process corner. Do chiplets need to be marketed under different speed grades like memories are? If some chiplets are from fast corners and others are from the slow corners, what kind of issues will arise during system simulation and when deployed in the field?

As chiplets technology continues to evolve, companies will be experimenting with different approaches to incorporating chiplets into their products.

eFPGA-Based Chiplet

Embedded FPGA (eFPGA) has been gaining a lot of traction within the monolithic ASIC world. An eFPGA-based chiplet can extend the eFPGA benefits to a full chiplet-based system. Achronix as a leader in the FPGA solutions space is offering eFPGA IP-based chiplets to deliver the following benefits. Unique production solution (different SKUs); support different process technologies in cases where the optimal process technology for the ASIC is not optimal for an embedded FPGA; Utilize the FPGA chiplet across multiple generations of products versus having it in just one monolithic device.

Summary

Chiplets offer a promising new direction for the semiconductor industry. The winning solutions will be determined over the coming years. How many years, that depends on whom you ask. To listen to the entire webinar, check here. The panelists fielded a number of audience questions as well that you may find of interest to you.

Also Read:

Achronix on Platform Selection for AI at the Edge

WEBINAR: FPGAs for Real-Time Machine Learning Inference

WEBINAR The Rise of the SmartNIC


Speculation for Simulation. Innovation in Verification

Speculation for Simulation. Innovation in Verification
by Bernard Murphy on 03-28-2023 at 6:00 am

Innovation New

This is an interesting idea, using hardware-supported speculative parallelism to accelerate simulation, with a twist requiring custom hardware. Paul Cunningham (Senior VP/GM, Verification at Cadence), Raúl Camposano (Silicon Catalyst, entrepreneur, former Synopsys CTO and now Silvaco CTO) and I continue our series on research ideas. As always, feedback welcome.

The Innovation

This month’s pick is Chronos: Efficient Speculative Parallelism for Accelerators. The authors presented the paper at the 2020 Conference on Architectural Support for Programming Languages and Operating Systems and are from MIT.

Exploiting parallelism using multicore processors is one option for applications where parallelism is self-evident. Other algorithms might not be so easily partitioned but might benefit from speculative execution exploiting intrinsic parallelism. Usually, speculative execution depends on cache coherence, a high overhead especially for simulation. This method bypasses need for coherence, physically localizing task execution to compute tiles by target read-write object, ensuring that conflict detection can be detected locally, without need for global coherence management. Tasks can execute speculatively in parallel; any conflict detected can be unrolled from a task through its child tasks then re-executed without need to stall other threads.

One other point of note here. This method supports delay-based simulation, unlike most hardware acceleration techniques.

Paul’s view

Wow, what a wonderful high-octane paper from MIT! When asked about parallel computation I immediately think about threads, mutexes, and memory coherency. This is of course how modern multi-core CPUs are designed. But it is not the only way to support parallelization in hardware.

This paper proposes an alternative architecture for parallelization called Chronos that is based on an ordered queue of tasks. At runtime, tasks are executed in timestamp order and each task can create new sub-tasks that are dynamically added to the queue. Execution begins by putting some initial tasks into the queue and ends when there are no more tasks in the queue.

Tasks in the queue are farmed out to multiple processing elements (PEs) in parallel – which means Chronos is speculatively executing future tasks before the current task has completed. If the current task invalidates any speculatively executed future tasks then the actions of those future tasks are “undone” and they are re-queued. Implementing this concept correctly in hardware is not easy, but to the outside user it’s beautiful: you just code your algorithm as if the task queue is being executed serially on a single PE. No need to code any mutexes or worry about deadlock.

The authors implement Chronos in SystemVerilog and compile it to an FPGA. Much of the paper is devoted to explaining how they have implemented the task queue and any necessary unrolling in hardware for maximum efficiency. Chronos is benchmarked on four algorithms well suited to a task-queue based architecture. Each algorithm is implemented two ways: first using a dedicated algorithm-specific PE, and second using an off the shelf open source 32-bit embedded RISC-V CPU as the PE. Chronos performance is then compared to multi-threaded software implementations of the algorithms running on an Intel Xeon server with a similar price tag to the FPGA being used for Chronos. Results are impressive – Chronos scales 3x to 15x better than using the Xeon server. However, comparing Table 3 to Figure 14 makes me worry a bit that most of these gains came from the algorithm-specific PEs rather than the Chronos architecture itself.

Given this is a verification blog I naturally zoomed in on the gate-level simulation benchmark. The EDA industry has invested heavily to try and parallelize logic simulation and it has proven difficult to see big gains beyond a few specific use cases. This is mainly due to the performance of most real-world simulations being dominated by load/store instructions missing in the L3-cache and going out to DRAM. There is only one testcase benchmarked in this paper and it is a tiny 32-bit carry save adder. If you are reading this blog and would be interested to do some more thorough benchmarking please let me know – if Chronos can truly scale well on real world simulations it would have huge commercial value!

Raúl’s view

The main contribution of this paper is the Spatially Located Ordered Tasks (SLOT) execution model which is efficient for hardware accelerators that exploit parallelism and speculation, and for applications that generate tasks dynamically at runtime. Dynamic parallelism support is inevitable for simulation and speculative synchronization is an appealing option, but coherency overhead is too high.

SLOT avoids the need for coherence by restricting each task to operate (write) on a single object and supports ordered tasks to enable multi-object atomicity. SLOT applications are ordered, dynamically created tasks characterized by a timestamp and an object id. Timestamps specify order constraints; object ids specify the data dependences, i.e., tasks are data-dependent if and only if they have the same object id. (if there is a read dependency the task can be executed speculatively). Conflict detection becomes local (without complex tracking structures) by mapping object ids to cores or tiles and sending each task to where its object id is mapped.

The Chronos system was implemented in the AWS FPGA framework as a system with 16 tiles, each with 4 application specific processing elements (PE), running at 125MHz. This system is compared with a baseline consisting of 20-core/40-thread 2.4 GHz Intel Xeon E5-2676v3, chosen specifically because its price is comparable with the FPGA one (approx. $2/hour). Running a single task on one PE, Chronos is 2.45x faster than the baseline. As the number of concurrent tasks increases, the Chronos implementation scales to a self-relative speedup of 44.9x on 8 tiles, corresponding to a 15.3x speedup over the CPU implementation. They also compared an implementation based on general purpose RISC-V rather than application specific PEs; PEs were 5x faster than RISC-V.

I found the paper impressive because it covers everything from a concept to the definition of the SLOT execution model to the implementation of hardware and the detailed comparison with a traditional Xeon CPU for 4 applications. The effort is substantial, Chronos is over 20,000 lines of SystemVerilog. The result is a 5.4x mean (of the 4 applications) speedup over software-parallel versions, due to more parallelism and more use of speculative execution. The paper is also worth reading for application to non-simulation tasks; the paper includes three examples.


Power Delivery Network Analysis in DRAM Design

Power Delivery Network Analysis in DRAM Design
by Daniel Payne on 03-27-2023 at 10:00 am

IR drop plot min

My IC design career started out with DRAM design back in 1978, so I’ve kept an eye on the developments in this area of memory design to note the design challenges, process updates and innovations along the way. Synopsys hosted a memory technology symposium in November 2022, and I had a chance to watch a presentation from SK hynix engineers, Tae-Jun Lee and Bong-Gil Kang. DRAM chips have reached high capacity and fast data rates of 9.6 gigabits per second, like the recent LPDDDR5T announcement on January 25th. Data rates can be limited by the integrity of the Power Delivery Network (PDN), yet analyzing a full-chip DRAM with PDN will slow simulation times down too much.

The peak memory bandwidth per x64 channels has shown steady growth across several generations:

  • DDR1, 3.2 GB/s at 2.5V supply
  • DDR2, 6.4 GB/s at 1.8V supply
  • DDR3, 12.8 GB/s at 1.5V supply
  • DDR4, 25.6 GB/s at 1.2V supply
  • DDR5, 51.2 GB/s at 1.1V supply

A big challenge in meeting these aggressive timing goals is controlling the parasitic IR drop issues caused during the IC layout of the DRAM array, and shown below is a plot of IR drop where the Red color is an area of highest voltage drop, which in turn slows the performance of the memory.

IR drop plot of DRAM array

The extracted parasitics for an IC are saved in a SPF file format, and adding these parasitics for the PDN to a SPICE netlist causes the circuit simulator to slow down by a factor of 64X, while the number of parasitic RC elements added by the PDN is 3.7X more than just signal parasitics.

At SK hynix they came up with a pragmatic approach to reduce the simulation run times when using the PrimeSim™ Pro circuit simulator on SPF netlists including the PDN by using three techniques:

  1. Partitioning of the netlist between Power and other Signals
  2. Reduction of RC elements in the PDN
  3. Controlling simulation event tolerance

PrimeSim Pro uses partitioning to divide up the netlist based upon connectivity, and by default the PDN and other signals would combine to form very large partitions, which in turn slowed down simulation times too much. Here’s what the largest partition looked like with default simulator settings:

Largest partition, default settings

An option in PrimeSim Pro (primesim_pwrblock) was used to cut down the size of the largest partition, separating the PDN from other signals.

Largest partition, using option: primesim_pwrblock

The extracted PDN in SPF format had too many RC elements, which slowed down circuit simulation run times, so an option called primesim_postl_rcred was used to reduce the RC network, while at the same time preserving accuracy. The RC reduction option was able to decrease the number of RC elements by up to 73.9%.

Circuit simulators like PrimeSim Pro use matrix math to solve for current and voltages in the netlist partitions, so runtime is directly related to matrix size and how often a voltage change requires recalculation. The simulator option primesim_evtgrid_for_pdn was used, and it reduces the number of times a matrix needs to be solved whenever there are small voltage changes in the PDN. The chart below shown in purple has an X at each point in time when matrix solving in the PDN was required by default, then shown in white are triangles at each point in time that matrix solving is used with the simulator option. The white triangles happen much less frequently than the purple X’s, enabling faster simulation speeds.

Power Event Control, using option: primesim_evtgrid_for_pdn

A final PrimeSim simulator option used to reduce runtimes was primesim_pdn_event_control=a:b, and it works by applying an ideal power source for a:b, resulting in fewer matrix calculation for the PDN.

The simulation runtime improvements by using all of the PrimeSim options combined was a 5.2X speed-up.

Summary

Engineers at SK hynix have been using both the FineSim and PrimeSim circuit simulators for analysis in their memory chip designs. Using four options in PrimeSim Pro have provided sufficient speed improvements to allow full-chip PDN analysis with SPF parasitics included. I expect that Synopsys will continue to innovate and improve their circuit simulator family in order to meet the growing challenges of memory chip and other IC design styles.

Related Blogs


Siemens Keynote Stresses Global Priorities

Siemens Keynote Stresses Global Priorities
by Bernard Murphy on 03-27-2023 at 6:00 am

Space Perspective

Dirk Didascalou, Siemens CTO, gave a keynote at DVCon, raising our perspective on why we do what we do. Yes, our work in semiconductor design enables the cloud and 5G and smart everything, but these technologies push progress for a select few. What about the big global concerns that affect us all: carbon, climate, COVID and conflict? He made the point that industry collectively has a poor record against green metrics: a 27% contributor to carbon, more that 33% in energy consumption, and less than 13% of products recycled.

We need industries globally, for food and clothing, energy, health, education, and opportunity. Returning to a pastoral way of life isn’t an option so we must help industries become greener. While adapting faster to rapidly evolving demands and constraints thanks to geopolitical instability. Add in demographic ageing, relentlessly chipping away at the pool of critical skills in manufacturing. Siemens aims at these global challenges by helping industries to become more efficient, more automated, and more nimble through digital transformation.

Optimizing industry

Manufacturing industries are very process driven. For conventional production flows, global optimization on-the-fly – reworking flows or product mixes – is very difficult. Improvements in these contexts are more commonly limited to local optimizations, tweaking the process recipe where possible. Global optimization through trial-and-error experiments is simply not practical. Auto manufacturers ran into exactly this problem, intrinsic inflexibility in the Henry Ford manufacturing model. To their credit are already adjusting, often with Siemens help.

Digital transformation allows industries to model whole product lines digitally and experiment with options. Not only to model but also to plan how to adapt those lines quickly in real life, and to plan for predictive maintenance. This is the digital twin concept, though going far beyond the familiar autonomous car training example. Here Dirk is talking about a digital twin to model a continuous, context-driven process for business through manufacturing.

Siemens is itself a manufacturing company. They have a factory in southern Germany producing many of the products they use to help other companies in their automation goals.  The Amberg site manufactures 17 million products a year, from of a portfolio of 1,200. Each day they must reconfigure the factory 350 times to serve many different types of order. Siemens put their own digital transformation advice and products to work in this factory, delivering 14X productivity improvements on products with 2X the complexity in the same factory with the same number of people. The World Economic Forum has named that site one of their lighthouse factories.

What difference does this make to the big goals and to what we do? Siemens doesn’t need to produce 14X more products today. For the same product volume, those improvements drive lower energy consumption and therefore a lower carbon footprint. Digital transformation also minimizes need for trial-and-error modeling in the real world, a faster turnaround with less waste to produce better, greener manufactured goods. And it allows for more flexibility in quickly switching product features and mixes. Consumers get exactly the options they want at a similar cost, from more eco-friendly manufacturing. All enabled by digital twin models, sensing, compute and communication technologies and of course AI.

Real applications

One example is Space Perspective, a carbon neutral spacecraft powered by a balloon! It can carry eight people in a 12 mile per hour ascent to 100,000 feet. The craft was designed completely digitally using Siemens SimCenter-STAR CCM. Soon you won’t have to be a billionaire to go to space!

A more widely important example is vertical farming. 80 Acres Farms designed their indoor, vertically stacked farms using Siemens products. An 80 Acres farm can produce up to 300 times more food than a regular farm in the same footprint, using renewable energy, no pesticides, and 95% less water. These farms produce food locally to serve local needs, minimizing trucking costs and consequent impact on the environment.

Where does COVID fit into this story? Remember BioNTech? They produced the Pfizer vaccine, the first widely available shot. Designing the vaccine was a great accomplishment but then needed to be ramped to billions of doses in 6 months. That required more research on boosting immune response. Siemens products assisted with solutions to help simulate the impact of modeled molecular structures on immune response. A combination of simulations, AI, and results from clinical trials led to the vaccine many of us received following a record development and production cycle for biotech.

Northvolt is another example. This is a Swedish company building lithium-ion batteries for EVs and other applications. This is a serious startup with $30 billion in funding, not a wishful one-off. Batteries are integral to making renewable energy more pervasive, but we hear lots of concerns about environmental issues in their manufacture. Northvolt’s mission is to deliver batteries with an 80% lower carbon footprint than those made in other factories, and they recycle material from used batteries into new products. These guys are committed. Again, the whole operation was designed digitally with Siemens – creation, commissioning, manufacture, deployment, and recycling.

There are more examples. Milling machines as a service – yes that’s a real thing. A German company offers a machine which can be de-featured to do just the basics, competing on price with cheap Asian counterparts. When needed you can pay for an upgrade, enabled naturally through an app, which will turn on a more advanced feature. Naturally there are multiple such features 😊.

Closer to home for automotive design, safety analysis and ML training through digital twins is enabled by Siemens EDA. Samsung presented later in the same conference on using Siemens Xcelerator tools to reduce functional safety iterations by 70%. and to generate an integrated final validation report across the formal, simulation and emulation engines they used for ISO 26262 certification.

An inspiring keynote. Next time a relative asks you what you do for a living, aim a little higher. Tell them you design products that ultimately drive greener manufacturing, faster response to pandemic crises, and (who knows) maybe ultimately more constructive approaches to resolving conflict.