wide 1

Multiphysics Analysis from Chip to System

Multiphysics Analysis from Chip to System
by akanksha soni on 04-10-2023 at 10:00 am

Image 1 2

Multiphysics simulation is the process of computational methods to model and analyze a system to understand its response to different physical interactions like heat transfer, electromagnetic fields, and mechanical structures. Using this technique, designers can generate physics-based models and analyze the behavior of the system as a whole.

Multiphysics phenomena play a key role when designing any electronic device. In the real world, most devices that we use in our day-to-day life consist of electronics. These devices consist of chips, wires, antennas, casing, and many other components responsible for the final operation and execution of the product. The physical phenomena happen not just within an electronic device but also impact nearby devices. Therefore, it’s important to consider the effects of physical interactions from chip to system and the surrounding environment.

Alternative methods and their shortcomings

Understanding the electrical behavior of any device or system isn’t enough. Designers also need to consider the Multiphysics aspects like thermal, mechanical stress/warpage, and electromagnetic effects. Designers may use different ways to understand the Multiphysics behavior of the system at different levels.

Engineers can simulate each physical phenomenon separately and integrate the results to understand the cumulative behavior. This approach is time-consuming and prone to errors, and does not allow for a comprehensive analysis of the interactions between different physical fields. For example, temperature variations in a multi-die IC package can induce Mechanical stress and mechanical can impact the electromagnetic behavior of the system. Everything is co-related; therefore, a comprehensive Multiphysics solution is required for simulating the physics of the entire system.

To achieve high performance and speed goals, chip designers are embracing multi-die systems like 2.5D/3D-IC architectures. The number of vectors to be simulated in these systems has reached millions. Conventional IC design tools cannot handle this explosion of data, so chip designers considered a limited set of data to analyze the Multiphysics behavior of the system. This approach might work if the system is not high-speed and not used in critical conditions, but it is definitely not applicable for today’s high-speed systems where reliability and robustness are the major requirements.

Ansys provides a complete comprehensive Multiphysics solution that can easily solve millions of vectors to thoroughly analyze the Multiphysics of the entire system- chip-package-system.

Advantages of Multiphysics Simulation from Chip to System

Comprehensive Multiphysics simulation is a powerful method that enables designers to accurately predict and optimize the behavior of complex systems at all levels, including chip, package, and system. Multiphysics simulation has many advantages but some of the most prominent advantages are:

  1. Enhanced Reliability: Comprehensive Multiphysics simulation methods analyze the physics of each complex component in the system and also consider the interactions between different physical domains. This technique provides more accurate results which ensure the reliability of the system. Ansys offers a wide range of Multiphysics solutions enabling designers to analyze the Multiphysics at all levels, chip, package, system, and surrounding environment.
  2. Improved Performance: Multiphysics solutions give insight into the different physics domains, their interactions, and their impact on the integrity of the system. By knowing the design’s response to the thermal and mechanical parameters along with electrical behavior, designers can take an informed decision and modify the design to achieve desired performance. In a 3D-IC package, Ansys 3D-IC solution provides a clear insight into the power delivery, temperature variations, and mechanical stress/warpage around chiplets and interposer, allowing designers to deliver higher performance.
  3. Design Flexibility: Designers can explore a wide range of design options and tradeoffs. It allows designers to take decisions based on yield, cost, and total design time. For example, in a 3D-IC package designers can choose the chiplets based on functionality, cost, and performance. Multiphysics simulation allows this flexibility without extra cost.
  4. Reduced cost: It allows designers to identify potential design issues early in the development process, reducing the need for physical prototypes and lowering development costs. Using simulation, you can also tradeoff between the BOM costs and expected performance.
  5. Reduced Power Consumption: A system consists of multiple parts and each part might have different power requirements. By Multiphysics simulation, the designers can estimate the power consumption in each part of the system and optimize the power delivery network.

Ansys offers powerful simulation capabilities that can help designers optimize their products’ performance, reliability, and efficiency, from chip to the system level. By using Multiphysics solutions by Ansys, designers can take informed design decisions while designing.

Learn more about Ansys Multiphysics Simulation tools here:

Ansys Redhawk-SC | IC Electrothermal Simulation Software

High-Tech: Innovation at the Speed of Light | Ansys White Paper

Also Read:

Checklist to Ensure Silicon Interposers Don’t Kill Your Design

HFSS Leads the Way with Exponential Innovation

DesignCon 2023 Panel Photonics future: the vision, the challenge, and the path to infinity & beyond!


Feeding the Growing Hunger for Bandwidth with High-Speed Ethernet

Feeding the Growing Hunger for Bandwidth with High-Speed Ethernet
by Madhumita Sanyal on 04-10-2023 at 6:00 am

Picture2

The increasing demands for massive amounts of data are driving high-performance computing (HPC) to advance the pace in the High-speed Ethernet world. This in turn, is increasing the levels of complexity when designing networking SoCs like switches, retimers, and pluggable modules. This growth is accelerating the need for the bandwidth hungry applications to transition from 400G to 800G and eventually to 1.6T Ethernet. In terms of SerDes data-rate, this translates to 56G to 112G to 224G per lane.

The Move from NRZ to PAM4 SerDes and the Importance of Digital Signal Processing Techniques

Before 56G SerDes, Non-Return-to-Zero (NRZ) signaling was prevalent. It encoded binary data as a series of high and low voltage levels, with no returning to zero voltage level in between. NRZ signaling is mostly implemented using analog circuitry since the processing latency is low, which is suitable for high-speed applications. However, as the data rates increased, the need for more advanced signal processing capabilities emerged. Digital circuits became more prevalent in SerDes designs in 56G to 112G to 224G. Digital signal processing (DSP) circuits enabled advanced signal processing, such as equalization, clock and data recovery (CDR), and adaptive equalization, which are critical in achieving reliable high-speed data transmission.

Furthermore, the demands for lower power consumption and smaller form factor also drove the adoption of digital SerDes circuits. Digital circuits consume less power and can be implemented using smaller transistors, making them suitable for high-density integration. Pulse Amplitude Modulation with four levels (PAM4) has become a popular signaling method for high-speed communication systems due to its ability to transmit more data per symbol and its higher energy efficiency. However, PAM4 signaling requires more complex signal processing techniques to mitigate the effects of signal degradation and noise, so that the transmitted signal is recovered reliably at the receiver end. In this article, we will discuss the various DSP techniques used in PAM4 SerDes.

Figure 1: PAM4 DSP for 112G and beyond

DSP techniques used in PAM4 SerDes

Equalization

Equalization is an essential function of the PAM4 SerDes DSP circuit. The equalization circuitry compensates for the signal degradation caused by channel impairments such as attenuation, dispersion, and crosstalk. PAM4 equalization can be implemented using various techniques such as:

  • Feedforward Equalization (FFE)
  • Decision-Feedback Equalization (DFE)
  • Adaptive Equalization

Feedforward Equalization (FFE) is a type of equalization that compensates for the signal degradation by amplifying or attenuating specific frequency components of the signal. FFE is implemented using a linear filter, which boosts or attenuates the high-frequency components of the signal. The FFE circuit uses an equalizer tap to adjust the filter coefficients. The number of taps determines the filter’s complexity and its ability to compensate for the channel impairments. FFE can compensate for channel impairments such as attenuation, dispersion, and crosstalk but is not effective in mitigating inter-symbol interference (ISI).

Decision-Feedback Equalization (DFE) is a more advanced form of equalization that compensates for the signal degradation caused by ISI. ISI is a phenomenon in which the signal’s energy from previous symbols interferes with the current symbol, causing distortion. DFE works by subtracting the estimated signal from the received signal to cancel out the ISI. The DFE circuit uses both feedforward and feedback taps to estimate and cancel out the ISI. The feedback taps compensate for the distortion caused by the previous symbol, and the feedforward taps compensate for the distortion caused by the current symbol. DFE can effectively mitigate ISI but requires more complex circuitry compared to FFE.

Adaptive Equalization is a technique that automatically adjusts the equalization coefficients based on the characteristics of the channel. It uses an adaptive algorithm that estimates the channel characteristics and updates the equalization coefficients to optimize the signal quality. The Adaptive Equalization circuit uses a training sequence to estimate the channel response and adjust the equalizer coefficients. Circuit can adapt to changing channel conditions and is effective in mitigating various channel impairments.

Clock and Data Recovery (CDR)

Clock and Data Recovery (CDR) is another essential function of the PAM4 SerDes DSP circuit. The CDR circuitry extracts the clock signal from the incoming data stream, which is used to synchronize the data at the receiver end. The clock extraction process is challenging in PAM4 because the signal has more transitions, making it difficult to distinguish the clock from the data. Various techniques such as Phase-Locked Loop (PLL) and Delay-Locked Loop (DLL) can be used for PAM4 CDR.

PLL is a technique that locks the oscillator frequency to the incoming signal’s frequency. The PLL measures the phase difference between the incoming signal and the oscillator and adjusts the oscillator frequency to match the incoming signal’s frequency. The PLL circuit uses a Voltage-Controlled Oscillator (VCO) and a Phase Frequency Detector (PFD) to generate the clock signal. PLL-based CDR is commonly used in PAM4 SerDes because it is more robust to noise and has better jitter performance compared to DLL-based CDR. DLL is a technique that measures the time difference between the incoming signal and the reference signal and adjusts the phase of the incoming signal to align with the reference signal. The DLL circuit uses a delay line and a Phase Detector (PD) to generate the clock signal. DLL-based CDR is less common in PAM4 SerDes because it is more susceptible to noise and has worse jitter performance compared to PLL-based CDR.

Advanced DSP techniques

Maximum Likelihood Sequence Detection (MLSD) are used to improve the signal quality and mitigate channel impairments in high-speed communication systems requiring very long reaches. MLSD is a digital signal processing technique that uses statistical models and probability theory to estimate the transmitted data sequence from the received signal. MLSD works by generating all possible data sequences and comparing them with the received signal to find the most likely transmitted sequence. The MLSD algorithm uses the statistical properties of the signal and channel to calculate the likelihood of each possible data sequence, and the sequence with the highest likelihood is selected as the estimated transmitted data sequence. It is also a complex and computationally intensive technique, requiring significant processing power and memory. However, MLSD can provide significant improvements in signal quality and transmission performance, especially in channels with high noise, interference, and dispersion.

Figure 2: Need for MLSD: Channel Library of 40+ dB IL equalized by 224G SerDes

There are several variants of MLSD, including:

  • Viterbi Algorithm
  • Decision Feedback Sequence Estimation (DFSE)
  • Soft-Output MLSD

The Viterbi Algorithm is a popular MLSD algorithm that uses a trellis diagram to generate all possible data sequences and find the most likely sequence. The Viterbi Algorithm can provide excellent performance in channels with moderate noise and ISI, but it may suffer from error propagation in severe channel conditions.

DFSE is another MLSD algorithm that uses feedback from the decision output to improve the sequence estimation accuracy. DFSE can provide better performance than Viterbi Algorithm in channels with high ISI and crosstalk, but it requires more complex circuitry and higher processing power.

Soft-Output MLSD is a variant of MLSD that provides probabilistic estimates of the transmitted data sequence. It can provide significant improvements in the error-correction performance of the system, especially when combined with FEC techniques such as LDPC. Soft-Output MLSD requires additional circuitry to generate the soft decisions, but it can provide significant benefits in terms of signal quality and error-correction capabilities.

Forward Error Correction Techniques

In addition to DSP methods, Forward Error Correction (FEC) techniques adds redundant data to the transmitted signal to detect and correct errors at the receiver end. FEC is an effective technique to improve the signal quality and ensure reliable transmission. There are various FEC techniques that can be used, including Reed-Solomon (RS) and Low-Density Parity-Check (LDPC).

RS is a block code FEC technique that adds redundant data to the transmitted signal to detect and correct errors. RS is a widely used FEC technique in PAM4 SerDes because it is simple, efficient, and has good error-correction capabilities. LDPC is a more advanced FEC technique that uses a sparse parity-check matrix.

Defining the Future of 224G SerDes Architecture

In summary, the IEEE 802.3df working group and the Optical Internetworking Form (OIF) consortium are focused on the definition of the 224G interface.  The analog front-end bandwidth has increased by 2X for PAM4 or by 1.5X for PAM6 to achieve 224G. ADC with improved accuracy and reduced noise is required. Stronger equalization is needed to compensate for the additional loss due to higher Nyquist frequency, with more taps in the FFE and DFE. MLSD advanced DSP will provide significant improvements in signal quality and transmission performance at 224G. MLSD algorithms such as Viterbi Algorithm, DFSE, and Soft-Output MLSD can be used to improve the sequence estimation accuracy and mitigate channel impairments such as noise, interference, and dispersion. However, MLSD algorithms are computationally intensive and require significant processing power and memory, should carefully trade-off between C2M and cabled host applications.

Synopsys has been a developer of SerDes IP for many generations, playing an integral role in defining PAM4 solution with DSP at 224G. Now, Synopsys has a silicon-proven 224G Ethernet solution that customers can reference to achieve their own silicon success. Synopsys provides integration-friendly deliverables for 224G Ethernet PHY, PCS, and MAC with expert-level support which can make customers’ life easier by reducing the overall design cycle and helping them bring their products to market faster.

Also Read:

Full-Stack, AI-driven EDA Suite for Chipmakers

Power Delivery Network Analysis in DRAM Design

Intel Keynote on Formal a Mind-Stretcher

Multi-Die Systems Key to Next Wave of Systems Innovations

 


Podcast EP151: How Pulsic Automates Analog Layout for Free with Mark Williams

Podcast EP151: How Pulsic Automates Analog Layout for Free with Mark Williams
by Daniel Nenni on 04-07-2023 at 10:00 am

Dan is joined by Mark Williams. Mark is a Founder and Chief Executive Officer at Pulsic. He has over 30 years of experience in the EDA industry and was one of the team that pioneered shape-based routing technology in the 1980s while working at Zuken-Redac.

Mark explains the unique “freemium” model for automating analog layout developed by Pulsic. The Animate product offers this capability. Over the past two years, there have been two versions of the tool released. Mark describes the results of this new and innovative business model and previews what’s coming at DAC this year.

The views, thoughts, and opinions expressed in these podcasts belong solely to the speaker, and not to the speaker’s employer, organization, committee or any other group or individual.


Takeaways from SNUG 2023

Takeaways from SNUG 2023
by Bernard Murphy on 04-07-2023 at 6:00 am

Aart keynote

Synopsys pulled out all the stops for this event. I attended the first full day, tightly scripted from Aart’s keynote kick off to 1×1 interviews with Synopsys executives to a fireside chat between Sassine Ghazi (President and COO) and Rob Aitken (ex-Fellow at Arm, now Distinguished Architect at Synopsys). That’s a lot of material; I will condense heavily to just takeaways. My colleague Kalar will cover the main Synopsys.ai theme (especially Aart’s talk).

AI in EDA

The big reveal was broad application of AI in the Synopsys tool suite, under the Synopsys.ai banner. This is exemplified in DSO.ai (design space optimization), VSO.ai (verification space optimization) and TSO.ai (test space optimization). I talked about DSO.ai in an earlier blog, a reinforcement learning method starting from an untrained algorithm. The latest version is now learning through multiple parallel training runs, advancing quickly to better PPAs that customers have been able to find in much less time and needing only one engineer. Synopsys claim similar results for VSO.ai for which the objective is to reduce time to coverage targets and increase coverage, and for TSO.ai for which the objective is to reduce the number of ATPG vectors required for the same or better coverage.

My discussions with execs including Thomas Anderson (VP AI and Machine Learning at Synopsys) suggest that these are all block level optimizations. DSO.ai wraps around Fusion Compiler (a block-level capability) and TSO.ai wraps around TestMax here advertised for the ATPG feature, not BIST. Similarly, the coverage metric for VSO.ai suggests functional coverage, not including higher level coverage metrics. Don’t get me wrong. These are all excellent capabilities, providing a strong base for implementation and verification at the system level.

Aart did talk about optimization for memories, I think in a system context, indicating they are exploring the system level. I think AI application at that level will be harder and will advance incrementally. Advances in performance and power now depend on architecture rather than process, driving a lot of architecture innovation which will impede approaches to optimization which insufficiently understand the architecture. Further, any system-level optimizer will need to collaborate with in-house and commercial generators (for mesh networks and NoCs for example), inevitably slowing progress. Finally, optimization at the system level must conform to a broader set of constraints, including natural language specifications and software-level use-case tests. Speculating on Aart’s memory example, perhaps optimization can be applied to a late-stage design to replace existing IP instances with improved instances. That would certainly be valuable.

What about pushing AI through the field to customers? Alessandra Costa (Sr. VP WW customer success at Synopsys) tells me that at a time when talent scarcity has become an issue for everyone, the hope of increasing productivity is front and center for design teams. She tells me, “There are high expectations and some anxiety on AI to deliver on its promises.”

DSO.ai has delivered, with over 160 tapeouts at this point, encouraging hope that expectations may outweigh anxiety. DSO.ai is now in the curriculum for implementation AEs, driving wider adoption of the technology across market segments and regions. Verification is an area where shortage of customer talent is even more acute. Alessandra expects the same type of excitement and adoption in this space as proven for DSO.ai. Verification AEs are now being trained on VSO.ai and are actively involved in deployment campaigns.

Multi-die systems

Aart talked about this, I also talked with Shekhar Kapoor (Sr. Director of Marketing) to understand multi-die as a fast-emerging parameter in design architecture. These ideas seem much more real now, driven by HPC need, also automotive and mobile. Shekhar had a good mobile example. The system form factor is already set, yet battery sizes are increasing, shrinking the space for chips. At the same time each new release must add more functionality, like support for multiplayer video games. Too much to fit in a single reticle, but you still need high performance and low power at a reasonable unit cost. In HPC, huge bandwidth is always in demand and memory needs to sit close to logic. Multi-die isn’t easy, but customers are now saying they have no choice.

Where does AI fit in all of this? Demand for stacking is still limited but expected to grow. Interfaces between stacked die will support embedded UCIe and HBM interfaces. These high frequency links require signal and power integrity analyses. Stacking limits amplifies thermal problems so thermal analysis is also critical. Over-margining everything becomes increasingly impractical at these complexities, requiring a more intelligent solution. Enter reinforcement learning. Learning still must run the full suite of analyses (just as DSO.ai does with Fusion Compiler), running multiple jobs in parallel to find its way to goal parameters.

There are still open challenges in multi-die as Dan Nenni has observed. How do you manage liability? Mainstream adopters like AMD build all their own die (apart from memories) for their products so can manage the whole process. The industry is still figuring out how to manageably democratize this process to more potential users.

Other notable insights

I had a fun chat with Sassine at dinner the night before. We talked particularly about design business dynamics between semiconductor providers and system companies, whether design activity among the hyperscalers and others is a blip or a secular change. I’m sure he has more experience than I do but he is a good listener and was interested in my views. He made a point that systems companies want vertical solutions, which can demand significant in-house expertise to specify, design and test and of course should be differentiated from competitors.

Rapid advances in systems technology and the scale of those technologies make it difficult for semiconductor component builders to stay in sync. So maybe the change is secular, at least until hyperscalar, Open RAN and automotive architectures settle on stable stacks? I suggested that a new breed of systems startups might step into the gap. Sassine wasn’t so certain, citing even more challenging scale problems and competition with in-house teams. True, though I think development under partnerships could be a way around that barrier.

I had another interesting talk with Alessandra, this time on DEI (diversity, equity and inclusion). Remember her earlier point about lack of talent and the need to introduce more automation? A complementary approach is to start developing interest among kids in high school. University-centric programs may be too late. She has a focus on girls and minorities at that age, encouraging them to play with technologies, through Raspberry Pi or Arduino. I think this a brilliant idea. Some may simply be attracted to the technology for the sake of the technology. Perhaps others could be drawn in by helping them see the tech as a means to an end – projects around agriculture or elder care for example.

Good meeting and my hat is off to the organizers!

Also Read:

Full-Stack, AI-driven EDA Suite for Chipmakers

Power Delivery Network Analysis in DRAM Design

Intel Keynote on Formal a Mind-Stretcher

Multi-Die Systems Key to Next Wave of Systems Innovations

 


TSMC has spent a lot more money on 300mm than you think

TSMC has spent a lot more money on 300mm than you think
by Scotten Jones on 04-06-2023 at 10:00 am

Slide1

Up until November of 2022, IC Knowledge LLC was an independent company and had become the world leader in cost and price modeling of semiconductors. In November 2022 TechInsights acquired IC Knowledge LLC and IC Knowledge LLC is now a TechInsights company.

For many years, IC Knowledge has published a database tracking all the 300mm wafer fabs in the world. Compiled from a variety of public and private sources, we believe the 300mm Watch database is the most detailed database of 300mm wafer fabs available. IC Knowledge LLC also produces the Strategic Cost and Price Model that provides detailed cost and price modeling for 300mm wafer fabs as well as detailed equipment and materials requirements. The ability to utilize both products to analyze a company provides a uniquely comprehensive view and we recently utilized these capabilities to do a detailed analysis of TSMC’s 300mm wafer fabs.

One way we check the modeling results of the Strategic Cost and Price Model is to compare the modeled spending on 300mm fabs for TSMC to their reported spending. Since the early 2000s nearly all of TSMC’s capital spending has been on 300mm wafer fabs and the Strategic Model covers every TSMC 300mm wafer fab.

Figure 1 presents an analysis of TSMC’s cumulative capital spending by wafer fab site from 2000 to 2023 and compares it to the reported TSMC capital spending.

Figure 1. TSMC Wafer Fab Spending by Fab.

In figure 1 there is a cumulative area plot by wafer fab calculated using the Strategic Cost and Price Model – 2023 – revision 01 – unreleased, and a set of bars representing TSMC’s reported capital spending. One key thing to note about this plot is the Strategic Cost and Price Model is a cost and price model and fabs don’t depreciate until they are put on-line, therefore the calculated spending from the model is for when the fabs come on-line whereas the reported TSMC spending is when the expenditure is made regardless of when it comes on-line. TSMC’s capital spending also includes some 200mm fab, and mask and packaging spending. The TSMC reported spending is adjusted as follows:

  1. In the early 2000s estimated 200mm spending is subtracted from the totals. In some cases, TSMC announced what portion of capital spending was 200mm. In the overall cumulative total through 2022 this is a not a material amount of spending.
  2. Recently roughly 10% of TSMC’s capital spending is for masks and packaging, TSMC discloses this and it is subtracted from the total.
  3. When capital equipment is acquired but not yet put on-line, it is accounted for as assets in progress and this number is disclosed in financial filings. We subtract this number from the reported spending because the Strategic Model calculates on-line capital.

Note that fabs 12 and 20 are/will be in Hsinchu – Taiwan, Fabs 14 and 18 are in Tainan – Taiwan, Fab 15 is in Taichung – Taiwan, Fab 16 is in Nanjing – China, Fab 21 is in Arizona – United States, Fab 22 is planned for Kaohsiung – Taiwan and Fab 23 is being built in Kumamoto – Japan.

Some interesting conclusions from this analysis:

TSMC has spent roughly $135 billion dollars on 300mm wafers fabs through 2022. This number should break $200 billion dollars in 2024.

Fab 18 is TSMC’s most expensive fab (5nm and 3nm production), we expect that site to exceed $100 billion dollars in investment next year. Interestingly Fab 18 is right next to Fab 14 where an investment of more than $30 billion dollars has taken place and the combination next year will approach $140 billion dollars!

The capital investment of roughly $135 billion dollars in 300mm fabs just by TSMC is an amazing number, perhaps even more amazing is the investment is accelerating, should break $200 billion dollars in 2024 and could break $400 billion dollars by 2030.

Customers that license our 300mm Watch channel not only get the 300mm watch database along with regular updates, they also get access to this recent TSMC analysis and will also get access to a similar analysis we are doing of Samsung. For information on the 300mm Watch database or Strategic Cost and Price Model please contact sales@techinsights.com

Also Read:

SPIE Advanced Lithography Conference 2023 – AMAT Sculpta® Announcement

IEDM 2023 – 2D Materials – Intel and TSMC

IEDM 2022 – TSMC 3nm

IEDM 2022 – Imec 4 Track Cell


Interconnect Under the Spotlight as Core Counts Accelerate

Interconnect Under the Spotlight as Core Counts Accelerate
by Bernard Murphy on 04-06-2023 at 6:00 am

Core counts min

In the march to more capable, faster, smaller, and lower power systems, Moore’s Law gave software a free ride for over 30 years or so purely on semiconductor process evolution. Compute hardware delivered improved performance/area/power metrics every year, allowing software to expand in complexity and deliver more capability with no downsides. Then the easy wins became less easy. More advanced processes continued to deliver higher gate counts per unit area but gains in performance and power started to flatten out. Since our expectations for innovation didn’t stop, hardware architecture advances have become more important in picking up the slack.

Drivers for increasing core-count

An early step in this direction used multi-core CPUs to accelerate total throughput by threading or virtualizing a mix of concurrent tasks across cores, reducing power as needed by idling or powering down inactive cores. Multi-core is standard today and a trend in many-core (even more CPUs on a chip) is already evident in server instance options available in cloud platforms from AWS, Azure, Alibaba and others.

Multi-/many-core architectures are a step forward, but parallelism through CPU clusters is coarse-grained and has its own performance and power limits, thanks to Amdahl’s law. Architectures became more heterogenous, adding accelerators for image, audio, and other specialized needs. AI accelerators have also pushed fine-grained parallelism, moving to systolic arrays and other domain-specific techniques. Which was working pretty well until ChatGPT appeared with 175 billion parameters with GPT-3 evolving into GPT-4 with 100 trillion parameters  – orders of magnitude more complex than today’s AI systems – forcing yet more specialized acceleration features within AI accelerators.

On a different front, multi-sensor systems in automotive applications are now integrating into single SoCs for improved environment awareness and improved PPA. Here, new levels of autonomy in automotive depend on fusing inputs from multiple sensor types within a single device, in subsystems replicating by 2X, 4X or 8X.

According to Michał Siwinski (CMO at Arteris), sampling over a month of discussions with multiple design teams across a wide range of applications suggests those teams are actively turning to higher core counts to meet capability, performance, and power goals. He tells me they also see this trend accelerating. Process advances still help with SoC gate counts, but responsibility for meeting performance and power goals is now firmly in the hands of the architects.

More cores, more interconnect

More cores on a chip imply more data connections between those cores. Within an accelerator between neighboring processing elements, to local cache, to accelerators for sparse matrix and other specialized handling. Add hierarchical connectivity between accelerator tiles and system level buses. Add connectivity for on-chip weight storage, decompression, broadcast, gather and re-compression. Add HBM connectivity for working cache. Add a fusion engine if needed.

The CPU-based control cluster must connect to each of those replicated subsystems and to all the usual functions – codecs, memory management, safety island and root of trust if appropriate, UCIe if a multi-chiplet implementation, PCIe for high bandwidth I/O, and Ethernet or fiber for networking.

That’s a lot of interconnect, with direct consequences for product marketability. In processes below 16nm, NoC infrastructure now contributes 10-12% in area. Even more important, as the communication highway between cores, it can have significant impact on performance and power. There is real danger that a sub-optimal implementation will squander expected architecture performance and power gains, or worse yet, result in numerous re-design loops to converge.  Yet finding a good implementation in a complex SoC floorplan still depends on slow trial-and-error optimizations in already tight design schedules. We need to make the jump to physically aware NoC design, to guarantee full performance and power support from complex NoC hierarchies and we need to make these optimizations faster.

Physically aware NoC designs keeps Moore’s law on track

Moore’s law may not be dead but advances in performance and power today come from architecture and NoC interconnect rather than from process. Architecture is pushing more accelerator cores, more accelerators within accelerators, and more subsystem replication on-chip. All increase the complexity of on-chip interconnect. As designs increase core counts and move to process geometries at 16nm and below, the numerous NoC interconnects spanning the SoC and its sub-systems can only support the full potential of these complex designs if implemented optimally against physical and timing constraints – through physically aware network on chip design.

If you also worry about these trends, you might want learn more about Arteris FlexNoC 5 IP technology HERE.

 


AI is Ushering in a New Wave of Innovation

AI is Ushering in a New Wave of Innovation
by Greg Lebsack on 04-05-2023 at 10:00 am

16268313 rm373batch5 18a

Artificial intelligence (AI) is transforming many aspects of our lives, from the way we work and communicate to the way we shop and travel. Its impact is felt in nearly every industry, including the semiconductor industry, which plays a crucial role in enabling the development of AI technology.

One of the ways AI is affecting our daily lives is by making everyday tasks more efficient and convenient. For example, AI-powered virtual assistants such as Alexa and Siri can help us schedule appointments, set reminders, and answer our questions. AI algorithms are also being used in healthcare to analyze patient data and provide personalized treatment plans, as well as in finance to detect fraud and make investment decisions.

AI is also changing the way we work. Many jobs that used to require human labor are now being automated using AI technology. For example, warehouses are increasingly using robots to move and sort goods, and customer service departments are using chatbots to handle routine inquiries.

The semiconductor industry is a critical component of the AI revolution. AI relies on powerful computing processors, such as graphics processing units (GPUs) and deep learning processors (DLPs), to process massive amounts of data and perform complex calculations. The demand for these chips has skyrocketed in recent years, as more companies invest in AI technology.

AI is beginning to have an impact on the design and verification of ICs. AI can be used to improve the overall design process by providing designers with new tools and insights. For example, AI-powered design tools can help designers explore design alternatives and identify tradeoffs between performance, power consumption, and cost. AI can also be used to provide designers with insights into the behavior of complex systems, such as the interaction between software and hardware in an embedded system.

AI is enabling the development of new types of chips and systems. For example, AI is driving the development of specialized chips for specific AI applications, such as image recognition and natural language processing. These specialized chips can perform these tasks much faster and more efficiently than general-purpose processors and are driving new advances in AI technology.

Semiconductor fabrication is the largest expenditure and AI has the greatest potential in this area. AI can help optimize the manufacturing process from design to fabrication by analyzing the process data, identifying defects, and suggesting optimizations.  These insights and changes will allow fans to detect problems earlier, reducing cost, increasing yield, and improving overall efficiency.

There are also many concerns with a technology that is this disruptive.  While this automation can potentially increase productivity and reduce costs, it also raises concerns about job loss and the need for workers to acquire new skills.  There are also a number of ethical concerns associated with AI.  AI systems can collect and analyze large amounts of personal data, raising concerns about privacy and surveillance. There are also concerns about the potential for corporations and governments to misuse this data for their own purposes.

AI is transforming many aspects of our lives, from the way we work and communicate to the way we shop and travel. The semiconductor industry is a critical component of the AI revolution, not only providing the computing power to enable AI, but also benefiting from AI for IC design and manufacturing improvements. As AI technology continues to advance, it is likely that it will continue to play an increasingly important role in the semiconductor design process, enabling new levels of innovation and driving new advances in AI technology.   It is essential to stay informed about AIs impact and ensure that its benefits are realized while minimizing the potential risks.

Also Read:

Narrow AI vs. General AI vs. Super AI

Scaling AI as a Service Demands New Server Hardware

MIPI D-PHY IP brings images on-chip for AI inference

Deep thinking on compute-in-memory in AI inference


LIVE WEBINAR: New Standards for Semiconductor Materials

LIVE WEBINAR: New Standards for Semiconductor Materials
by Daniel Nenni on 04-05-2023 at 6:00 am

New Standards for Semiconductor Materials

This is the 5th webinar in our series of webinars to explore trending topics on materials and semiconductor development. Join us to discover how digital solutions are forming new ways of operating in a fast-paced, highly demanding semiconductor industry.

With data analytics and digital tools, we are setting new standards for the way we develop new materials, manufacture, control our processes, and supply our customers. During this webinar, you will learn how we pair engineering principles with data analytics capabilities to firstly, drive digitalization with digital twin deployment and secondly, establish comprehensive data analytics methods to deploy descriptive, predictive, and prescriptive solutions throughout the organization.

LEARN MORE 

Why attend? 

Our Semiconductor Materials Series attract professionals, business and technology leaders, researchers, academics, and industry analysts from across the electronics supply chain around the world.

In this webinar, you will gain practical insights on:

  • Data-based operations of a semiconductor materials supplier
  • Digital twin deployment in every-day operations
  • Advanced data analytics methodologies to drive innovation, increase transparency, and act with speed

Who should attend? 

  • Quality experts, process engineers, Supply Chain, Process, and Technology Development teams in semiconductor companies
  • Data enthusiasts and digitalization experts

Register today to access exclusive content and engage in the interactive Q&A session.

You will be able to apply innovative techniques and best practices to solve your unique challenges.

Attendees are invited to submit questions ahead of time at info_semi_webinar@emdgroup.com.

AGENDA

4:00 pm – 4:05 pm
Laith Altimime
President
SEMI Europe

Welcome Remarks

Anand Nambiar
Executive Vice President and Global Head, Semiconductor Materials
The Electronics business of Merck KGaA, Darmstadt, Germany
4:05 pm – 4:45 pm
Dr. Safa Kutup Kurt
Executive Director and Head of Operations of Digital Solutions
The Electronics business of Merck KGaA, Darmstadt, Germany

Presentation

Biography
His organization is a key enabler to design and optimize products by using data analytics methodology in R&D, quality, and supply chain while ensuring data protection in sensitive environments. After gaining a bachelor´s degree in Chemical Engineering and Business Administration in Turkey, Kutup received a Master’s degree in Industry 4.0 Technologies in Germany at the TU Dortmund University. He also gained his Ph.D. in Chemical Engineering at the same university. He and his team led several data-driven process optimization projects worldwide for Electronics and Life Science business sectors. He has co-authored over 16 technical papers and patents focused on smart and continuous manufacturing technology, equipment design, and process intensification.

Anja Muesch
Head of Use Case Management of Digital Solutions
The Electronics business of Merck KGaA, Darmstadt, Germany

Presentation

Biography
Anja’s work is focused on portfolio development and expansion of data sharing and analytics engagements. Her team manages Use Cases for customers and suppliers along the use case life cycle. She holds an MSc in Business Chemistry from the Heinrich-Heine University in Düsseldorf, Germany, and the Universiteit van Amsterdam, the Netherlands. She has in-depth experience in project management and strategy development focusing on Data and Digital.

4:45 pm – 5:00 pm

Live Q&A and Conclusions

REGISTRATION

Registration is FREE of charge. If you miss the live session, view the recording on-demand.

LEARN MORE

DIGITAL SOLUTIONS FOR THE SEMICONDUCTOR INDUSTRY

The semiconductor industry is demanding to drive higher yield, produce with zero defects and reduce time to market to meet the ultimate goal – the “ideal ramp”.

Merck KGaA, Darmstadt, Germany, applies advanced analytics methods to design and optimize our products in R&D, Quality, and High Volume Manufacturing while ensuring data security in our sensitive environments.

Also Read:

Step into the Future with New Area-Selective Processing Solutions for FSAV

Integrating Materials Solutions with Alex Yoon of Intermolecular

Ferroelectric Hafnia-based Materials for Neuromorphic ICs


Autonomy Lost without Nvidia

Autonomy Lost without Nvidia
by Roger C. Lanctot on 04-04-2023 at 10:00 am

Autonomy Lost without Nvidia

Five years ago Uber nearly singlehandedly wiped out the prospect of a self-driving car industry with the inept management of its autonomous vehicle testing in Phoenix which led to a fatal crash. The massive misstep instantly vaporized tens of billions of dollars of Uber’s market cap and sent the company’s robotaxi development arm into a tailspin from which it was unable to recover.

It had only been a few months before – at the CES event in Las Vegas – that Nvidia had proudly trumpeted its newfound relationship with Uber – just one of several dozen autonomous vehicle collaborations announced by Nvidia. But the Uber crash cast a pall that caused Nvidia to pause its own autonomous vehicle testing and, ultimately, dial back on its autonomous vehicle grandstanding.

A measure of that impact was evident in Nvidia CEO Jensen Huang’s keynote at this week’s Nvidia GTC event. Huang’s keynotes have become a tech industry bellwether as the company’s GPU processing platforms have risen to prominence across the spectrum of emerging high-end applications.

While Nvidia remains an actively engaged participant in the development of autonomous vehicle technology – the topic received scant mention in Huang’s keynote this year. Instead, generative AI and large language model inference engines got the spotlight as Nvidia announced its launch of the Nvidia AI Foundation – a cloud-based platform developed in partnership with Microsoft, Google, and Oracle to deliver processing power for a wide range of AI-centric applications.

Huang announced its Omniverse Managed Cloud Service – what he described as AI’s iPhone moment – delivering four configurations from a single architecture. The four configurations included L4 for AI video, L40 for Omniverse and graphics rendering, H100 PCIE for scaling out large language model inference engines, and Grace-Hopper for recommender systems and vector databases.

Huang’s more than hour long presentation was a typical tour de force of all of Nvidia’s technological advances in GPU and server technology along with a review of various strategic engagements and technology deployments. The fact that autonomous vehicle tech got short shrift – while automotive factory planning and automation did get a fair bit of attention – was yet another hint that autonomous vehicle tech has been consigned to the sidelines.

The dark cloud of Uber’s failure lingers over the industry. Even semi-autonomous vehicle operator Tesla struggles to explain suspicious Autopilot and full-self-driving misbehavior (crashes – fatal and otherwise) to regulators. Autonomous vehicle developers have been forced to extend their viability forecasts. Some have given up altogether.

Cruise CEO Kyle Vogt told Fortune Magazine this week that “within 10 years driving a car will be a hobby like riding horses is today.” He added that within five years the majority of people would get around cities in autonomous vehicles.

Sadly, Vogt’s sanguine view is shared by few.

While Vogt may foresee a very short time-line to the arrival of millions of autonomous vehicles on city streets – vehicles the deployment of which Vogt believes does not require National Highway Traffic Safety Administration exemptions – the dim reality is manifest in the stuttering performance of Cruise vehicles on the streets of San Francisco today.

What was once sexy and worthy of spotlighted emphasis at Nvidia’s GTC event, has now become an awkward and frightening embarrassment. The promise of autonomous vehicles transforming society is being lost in the focus on the downside – potential catastrophic failures and exorbitant expenditures with little short-term prospect of revenue.

Interestingly, the technology that has seized the spotlight – generative AI – is itself a pricey proposition with ill-defined commercialization opportunities. While autonomous tech transitions through its trough of despair, ChatGPT and its ilk are riding high on the ether of unlimited potential.

In some respects, the collateral damage from the fatal Uber crash five years ago was Nvidia’s diminished enthusiasm for autonomous vehicle tech. The sector is in dire need of leadership and vision – something that Nvidia is imparting to the AI sector in spades.

It might be time for Nvidia to get its robotaxi mojo rolling again. The fatal Uber crash was a devastating blow – but it ought not to be fatal to the entire sector. Autonomous vehicle tech remains a strategic focus for Nvidia and retains the promise of societal transformation. This is not time to throw in the towel.

Also Read:

Mercedes, VW Caught in TikTok Blok

AAA Hypes Self-Driving Car Fears

IoT in Distress at MWC 2023


AI in Verification – A Cadence Perspective

AI in Verification – A Cadence Perspective
by Bernard Murphy on 04-04-2023 at 6:00 am

Opening slide min

AI is everywhere or so it seems, though often promoted with insufficient detail to understand methods. I now look for substance, not trade secrets but how exactly they using AI. Matt Graham (Product Engineering Group Director at Cadence) gave a good and substantive tutorial pitch at DVCon, with real examples of goal-centric optimization in verification. Some of these are learning-based, some are simply sensible automation. In the latter class he mentioned test weight optimization, ranking test value and perhaps ordering tests by contribution to coverage. Pushing the low contributors to the end or out of the list. This is human intelligence applied to automation, just normal algorithmic progress.

AI is a bigger change, yet our expectations must remain grounded to avoid disappointment and the AI winters of the past. I think of AI as a second industrial revolution. We stopped using an ox to drag a plough through a field and started building steam driven tractors. The industrial revolution didn’t replace farmers, it made them more productive. Today, AI points to a similar jump in verification productivity. The bulk of Matt’s talk was on opportunities, some of these already claimed for the Verisium product.

AI opportunities in simulation

AI can be used to compress regression, by learning from coverage data in earlier runs. It can be used to increase coverage in lightly covered areas and on lightly covered properties, both worthy of suspicion that unseen bugs may lurk under rocks. Such methods don’t replace constrained random but rather enhance it, increasing bug exposure rate over CR alone.

One useful way to approach rare states is through learning on front-end states which naturally if infrequently reach rare states, or come close. New tests can be synthesized based on such learning which together with regular CR tests can increase overall bug rate both early and late in the bug maturation cycle.

AI opportunities in debug

I like to think of debug as the third wall in verification. We’ve made a lot of progress in test generation productivity through reuse (VIP) and test synthesis, though we’re clearly not done yet. And we continue to make progress on verification engines, from virtual to formal and in hardware assist platforms (also not done yet). But debug remains a stubbornly manual task, consuming a third or more of verification budgets. Debuggers are polished but don’t attack the core manual problems – figuring out where to focus, then drilling down to find root causes. We’re not going to make a big dent until we start knocking down this wall.

This starts with bug triage. Significant time can be consumed simply by separating a post-regression pile of bugs into those that look critical and those that can wait for later analysis. Then sub-bucketing into groups with suspected common causes. Clustering is a natural for unsupervised learning, in this case looking at meta-data from prior runs. What checkins were made prior to the test failing? Who ran it and when? How long did the test run for? What was the failure message? What part of the design was responsible for the failure message?

Matt makes the point that as engineers we can look at a small sample of these factors but are quickly overwhelmed when we have to look at hundred or thousands of pieces of information. AI in this context is just automation to handle large amounts of relatively unstructured data to drive intelligent clustering. In a later run when intelligent triage sees a problem matching an existing cluster with high probability, bucket assignment becomes obvious. An engineer then only needs to pursue the most obvious or easiest failing test to a root cause. They can then re-run regression in expectation that all or most of that class of problems will disappear.

On problems you choose to debug, deep waveform analysis can further narrow down a likely root cause. Comparing legacy and current waveforms, legacy RTL versus current RTL, a legacy testbench versus the current testbench. There is even research on AI-driven methods to localize a fault – to a file or possible even to a module (see this for example).

AI Will Chip Away at Verification Complexity

AI-based verification is a new idea for all of us; no-one is expecting a step function jump into full blown adoption. That said, there are already promising signs. Orchestrating runs against proof methods appeared early in formal methodologies. Regression optimization for simulation is on an encouraging ramp to wider adoption. AI-based debug is the new kid in this group, showing encouraging results in early adoption. Which will no doubt drive further improvements, pushing debug further up the adoption curve. All inspiring progress towards a much more productive verification future.

You can learn more HERE. (Need a link)