DAC2025 SemiWiki 800x100

Simulating the Whole Car with Multi-Domain Simulation

Simulating the Whole Car with Multi-Domain Simulation
by Bernard Murphy on 03-21-2024 at 6:00 am

Simulating the Whole Car with Multi-Domain Simulation

Next significant automotive blog in a string I will be posting (see here for the previous blog).

In the semiconductor world, mixed simulation means mixing logic sim, circuit sim, virtual sim (for software running on the hardware we are designing) along with emulation and FPGA prototyping. While that span may seem all-encompassing, in fact, it’s still a provincial view. OEMs like auto companies develop complete products in which software plays an outsize role, governing what in effect is a highly distributed compute system across the car. Developing and testing this software on (software-based) digital twins allows for faster experimentation and high levels of parallelism than possible with hardware prototypes but requires collaboration between many kinds of domain specific simulators. A very diverse group of companies are planning to launch a working group under Accellera to define an enabling standard to serve this “Federated Simulation” need.

Who Wants Federated Simulation and Why?

At DVCon I met with an impressive group: Mark Burton (Vice-Chair of the proposed working group, also Qualcomm), Yury Bayda (Principal Software Engineer at Ford Motor Company, previously Intel), Trevor Wieman (System Level Simulation Technologist at Ford Motor Company, also previously Intel), Lu Dai (Accellera Chair and Qualcomm) and Dennis Brophy (needs no introduction). What follows is primarily a synopsis of inputs from Mark, Yury and Trevor.

A car is a network of interconnected computers developed by multiple suppliers; the auto OEM software team must develop/refine and debug software to make this whole system run correctly. Perhaps the infotainment system is based on a Qualcomm chip, communicating with zonal controllers, in turn talking to edge sensors, drivetrain MCUs and other devices around the car, all communicating through CAN or automotive Ethernet. To meet acceptable software simulation times this total system model must run on abstracted virtual models for each component. Suppliers provide such models in a variety of formats: proprietary instruction set simulators or representations based on different virtual modeling tools.

Which raises the perennial problem of blending all these different models into a unified virtual runtime. Maybe someday all suppliers will provide models with TLM-compliant interfaces but until then, can we build better bridges/wrappers to couple between all these models? That’s what the Federated Simulation initiative aims to address. Yury provided a compelling example for why we need to make this work. Over the Air (OTA) software updates are a must-have for software-defined vehicles but what happens if something goes wrong in an update – if the update bricks your car or some part of your car? System-level scenarios like this must be considered during design to mitigate such problems and must tested exhaustively.

Bottom-line, system software development must start early before hardware is available and is completely dependent on reliable high-level simulation abstractions to underpin total system simulations.

Not just for electronic systems

A car is not just electronic circuits; neither is a plane or a spacecraft or an industrial robot. Still, electronics plays an increasing role, now interacting with mechanical systems and with the surrounding environment. Antilock braking must behave appropriately under different levels of traction on a dry road, in rain or in snow. From ADAS to autonomy, driving systems must be tested against a vast array of scenarios. The CARLA simulator is an important component of such testing, modeling urban and other layouts across many environment conditions and providing streaming video, LIDAR, and other sensor data as input to full system simulations.

A federated simulation solution must couple to simulators like CARLA. Ultimately it must also couple with standards in other verticals such as OpenCRG to describe road surfaces, VISTAS/VHTNG for avionics, SMP2 for space applications, and FMUs for mechatronics. Each is well established in its own domain and unlikely to be displaced. A federated simulation standard must respect and smoothly interoperate with these standards – I’m guessing in incremental steps. That said there is already enthusiastic support from many quarters to be involved in this effort.

 

Accellera

Agnisys, Inc.
Airbus
AMD
Aptiv
Cadence Design Systems, Inc.
Collins
Doulos Ltd.
Ford
Huawei Technologies Sweden AB
IEEE

Intel Corporation

IRT-Saint Exupery
Marvell International Ltd
Microsoft Corporation
MachineWare GmbH
NXP Semiconductors
Qualcomm Technologies, Inc.

Robert Bosch GmbH
Renesas Electronics Corp.
S2C
Shokubai
Siemens EDA

Spacebel
STMicroelectronics
Synopsys
Shanghai UniVista Industrial Software GroupTexas Instruments
Vayavya Labs
Zettascale

Core team membership for the initial definition

 What will it take?

This is an ambitious goal but it’s worth noting that the US DoD launched a similar effort called HLA in the 1990s which has continued to grow. Airbus has built their own architecture with similar intent, including a physical prototype of an aircraft for hardware in the loop testing. At the electronic systems level, Mark, Yury and Trevor have all previously been involved in multi-simulator projects at Intel and Qualcomm, and more recently with Ford (Yury and Trevor). They do not see this as an impossible goal though I’m guessing it will likely evolve from modest expectations through multiple releases.

The core concept as described to me is based on cloud deployment with a container instance for each simulation and Kubernetes for resource allocation (CPUs, GPUs, hardware accelerators, etc.) and orchestration. The Accellera team don’t plan to reinvent any standards (or emerging standards) that already work well. Instead they intend to leverage existing transport layers, adding only applications layers above that level such that a simulator instance can publish streams of activity to other subscribing simulators, and subscribers can be selective about what data they want to see.

Very interesting. You can learn more HERE and HERE

Also Read:

An Accellera Functional Safety Update

DVCon Europe is Coming Soon. Sign Up Now

Accellera and Clock Domain Crossing at #60DAC


QuantumPro unifies superconducting qubit design workflow

QuantumPro unifies superconducting qubit design workflow
by Don Dingee on 03-20-2024 at 10:00 am

Superconducting qubit design workflow in QuantumPro

To create quantum computing chips today, a typical designer must cobble various tools together, switching back and forth between them for different tasks. By contrast, EDA solutions such as Keysight Advanced Design System (ADS) unify a design workflow in a single interface with automated data exchange between features. In an industry-first, Keysight QuantumPro brings five different functions for quantum design together in a superconducting qubit design workflow, reducing design cycle time and prototyping and yield risk for optimized quantum chips. Keysight’s Quantum EDA solution focuses on accurate electromagnetic modeling, ensuring that simulation and measurement outcomes align effectively.

Adding superconducting qubits presents a yield challenge

Superconducting quantum computers rely on qubits constructed from Josephson junctions held at cryogenic temperatures, which exhibit non-linear inductance needed in constructing two-level systems. Qubits interconnect by meander line coplanar waveguide resonators with frequencies often in the 4 to 10 GHz range. The resonators serve two primary functions: indirectly reading out the state of the qubits and entangling them with each other. Quantum amplifiers with gain and unique ultra-low noise characteristics amplify the output signal from qubits to improve the readout fidelity. Quantum processing units (QPUs), comprising arrays of qubits and resonators, are positioned to showcase quantum advantage, surpassing the reach of classical CPUs.

The advantage of quantum computing arises from its distinct exponential increase in computational capacity as the number of entangled qubits grows, with n qubits taking on 2n stable states. Scaling the number of qubits in quantum computing presents a formidable challenge, with the stability and coherence of qubits becoming more complex with each addition. By tying four design themes – structural layout, electromagnetic analysis, complex quantum circuit optimization, and system-level exploration and troubleshooting – into a single quantum design workflow, designers can thoroughly exercise every part of a design, make data-based adjustments, and re-simulate to verify improvements.

Unlike digital circuit design, the challenge of designing quantum chips is more than step and replication. For entanglement to work correctly, resonance frequencies must be unique within and between all nearby qubits. If two (or more) resonance frequencies overlap, qubits entangle improperly due to unpredictable cross-coupling, and the quantum chip becomes a yield failure. Factors such as minimizing environmental interference, maintaining qubit entanglement, and managing errors due to decoherence pose significant obstacles. Debugging problems discovered after fabricating chips and cryogenic testing become expensive and time-consuming.

Enabling a five-point superconducting qubit design workflow

A better solution is a shift left for quantum chip design – integrating layout and simulation tools to predict and optimize resonance frequencies accurately in virtual space. RF designers are familiar with these workflows, but quantum designers are just beginning their adoption. “QuantumPro bridges the gap from ad-hoc quantum chip design with inherent yield risks to confidence in layout and simulation for predictable parts,” says Mohamed Hassan, Quantum Solutions Planning Lead at Keysight.

QuantumPro integrates five functions built on the ADS platform to streamline superconducting qubit designs, including schematic design, layout creation, electromagnetic (EM) analysis, non-linear circuit simulation, and quantum parameter extraction. Beginning with the schematic interface, users can effortlessly drag and drop components from the built-in quantum artwork. Subsequently, a layout can be generated automatically from the schematic.

Within QuantumPro, two distinct analyses are available. First, the Full EM Analysis facilitates a frequency sweep of circuits, producing s-parameters at input and output ports. The platform supports both the finite element method (FEM) and the method of moments (MoM) for the Full EM analysis. Second, the Energy Participation Analysis allows for finding the eigenmodes of the system with the FEM solver. Instead of solving for the electric field over the entire volume, the MoM solves only for the currents on the metal surface, significantly cutting computational costs.

Quantum parameter extraction is automatic in QuantumPro, with quasi-static, black box quantization, and energy participation ratio (EPR) methods. A simplified layout of a four-qubit design shows the transmon qubits (Q1 through Q4) and meander line resonators (R1 through R4). Note the unique resonance frequency values extracted by each of the three different methods in QuantumPro.

EM simulation of superconducting qubits requires one extra step. Superconductors exhibit kinetic inductance, an additional inductance large enough to sway results compared with perfect electric conductors. “Designers can’t ignore kinetic inductance – it can cause a miss in resonance frequency by as much as 40% in some cases of thin film superconductors,” says Hassan. Superconductor material editors in ADS and EMPro allow designers to describe materials for accurate kinetic inductance capture.

A core feature from ADS in QuantumPro is a Python console that gives users more control over workspaces and user interfaces and the ability to script and automate repetitive tasks.

Scaling quantum chips to hundreds or thousands of qubits

QuantumPro assures superconducting quantum chip designers get the results they expect at prototyping after simulation and optimization of designs. QuantumPro arrives at a pivotal point in quantum computing development as designers seek to move from designs with tens of qubits into the hundreds and perhaps thousands. As in conventional semiconductors, reducing prototype re-spins and improving yields can help move ideas from research to commercialization more quickly.

Designers will also see a productivity boost in their superconducting qubit design workflow with QuantumPro, which may also lower the learning curve for designers from other disciplines. Resources on the Keysight website explain more about the science behind superconducting qubits and the EM analysis and parameter extraction methods in QuantumPro.

QuantumPro webpages, with links to an application note, technical overview, and videos:

Quantum EDA: Faster design cycles of superconducting qubits

W3037E PathWave QuantumPro


sureCore Enables AI with Ultra-Low Power Memory IP

sureCore Enables AI with Ultra-Low Power Memory IP
by Mike Gianfagna on 03-20-2024 at 8:00 am

sureCore Enables AI with Ultra Low Power Memory IP

We all know that AI is becoming pervasive in a wide array of products to make them smarter, safer and feature rich. Just look at the announcements from the recent CES show in Las Vegas to see some examples. These AI workloads demand a lot of compute power. Fueling this trend is the need for significant arrays of embedded memory on chip, as close to the compute units as possible. This is a way to reduce latency, but it also brings a lot of power challenges and packaging/cooling headaches. sureCore is well-known for its ultra-low power memory solutions and the company recently announced a significant addition to its arsenal to address the vexing embedded memory problem for AI systems. Let’s see how sureCore enables AI with ultra-low power memory IP.

The New sureCore Solution

sureCore is no stranger to exotic and cutting-edge applications. You can learn about what the company is doing to enable quantum computing here. The new offering from sureCore is called PowerMiser AI. The inferencing power required by new applications needs massively parallel processing arrays. These technologies increase power consumption and thermal load, making packaging and cooling more challenging.

SureCore modified its PowerMiser IP by optimizing it to reduce dynamic power and  leveraging power efficiencies offered by FinFET technology. The result is PowerMiser AI, a new product that reduces thermal impact and addresses the performance demanded by AI workloads.

Digging a bit deeper, embedded SRAM can cause substantial issues for workloads such as pattern matching. As a result, memory can contribute as much as 50% of the power on a large AI-enabled chip. sureCore estimates that using PowerMiser AI can reduce dynamic power by up to 50%. This reduces thermal load, so heat sinks or other cooling approaches are either not needed or significantly reduced. The result is increased system reliability and lower cost. 

To Learn More

Beyond memory IP, sureCore offers a custom memory development service called sureFIT.  Using this service, designers can get memory subsystems optimized for the target power, performance and area (PPA) of the project. In addition to custom solutions, sureCore has a range of power-optimized standard products that deliver market-leading power profiles. These include Everon, PowerMiser, and MiniMiser. Further details can be found at on sureCore’s product page.

Power savings are available at both the nominal operating voltage and at low to near threshold voltages, providing many choices. sureCore memories offer single rail low voltage operation that allows direct logic connection, which eases system-level design.

There was a recent press release announcing PowerMiser AI. You can read the full release here.

In that release, Paul Wells, sureCore’s CEO went into some detail about what drove the development of the new product:

“Our typical customer has been using our ultra-low power SRAM IP in battery-powered applications to provide a longer operational life between recharges. The surge in AI augmentation means that whole new areas for our low power memory solutions have appeared in new and exciting areas that are not constrained by battery life and can be mains powered or are even in the automotive space.

Power consumption is still a critical factor for these applications, but the constraining factor is starting to become heat dissipation and potential thermal damage. In order to keep product form factors under control and obviate the need for forced cooling so as to prevent overheating, new low power solutions are needed. Our recent announcements about working on ultra-low power memory IP for use in cryostats in the quantum computing arena, where heat generation by chips has to be minimised, has resulted in enquiries from companies who also need to keep AI chips operating within temperature boundaries albeit at the other end of the scale.”

And that’s how sureCore enables AI with ultra-low power memory IP.

Also Read:

sureCore Enables AI with Ultra-Low Power Memory IP

Agile Analog Partners with sureCore for Quantum Computing

Slashing Power in Wearables. The Next Step


Challenge and Response Automotive Keynote at DVCon

Challenge and Response Automotive Keynote at DVCon
by Bernard Murphy on 03-20-2024 at 6:00 am

dvcon 2024 keynote min

Keynotes commonly provide a one-sided perspective of a domain, either customer-centric or supplier-centric. Kudos therefore to Cadence’s Paul Cunningham for breaking the mold in offering the first half of his keynote to Anthony Hill, a TI fellow, to talk about outstanding challenges he sees in verification for automotive products. Paul followed with his responses, some already in reach with existing technologies, some requiring more stretching to imagine possible solutions, and some maybe out of range of current ideas.

Anthony set the stage with his breakdown of macro trends in automotive systems design. ADAS and autonomy are pushing more automation features like lane assist and driver monitoring, in turn driving more centralized architectures. Connectivity is critical for OTA updates, vehicle to vehicle communication, in-cabin hot spots. And electrification isn’t just about inverters, motor drivers (now charging at higher power levels and frequencies). It’s also about all the electronics around those core functions, for battery monitoring and for squeezing maximum power efficiency to extend range.

These trends drive higher levels of integration to increase capabilities and to reduce latencies for responsiveness and safety. Pulling in more IPs to build more complex systems, increasing distributed mixed-signal operation around PLLs, DDRs etc., and adding more complex tiered network on chip (NoC) bus fabrics. Chiplet requirements now central at some big auto OEMs will add yet more verification complexity.

Anthony’s Challenges

Anthony shared challenges he sees growing in importance through this increased complexity. Integrating IPs from many sources remains challenging. Not so much in basic functionality and timing, more in non-executable (documentation) guidance on usage limitations or limits to the scope of testing for functionality, performance, safety, configuration options, etc. Relying on doc for communicating critical information is a weak link, suggesting more opportunities for standardization.

In functional safety (FuSA), standards for data exchange between suppliers and consumers are essential. FMEDA analysis is only as good as the safety data supplied with each IP. Is that in an executable model or buried somewhere in a document? Can formal play a bigger role in safety than it does today, for example in helping find vulnerabilities in a design during the FMEA stage?

For mixed signal, he’s seeing more cases of digital embedded in analog with corresponding digital challenges like CDC correctness. These are already solved in verification for digital but not for mixed A/D. Monte-Carlo simulations are not adequate for this level of testing. Can we extend digital static verification methods to mixed A/D?

As NoCs become tiered across large systems, non-interlocked (by default), and with significant room for user-defined prioritization schemes, it is become more challenging to prove there is no potential for deadlocks, more generally to ensure compliance with required service level agreements (SLAs). Simulation alone is not enough to ensure, say, that varying orders of command arrival and data return will work correctly under all circumstances. Is a formal methodology possible?

Another very interesting challenge is multicycle glitch detection in non-traditional process corners. Gate level simulation across PVT corners may claim a multicycle logic cone is glitch free but still miss a glitch in a non-standard corner. Anthony speculates that maybe some kind of “formal” verification of logic cones with overlapping timing constraints could be helpful.

Finally, in chiplet-based designs, today he sees mostly internal chiplets plus external memory for HPC-class busses, but over time expects multi-source chiplets will amplify all the above problems and more over current IP-level integration, pushing additional requirements onto what we expect from models and demanding more standardization.

Paul’s Responses

Paul opened by acknowledging that Anthony had a pretty good challenge list since he (Paul) was only able to find slides among his standard decks to address about 50% of the list. For the rest he sketched ideas on what could be possible.

For system integration (and by extension multi-die systems), System VIPs and other system verification content are now extending the familiar IP-level VIP concept up to more complex subsystems. For example, the Cadence Arm SBSA (server-based system architecture) compliance kit provides all the components necessary for that testing. This concept is naturally extensible to other common subsystems, even extending into mixed signal subsystems, adding further value through stress testing across multiple customers and designs. Connectivity checking is another variant on the system-level verification theme, not only for port-to-port connections but also for connection paths through registers and gating.

Verification management solutions provide a structured approach to system verification, from requirements management, spec annotation, and integration with PLM systems. Additionally, verification campaigns today span multiple engines (virtual, formal, simulation, emulation, prototyping) and regression testing through the full evolution of a design. Taking a holistic and big data view together with machine learning enables design teams to understand the whole campaign and how best to maximize throughput, coverage, and debug efficiency. In-house systems with similar purpose exist of course, but we already see enthusiasm to concentrate in-house R&D more on differentiating technologies and to offload these “verification management” functions onto standard platforms.

In safety, Paul sees current verification technologies as a starting point. Fault simulation and support for FMEDA analysis with fault campaign strategies is already available, together with formal reachability analysis to filter out cases where faults can’t be observed/controlled. The EDA and IP industry is tracking the Accellera working group activity on an interoperable FuSA standard and should become actively involved with partners as that work starts to gel. There is also surely opportunity to do more: enabling fault sim in emulation will amplify capacity and performance and will allow us to quantify impact of safety mechanisms on performance and power. Paul hinted that more can also be done with formal as Anthony suggests.

Verifying mixed signal hierarchies is an interesting challenge. Analog embedded in digital, digital embedded in analog, even inline bits of digital in analog. DV-like verification in these structures will be a journey, starting with an interoperable database across the whole design. Cadence has put significant investment (20 years) into the OpenAccess standard and can natively read and write both digital and analog circuitry to this database. From this they can extract a full chip flat digital network structure – even for the analog components. From that it should be possible in principle to run all the standard digital signoff techniques: SDC constraints, clock and reset domain crossings, lint checks, connectivity checking, even formal checking. Paul stressed that proving and optimizing these flows is a step yet to be taken 😀

He sees the NoC challenge as something that should be perfect for formal methods. The Jasper group has been thinking about architectural formal verification for many years since these kinds of problem cannot be solved with simulation. Given the refined focus this challenge presents, the problem should be very tractable. (Editor sidenote. This problem class reminds me of formal application to cache coherence control verification, or SDN verification where you abstract away the payload and just focus on control behavior.)

Finally on PVT-related multicycle paths, Paul confessed he was stumped at least for now. A problem that only occurs on a corner that wasn’t tested is very hard to find. He closed by admitting that while automated verification will never be able to find everything, the industry will continue to push on boundaries wherever it can.

Good discussion and good input for more Innovation in Verification topics!

Also Read:

BDD-Based Formal for Floating Point. Innovation in Verification

Photonic Computing – Now or Science Fiction?

Cadence Debuts Celsius Studio for In-Design Thermal Optimization


Unleash the Power: NVIDIA GPUs, Ansys Simulation

Unleash the Power: NVIDIA GPUs, Ansys Simulation
by Daniel Nenni on 03-19-2024 at 10:00 am

Electromagnatic PerceiveEM

ANSYS Perceive EM GPU solver for 5G/6G simulations integrated into NVIDIA Omniverse

In the realm of engineering simulations, the demand for faster, more accurate solutions to complex multiphysics challenges is ever-growing.

Simulation is a vital tool for engineers to design, test, and optimize complex systems and products. It helps engineers reduce costs, improve quality, and accelerate innovation. However, the associated high computational demands, large data sets, and multiple physics domains pose significant challenges.

A key technology instrumental in meeting this demand is the graphics processing unit (GPU). With its unique applicability to acceleration of multiphysics simulations, GPUs are a game-changer driving innovation. Ansys and NVIDIA partner to deliver solutions that leverage the power of NVIDIA GPUs and the physics-based authority of Ansys software. Together, Ansys and NVIDIA are enabling engineers to solve the industry’s most computationally challenging problems.

NVIDIA GPUs for Multiphysics Simulations

NVIDIA GPUs are designed to accelerate parallel and compute-intensive tasks, such as simulations, leveraging thousands of cores and high-bandwidth memory. NVIDIA GPUs deliver magnitudes faster performance than CPUs for many simulation applications, enabling engineers to run more simulations in less time and with higher fidelity.

NVIDIA GPUs, now famous for driving the AI revolution, are renowned for their high-performance computing capabilities, making them an ideal choice for engineers and scientists working on multiphysics simulations in highly complex applications in industries such as aerospace, automotive, biomedical, and energy. These simulations account for the interaction of multiple physical phenomena, such as fluid dynamics, structural mechanics, and electromagnetics. By harnessing the parallel processing power of NVIDIA GPUs, engineers can significantly reduce simulation times and achieve more accurate results. More simulation iterations lead to more ideas and ultimately superior end products.

The innovation of 3D-IC designs requires the addition of multiphysics simulations to semiconductor design, such as electromagnetic, thermal, and structural analysis. Stacking of chiplets in close proximity within a single package brings system-level multiphysics challenges into IC design.

Solving for Multiphysics Challenges

Ansys has long been at the forefront of providing cutting-edge simulation software solutions for engineers and researchers worldwide. With a strong focus on multiphysics and semiconductor simulations, Ansys has established itself as the leader in enabling users to simulate a wide range of physical phenomena accurately. Ansys software is trusted by tens of thousands of engineers worldwide who rely on its accuracy, reliability, and scalability.

Ansys software is particularly renowned for its multiphysics simulations enabling engineers to simulate the complex interactions among various physical processes and gain valuable insights into the behavior of their systems. Ansys offers a comprehensive suite of tools for multiphysics simulations, such as Ansys Discovery, Ansys Fluent™, Ansys HFSS™, Ansys LS-DYNA™, and Ansys SPEOS™. These tools enable engineers to perform interactive, real-time, and high-fidelity simulations of various multiphysics phenomena, such as fluid-structure interaction, electromagnetic, shock and impact, and optical performance.

ANSYS LS-Dyna crash test simulation result visualized with NVIDIA Omniverse

Ansys Product Support for NVIDIA Processors: Leveraging Grace and Hopper

Ansys harnesses NVIDIA H100 GPUs to boost multiple simulation solutions and prioritizes NVIDIA’s latest processors Grace and Hopper along with the newly announced Blackwell architecture for products across the Ansys portfolio, such as Fluent, and LS-DYNA. By utilizing these products in conjunction with NVIDIA processors, engineers achieve faster simulation times, increased accuracy, and improved productivity in their work.

Ansys software can leverage the features and benefits of NVIDIA processors, such as:

  • Massive parallelism and high-bandwidth memory enable faster and more accurate simulations leading to better end products.
  • Unified memory and NVLink enable seamless data transfer and communication between CPU and GPU.
  • Tensor cores and ray tracing cores enable advanced simulations of artificial intelligence and optical effects.
  • Multi-GPU and multi-node support enable scalable simulations of large and complex models.

Gearbox CFD simulation using ANSYS Fluent

Driving Innovation: Benchmarks and Performance Gains

Ansys integrates support for NVIDIA processors into its flagship products to harness the immense potential of NVIDIA processors in enhancing simulation performance. This collaboration between Ansys and NVIDIA unlocks new possibilities for engineers seeking to leverage the power of GPU acceleration in their simulations. Ansys has already announced intent to support NVIDIA’s just announced innovative Blackwell architecture, presaging even more magnitudes of simulation acceleration

ANSYS Perceive EM GPU solver for 5G/6G simulations integrated into NVIDIA Omniverse

  • ANSYS Perceive EM GPU solver for 5G/6G simulations integrated into NVIDIA Omniverse

Compared to traditional computing methods, benchmarks demonstrate that Ansys simulations run on NVIDIA GPUs deliver significant performance gains. Engineers see a substantial reduction in simulation times, allowing for faster design iterations and more efficient problem-solving. For example:

  • Fluent enables high-fidelity and scalable fluid dynamics simulations on NVIDIA GPUs, allowing engineers to solve challenging problems such as turbulence and combustion phenomena. Fluent runs up to 5x faster on one NVIDIA H100 GPU than on dual 64 cores of a recently released high-end CPU processor.
  • Ansys Mechanical™ enables fast and accurate structural mechanics simulations on NVIDIA GPUs, allowing engineers to model complex phenomena such as acoustics, vibration, and fracture dynamics. Mechanical’s matrix kernel in Mechanical runs on 4 CPU cores up to 11x faster when adding one NVIDIA H100 GPU.
  • Ansys SPEOS enables realistic and high-performance optical simulations on NVIDIA GPUs, allowing engineers to design, measure and assess light propagation in any environment. SPEOS can run optical simulations up to 35x faster on an NVIDIA RTX™ 6000 Ada than on a recently released  8-core processor.
  • Ansys Lumerical enables comprehensive and efficient photonics simulations, allowing engineers to design and optimize photonic devices and circuits. Ansys Lumerical FDTD running on a single NVIDIA A100 GPU solves up to 40% faster than when compared to an HPC cluster containing 480 AMD EPYC 7V12 cores of a recently released high-end CPU. This equates to nearly a 6x improvement in price-performance ratio.
  • Other Ansys products such as RedHawk-SC™, Discovery, Ensight™, Rocky™, HFSS SBR+™, Perceive EM™, Maxwell™, AVxcelerate Sensors™, and RF Channel Modeler™ either already or soon will benefit from NVIDIA GPU acceleration.

Caption: ANSYS AVxcelerate Sensors (Physics-based Radar, Lidar, Camera) connected to NVIDIA Drive Sim

Embracing the Future: Grace, Hopper and now Blackwell

The future of simulation is bright, and NVIDIA’s latest innovations including NVIDIA Grace CPU, NVIDIA Hopper GPU, and now Blackwell architectures promise even more impressive performance gains. Ansys is committed to optimizing its software for these next-generation platforms, ensuring engineers have access to the most powerful simulation tools available.

The combination of Ansys’ advanced simulation software and NVIDIA’s GPU technology is revolutionizing engineers’ approach to multiphysics simulations. The recently announced expanded partnership between the two companies promises even greater advances, enhanced by artificial intelligence. Using Ansys software with NVIDIA GPUs, engineers tackle complex multiphysics simulations with unprecedented speed and accuracy, paving the way for new innovations and breakthroughs in engineering and science.

Ansys and NVIDIA Pioneer Next Era of Computer-Aided Engineering  

*CoPilot & ChatGPT, both powered by NVIDIA GPUs, contributed to this blog post.

Also Read:

Ansys and Intel Foundry Direct 2024: A Quantum Leap in Innovation

Why Did Synopsys Really Acquire Ansys?

Will the Package Kill my High-Frequency Chip Design?


Synopsys Enhances PPA with Backside Routing

Synopsys Enhances PPA with Backside Routing
by Mike Gianfagna on 03-19-2024 at 6:00 am

Comparison of frontside and backside PDNs (Source IMEC)

Complexity and density conspire to make power delivery very difficult for advanced SoCs. Signal integrity, power integrity, reliability and heat can seem to present unsolvable problems when it comes to efficient power management. There is just not enough room to get it all done with the routing layers available on the top side of the chip. A strategy is emerging to deal with the problem that seems to take a page out of the multi-die playbook. Rather than deal with the existing, single surface constraints, why not move power delivery to the backside of the chip, and get additional PPA benefit out of it? The entire fab and process equipment ecosystem is buzzing about this approach. But what about the design methodology? There is help on the way. A very informative white paper is now available from the leading EDA supplier. Read on to get the details about how Synopsys enhances PPA with backside routing.

Why Use BackSide Routing?

In a typical SOC, dedicated power layers tend to be thicker, with wider traces than the signal layers to reduce the amount of loss due to IR drop. The power delivery network, or PDN, is what brings power to all parts of the chip. PDN design required extensive analysis of electromigration, noise, and cross-coupling effects as well as IR drop to ensure power integrity. Solving this problem by adding metal layers will increases the cost and complexity of the fabrication process, if it’s even possible given process constraints.

There is more to the story which is explained well in the white paper (a link is coming). Now that manufacturing technology supports it, backside routing for the PDN is a great way to remove the obstacles, opening a new approach to implementation and the opportunity to enhance PPA. The graphic at the top of this post provides a comparison of frontside and backside PDNs. Thanks to IMEC for this depiction.

But, as they say, there’s no free lunch. For backside routing, the design process has to deal with many new problems, such as:

  • Signal integrity
    • Frontside PDN acted as a natural shield for signal integrity
    • Important to have close correlation between pre-route and post-route
  • Thermal impact
    • Thermal aware implementation to reduce impact of backside metal
  • Post-silicon observability
    • Methodologies to support robust observability
  • Multi-die and backside metal
    • Lots of synergy between the two
    • Leverage EDA technology pieces from each other

The white paper also explains what Synopsys is doing for backside routing. Here is a summary.

What Synopsys is Doing

Synopsys has embraced the use of backside routing for the PDN. The approach fits well with its design technology co-optimization (DTCO) methodology. The company has added support for backside PDNs in all relevant EDA products. The result is fast and efficient technology exploration, design PPA assessment, and design closure to accelerate the overall development process. As with many of its programs, the approach allows chip designers to adopt new silicon technology with predictable results.

A large number of additions are part of Synopsys Fusion Compiler, the industry-leading RTL-to-GDSII implementation system. The figure below summarizes the enhancement at a high level. The white paper goes into more detail about these enhancements and the measurable impact on chip design results.

Overview of Synopsys Fusion Compiler Enhancements

The white paper also discusses potential future additions to expand the use of backside routing even further.

To Learn More

Backside routing is here. The foundry ecosystem is delivering this capability and design teams need an enhanced flow to take advantage of the benefits, both today and tomorrow. Synopsys is at the leading edge of this trend and the new white paper provides important details. You can get a copy of the new Synopsys white paper here. And that’s how Synopsys enhances PPA with backside routing.


Afraid of mesh-based clock topologies? You should be

Afraid of mesh-based clock topologies? You should be
by Daniel Payne on 03-18-2024 at 10:00 am

mesh-based clock topology

Digital logic chips synchronize all logic operations by using a clock signal connected to flip-flops or latches, and the clock is distributed across the entire chip. The ultimate goal is to have a clock signal that arrives at the exact same moment in time at all clocked elements. If the clock arrives too early or too late from the PLL output to flip-flop or latch across the chip, then that time difference will impact the critical path delays and the maximum achievable clock frequency. An architect or RTL designer looks at the clock as being a perfectly defined square wave with no delays, while engineers doing timing analysis or physical design know that clock signals are starting to look more like Sine waves than square waves, and that there are delays along the clock tree depending on the topology of the clock network. At small process nodes the On-Chip Variations (OCV) make delays in logic and clock networks differ from ideal conditions, so clock designers resort to adding timing margins.

Two popular clock topologies are tree and mesh, so a comparison reveals the differences between each.

Tree and Mesh Clock topologies. Source: GLVLSI 10, May 16-18, 2010

Tree Mesh
Shared path depth Higher Lower
Timing Analysis Static Timing Analysis SPICE
Power Lower Higher
OCV More sensitive More tolerant
Clock speed Lower Higher
Clock skew Higher Lower
Routing resources Lower Higher

With a tree topology the EDA tool flow is quite automated with Clock Tree Synthesis (CTS) in popular tool flows that include logic synthesis tied to place and route, with timing analysis run on a Static Timing Analysis (STA) tool. The downsides to a tree topology are the sensitivity to OCV, lower clock speeds and higher clock skew. In modern process nodes the aging of P and N channel devices will change the duty cycle of the clock, so it may not be 50% high and 50% low, which then impacts critical path delays.

The mesh topology for clocks provides the highest clock speeds, higher tolerance to OCV and lowest clock skew. A mesh topology has downsides of higher routing resources, higher power consumption and the SPICE requirement for timing analysis. An STA tool cannot be used to analyze a mesh, because with a mesh the clock signal has paths that combine. The only information that a STA tool could provide in a mesh-based clock design is setup and hold analysis, but not critical path analysis.

There is also a middle ground, where the best aspects of tree and mesh topologies are combined, so there are choices for clock topology that are driven by your product requirements.

SPICE circuit simulation using an extracted IC netlist with parasitics is required for timing analysis of mesh-based clock topologies. For accurate analysis of OCV effects a Monte Carlo simulation using SPICE is required, and that is a very time-consuming step, plus your SPICE simulator may not have the capacity for such a large, extracted netlist. If your chip design group is intimidated by using SPICE for analyzing clock timing in a mesh-based topology, then there’s some good news, because the EDA vendor Infinisim has an easy to use product called ClockEdge that is used for clock timing analysis without you having to be a SPICE expert. The analysis provided with ClockEdge will help your team implement a mesh-based clocking topology the quickest, and with the least amount of training.

Summary

SoC designers tackle many technical issues to reach their Power, Performance and Area (PPA) goals, and choosing a clock topology is one of these issues. Most modern chip teams will be attracted to the benefits of using a combined tree and mesh topology for their clock network, as it provides high clock rates, low skew, acceptable routing resources, and withstands OCV effects. The timing analysis of mesh-based clock networks is now simplified by using the ClockEdge tool from Infinisim, as it provides analysis for metrics like: rail to rail failures, duty cycle distortion, slew rate and transition distortion, power-supply induced jjitter.

Related Blogs

Can Correlation Between Simulation and Measurement be Achieved for Advanced Designs?

Can Correlation Between Simulation and Measurement be Achieved for Advanced Designs?
by Mike Gianfagna on 03-18-2024 at 6:00 am

Can Correlation Between Simulation and Measurement be Achieved for Advanced Designs?

“What you simulate is what you get.” This is the holy grail of many forms of system design. Achieving a high level of accuracy between predicted and actual performance can cut design time way down, resulting in better cost margins, time to market and overall success rates. Achieving a high degree of confidence in predicted performance is not an easy task. Depending on the type of design being done, there are many processes and methods that must be executed flawlessly to achieve the desired result. There was a panel devoted to this topic at the recent DesignCon in Santa Clara, CA. Experts looked at the problem from several different perspectives.  Read on to learn more – can correlation between simulation and measurement be achieved for advanced designs?

About the Panel

The DesignCon panel was entitled, Extreme Confidence Simulation for 400-800G Signal Integrity Design. The event was organized by Wild River Technology, a supplier of products and services for advanced signal integrity design. Samtec also participated in the panel. Samtec and Wild River represented the two companies on the panel that focus on products and services specifically targeted to support advanced signal integrity designs. The balance of the panel included companies that focus on design methodology/tools and advanced product development, so all points of view were represented. Here is a summary of who participated – all have impressive credentials.

I will summarize the comments from Samtec and Wild River Technology on correlation for advanced designs  next, since these two points of view are wholly focused on correlation accuracy vs. product design or design methodology.

Al Neves, Founder & Chief Technology Officer, Wild River Technology

Al has over 39 years of experience in the design and application development of semiconductor products, capital equipment design focused on jitter and signal integrity analysis. He has successfully been involved with numerous business developments and startup activity for the last 17 years. Al focuses on measure-based model development, ultra-high signal integrity serial link characterization test fixtures, high-speed test fixture design, and platforms for material identification and measurement-simulation to 110 GHz.

Scott McMorrow, Strategic Technologist, Samtec

Scott currently serves as a Strategic Technologist for Samtec, Inc. As a consultant for many years, Scott has helped many companies develop high performance products, while training signal integrity engineers. He is a frequent author and spokesperson for Samtec.

Gary Lytle, Product Management Director, Cadence

Gary leads product strategy, positioning, sales enablement and demand generation for Cadence electromagnetic simulation technologies. He has held may positions in the RF and Simulation industry, including Technical Director with ANSYS, Inc, Lead Antenna Design Engineer with Dielectric Communications, Combat Systems Engineer with General Dynamics and Engineering Manager with Amphenol.

Cathy Liu, Distinguished Engineer, Broadcom

Cathy Ye Liu currently heads up Broadcom SerDes architecture and modeling group. Since 2002, she has been working on high-speed transceiver solutions. Previously she has developed read channel and mobile digital TV receiver solutions.

 

Jim Weaver, Senior Design & Signal Integrity Engineer, Arista Networks

Jim is responsible for design and analysis of large switches for cloud computing and high bit rate serial links. Jim has over 40 years of experience in system design, including 20 years of signal integrity experience, and is heavily involved with IEEE802.3dj electrical specification work.

Todd Westerhoff, High-Speed Design Product Marketing at Siemens EDA

Todd Westerhoff moderated the panel. He has over 42 years of experience in electronic system modeling and simulation, including 25 years of signal integrity experience. Prior to joining Siemens EDA, he held senior technical and management positions at SiSoft, Cisco and Cadence. He also worked as an independent signal integrity consultant developing analysis methodologies for major systems and IC manufacturers.

The focus of the panel was defined this way:

What’s the point in running detailed simulations if the PCB test vehicle you fabricate and assemble performs differently than you had predicted? This panel will discuss issues associated with achieving tight and repeatable correlation between simulation and measurement for structures such as vias, connector launches, transmission lines, etc. and the channels that contain them.

This correlation allows us to perform what we call “Extreme Confidence Simulation”. A wide set of simulation topics will be addressed that are focused on the epic signal integrity challenges presented by 400-800G communication.

Key Takeaways – Samtec

Scott provided his views and experience on correlation for advanced designs, beginning with the observation that, in order to correlate measurements to simulation, it is necessary to understand the limits of the methods. We assume our simulations are correct given correct modeling inputs. Further, we assume our measurements are correct given the best measurement methods. But are they?

Scott pointed out that there is a statistical probability of error in both the simulations and the measurements that has nothing to do with correct modeling of materials. Therefore, we need to understand these to improve our measurement to model correlation.

Scott then dove into significant detail to discuss HFSS simulation maximum delta S criteria, HFSS simulation convergence criteria, high frequency phase accuracy, transmission uncertainty, Mcal insertion loss error, and Mcal delay error.

Scott concluded his talk with a summary of what’s needed to understand the limits of measurement. For simulation modeling, understanding the convergence controls to achieve the necessarily level of correlation is mandatory. He pointed out that for VNA measurements, for all but metrology grade measurements, phase (delay) error is significantly low enough to be accurate within several hundred femtoseconds, which is fortunate for material identification problems.

But below 10 GHz, he warned of incorrect phase creeping in, altering the starting point for material identification, and creating time domain causality issues. At low frequencies, he suggested using a separate method to validate the low frequency and DC characteristics of the material, where the accuracy is higher.

A final comment from Scott: Separate correlation to individual structures so that accuracy can be preserved in both simulation and measurement.

Key Takeaways – Wild River Technology

Al took a direct approach to the topic, pointing out that EDA tools are not standards. “There is nothing “golden” about them (sorry). Believing EDA tools are standards can corrupt the path to high-speed design confidence.” He went on to explain that the path to simulation-to-measurement confidence is a hard road that takes a lot of work and it’s uncompromising.

The hard work is EDA calibration/benchmarking and building systematic approaches using advanced test fixtures (material ID, verification of models, etc.) The bottom line is that all EDA tools have issues, and it is our job to identify and work around them.

Al then spent some time on the importance of calibration and metrics. He explained that better calibration is required for simulation-measurement. For example, sliding load cal performance is required for good sim-measure correspondence. He felt the industry was over reliant on easy-to-use ECAL and the industry has neglected good mechanical cals. Al coined the term EDA Metrics Matter. His concluding points were:

  • Mindset matters
  • You cannot ignore Maxwell
  • The world of >70GHz is not in good shape for signal integrity
  • Metrics will be very useful

Summary, and Next Steps

There were similar messages from Scott and Al at this panel. Understanding how to calibrate results and factor in all sources of errors, including an understanding the materials being used is important.

Samtec offers a vast library of information on calibration and measurement accuracy. You can explore Samtec’s technical library here. I’m a fan of the gEEk spEEk webinars. You can explore the extreme signal integrity products and services offered by Wild River Technology here. So, can correlation between simulation and measurement be achieved for advanced designs? With the right approach and the right partners, I believe it can.


Measuring Local EUV Resist Blur with Machine Learning

Measuring Local EUV Resist Blur with Machine Learning
by Fred Chen on 03-17-2024 at 10:00 am

Measuring Local EUV Resist Blur with Machine Learning

Resist blur remains a topic that is relatively unexplored in lithography. Blur has the effect of reducing the difference between the maximum and minimum doses in the local region containing the feature. Blur is particularly important for EUV lithography since EUV lithography is prone to stochastic fluctuations and also driven by secondary electron migration, which presents a significant source of blur [1].

While optical sources of blur, such as defocus, flare, and EUV dipole image fading [2], can be considered as independent of wafer location, non-optical sources, such as from electron migration or acid diffusion, can have a locally varying behavior. It is therefore important to have some way to characterize and/or monitor the local blur in a patterned EUV resist.

The most straightforward way is to have a resist pattern that covers the whole exposure field with adequate resolution-scale sampling. A practical choice for a 0.33 NA EUV system would be a 20 nm half-pitch hole or pillar array, which gives equal sampling in x and y directions. It is also practically at the resolution limit for contact/via patterning due to stochastic variations [3,4]. As shown for an example in Figure 1, a large enough blur, e.g., 20 nm, is enough for the contact to go missing. Such a large blur may result from local resist inhomogeneities as well as occasionally large electron range.

Figure 1. 20 nm half-pitch via pattern, at 20 mJ/cm2 absorbed dose (averaged over 40 nm x 40 nm cell), with different values of blur. Quadrupole illumination is used with a darkfield mask. Secondary electron quantum yield = 2. A Gaussian was fit to the half-pitch via.

One can envisage that machine learning methods [5] can be used to match via appearance to the most likely blur at a given location, allowing a blur map to be generated for the whole exposure field. It also should be reminded that the rare large local blur scenario is consistent with the rare occurrence of stochastic defects [6]. Thus, studying local blur is important for basic understanding of not just the resist but also of the origin of stochastic defects.

References

[1] P. Theofanis et al., Proc. SPIE 11323, 113230I (2020).

[2] J-H. Franke, T. A. Brunner, and E. Hendrickx, J. Micro/Nanopattern. Mater. Metrol. 21, 030501 (2022).

[3] W. Gao et al., Proc. SPIE 11323, 113231L (2020).

[4] F. Chen, “Via Shape Stochastic Variation in EUV Lithography,” https://www.youtube.com/watch?v=Cj1gfDV7-GE

[5] C. Bishop, Pattern Recognition and Machine Learning, https://www.microsoft.com/en-us/research/publication/pattern-recognition-machine-learning/

[6] F. Chen, “EUV Stochastic Defects from Secondary Electron Blur Increasing with Dose,” https://www.youtube.com/watch?v=Q169SHHRvXE, “Modeling EUV Stochastic Defects with Secondary Electron Blur,” https://www.linkedin.com/pulse/modeling-euv-stochastic-defects-secondary-electron-blur-chen

This article first appeared in LinkedIn Pulse: Measuring Local EUV Resist Blur with Machine Learning

Also Read:

Pinning Down an EUV Resist’s Resolution vs. Throughput

Application-Specific Lithography: Avoiding

Non-EUV Exposures in EUV Lithography Systems Provide the Floor for Stochastic Defects in EUV Lithography

Stochastic Defects and Image Imbalance in 6-Track Cells


Podcast EP212: A View of the RISC-V Landscape with Synopsys’ Matt Gutierrez

Podcast EP212: A View of the RISC-V Landscape with Synopsys’ Matt Gutierrez
by Daniel Nenni on 03-15-2024 at 10:00 am

Dan is joined by Matt Gutierrez. Matt joined Synopsys in 2000 and is currently Sr. Director of Marketing for Processor & Security IP and Tools. His current responsibilities include the worldwide marketing of ARC Processors and Subsystems, Security IP, and tools for the development of application-specific instruction set processors. Prior to joining Synopsys, Matt held various technical and management positions with companies such as Cypress Semiconductor, Fujitsu Limited, and The Silicon Group. Matt has over 25 years of experience in the semiconductor, computer systems, and EDA industries.

Matt provides an overview of what’s happening in custom processors and the impact of the RISC-V ISA. Matt also discusses what Synopsys is doing to enable application-specific processor design, including the recent announcement of its ARC-V processor IP.

The views, thoughts, and opinions expressed in these podcasts belong solely to the speaker, and not to the speaker’s employer, organization, committee or any other group or individual.