SemiWiki – Page 103 – The Open Forum for Semiconductor Professionals

September 28, 2023March 7, 2024

AI for the design of Custom, Analog Mixed-Signal ICs

Custom and Analog-Mixed Signal (AMS) IC design are used when the highest performance is required, and using digital standard cells just won’t meet the requirements. Manually sizing schematics, doing IC layout, extracting parasitics, then measuring the performance only to go back and continue iterating is a long, tedious approach. Siemens EDA has been offering EDA tools that span a wide gamut, including: High Level Synthesis, IC design, IC verification, physical design, physical verification, manufacturing and test, packaging, electronic systems design, electronic systems verification and electronic systems manufacturing. Zooming into the categories of IC design and IC verification is where tools for Custom IC come into focus, like the Solido Design Environment. I had a video conference with Wei Tan, Principal Product Manager for Solido to get an update on how AI is being used.

Designing an SoC at 7nm can cost up to $300 Million, and 5nm can reach $500 Million, so having a solid design and verification methodology is critical to the financial budget, and the goal of first pass silicon success. With each smaller process node the number of PVT corners required for verification only goes up.

The general promise of applying AI to the IC design and verification process is to improve or reduce the number of brute-force calculations, assist engineers to be more productive, and to help pinpoint root causes for issues like yield loss. Critical elements of using AI in EDA tools include:

Verifiability- the answers are correct
Usability -non-experts can use the tools without a PhD in statistics
Generality – it works on custom IC, AMS, memory and standard cells
Robustness – all corner cases work properly
Accuracy – same answers as brute-force methods

Wei talked about three levels of AI, with the first being Adaptive AI which accelerates an existing process using AI techniques, the next level as Additive AI that retains previous model answers in new runs, and the final level of Assistive AI to help circuit designers be more productive with new insights while using generative AI.

Solido has some 15 years of applying AI techniques to EDA tools used by circuit designers at the transistor level. For Monte Carlo simulations using Adaptive AI there’s up to a 10,000X speedup so you can get 3 to 6+ sigma results at all corners that matches brute-force accuracy. Here’s an example of Adaptive AI where a 7.1 sigma verification that required 10 trillion brute-force simulations only used 4,000 simulations, or 2,500,000,00X faster with SPICE accuracy.

High-Sigma Verifier

The Solido Design Environment also scales well in the cloud to speed up simulation runs using AWS or Azure vendors to meet peak demands.

An example of Additive Learning employs AI model reuse for when there are multiple PDK revisions and you want to characterize your entire standard cell library for each new PDK version. The traditional approach would require 600 hours to do the initial PVT runs using Monte Carlo, covering five revisions.

Traditional PVTMC jobs

With AI model reuse this scenario takes much less time to complete, also saving many MB to GB of data saved on disk.

AI model reuse, saves time

Assistive AI is applied to the sizing of transistors and identifies optimization paths to improve PPA, determines the optimal sizing of transistors to achieve the target PPA goals, and has friendly reports to visual the progress. You can expect your IC team to save days to weeks of engineering time by using AI-assisted optimization.

Assistive AI for circuit sizing

Summary

Custom and AMS IC designers can now apply AI-based techniques in their EDA tool flows during both design and verification stages. Adaptive AI speeds up brute-force Monte Carlo simulation, Additive learning uses retained AI models to speed up runs, and Assistive AI is applied to circuit optimization and analysis.

Yes, you still need circuit designers to envision transistor-level circuits, but they won’t have to wait so long for results when using EDA tools that have AI techniques under the hood.

Related Blogs

September 28, 2023October 6, 2023

Keysight EDA 2024 Delivers Shift Left for Chiplet and PDK Workflows

Much of the recent Keysight EDA 2024 announcement focuses on high-speed digital (HSD) and RF EDA features for Advanced Design System (ADS) and SystemVue users, including RF System Explorer, DPD Explorer (for digital pre-distortion), and design elements for 5G NTN, DVB-S2X, and satcom phased array applications. Two important new features in the Keysight EDA 2024 suite may prove crucial in EDA workflows for chiplets and PDKs (process design kits).

A quick introduction to chiplet interconnects

Chiplets are the latest incarnation of modular chip design tracing back through multi-chip module (MCM), system-in-package (SiP), package-on-package (PoP), and others, targeting improved cost-effective design, performance, yield, power efficiency, and thermal management. Chiplets decompose what would otherwise be a complex SoC, with an expensive and maybe unrealizable single-die solution, into smaller pieces designed and tested independently and then packaged together. Chip designers can grab chiplets from different process nodes in a heterogeneous approach – say, putting a 3nm digital logic chiplet alongside a 28nm mixed-signal chiplet.

Until recently, there has been no specification for die-to-die (D2D) interconnects, leaving chiplet designers with two significant challenges. First is the speed of today’s interconnects, often with gigabit clocks, where the bit error rate (BER) starts creeping up enough to affect performance. Second is the difficulty of modeling and simulating interconnects in digital EDA tools, usually in a do-it-yourself approach, trying to match precise time-domain measurements of eye patterns from high-speed oscilloscopes.

UCIe (Universal Chiplet Interconnect Express) fills the gap for D2D interconnects. It defines three layers: a PHY layer with data paths on physical bumps grouped into lanes by signal exit ordering; a D2D adapter coordinating link states, retries, power management, and more; and a protocol layer building on CXL and PCIe specifications. The Standard Package (2D) drives low-cost, long-reach (up to 25mm) interconnects. Advanced Package (2.5D) variants optimize performance on short-reach (less than 2mm) interconnects with tighter bump pitch, enabling improved BER at higher transfer rates. Bump maps and signal exit routing, combined with scalable diagonal bump pitch requirements, ensure that a UCIe-compliant chiplet places on a substrate with controlled interface characteristics, making interoperable connections possible.

A shift left with Chiplet PHY Designer for UCIe

Keysight EDA teams have been working on modeling and simulating HSD interfaces aligned with industry specifications for some time. Their first major product release was ADS Memory Designer with an IBIS-AMI modeler for DDR5/LPDDR5/GDDR7 memory interfaces with statistical and single-ended bus bit-by-bit simulations. Its rigorous and genuine JEDEC compliance testing handles over 100 test IDs with the same test algorithm found in the Keysight Infinium oscilloscope family.

According to Hee-Soo Lee, DDR/SerDes Product Owner and HSD Segment Lead at Keysight, the HSD R&D squad leveraged four years of effort developing Memory Designer in the creation of Chiplet PHY Designer, the industry’s first chiplet interconnect simulation tool ready for introduction as part of ADS 2024 Update 1.0 in the Keysight EDA 2024 suite. “We saw an opportunity to speed up designs using chiplets by simulating a chiplet subsystem, from one D2D PHY through interconnect channels to another D2D PHY, much earlier in the cycle,” says Lee. “Chiplet PHY Designer precisely computes a voltage transfer function (VTF) to ensure specification compliance and analyzes system BER down to 1e-27 or 1e-32 levels.” Chiplet PHY Designer can also measure eye height, eye width, skew, mask margin, and BER contour.

Keysight teams adapted the single-ended bus simulation technology to deal with the single-ended signaling and forwarded clocking used in UCIe. They then incorporated the UCIe signal naming convention and connection rules for handling smart wiring in the schematic. “After placing two dies along with interconnect channels, we can now tell Chiplet PHY Designer to make the automated wiring connections between chiplet components, and the design is ready for simulation right away,” continues Lee. The upcoming November 2023 release of Chiplet PHY Designer puts Keysight ahead of competing EDA environments for chiplet design. Interestingly, Lee hints support for Bunch of Wires (BoW) and Advanced Interface Bus (AIB) is coming in future releases.

Adapt existing PDK models to new process specifications

Creating accurate and high-quality transistor models can be time-consuming and affect the on-time delivery of PDKs. “In the traditional modeling approach, extracting a transistor model card from mass measurement data takes at least several days, often weeks,” says Ma Long, Manager of Device Modeling and Characterization at Keysight.

Keysight IC-CAP now incorporates a new product for model recentering, where models from prior processes are adjusted using figure-of-merits (FOMs) on a new process. “The biggest challenge is addressing the trend plots in real-time, simulating data points for different geometries and temperatures,” says Long. “From threshold voltage, cutoff frequency, and other FOMs, modeling engineers can modify an existing model to new specifications and save 70% compared to traditional step-by-step model extraction.”

Earlier model quality check reduces later iterations

Keysight has a full-featured model quality assurance (QA) tool, MQA, used for final SPICE model library sign-off and documentation. A newly developed light version of MQA, QA Express, is now integrated into Keysight Model Builder (MBP), allowing modeling engineers to apply a quick model QA check during parameter extraction.

Binning model QA is complicated and can also take days or weeks, and issues showing up late in the process can send teams back to the beginning. “QA Express gives easy-to-use, quick results providing a high-confidence check,” Long continues. A faster result is beneficial when simulators toss warnings over parameter effective ranges or bin discontinuity is detected. QA Express enables modeling engineers to find QA issues earlier with one-click ease.

Learn more at the Keysight EDA 2024 product launch event

Keysight has packed many new capabilities into the Keysight EDA 2024 release. For a brief introduction to trends in the EDA market driving these improvements, watch the video featuring Keysight EDA VP and GM Niels Faché below.

To help current and future users understand the latest enhancements in Keysight EDA 2024, including workflows for chiplets and PDKs, Keysight is hosting an online product introduction event on October 10^th and 11^th for various time zones.

Registration page:

Keysight EDA 2024 Product Launch Event

Press release for Keysight EDA 2024:

Keysight EDA 2024 Integrated Software Tools Shift Left Design Cycles to Increase Engineering Productivity

Also Read:

Version Control, Data and Tool Integration, Collaboration

Keysight EDA visit at #60DAC

Transforming RF design with curated EDA experiences

September 28, 2023October 16, 2023

Assertion Synthesis Through LLM. Innovation in Verification

Assertion Synthesis Through LLM. Innovation in Verification
by Bernard Murphy on 09-28-2023 at 6:00 am
Categories: AI, Cadence, EDA

Assertion based verification is a very productive way to catch bugs, however assertions are hard enough to write that assertion-based coverage is not as extensive as it could be. Is there a way to simplify developing assertions to aid in increasing that coverage? Paul Cunningham (Senior VP/GM, Verification at Cadence), Raúl Camposano (Silicon Catalyst, entrepreneur, former Synopsys CTO and now Silvaco CTO) and I continue our series on research ideas. As always, feedback welcome.

The Innovation

This month’s pick is Towards Improving Verification Productivity with Circuit-Aware Translation of Natural Language to SystemVerilog Assertions. The paper was presented in the First International Workshop on Deep-Learning Aided Verification in 2023 (DAV 2023). The authors are from Stanford.

While a lot of attention is paid to LLMs for generating software or design code from scratch, this month’s focus is on generating assertions, in this case as an early view into what might be involved in such a task. The authors propose a framework to convert a natural language check into a well-formed assertion in the context of the target design which a designer can review and edit if needed. The framework also provides for formally checking the generated assertion, feeding back results to the designer for further refinement. The intent looks similar to prompt refinement in prompt-based chat models, augmented by verification.

As a very preliminary paper our goal this month is not to review and critique method and results but rather to stimulate discussion on the general merits of such an approach.

Paul’s view

A short paper this month – more of an appetizer than a main course, but on a Michelin star topic: using LLMs to translate specs written in plain English into SystemVerilog assertions (SVA). The paper builds on earlier work by the authors using LLMs to translate specs in plain English into linear temporal logic (LTL), a very similar problem, see here.

The authors leverage a technique called “few shot learning” where an existing commercial LLM such as GPT or Codex is asked to do the LTL/SVA translation, but with some additional coaching in its input prompt: rather than asking the LLM to “translate the following sentence into temporal logic” the authors ask the LLM to “translate the following sentence into temporal logic, and remember that…” followed by a bunch of text that explains temporal logic syntax and gives some example translations of sentences into temporal logic.

The authors’ key contribution is to come up with the magic text to go after “remember that…”. A secondary contribution is a nice user interface to allow a human to supervise the translation process. This interface presents the user with a dialog box showing suggested translations of sub-clauses in the sentence and asks the user to confirm these sub-translations before building up the final overall temporal logic expression for the entire sentence. Multiple candidate sub-translations can be presented in a drop-down menu, with a confidence score for each candidate.

There are no results presented in the SVA paper, but the LTL paper shows results on 36 “challenging” translations provided by industry experts. Prior art correctly translates only 2 of the 36, where the authors’ approach succeeds on 21 of 36 without any user supervision and on 31 of 36 with user supervision. Nice!

Raúl’s view

The proposed framework, nl2sva, “ultimately aims to utilize current advances in deep learning to improve verification productivity by automatically providing circuit-aware translations to SystemVerilog Assertions (SVA)”. It is done by extending a recently released tool, nl2spec, to interactively translate natural language to temporal logic (SVA are based on temporal logic). The framework requires an LLM (they use GTP-4) and a Model checker (they use JasperGold). The LLM reads the circuits in System Verilog and the assertions in natural language, and generates SVAs plus circuit meta information (e.g., module names, input and output wire names) and sub-translations in natural language. These are presented to the developer to evaluate and the SVA are run through a model checker to evaluate. The authors describe how the framework is trained (few-shot prompting) and include two complete toy examples (Verilog listings) and show a correctly generated SVA for each of them (“Unless reset, the output signal is assigned to the last input signal”).

As pointed out, this is preliminary work. Using AI to generate assertions seems a worthy enterprise. It is a hard problem in the sense that it involves translation; we briefly hit translation back in July when reviewing Automatic Code Review using AI; translation is a hard problem to score, often done with the BLEU score (bilingual evaluation understudy which evaluates quality of machine translation on a scale of 0-100) involving human evaluation. The authors use GPT-4 stating that they have “up to 176 billion parameters” and “supports up to 8192 tokens of context memory”, which is limiting. Using GPT-5 (1.76 trillion parameters, not clear why they quote only 8192 tokens) will remove these limits.

In any case, this is a really easy paper, with a paragraph long introduction to both SVA and to LLMs, with two complete toy examples – fun to read!

Also Read:

Cadence Tensilica Spins Next Upgrade to LX Architecture

Inference Efficiency in Performance, Power, Area, Scalability

Mixed Signal Verification is Growing in Importance

Podcast EP184: The History and Industry-Wide Impact of TSMC OIP with Dan Kochpatcharin

Podcast EP184: The History and Industry-Wide Impact of TSMC OIP with Dan Kochpatcharin
by Daniel Nenni on 09-27-2023 at 2:00 pm

Dan is joined by Dan Kochpatcharin, Dan joined TSMC in 2007. Prior to his current role heading up the Design Infrastructure Management Division, Dan led the Japan customer strategy team, the technical marketing and support team for the EMEA region in Amsterdam and was a part of the team leading the formation of the TSMC Open Innovation Platform. Prior to TSMC, Dan worked at Chartered Semiconductor both in the US and Singapore and LSI Logic.

The history of TSMC ecosystem collaboration is reviewed, starting with the first reference flow work in 2001. TSMC’s OIP Ecosystem has been evolving for the past 15 years and Dan provides an overview of the activities and impact of this work. Ecosystem-wide enablement of 3DIC design is also discussed with a review of the TSMC 3DFabric Alliance and 3Dblox.

The views, thoughts, and opinions expressed in these podcasts belong solely to the speaker, and not to the speaker’s employer, organization, committee or any other group or individual.

September 27, 2023October 6, 2023

Version Control, Data and Tool Integration, Collaboration

Version Control, Data and Tool Integration, Collaboration
by Daniel Payne on 09-27-2023 at 10:00 am
Categories: EDA, Keysight EDA

As a follow up from my #60DAC visit with Simon Rance of Keysight I was invited to their recent webinar, Unveiling the Secrets to Proper Version Control, Seamless Data and Tool Integration, and Effective Collaboration. Karim Khalfan, Director of Solutions Engineering, Data & IP Management was the webinar presenter.

Modern SoC devices can contain hundreds of semiconductor IP blocks that could contain subsystems for: CPU, GPU, Security, Memory, Interconnect, NoC, and IO. Keeping track of such a complex system of subsystems requires automation.

SoC Complexity

Version Control

The goals of a version control tool for SoC design are to capture objects used in a release, ensure data security, resolve conflicts from multi-user check-ins, maintain design handoffs using labels, and being able to revert to a stable version of the system. Access controls define which engineers can read or modify the system, something that is required for military projects through ITAR compliance. Authorized engineers can check out IP like hardware or software, work on a branch, resolve conflicts with other team members, then merge changes when ready by checking in or committing

Designers with version control can update specific objects, go back in time to revert previous versions and use labels to assist in communicating with their team what each update is about. Modern version control tools should allow both command line and GUI modes to suite the style of each project.

Reuse and Traceability

The first diagram showed just how much IP that it can take to design a system, so being able to re-use trusted IP from internal or external sources is required, along with being able to trace where each IP block came from along with it’s version history. Industries like aerospace and automotive have requirements to archive their designs over a long period of time, so having thorough documentation is key to understanding the BOM.

IP developers need to know who who is using each IP block, and IP users need to be informed when any changes or updates have been made to an IP block. The legal department needs to know how each IP bock is licensed, and how many of each block is being actively used in designs. The Design Data Management tool from Keysight is called SOS. A traceability report should show where each IP block is being used across a global scale, by version, and by geography. If two versions of the same IP are referenced in the same project, then a conflict should be detected and reported.

IP by Geography

Storage Optimization

SoC design sizes continue to increase in size, so how the data is stored and accessed becomes an issue.

Design	# of Files	File Size
12 Bit ADC	25K	150GB
Mixed Signal Sensor	100K	250GB
PDK	300K	800GB
Processor	500K	1,500GB

With a traditional storage approach there is a physical copy of the data per user, so for a team of five engineers there would be five copies of the data. Each new engineer grows the disk space linearly, requiring more network storage.

The Keysight SOS approach instead uses a centralized work area, then design files in a users work area are symbolic links to a cache, except for files to be edited. This creates an optimized use of network storage, saving on disk space for the team. Creating a new user work area is quite fast.

SOS storage

Team & Site Collaboration

Without remote sharing of IP blocks, your engineering team may be working on the wrong version of data, wasting time trying to locate the golden data, using stale data that is out of sync, or even handing off data to another geography that is out of date. Keysight recommends using labels as tags to communicate between team members, and also using tags to represent milestones in the IC design process. In the following diagram there’s a mixed-signal design flow with tags and labels being used to ensure that the correct versions are being used by each engineer.

Mixed-signal design flow using tags and labels

Once the design methodology is established, then each geography can work concurrently sharing data through the repository and cache system. SOS supports automatic synching data across sites, so there is fast access to data at each remote site. Even remote updates are performed quickly just like at the primary site, as the traffic of data is reduced, and this approach also works in cloud-based EDA tool flows. Remote design centers and cloud users are both supported, as the data management is built in.

Integration

Over many years the Keysight SOS tool has been integrated with the most popular EDA software vendor flows.

MathWorks
Siemens
Synopsys
Keysight
Cadence
Silvaco
Empyrean

These are all native integrations, so the data management and version control are consistent across projects, groups and geographies. The SOS tool runs under Windows or Linux, has a web interface, and can also be run from the command line. Here’s what the SOS interface looks like to a Cadence Virtuoso user:

SOS commands in Cadence Virtuoso

Summary

Having an integrated data management tool within your favorite EDA flow will help your design team’s productivity, as it automatically synchs your data around the world to ensure that all members are accessing the correct IP blocks. Using a tagging methodology to promote data once it gets completed will communicate to everyone on a team what state each block is in. All of your IP reuse will now have traceability to more easily audit data.

Version control has gone beyond just the simple check-in, checkout and update cycles, as advanced flows need to also support variations of experiments or branches. The archived webinar is online now here.

Related Blogs

September 27, 2023September 27, 2023

WEBINAR: Emulation and Prototyping in the age of AI

WEBINAR: Emulation and Prototyping in the age of AI
by Daniel Nenni on 09-27-2023 at 8:00 am
Categories: Corigine, EDA, Emulation, Events, Prototyping

Artificial Intelligence is now the primary driver of leading edge semiconductor technology and that means performance is at a premium and time to market will be measured in billions of dollars of revenue. Emulation and Prototyping have never been more important and we are seeing some amazing technology breakthroughs including a unified platform from Corigine.

How does an innovative and unified platform for Prototyping and Emulation deliver never seen speeds, truly fast enough for system software development?

How is Debug made possible with powerful capabilities to shorten validation times by orders of magnitude?

Is Push-Button automation for real? And how can scalability go from 1 to 128 FPGA’s on the fly?

To answer these questions, please join the Corigine coming up webinar. We will showcase the new MimicPro-2 platform architected and designed from the ground up by the pioneers of Emulation and Prototyping.

LIVE WEBINAR: Emulation and Prototyping in the age of AI

October 18, 9:00am PDT

The complexity of hardware and software content increases the need of faster emulation and prototyping capacity to achieve the hardware verification and software development goals. Identifying the best balance of emulation and prototyping hardware capacity for SoC design teams is always challenging. This is why Corigine made the best effort to determine the upfront unified prototyping and emulation platform.

Corigine’s team hailing from the days of Quickturn and developing generations of Emulation and Prototyping at the big EDA companies has architected a unified new platform. The new platform breaks barriers in a space that has not been keeping pace with the needs of AI, Processors and Communications SoCs that need software running at the system performance levels…pre-silicon. And as the shift-left push shortens R&D cycles, enormous innovations in debug are necessary, with the kind of backdoor access and system view transparency that is near-impossible with legacy emulation and prototyping platforms. A new Corigine platform will be unveiled in this webinar to showcase and demo what is achievable

Why attend?

In this webinar, you will gain insights on:

What new levels are achievable for software and hardware teams for
- Pre-silicon emulation performance
- Debug capabilities as never seen before
- Multi-user accessibility and granularity
What is next on the prototyping and emulation roadmap

LIVE WEBINAR: Emulation and Prototyping in the age of AI

October 18, 9:00am PDT

Speakers:

Ali Khan
The VP Business and Product Development at Corigine. He has over 25 years of experience building startups and running businesses, with particular expertise in the semiconductor sector. Before joining Corigine, Ali was Director of Product Management at Marvell, Co-Founder of Nexsi System, and Server NIC Product Manager at 3Com. Ali obtained a bachelor’s degree in Electrical Engineering from UT Austin and MBA from Indiana University.

Mike Shei
The VP Engineering at Corigine. Mike has over 30 years of experience on emulation/prototyping tools.

About Corigine
Corigine is a leading supplier of FPGA prototyping cards and systems to shift left R&D schedules. Corigine delivers EDA tools, IPs and networking products. Corigine has worldwide R&D centers and offices in US, London, South Africa, and China. https://www.corigine.com/

Also Read:

Speeding up Chiplet-Based Design Through Hardware Emulation

Bringing Prototyping to the Desktop

A Next-Generation Prototyping System for ASIC and Pre-Silicon Software Development

September 27, 2023October 3, 2023

Power Supply Induced Jitter on Clocks: Risks, Mitigation, and the Importance of Accurate Verification

Power Supply Induced Jitter on Clocks: Risks, Mitigation, and the Importance of Accurate Verification
by Daniel Nenni on 09-27-2023 at 6:00 am
Categories: EDA, Infinisim

In the realm of digital systems, clocks play a crucial role in synchronizing various components and ensuring smooth flow of logic propagation. However, the accuracy of clocks can be significantly affected by power supply induced jitter. Jitter refers to the deviation in the timing of clock signals with PDN noise compared to ideal periodic timing. This essay explores the risks associated with power supply induced jitter on clocks, strategies to mitigate its impact, and the crucial role of accurate verification in maintaining reliable clock performance.

Infinisim JitterEdge is a specialized jitter analytics solution, designed to compute power supply induced jitter of clock domains containing millions of gates at SPICE accuracy. It computes both period and cycle-to-cycle jitter at all clock nets, for all transitions using large milli-second power-supply noise profiles. Customers use Infinisim jitter analysis during physical design iterations and before final tape-out to ensure timing closure.

Understanding Power Supply Induced Jitter

Power supply induced jitter occurs when fluctuations or noise in the power supply voltage affect the timing of a clock signal. In digital systems, clock signals are typically generated by phase-locked loops (PLLs) or delay-locked loops (DLLs). PLL jitter is added to the PDN jitter to compute total clock jitter.

Risks of Power Supply Induced Jitter

Timing Errors: The primary risk associated with power supply induced jitter is the introduction of timing errors. These errors can lead to setup and hold violations resulting in synchronization errors between different components
Increased Bit Error Rates (BER): Jitter-induced timing errors can result in data transmission issues, leading to a higher BER in communication channels. This can degrade the overall system’s reliability and performance.
Reduced Signal Integrity: Jitter can cause signal integrity problems, leading to crosstalk, data corruption, and other noise-related issues, especially in high-speed digital systems.
Frequency Synthesizer Instability: In systems that rely on frequency synthesizers for clock generation, power supply induced jitter can cause the synthesizer to become unstable, leading to unpredictable system behavior.

Mitigating Power Supply Induced Jitter

To minimize the impact of power supply induced jitter on clocks, several mitigation strategies can be employed:

Quality Power Supply Design: Implementing a robust and well-designed power supply system is crucial. This includes the use of decoupling capacitors, voltage regulators, and power planes to reduce noise and fluctuations in the supply voltage.
Filtering and Isolation: Incorporate filtering mechanisms to remove high-frequency noise from the power supply. Additionally, isolate sensitive clock generation circuits from noisy power sources to limit the propagation of jitter.
Clock Buffering and Distribution: Utilize clock buffers to distribute the clock signal efficiently and accurately. Proper buffering helps to isolate the clock signal from the original source, reducing the impact of jitter.
Clock Synchronization Techniques: Implement clock synchronization techniques that enable multiple components to share a common reference clock, mitigating potential timing discrepancies.
Minimize Load and Crosstalk: Reduce the capacitive load on clock distribution networks and minimize crosstalk between clock and data signals to maintain signal integrity.

Importance of Accurate Verification

Accurate verification of power supply induced jitter is essential for several reasons:

System Reliability: Accurate verification ensures that the system meets timing requirements, reducing the risk of errors and malfunctions caused by jitter-induced timing variations.
Performance Optimization: By understanding the extent of jitter in the system, designers can optimize clock generation and distribution, maximizing performance without compromising reliability.
Compliance with Standards: Many industries and applications have specific timing requirements, such as in telecommunications or safety-critical systems. Accurate verification ensures compliance with these standards.
Cost and Time Savings: Early identification and mitigation of power supply induced jitter during the verification process save time and resources, preventing potential issues during product deployment.

Conclusion

Power supply induced jitter on clocks poses significant risks to the accurate operation of digital systems. Mitigation strategies, including quality power supply design, filtering, and proper clock distribution, are essential for reducing jitter’s impact. Accurate verification of power supply induced jitter is crucial to maintaining system reliability, optimizing performance, and ensuring compliance with industry standards. By understanding and addressing this challenge, designers can create more robust and dependable digital systems capable of meeting the demands of modern technology.

Characterization is a common technique used in the analysis of clock jitter and involves measuring and quantifying the variations in a clock signal’s timing. Characterization is often used to describe the process of measuring and analyzing the behavior of a signal or component to understand its performance characteristics. In the context of clock jitter, characterization-based tools measure the statistical distribution of jitter values, determine key metrics such as RMS jitter and peak-to-peak jitter, and analyzing how different factors in the design contribute to jitter.

For designs at 7nm and below nodes, where a sub-pico-second level of jitter needs to be identified, a more accurate approach is needed. Running a full circuit simulation at the transistor-level along with parasitic interconnect can provide SPICE accurate Jitter analysis and help identify sources of jitter and their impact. Infinisim’s ClockEdge’s jitter capability provides the accuracy needed to model clock jitter and its effects.

If you have questions feel free to contact the clock experts at Infinisim here: https://infinisim.com/contact/

Also Read:

Clock Verification for Mobile SoCs

CTO Interview: Dr. Zakir Hussain Syed of Infinisim

Clock Aging Issues at Sub-10nm Nodes

September 26, 2023November 15, 2023

Fast Path to Baby Llama BringUp at the Edge

Fast Path to Baby Llama BringUp at the Edge
by Bernard Murphy on 09-26-2023 at 10:00 am
Categories: AI, IP, Quadric

Tis the season for transformer-centric articles apparently – this is my third within a month. Clearly this is a domain with both great opportunities and challenges: extending large language model (LLM) potential to new edge products and revenue opportunities, with unbounded applications and volumes yet challenges in meeting performance, power, and cost goals. Which no doubt explains the explosion in solutions we are seeing. One dimension of differentiation in this race is in the underlying foundation model, especially GPT (OpenAI) versus Llama (Meta). This does not reduce to a simple “which is better?” choice it appears, rather opportunities to show strengths in different domains.

Llama versus other LLMs

GPT has enjoyed most of the press coverage so far but Llama is demonstrating it can do better in some areas. First a caveat – as in everything AI, the picture continues to change and fragment rapidly. GPT already comes in 3.5, 4, and 4.5 versions, Google has added Retro, LaMDA and PaLM2, Meta has multiple variants of Llama, etc, etc.

GPT openly aims to be king of the LLM hill both in capability and size, able from a simple prompt to return a complete essay, write software, or create images. Llama offers a more compact (and more accessible) model which should immediately attract edge developers, especially now that the Baby Llama proof of concept has been demonstrated.

GPT 4 is estimated to run to over a trillion parameters, GPT 3.5 around 150 billion, and Llama 2 has variants from 7 to 70 billion. Baby Llama is now available (as a prototype) in variants including 15 million, 42 million and 110 million parameters, a huge reduction making this direction potentially very interesting for edge devices. Notable here is that Baby Llama was developed by Andrej Karpathy of OpenAI (not Meta) as a weekend project to prove the network could be slimmed down to run on a single core laptop.

As a proof of concept, Baby Llama is yet to be independently characterized or benchmarked, however Karpathy has demonstrated ~100 tokens/second rates when running on an M1 MacBook Air. Tokens/second is a key metric for LLMs in measuring throughput in response to a prompt.

Quadric brings Baby Llama up on Chimera core in 6 weeks

Assuming that Baby Llama is a good proxy for an edge based LLM, Quadric made the following interesting points. First, they were able to port the 15 million parameter network to their Chimera core in just 6 weeks. Second, this port required no hardware changes, only some (ONNX) operation tweaking in C code to optimize for accuracy and performance. Third they were able to reach 225 tokens/second/watt, using a 4MB L2 memory, 16 GB/second DDR, a 5nm process and 1GHz clock. And fourth the whole process consumed 13 engineer weeks.

By way of comparison, they ran the identical model on an M1-based Pro laptop running the ONNX runtime with 48MB RAM (L2 + system cache) and 200 GB/sec DDR, with a 3.3 GHz clock. That delivered 11 tokens/second/watt. Quadric aims to extend their comparison to edge devices once they arrive.

Takeaways

There are obvious caveats. Baby Llama is a proof of concept with undefined use-rights as far as I know. I don’t know what (if anything) is compromised in reducing full Llama 2 to Baby Llama, though I’m guessing for the right edge applications this might not be an issue. Also performance numbers are simulation-based estimates, comparing with laptop performance rather than between implemented edge devices.

What you can do with a small LLM at the edge has already been demonstrated by recent Apple IoS/MacOS releases which now support word/phrase completion as you type. Unsurprising – next word/phrase prediction is what LLMs do. A detailed review from Jack Cook suggests their model might be a greatly reduced GPT 2 at about 34 million parameters. Unrelated recent work also suggests value for small LLMs in sensing (e.g. for predictive maintenance).

Quadric’s 6-week port with no need for hardware changes is a remarkable result, important as much in showing the ability of the Chimera core to adapt easily to new networks as in the performance claims for this specific example. Impressive! You can learn more about this demonstration HERE.

Also Read:

Vision Transformers Challenge Accelerator Architectures

An SDK for an Advanced AI Engine

Quadric’s Chimera GPNPU IP Blends NPU and DSP to Create a New Category of Hybrid SoC Processor

September 26, 2023March 7, 2024

Optimizing Shift-Left Physical Verification Flows with Calibre

Optimizing Shift-Left Physical Verification Flows with Calibre
by Peter Bennet on 09-26-2023 at 6:00 am
Categories: EDA, Siemens EDA

Advanced process nodes create challenges for EDA both in handling ever larger designs and increasing design process complexity.

Shift-left design methodologies for design cycle time compression are one response to this. And this has also forced some rethinking about how to build and optimize design tools and flows.

SemiWiki covered Calibre’s use of a shift-left strategy to target designer productivity a few months ago, focusing on the benefits this can deliver (the “what”). This time we’ll look closer at the “how” – specifically what Siemens call Calibre’s four pillars of optimization (these diagrams are from the Siemens EDA paper on this theme).

Optimizing Physical Verification (PV) means both delivering proven signoff capabilities in a focused and efficient way in the early design stages and extending the range of PV.

Efficient tool and flow Execution isn’t only about leading performance and memory usage. It’s also critical to reduce the time and effort to configure and optimize run configurations.

Debug in early stage verification is increasingly about being able to isolate which violations need fixing now and providing greater help to designers in quickly finding root causes.

Integrating Calibre Correction into the early stage PV flow can save design time and effort by avoiding potential differences between implementation and signoff tool checks.

Reading through the paper, I found it helpful here to think about the design process like this:

Current design

The portion of the design (block, functional unit, chip) we’re currently interested in
Has a design state, e.g. pre-implementation, early physical, near final, signoff

Design context

States of the other design parts around our current design

Verification intent

What we need to verify now for our current design
A function of current design state, context and current design objectives and priorities
Frequently a smaller subset of complete checks

We’ll often have a scenario like that below.

Sometimes we’ll want to suppress checks or filter out results from earlier stage blocks. Sometimes we might just want to check the top-level interfaces. Different teams may be running different checks on the same DB at the same time.

Verification configuration and analysis can have a high engineering cost. How to prevent this multiplying up over the wide set of scenarios to be covered as the design matures ? That’s the real challenge Calibre sets out to meet here by communicating a precise verification intent for each scenario, minimizing preparation, analysis, debug and correction time and effort.

Extending Physical Verification

Advanced node physical verification has driven some fundamental changes in both how checks are made and their increased scope and sophistication in the Calibre nmPlatform

Equation-based checks (eqDRC) that require complex mathematical equations using the SVRF (standard verification rule) format are one good example. And also one that emphasizes the importance of more programmable checks and fully integrating both checks and results annotation into the Calibre toolsuite and language infrastructure.

PERC (programmable electrical rule checking) is another expanding space in verification that spans traditional ESD and latch-up to newer checks like voltage dependent DRC.

Then there are thermal and stress analysis for individual chips and 3D stacked packages and emerging techniques like curvilinear layout checks for future support.

The paper provides a useful summary diagram (in far more detail than we can cover here).

Improving Execution Efficiency

EDA tool configuration is a mix of top-down (design constraints) and bottom-up (tool and implementation settings) – becoming increasingly bottom-up and complex as the flow progresses. But we don’t want all the full time-consuming PV config effort for the early design checks in a shift-left flow.

Calibre swaps out the traditional trial-and-error config search for a smarter, guided and AI-enabled one which understands the designer’s verification intent. Designers might provide details on the expected state (“cleanliness”) of the design and even relevant error types and critical parts of a design, creating targeted check sets that minimize run time.

Some techniques used by Calibre are captured below.

Accelerating Debug

Streamlining checks for the design context usefully raises the signal-to-noise ratio in verification reports. But there’s still work to do in isolating which violations need addressing now (for example, a designer may only need to verify block interfaces) and then finding their root causes.

Calibre puts accumulated experience and design awareness to work to extract valuable hints and clues to common root causes – Calibre’s debug signals. AI-empowered techniques aid designers in analyzing, partitioning, clustering and visualizing the reported errors.

Some of Calibre’s debug capabilities are shown below.

Streamlining Correction

If we’re running Calibre PV in earlier design stages, why not use Calibre’s proven correct-by-construction layout modifications and optimizations from its signoff toolkit for the fixes – eliminating risks from potential differences in implementation and signoff tool checks ? While Calibre’s primarily a verification tool, it’s always had some design fixing capabilities and is already tightly integrated with all leading layout flows.

But the critical reason is that layout tools aren’t always that good at some of the tasks they’ve traditionally been asked to do. Whether that’s slowness in the case of filler insertion or lack of precision in what they do – since they don’t have signoff quality rule-checking – meaning either later rework or increased design margining.

An earlier SemiWiki article specifically covered Calibre Design Enhancer’s capabilities for design correction.

The paper shows some examples of Calibre optimization.

Summary

A recent article about SoC design margins noted how they were originally applied independently at each major design stage. As diminishing returns from process shrinks exposed the costly over-design this allowed, this forced a change to a whole process approach to margining.

It feels like we’re at a similar point with the design flow tools. No longer sufficient to build flows “tools-up” and hope that produces good design flows, instead move to a more “flow-down” approach where we co-optimize EDA tools and design flows.

That’s certainly the direction Calibre’s shift-left strategy is following. building on these four pillars of optimization.

Find more details in the original Siemens EDA paper here:

The four foundational pillars of Calibre shift-left solutions for IC design & implementation flows.

September 25, 2023November 7, 2023

Power Analysis from Software to Architecture to Signoff

Power Analysis from Software to Architecture to Signoff
by Daniel Payne on 09-25-2023 at 10:00 am
Categories: EDA, Synopsys

SoC designs use many levels of design abstraction during their journey from ideation to implementation, and now it’s possible to perform power analysis quite early in the design process. I had a call with William Ruby, Director of Porduct Marketing – Synopsys Low Power Solution to hear what they’ve engineered across multiple technologies. Low-power IC designs that run on batteries need to meet battery life goals, and that is achieved through analyzing and minimizing power throughout the design lifecycle. High-performance IC designs also need to meet their power specifications, and lowering power during early analysis can also allow for increased clock rates which then boosts performance further. There are five EDA products from Synopsys that each provide power analysis and optimization capabilities to your engineering team from software to signoff.

Power-aware tools at Synopsys

The first EDA tool listed is Platform Architect, and that is used to explore architectures and even provide early power analysis, before any RTL is developed by using an architectural model that your team can run different use cases on. With the Platform Architect tool you can build a virtual platform for early software development, and to start verifying the hardware performance.

Once RTL has been developed, then an emulator like Synopsys ZeBu can be used to run actual software on the hardware representation. Following the emulation run, ZeBu Empower delivers power profiling of the entire SoC design so that you can know the sources of dynamic and leakage power quite early, before silicon implementation. These power profiles cover billions of cycles, and the critical regions are quickly identified as areas for improvements.

Zebu Empower flow

RTL Power Analysis

RTL power analysis is run with the PrimePower RTL tool using vectors from simulation and/or emulation, or even without vectors for what-if analysis. Designers can explore and get guidance on the effects of clock-gating, memory, data-path and glitch power. The power analysis done at this stage is physically-aware, and consistent with signoff power analysis results.

PrimePower – Three Stages

Gate-level Power Analysis

Logic synthesis converts RTL into a technology-specific gate-level netlist, ready for placement and routing during the implementation stage. The golden power signoff is done on the gate-level design using PrimePower. Gate-level power analysis provides you with average power, peak power, glitch power, clock network power, dynamic and leakage power, and even multi-voltage power. Input vectors can come from RTL simulation, emulation or vectorless. The RTL to GDSII flow is provided with the Fusion Compiler tool, where engineers optimize their Power, Performance and Area (PPA) goals.

Summary

Achieving energy efficiency from software to silicon is now a reality using the flow of tools and technologies provided by Synopsys. This approach takes the guesswork out of meeting your power goals prior to testing silicon, and has been proven by many design teams around the world. What a relief to actually know that your power specification has been met early in the design lifecycle.

Synopsys has a web page devoted to energy-efficient SoC designs, and there’s even a short overview video on low-power methodology. There’s also a White Paper, Achieving Consistent RTL Power Accuracy.