ads mdx semiwiki building trust gen 800x100ai

Alphawave Semi Showcases 3nm Connectivity Solutions and Chiplet-Enabled Platforms for High Performance Data Center Applications

Alphawave Semi Showcases 3nm Connectivity Solutions and Chiplet-Enabled Platforms for High Performance Data Center Applications
by Daniel Nenni on 05-04-2023 at 6:00 am

Alphawave Semi 3nm Eye Diagram

There were quite a few announcements at the TSMC Technical Symposium last week but the most important, in my opinion, were based on TSMC N3 tape-outs. Not only is N3 the leading 3nm process it is the only one in mass production which is why all of the top tier semiconductor companies are using it. TSMC N3 will be the most successful node in the history of the TSMC FinFET family, absolutely.

(Graphic: TSMC)

In order to tape-out to 3nm you need IP and high speed SerDes IP is critical for HPC applications such as AI which is now the big semiconductor driver for leading edge silicon. Enabling chiplets at 3nm is also a big deal and that is the focus of this well worded announcement:

Successful launch of 3nm connectivity silicon brings chiplet-enabled custom silicon platforms to the forefront Alphawave Semi 3nm Eye Diagram

(Graphic: Business Wire)

LONDON, United Kingdom, and TORONTO, Canada – April 25, 2023 – Alphawave Semi (LSE: AWE), a global leader in high-speed connectivity for the world’s technology infrastructure, today announced the bring-up of its first connectivity silicon platform on TSMC’s most advanced 3nm process with its ZeusCORE Extra-Long-Reach (XLR) 1-112Gbps NRZ/PAM4 serialiser-deserialiser (“SerDes”) IP.

An industry-first live demo of Alphawave Semi’s silicon platform with 112G Ethernet and PCIe 6.0 IP on TSMC 3nm process will be unveiled at the TSMC North America Symposium in Santa Clara, CA on April 26, 2023.

The 3nm process platform is crucial for the development of a new generation of advanced chips needed to cope with the exponential growth in AI generated data, and enables higher performance, enhanced memory and I/O bandwidth, and reduced power consumption. ZeusCORE XLR Multi-Standard-Serdes (MSS) IP is the highest performance SerDes in the Alphawave Semi product portfolio and on the 3nm process will pave the way for the development of future high performance AI systems. It is a highly configurable IP that supports all leading edge NRZ and PAM4 data center standards from 1112 Gbps, supporting diverse protocols such as PCIe Gen1 to Gen6 and 1G/10G/25G/50G/100 Gbps Ethernet.

This flexible and customizable connectivity IP solution together with Alphawave Semi’s chiplet-enabled custom silicon platform which includes IO, memory and compute chiplets, allows end-users to produce high performance silicon specifically tailored to their applications. Customers can benefit from Alphawave Semi’s application optimized IP-subsystems and advanced 2.5D/3D packaging expertise to integrate advanced interfaces such Compute Express Link (CXLTM), Universal Chiplet Interconnect ExpressTM (UCIeTM), High Bandwidth Memory (HBMx), and Low-Power Double Data Rate DRAM (LP/DDRx/) onto custom chips and chiplets.

“We are thrilled to be one of the first companies to successfully demonstrate our highest performance silicon platform with our XLR 112G Ethernet and PCIE6.0 SerDes IP on TSMC’s most advanced 3nm technology”, said Tony Pialis, CEO and co-founder of Alphawave Semi. “This represents a significant stepforward in our execution of Alphawave Semi’s strategy to be a vertically-integrated semiconductor leader in high-speed connectivity. Thanks to our rapidly growing partnership with TSMC through the Open Innovation Platform© (OIP), we continue to deliver innovative, high-performance custom silicon and IP solutions to our customers in data center, compute, networking, AI, 5G, autonomous vehicles, and storage applications.”

“Alphawave Semi continues to see growing demand from our hyperscaler customers for purpose-built silicon with very high-speed connectivity interfaces, fueled by an exponential increase in processing of AI-generated data”, said Mohit Gupta, SVP and GM, Custom Silicon and IP, Alphawave Semi. “We’re engaging our leading customers on chiplet-enabled 3nm custom silicon platforms which include IO, memory, and compute chiplets. Our Virtual Channel Aggregator (VCA) partnership with TSMC has provided invaluable support, and we look forward to accelerating our customers’ high-performance designs on TSMC’s 3nm process.”

About Alphawave Semi

Alphawave Semi is a global leader in high-speed connectivity for the world’s technology infrastructure. Faced with the exponential growth of data, Alphawave Semi’s technology services a critical need: enabling data to travel faster, more reliably and with higher performance at lower power. We are a vertically integrated semiconductor company, and our IP, custom silicon, and connectivity products are deployed by global tier-one customers in data centers, compute, networking, AI, 5G, autonomous vehicles, and storage. Founded in 2017 by an expert technical team with a proven track record in licensing semiconductor IP, our mission is to accelerate the critical data infrastructure at the heart of our digital world. To find out more about Alphawave Semi, visit: awavesemi.com.

Also Read:

TSMC 2023 North America Technology Symposium Overview Part 1

Alphawave Semi at the Chiplet Summit

Alphawave IP is now Alphawave Semi for a very good reason!

High-End Interconnect IP Forecast 2022 to 2026

 


Using ML for Statistical Circuit Verification

Using ML for Statistical Circuit Verification
by Daniel Payne on 05-03-2023 at 10:00 am

6 sigma samples statistical circuit

I’ve been following Solido as a start-up EDA vendor since 2005, then they were acquired by Siemens in 2017. At the recent User2User event there was a presentation by Kwonchil Kang, of Samsung Electronics on the topic, ML-enabled Statistical Circuit Verification Methodology using Solido. For high reliability circuits there is a high-sigma requirement, and 6 sigma equates to 10 failures per 10,135,946,920 samples, or simulations. Using multiple Process, Voltage and Temperature (PVT) corners creates even more  simulations. Using a brute-force approach to reach high-sigma by Monte Carlo simulations simply takes too much time.

There is a reduced Monte Carlo approach that tries to scale to 6-sigma, but for a bandgap reference circuit example with 36 PVT corners it requires 3,000 simulations per PVT corner, or 108,000 simulations for all 36 PVT corners, and the limited accuracy comes as long tail or non-gaussian characteristics are introduced.

The Solido approach uses Artificial Intelligence (AI) for variation-aware design and verification with Solido Variation Designer, and there are two components:

  • PVTMC Verifier – finds worst-case corner for target sigma and design sensitivities to variation
  • High-Sigma Verifier – High-sigma verification 1,000 to 1,000,000,00 faster than brute-force simulation

There are several steps to the AI algorithm used in the Solido tools, and the first step is to generate Monte Carlo samples, but don’t simulate them. Next, simulate initial samples, and then sort all of the samples and simulate them in order. Simulating even more samples will then capture the true yield at a target sigma.

Simulate samples around target sigma

With this Solido AI approach, and the resulting Probability Density Function (PDF) would look like the example below:

Probability Density Function

Probability Density Function

The dashed blue line is the verified PDF fit. Green dots are the initial samples, and dark dots the Monte Carlo results. The orange dots are ordered samples.

For the actual bandgap reference circuit described in the presentation, Solido Variation Designer achieved verification equivalent to 10 billion brute-force simulations in just 24,100 simulations translating to a speed-up of some 415,000X.

PVTMC Verifier covers all PVT corners and runs Monte Carlo in a way that requires only a few hundred simulations to capture the target sigma, thus reducing the number of simulations across the corners. The results are accurate as there are no extrapolations used or Gaussian assumptions, because it’s using real simulations at the target sigma. It’s covering all PVTs in a single run of the tool.

PVTMC Verifier example results

Inside of the PVTMC Verifier it’s identifying ordinals classes for all PVTs, capturing a distribution for each class, then verifying distributions within known classes. On the bandgap reference circuit described in the presentation, PVTMC Verifier ran a 6-sigma verification across all 36 PVTs in just 11,000 simulations, a speed-up of 32,000,000 compared to brute-force Monte Carlo.

The tool flow for using Solido AI is that a circuit netlist is run through PVTMC Verifier to select the worst-case statistical points, simulates the samples at multiple scales, observe the response to a change in scale, then it builds a model to predict the unscaled yield estimate. These first-pass results are then sent to the high-sigma verifier which runs initial samples until model building is successful, uses AI to generate Monte Carlo samples, then runs tail samples until the result is verified.

Using the Solido AI methodology required only 300 simulations per PVT with PVTMC Verifier (10,800 simulations) plus 20,000 simulations with High-Sigma Verifier, so a total of 34,900 simulations. The accuracy matched brute-force Monte Carlo, however the results completed 10,000,000X faster

Summary

At Samsung they are using Solido AI technology to achieve their goals of high-sigma verification across IC applications, while having much shorter run times than using brute-force Monte Carlo simulations. They used PVTMC Verifier to give first-pass results across all PVTs, then followed with High-Sigma Verifier for the final verification on critical and worst-case PVTs.

Related Blogs


Gate Resistance in IC design flow

Gate Resistance in IC design flow
by Maxim Ershov on 05-03-2023 at 6:00 am

Figure1 9

MOSFET gate resistance is a very important parameter, determining many characteristics of MOSFETs and CMOS circuits, such as:

• Switching speed
• RC delay
• Fmax – maximum frequency of oscillations
• Gate (thermal) noise
• Series resistance and quality factor in MOS capacitors and varactors
• Switching speed and uniformity in power FETs
• Many other device and circuit characteristics

Many academic and research papers have been written about gate resistance. However, for practical work of IC designers and layout engineers, many important things have not been discussed or explained, for example:

• Is gate resistance handled by SPICE models or by parasitic extraction tools?
• How do parasitic extraction tools handle gate resistance?
• How can one evaluate gate resistance from the layout or from extracted, post-layout netlist?
• How can one identify if gate resistance is limited by the “intrinsic” gate resistance (gate poly), or by gate metallization routing, and what are the most critical layers and polygons?
• Is gate distributed effect (factors of 1/3 and 1/12, for single- and double-contacted poly) captured in IC design flow (in PDK)?
• Is vertical gate resistance component captured in foundry PDKs?
• Should the gate be made wider or narrower, to reduce gate resistance?
• What’s the difference between handling gate resistance in PDKs for RF versus regular MOSFETs or p-cells?

The purpose of this article is to demystify these questions, and to provide some insights for IC design and layout engineers to better understand gate resistance in their designs.

Gate resistance definition and measurement

Gate resistance is an “effective” resistance from the driving point (gate port, or gate driver), to the MOSFET gate instance pin(s) – see Fig.1. (instance pin is a connection point between a terminal of SPICE model and resistive network a net).

Figure 1. MOSFET cross-section and schematic illustration of gate resistance.

However, the simplicity of the schematic in Figure 1 may be very misleading. Gate nets can be very large in size, contain many driving points, many (dozens of) layers (metal and via), millions of polygons, and up to millions of gate instance pins (connection points for device SPICE model gate terminals) – see Figure 2.

Figure 2. Schematic illustration of the top-view and cross-sectional view of MOSFET gate network

Gate network forms a large distributed system, with one or several driving points, and many destination points.

Very often, gate net looks and behaves as a huge, regular clock network, distributing the gate voltage to a FET.

Deriving an equivalent, effective gate resistance for such a large and complex system is not a simple and straightforward task. SPICE circuit simulation does not explicitly report gate resistance value.

Knowing the value of gate resistance is very useful to estimate the speed of switching, delay, noise, Fmax, and other characteristics, to see if characteristics are within the spec. Also, knowing the contributions to the gate resistance – by layer, and by layout polygons – is very useful to guide the layout optimization efforts.

Gate resistance handling by parasitic extraction tools

To understand gate resistance in IC design flow, it’s important to know how parasitic extraction tools treat and model it.

All industry-standard parasitic extraction tools handle gate resistance and its extraction similarly. In layout, the MOS gate structure is represented by a 2D mask traditionally called “poly” – even though the material can be formed by a complex gate metal stack and may have a complex 3D structure.

They fracture the poly line at the intersection with the active (diffusion) layer, breaking it into “gate poly” (poly over active) and “field poly” (poly outside active), as shown in Figure 3.

Figure 3. R and RC extraction around MOSFET gate.

Gate poly is also fractured at the center point. Gate instance pin of the MOSFET (SPICE model) is connected to the center point of the gate poly. Gate poly is described by two parasitic resistors, connecting the fracture points. A more accurate model of the gate poly, with two positive and one negative resistor, can be enabled in the PDK, but some foundries prefer not to use it (see next section on Gate Delta Model).

Parasitic resistors representing the field poly are connected to the gate contacts or to MEOL (Middle-End-Of-Line) layers and further to upper metal layers.

MOSFET extrinsic parasitic capacitance between gate poly and source / drain diffusion and contacts is calculated by parasitic extraction tools, and assigned to the nodes of the resistive networks. Different extraction tools do this differently – some tools connect these parasitic capacitances to the center point of the gate poly, while some other tools connect them to the end points of the gate poly resistors. The details of the parasitic capacitance connection to the gate resistor network may have a large, significant impact on transient and AC response, especially in advanced nodes (16nm and lower), where gate parasitic resistance is huge.

These details can be seen in the DSPF file, but are not usually discussed in the open literature or in foundry PDK documentation. Visual inspection of text DSPF files is tedious and requires some expertise. Specialized EDA tools (e.g ParagonX [3]) can be used to visualize RC networks connectivity for post-layout netlists (DSPF, SPEF), probe them (see and inspect R and C values), perform electrical analysis, and do other useful things.

Delta gate model

MOSFET gate forms a large distributed RC network along the gate width – shown in Figure 4.

This distributed network has a different AC and transient response than a simple lumped one-R and one-C circuit. It was shown [2-3] that such RC network behaves approximately the same as a network with one R and one C element, where C is the total capacitance, and R=1/3 * W/L *rsh for single-side connected poly, and R=1/12 * W/L * rsh for double-sided connected poly. These coefficients – 1/3 and 1/12 – effectively enable an accurate reduced order model for the gate, reducing a large number of R and C elements to two (or three) resistors and one capacitor.

To enable these coefficients in a standard RC netlist (SPICE netlist or DSPF), some smart folks invented a so-called Gate Delta Model – where a gate is described by two positive and one negative resistors – see Figure 5.

Figure 5. MOSFET Delta gate model.

Some SPICE simulators have problems handling negative resistors, that’s possibly why this model did not get a wide adoption. Some foundries and PDKs support delta gate model, while some others don’t.

Many people get surprised when they see negative resistors in DSPF files. If these resistors are next the gate instance pin – they are a part of the gate delta circuit.

Distributed effects along the gate length (in the direction from source to drain) are usually ignored at the circuit analysis level, due to a small value of gate length as compared to gate width.

Impact of interconnect parasitics on gate resistance

In “old” technologies, metal interconnects (metals and vias) had a very low resistance, and gate resistance was dominated by gate poly. The analysis and calculation of gate resistance was very simple.

In the latest technologies (e.g. 16nm and lower), interconnects have very high resistance, and can contribute a significant fraction (50% or more) to the gate resistance. Depending on the layout, gate resistance may have significant contributions from any layers – devices (gate poly, field poly), MEOL, or BEOL.
Figure 6 shows the results of gate resistance simulation using ParagonX [3]. Pareto chart with resistance contributions by layer helps identify the most important layers for gate resistance. Visualization of contributions by layout polygons to the gate resistance immediately points to the choke points, bottlenecks for gate resistance, that is very useful to guide layout optimization efforts.

Figure 6. Simulation results of gate resistance: (a) Gate resistance contribution by layer, and (b) contribution by polygons shown by color over the layout.
Gate resistance in FinFETs

In planar MOSFETs, the gate has a very simple planar structure, and the current flow in the gate is one-dimensional, along the direction of the gate width.

In FinFET technologies, the gate wraps around very tall silicon fins, and hence has a very complicated 3D structure. Further, gate material is selected based on the work function, to tune the threshold voltage (threshold voltage in FinFETs is tuned not by the channel doping, but by gate materials). These materials have very high resistance, much higher than solicited poly (which has typical sheet resistivity of ~10 Ohm/sq). The gate may be formed by multiple layers – interface layer with silicon, and one or more layers above it.
However, all these details are abstracted from the IC designers and layout engineers, and they see usual polygons for “poly” and for “active” – which makes design work much easier.

Handshake between SPICE model and parasitic extraction

In general, both SPICE models and parasitic extraction tools take gate resistance into account. Parasitic extraction is considered a more accurate method of calculating parasitic R and C values around the devices, since it “knows” (unlike SPICE) about the layout.

To avoid parasitic resistance and capacitance double-counting (in SPICE model and in parasitic extraction), there is a mechanism of a hand-shake between SPICE modeling and parasitic extraction, based on special instance parameters.

Regular device vs RF Pcell compact models

Regular MOSFET SPICE models do not describe gate resistance accurately enough for high frequencies, high switching speeds, or for RF or noise performance. To enable high simulation accuracy, the foundries usually recommend using RF P-cells, that have fixed size, that contain a shield (guard rings and metal cages), and that are described by high-accuracy models derived from measurements. However, these RF P-cells have a much larger area than standard MOSEFTs, and many designers prefer to use standard MOSFETs, to reduce area.

Vertical component of gate resistance

In “old” technologies (pre-16nm), gate resistance was dominated by lateral resistance. However, in advanced technologies, multiple interfaces between gate material layers lead to a large vertical gate resistance. This resistance is inversely proportional to the area of the gate poly. It can be modeled as an additional resistor connecting gate instance pin to the center point of the gate poly – see Figure 7(a). As a result, when the gate gets narrower (smaller number of fins), gate resistance goes down, but increases at very small gate widths. It displays a characteristic non-monotonic behavior, as seen in Figure 7(b). The old rule of thumb where “the narrower gate has lower gate resistance” does not work any more. Designers and layout engineers have to select the optimum (non-minimal) gate width (number of fins), to minimize gate resistance.

Figure 7. (a) Gate model accounting for vertical gate resistance, and (b) measured and simulated gate resistance versus number of fins (ref. [2]).
Depending on technology, on PDK, and on foundry, the vertical gate resistance may or may not be included into parasitic extraction. It’s very easy to check this in DSPF file – if gate instance pin is connected directly to the center of the gate poly – vertical resistance is not accounted for. If it is connected by a positive resistor to the center of the gate poly – that resistors represents the vertical gate resistance.

Technology trends

With technology scaling, both gate resistances and interconnect resistances increase significantly – by up to one or two orders of magnitude. As a result, the details of the layout that were not important for gate resistance in older nodes, become very important in advanced nodes.

Other MOSFET gate-like structures

While the discussion on gate resistance in this article is focused on MOSFETs, the same arguments and approaches are applicable to other distributed systems controlled by the gate or by gate-like systems, such as:
• IGBTs (Insulated Gate Bipolar Transistors)
• Decoupling capacitors
• MOS capacitors
• Varactors
• Deep trench and other MIM-like integrated capacitors

Figure 8 shows a gate structure of a vertical MOSFET, and gate delay distribution over the device area, simulated using ParagonX [3].

Figure 8. (a) Typical layout of vertical FET, IGBT, and other gate-controlled devices. (b) Distribution of gate resistance and delay over area.
References

1. B. Razavi, et al., “Impact of distributed gate resistance on the performance of MOS devices,” IEEE Transactions on Circuits and Systems I: Fundamental Theory and Applications, vol. 41, pp. 750-754, 11 1994.
2. A.J.Sholten et al., “FinFET compact modelling for analogue and RF applications”, IEDM’2010, p.190.
3. ParagonX User Guide, Diakopto Inc., 2023.

Also Read:

Bizarre results for P2P resistance and current density (100x off) in on-chip ESD network simulations – why?

Your Symmetric Layouts show Mismatches in SPICE Simulations. What’s going on?

Fast EM/IR Analysis, a new EDA Category


Achieving Optimal PPA at Placement and Carrying it Through to Signoff

Achieving Optimal PPA at Placement and Carrying it Through to Signoff
by Kalar Rajendiran on 05-02-2023 at 10:00 am

PreRoute PostRoute Net Length Correlation

Performance, Power and Area (PPA) metrics are the driving force in the semiconductor market and impact all electronic products that are developed. PPA tradeoff decisions are not engineering decisions, but rather business decisions made by product companies as they decide to enter target end markets. As such, the sooner a company knows if a certain PPA can be achieved, the better it is for business planning purposes alongside chip development work. The worst thing to happen is for a company to realize many months into the development phase that the desired PPA cannot be achieved. Hence, companies seek to establish optimal PPA for a chip as early as possible in the development process. Placement happens to be such a stage as the physical elements have more or less been finalized. But achieving that goal is not that simple given the several challenges related to PPA that need to be addressed to achieve a successful design.

What if there is a way to achieve optimal PPA at placement stage and carry it through to signoff, in spite of the above mentioned challenges. Siemens EDA’s digital implementation solution Aprisa promises to deliver that benefit and more. This was substantiated by MaxLinear, one of Siemens EDA’s customers during the recent User2User conference. Ravi Ranjan, MaxLinear’s Director of Physical Design presented proof points based from real life experience with N16 and N5 process based design implementations. MaxLinear plans to extend the adoption of Aprisa for new process nodes on future projects.

Excellent Correlation

One common reason for changing the placement after routing is to fix timing violations. The design needs to meet the required timing constraints while minimizing the delay and maximizing the clock frequency. This requires careful optimization of the placement and routing to ensure that the critical paths are optimized, and the timing constraints are met. Another reason for changing placement after routing is to optimize power consumption. The design needs to minimize the power consumption while still meeting the performance requirements. This requires careful power optimization techniques such as clock gating, power gating, and voltage scaling to minimize the power consumption. Yet another reason for changing placement after routing stage is to address routing congestion. Too many wires or interconnects that need to be routed through a limited space, result in routing difficulties or a suboptimal routing solution.

Closure is the process of meeting all design requirements, such as timing, power, and area, while also ensuring that the design is manufacturable.

Excellent correlation of timing, latency and skew through the placement, clock tree synthesis (CTS) and routing stages are indicative of placement stage PPA carrying through to successful routing. As an example, the following Figure shows pre-route vs post-route signal net length and clock net length correlation sample from a N16 based design.

Automated Flow Setups

Place and route (P&R) tools and methodologies typically need to be adapted for each new technology node to achieve the best PPA for a target process. The reason for this is that each process technology has unique characteristics that can significantly impact the P&R process. To achieve the best results, P&R tools need to be specifically calibrated and optimized for a process node. Typically, this step calls for engineering expertise and prior experience and involves trial-and-error for adoption.

Aprisa’s FlowGen capability reduces the effort needed to setup for new technology and adapt the design flow as well. MaxLinear found it very easy to adapt their flow when moving from N16 to N5 designs. The Aprisa FlowGen supports a wide range of design types including SoC, CPU, timing critical and congestion critical ones.

Summary of Aprisa Benefits

  • Placement stage optimal PPA maintained through to signoff
  • Reduced total turn-around-time (TAT)
  • Reduced ECO iterations
  • Ease of adoption to new process nodes

For more details, visit the Aprisa product page.

Also Read:

Mitigating the Effects of DFE Error Propagation on High-Speed SerDes Links

Hardware Root of Trust for Automotive Safety

Siemens EDA on Managing Verification Complexity


Anirudh Keynote at Cadence Live

Anirudh Keynote at Cadence Live
by Bernard Murphy on 05-02-2023 at 6:00 am

IMG 0064

Anirudh is an engaging speaker with a passion for technology. Acknowledging the sign of the times, he sees significant value-add in AI but reminded us that it is a still supporting actor in system design and other applications where star roles will continue to be played by computational software that’s founded in hard science, math, and engineering technologies. This is Cadence’s singular focus—continuing to advance computational software methods in EDA and other domains while leveraging AI techniques where appropriate.

Market drivers

I was talking to an analyst recently who thought that because manufacturing activity is down, semiconductor design must also be suffering. So, I’m not surprised that Anirudh kicked off the discussion with a nod to this being a tough year for semi revenues following multiple years of massive growth, attributing the correction to over-stuffed inventories.

And yet (maybe someday analysts will understand this), design continues to be strong, in part because design cycles are much longer than manufacturing cycles, in part because no one can afford to come out of a downturn without new products ready to launch, and in part because design starts continue to grow as systems companies, now delivering 45% of Cadence business, are accelerating their own design activity.

Anirudh believes that the semi-industry will reach $1 trillion in revenue by the end of the decade, a 2X growth from today, and that electronics and related systems will reach $3 trillion in revenue in a similar period. Manufacturing may be in a slump right now, but we already have an appetite for hyperscale computing, 5G, autonomous vehicles, AI and industrial IoT. That appetite won’t disappear, so manufacturing will surge back at some point. The winners at that point will be companies ready with new designs. Cadence is very optimistic about the long-term tailwinds behind design for such products.

Computational software and AI

Computational software is all about precision at a massive scale of complexity of the object to analyze (billions to soon trillions of transistors) and in the nature of the analysis (PPA optimization and/or multi-physics). Foundational methods are grounded in well-established hard science advances like finite element analysis with origins in the late 1800’s. Maxwell’s electromagnetic equations from a similar period, thermal diffusion first described by Fourier even earlier, and so on and so on. The EDA industry has been developing technologies over at least the last 50 years.

In contrast, AI is all about probabilistic inferencing, delivering impressive responses with, say, 97% certainty, in some cases better than we can manage. But, at the same time, we don’t want to hear that in 97% of cases our car won’t crash, or the robot surgeon won’t make a mistake. We want the precision and reliability of computational software in building and analyzing systems with AI as a layer on top to help us explore more implementation options.

Developing and maintaining that technology is not cheap. Cadence has about 10,000 people with 65% in R&D and 25% in customer support. 90% of its staff are in engineering, which is comforting to know. Designs built using these technologies will be reliable, safe, secure, and eco-friendly. But how does that scale? By 2030, designs are expected to grow by at least 10X in transistor count. Technology companies are already struggling to add staff, but none can afford to grow staffing by 10X. We need to become even more efficient by abstracting architecture design to higher levels, parallelizing even more, and relying more on AI-assisted decision-making.

AI in Cadence products

Reinforcement learning has become a dominant technique for optimization in EDA. One significant advantage is that it doesn’t require gradient-based estimation to find good search directions to advance. Gradient methods work well when optimization metrics vary relatively smoothly and can be computed quickly but not if they vary rapidly or take hours to compute on each change. Cadence has been talking over the past couple of years about advances in AI with the delivery of products like Cadence Cerebrus, Verisium and Optimality, which all utilize reinforcement learning over multiple runs to guide optimization. These are all cases where computing metrics with precision may take hours, making reinforcement learning essential to advance optimization options.

Evidently results are impressive, as judged by numbers Anirudh shared. There have already been 180+ Cadence Cerebrus tapeouts.

Last month, Cadence announced Virtuoso Studio, covered in more detail by my colleague Daniel Payne. Briefly, this offers more place-and-route support in analog and a claimed 3X productivity advance for designers. There’s more support for heterogenous integration in 2.5D/3D packaging, adding analog and RF into the same package. It also includes integrations with the digital design tool suite, integrity analysis, multi-die packaging, AWR analysis and multi-physics analysis for thermal, electromagnetics, etc.

Cadence also recently announced Allegro X AI for PCB and package design, which it automates placement and routing and reduces physical layout and analysis challenges. For 3D-IC Cadence offers Integrity, starting development in Allegro back in 1995, long before most of us had even heard of chiplets (remember system in package and modules?). Around 2015, more capabilities were developed, though the industry was still not quite ready. More recently, Cadence has been working very closely with its foundry partners to refine Integrity support leading to their latest AI-driven 3D-IC solution.

Onward and upward

It’s easy to see the computational software focus in everything I described above—from chip design and analysis to package and system design and analysis. Where is Cadence going with some of their recent acquisitions? In May of last year, Cadence announced a partnership with the McLaren Formula 1 team who are looking to its Fidelity CFD software to optimize aerodynamics for McLaren’s race cars.

Cadence acquired Future Facilities about a year ago. They provide electronics cooling analysis and energy performance optimization solutions for data center design and operations using physics-based 3D digital twins. Just to prove they are even more versatile, only a few days ago, Cadence announced a partnership with the 49ers, to evaluate ways to optimize energy efficiency and sustainability at Levi’s stadium.

Last year, Cadence also made an investment in molecular sciences company, OpenEye Scientific. Anirudh is very excited about this, seeing huge synergy in simulating molecules. He sees (of course) significant similarities between OpenEye simulation and the Cadence Spectre platform with physics models for molecules looking rather like BSIM models for circuit simulation!

Energizing stuff. I look forward to next year’s update.

Also Read:

Petri Nets Validating DRAM Protocols. Innovation in Verification

More Software-Based Testing, Less Errata

What’s New with Cadence Virtuoso?


Formal-based RISC-V processor verification gets deeper than simulation

Formal-based RISC-V processor verification gets deeper than simulation
by Don Dingee on 05-01-2023 at 10:00 am

End to end formal-based RISC-V processor verification flow for the Codasip L31

The flexibility of RISC-V processor IP allows much freedom to meet specific requirements – but it also opens the potential for many bugs created during the design process. Advanced processor features are especially prone to errors, increasing the difficulty and time needed for thorough verification. Born out of necessity, Codasip has pioneered a formal-based RISC-V processor verification process, bringing higher-level simulation and formal verification methods to efficiently verify core modifications, as shared in their latest technical paper.

Digital logic simulation in complexity only scratches the surface

Simulation seems like an ideal solution for digital logic, and it does shake out many bugs in a first pass that runs quickly. It’s logical to think that by simulating functional blocks individually, combining them into a complete design, and simulating it again, coverage would be pretty good.

As it turns out, pretty good is not nearly good enough. Simulation is only as good as the models and conditions applied – and in a complex RISC-V setting, uncertainty prevails. What software is running, which asynchronous events occur, and how the processor transitions from state to state in every possible scenario can be extremely time-consuming to model for simulation. If a simulator stumbles on a problematic corner case, it may only be sheer luck.

 

L31 block diagram. Source: Codasip

A good example is a branch predictor, such as the block shown in the Codasip L31 RISC-V processor IP. A myriad of states need to be set up to simulate completely how the branch predictor behaves in all cases of caching and pipeline execution.

Codasip uses the word “impossible,” but covering all state space in simulation is impractical, given finite engineering resources and time limits. They see simulation as an essential but not a standalone tool in the formal-based RISC-V processor verification process.

Combining techniques for better coverage in less time

Codasip outlines several advantages of applying formal verification as a complement to simulation.

  • Design behaviors are specified using assertions. Assertions are used both in simulation and in formal. In formal, however, a mathematical analysis can sometimes uncover corner cases that the random walk of simulation has a low probability of hitting.
  • Formal can also tackle non-functional verification such as clock-gating, X-propagation, and clock domain crossing.
  • Formal verification handles investigations where debugging information is limited, providing a way to classify and identify bugs for detailed analysis.

Manually writing assertions remains a problem, especially if one has to go back to write all the assertions for a complex design from scratch. The nice thing about RISC-V being an open architecture is many people are looking at the challenges, including verification. If a team could build on existing assertions from the RISC-V community and incrementally define new ones for their adjustments and additions, it would speed up formal verification.

Codasip has gone one step further, grabbing a golden RISC-V instance. The Siemens OneSpin Processor Verification App consists of a single-cycle execution model of the RISC-V ISA and a set of template assertions to check a design against this model. Codasip has instantiated this model together with the design and specialized the assertions to “connect” both, taking care of the pipeline and other specificities.

 

 

 

 

 

 

 

End-to-end verification flow for the 3-stage L31 core (when the verified instruction I is neither stalled nor flushed). Source: Codasip.

Automatically is the operative word. In this approach, users can specify RISC-V parameters and extensions, extract design data, add customizations like new instructions, and move to verification without spending time writing assertions manually or trying to split the design into smaller chunks. Codasip has demonstrated verification of the full L31 at once in this approach.

Cutting formal-based RISC-V processor verification down to two hours

The punchline of this technical paper is Codasip reduced the time to verify the L31 core from thousands of simulation hours – assuming that a simulation scenario could even be generated, what could be person-years of effort – to full proof in two hours of run time after setup.

And they didn’t skip steps. They outline three complex corner cases where the approach spotted bugs they didn’t anticipate from the simulation. A branch predictor corruption arose when an undefined instruction generated an exception. Delays and pipeline stalls in the right combination caused multiple executions of the same instruction. A legal FENCE.I instruction got incorrectly marked as illegal.

Codasip uncovered 15 bugs in the formal-based RISC-V processor verification effort for the L31 that their simulations and other steps missed. Design quality increased, and the process laid the path to faster verification of future customizations for their customers.

To get the whole story, download the Codasip technical paper (registration access for full text):

A formal-based approach for efficient RISC-V processor verification


Podcast EP159: The Fourth Quarter ESDA Market Data Report with Dr. Walden Rhines

Podcast EP159: The Fourth Quarter ESDA Market Data Report with Dr. Walden Rhines
by Daniel Nenni on 05-01-2023 at 8:00 am

Dan discusses the recent Q4 2022 ESDA report with Wally Rhines, Executive Sponsor of the SEMI Electronic Design Market Data report. Overall, the report portrays strong growth and positive results. Dan explores some anomalies in the data and possible reasons with Wally.

Dan also discusses some very interesting events at the upcoming Design Automation Conference with Wally. You won’t want to miss those.

The views, thoughts, and opinions expressed in these podcasts belong solely to the speaker, and not to the speaker’s employer, organization, committee or any other group or individual.


Petri Nets Validating DRAM Protocols. Innovation in Verification

Petri Nets Validating DRAM Protocols. Innovation in Verification
by Bernard Murphy on 05-01-2023 at 6:00 am

Innovation New

A Petri nets blog scored highest in engagement last year. This month we review application of the technique to validating an expanding range of JEDEC memory standards. Paul Cunningham (Senior VP/GM, Verification at Cadence), Raúl Camposano (Silicon Catalyst, entrepreneur, former Synopsys CTO and now Silvaco CTO) and I continue our series on research ideas. As always, feedback welcome.

The Innovation

This month’s pick is Fast validation of DRAM protocols with timed petri nets. The authors presented the paper at the 2019 MEMSYS conference and are from Fraunhofer and TU Kaiserslautern in Germany.

JEDEC standards for memory protocols describe the complexities of command behaviors and timing through a mix of state machine diagrams, tables, and timing diagrams. Validating an implementation through simulation depends on creating meaningful tests and checks through manual comparison with the standard doc. JEDEC themselves acknowledge that their reference based on a combination of FSMs, tables and timing diagrams is not fully complete and makes automated test generation problematic. This paper uses Timed Petri Nets to provide a full model of the DRAM states, the logical command dependencies and the internal timing dependencies of the system under test, from which a complete SystemC reference model can be automatically generated and used as a reference with popular DRAM simulators for verification.

In addition to the value of the ideas, this paper provides a useful intro to the mechanics of DRAM operation for novices like me!

Paul’s view

This is an easy to read, self-contained paper, providing a great example of the opportunity for domain specific languages in design and verification. In this paper the authors tackle the problem of verifying DDR interfaces. They build on prior art describing the DDR3 protocol using a Petri Net, a wonderful graph-based notation for visually representing interactions between concurrent state machines.

The authors’ key contribution is to upgrade this prior art to a “Timed” Petri Net containing additional arcs and annotations to model the timing dependences between commands in the DDR protocol standard. They create a nice little textual language, DRAMml, for describing these Timed Petri Nets which is able to represent the complete DDR3 command protocol on one page. Nice!

They also develop a compiler for DRAMml to generate simulatable SystemC code which can be used as a “golden” model as a reference for verification. As final icing on the cake, they use their golden model to find a bug in DRAMSys, a well cited popular DRAM simulator in the literature. I would be really interested to see if this work could be applied to other protocols such as PCIe or Ethernet.

Raúl’s view

This is the first paper on the validation of dynamic memory controllers for DDR that we have reviewed in this series. JEDEC has issued 19 standards since the original DDR in 2000, including DDR5 and HBM3. It is easy to see that the specification of such memories – with multiple banks, 212 to 218 rows, where each row can store 512 B to 2 KB of data, recharge, refresh requirements – can get very complex, particularly regarding the timing requirements. As an example, the authors cite that to model the complete state space of a DDR4 with 16 banks requires 65,536 states with over a billion transitions.

To simplify and formalize memory specifications, the paper builds models using extended Petri Nets. Petri nets are bipartite graphs with places and transitions connected by weighted directed arcs (section 3.2). They are extended by:

-> [t1, t2] timed-arc with guard [t1,t2] meaning the transition can only fire in that time interval

->>            reset-arc which clears a place of all tokens

-o              inhibitor-arc which prevents a transition to fire

With such extensions Petri Nets become as powerful as Turing Machines. These Petri Nets model DRAMs with reasonable complexity, e.g. 4 places and 8 transitions, plus a place and 6 transitions per bank. Power can be modeled directly; timing gets a bit more complicated requiring also timing dependencies between transitions.

The paper then goes on to define a DSL (domain specific language) called DRAMml using the MPS Software from Jetbrains, to describe this Petri Net and convert it to SystemC. The generated executable model was simulated with several DRAM simulators, namely DRAMSim2, DRAMSys, Ramulator, and the DRAM controller in gem5, uncovering a timing violation in DRAMSys.

I found the paper interesting and easy to read given that I also worked with Petri Nets before. Following the DRAM specs in detail is more for the specialist but can be educational. The claim that “DRAMml, which describes all the timing, state and command information of the JEDEC standards in a formal, short, comprehensive, readable and understandable format” is not obvious. It requires understanding of Petri Nets, which may be a barrier to the adoption of the methodology, despite improved simplicity and expressive power. It would be interesting to know what JEDEC thinks of this approach since in principle it should allow them to provide or at least build definite reference models for new standard releases.


Podcast EP158: The Benefits of a Unified HW/SW Architecture for AI with Quadric’s Nigel Drego

Podcast EP158: The Benefits of a Unified HW/SW Architecture for AI with Quadric’s Nigel Drego
by Daniel Nenni on 04-28-2023 at 10:00 am

Dan is joined by Nigel Drego, the CTO and Co-founder at Quadric. Nigel brings extensive experience in software and hardware design to his role at Quadric. Nigel is an expert in computer architectures, compiler technology, and software frameworks.

Dan explores the unique and unified HW/SW architecture developed by Quadric with Nigel. The benefits of a single architecture programmable approach to on-chip AI is explained, along with specific examples of how to adapt the system to various AI processing challenges.

The views, thoughts, and opinions expressed in these podcasts belong solely to the speaker, and not to the speaker’s employer, organization, committee or any other group or individual.


CEO Interview: Ravi Thummarukudy of Mobiveil

CEO Interview: Ravi Thummarukudy of Mobiveil
by Daniel Nenni on 04-28-2023 at 6:00 am

Ravi Thummarukudy

Mobiveil Marks 11th Anniversary

Ravi Thummarukudy is Mobiveil’s Chief Executive Officer and a founder. He and I recently spent an enjoyable afternoon getting acquainted as I learned more about Mobiveil. It’s an inspiring story of a technology company in the semiconductor space helping customers and prospering.

Eleven-year-old Mobiveil is noted for its silicon IP, application platforms and engineering services for Flash Storage, data center, 5G, AI/ML, automotive and IoT applications. In those 11 years, Mobiveil thrived and continues to do so with around 500 employees located around the world – Mobiveil has R&D centers in Silicon Valley, Bangalore, Chennai, Hyderabad, and Rajkot in India.

Product Development teams at the world’s largest product companies in U.S., Europe, China, Japan, Korea, Israel, and Taiwan have leveraged Mobiveil’s IP blocks, hardware platforms or its specialized engineering services to accelerate their innovation and product development schedule. Mobiveil is unique in its business by continuously investing in R&D to develop reusable components and platforms to increase the value add for its engineering capabilities.

What’s your background? What about your co-founders’ backgrounds?

All the founders of Mobiveil have electronics engineering background with 20 plus   years’ experience in the electronic product development marketplace working for either System OEMs or their supply chains like EDA or Semiconductor companies. Another thing in common among the founders is our passion for new product development and the opportunity to accelerate this process.

This is the second company we founded together as a leadership team. We founded GDA Technologies in the late 1990s and grew to a more than 600 employee organization before being acquired by L&T Infotech.

I received a Master of Science degree in electrical engineering from IIT Chennai and MBA from Santa Clara University and worked at Indian Space Research Organization (ISRO), Tata Consulting Services and Cadence Design Systems before venturing into entrepreneurship.

Did you see yourself as an entrepreneur?

Like many people working in Silicon Valley, becoming an entrepreneur was always in my mind and my co-founders also had similar aspirations. At that time, I was at Cadence, I learned the contours of the electronics industry as well as customer-centered business practices. I also established deep working relationships in the EDA and Semiconductor industry. The industry was prime for growth due to standardization of design languages and the advent of IP centric-SoC design methodologies. When we started the first company, it was a boom time in the Silicon Valley where VCs were funding several semiconductor startups and there was tremendous demand for outsourced engineering – This helped us scale the business with no external investment. This jumpstart allowed us to chart our own destiny with our engineering intuitions and scale the business targeting our investments toward the disruptions that were taking place in the marketplace.

After we had a successful exit from our first venture, we worked at large public companies but again came together for this venture we named Mobiveil. I would say that we are passionate about this business and enjoy being serial entrepreneurs.

What convinced you and your co-founders to start Mobiveil?

It’s our firm conviction that no company big or small can do their entire R&D by themselves. There was a time when product companies were fully vertically integrated from product definition to manufacturing. Cost and efficiency offered by specialized companies drove product companies to slowly outsource manufacturing, chip development, EDA tools and finally IP. And once the industry became standardized, outsourcing EDA tools, IP and engineering services became a no-brainer. We helped this process by offering capabilities in the U.S. as well as from India that met the needs for our customers at affordable cost then and continues even today.

Mobiveil began by targeting the mobile apps space and later moved into the storage area? Why the shift?

Our first company got acquired in 2008 and when we started to look for a restart several years later, the main theme of the time was mobility and smart phones and we wanted to contribute to this space. However, we soon realized that our passion belonged in product development and soon pivoted back to silicon IP, platforms, and engineering services.

During this time, storage technology was transforming from hard disks to Flash storage and the NVM Express standard emerged. We quickly developed the NVM Express IP and got it certified by the University of New Hampshire (UNH). That enabled us to help many of our customers accelerate this hard disk to flash or SATA to NVMe transformation for them. This trend was further accelerated by the exponential growth of data center and cloud service providers who needed the latency and throughput that PCI Express and NVM Express offered. Over the years, we developed many IP blocks as well as acquired IP assets in data storage and augmented it by standard platforms and specialized engineering services.

One other example is CXL technologies. We were one of the first companies to develop CXL design IP and get interoperability with Intel’s Sapphire Rapids platform. Today we have several silicon IP blocks around high-speed interfaces and error-correction technologies for memory and storage along with several readymade FPGA platforms. We also hold a few important patents in this space.

How has the SIP market changed and evolved over the past 10 years? How much have the market dynamics changed?

One of my first jobs at Cadence in the 1990s was to work with companies like NEC, LSI Logic, Toshiba to convince them to use Cadence EDA tools instead of their internally developed tools and methodologies. Even though these customers loved their fully customized and locally supported EDA tools, these tools did not stand a chance against more sophisticated and ever-improving third- party EDA tools and the economies of scale that they offered.

When we started the first company, we were convinced that SoC customers sooner or later would outsource standard IP to third parties and ventured into standards-based IP like Hyper Transport, RapidIO and PCI Express Today using third-party IP and engineering services is as common a practice as getting a chip manufactured at TSMC or Samsung foundries.

While the IP industry grew as a separate segment, it was quickly absorbed by the EDA companies as it was an adjacent market for them. They started offering many of the standards-based IP along with their EDA tools. Conversely, independent IP companies became specialized and are offering highly complex IP to differentiate themselves from the EDA companies.

Our approach is to focus on technology verticals like data storage and 5G and offer a portfolio of digital IP combined with engineering services as our major differentiator.

How important are industry standards for IP? Is Mobiveil active in standards organizations?

Standardization is extremely important for IP. In fact, standardization is the main reason the EDA and SIP industries were created. The huge success of Verilog and VHDL design languages and PCIe, USB, DDR, Ethernet protocols and electrical standards fueled the growth of EDA and IP business along with outsourced engineering services.

Standardization allows engineers working from anywhere in the world the ability to design standard IP components in an EDA environment that could be quickly integrated to SoCs easily.  The democratization of the semiconductor industry was further strengthened by the advent of open-source initiatives like RISC-V and availability of lower cost manufacturing and engineering talent available in primarily Asian countries.

Mobiveil currently participates in several standard bodies such as PCIe SIG, MIPI Alliance and the NVM Express Consortium. Being part of these standard bodies and accelerating the adoption of these technologies early is a critical strategy for our growth.

What’s next for Mobiveil?

For Mobiveil, we continue to be passionate about helping our customers realize their products faster and cheaper. And to that extent, we continue to invest in innovation that creates standards-based IP blocks and platforms augmented by specialty services. We are confident of the growth of this industry and our ability to scale the company for greater success in the years to come. We have had several successes in entering a new space like Flash storage and making valuable contributions.

Looking forward, I see Mobiveil growing and becoming a major contributor of IP and services in 5G wireless where we have an initiative to develop mathematics-based data path IP like data encoding, decoding and transformation. We are also developing a platform for offering 5G services for private cells (licensed and unlicensed band) and gateways.  On the AI front, we focus on computer vision, image recognition and processing. For geographic expansion, we will soon open our office in Munich, Germany, where we plan to focus on 5G wireless, automotive, and industrial automation including robotics.

Also Read:

Developing the Lowest Power IoT Devices with Russell Mohn

CTO Interview: Dr. Zakir Hussain Syed of Infinisim

CEO Interview: Axel Kloth of Abacus