SemiWiki – Page 466 – The Open Forum for Semiconductor Professionals

March 12, 2019September 15, 2020

Webinar: Addressing Multiphysics Challenges in 7nm FinFET Designs

Webinar: Addressing Multiphysics Challenges in 7nm FinFET Designs
by Daniel Nenni on 03-12-2019 at 7:00 am
Categories: Ansys, Inc., EDA, FinFET

EDA is big on growth through acquisition, being acquired many times throughout my career I know this by experience. In fact, we have a wiki that tracks EDA Mergers and Acquisitions and it is the most viewed wiki on SemiWiki.com with 101,918 views thus far.

In March of 2017 ANSYS acquired CLK Design Automation which did timing variation analysis and FX for transistor model and simulation. At the time I worked for Solido Design who had some overlap with CLK and we actually looked at acquiring them before ANSYS did. The jewel in the crown of CLK was the technologists and one of those jewels is Dr. Joao Geada, absolutely.

Bio:Dr. Joao Geada is a chief technologist at ANSYS, with over 20 years of EDA experience. He leads the development of the semiconductor business unit’s FX timing and timing variation products. He is the author of numerous papers and patents around static timing analysis and statistical timing. Before ANSYS, Dr. Geada was CTO and co-founder of CLK Design Automation and one of the lead architects in the verification and simulation group at Synopsys. Before Synopsys, Dr. Geada was a senior researcher at Cadence Design Systems and started his career at the IBM TJ Watson Research Center. Dr. Geada holds a Ph.D. and bachelor’s degree in engineering from the University of Newcastle on Tyne (UK).

Dr. Geada is the speaker for the upcoming webinar Addressing Multiphysics Challenges in 7nm FinFET Designs.Even if you can’t make the live webinar sign up and you will be notified when the replay is up:

Date: March 28, 2019
Time:9 a.m. PST
Presenter: Dr. Joao Geada, Chief Technologist, ANSYS

Webinar Link:http://bit.ly/2C8pF3B

Abstract:
Variability has become the new enemy in 7nm FinFET designs. You can’t fix what you can’t find, and variability takes many forms. For instance, there is variability in process due to smaller geometries, variability in voltage drop due to varying workloads and variability in temperature across the chip due to increased self-heating and joule heating effects. All directly impact silicon performance. Increased cross-coupling of various multiphysics effects such as timing, power and thermal in 7nm designs poses significant challenges for design closure. Power grid design and the profound impact of grid weakness issues on timing-critical paths have become limiting factors for achieving the desired performance and area targets. Power grids consume a significant amount of metallization resources, and with routability becoming a big constraint at advanced nodes, power and timing closure have become a designer’s nightmare.

Traditional margin-based methodologies that have served well in the past are becoming ineffective. These methodologies helped in confining the problem space by decoupling design methodologies to manage complexity and limitations in electronic design automation (EDA) tools that are not architected to solve multiphysics challenges. At 7nm process nodes, however, these siloed methodologies are increasingly failing to achieve the highest performance in silicon. Margins work well only as long as the results are predictable. With the margin-based approach, increased variability makes it hard to predict true silicon behavior and impacts both time-to- result (TTR) and time-to-market (TTM) goals in complex design projects.

Attend this webinar to learn how ANSYS multiphysics simulations can be leveraged for better understanding the true limits of built-in margins and accurately predicting post-silicon behavior. Multiphysics simulations will enable you to achieve the target maximum frequency on silicon, while drastically improving the functional yield of your chips.

About ANSYS, Inc.
If you’ve ever seen a rocket launch, flown on an airplane, driven a car, used a computer, touched a mobile device, crossed a bridge, or put on wearable technology, chances are you’ve used a product where ANSYS software played a critical role in its creation. ANSYS is the global leader in Pervasive Engineering Simulation. We help the world’s most innovative companies deliver radically better products to their customers. By offering the best and broadest portfolio of engineering simulation software, we help them solve the most complex design challenges and create products limited only by imagination. Founded in 1970, ANSYS employs thousands of professionals, many of whom are expert M.S. and Ph.D.-level engineers in finite element analysis, computational fluid dynamics, electronics, semiconductors, embedded software and design optimization. Headquartered south of Pittsburgh, Pennsylvania, U.S.A., ANSYS has more than 75 strategic sales locations throughout the world with a network of channel partners in 40+ countries. Visit www.ansys.comfor more information.

March 11, 2019July 18, 2025

Accelerating SOC Development for Automobile Applications

Accelerating SOC Development for Automobile Applications
by Tom Simon on 03-11-2019 at 12:00 pm
Categories: Automotive, IP, Synopsys

No area of electronics is moving faster than automotive semiconductors. Everyone has been talking about the increasing electronics content of automobiles for decades. With Advanced Driver Assistance System (ADAS) and autonomous driving becoming a reality the pace has picked up even more. These new designs combine just about every single advanced subsystem used in SoC designs. Prior to the giant leap in mobile device technology, people talked about how there was a ‘convergence’ coming that would integrate communications, networking, graphics, processing, etc. That did indeed happen with the result being the current generation of cell phones.However, a new convergence is coming.

With ADAS and autonomous driving, we are essentially talking about putting advanced supercomputing along with state of the art sensor fusion, multiple modes of wired and wireless networking, high-speed memory, and advanced algorithms into each car. The rigid power, security, and reliability constraints on these new systems make this a more daunting task. Exciting new companies like FABU Technology have come along to address the growing market for AI SoCs with innovative designs targeted at ADAS and autonomous driving.

FABU has set out to rapidly build SoCs for ADAS and autonomous driving that can collect sensor data from gyro, accelerometer, compass, vision, lidar and radar systems, then combine them with maps, real time traffic and road condition data to create an accurate view of the vehicle’s environment. Surrounding vehicles, traffic signs, and pedestrians must be identified. Additionally, the ADAS and autonomous driving systems need to monitor the driver to detect driver attentiveness, distraction or drowsiness.

The market is moving too fast for a company like FABU to set out to build the required automotive IP from scratch. At the same time much of the IP needed is specialized and has be built to automotive standards. These standards include ISO 26262 and AEC-Q100. In order for them to focus on their core competency, FABU chose to license a broad swath of the necessary IP from Synopsys. Their automotive-grade DesignWare IP offerings are ideally suited to FABU’s needs.Going this route allows FABU to focus on where they add value – implementing highly optimized algorithms – while leveraging IP that meets their functional needs as well as all automotive reliability and security requirements.

I had a chance to speak recently with Ron DiGiuseppe, senior marketing manager for automotive IP products at Synopsys, about FABU’s choice for IP. They worked closely with FABU to help them select the optimal interface, security, processors, and foundation IP solutions. Interface IP is being used to collect data from sensors, such as MIPI for image and video.

FABU will be using the Synopsys Safety Island for its ARC processors with dual lockstep cores, which have an independent safety monitor to check for faults and failures. In the event of a safety exception there is an escalation to the host processor where it can be independently processed.

Ron also talked about the need for security. The last thing you want is security intrusions. Synopsys IP offers hardware Root of Trust with encryption and a trusted execution environment (TEE) to prevent tampering and other malicious activities.

By licensing the broad portfolio of IP from Synopsys, FABU will benefit from consistency in the deliverables, especially with respect to documentation for ISO 26262 and IP integration. Ron pointed out that Synopsys will work with FABU to ensure selection of the ideal process node to meet their PPA and safety requirements. Synopsys has built relationships with foundries for automotive processes as part of their commitment to this market. FABU can use the licensed IP as a foundation for creating SoCs that offer breakthrough functionality and performance. The announcement contains detailed information on each of the categories of IP that they licensed and information about how each of them meets the requirements for automotive applications.

Synopsys automotive-grade DesignWare IP comes with documentation for ISO 26262 compliance and their automotive IP is ASIL B or D Ready. For reliability, Synopsys works with foundries to ensure AEC-Q100 compliance. This involves producing GDS layout that meets the more stringent automotive reliability including design rules for Grade 1 and 2 temperature. Another area where Synopsys adds value for the automotive market is with their test and repair tools, which further improve quality and reliability.

March 11, 2019September 15, 2020

eSilicon Bucking the Trend at OFC with 7nm SerDes

eSilicon Bucking the Trend at OFC with 7nm SerDes
by Daniel Nenni on 03-11-2019 at 8:00 am
Categories: eSilicon, Events, IP

A recent press release from eSilicon caught my eye. The company has been touting their 7nm SerDes quite a bit lately – reach, power, flexibility, things like that. While those capabilities are important, any high-performance chip needs to work in the context of the system, which usually contains technology from multiple sources. So, interoperability does matter and eSilicon’s press release announcing the addition of an interoperability demo with a mainstream FPGA at a major show is relevant. The release also talked about working with another ecosystem partner – Precise-ITC, to validate that their forward error correction (FEC) IP worked with the eSilicon SerDes as well.

Interoperability demo at OFC: eSilicon 56G SerDes and Precise-ITC 400G FEC

“Our current SerDes demonstration showcases the robustness, low power and flexibility of our 7nm device,” said Hugh Durdan, vice president of strategy and products at eSilicon. “It is also important to demonstrate interoperability with other popular hardware. I am delighted we can showcase this additional aspect of our SerDes capabilities at OFC.”

“Precise-ITC is a leading provider of Ethernet and optical transport (OTN) intellectual property products for ASIC and FPGA,” said Silas Li, Director of Engineering at Precise-ITC. “OFC2019 is a showcase event for the partnerships we have with FPGA vendors, ASIC developers, like eSilicon, and test equipment developers. Together, we’re enabling rapid deployment of 400GbE.”

Digging a bit more, the release announced additions to the demo compliment eSilicon will showcase at OFC. The Optical Fiber Communication Conference and Exposition (OFC) is a huge technical conference and trade show that is over 40 years old. According to their website: “OFC is the largest global conference and exhibition for optical communications and networking professionals.” There are over 700 exhibitors on 350,000 square feet of exhibit space. The show takes up the entire San Diego Convention Center, which is also where Comic-Con is held. This a huge show, absolutely.

OFC show floor

Digging further, you can find some more interesting news in the press release. In addition to the interoperability demo, eSilicon is demonstrating a complete HBM2 memory subsystem using Silicon’s latest 7nm HBM2 PHY, Northwest Logic’s memory controller and an HBM DRAM stack from a leading memory supplier. And they’re demonstrating the performance, flexibility and extremely low power consumption of their 7nm SerDes using with a five-meter ExaMAX Backplane Cable Assembly from Samtec.

eSilicon booth

Five-meter cable demo

eSilicon is demonstrating high-speed communications over a five-meter copper cable at the biggest optical networking show in the world. I would say that takes a lot of confidence. I had some spies at the show, and they reported quite a bit of interest in eSilicon’s copper cable demo. They appear to be driving the longest electrical cable at the show. Getting high speed and low power with a proven, simpler technology such as copper is certainly appealing. I’ll be watching to see what eSilicon announces next.

March 10, 2019

Ultra low-power Analog Design using a Multi-Project Wafer approach

Ultra low-power Analog Design using a Multi-Project Wafer approach
by Daniel Payne on 03-10-2019 at 1:00 pm
Categories: EDA, General

On SemiWiki we often talk about bleeding-edge technology like 7nm, 5nm or even 3nm, but for analog IC designs there’s a low-cost alternative to getting your ideas validated and prototyped without taking out a multi-million dollar loan, and that’s through the use of Multi-Project Wafers (MPW). Starting with a mature process node like 180nm still produces adequate silicon for low-power applications like IoT where analog sensors and converters are the main part of the chip functionality, along with some digital control logic, think big A and little D applications.

My industry contact Wladek Grabinski shared information with me this week about a company in France called CMP(English translation Multi-Project Circuits) that has been offering MPW foundry services since 1981 to keep costs down for IC designers at Universities, research laboratories or industrial companies that want to prototype their analog ideas economically.

For an MPW project you likely want from dozens to thousands of pieces manufactured for you, either packaged or just bare die, ready for testing. In total, some 7,900 projects have been prototyped through 1,043 MPW runs at CMP over the years, helping 614 customers realize their analog ideas into silicon. CMP certainly has their act together and provide a much needed service for companies needing to get quick prototypes for big A, little D designs.

A Swiss company em microelectronic (EM) has an ultra low-power IP library and foundry all ready to use with MPW services provided by CMP. Here’s what EM has to offer you:

Mature 180nm node for ultra low-power analog design (APL018)
NVM (EE or Flash)
EKV accurate models near and sub Vth operation
Analog and digital IP libraries characterized for low voltage (down to 0.4V), low current (nA bias)
I/O pads, low leakage ESD protection
Design Kit for Cadence
Digital flow for Synopsys

EM really knows IC design, as they’ve been in business since 1975 and their ultra-low power silicon is used in six major application areas:

Energy – harvesting, power management, storage
Interfaces – displays, tactile surfaces, computer peripherals, motion sensing, sound production
Sensing – interfaces, sensors
Communications – RF technologies, RF long range communication, RFID, beacons
Smart Processing – wearables, cryptography and security
Time – watches, fobs

Even though the EM headquarters are in Marin, Switzerland, you can also find their facilities around the globe in:

Colorado Springs, USA
Prague, Czech Republic
Bangkok, Thailand

If you’ve ever shopped for a watch you likely have seen the iconic Swatch brand in retail stores and online, so EM is the semiconductor company for Swatch.

Looking at the most recent press releases at EM I conclude that this company is well suited for IC designs that require Bluetooth, IoT, RF and anything that is battery-powered and requires ultra-low power consumption.

CMP invited EM to present at a seminar last month, so check out the slides here.

March 10, 2019

Lyft IPO Paints Perilous Profitless Picture

Lyft IPO Paints Perilous Profitless Picture
by Roger C. Lanctot on 03-10-2019 at 8:00 am
Categories: Automotive

Lyft’s S1 filing for its IPO is a sobering read, as such documents often are, requiring, as they do, the full disclosure of current financial circumstances and, everyone’s favorite: risk factors. Lyft identifies 18 risk factors (below) which could interfere with the long-term success of the operation. I think there are more. Continue reading “Lyft IPO Paints Perilous Profitless Picture”

March 8, 2019May 6, 2019

Data Centers and AI Chips Benefit from Embedded In-Chip Monitoring

Data Centers and AI Chips Benefit from Embedded In-Chip Monitoring
by Daniel Payne on 03-08-2019 at 12:00 pm
Categories: AI, EDA, Moortec

Webinars are a quick way to come up to speed with emerging trends in our semiconductor world, so I just finished watching an interesting one from Moortec about the benefits of embedded in-chip monitoring for Data Center and AIchip design. My first exposure to a data center was back in the 1960s during an elementary school class where they wheeled in a Teletype machine connected to a telephone line, and at the other end was a centralized computer system located in some air-conditioned room that ran a Civil War game app that had us students choosing how to run a campaign with our resources and then predict the outcome of the battle. In the 1970s at the University of Minnesota our data center was powered by machines from Control Data Corporation, and then at my first job with Intel in 1978 the data center was powered by IBM mainframes in a remote location that we accessed from Oregon.

Living in Oregon we know something about data centers because of the low cost of electricity from our plentiful hydro power generators, moderate climate, and generous tax breaks for companies like Googleto locate. In 2018 the data centers in the US consumed some 90 billion kilowatt-hours of electricity, while globally that power consumption was 416 terawatts, which was 3% of the total electrical output. This growing trend for data center power consumption causes heat-induced reliability issues for each of the semiconductor components mounted on boards, stuffing racks of equipment.

Source: Google Data Center

Much new VC money in 2018 has poured into AI chip startups, so let’s just summarize both the data center and AI chip design challenges:

Data Center
· Reliability and long MTBF(Mean Time Between Failures)
· Low service interruption
· Big die sizes at advanced nodes
· High volume with high manufacturing yield required
· Fine grain DVFS (Dynamic Voltage and Frequency Scaling) control
· Chip supply voltage noise

AI
· High data throughput
· Intense and bursty computations
· Constrained power
· Variable CPU core usage, or utilisation
· Continual optimisation of algorithms for data analysis and manipulation
One method to deal with all of these chip design challenges is to place PVT (Process, Voltage, Temperature) monitors in your AI or data center chips, allowing you to measure in real time what’s happening deep within each chip, then use that info to make decisions about changing the Vdd values or local clock speeds to ensure chip reliability and meet MTBF goals. Take the example of a typical AI chip which may have CPU clusters with thousands of cores being used, as shown below where 16 cores form each cluster and then placed around each cluster are PVT blocks sensor (colored blocks):

CPU Clusters with PVT monitors

The temperature monitors will let you know if the Junction Temperatures are within specifications, for example 110C. Thermal monitors can be used to:

· Avoid Electrical Over Stress (EOS)
· Mitigate Electromigration effects
· Limit hot carrier aging
· Prevent thermal runaway

Semiconductor processes are not uniform, so you cannot expect that Silicon will be centered on the TT corner, instead you can expect:
· Process variability across each die
· Variation caused by lithography
· Reliability effects like aging
· FinFET variations

IC designers start out with an ideal power supply concept like a Vdd value of 1.1V, but then you have to deal with the non-ideal physical realties with on-chip voltages like:
· Interconnect resistance causing dynamic IR drops along Vdd paths
· Dynamic versus static power
· Electromigration effects on Power, clock and interconnect

Static Timing Analysis (STA) tools are run on chips before tapeout to ensure that your design meets speed criteria across all PVT corners, but with actual physical local variations on advanced nodes it’s conceivable that one die region has a temperature of 50C, Vdd of 0.8V and SS corner, while another region has a slightly different temperature of 65C, Vdd of 0.9V and TT corner. Your STA tool needs to handle these on-chip variations (OCV) while calculating path delays.

Not all thermal monitors are created equal, so if Moortec provides a thermal monitor with +/- 2C accuracy, and another vendor has a +/- 5C accuracy thermal monitor, go with the 2C monitor in order to provide tighter control to your thermal throttling system, which in turn provides greater power savings and allows for the highest data throughput.

Consider the power savings for a data center with 100,000 servers (Facebook having ~400,000 for example) and you could save 2W per chip by using a Moortec PVT approach versus a less accurate monitor that requires 6C more thermal guard-banding. The webinar provided a case study with calculations, showing if this saving per chip were scaled upward then a data center could save around $2M per year in electricity costs.

Just like tighter thermal guard-banding is beneficial to data center chips and systems, the same can be said for voltage guard-banding with highly accurate 1% values with Moortec mean fewer watts wasted on a system compared with less accurate voltage guard-banding. An example system using 0.8V for Vdd and a 20W target and using Moortec voltage monitors shows a worst-case value of 20.4W, while a less accurate voltage monitor has a worst-case value of 22.1W which is 10% more wasted power than what Moortec provides. Again, Moortec outlined that there were material cost savings to the data center operators.

SoCs that use Adaptive Voltage Scaling (AVS) in closed loop benefit from using embedded Process or Voltage Monitors that tell the PMIC (Power Management IC) what the actual silicon values are.

Voltage scaling optimization

Summary
There’s only one IP vendor dedicated 100% to PVT monitoring for ICs and that’s Moortec, they started in the UK back in 2005 and have customers now around the globe using the most popular nodes from the major foundries. You can take the next step and contact one of their offices nearest to your timezone: UK, USA, China, Taiwan, Israel, Europe, South Korea, Russia, Japan.

Watch the entire 35 minute webinar recording online, after a brief registration process.

Related Blogs

March 8, 2019May 6, 2020

Arm Deliver Their Next Step in Infrastructure

Arm Deliver Their Next Step in Infrastructure
by Bernard Murphy on 03-08-2019 at 7:00 am
Categories: Arm, IoT, IP
3 Comments

Arm announced their Neoverse plans not long ago at TechCon 2018. Neoverse is a brand, launched by Arm, to provide the foundations for cloud to edge infrastructure in support of their vision of a trillion edge devices. To a cynic this might sound like marketing hype. Sure, they’re widely used in communications infrastructure and certainly in edge devices, but they never really cracked the datacenter, or so conventional wisdom held. They put that concern to rest not long after TechCon when AWS announced immediate availability of EC2 A1 instances in their services. These are built on Arm-based Graviton processors, developed by AWS Annapurna Labs.

Continue reading “Arm Deliver Their Next Step in Infrastructure”

March 7, 2019November 22, 2019

Newer cryptocurrencies highlight need for agile mining strategies

Newer cryptocurrencies highlight need for agile mining strategies
by Tom Simon on 03-07-2019 at 12:00 pm
Categories: Achronix, eFPGA, FPGA

Cryptocurrencies represent a radical departure from traditional forms of money. Currencies like Bitcoin, Etherium and Monero offer many unique advantages over traditional currencies, and are changing how money is created and used. Bitcoin, the pioneer of cryptocurrencies, relies on pure computational power for so-called mining, which is the process where transactions are verified and providers of this service are rewarded with newly minted bitcoins. Starting with CPU’s, then GPU’s this lead to an inexorable spiral towards more powerful and dedicated mining hardware. The mining activity moved to FPGAs and then to dedicated ASICs; at the same time, it moved to very specific geographies with low electricity costs. And, the democratization of cryptocurrency yielded to a smaller group of niche players.

Fortunately, this trend has been challenged by newer cryptocurrencies that have imposed new requirements on mining that make it more democratic. For instance, newer currencies such as Monero regularly perform forks, which change the algorithm for mining, rendering dedicated ASICs obsolete. Another strategy is requiring random memory access in a large address space. Both of these features make it more challenging to develop silicon specifically targeted at gaining an advantage in mining.

Interestingly, Achronix has developed a radical departure from traditional FPGAs in the form of embeddable FPGA (eFPGA) fabric, that coincidentally offers some compelling advantages in the mining of these newer cryptocurrencies. Achronix has written a white paper that outlines how their Speedcore eFPGA is well suited to the task of mining. However, their treatise on how well their eFPGA is for mining, also speaks indirectly to how eFPGA can be used to solve a wide variety of challenges that either traditional ASIC or FPGA may struggle with.

Achronix’s Speedcore eFPGA is highly configurable, and at the same time does not drag a lot of unnecessary blocks into the finished design. In an amusing section of their white paper Achronix refers to how some writers refer to standard FPGAs as programmable piles of parts. In all seriousness, standard FPGA parts often are mismatched to the task at hand. Nowhere is this truer than in the area of cryptocurrency mining. Things like Ethernet, PCIe, MAC’s, SerDes, etc. are not needed and just end up taking up valuable real estate for no actual benefit. Also, a multitude of small memories do not suffice for the memory needs associated with mining.

When a precisely configured eFPGA core can be married to custom memory instances, it leads to big performance, power and area advantages. Their white paper compares a case study that uses eFPGA in an ASIC to the performance of GPU or standard FPGA based alternatives. A traditional ASIC based alternative was ruled out because it lacks the re-programmability to deal with forks that require new algorithms for mining.

While perhaps some readers of their white paper may be compelled to embark on designing a new mining chip – the white paper certainly makes clear that it would be a wise choice – the bigger take away is that Speedcore eFPGA offers numerous advantages for a wide range of problems that are currently being addressed with CPUs, GPUs, ASICs or standard FPGAs. It was of course an interesting read on the directions where cryptocurrencies are headed. If you want to learn more, the white paper is available on their website, and makes for good reading.

March 7, 2019January 8, 2020

Intelligent Electronic Design Exploration with Large System Modeling and Analysis

Intelligent Electronic Design Exploration with Large System Modeling and Analysis
by Camille Kokozaki on 03-07-2019 at 7:00 am
Categories: Cadence, EDA

At the recent DesignCon 2019 in Santa Clara, I attended a couple of sessions where Cadence and their research partners provided some insight on machine learning/AI and on large system design analysis; with the first one focused on real-world cloud & machine learning/AI deployment for hardware design and the second one focused on design space exploration analyzing large system designs.

I. Intelligent Electronic Design and Decision

The first session was kicked off by Dr. David White of Cadenceand was entitled Intelligent Electronic Design and Decision. He contrasted the internet-driven image recognition AI problems with EDA related AI. The characteristics of image recognition include natural or man-made static objects with a rich set of online examples whereas EDA characteristics are dynamic and require learning adaptability with sparse data sets where verification is critical and optimization very important.

White pointed out that not a lot of large data sets exist and verification is essential to all we do in EDA/SoC design, and optimization plays a role in large designs when finding design solutions. The ML/DL space additionally refers to a few different technologies such as optimization and analytics. He also noted that these approaches can be computationally heavy, so massive parallel optimization is used to get the performance back. In the development of design automation solutions, uncertainty arises in one of two forms:

Factors/features that are unobservable
Factors/features that are observable but change over time.

Design intent is not always captured in EDA tools where designers have an objective and intention in mind and then tune to an acceptable solution. This can be problematic at recent silicon technologies where uncertainty is greatest and there is a low volume of designs to learn from. The goal is to use AI technology and tools to learn from a prior design database, to explore, and reach an acceptable solution. At PCB West 2018, auto router results presented from Intel took 120 hours but when using AI-based smart routing the runtime got down to 30 minutes.

There are five challenges for intelligent electronic design:
1. Developing real-time continuous learning systems:

Uncertainty requires the ability to adapt quickly
Limited observability requires ways to determine design intent

2. Creation of contextual learning for hierarchical decision structures:
There are a series of design decisions a designer makes to design a chip, package or board, those decisions drive to a number of sub-goals. This leads to a number of complicated objective functions or a complicated optimization problem that requires solving in order to automate large chunks of the automation flow.

3. Robust flexibility and verification:
Most designs are used behind firewalls, and solutions need autonomy. Formalized verification processes are needed to ensure stable learning and inference. Robust optimization approaches are needed to ensure stable decisions.

4. Cold start issues:
Learning and model development is difficult when a new silicon technology is ramped. Typically very little data is available and there is no model to transfer. This is typical of early silicon nodes (like 7nm) when there are few designs to learn from and overall uncertainty is largest.

5. Synthesizing cost functions to drive large-scale optimization is complex and difficult.

II. Design Space Exploration Models for Analyzing Large System Designs

The second session addressed Design Space Exploration with Polynomial Chaos Surrogate Models for Analyzing Large System Designs.[1] Cadence is collaborating with and supporting the academic work that was presented in that session.

Design space exploration usually involves tuning multiple parameters. Traditional approaches (sweeping, Monte Carlo) are time-consuming, costly, and non-optimal. The challenge is quantifying uncertainty from un-measurable sources. Polynomial Chaos (PC) provides more efficient uncertainty quantification methods and addresses the curse of dimensionality (too many parameters to track which may or may not be significant). In order to address this curse of dimensionality and since the size of the PC surrogate model increases near-exponentially, a dimension reduction of less important variables that have a negligible effect on output can occur as follows:

• Only sensitive variables are considered as random.
• The rest are fixed at their average value.
• A full PC model is developed based on the selected terms.

Polynomial Chaos theory was presented (with intimidating math that was well explained including sensitivity analysis). A multi-stage approach for developing surrogate models was proposed and goes as follows:

• First, a simplified Polynomial Chaos (PC) model is developed.
• The simplified model is used for sensitivity analysis.
• Sensitivity analysis results are used for dimension reduction.
• The sensitivity of different ranges of variables is evaluated.
• Training samples are placed based on the results.
• A full PC surrogate model is developed and used for design space exploration.
• A numerical example with a DDR4 topology was presented for validation, with results summarized in the table and diagram:

I had a chance to chat with Ambrish Varma Sr Principal Software Engineer, who is working in the Sigrity High-Speed analysis division andKen Willis (product engineering architect, signal integrity). Their products are system level topology end-to-end from transmitters to receivers, not just for SerDes but also for parallel buses. Anything on the board can be extracted, making models for the transmitter and receiver, so pre-layout and post-layout simulations can be done. Now, one can use machine learning algorithms to hasten the simulations. Even if a simulation takes 30 or 90 seconds each, a million of those takes weeks. One needs to figure out which parts of the SerDes to focus on. One could make a model of the layout and then never be able to run a simulation. The R&D here is the first foray into simulation analysis smart technology.

ML trains and gathers the data, and to ensure the training data is not biased, the test will use random data. You then decide which parameters and variables to focus on. This is the first phase of the analysis. Next you abstract to a behavioral model, so a simulation lasts a couple of minutes, but then with more training data, you can dial in the accuracy. Final results get within 1% of the predicted value. When sensitivity analysis is run, models developed have an objective function or criteria. They use a metric called NJN, Normalized Jitter Noise, a metric of how open or closed an eye is within one unit interval, but the metric could also be overshoot, or could be channel operating margin, power ripple, signal-noise-ratio.

Picking that objective function is important and then the sensitivity analysis can focus on the major contributor. Cadence is helping academia as part of a consortium of industry and three universities, Georgia Tech, NC State and UIUC. This is still in the research stage and no release to production has occurred yet. One can tune the R, L, C, and the sensitivity analysis helps in the choices of the optimum setting. A model will be part of a library of use cases. Design reuse is enhanced with physicality, a snippet of layout, logic, netlist. If those reusable blocks are augmented with ML models for different objective functions, you can leverage the analysis in the reuse. It is possible that the ML models get standardized so that they can be used across all EDA tools. The solution space will have different designs with models that can be standardized. Whole solutions could be tool-based or tool-specific.

Cooperation with academia, and making the tool smarter are objectives such as trying to minimize input from the user by being smarter. A design cell is used as input, is an edge thing run now, but one can imagine that computations and sampling can be sent to an engine in the cloud, which could be returning data. One step push button, computationally intensive can be envisioned moving forward. The team is working on firming the model with tangible applications in mind. There is a tendency to think that is replacing traditional methods. It is, however, more an augmentation than a replacement. Advanced analysis is democratized a lot more, more simulation will be needed in the future, and this capability comes at the right time.

[More on Cadence signal integrity with artificial neural networks and deep learning]

[1] Majid Ahadi Dolatsara(1), Ambrish Varma(2), Kumar Keshavan(2), and Madhavan Swaminathan(1)
(1) Department of Electrical and Computer Engineering, Georgia Institute of Technology, Center for Co-Design of Chip, Package, System (C3PS)
Center for Advanced Electronics Through Machine Learning (CAEML), (2) Cadence

March 6, 2019July 18, 2025

PCIe 5.0 Jumps to the Fore in 2019

PCIe 5.0 Jumps to the Fore in 2019
by Tom Simon on 03-06-2019 at 12:00 pm
Categories: IP, Synopsys

2019 will be a big year for PCIe. With the approval of version 0.9 of the Base Layer for PCIe 5.0, implementers have a solid foundation to begin working on designs. PCIe 4.0 was introduced in 2017, before that the previous PCIe 3.0 was introduced in 2010 – ages ago in this industry. In fact, 5.0 is so close on the heels of 4.0, many products may simply leapfrog the 4.0 version and go directly to 5.0. Each version of PCIe has doubled the throughput, with 5.0 coming in at 63 GB/s with a 16 lane implementation. Compare that to the 4 GB/s throughput for the 2003 PCIe 1.0 with 16 lanes.

It’s even more amazing to go back to the specs of the original PCI from Intel in 1992. Back then the clock rate was 33.33 MHz with data rates of 133MB/s for a 32-bit bus. Of course, the original PCI used parallel synchronous data lines, which limited throughput due to clocking and bus arbitration issues. All of the PCIe specifications rely on high speed serial data transfers with each connected device having a dedicated full-duplex pair of transmit and receive lines. As with modern serial links the clock is embedded in the data stream, eliminating the need for external clock lines. Multiple lanes are used to increase throughput with the added requirement of limited lane skew so that the controller can reassemble the striped data.

Indeed, designers of PCIe IP and teams that are integrating PCIe 5.0 need to be mindful of a number of technical considerations. Synopsys recently posted an informative article about PCIe 5.0 on their website that discusses many of these issues. At the rate of 32GT/s the Nyquist frequency increases to 16GHz. This higher frequency for transmitting data complicates the channel design. Insertion loss increases at this higher operating frequency, and cross talk becomes a more serious problem. FR4 as a choice for PCB material is completely ruled out for most designs, unless retimers can be used. Maximum allowed channel loss for PCIe is 36dB. A 16 inch 100 Ohm differential pair stripline on FR4 would have a loss of 33.44 at 16 GHz. Leaving virtually no loss allowable for the other elements of the channel such as packaging, connectors, cabling, etc. Fortunately, there are alternatives that perform better, if the right design decisions are made.

In their article Synopsys also points out that the interplay between the PHY and controller becomes more interesting. There is an interface, known as the PHY Interface for PCIe (PIPE), for integrating the PHY and controller, with the latest PIPE 5.1.1 supporting the changes for PCIe 5.0. In the latest version, the pin count has been reduced by moving side-band pins into register bits, the Physical Coding Sublayer (PCS) moved from the PHY to the controller to permit the use of more general purpose PHY designs, and a 64-bit option has been added to help reduce the speed needed in the PIPE interface.

The Synopsys white paper offers an excellent description of the trade-offs relating to timing closure on 8 and 16 lane interfaces running at the highest transaction rates. Using a 512-bit controller with a 32-bit PIPE, running at 32 GT/s with 16 lanes, the controller logic timing can be closed with a 1 Ghz clock rate. Other options either require much higher clock rates, making timing closure infeasible, or call for a larger controller that is not available in today’s market.

Synopsys also provides a lot of useful information about packaging and signal integrity considerations for PCIe 5.0. They conclude with a section on modeling and testing of the interfaces.

Synopsys offers a complete solution for PCIe 5.0, including controllers, PHYs, and verification IP. This should come as some comfort to design teams that are looking to add the latest generation to their products.

There are a lot of considerations and choices to be made in order to build the right interface for a given application. The Synopsys DesignWare IP for PCIe includes configurability with support for multiple data path widths, including a silicon proven 512-bit architecture. The article on their website is very informative and helps clarify some of the biggest issues relating to the move to PCIe 5.0.