At the recent DesignCon 2019 in Santa Clara, I attended a couple of sessions where Cadence and their research partners provided some insight on machine learning/AI and on large system design analysis; with the first one focused on real-world cloud & machine learning/AI deployment for hardware design and the second one focused on design space exploration analyzing large system designs.
I. Intelligent Electronic Design and Decision
The first session was kicked off by Dr. David White of Cadenceand was entitled Intelligent Electronic Design and Decision. He contrasted the internet-driven image recognition AI problems with EDA related AI. The characteristics of image recognition include natural or man-made static objects with a rich set of online examples whereas EDA characteristics are dynamic and require learning adaptability with sparse data sets where verification is critical and optimization very important.
White pointed out that not a lot of large data sets exist and verification is essential to all we do in EDA/SoC design, and optimization plays a role in large designs when finding design solutions. The ML/DL space additionally refers to a few different technologies such as optimization and analytics. He also noted that these approaches can be computationally heavy, so massive parallel optimization is used to get the performance back. In the development of design automation solutions, uncertainty arises in one of two forms:
- Factors/features that are unobservable
- Factors/features that are observable but change over time.
Design intent is not always captured in EDA tools where designers have an objective and intention in mind and then tune to an acceptable solution. This can be problematic at recent silicon technologies where uncertainty is greatest and there is a low volume of designs to learn from. The goal is to use AI technology and tools to learn from a prior design database, to explore, and reach an acceptable solution. At PCB West 2018, auto router results presented from Intel took 120 hours but when using AI-based smart routing the runtime got down to 30 minutes.
There are five challenges for intelligent electronic design:
1. Developing real-time continuous learning systems:
- Uncertainty requires the ability to adapt quickly
- Limited observability requires ways to determine design intent
2. Creation of contextual learning for hierarchical decision structures:
There are a series of design decisions a designer makes to design a chip, package or board, those decisions drive to a number of sub-goals. This leads to a number of complicated objective functions or a complicated optimization problem that requires solving in order to automate large chunks of the automation flow.
3. Robust flexibility and verification:
Most designs are used behind firewalls, and solutions need autonomy. Formalized verification processes are needed to ensure stable learning and inference. Robust optimization approaches are needed to ensure stable decisions.
4. Cold start issues:
Learning and model development is difficult when a new silicon technology is ramped. Typically very little data is available and there is no model to transfer. This is typical of early silicon nodes (like 7nm) when there are few designs to learn from and overall uncertainty is largest.
5. Synthesizing cost functions to drive large-scale optimization is complex and difficult.
II. Design Space Exploration Models for Analyzing Large System Designs
The second session addressed Design Space Exploration with Polynomial Chaos Surrogate Models for Analyzing Large System Designs. Cadence is collaborating with and supporting the academic work that was presented in that session.
Design space exploration usually involves tuning multiple parameters. Traditional approaches (sweeping, Monte Carlo) are time-consuming, costly, and non-optimal. The challenge is quantifying uncertainty from un-measurable sources. Polynomial Chaos (PC) provides more efficient uncertainty quantification methods and addresses the curse of dimensionality (too many parameters to track which may or may not be significant). In order to address this curse of dimensionality and since the size of the PC surrogate model increases near-exponentially, a dimension reduction of less important variables that have a negligible effect on output can occur as follows:
• Only sensitive variables are considered as random.
• The rest are fixed at their average value.
• A full PC model is developed based on the selected terms.
Polynomial Chaos theory was presented (with intimidating math that was well explained including sensitivity analysis). A multi-stage approach for developing surrogate models was proposed and goes as follows:
• First, a simplified Polynomial Chaos (PC) model is developed.
• The simplified model is used for sensitivity analysis.
• Sensitivity analysis results are used for dimension reduction.
• The sensitivity of different ranges of variables is evaluated.
• Training samples are placed based on the results.
• A full PC surrogate model is developed and used for design space exploration.
• A numerical example with a DDR4 topology was presented for validation, with results summarized in the table and diagram:
I had a chance to chat with Ambrish Varma Sr Principal Software Engineer, who is working in the Sigrity High-Speed analysis division andKen Willis (product engineering architect, signal integrity). Their products are system level topology end-to-end from transmitters to receivers, not just for SerDes but also for parallel buses. Anything on the board can be extracted, making models for the transmitter and receiver, so pre-layout and post-layout simulations can be done. Now, one can use machine learning algorithms to hasten the simulations. Even if a simulation takes 30 or 90 seconds each, a million of those takes weeks. One needs to figure out which parts of the SerDes to focus on. One could make a model of the layout and then never be able to run a simulation. The R&D here is the first foray into simulation analysis smart technology.
ML trains and gathers the data, and to ensure the training data is not biased, the test will use random data. You then decide which parameters and variables to focus on. This is the first phase of the analysis. Next you abstract to a behavioral model, so a simulation lasts a couple of minutes, but then with more training data, you can dial in the accuracy. Final results get within 1% of the predicted value. When sensitivity analysis is run, models developed have an objective function or criteria. They use a metric called NJN, Normalized Jitter Noise, a metric of how open or closed an eye is within one unit interval, but the metric could also be overshoot, or could be channel operating margin, power ripple, signal-noise-ratio.
Picking that objective function is important and then the sensitivity analysis can focus on the major contributor. Cadence is helping academia as part of a consortium of industry and three universities, Georgia Tech, NC State and UIUC. This is still in the research stage and no release to production has occurred yet. One can tune the R, L, C, and the sensitivity analysis helps in the choices of the optimum setting. A model will be part of a library of use cases. Design reuse is enhanced with physicality, a snippet of layout, logic, netlist. If those reusable blocks are augmented with ML models for different objective functions, you can leverage the analysis in the reuse. It is possible that the ML models get standardized so that they can be used across all EDA tools. The solution space will have different designs with models that can be standardized. Whole solutions could be tool-based or tool-specific.
Cooperation with academia, and making the tool smarter are objectives such as trying to minimize input from the user by being smarter. A design cell is used as input, is an edge thing run now, but one can imagine that computations and sampling can be sent to an engine in the cloud, which could be returning data. One step push button, computationally intensive can be envisioned moving forward. The team is working on firming the model with tangible applications in mind. There is a tendency to think that is replacing traditional methods. It is, however, more an augmentation than a replacement. Advanced analysis is democratized a lot more, more simulation will be needed in the future, and this capability comes at the right time.
[More on Cadence signal integrity with artificial neural networks and deep learning]
 Majid Ahadi Dolatsara(1), Ambrish Varma(2), Kumar Keshavan(2), and Madhavan Swaminathan(1)
(1) Department of Electrical and Computer Engineering, Georgia Institute of Technology, Center for Co-Design of Chip, Package, System (C3PS)
Center for Advanced Electronics Through Machine Learning (CAEML), (2) Cadence