Can machine learning be combined with formal to find rare failures in analog designs? ML plus formal for analog – neat! Paul Cunningham (GM, Verification at Cadence), Jim Hogan and I continue our series on research ideas. Here an idea from analog simulation sampling. Feel free to comment.
This month’s pick is HFMV: Hybridizing Formal Methods and Machine Learning for Verification of Analog and Mixed-Signal Circuits. This paper appeared in the 2018 DAC. The authors are from Texas A&M University.
The authors propose a novel approach to analog verification. This uses a machine learning model built over a relatively small number of simulation samples. Then applying SMT-based formal analysis for model refinement.
The paper analyzes three analog circuits with O(20) transistors. Each transistor has multiple process or electrical parameters with statistical variances, making for an O(60) dimensional analysis.
The approach iteratively generates a set of parameter sample points with a goal to span a target n-sigma confidence of circuit correctness. At each iteration, they refine an ML-based failure model based on simulating the sample points from the previous iteration. They then linearize this ML model and pass it on into an SMT solver. To then pick a next batch of sample points to simulate for the next iteration. You can think of the SMT solver conceptually “inverting” the ML model. Picking parameter sample points the model believes have a high probability of causing the circuit to fail.
The authors present results showing their approach achieves 4-sigma confidence with 100X less simulations than Monte Carlo. And 10X less simulations than a modern importance-based sampling technique (SSS).
What a fun paper! Combine ML with SMT to find n-sigma process variation bugs in analog circuits. The results data also looks compelling, finding 8-sigma failures with 10X less simulations than SSS, a respected importance sampling method for high dimensionality problems.
That said, I don’t think that SSS and HFMV are solving quite the same problem: SSS is focused on picking an efficient and representative set of sample points to measure the failure rate with some target confidence level. HFMV is using ML and SMT to intelligently search for sample points which will cause the circuit to fail.
The magic for me in HFMV is the use of SMT to invert the ML failure model – i.e., to find a process variation point for all the transistors in the circuit which, when applied to the ML failure model, results in a circuit failure. I worry a bit that the SMT formulation given in the paper is being satisfied by picking solutions where all the transistors in the circuit are at the extremes of their process variation points: the paper refers to bounding SMT based on an n-dimensional hyper-cube which I believe means to bound each transistor’s process variation independently to 8-sigma, rather than bounding the composite multi-dimensional distribution of all transistors’ process variations to 8-sigma, which would be an n-dimensional hyper-sphere.
A failure where multiple transistors are at 8-sigma is way beyond 8-sigma for the circuit as a whole and so may not be an interesting failure to detect. I wonder if it would be possible to add an additional constraint to the SMT expression to make it pick solutions inside of the appropriate hyper-sphere denoted by 8-sigma for the whole circuit? Failures found inside this hyper-sphere would be true 8-sigma failures for the circuit.
Overall, a very innovative and thought-provoking paper.
This reminds me of Solido. Using Monte Carlo with ML to find the corners without having to do huge numbers of simulations. Back then our challenge was to make sure we were covering those corners with confidence. I talked to some of my expert analog buddies to get their take. Their view was that this is nice, but are they able to do it in less time than other methods? They thought it would take a deeper dive to figure out the added benefit in this method. Which is intriguing but not yet investable in my view.
I would like to see more analysis and characterization of the failures they found. First validating each failure through some independent method of analysis – to make sure none are false positives. Second to discuss how this distribution maps against a say a multi-variate normal distribution, back to Paul’s point.
You can read the previous Innovation in Verification blog HERE.
Also ReadShare this post via: