We’re always looking for ways to leverage machine-learning (ML) in coverage refinement. Here is an intriguing approach proposed by Google Research. Paul Cunningham (GM, Verification at Cadence), Raúl Camposano (Silicon Catalyst, entrepreneur, former Synopsys CTO and now Silvaco CTO) and I continue our series on research ideas. As always, feedback welcome.
This month’s pick is Learning Semantic Representations to Verify Hardware Designs. This paper published in the 2021 NeurIPS. The authors are from Google and Google Research.
The research uses simulation data as training input to learn a representation for the currently covered subset of a circuit state transition graph. In inference, the method uses this representation to predict whether a newly defined test can meet new cover points, much faster than running the corresponding simulation. The architecture of the reported tool, Design2Vec, is based on a blending of Graph Neural Network (GNN) reasoning about the RTL CDFG structure and RNN reasoning about sequential evolution through the STG.
The paper positions Design2Vec as an augment to a constrained-random (CR) vector generation process. The method generates CR vectors as usual, then ranks these using a gradient ascent algorithm to maximize the probability of covering target cover points. The simulator then runs tests with highest predicted coverage.
The authors detail evaluations across a couple of RISC-V-based designs, also the Google TPU, and show compelling results in improving coverage over constrained random methods alone..
This is a great paper, on a center stage topic in commercial EDA today. The paper studies two very practical opportunities to use ML in mainstream digital verification. First using ML as a rapid low-cost way to predict the coverage a test will achieve. And second using ML to automatically tune test parameters to maximize coverage.
On the first, the paper eloquently demonstrates that predicting coverage without understanding anything about the design (where the design is a black box) doesn’t work very well (50% accuracy across 3 testcases). However, if features derived from the design’s control-dataflow-graph (CDFG) are also fed into the predictor then it can work quite well (80-90% accuracy across the same testcases).
The way the CDFG is modeled in their neural network is very slick, building incrementally on other published work for modeling software program control flow in a neural network using a softmax function.
On the second opportunity, they compare their CDFG-based neural network with another tool that uses an entirely black box algorithm using Bayesian optimization. Here the results are less conclusive, showing data for only 1 testcase, and for this case showing only marginal benefit using the CDFG-based neural network over Bayesian optimization
Stepping back for a moment, I believe there are huge opportunities to use ML to improve coverage and productivity in digital verification. We are investing heavily in this area at Cadence. I applaud the Google authors of this paper for investing and sharing their insights. Thank you!
The authors address the problem of coverage, hard-to-cover branches and generating tests to cover them. Their approach is through training to predict whether a cover point is activated by an input test vector. The CDFG architecture is captured by 4 different graph neural networks, out of which an enhanced IPA-GNN (Instruction Pointer Attention Graph NN ) called RTL IPA-GNN works marginally best.
Design2Vec is also used for test generation for given cover points. The method uses predicted probability in a gradient-based search to maximize detection probability. Tests generated are run through an RTL simulator to get the actual coverage. Results comparing to Vizier , a Google tool using Bayesian optimization are not conclusively superior.
They ran coverage prediction experiments on two small RISC-V cores and a TPU. They look at points covered by10-90% of random tests to exclude trivial cases. The authors compare results against 3 methods: statistical frequency random patterns; a multi-layer perceptron (MLP) treating the design as a black box; an MLP with node sequence embedding, allowing generalization across cover points.
Design2Vec beats other approaches by 20% (against statistical frequency) to about 3% (against node sequence embedding). Notably MLP black box approach does worse than statistical frequency for the TPU (large design). In the words of the authors “the MLP performs catastrophically poorly on the test data”. For me, the main insight is that embedding the design in the architecture is key in building a semantic representation.
The authors stress the “potential of deep learning to make a quantum leap in progress in the area of verification”. Their results back this up.
If you download this paper, you may notice that it is missing some appendices. The appendices are useful though not essential for full understanding. You might find this live presentation will bridge that gap.