Leveraging neural nets and CR testing isn’t as simple as we first thought. But is that the last word in combining these two techniques? Paul Cunningham (GM, Verification at Cadence), Raúl Camposano (Silicon Catalyst, entrepreneur, former Synopsys CTO) and I continue our series on research ideas. As always, feedback welcome.
This month’s pick is Automation of Processor Verification Using Recurrent Neural Networks. The paper was presented at the 2017 18th International Workshop on Microprocessor and SOC Test and Verification and are from Brno University of Technology, Czech Republic.
The authors start with the reasonable view that, as coverage improves, constrained random (CR) generation will generate more redundant tests for diminishing return. Their paper focuses on a CR CPU instruction generator with a probability-controlled distribution on a fixed set of constraints. These generate and run multiple tests, recording coverage for each run. A neural-network (NN) algorithm uses this information to adjust the generator controls. The whole process repeats for some number of cycles. The authors pre-determine weights based on a deductive method, not training. One set of weights is essentially random, another set is based on the grammatical structure of the CPU ISA.
The authors test their method against two 32-bit RISC CPUs from Codasip, one at 16k gates, another production core at 24k gates. They compare results between their two NNs, a genetic algorithm, and a default CR pattern generator without interference. The NN methods achieve about 5% higher coverage for the same runtime budget versus the default. For higher coverage levels after the knee of the coverage ramp curve, the genetic algorithm does no better than the default.
Reading this paper led me down a wondrous path into early works on neural networks from the 1980’s by JJ Hopfield at Caltech’s department of chemistry and biology.
In those papers there was no NN training. Hopfield’s deductively constructed NN topologies and weights from first principles to solve a particular problem. And he solved some cool problems, for a content addressable memory and a traveling salesman problem.
In the subject paper of this month’s blog, our authors take inspiration from Hopfield and attempt to deductively construct a NN. Here using topology and weights based on the grammar of a CPU instruction set ISA to increase coverage from a constrained-random CPU instruction generator. It’s a neat idea, but ultimately doesn’t seem to add any value beyond their control NN. That NN is essentially just a balanced set of random +1 or -1 weights between nodes in their network.
However, what is intriguing from their work is that even the control NN improves coverage significantly compared to the default instruction generator. This control NN can be likened to running the constrained random generator in short bursts, each time randomly adjusting some control knob and either keeping or undoing the knob adjustment depending on whether it improved coverage or not. In essence, if you have some control knobs to a constrained random instruction generator, it’s a good idea to tweak them periodically, even if this tweaking is basically random 🙂
One thing I should add is that applying modern ML techniques to make constrained-random stimulus generation smarter is a very hot and active topic in commercial EDA today. Either to achieve higher coverage or to achieve the same coverage with dramatically less compute. It absolutely works and chip and system companies are starting to adopt it widely in production.
First, I like that this method is “non-invasive”. It aligns well with existing verification flows (this paper uses a standard UVM flow). That makes the approach incremental and practical for production use. The approach consists of generating changes to constraints for a pseudo-random generator (PRG), then have the PRG generate a set of stimuli, simulate these stimuli, and use the collected simulation data to evaluate the objective function (various types of coverage metrics), in the process optimizing neuron settings.
For the two Codasip processors they run multiple experiments with NNs of 41 and 1020 neurons respectively. From these they determine initial NN states, sigmoid steepness and epoch length. In my view, results are mixed. Comparing with other methods (Default and Genetic Algorithm), NN is slightly favorable, as measured by the coverage reached. But this is at the cost of building an RNN. They also compute “optimal” (small with high coverage) set of stimuli, useful for regression tests. I am worried that the size of the RNN increases significantly for a small increase in design size. Also that coverage diminishes for the larger design.
That said, this is very interesting research. Today’s commercial EDA tools are starting to incorporate Machine Learning, and PRG is an application that will likely benefit.
This blog nicely underlines the point of these reviews. Our goal is not to add yet another paper review. It’s to look for intriguing insights. Even if, as in this case, they’re not necessarily the main point of the paper.
Also ReadShare this post via: