Mining assertions from constrained random simulations to localize bugs. Paul Cunningham (GM, Verification at Cadence), Jim Hogan and I continue our series on research ideas. Feel free to comment.
This month’s pick is Symptomatic bug localization for functional debug of hardware designs. This paper was presented at the 2016 ICES. The authors are from the University of Illinois at Urbana Champaign.
There’s wide agreement that tracing bugs to their root cause consumes more effort than any other phase of verification. Methods to reduce this effort are always worth a close look. The authors start with constrained random (CR) simulations. They mine failing simulations for likely invariant assertions, which they call symptoms. These they infer from multiple CR simulations based on common observed behaviors. They use an open-source miner developed in the same group, GoldMine, for this purpose.
Then for each failure they look for commonalities between assertions, looking for common execution paths among symptoms. They claim that common symptoms signal a highly suspicious path which likely localize the failure.
One way to determine what code is covered by an assertion is through static cone-of-influence analysis. These authors instead use simulations to determine dynamically what statements are relevant. These they assume are statements executed in the time slice from the left-hand-side of the assertion becoming true and the right-hand-side becoming true. They acknowledge dynamic analysis is incomplete for this purpose, though in their experiments they say they found all simulated all statements in scope.
The authors ran experiments on a USB 2.0 core and were able to localize bugs to within 5% of the RTL source code lines in one case and within 15% on average.
I first would like to acknowledge that this paper is one of a series of high-quality papers from Dr. Vasudevan’s team at the University of Illinois, building on their GoldMine assertion mining tool. Lots of inspiring research here which I very much enjoyed learning about!
The paper nicely brings together three techniques. In the first phase, they look for common patterns in failure traces using GoldMine. The patterns found are in the form of “A implies B” assertions. Each such assertion identifies some time slice from the trace where whenever a sequence of events “A” happens to some other sequence “B” of events that happens later.
Second, they intersect these assertions mined across all the failure traces to find the important signatures that apply to all failure traces.
Finally, they map the assertions back to lines in the source RTL by re-running the failing simulations and tracking which RTL lines are executed in the time between “A” and “B” happening in the trace.
Overall, this is a very elegant and scalable method to localize faults. On their USB controller example, they localized half of the 22 bugs inserted to within 15% of the source code base, which is impressive.
I struggled a bit with the importance/sensitivity analysis in Figure 4. I expected to see some visual correlation between code zone importance and the actual code zone where the bug was injected but this didn’t seem to be the case.
The other thought I have, which would be a fun and easy experiment: in the second phase, check the signature assertions against traces for good simulations that passed. Then prune any assertion that also matches a good simulation trace. This might improve the quality of the important signature list further.
By default, I would normally look at this and think “part of verification, no new money there, not investable”. However, debug probably has the least EDA support today. It’s also the most expensive part of verification in time consumed by verification experts. And tools in this area can be very sticky. I’m thinking particularly of Verdi. There, users developed a strong loyalty, quite likely because they spend so much time in debug.
Now I think a new capability that could significantly enhance the value of a debugger – reducing total debug time – could attract a lot of interest. I’d want to see proof of course, but I think there might be a case.
Before Atrenta was acquired by Synopsys, we acquired a company that did assertion mining. As a standalone capability, generating new assertions for direct verification, it seemed to struggle find traction. This is a different spin on the technique. The assertions are not the end goal but a means to localize bugs. Clever idea and maybe more immediate market appeal.
You can read the previous Innovation blog HERE.Share this post via: