WP_Term Object
(
    [term_id] => 15
    [name] => Cadence
    [slug] => cadence
    [term_group] => 0
    [term_taxonomy_id] => 15
    [taxonomy] => category
    [description] => 
    [parent] => 157
    [count] => 484
    [filter] => raw
    [cat_ID] => 15
    [category_count] => 484
    [category_description] => 
    [cat_name] => Cadence
    [category_nicename] => cadence
    [category_parent] => 157
)
            
14173 SemiWiki Banner 800x1001
WP_Term Object
(
    [term_id] => 15
    [name] => Cadence
    [slug] => cadence
    [term_group] => 0
    [term_taxonomy_id] => 15
    [taxonomy] => category
    [description] => 
    [parent] => 157
    [count] => 484
    [filter] => raw
    [cat_ID] => 15
    [category_count] => 484
    [category_description] => 
    [cat_name] => Cadence
    [category_nicename] => cadence
    [category_parent] => 157
)

Finding Large Coverage Holes. Innovation in Verification

Finding Large Coverage Holes. Innovation in Verification
by Bernard Murphy on 02-24-2021 at 6:00 am

Is it possible to find and prioritize holes in coverage through AI-based analytics on coverage data? Paul Cunningham (GM, Verification at Cadence), Jim Hogan and I continue our series on research ideas. As always, feedback welcome.

Finding Large Coverage Holes

The Innovation

This month’s pick is Using Machine Learning Clustering To Find Large Coverage Holes. This paper was presented at Machine Learning for CAD, 2020. The authors are from IBM Research, Haifa, Israel.

Improving coverage starts with knowing where you need to improve, especially where you may have significant holes. Getting to what you might call good scalar coverage (covered functions, statements, and the like) is fairly mechanical. Assertions provide a set of more complex checks on interdependencies, high value but necessarily low coverage. These authors look at cross-product checks, relationships between events, somewhat reminiscent of our first blog topic.

It is important first to understand what the authors means by a cross-product coverage task. This might be say a <request,response> pair where <request> may be one of memory_read, memory_write, IO_read, IO_write and <response> may be ack, nack, retry, reject. Coverage is then over all feasible combinations.

Events are assumed related through naming. In their convention, reg_msr_data_read breaks into {reg,msr,data,read} which is close to {reg,msr,data,write}, not quite as close to {reg,pcr,data,write}. (You could easily adapt to different naming conventions.) From these groups they run K-means clustering analysis to group features (reg, msr, etc).

From these clusters, they build cross-product structures. This starts with sets of feature locations, counting from start and end of an event. Then finding anchors, most commonly occurring, and therefore likely most significant features in events (reg for example). The authors call groups of features falling between these anchors dimensions. Though not quite explicit in the paper, it seems these provide a basis for probable event combinations which ought to be covered. From that they can then monitor covered and non-covered events. Better yet, they can provide very descriptive guidance on which combinations they expected to see covered but did not.

Paul’s view

The depth of this paper can be easy to miss on a quick read. It’s actually very thought provoking and draws on ML techniques in text document classification to help with verification. Very cool!

The verification methodology in this paper is based on “coverage events” represented as a concatenation of words, e.g. “reg_msr_data_read”. However, the paper would seem to be equally applicable to any meta-data in the form of semi-structured text strings – it could be debug messages for activity on a bus or even the names of requirements in a functional specification.

The heart of the paper is a set of algorithms that cluster similar coverage events into groups, break apart the concatenation of words and then intelligently re-combine the words to identify other concatenations that are similar but as yet un-covered events. They use a blend of K-means clustering, non-negative matrix factorization (NMF), and novel code to do this. The paper is a bit thin on specifics of how K-means and NFR are applied, but the essence of the overall method still shines through and the reported results are solid.

The more I think about this paper, the more the generality of their method intrigues me – especially the potential for it to find holes in a verification plan itself by classifying the names of functional requirements themselves. The approach could quite easily be added as an app to a couple of the coverage tools in our Cadence verification flow…a perfect opener for an intern project at Cadence – please reach out to me if you are reading this blog and interested.

Jim’s view

Paul made an interesting point (separately). At the block level people are already comfortable with functional coverage and randomization. But at the SoC level, Engineers typically use directed tests and don’t have as good a concept of coverage. They want functional coverage at SoC but it’s too much work.

Maybe this is a more efficient way to get a decent measure of coverage. If so, that would definitely be interesting. I see it as an enhancement to existing verification flows, not investable as a standalone company, but certainly something that would be interesting as a quick acquisition. This would follow a proof of concept of no more than a month or so – a quick yes/no.

My view

Learning techniques usually focus on pure behaviors. As Paul suggests, this method adds a semi-semantic dimension. It derives meaning from names which I think is quite clever. Naturally that could lead to some false positives, but I think those should be easy to spot, leaving the signal to noise ratio quite manageable. Could be a nice augment perhaps to PSS/ software-driven verification.

Share this post via:

Comments

There are no comments yet.

You must register or log in to view/post comments.