Instance

Array
(
    [title] => Recent Forum Threads
    [title_url] => 
    [ignore_sticky] => 0
    [exclude_current] => 0
    [limit] => 10
    [sluglist] => ["jobs-dashboard"]
    [rw_opt] => Array
        (
            [widget_select] => 1
            [pageid_281769] => 1
            [pageid_281772] => 1
        )

    [display_widget_mobile] => 
    [rw_opt_exclude] => Array
        (
            [pageid_274493] => 1
            [cpt_podcast] => 1
            [cpta_podcast] => 1
            [category_16613] => 1
            [category_16631] => 1
            [taxonomy_series] => 1
            [pageid_354254] => 1
        )

    [node_id] => Array
        (
            [0] => 2
        )

)

Threads

Recent Article Comments

Consolidation and Competition: Who is Winning the $4.5 Billion Interface IP Race?
HPC can be Chiplet. Wondering why UCIe is not considered. Internally AMBA neither

— chiro.lentz on July 11, 2026
The Packaging PDK Is the Missing Layer for Co-Packaged Optics
Thank you to Daniel Nenni and SemiWiki for publishing my latest article: The Packaging PDK Is the Missing Layer for…

— moh.kolb on July 8, 2026
The Packaging PDK Is the Missing Layer for Co-Packaged Optics
Very interesting. Thanks.

— U235 on July 8, 2026
Why Huawei Says It Will Match TSMC’s Most Advanced Chips by 2031
N+3 is denser than N6: https://newsletter.semianalysis.com/p/steel-smic-n3-teardown?open=false

— Fred Chen on July 5, 2026
Why Huawei Says It Will Match TSMC’s Most Advanced Chips by 2031
Fixed, thank you.

— Daniel Nenni on July 4, 2026
Why Huawei Says It Will Match TSMC’s Most Advanced Chips by 2031
The article is not correct. EUV equipment is not primarily produced by ASML. It is only produced by ASML. It…

— AndyG on July 4, 2026
Intel 18A vs Intel 18A-P: What Is the Difference and Why Does It Matter?
Nice writeup

— Rahul Razdan on June 27, 2026
Available Is Not In Control: Balancing Output, Quality, and Risk in High-Volume Fabs
In a DoD centric III-V fab I had wafers run in a few decades ago, yield was miserable, but adequate…

— PBealo on June 27, 2026
Available Is Not In Control: Balancing Output, Quality, and Risk in High-Volume Fabs
Another thing that can help improve availability is a very old but often overlooked basic bedrock: Having good SPC, that…

— benb on June 24, 2026
Available Is Not In Control: Balancing Output, Quality, and Risk in High-Volume Fabs
Thanks, Ben , both points land. The single spread tool that takes the whole line down is exactly the bottleneck…

— Boris Shteinberg on June 23, 2026

WP_Term Object
(
    [term_id] => 157
    [name] => EDA
    [slug] => eda
    [term_group] => 0
    [term_taxonomy_id] => 157
    [taxonomy] => category
    [description] => Electronic Design Automation
    [parent] => 0
    [count] => 4517
    [filter] => raw
    [cat_ID] => 157
    [category_count] => 4517
    [category_description] => Electronic Design Automation
    [cat_name] => EDA
    [category_nicename] => eda
    [category_parent] => 0
)

May 19, 2020January 12, 2021 by Bernard Murphy

Is Mutation Testing Worth the Effort? Innovation in Verification

Is Mutation Testing Worth the Effort? Innovation in Verification
by Bernard Murphy on 05-19-2020 at 6:00 am
Categories: Cadence, EDA
5 Comments

Mutation testing is an intriguing idea, but is it useful? Paul Cunningham (GM of Verification at Cadence), Jim Hogan and I continue our series on novel research ideas, here looking at a paper examining the pros and cons of this topic. Feel free to comment if you agree or disagree.

The Innovation

This month’s pick is Which Software Faults Are Tests Not Detecting? The paper was presented at the 2020 Evaluation and Assessment in Software Engineering conference. The authors are all from Lancaster University in the UK.

The contribution in this paper is analysis of testing efficiency in software, to find methods to improve the ability of tests to uncover more bugs. The authors measure efficiency through a combination of code coverage and mutation analyses. In mutation testing functional errors are inserted in the code, testing is re-run and test efficiency is determined by ability to detect the mutation. They apply their analysis to 10 open-source systems with associated unit tests, using a tool to automatically insert faults. From this they analyze efficiency of the tests by fault type.

They report that in 6 of the systems, less than 50% of the injected faults are detected and some fault types are detected more frequently than others, particularly conditional boundary checks. They also find that the lowest performing tests are 10X less efficient in detecting boundary faults.

The authors also discuss challenges in mutation testing. One study finds that most post-release faults are complex and can only be fixed through modifications in several locations. Attempting to model these through mutation would explode rapidly. Also, a study at Google confirms that even simple mutation testing is very expensive. Many mutants are unproductive, being either redundant or equivalent, yet are not easily weeded out.

Paul

This is something we’re looking at closely as a natural area of interest in our metric driven verification (MDV) strategy. We’re always interested in ways to help improve test effectiveness; this paper adds to our understanding.

Testing mutated code is computationally expensive, whether it’s software or hardware, since you have to run all your tests not only on the original code but also on each mutated version. In hardware verification, testing the non-mutated design is already swamping verification resources. If we are going to do mutation testing in hardware, we need to focus on high ROI mutations. A second concern is that mutation testing exposes limitations in tests, not bugs in the design. Which is still valuable but not a first-order concern, making it a tougher sell for schedule-constrained projects.

Nevertheless, selective use of high ROI mutation coverage could still be helpful in hardware, especially for modules where there is no good functional coverage model available. The paper cites boundary condition mutation, for example, mutating “<=” with “<” as more likely to find useful gaps in tests than mutating “+” with “-“. Buffer overflow security attacks are given as a good example where boundary condition mutation can catch gaps in test suites. This example applies equally to both software and hardware test.

Very thought-provoking.

Jim

The observation I want to make here, as an investor is that I have seen a decline in research in functional verification at the RTL level, at least judging by the number of papers we see. Not application-level stuff, how to better use the tools we’ve already got, that’s common. I’m talking about original research, from universities or outfits like Google.

This isn’t because all the problems are solved – they definitely aren’t. I think it’s more for universities because grants are directed to problems in other areas, and in the hyperscalars because software is their biggest driver for innovation. What then should we do in functional verification for hardware? Learn from research in software verification! The two domains are very closely related, not identical but the overlap is significant. I want to see more of these software parallels.

On this topic specifically, I want to better understand the associated costs. That’s a huge factor in ROI; the “R” will have to be equally impressive.

Security seems like a good application for mutation testing. Here there may be more willingness to accept the added overhead, also the recently released Mitre list of common weaknesses in hardware should provide inspiration for more security-related high-value mutations beyond boundary conditions.

To see the previous blog click HERE. You can see the next blog HERE.

Share this post via: