After reading the Cadence blog post –“Dracula, Vampire, Assura, PVS: A Brief History” – Dr. Andrew Moore has written the below article where he helps readers get a sense as to what “the year of hell” was like, from one of the key individuals who lived it. Andrew also addresses and corrects some of the “urban legends” on how Calibre came out on top. Sorry, no pictures other than this one of Andrew currently working at NASA. None the less, this is anexcellent read!
Dear Cadence: Calibre Didn’t Run Any Dracula Decks
It was refreshing to read “Dracula, Vampire, Assura, PVS: A Brief History,” as it frankly outlines the technology missteps and sales myopia that have made Cadence’s design rule checker products irrelevant since the 0.35-micron design node. However, it neglects to mention a couple of important academic antecedents, misstates the evolution of DRC languages, and greatly understates the magnitude of sweat and toil required from application engineers during “the year of hell” to help designers unshackle themselves from the limitations of Dracula’s command set. Its glib statement that Calibre “would run any Dracula deck” is simply wrong.
Software version instability created a problem for chip designers, and Cadence’s Dracula solved it – for a price. Even though Dracula was not free, it was a welcome alternative to free academic design rule checkers (for example, Magic from UC Berkeley, and runDRC from Caltech). Source code control was still in its infancy in the 1990s, and academic tools changed by the semester, often with no archived version history. How can you risk signing off a chip with a version of a software program that runs differently than the version, no longer available, that you used when you designed the chip’s individual parts? Hundreds of design teams purchased a perpetual license for a commercial tool, archived, maintained product – Dracula – to remove that risk, and the commercial DRC market was born.
In the era of 1.0- and 0.5-micron designs, semiconductor processing chemistry was rather coarse, and the design rules were simple. As etching, deposition, and implantation technologies became more precise to enable submicron silicon processing, design rules became more complex. This introduced a new risk: even the most up to date version of a DRC tool could be inaccurate if design rule checks written in its command language did not encompass all of the complexity of submicron rules. Different academic tools gave different DRC errors, which were different than the errors that Dracula found. Were these differences just false positives? Were there false negatives that none of the available tools were finding? Routinely, designers ran a DRC tool and then crawled over the entire design, manually verifying each violation that DRC uncovered and looking for other design rule violations that were not detected. A design crawl took hours of panning, zooming, inspection, and thinking. The lack of confidence introduced by complex submicron design rules and inadequate software DRC created more productivity problems than DRC software was solving.
This hybrid method (software DRC plus manual inspection) became untenable as layouts got bigger and bigger. Academic tools and Dracula simply did not have the capacity for large layout files. The designs were getting so large that manual traversal was taking more than a day. I kept a running tally of the size in megabytes of the biggest known GDS file from 1997 to 1999; when it exceeded one gigabyte in late 1999, I stopped keeping track. By then, manual inspection took more than two days, and all of the first generation DRC tools (Magic, runDRC, and Dracula) would run for a few hours and then crash on large 0.25-micron designs. Accuracy and capacity risks were foremost in the minds of product managers and tape-out engineers, who scrambled to find ways to cut the design into manageable pieces and verify them, in parallel, by teams of engineers.
Mentor and Avanti fielded products (Calibre and Hercules, respectively) to minimize these newly emerging accuracy and capacity risks. Calibre and Hercules had richer commands that allowed a clever designer to find all true errors and to sift through false positives, mitigating the accuracy risk. Both of these tools also took advantage of design repetition, so that multiple instances of the same block needed to be checked just once, except at their boundaries with other blocks of layout. This lowered the effective size of the design and sped up runtimes for most designs, mitigating capacity risk. As a former student of Carver Mead at Caltech, this was obvious to me, but for most designers it was a new paradigm. I spent a lot of time with designers manually looking at the boundaries of repeated blocks (e.g., bit cells, flip flops, pads, and shift registers) signing off and coding past false positives and making sure that there weren’t false negatives. They were relieved that it was possible to check the chip manually again, because they could be confident that it was not necessary to peruse large areas that were just arrays of identical parts. To be double sure, designers ran the first-generation tools on each of the repeated blocks to verify that they were individually error-free. After all, the “bit cell tools” were free (academic) or on a perpetual license (Dracula) that was already paid for.
Everybody knew that this was just a temporary reprieve, though. The manual inspection of array boundaries was getting more and more time consuming. Now product managers were pressuring designers to shorten the entire DRC sign-off to a few “spins,” with each spin made up of an overnight DRC run, followed by manual inspection of the results the next day. Ten spins (two five-day work weeks) were usually tolerated – if a company allowed more than ten spins, the product manager complained about slipping product release schedules, while if it insisted on fewer, the designer would not guarantee that the chip would actually work. Semiconductor executives did their homework, learned the funny names of the tools that were alternately creating and resolving this bottleneck, and gave their product teams a year or so to cut the time in half. They also added budget to purchase second-generation (hierarchical) DRC tooling and to hire new people to evaluate and use it.
Mentor and Avanti executives and account managers saw an opportunity to replace the first-generation DRC product, Dracula, with their respective second-generation product (Calibre and Hercules, respectively) and thereby harvest all of this newly allocated software budget. All that was needed was a little sweat and toil from their application engineers. I trained a lot of those application engineers, and worked side-by-side with them in what became known as “the year of hell,” which ran roughly from summer 1998 to summer 1999.
Designers insisted on running both Dracula and a second-generation tool on each chip layout, to have all possible awareness during this transition. Naturally, it occurred to everyone that it would be more efficient if the second-generation tool could read the Dracula rule set directly. DRC software architects were reluctant to implement this concept. For one, they argued, mapping a “flat” command set onto a hierarchical architecture was demonstrably silly. Secondly, they asked, “Do you really want to take on the liability of actually creating false positives and missing true positives, just to reproduce the inadequacies of a first generation tool?” Third, Cadence woke briefly from its slumber in 1999 and slightly expanded the Dracula command set, so that direct implementation became a moving target. There were other rebuttals that were more subtle, but by the strength of these three points, DRC software architects won the argument.
As a result, several standalone programs were composed (by designers, academics, and enterprising application engineers) to translate, as best as possible, Dracula commands to Calibre and/or Hercules commands. Dracula translator developers communicated by phone, email, and in ad-hoc meetings at design automation conferences to share ideas and code. These translators succeeded, at best, according to a set of 80/20 rules: translate 80% of the commands, or reduce run time by 80%, and let each individual chip design team take care of writing rules to address the other 20% on a case-by-case basis. These translators read in a Dracula rule file, written in Dracula’s command language, and output a Calibre or Hercules rule file comprised of Calibre or Hercules commands. The untranslatable 20% was ignored, output as comment lines, or as output as unreadable garbage text.
The problem, of course, was that if the translator ignored 20 percent of Dracula commands, and if the tape-out team did not somehow check the actual design rules corresponding to that 20%, the result was a dead chip, with short circuits, open circuits, intolerably high transistor contact resistance, etc. As a result, designers around the world spent a lot of their time (say, 80%) writing and testing Calibre commands that accurately checked the design rules, which were not checked properly by that 20% of the enfeebled Dracula command set. Design software company application engineers were called in to help during “the year of hell.” I was asked to help the newly emerging Asian foundries with the transition to second generation DRC that year, and I wrote a lot of Calibre rules on plane flights to Taiwan. There were dozens of foundry fabrication processes, and I saw that I was reusing manually written chunks of Calibre code to replace untranslatable, defunct Dracula code across several processes. To alleviate the burden on application engineers and to educate designers, I started capturing these chunks and a description of what they were checking in application notes. I have heard that the same thing happened for people trying to displace Dracula with Hercules across several semiconductor processes.
Calibre never read Dracula commands directly, because to do so would introduce the risk of incorrect chip signoff and because second-generation (hierarchical) DRC is fundamentally different from first-generation (flat) DRC. For the same reasons, it did not read Magic or runDRC commands either. The same is true today: Calibre only reads Calibre commands.
Bio: Andrew Moore earned a BSEE at the University of Illinois, Urbana, and a Ph.D. in Computation and Neural Systems at Caltech. Before joining NASA as a Research Aerospace Technologist in 2012, he served in several technology, sales, and executive roles in industry. These include two tours at Mentor Graphics (Calibre Technical Marketing Manager in the late 1990’s and PacRim Technical Director from 2009-2012), Deputy Director of Design Marketing at TSMC, and Vice President for North America and Europe at Luminescent.