Modeling and verifying asynchronous systems is a constant challenge. Petri net models may provide an answer. Paul Cunningham (GM, Verification at Cadence), Raúl Camposano (Silicon Catalyst, entrepreneur, former Synopsys CTO and now Silvaco CTO) and I continue our series on research ideas. As always, feedback welcome.
This month’s pick is Hazard Detection in a GALS Wrapper: a case study. The paper published in the 2005 International Conference on Application of Concurrency to System Design. The authors are/were from Humboldt-Universität zu Berlin and IHP Microelectronics, Frankfurt.
This is an old paper yet interesting for use of Petri nets in modeling asynchronous processes, a problem domain that continues to crop up in modern designs. The authors apply their analysis to a GALS wrapper for a baseband block. Hazards can occur in handoff between a local clock driving the pipeline in the absence of external activity and the clock associated with an external input.
The method constructs Petri net patterns for a range of elementary gates, as state transition graphs (STGs) based on input signal states and edges. These nets are composable into a full circuit. In modeling, if an output edge with more than one marking is reachable from a starting state or might lead to such a marking, that represents a hazard. This is a widely studied problem and the basis of the method. Through a series of abstractions, the authors were able to detect several hazards and a potential deadlock.
This month’s paper was a total blast from the past for me, bringing back memories from my own PhD in the formal verification of asynchronous circuits. The paper describes a method for detecting hazards (the potential for wires to glitch) in control circuits for a “globally asynchronous locally synchronous” (GALS) design style, but the star of the show for me is the authors’ use of a wonderful notation called a Petri net.
A Petri net looks like a state machine diagram, but it overlays the traditional node and arc notation with the concept of a “token”, literally drawn as a black dot that can go inside nodes. These tokens enable Petri nets to elegantly visualize concurrent interactions between state machines. The notation is not new and dates to the 1960s. Much of what we do in modern SOC verification involves wrestling with concurrency and this paper reminded me that Petri nets may warrant some fresh attention to improve complex protocol verification (CXL, USB4, etc.), cache coherency verification, or EDA tools based on the new Portable Stimulus language for transaction level randomization like Cadence’s Perspec.
As a further side comment, we all know about the duality between state machines and regular expressions. Likewise with Petri nets, there is an extension of regular expression syntax called “Trace Expressions” that introduces a special concurrency construct called the “weave” operator enabling a similar duality between Petri nets and Trace Expressions…but now we’re getting way too close to my PhD for comfort.
OK, back to the paper: the authors develop a system to represent every gate in a circuit with a Petri net fragment. They then compose these Petri net fragments together to form a giant Petri net for their GALS control circuits. Hazards can be formally expressed as certain states of this giant Petri net, and so proving that a circuit is hazard-free reduces to proving that these states are unreachable, which can be done with standard off-the-shelf model checking tools.
It’s a tight paper, and the authors’ use of Petri nets to formalize hazards is beautiful. But unfortunately, the scalability of the underlying model checking problem generated by their method is very poor, and in the end the authors have to make some significant simplifying assumptions to get their checker to converge. That said, even with these assumptions they did find some real hazards in the control circuits that needed fixing.
The authors show how to model a chip that implements baseband processing compliant with IEEE802.11a. The implementation is globally asynchronous locally synchronous (GALS). The asynchronous part, which is of interest for the paper, is implemented as a wrapper around synchronous parts. It consists of 5 blocks and ensures that a synchronous pipeline functions properly. This they build up from elementary gate models, each represented by a Petri Net model. Once built, they make a key observation: that a hazard can occur when an edge place c carries more than one token. The problem of determining if a particular token pattern in a Petri-Net can occur is called the reachability problem.
They aim to solve this problem with a model checker, e.g., LoLA  or SMV. The initial model results in a Petri-Net of 288 places and 526 transitions, too complex to be solved. The model is simplified applying hazard preserving abstraction techniques such as merging gates, abstracting structures. Also considering each of the 5 blocks separately. And they use VHDL simulations to confirm if a “dangerous” state transition could really occur. Several hazards could be detected with the simplified model. The authors conclude that “Fortunately, in the usual application scenario for GALS blocks no reachable state generates any of the detected potential hazards. Nevertheless, in order to increase reliability of the asynchronous wrapper we have fixed all those potential hazards in the system. Without the formal verification this would not have been feasible.”
I think this is an interesting modeling technique which allows formal verification. The modeling part is very clear, I found it elegant to model both signal values and edges. Detecting hazards when multiple edges can occur through solving reachability in Petri-Nets. The formal verification part however applies several techniques which are cumbersome and difficult to follow. Concluding thoughts: Petri-Nets were popular in the 80s and 90s, particularly to model asynchronous systems. Asynchronous systems never became ubiquitous. Almost all digital systems are synchronous and just communicate, necessarily, with the outside world in a well understood asynchronous manner. There was no follow up to this paper. But even if not practical, it is a stimulating contribution to a hard problem.
Despite the theoretical elegance of Petri Nets, I struggled to find papers on this topic in our domain. I did find several related papers based on different methods which all commented on scalability problems for Petri Nets. Perhaps this realization accounts in part for the lack of papers.Share this post via: