Back in the microprocessor stone age, government procurement agencies fell in love with the idea of radiation hardened parts that might survive catastrophic events. In those days, before rad-hard versions of PowerPC and SPARC arrived, there were few choices for processors in defense and space programs.
One of the first rad-hard microprocessors was the Performance Semiconductor PACE P1750A, a product line since acquired by Pyramid Semiconductor. It was born in the Reagan-era “Star Wars” boom, where total ionizing dose (TID) and low power consumption were the first two requirements. Thank goodness, our project using the PACE P1750A never got past system design and lab prototyping, because I don’t think we fully appreciated what we were up against in creating a totally rad-hard, space-ready system.
What the semiconductor industry has learned about rad-hard and rad-tolerant design since fills volumes of books and is still developing. Geometries have shrunk, worsening the chance of a disruption from radiation. Processes have improved with technology such as silicon on insulator (SOI). Software content in all projects has swelled, justifying investments in creating rad-hard processors delivering a high level of confidence at a high level of cost.
Rad-hard ASICs, however, are another matter. While the technology exists, the volumes around a custom design usually do not. Fortunately, FPGA vendors and some defense firms licensing technology have stepped in with rad-hard parts targeting space-based projects.
Space is not the only place radiation exists. Many applications, including industrial, medical, and automotive, are subject to single event upset (SEU). To provide adequate levels of safety-critical operation without the expense of full rad-hard FPGAs, designers are turning more and more to SEU-tolerant approaches in FPGA logic synthesis. These same approaches are even applicable in full rad-hard FPGAs, as various FPGA technologies present different susceptibility and need additional mitigation techniques in some areas.
The cornerstone of SEU mitigation is triple modular redundancy, or TMR. This is the classic voting scheme, where circuitry is replicated three times and combined into a majority voter. In theory, if an SEU occurs in one block, the other two provide correct results. TMR schemes can detect and correct single-bit errors.
Dialing in TMR by hand in a complex FPGA-based design could take forever, take a lot of area on the chip, and potentially mess up timing. Understanding the tradeoff between safety, area, and timing can make or break a project. Synopsys has spent significant amounts of research on its Synplify Premier tool, studying popular FPGA architectures and mitigation approaches, to automate the insertion of TMR during synthesis.
For instance, there are actually three flavors of TMR. Registers can be protected with local TMR (LTMR), a simple replication. However, researchers are finding SRAM-based FPGAs in space-qualified applications are still susceptible to upset using LTMR – geometries are small enough and events rapid enough that radiation strikes two or even all three blocks.
To protect I/O and logic and provide more hardening for space-based designs, distributed TMR (DTMR) physically separates the triplicated circuitry on the chip. Block TMR (BTMR) takes the approach a step further with physical separation and clock synchronization, and can be used with indivisible or encrypted IP blocks.
Synplify Premier handles all three of these TMR types and more mitigation techniques, with automated FPGA-aware synthesis techniques supporting all popular devices. Synopsys application engineer Sharath Duraiswami dives into the details in an archived webinar:
Building Highly Reliable FPGA Designs for Applications Needing Functional Safety
One idea Sharath discusses is “partial DTMR”, where voters around flushable dual flip-flops are optimized to save area when possible. He also shows how the physical separation works, along with Synplify mitigation techniques for each type of functional block in an FPGA including duplicate with compare (DWC), Hamming-3 encoding, and safe case FSM. One example even shows use of a Xilinx Zynq-7000 SoC using DWC techniques for error control.
The webinar tips will be helpful for designers working with full rad-hard FPGAs or trying to harden safety-critical applications, whether working with Altera, Lattice, Microsemi, Xilinx, or other parts. It’s evident just how much work Synopsys has put into Synplify Premier to automate synthesis for a wide variety of scenarios, far beyond just blasting away with logic triplication. I like this presentation because it isn’t tied to just one FPGA architecture or vendor – each has its merits and limitations in safety-critical design that engineers need to be aware of.
Comments
0 Replies to “3 flavors of TMR for FPGA protection”
You must register or log in to view/post comments.