webinar IPXACT banner

Designing SmartCar ICs

Designing SmartCar ICs
by Daniel Payne on 09-30-2014 at 7:00 am

When I upgraded cars from a 1988 to 1998 Acura it seemed like my car had become much smarter with a security chip in the key, security codes in the radio and a connector for computer diagnosis, however in today’s modern auto there’s a lot more mixed-signal design content. Micronasand Synopsysgot together and hosted a webinar two months ago, “Advanced Mixed-Signal Design and Verification of Smartcar ICs“, so I watched it today.


BMW with massive amounts of electronics

Marco Casale-Rossi from Synopsys started out first by comparing the number of 14nm transistors on a 300 mm wafer as being more than the number of stars in the Milky Way galaxy, the infinitely large. The pursuit of every-smaller process nodes is approaching the infinitely small. Car electronics are not designed at the bleeding edge of 16nm because of the long design cycle times, and cost controls. Automotive ICs can span from 1V power supplies up to tens of volts, operate in extreme temperature regions, and often contain mixed-signal IP blocks. An AMS process for automotive may contain: double-poly, triple-well, few metal layers, bipolar, CMOS, DMOS, SRAM, NVM, high voltage and sensors.


Source: IDM 2013, a 160nm mixed-signal design

Designs can be analog on top, digital on top, or any mixture in-between. EDA tools for automotive should enable lower design and silicon costs. An IDM designed with DC Graphical and IC Compiler to produce an automotive chip in a 110nm process that was 18% smaller in size, with a 13% higher utilization, and a 77% double via rate to improve reliability.

Related: AMD Design IP Deal with Virage Logic. Oops. Synopsys

IR drop and electromigration analysis are crucial for predicting the reliability of automotive chips, and an IDM used PrimeRail and IC Compiler for their integrity analysis and resistance calculations.


Source: IDM 2013, a 130nm mixed-signal design

Functional verification from Synopsys spans a wide range of tools, now called Continuum Vision. The technical goal is better and faster verification by finding more bugs, sooner:

A feature called CircuitCheckis used within the transistor-level, FastSPICE tool CustomSim to help detect bugs during the design phase. Circuit designers have several simulators for use during AMS design, and you choose based upon the required accuracy and capacity:

Mario Anton from Micronas was the second speaker, and his company has 900 people that design, fabricate, test and supply hall sensors and sensor-based products to the automotive industry with a goal of zero ppm quality levels. Micronas started back in 1952, and so far have shipped some 2.5 billion hall sensors. X-FAB also has a fabrication partnership with Micronas.

You’ll find Micronas chips in automotive applications like:

  • Powertrain (active pedal, gear position, battery management)
  • Chassis and safety (steering torque, chassis height, steering motor position, braking)
  • Body and comfort (seat position, window position, wiper position, grill shutter)

Related: Why do you need 9D Sensor Fusion to support 3D orientation?

Outside of automotive, there are Micronas chips for industrial applications like building and home automation, heavy machines and factory automation, plus white goods and home appliances.

Hall-Effect sensors from Micronas are used in automotive as switches, current sensors, angular sensors and linear sensors. Electronic throttle control replaces the mechanical system. Every electric motor in an auto has embedded servo-drives.

For design and verification the engineers need to meet safety requirements defined in ISO26262, and design for zero PPM levels. The IC design flow includes analog full-custom, mixed-signal and digital.

Related: How to Trim Automotive Sensor?

The final speaker was Gernot Koch of Micronas and he shared about their specific design and verification approaches using a range of tools for analog on top:

  • Full SPICE
  • FastSPICE
  • Verilog + SPICE, co-simulation
  • Verilog + Verilog A + Verilog AMS + SPICE
  • Verilog with Realtype modeling

The typical HALL chip architecture is shown below, which includes dozens of different IP blocks:

A small Hall sensor chip with about 5,000 transistors, and a full-custom design approach was taken, using FineSim for circuit simulation. A medium sized Hall sensor with 50,000 transistors had both analog and digital content, so FineSim and VCS simulators were used. On the high-end of hall sensors there were 100K transistors; so HSPICE, FineSim, CustomSim and VCS simulators were part of the design and verification flow.

Simulating with FineSim on multi-cores showed up to a 20X speedup versus an internal SPICE circuit simulator.

The plans at Micronas are to have more teams use the Synopsys tool flow for AMS designs, and replace their internal simulator with FineSim instead because of the speedup provided.

View the entire 52 minute archived webinar here, after a brief registration process.


Who Will Lead at 10nm?

Who Will Lead at 10nm?
by Scotten Jones on 09-29-2014 at 4:00 pm

There has been a lot of discussion on SemiWiki lately around 14nm FinFET technology and who really leads and by how much. I thought it would be interesting to review some process metrics for previous technology generation and then make some forecasts around 10nm.

The focus of this article will be Intel, TSMC and Global Foundries/Samsung as the logic volume leaders:

  • Intel is the world’s largest semiconductor and far and away the largest IDM logic producer today.
  • TSMC is the world’s largest foundry
  • Global Foundries is the world’s second largest foundry. We have combined them with Samsung because they are both members of the common platform alliance and closely aligned in process technology. In fact Global Foundries has licensed Samsung’s 14nm FinFET process technology.

The characterization of process density has shifted over the years and nodes have become less reflective of actual feature sizes and density. A more recent metric that Intel has been using is Gate Pitch (GP) multiplied by Metal 1 Pitch (M1P). This same metric has also shown up in a recent paper by the common platform partners disclosing their 10nm process work. GP x M1P will be the metric used for comparison in this paper.

Intel
The following table presents Intel’s GP, GP shrink ratio, M1P, M1P shrink ration and GP x M1P starting at 130nm and projecting out to 10nm.

[TABLE] border=”1″
|-
| style=”width: 115px; height: 17px” |
| style=”width: 60px; height: 17px” | 130nm
| style=”width: 60px; height: 17px” | 90nm
| style=”width: 60px; height: 17px” | 65nm
| style=”width: 60px; height: 17px” | 45nm
| style=”width: 61px; height: 17px” | 32nm
| style=”width: 60px; height: 17px” | 22nm
| style=”width: 60px; height: 17px” | 14nm
| style=”width: 59px; height: 17px” | 10nm
|-
| style=”width: 115px; height: 17px” | GP
| style=”width: 60px; height: 17px” | 319
| style=”width: 60px; height: 17px” | 260
| style=”width: 60px; height: 17px” | 220
| style=”width: 60px; height: 17px” | 180
| style=”width: 61px; height: 17px” | 112.5
| style=”width: 60px; height: 17px” | 90
| style=”width: 60px; height: 17px” | 70
| style=”width: 59px; height: 17px” | 55
|-
| style=”width: 115px; height: 17px” | GP shrink
| style=”width: 60px; height: 17px” |
| style=”width: 60px; height: 17px” | 0.82
| style=”width: 60px; height: 17px” | 0.85
| style=”width: 60px; height: 17px” | 0.82
| style=”width: 61px; height: 17px” | 0.63
| style=”width: 60px; height: 17px” | 0.80
| style=”width: 60px; height: 17px” | 0.78
| style=”width: 59px; height: 17px” | 0.78
|-
| style=”width: 115px; height: 18px” | M1P
| style=”width: 60px; height: 18px” | 350
| style=”width: 60px; height: 18px” | 220
| style=”width: 60px; height: 18px” | 210
| style=”width: 60px; height: 18px” | 160
| style=”width: 61px; height: 18px” | 112.5
| style=”width: 60px; height: 18px” | 90
| style=”width: 60px; height: 18px” | 52
| style=”width: 59px; height: 18px” | 38
|-
| style=”width: 115px; height: 18px” | M1P shrink
| style=”width: 60px; height: 18px” |
| style=”width: 60px; height: 18px” | 0.63
| style=”width: 60px; height: 18px” | 0.95
| style=”width: 60px; height: 18px” | 0.76
| style=”width: 61px; height: 18px” | 0.70
| style=”width: 60px; height: 18px” | 0.80
| style=”width: 60px; height: 18px” | 0.58
| style=”width: 59px; height: 18px” | 0.74
|-
| style=”width: 115px; height: 18px” | GP x M1P
| style=”width: 60px; height: 18px” | 111,650
| style=”width: 60px; height: 18px” | 57,200
| style=”width: 60px; height: 18px” | 46,200
| style=”width: 60px; height: 18px” | 28,800
| style=”width: 61px; height: 18px” | 12,656
| style=”width: 60px; height: 18px” | 8,100
| style=”width: 60px; height: 18px” | 3,640
| style=”width: 59px; height: 18px” | 2,101
|-

All of the pitches down through 14nm are based on Intel public disclosures at IEDM and the IDF. The 10nm forecast is based on applying the average shrink ratio from the previous seven process generations.

TSMC

The following table presents TSMC’s GP, GP shrink ratio, M1P, M1P shrink ration and GP x M1P starting at 130nm and projecting out to 10nm.

[TABLE] border=”1″
|-
| style=”width: 115px; height: 17px” |
| style=”width: 60px; height: 17px” | 130nm
| style=”width: 60px; height: 17px” | 90nm
| style=”width: 60px; height: 17px” | 65nm
| style=”width: 60px; height: 17px” | 40nm
| style=”width: 61px; height: 17px” | 28nm
| style=”width: 60px; height: 17px” | 20nm
| style=”width: 60px; height: 17px” | 16nm
| style=”width: 59px; height: 17px” | 10nm
|-
| style=”width: 115px; height: 17px” | GP
| style=”width: 60px; height: 17px” | 310
| style=”width: 60px; height: 17px” | 240
| style=”width: 60px; height: 17px” | 160
| style=”width: 60px; height: 17px” | 162
| style=”width: 61px; height: 17px” | 122
| style=”width: 60px; height: 17px” | 87
| style=”width: 60px; height: 17px” | 90
| style=”width: 59px; height: 17px” | 70
|-
| style=”width: 115px; height: 17px” | GP shrink
| style=”width: 60px; height: 17px” |
| style=”width: 60px; height: 17px” | 0.77
| style=”width: 60px; height: 17px” | 0.67
| style=”width: 60px; height: 17px” | 1.01
| style=”width: 61px; height: 17px” | 0.75
| style=”width: 60px; height: 17px” | 0.71
| style=”width: 60px; height: 17px” | 1.03
| style=”width: 59px; height: 17px” | 0.78
|-
| style=”width: 115px; height: 18px” | M1P
| style=”width: 60px; height: 18px” | 340
| style=”width: 60px; height: 18px” | 240
| style=”width: 60px; height: 18px” | 180
| style=”width: 60px; height: 18px” | 128
| style=”width: 61px; height: 18px” | 95
| style=”width: 60px; height: 18px” | 67
| style=”width: 60px; height: 18px” | 64
| style=”width: 59px; height: 18px” | 46
|-
| style=”width: 115px; height: 18px” | M1P shrink
| style=”width: 60px; height: 18px” |
| style=”width: 60px; height: 18px” | 0.71
| style=”width: 60px; height: 18px” | 0.75
| style=”width: 60px; height: 18px” | 0.71
| style=”width: 61px; height: 18px” | 0.74
| style=”width: 60px; height: 18px” | 0.70
| style=”width: 60px; height: 18px” | 1.00
| style=”width: 59px; height: 18px” | 0.72
|-
| style=”width: 115px; height: 18px” | GP x M1P
| style=”width: 60px; height: 18px” | 105,400
| style=”width: 60px; height: 18px” | 57,600
| style=”width: 60px; height: 18px” | 28,800
| style=”width: 60px; height: 18px” | 20,736
| style=”width: 61px; height: 18px” | 11,590
| style=”width: 60px; height: 18px” | 5,829
| style=”width: 60px; height: 18px” | 5,760
| style=”width: 59px; height: 18px” | 3,220
|-

In the case of TSMC they follow the “Foundry” node progress whereas Intel follows more of an “IDM” node transition 40nm versus 45nm, 28nnm versus 32nm and 20nm versus 22nm. At the 14nm node TSMC has also chosen to call their node 16nm where everyone else is calling it 14nm.

We have updated this article with actual measured 28nm and 20nm pitch numbers from Chipworks. At 16nm the pitches are based on TSMC’s 2013 IEDM paper. TSMC’s 16nm is reported to have the same metal pitches as their 20nm so we have used the same pitch for 20nm M1. The 16nm gate pitch is larger than our projected gate pitch for 20nm, this is due to the planar to FinFET transition. The 10nm pitches are based on the average TSMC shrink ratios through 20nm. We have excluded 16nm due to the metal pitch pause and planar to FinFET transition.

Global Foundries/Samsung (GF/S)

The following table presents GF/S’s GP, GP shrink ratio, M1P, M1P shrink ration and GP x M1P starting at 130nm and projecting out to 10nm.

[TABLE] border=”1″
|-
| style=”width: 115px; height: 17px” |
| style=”width: 60px; height: 17px” | 130nm
| style=”width: 60px; height: 17px” | 90nm
| style=”width: 60px; height: 17px” | 65nm
| style=”width: 60px; height: 17px” | 40nm
| style=”width: 61px; height: 17px” | 28nm
| style=”width: 60px; height: 17px” | 20nm
| style=”width: 60px; height: 17px” | 14nm
| style=”width: 59px; height: 17px” | 10nm
|-
| style=”width: 115px; height: 17px” | GP
| style=”width: 60px; height: 17px” | 350
| style=”width: 60px; height: 17px” | 245
| style=”width: 60px; height: 17px” | 200
| style=”width: 60px; height: 17px” | 129
| style=”width: 61px; height: 17px” | 90
| style=”width: 60px; height: 17px” | 64
| style=”width: 60px; height: 17px” | 78
| style=”width: 59px; height: 17px” | 64
|-
| style=”width: 115px; height: 17px” | GP shrink
| style=”width: 60px; height: 17px” |
| style=”width: 60px; height: 17px” | 0.70
| style=”width: 60px; height: 17px” | 0.82
| style=”width: 60px; height: 17px” | 0.65
| style=”width: 61px; height: 17px” | 0.70
| style=”width: 60px; height: 17px” | 0.71
| style=”width: 60px; height: 17px” | 1.22
| style=”width: 59px; height: 17px” | 0.82
|-
| style=”width: 115px; height: 18px” | M1P
| style=”width: 60px; height: 18px” | 350
| style=”width: 60px; height: 18px” | 245
| style=”width: 60px; height: 18px” | 180
| style=”width: 60px; height: 18px” | 117
| style=”width: 61px; height: 18px” | 96
| style=”width: 60px; height: 18px” | 64
| style=”width: 60px; height: 18px” | 64
| style=”width: 59px; height: 18px” | 48
|-
| style=”width: 115px; height: 18px” | M1P shrink
| style=”width: 60px; height: 18px” |
| style=”width: 60px; height: 18px” | 0.70
| style=”width: 60px; height: 18px” | 0.73
| style=”width: 60px; height: 18px” | 0.65
| style=”width: 61px; height: 18px” | 0.82
| style=”width: 60px; height: 18px” | 0.67
| style=”width: 60px; height: 18px” | 1.00
| style=”width: 59px; height: 18px” | 0.75
|-
| style=”width: 115px; height: 18px” | GP x M1P
| style=”width: 60px; height: 18px” | 122,500
| style=”width: 60px; height: 18px” | 60,025
| style=”width: 60px; height: 18px” | 36,000
| style=”width: 60px; height: 18px” | 15,093
| style=”width: 61px; height: 18px” | 8,640
| style=”width: 60px; height: 18px” | 4,090
| style=”width: 60px; height: 18px” | 4,992
| style=”width: 59px; height: 18px” | 3,072
|-

We do not have actual pitch numbers for GF/S 20nm technology and we have interpolated them based on available data. At 14nm and 10nm the pitches are based on published values including the 2014 VLSIT 10nm paper from IBM, Samsung, St Micro and Global Foundries.

Density Comparisons
Having reviewed the three companies/groups we can now compare the GP x M1P metric over the range of nodes studied.

[TABLE] border=”1″
|-
| style=”width: 61px; height: 19px” |
| style=”width: 71px; height: 19px” | 130nm
| style=”width: 55px; height: 19px” | 90nm
| style=”width: 58px; height: 19px” | 65nm
| style=”width: 70px; height: 19px” | 45/40nm
| style=”width: 70px; height: 19px” | 32/28nm
| style=”width: 70px; height: 19px” | 22/20nm
| style=”width: 70px; height: 19px” | 16/14nm
| style=”width: 70px; height: 19px” | 10nm
|-
| style=”width: 61px; height: 19px” | Intel
| style=”width: 71px; height: 19px” | 111,650
| style=”width: 55px; height: 19px” | 57,200
| style=”width: 58px; height: 19px” | 46,200
| style=”width: 70px; height: 19px” | 38,800
| style=”width: 70px; height: 19px” | 12,656
| style=”width: 70px; height: 19px” | 8,100
| style=”width: 70px; height: 19px” | 3,640
| style=”width: 70px; height: 19px” | 2,101
|-
| style=”width: 61px; height: 19px” | TSMC
| style=”width: 71px; height: 19px” | 105,400
| style=”width: 55px; height: 19px” | 57,600
| style=”width: 58px; height: 19px” | 28,800
| style=”width: 70px; height: 19px” | 20,736
| style=”width: 70px; height: 19px” | 11,590
| style=”width: 70px; height: 19px” | 5,829
| style=”width: 70px; height: 19px” | 5,760
| style=”width: 70px; height: 19px” | 3,220
|-
| style=”width: 61px; height: 19px” | GF/S
| style=”width: 71px; height: 19px” | 122,500
| style=”width: 55px; height: 19px” | 60,025
| style=”width: 58px; height: 19px” | 36,000
| style=”width: 70px; height: 19px” | 15,093
| style=”width: 70px; height: 19px” | 8,640
| style=”width: 70px; height: 19px” | 4,090
| style=”width: 70px; height: 19px” | 4,992
| style=”width: 70px; height: 19px” | 3,072
|-

This table has been updated since the original post based on measured TSMC 28nm and 20nm pitches from Chipworks. In the table above I have marked in bold the densest process at each node. It is interesting to see that it has moved around from node to node. Based on what has been disclosed to date and reasonable projections it looks like Intel will have the densest process at 16/14nm and 10nm using the GP x M1P metric. Whether this translates into a denser process for actual designs is a different question but GP x M1P is in our opinion a good measure of pure process density.

The same data is also plotted below as the now infamous Intel density comparison:


Place & Route with FinFETs and Double Patterning

Place & Route with FinFETs and Double Patterning
by Paul McLellan on 09-29-2014 at 8:00 am

Place & route in the 16/14nm era requires a new approach since it is significantly more complex. Of course, every process generation is more complex than the one before and the designs are bigger. But modern processes have new problems. The two biggest changes are FinFETs and double patterning.

FinFETs, as I assume you know, are vertical transistors that stick up like a shark’s fin from the wafer and then the gate is wrapped around them on 3 sides. The gate width is quantized, meaning you can have 2, 3 or more FinFETs to make up what in a planar process would be a single transistor. The FinFETs thus end up being laied out in a grid. There are a lot of complications to laying out FinFET cells but place & route doesn’t have to deal with that since the standard cell library is an input file. However, the FinFET architecture makes for dense cells which can lead to difficulty in finding points to pick up signals. Also, while FinFETs are lower leakage they have relatively high dynamic power which can lead to power/timing closure difficulties.

Double patterning comes about because we are still stuck with 193nm light for lithography which only allows us to get down to about 80nm pitch. To go lower, as we must for processes at 20nm and below, we have to print half the polygons using one mask, and half using another, so that neither mask violates the 80nm pitch rule but together the two masks generate all the polygons. The most common way to do this is called LELE (litho-etch-litho-etch) although there are other potential approaches. LELE is the cheapest approach but the two masks are not self-aligned so there is increased process variation depending on how accurate the alignment of the masks turns out to be.


The problem with double patterning is that it is possible to design layouts that cannot be split into two masks. This will happen if there is an “odd cycle” in the design where polygon A is close to B (so must be colored differently) and a third polygon C is close to both A and B and so cannot be colored with either of the only two colors we have. This can require either adjusting the layout and moving the polygons, or alternatively a polygon can often be split into two with the two halves being colored differently and overlapping when manufactured.

To make things worse, this is not a local phenomenon. The odd cycle can be a large (odd) number of polygons spread all over a chip or a large block, as in the diagram above. Place & route needs to be double patterning aware to minimize the problems that it creates, but also to locate and correct any odd cycles that are generated.

The introduction of both multi-patterning and FinFETs has a huge impact on all the key engines in the place and route flow. The complexity and number of DRC rules along with the multi-patterning rules has increased significantly and poses a big challenge to the router. Tighter design rules and FinFET process requirements, such as voltage threshold-aware spacing, implant layer rules, etc., impose restrictions on placement, floorplanning, and optimization engines that directly impacts design utilization and area. Multi-patterning closure and timing closure are inter-dependent, each requiring minimal design perturbations and can increase design closure time. In order to account for multi-patterning and FinFETs, the entire place and route flow needs to be completely revamped.

Olympus-SoC provides a comprehensive multi-patterning place and route platform to address the challenges of advanced nodes. It addresses all the routing rules required for 14/16/20nm, including dealing with the interactions between multi-patterning and FinFETs. The routing engine provides complete support for DRC/multi-patterning rules for the leading foundries. The Olympus-SoC database is architected to handle the requirements for multiple masks and supports anchoring and propagation of pre-colored objects. All the key engines in the entire flow, including placement, optimization, timing, and extraction are FinFET and multi-patterning aware. Tight integration with Calibre ensures sign-off clean physical verification with minimal design iterations.

Mentor has a white paper FinFET and Multi-patterning Aware Place and Route Implementation. You can find it here.


A Complete Timing Constraints Solution – Creation to Signoff

A Complete Timing Constraints Solution – Creation to Signoff
by Pawan Fangaria on 09-28-2014 at 10:00 pm

With the unprecedented increase in semiconductor design size and complexity design teams are required to accommodate multiple design constraints such as multiple power domains for low power design, multiple modes of operation, many clocks running, and third party IPs with different SDCs. As a result timing closure has become extremely complex and tricky. While false paths may take away all your attention towards timing closure, some of the real timing issues may get missed, and certain subtle issues such as incorrect exceptions may show up later in the chips or ask for re-spin. As such, timing constraints by nature are incomplete or inconsistent and evolve through the design process. A valid constraint at block level may become invalid at chip level, thus impacting the timing closure and overall design cycle. And that gets further complicated when you have to promote the validated constraints from IP to SoC or push down from SoC to IP during IP integration and then signoff at SoC level taking into account all modes of operations, detailed validation and post-layout repair. So, it’s high time we have an automated comprehensive timing solution and constraints signoff flow to accelerate timing closure and reduce design risks.

Early in this month, it was a nice opportunity to attend an interesting webinaron SpyGlass Timing Constraints Signoff Flow, presented by Mark Baker, Director of Product Marketing at Atrenta. For me, it was an extra pleasure to hear Mark as I know him from my Cadence days. What I found was a real comprehensive flow where everything about timing constraints is taken care of, starting from creation and validation including exception verification, all the while providing management of constraints through signoff.

An SDC can be created from scratch or incrementally added to an existing set of design constraints. The creation process works at the RTL level by identifying constraints for primary clocks, generated clocks, and primary IO; of course uncertainly and latency has to be added by the user. Identification of all clock crossings is automatic, setting false paths (FPs) between asynchronous CDC. Architectural exceptions or false path constraints are generated to avoid false timing violations between exclusive clocks. Addition of these exceptions can lead to faster timing closure.

Constraint validation is done at RTL as well as netlist level, checking completeness, consistency and correctness within a robust debugging environment. There are over 300 rules covering clocks, I/O delays, structural exceptions and methodology based rules with full support of SDC standards. The solution supports commonly used non-SDC constructs as well. All clock constraints are taken into consideration for consistency of clock intent. Any constant propagation conflicts including forward and backward propagation are flagged.

Formal waveform verification is done to avoid mismatch in design and timing intent which can give a false sense of timing closure but actually can lead to chip failure or re-spin. Complete clock domain analysis or relationship reporting between all clocks and generated clocks are done by extracting clocks, false paths and uncertainties. The setup for CDC, power and exception verification is done automatically while pointing out any conflict that can lead to incorrect timing or CDC.

Timing Exception Verification is a major step to ensure adequate and correct timing exceptions are applied to the design. Accurate timing exceptions lead to faster timing closure without overdesigning, thus enabling better power and area optimization. The asynchronous FPs are detected using CDC solution, quasi-static FPs through simulation and synchronous FPs using functional verification. The Multi-Cycle Path (MCP) verification is done by using patented formal techniques, providing a fast solution with high rate of completion. The overall idea is to identify exceptions meaningful to design implementation that accelerates back-end timing closure through industry standard STA flows. Closure of exception verifications are achieved faster by intelligently monitoring any changes to the exceptions throughout the implementation, use of assertions, incremental verification and ensuring that a path is either timed or verified for synchronization.

The constraints are gracefully managed throughout the design by detecting and incrementally adding missing constraints at any stage and checking their equivalence at every stage under different scenarios, a patented capability in SpyGlass Timing Constraint Signoff Flow. There can be a scenario where there may be 2 SDCs created for a single design; the equivalence between the two SDCs can be easily checked. Similarly, SDCs at block and chip levels can be checked for their equivalence. Also, equivalence between 2 SDCs for two different flavours of a design can be checked.

A design can have multiple functional and test modes and for each mode of operation there is a separate SDC file. In order to simplify the job and save runtime for implementation tools such as Synthesis, P&R and STA, the SDC files of multiple modes can be merged into a single file representing a virtual mode that has timing constraints of all individual modes. To ensure that the merged mode has all timing aspects of individual modes (with pessimism in merged constraint) the SDC equivalence between an individual mode and the merged mode can also be checked.

The overall health of timing analysis can be measured by a nice mode wise coverage summary that points out any aspects of timing that’s not covered such as missing clock constraints, unconstrained registers or ports and so on. The timing coverage analysis report acts as a good indicator of signoff readiness in all modes.

As a concluding remark, this flow covers all aspects of timing constraints for a robust optimized design; it ensures correct, consistent and complete constraints with all clock relationships properly defined, constraints checked against design intent, timing exceptions verified, constraint equivalence ensured at each stage of design, modes merged for faster implementation and exception coverage report generated for timing signoff.

It’s a nice webinar to attend and learn about timing constraint issues in today’s designs and how they can be addressed using a comprehensive timing constraints signoff solution.

Also read –
Expert Constraint Management Leads to Productivity & Faster Convergence

SpyGlass CDC: A Comprehensive solution for addressing CDC issues

An Approach to Clock Domain Crossing for SoC Designs

Smart Clock Gating for Meaningful Power Saving

More Articles by Pawan Fangaria…..


ARM TrustZone and Zynq

ARM TrustZone and Zynq
by Paul McLellan on 09-28-2014 at 10:00 am

Security of embedded devices is becoming more and more important. The requirement for good protection increases as devices become more interconnected: wearable medical devices that connect to the cloud, mobile base stations that are no longer up poles but in much less physically secure areas, cars that communicate among themselves. A programmable device is especially vulnerable since not only can the software running on the Soc potentially be compromised, so can the very hardware on which it is running if the programming bitstream itself is replaced. Your base station router no longer just processes the packets, it also sends a copy to the Chinese military, the NSA or Google. Pick your bogeyman.

To further compound the problem, many devices are open platforms on which additional software such as apps can be run. Your smartphone probably runs a mixture of stuff you don’t care about much, like games, to things that you probably have some concern about, such as your WhatsApp chat history, to things you certainly care a lot about like access to your bank. There are compromises involved in security: very high security may be too complex for the average use to install and maintain, and it may be too expensive (in terms of power dissipation or FPGA fabric use). Minimal security may stop the clueless but it is a waste of time against anyone knowledgeable.

One solution is ARM TrustZone. This is widely used because of the near ubiquity, or at least widespread use, of ARM processors in embedded and other systems. This is a combined hardware/software solution to security that builds up in layers.

The Zynq-7000 AP SoC architecture integrates a dual-core ARM Cortex-A9 along with Xilinx FPGA programmable fabric into a single device built on top of TSMC’s 28nm HPL (low power) process. Like traditional SoCs, the processor-centric approach allows the processor to boot first and then bring up the rest of the device. This approach also allows control and partial reconfiguration of the programmable logic by running software on the processor. In turn, this enables the user to optimize system performance and power management to meet varying operating environments.


The ARM TrustZone architecture makes trusted computing within the embedded world possible by establishing a trusted platform, a hardware architecture that extends the security infrastructure throughout the system design. Instead of protecting all assets in a single dedicated hardware block, the TrustZone architecture runs specific subsections of the system either in a “normal world” or a “secure world.” Such an approach, when combined with software designed to leverage its advantages, enables creation of an end-to-end security solution that includes functional units as well as debug infrastructure.

In the Zynq-7000 AP SoC, a normal world might be defined as a hardware subset consisting of memory regions, caches and specific devices. This non-trusted software can be limited to an environment that prevents access to, or even knowledge of, the additional hardware dedicated to the support of the TrustZone architecture in the secure world. Trusted applications run on a TrustZone based ssystem tat implements a trusted execution environment. On the Zynq SoC further system-wide security is provided by integrating the TrustZone framework into the processor interconnects and system peripherals.


A key part of ARM’s TrustZone approach is that all AXI interfaces contain an additional bit known as the Non-Secure (NS) bits. During a transaction all masters assign an appropriate value to this bit and all slaves must interpret them to ensure that security separation is not violated, so, in particular, a non-secure master cannot access a secure slave.

It is beyond the scope of an introductory blog entry to go into all the low-level operating details of how a complex SoC design is configured. But luckily Xilinx has a detailed white paper on the subject, TrustZone Technology Suppport in Zynq-7000 All Programmable SoCs. You can download it here.


More articles by Paul McLellan…


ARM ♥ Xilinx!

ARM ♥ Xilinx!
by Daniel Nenni on 09-28-2014 at 7:00 am

The good news is that as a part of SemiWiki we get free media passes to all of the cool conferences. The bad news is that our inboxes get flooded with announcements. ARM TechCon is next week and my delete button is on overtime but it is interesting to see who is active in conferences and who is not. In this case Xilinx is very active and Altera not so much which is surprising since Altera has ARM inside, right?

Also Read: Pigs Fly. Altera Goes with ARM on Intel 14nm

My first FPGA experience was with a start-up called GateField which was acquired by Actel in 2000 then Actel was acquired by Microsemi in 2010. Some of the GateField people are still at Microsemi, others are at QuickLogic and Lattice. For me, the GateField experience of competing against Altera and Xilinx in the trenches was enough for a lifetime, absolutely.

As a Strategic Foundry Relationship Consultant I enjoyed working with Altera down to 20nm. 40nm was a lot of fun since Xilinx and their foundry partner UMC missed a step. Xilinx then moved to TSMC at 28nm and has dominated ever since, just my opinion of course. In my experience both Xilinx and Altera have great technology but Xilinx executes at a higher level and has a much stronger sales and marketing channel. ARM TechCon is a clear example:

Register today for ARM TechCon 2014 to learn how Xilinx® and ARM® are delivering smarter solutions with the ARM processor-based Zynq® All Programmable SoC.

Xilinx In-booth Customer and Ecosystem Partner Demonstrations Will Feature:

  • System on Module presented by NI
  • Integrated Media Processing Platform presented by Cloudium
  • Medical Application Development Platform presented by Topic
  • IC CAM for Personal Identity Recognition by Cornerstone
  • Real-Time Object Recognition and Reconstruction by VanGogh Imaging

Xilinx Product Teardown Presentation

  • October 2nd at 11:30 a.m. in the ARM TechCon Theater: Moderated by Steve Leibson, Editor of Xcell Daily, the product teardown will feature the NI VirtualBench and the Cloudium Integrated Media Processing Platform.

Xilinx Technical Presentation

  • October 2nd at 11:30 a.m. in the Santa Clara Convention Center: Join Carl Cao, Wireless Systems Architect at Xilinx, to learn about “Integrated All Programmable HW and SW Platforms for Wireless Applications”.

To learn more about how Xilinx & ARM are delivering All Programmable Solutions for Smarter Systems,please visit Smarter Systems

I did not get a mailer from Altera but I did reach out to them and was told an Altera wireless expert will present on optimizing wireless radio heads using ARM-based SoC FPGAs:

Wireless DPD (Digital Pre-Distortion) Optimization and Profiling in Altera SoC Devices

In addition, Altera is planning some ARM-related news at the show but, for obvious reasons, I was not offered an advanced briefing. Ever since Altera and TSMC divorced, Altera and I don’t speak much. In fact, I was at TSMC Fab 12 when the Altera/Intel relationship was announced and it really did feel like a divorce after a very long marriage.

If you are at ARM TechCon on Wednesday or Thursday look me up. It would be a pleasure to meet you!

You can read more about Xilinx on SemiWiki HERE.

More Articles by Daniel Nenni…..

ARM TechCon 2014delivers an at-the-forefront comprehensive forum created to ignite the development and optimization of future ARM-based embedded products. By offering three full days of technical tracks, demonstrations, and industry insight from broad and deep levels of industry-leading companies and innovative start-ups, ARM TechCon remains more than a tradeshow; it is a comprehensive learning environment for the entire embedded community, uniting the software and hardware communities.


Mentor at TSMC OIP, 16nm, and 10nm

Mentor at TSMC OIP, 16nm, and 10nm
by Beth Martin on 09-26-2014 at 4:46 pm

On Tuesday, September 30, TSMC hosts another Open Innovation Platform Ecosystem forum at the San Jose Convention Center. Have you registered? This year includes 30 technical sessions from TSMC’s ecosystem partners, divided into three separate tracks. I’ll be hanging out in the EDA track, listening to various takes on 16nm FinFET process design issues and marveling at the prospect of 10nm.

Mentor has three sessions:

  • “Design and Verification of 2.5D/3D IC Architectures Using TSMC 16nm FinFET Technology,” Mentor Graphics
  • “Four Ways an ECO Fill Reference Flow Can Benefit Your Bottom Line,” Mentor Graphics and TSMC
  • “Maintaining Hierarchy and Accuracy for Post-Layout Simulation: Grey/Black Box Flows in LVS->PEX->Simulation,” Oracle and Mentor Graphics.

Your friends at Mentor have made a videoabout working with TSMC on 10nm, and also have a press release out today about it.

It’s interesting to see engineering work turn towards 10nm when I’m not used to the idea of 16nm, but thus is the never ending march of technology, right? I look forward to learning more at the forum.


Synopsys Verification Continuum

Synopsys Verification Continuum
by Paul McLellan on 09-26-2014 at 4:00 pm

Verification spans a number of different technologies, from virtual platforms, RTL simulation, formal techniques, emulation and FPGA prototyping. Going back a few years, most of these technologies came from separate companies and one effect of this was that moving the design from one verification environment to another required completely different scripts, changes to the RTL, and a lot of time, sometimes measured in months. Getting a large design up and running in emulation or an FPGA prototype system, in particular, was a major challenge.


Over the last couple of years, Synopsys has assembled a broad portfolio of leading edge technologies. They have rewritten their static and formal engines, acquired 3 virtual platform companies, acquired EVE’s emulation technology, and Springsoft’s Verdi debug environment. But these tools largely still showed their roots as separate product lines with different scripts and different requirements on inputs. It was still too hard to get a design that was running cleanly in, say, VCS into an emulator. It is important for verification to be able to move up and down the chain of engines easily, so that the best tool for the job can be used as the design proceeds through the development process.

Earlier this week, Synopsys announced their Verification Continuum Platform. This is a major rewrite of the front ends of all of the various verification engines so that they have a common input, common scripts and so on. This gives seamless transitions between engines and a consistent interface for setup, runtime and debug. Earlier in the year, Synopsys announced Verification Compiler which pulled together formal and simulation into one environment. With this week’s announcement, that has now been broadened to pull in virtual platforms, emulation and FPGA prototyping too.


The VCS front-end is used for all compilation, analysis, elaboration, debug preparation, optimization, code-generation, synthesis and mapping. One immediate effect is that compile times for emulation are up to 3 times as fast as before.


There is also unified debug with Verdi across all the technologies from virtualizer, through simulation, formal techniques, Zebu emulation and FPGA prototyping with HAPS.

The result is a “shift left” of the bug discovery process, enabling bugs to be found earlier and enabling software development to be done earlier. Obviously this has the potential to accelerate product development schedules significantly and, by making the technology easier to use, increase the use higher performance technologies such as emulation, FPGA prototyping and virtual prototyping.


More articles by Paul McLellan…


Dominating FPGA clock domains and CDCs

Dominating FPGA clock domains and CDCs
by Don Dingee on 09-26-2014 at 7:00 am

Multiple clock domains in FPGAs have simplified some aspects of designs, allowing effective partitioning of logic. As FPGA architectures get more flexible in how clock domains, regions, or networks are available, the probability of signals crossing clock domains has gone way up. Continue reading “Dominating FPGA clock domains and CDCs”