SemiWiki – Page 773 – The Open Forum for Semiconductor Professionals

January 27, 2015

Shorten the Learning Curve for High Level Synthesis

Shorten the Learning Curve for High Level Synthesis
by Daniel Payne on 01-27-2015 at 4:30 pm
Categories: EDA

When chip designers moved from a gate-level design methodology to coding with RTL there was a learning curve involved, and the same thing happens when you move from RTL to High Level Synthesis (HLS) using C++ or SystemC coding. One great shortcut to this learning curve is the use of pre-defined library functions. I just heard about a new library of 1D signal processing hardware ready to use in an HLS flow from Calypto:

The library is called CatWareand you get a set of filter and FFT models that can be customized with parameters and then synthesized using the Catapult tool. Some designers prefer to use C++, while others are attracted to SystemC, either way the CatWare libraries support both languages. You add these library models to your C++ or SystemC source code, and then set the parameters, like:

Input precision
Output precision
Number of stages
Number of taps
Architecture

During synthesis with Catapult is where you constrain the design to use a specific technology and define the clock frequency. Because you are receiving the source code for each library, you can even change it to better meet your needs or create derivatives.

Each of the CatWare models has been run through a set of regression tests, so you’ve got something that has been verified in Simulink, C++ and RTL simulations already. Two extra verification techniques are also used: assertion synthesis and SLEC C property checking, helping to verify consistent behavior through synthesis.

Here’s what you get in the FFT library:

[TABLE] style=”width: 100%”
|-
| Catware FFT Blocks
| Radix 2 Fixed Point DIT
|
| Radix 2 Fixed Point
DIF
|
| Radix 2^2 Fixed
Point DIF
| Mix Radix Fixed Point
(2 and 2^2) DIF
| Configurable
Radix Fixed Point DIF
|-
| Supported Architectures
| Single Delay
Feed-back
| In Place
| Single Delay
Feed-back
| In Place
| Single Delay
Feed-back
| Single Delay
Feed-back
| In Place
|-
| Synthesizable C++ Model
| √
| √
| √
| √
| √
| √
| √
|-
| Synthesizable SystemC Model
| √
| √
| √
| √
| √
| √
| √
|-
| Simulink Model
| √
| √
| √
| √
| √
| √
| √
|-
| Configurable Bit Precision
| √
| √
| √
| √
| √
| √
| √
|-
| Configurable Stage-wise Scaling
| √
| √
| √
| √
| √
| √
| √
|-
| Configurable FFT Point
| √
| √
| √
| √
| √
| √
| √
|-
| Streaming Interfaces
| √
| √
| √
| √
| √
| √
| √
|-
| Configurable Delay Buffer Impl.
(Register/Memory)
| √
| NA
| √
| NA
| √
| √
| NA
|-
| Constant Twiddle Implementation
| √
| √
| √
| √
| √
| √
| √
|-
| Configurable Radix
| NA
| NA
| NA
| NA
| NA
| NA
| √
|-
| Option to Mix Radix
| NA
| NA
| NA
| NA
| NA
| NA
| √
|-
| Configurable Output Oder
(Natural or Bit-Reversed)
| X
| X
| X
| X
| X
| X
| √
|-
| C++ Interface Synthesis
| √
| √
| √
| √
| √
| √
| √
|-
| Multiview IO – HLS/TLM
| √
| √
| √
| √
| √
| √
| √
|-

For filters, the CatWare library has:
[TABLE] style=”width: 100%”
|-
| Catware Filter Blocks
| FIR Constant Coefficient
| FIR Programable
Coefficient

| FIR Loadable
Coefficient

| CIC Interpolator
& Decimator

| Moving
Avergage

| Integrate
& Dump

| Poly Phase Interpolator
& Decimator
|-
| Supported
Architectures
| Shift Register Circular
Buffer Rotational Shift
Folded – Even Taps
Folded – Odd Taps
Transpose
| Shift Register Circular
Buffer Rotational Shift
Folded – Even Taps
Folded – Odd Taps
Transpose
| Shift Register Circular
Buffer Rotational Shift
Folded – Even Taps
Folded – Odd Taps
Transpose
| Limited
Precision^
Full Precision
| 1D
Windowing
| NA
| NA
|-
| Synthesizable C++
Model
| √
| √
| √
| √
| √
| √
| √
|-
| Synthesizable SystemC
Model
| √
| √
| √
| √
| √
| √
| √
|-
| Simulink Model
| X
| X
| X
| √
| √
| X
| √
|-
| Multi-Channel Support
| X
| X
| X
| √
| X
| √
| X
|-
| Configurable Bit-Widths
| √
| √
| √
| √
| √
| √
| √
|-
| Configurable Rate
| NA
| NA
| NA
| √
| NA
| NA
| √
|-
| Configurable Number
of Taps
| √
| √
| √
| NA
| NA
| NA
| √
|-
| Streaming Interfaces
| √
| √
| √
| √
| √
| √
| √
|-
| Configurable Window
Type (Clip or Mirror)
| NA
| NA
| NA
| NA
| √
| NA
| NA
|-
| Configurable
Differential Delay
| NA
| NA
| NA
| √
| NA
| NA
| NA
|-
| C++ Interface Synthesis
| √
| √
| √
| √
| √
| √
| √
|-
| Multiview
IO – HLS/TLM
| √
| √
| √
| √
| √
| √
| √
|-

Summary

Give HLS a try on your next DSP design, and shorten your learning curve by using parameterized libraries for filters and FFT functions. This approach is sure to save you many days of engineering effort compared to starting from scratch.

January 27, 2015June 14, 2019

Sigrity Focuses on LPDDR4 Compliance Analysis in 2015 Release

Sigrity Focuses on LPDDR4 Compliance Analysis in 2015 Release
by Tom Simon on 01-27-2015 at 10:00 am
Categories: Cadence, EDA

It was back in July of 2012 that the acquisition of Sigrity by Cadence was announced. Although Cadence is a dominant player in both IC and board layout tools, they did not have an electromagnetic (EM) signal integrity solution in their portfolio. This acquisition marks a turning point for the EM/SI sector – tight integration into the design flow is now exceedingly important for designers. As always there is a lot of anticipation to see how things go after a smaller technology company is brought in under the wing of a larger player.

Cadence is announcing the features in the new 2015 release of Sigrity at DesignCon this week. This is an ideal time and place for the announcement. DesignCon is a leading event in the board design space. The tag line for DesignCon is “where the chip meets the board”. With that in mind, the main features of the Sigrity 2015 release are right on target. Of the things intheir press release, the one that jumps out is added support for analysis and qualification of designs using the newLPDDR4 high-speed low-power memory interface standard.

The driver for this is the growth of mobile devices – more data and more graphics. According to a presentation at Mobile Forum 2013 by Samsung’s JungYong Choi, data traffic on PCs will grow by only 2X from 2012 to 2017, but smartphone data traffic will grow 8X in the same timeframe. Concurrently, higher resolution displays on mobile devices are also driving bandwidth. UHD needs 9X the bandwidth than HD does.

The JEDEC specification for LPDDR4 came out in August of 2014. Not only is it a faster interface standard, but it uses less power. Power is reduced by lowering the operating voltage from 1.2V in LPDDR3 to 1.1V in LPDDR4, Significant power is also saved by using Low Voltage Swing Terminated Logic. This uses a signaling voltage of around 400mV, 50% lower than the previous LPDDR3. Transmitting a ‘0’ requires almost no current. To save even more power DBI is used to invert the data bits when a byte has more ‘1’s than ‘0’s. Overall LPDDR4 is said to use 40% less power than LPDDR3.

LPDDR4 runs at 3.2B transfers per second; at the bus level that translates to a bandwidth of 25.6GB/s (2ch). The bus is divided into 2 parallel 16 bit buses with their own clocking to help improve signal line layout and avoid skew issues.

With low swing voltage and higher operating frequency comes the need for added diligence on interconnect to memory chips. Signal traces need to be analyzed to ensure that they do not contribute to excessive bit error rates. According to Brad Griffin, Product Marketing Director at Cadence, trace geometry and ground plane structure affects signal integrity and needs to be modeled and simulated to understand how the whole system will operate. Also ground bounce and power rail integrity will have a dramatic effect on LPDDR4 compliance.

Sigrity 2015 not only provides modeling for board and package interconnect, but can simulate the system using its power aware system signal integrity feature to ensure that the bit error rate complies with the specification. Sigrity ties into package level simulation and up through chip level models that can be generated in the Cadence suite.

Looking at things from a system perspective, it seems that Cadence is well positioned to leverage its entire flow to help validate and qualify designs from on-chip all the way to the board and system level. After all, things will only get more complicated. The next revision of the LPDDR4 spec will go from the present 3200 MT/s to 4266 MT/s, a much higher transfer rate. Griffin expects channel equalization to become a necessity before long. If so, this will add the requirement to model equalization algorithmically to perform end to end qualification.

I look forward to visiting the Cadence booth at DesignCon this week to learn more about Sigrity 2015.

January 27, 2015

FPGA vendor to buy IC vendor Silicon Image

FPGA vendor to buy IC vendor Silicon Image
by Eric Esteve on 01-27-2015 at 9:35 am
Categories: FPGA, IP

Interesting news today : Lattice Semiconductor, FPGA vendor is buying Silicon Image. In fact, Silicon Image is a chip vendor, but also an innovator, licensing well known IP like HDMI, MHL and more. If we consider the amount paid for Silicon Image ($600 million), compared with the last full year revenue, $276M in 2014, that’s a 2.2X multiplicator… not really a success story for Silicon Image !

I am following the company since 2007, as I wanted to understand the SATA IP market and Silicon Image was the #1 with 80% market share or $16M on a $20M market. Thus I have continued to look at the company, as HDMI technology was starting a penetration which is today at 100% in many segments like PC or TV, and very strong in adjacent segments like Mobile. At that time Silicon Image was credited to (almost) 100% on the HDMI IP segment, no surprise.

You can take a look at the above results, you will see that the company was healthy, and growing fast between 2005 and 2007, passing from $212M to $320M in two year, which is more than 50% growth ! But if you look at the results for the last three years (below) you will see that none of the annual revenue is surpassing this made in 2007 :

So what’s has happened to Silicon Image ?

Looking at the company behavior through the prism of IP, we can say that Silicon Image has been a real innovator, inventing HDMI (after DVI). HDMI was a true revolution in the consumer market, as you can transfer Image through a serial high speed link, and that’s make a real difference ! Just try to use that we call a « Peritel » in France, a parallel link and you will jump to buy an HDMI cable.

Along with the patent, silicon Image has created the « HDMI Licensing LLC » to collect the royalties, 4 cent per HDMI port.

On top of HDMI Licensing LLC, Silicon Image has created the Authorized Test Center (ATC) policy : « The HDMI Founders have established Authorized Testing Centers (ATC) where licensed manufacturers can submit their products for compliance testing. Upon successful compliance testing, take advantage of the HDMI Product Finder to promote your Fully-Compliant products.” Good idea? Certainly for Silicon Image, as their direct competitors had to pass by these ATC to have the right to put the “HDMI” Logo! Such a policy can be a good way to accelerate Silicon Image time-to-market… or slow down competitor release.

From informal discussion that I had with application processor chip makers integrating HDMI –and paying the royalties to HDMI Licensing, it seems that these chip makers were upset by Silicon Image. Arrogance is not a guarantee of success…

To come back to the IP business, Silicon Image was making $50M with licensing in 2007 (SATA + HDMI + DVI) but around $44M in 2013 (replace DVI by MHL). On a product family (Interface IP) exhibiting 10% CAGR between 2007 and 2013, this is clearly a counter performance. To make a comparison, on the same products (SATA and HDMI) an IP vendor like Synopsys has made $5M in 2007, but $25M in 2013!

We can’t explain everything by only looking at the IP revenue evolution, but it’s clear that Silicon Image had some nuggets like HDMI in their port-folio, a good strategy back in 2007, and has failed to continue on the long term and develop new technologies, or new market. That’s why Silicon Image revenue has been 20% less in 2013 than in 2007, that’s the reason why Lattice has acquired the company for 2.2X the yearly revenues…

Eric Esteve from IPNEST

January 27, 2015June 14, 2019

ANSYS Talks About Multi Physics for Thermal Analysis at DesignCon

ANSYS Talks About Multi Physics for Thermal Analysis at DesignCon
by Tom Simon on 01-27-2015 at 9:00 am
Categories: Ansys, Inc., EDA

ANSYS makes a big deal of being a multi-physics company. Still it has taken them a while to fully integrate Apache. Nevertheless it seems like there is a compelling argument for combining technologies to solve SOC design problems. Frankly most chip designers would be hard pressed to think of a reason for using computational fluid dynamics (CFD). However it turns out that there is good reason to use it when looking at a comprehensive solution for determining electromigration.

ANSYS’s Ravi Ravikumar shared with me some slides that they are using atDesignCon this week in Santa Clara which outline the flow for thermal analysis of contemporary system designs, such as cell phone processors and other SOCs. Their flow accounts for cases where 3D ICs are used as well. The thing that stands out is that unless you understand the chip in detail along with the environment it is in, including interposer, package, board and cooling regime, you cannot come up with good thermal data – for example like what is needed to determine electromigration effects.

Sources of self heating include dynamic and static device power, as well as on-chip interconnect, bumps and PCB interconnect. Interconnect power consumption goes up linearly with temperature. The real problem is leakage or static power dissipation. We all know at advanced nodes leakage power has grown as a percentage of overall chip power. What is important for electromigration analysis is that leakage power is temperature dependent, dramatically increasing with temperature. So really we cannot talk about how much power the chip is dissipating until we know the temperature. This is where it starts to get interesting.

Until we take first-cut power numbers and propagate them from the chip to the interposer, through the package, board and whatever thermal environment the board is in, we can’t get to the actual power numbers. In fact, once we update the power numbers, we will have new self heating data to propagate again through this flow. It turns out to be an iterative process that eventually will converge. However it requires a tool chain that can accurately calculate results for each level of the design.

In the ANSYS flow here is what this looks like. At the chip level, their Chip Thermal Model (CTM) produced by Totem and/or Redhawk breaks each layer of the chip into small squares and characterizes them at several different temperature points for power consumption. This includes devices and all interconnect layers. With initial temperature information, this can be used by the iterative flow described below to predict power dissipation – leading to better temperature numbers. The chip level information is fed to Sentinel-TI which can take the CTM models to make more compact models that contain thermal information for the die.

Sentinel-TI predicts the thermal behavior of the package. Next we have to consider power dissipation on the board. ANSYS SIwave is used for this purpose. However, unless the housing and external cooling is accounted for these numbers don’t mean anything. This is where ANSYS Icepack comes in with the computational fluid dynamics. You were probably wondering when this was going to get brought up again. Icepack looks at things like airflow and heat transfer in the board housing – i.e. cell phone chassis.

It should be mentioned that different chip thermal models are produced for different modes of operation of the chip. Clearly the power consumed by a chip depends on activity information. Viewing a video will consume much more power than reading email, for instance. The ANSYS flow can accommodate different modes of operation and can even give information about temperature rise given certain usage profiles, such as busts of higher compute intensive usage, etc.

Delving in further there are internal issues in interposer designs that require rigorous analysis. Heat will flow from internal source through the microbumps and/or interposer to get to the exterior of the package. For example, when there are TSVs, there will be metal on the back side of the substrate. This will affect thermal flux. Sentinel-TI can analyze for these and many other cases.

Getting back to the reliability issues related to properly analyzing for electromigration, this flow looks to do a much better job than using guidelines that do not include accurate operating temperature information. Below is a graphic showing calculated electromigration effects with and without consideration of on-chip operating temperature.

ANSYS makes a strong case for using multi-physics for analysis of semiconductor designs. One would be hard pressed to think of another company that can provide a solution that combines such breadth of analysis in solving these tough design problems.

I expect the DesignCon presentation to go into much more detail than I have been able to cover here.

January 26, 2015April 10, 2025

Silvaco TCAD Webinar

Silvaco TCAD Webinar
by admin on 01-26-2015 at 4:50 pm
Categories: EDA

TCAD is a somewhat specialized area since not that many people design semiconductor processes compared to the number who design chips. Bit without TCAD there would be no chips. One area where the two domains intersect is that of SEE, where neutrons (mainly) can cause a flop or a memory bit to change. Since we live on a radioactive planet that is not going away and the smaller a transistor the less it needs to flip it.

Silvaco have a webinar coming up on the topic titled Simulating Total Dose, Prompt Dose, Damaging Fluence and SEU Using TCAD. It is Tuesday February 17th from 10-11am Pacific time. Her is what it will cover:

Introduction to a newly available and recently declassified Total Dose model
Description of the physical mechanisms accounted for in the Total Dose Model, including radiation induced de-trapping of trapped oxide holes.
How certain bias conditions during radiation can reduce the trapped hole concentration in radiation hardened oxides, leading to radiation induced threshold voltage recovery (this is NOT the normal “rebound” effect caused by the slow formation of interface traps).
How to simulate a particle fluence that creates damage in the semiconductor
How to simulate transient, very high dose rate “prompt” events.
Simulating other, more traditional high energy Single Particle Events (SEE)
Examples include, threshold voltage shift and inter device leakage from Total Dose oxide charging, Image Sensor Damage from a fluence of protons, Prompt Dose effects on a circuit, Single Event Burnout (SEB) of a power PiN diode, Single Event Upset (SEU) of a 22nm SRAM

The presenter is Derek Kimpton, Principal Applications Engineer at Silvaco, who spent four years characterizing radiation effects on devices at Plessey Semiconductors in Lincoln, England. Whilst there he published the paper in Solid-State Electronics on a new and predictive total dose oxide charging model, that is the basis for the code implemented in Silvaco’s latest TCAD Victory Device simulator.

Who should attend? Well, all in the radiation effects community, with an interest in the simulation of radiation effects on electronic devices using physics based (TCAD) tools. But given that all chips are affected by radiation then everyone is at least peripherally affected by this. So although this is an area of increasing importance even to people who think it doesn’t affect them.

For more details and to registration are here.

30+ Years of Semiconductors – The base matters!

30+ Years of Semiconductors – The base matters!
by Pawan Fangaria on 01-25-2015 at 10:00 am
Categories: EDA, Foundries

Although CMOS technology in semiconductors was patented in 1960s, commercial ICs and electronic systems based on CMOS ICs started picking up in 1970s, and the real growth with personal computer (PC) market took place in 1980s. Then Intelmicroprocessors started dominating the semiconductor market with increasing processing speed and likewise high memory demand. To catalyze that growth from software side was Microsoftwindows; we have all heard about “Wintel” jargon that ruled the PC market with PCs having Intel processors and windows running on them. The market grew most rapidly in 1980s (CAGR 16.8% in 1989) and 1990s (CAGR 13.6% in 1999) before two recessions hitting the market in 1[SUP]st[/SUP] decade of the new century bringing the CAGR down to 0.5% in 2009.

Thanks to IC Insightsfor providing this statistics in its report. The CAGR in this decade is expected to be at 4.1% (that is less than 9% averaged over 30 years), even though wireless networks and IoT connections are expected to rise at maximum CAGR of ~19% to ~22.5% while Cellphone market is expected to maintain CAGR at around 9%.

The point to note here is that even after global recession in 2001 due to dot.com and 2008 due to subprime crisis, the base semiconductor earnings have remained steady and has been rising in this decade. True, in 1980s and 1990s it showed maximum growth, but the base was very low ~$10 – $43B in 1980s which went up to ~$139B in 1990s. Today, in 2014, it’s expected to be at $333B. So, even 4.1% CAGR in this whole decade is not bad (in my opinion, of course I would be happy if it would have been moreJ), the base is what matters. Any more growth due to IoT and Smartphones would be icing on the cake! IoT is expected to see maximum growth, but with low base at ~$3.3B and wireless at ~$9.4B.

It’s a common phenomenon; a small cap company grows much faster compared to a mid cap and then a large cap. There are several pitfalls for a mid cap company to become large cap, but after attaining large cap status, it becomes steady. Semiconductor industry is in sweet spot; not company wise, but sector wise it is seeing a large cap phenomenon. So, definitely, it will not see double digit growth, but the base is high enough for it to sweat to retain single digit growth.

Today, semiconductor has entered into every aspect of our life and has become an essential ingredient. Why was the semiconductor market more or less steady during recession without growth or minimal decline in particular years? The essentials had to be maintained for consumption; expansions were curtailed, consolidation of businesses happened which stopped growth; however people had to keep their PCs running over the network, office equipments working, cars with all electronics, home appliances working, medical and healthcare taking their care and so on. Semiconductor has become a daily consumption like food for people.

It will not go down from here. Yes, it can rise with new growth drivers like IoT. Okay, in this decade IoT can rise at ~22% with low base, but in the next decade we can see the same story – the base will steadily increase with high % CAGR initially and then moderating before decreasing! Let’s see.

IBM to Humiliate 20% of Workforce?

IBM to Humiliate 20% of Workforce?
by Daniel Nenni on 01-25-2015 at 7:00 am
Categories: General

There are four big technology companies that I grew up with: Intel, Apple, Microsoft, and IBM. I still follow all four but it is sometimes hard to watch. Last week there was talk of a massive layoff at IBM and I have just confirmed it with my Upstate sources. According to an article in Forbes it will be 25% of the more than 400,000 workforce. According to my sources it will be closer to 20% but that is still more than 80,000 people who will lose their jobs. IBM is calling it a volunteer Transition to Retirement (T2R) which is a nice name for being terminated I guess. You can read about IBM’s 11[SUP]th[/SUP] quarter of declining revenue HEREbut take some aspirin first.

Let’s take a look at what happened after the x86 Server Division acquisition by Lenovo last October. Here is a comment from Alliance@ IBM (the official IBM Employees’ Union website) that suggests IBM “stuffed” the acquisition with employees that should not have been included:

Comment 01/06/15: Just heard from a former co-worker who was transitioned to Lenovo last year… As they returned to work for 2015 on 1/5/15 approximately 70% of the IBM’ers who transitioned to Lenovo as part of the System X buyout were told that they would be offered a package if they leave voluntarily. The majority of folks affected are said to be in the Development group. Folks in development who have been with IBM for 15 years are being offered one year of salary plus a year of health benefits. Folks outside development who had been with IBM for 25 years are being offered the same package. People who received the notice have one week to decide, if they volunteer to leave then their last day will be January 30th. If not enough people volunteer to leave then Lenovo will likely do mandatory layoffs. Lenovo is blaming IBM for sending too many duplicate and non-essential positions over with the buyout and not adequately reporting on what folks actually did relating to System X. I believe this to be accurate based on things I saw during the phase to “determine if you were in scope for the transition”. Our manager at the time told us you had to work on System X for at least 51% of your time to even be considered, but friends in other divisions were saying that people on their teams who didn’t even touch System X were being sent over. The common factor (anecdotal) was the person’s age. The theory at the time: “let the layoff be on Lenovo’s books and spare IBM from further claims of ageism”

You gotta love open forums! They are the best sources of information, absolutely. If you read some of the more recent comments you will see other claims of IBM abusing senior staff with Performance Improvement Programs as a motivation for the T2R “volunteer” program.

Comment 01/23/15: I’m also on T2R —- was planning to retire at the end of this year. I also got a 3 today without warning and told I would need to be on a PIP. Totally shocked. According to the T2R documentation, we are exempt from resource actions but not performance based termination. I think IBM is trying to figure out a way to get rid of us all early.

The question I have is: How is this T2R going to affect the IBM Semiconductor Operations acquisition by GlobalFoundries? If I was GF I would be VERY wary of employee stuffing!

January 24, 2015

Industrial Internet “In-Security” – Awaiting a Cyber Pearl Harbor?

Industrial Internet “In-Security” – Awaiting a Cyber Pearl Harbor?
by Charles DiLisio on 01-24-2015 at 7:00 pm
Categories: IoT

You feel violated when internet intruders (hackers) cause digital harm (theft of social security numbers, credit cards, logins, e-mails or addresses), however, it’s frightening when organized cyber attacks destroy critical physical infrastructure (disrupt water, power or gas). Its annoying having to update passwords or get a new credit card. How unnerved would you be if the power is out of weeks our you don’t have gas for your car? This is the new age of cyber-terrorism awaiting us as we connect more of our critical infrastructure to the Internet.
Continue reading “Industrial Internet “In-Security” – Awaiting a Cyber Pearl Harbor?”

January 24, 2015

Measuring Metastability

Measuring Metastability
by Jerry Cox on 01-24-2015 at 7:00 am
Categories: EDA

Measuring metastability is just 50 years old this year. In 1965 my colleague Tom Chaney took a sampling ‘scope picture of an ECL flip-flop going metastable. S. Lubkin had made mention of the phenomenon over a decade before that, but at that time most engineers were unaware of the phenomenon or did not believe it actually existed. Later many who saw the sampling ‘scope picture doubted the method’s validity. Subsequently, flip-flop output-voltage traces, patiently photographed by Tom in a darkened room, began to turn the tide. This lead to a paper that was rejected because one reviewer (perhaps and electrical engineer) saw a simple analog circuit that he felt was old and uninteresting and another reviewer (perhaps a computer scientist) said metastability could never occur so the paper should be rejected. Later in 1973 the classic Chaney and Molnar paper was accepted for publication in the IEEE Transaction on Computers.

Reliable measurement of metastability in synchronizers and arbiters has been hard to realize. Many subtle problems deceive the unwary and adequate simulation tools and models became available only in the 1980s. Here is a list of several of the major problems:

Measurements in silicon can only be made in circuits specially designed for that purpose. Simulation in advance of fabrication should be done for any other circuit and is preferable to avoid product re-spins.
Shorting the metastable nodes, either in silicon or in simulation, seems painless, but actually yields erroneous results. This is because the method is only valid if the circuit behaves symmetrically. However, hardly any synchronizers really do.
Most synchronizers today are master-slave designs and the recovery characteristics of the master and slave latches are usually quite different. This requires measurement of the characteristics of both latches and the calculation of an effective settling time-constant.
The effective settling time-constant is a function of the clock waveform and duty-cycle.
The load on the output of a synchronizer affects its behavior so care must be taken to include the subsequent circuit in simulations.
The component stages in a multistage synchronizer interact with each other invalidating the simple concept of multiplying their failure probabilities.
Beside clock domain crossings the potential for metastability hides in many surprising places: initialization logic, flip-flop reset signals, memory interfaces and analog input circuits.

Synchronizer designers need a tool for metastability analysis that overcomes these problems, has been proved in silicon and is easy to use. MetaACE from Blendics fits that need, but does not help with the collateral need for an educational tool that makes it easy for synchronizer designers, SoC engineers and engineering students to learn about metastability in its many manifestations.

To meet this educational need, Blendics is announcing MetaACE LTD, a version of MetaACE that limits the number of netlist nodes it can analyze to 250 or less. This node limitation does not otherwise narrow the functionality of the tool. MetaACE LTD is sufficient to handle most unextracted netlists and many netlists with capacitance-only extractions. Doing capacitance-only simulations also improves the run-time with only a small loss in accuracy. Other than the node limit, the two tools are the same and share GUIs and file formats.

MetaACE LTD is available for free download. There will be a Webinardescribing its use on Wednesday 18 Feb 2015 at 11 AM PT. Also, a public synchronizer is soon to be available for download including an extracted netlist and transistor model. This public synchronizer was developed as a master’s thesis project at Southern Illinois University Edwardsville. It can provide a benchmark for comparison with your favorite synchronizer circuit and is a great way to try out MetaACE LTD.

January 23, 2015July 18, 2025

How Imagination tested the PowerVR Series6XT

How Imagination tested the PowerVR Series6XT
by Don Dingee on 01-23-2015 at 10:00 pm
Categories: EDA, Synopsys

We have been hearing for some time about the Synopsys HAPS-70 and how they have co-created the hardware and software architecture for FPGA-based prototyping with their customers. Now, we see details published by Synopsys on how they collaborated with Imagination on the design of the PowerVR Series6XT GPU.

The first thing to come to grips with is just what a beast the PowerVR Series6XT GPU is. With up to eight unified shader clusters and an array of diverse co-processor units, testing all the configurations and concurrent execution of IP blocks pre-silicon is a tall order. The danger, as designs get larger and larger, is making an error in partitioning the design onto a prototype. This hazard multiplies when customers put the PowerVR Series6XT GPU into their own designs with other IP around it.

Synopsys and Imagination worked together to tackle the partitioning of a basic two-shader cluster, some of the GPU logic, and test logic allowing synchronization of stimuli from DDR3 storage and a connection to a PC host. This spanned four Virtex-7 FPGAs on a HAPS-70 S48. The biggest part of the two-week, manual effort was iterating the partitioning to get the right combination of logic and I/O multiplexing. The result was a prototype running at 8 MHz, which allowed 7000 regression tests to be run successfully – all pre-silicon.

When attempting to scale up to the full Series6XT GPU design, it became evident that the test logic that swallowed 90% of an FPGA in the initial prototype was going to exceed 100% quickly. The logical choice would be repartitioning again, but issues with I/O multiplexing using the “manual” synthesis rules would cut the system performance to 2 MHz. This would make evaluation of the full-up GPU with live video output excruciatingly slow.

Automation came to the rescue. ProtoCompiler has the ability to synthesize code versus HAPS-aware constraints, including interconnect. The teams upped the FPGA count to six, dialed in constraints including keeping FPGA utilization to 80%, and selected a pin-muxing strategy. By using the abstraction flows feature to explore FPGA-to-FPGA interconnects quickly, typically in less than a minute, ProtoCompiler was able to pick the best possible multiplexing ratio. The result was a full-up live video analysis prototype in five, not six, FPGAs running at 7.3 MHz.

One more performance tweak would make the difference. With the partitioning set, the chance to optimize interconnects came into play. The HAPS-70 supports a high-speed time-domain multiplexing feature on all connectors. ProtoCompiler understands how to assign source synchronous clocks, split multi-source nets, and other details to use the HSTDM feature. After a day of exploration of an HSTDM scheme, full-up performance was 12 MHz.

This successful effort retains all of the benefits of FPGA-based prototyping. Executing design changes in RTL is quick and easy. A host connection and debug tools allow control and visibility into the design and the test environment, facilitating sophisticated tests such as video analysis via a compressor/decompressor and frame buffer. The power of a synthesis environment that has detailed knowledge of the prototyping platform also shows the potential.

Synopsys published these results via a presentation at the SNUG Japan sessions in September 2014, and a short article in 4Q2014 edition of Synopsys Insight (on page 7). The author, Andy Jolley of Synopsys who worked directly with the Imagination teams, is presenting a live webinar to discuss his findings on Feburary 4, 2015 – the event is now open for registration:

Successful GPU IP Implementation on Synopsys HAPS Platforms using ProtoCompiler

Whether you are looking to the use the PowerVR Series6XT GPU or just facing a design of similar complexity, the lessons learned from this development are worth a look.