llmda newsletter ad (2)

eSilicon and IDT Collaborate on Next-generation RapidIO Switches

eSilicon and IDT Collaborate on Next-generation RapidIO Switches
by Paul McLellan on 07-18-2014 at 9:01 am

Earlier in the week, eSilicon and IDT announced a collaboration to accelerate development of next-generation RapidIO switches. These are used to meet the higher performance demands required for new wireless, embedded and computing infrastructures. The two companies will initially work together to develop RapidIO switches operating at 40Gbps per port, based on the RapidIO 10xN specification.

The switches developed under this program will be enable the next generation of wireless base-stations, such as cloud-RAN (which I previously wrote about here if you don’t know what it is), LTE-Advanced (LTE-A), and 5G. But it will also find use in emerging architectures such as base stations co-located with high-performance computing (HPC) platforms.

IDT’s production 20 Gbps per port switches are currently the de facto standard for the clustering of DSPs, microprocessors and ASICs in existing 3G and 4G base stations already deployed. Existing RapidIO switches from IDT are in virtually every 4G base station in the world. But a new generation of base stations is on the way requiring higher performance and scalability. Indeed the new switches will not only offer 40 Gbps performance, but 100ns latency and scalable to 4 billion nodes in a network. Wow, that’s a large network.

The plan is to combine eSilicon’s experience with 28nm implementation, including development of fast SerDes and custom memories, and complement that with IDT’s expertise in RapidIO design.


The requirements for RapidIO are largely driven by its need for use in wireless base stations, although it does have applicability in other systems. But base stations used to implement 4G, LTE or WiMAX are a particularly demanding application. They must:

  • Maximize the number of subscribers per antenna array/base station
  • Support more data bits per subscriber in the form of data and video (beyond narrowband voice)
  • Provide real-time data video and voice aggregating to beyond 1 Mbps per subscriber
  • Minimize power consumption
  • More data per subscriber, up to 100 Mbps
  • More processing per data bit per user by FPGA/ASIC/DSP cluster
  • Higher-speed handoffs between base stations
  • More onerous Orthogonal Frequency-Division Multiple Access (OFDMA) Physical Layer Protocol (PhY)-based processing compared with current 3G platforms

RapidIO is not proprietary to IDT, it is an open standard and you developed by the RapidIO Trade Association. Here is the current RapidIO roadmap:

The eSilicon press release is here. The RapidIO Trade Association website is here.


More articles by Paul McLellan…


SemiWiki Big Data Exposed!

SemiWiki Big Data Exposed!
by Daniel Nenni on 07-17-2014 at 7:00 pm

You will be hard pressed to attend a conference and NOT hear the term Big Data these days. What is Big Data? One example is the data SemiWiki has collected over the past 3.5 years while more than one million users have passed through our site. My summer project, with my daughter the math major, is to harness this massive pile of data and make something useful out of it, absolutely.

The first interesting thing that I filtered out are the top viewed blogs from the past 3.5 years:

2014
For I have seen the shadow of the curved touchscreen
by Don Dingee, Published on 01-10-2014

MakerSpace at CES, Atmel inside
by Paul McLellan, Published on 01-13-2014

If you still think that FDSOI is for low performance IC only…
by Eric Esteve, Published on 02-11-2014 09:02 AM

Carbon Design Systems – Secret of Success
by Pawan Fangaria, Published on 02-19-2014

The Great 28nm Debacle!
by Daniel Nenni, Published on 07-06-2014

2013
Interface Protocols, USB3, PCI Express, MIPI, SATA… the winners and losers in 2012
by Eric Esteve , Published on 01-08-2013

SEMulator3D – A Virtual Fab Platform
by Pawan Fangaria, Published on 05-30-2013

An EDA Acquisition that Worked
by Daniel Payne, Published on 08-14-2013

Intel Really is Delaying 14nm Move-in. 450mm is Slipping Too. EUV, who knows?
by Paul McLellan, Published on 08-24-2013

High-Sigma Standard Cell Optimization!
by Daniel Nenni, Published on 10-03-2013

2012
Interface Protocols, USB3, HDMI, MIPI… the winner and losers in 2011
by Eric Esteve, Published on 01-07-2012

TSMC 28nm Yield Explained!
by Daniel Nenni, Published on 03-04-2012

The best graphics chip is the one seen the most
by Don Dingee, Published on 04-10-2012 12:48 PM

A Brief History of Semiconductors
by Paul McLellan, Published on 10-25-2012

FinFET Process Modeling and Extraction at 16-nm and Below
by Daniel Payne, Published on 12-18-2012

2011
Clock Domain Crossing (CDC) Verification
By Paul McLellan, Published on 02-21-2011 04:12 PM

ARM vs Intel…Performance? Power? OS support? Or ubiquity?
by Eric Esteve, Published on 03-22-2011

TSMC Versus Intel: The Race to Semiconductors in 3D!
by Daniel Nenni, Published on 06-26-2011

AMS Design using Dongbu HiTek foundry and Tanner EDA Tools
by Daniel Payne, Published on 10-27-2011

3D Transistors @ TSMC 20nm!
by Daniel Nenni, Published on 11-06-2011

Also read: SemiWiki Exceeds One Million Users!


Palladium’s Little Brother Protium

Palladium’s Little Brother Protium
by Paul McLellan on 07-17-2014 at 8:00 am

Today, Cadence announced Protium, a new FPGA prototyping platform for software development. During development of an SoC, the most appropriate methodology changes. In the early days, developing RTL, the primary tool is simulation. Then, as the blocks get bigger or as the whole chip starts to come together, typically simulation runs out of steam. Time to switch to emulation using Palladium. Then, once the RTL is pretty much stable, it becomes attractive to use FPGA prototyping, which is where Protium comes in (I guess the name is a hybrid of prototype and palladium). Protium can then be used for software development and debugging. Protium is not really intended for hardware debugging. If a hardware issue is detected then it is best to fall back to either emulation or simulation where the debug tools are much more powerful.


One problem with FPGA prototyping is that bring up can be a challenge, typically measured in months. This severely detracts from the usefulness of the FPGA prototype since software development can’t really get going during this period. With Protium, Cadence have put a lot of effort into making this process much smoother, bringing what used to be a 3 month process down to 2 weeks. They actually use the Palladium front-end integrated compile engine (ICE). Halfway through the compilation there are then two possible routes to take the design: into Palladium to verify the FPGA functionality, or into Xilinx’s compilation flow to actually create the FPGA bitstreams to program Protium.


Protium comes in various configurations. It is based on Xilinx Virtex-7 2000T FPGAs. There are two baseboard options, either 2 or 4 FPGAs. One or two boards can be used per chassis giving the following configurations:

  • 2 FPGAs: 25M gates (single board)
  • 4 FPGAs: 50M gates (single board)
  • 6 FPGAs: 75M gates (dual board)
  • 8 FPGAs: 100M gates (dual board)

There are two 150pin daughtercard connectors per FPGA, up to 8 per board, with 1.8V signalling at speeds of up to 1.25Gbps. Bulk memory daughtercards can be added for DDRx, SRAM, flash, SD card etc, or custom memory cards. There is also SpeedBridge product interface capability, that handles the speed mismatch to buses such as PCI.

This fully automatic flow gives a prototype that runs at 3-10MHz. With manual guidance and adding memory cards (instead of implementing the memory in the FPGAs themselves) the performance is in the 10-30MHz range. And by doing a lot more manual work, guiding the FPGA partitioning and synthesis, mapping the clock directly and so on, the performance can be over 100MHz, although this is a slower and more manpower intensive process, roughly equivalent to how FPGA prototyping had to be done in the past.

So Protium is 4X faster bringup than the first generation FPGA prototype system, has 4 times the capacity and 3 times the memory. Compile times using Palladium/Xilinx flow is 5X faster.


Cadence announced a second addition to the system development suite, to address low power verification of power-shut-off domains, voltage scaling domains and so on. This allows verification to be done using either simulation or emulation using Palladium. There is full support for power policy being expressed either in Si2 CPF or IEEE 1801 (UPF). Simulation can be used with a full 0/1/X/Z signal model to check for corruption, and check isolation and retention. Palladium can be used for system validation and analysis, measuring average and peak power in the context of running a software load.


More articles by Paul McLellan…


Intel Second Quarter Results

Intel Second Quarter Results
by Paul McLellan on 07-16-2014 at 4:03 pm

Intel announced their quarterly results earlier this week. Their mainline microprocessor business is doing well, especially the highest performance segments for servers, datacenters and cloud computing. Broken down by segment the numbers come out like this:

  • PC Client Group revenue of $8.7 billion, up 9 percent sequentially and up 6 percent year-over-year.
  • Data Center Group revenue of $3.5 billion, up 14 percent sequentially and up 19 percent year-over-year.
  • Internet of Things Group revenue of $539 million, up 12 percent sequentially and up 24 percent year-over-year.
  • Mobile and Communications Group revenue of $51 million, down 67 percent sequentially and down 83 percent year-over-year. Ouch, that has to hurt.
  • Software and services operating segments revenue of $548 million, down 1 percent sequentially and up 3 percent year-over-year.

The PC business is being driven by a couple of things, such as the end of Microsoft’s support for Windows-XP driving a reinvestment cycle. Intel reckon that the installed base of PCs that are over 4 years old is 600 million and they are seeing clear signs of a refresh cycle in small and medium sized businesses.

In tablets, they shipped 10M devices last quarter, which puts them on track for the 40M unit goal for the year. But this is really just seeding the market since they are not making any money on them, and recognizing contra revenues against them (although they expect to get costs down enough to be breakeven by the end of the year). But they have more in the pipe:

“We qualified the first Broadwell-based Core M processors and at Computex we highlighted the form factor innovation that 14-nanometer Core M product family will enable. Systems like Llama Mountain reference design, a fanless detachable two-in-one that is razor thin at 7.2 millimeters and weighs just 24 ounces.


Mobile is still doing horribly, down to $51M (which is down over 80% from the same quarter last year). Since Intel is investing of the order of $1B per quarter in this segment that means they have lost over $2B so far this year. On the call they said that they are on-track for having their SoFIA integrated baseband and apps processor in Q4 of this year. This is Atom-based but built by TSMC (I assume that their existing LTE modem will be incorporated too). They talked about a new LTE product too, which I assume will also be built by TSMC:“We are working towards qualification of our 7260, our Category 6 LTE product with carrier aggregation early this quarter.”

In reply to a question, Brian said that they intend to bring these products inside late in 2015 or early 2016 since they see it as important to be able to leverage their process technology in every market. Talking of process technology, they confirmed that 10nm production would start in 2015 with volume in 2016.

Margins are expected to come down a little in Q4 since they will be ramping multiple 14nm fabs simultaneously. And another datapoint that is important for the stock price is that they announced they would buy back $4B in stock in Q3 with more in Q4 (they bought back $2B in Q2). The total buyback will eventually be $20B.


More articles by Paul McLellan…


Catching IC Manufacturing Defects With Slack-Based Transition Delay Testing

Catching IC Manufacturing Defects With Slack-Based Transition Delay Testing
by Daniel Payne on 07-16-2014 at 3:00 pm

Test engineers are often the unsung heroes in the semiconductor world, because they have the tough job of deciding if each IC is good or bad, while taking the least amount of time on a tester and ensuring that the tests are actually finding and uncovering all manufacturing and process variation defects. Simple stuck-at fault models are no longer sufficient to catch all of the actual defects, so to achieve the highest quality and lowest Defective Parts Per Million (DPPM) new Transition Delay (TD) patterns are being used. Extending TD testing with Slack-Based Transition Delay (SBTD) testing is a new approach for even higher defect coverage. A recent webinarpresented by Synopsys and Avago focused on this test topic.

Continue reading “Catching IC Manufacturing Defects With Slack-Based Transition Delay Testing”


Andes Plays an ACE

Andes Plays an ACE
by Paul McLellan on 07-16-2014 at 9:01 am

There is a perception that ARM is the only microprocessor game in town due to their strong position in many markets, especially mobile. In areas where the instruction set shows through, then this is probably true. There is no rush to build smartphones where the application processor is something else. But even in a phone there are perhaps ten more processors where the instruction set doesn’t show through since the user has no access to the code (bluetooth, audio decode, and so on). For these processors the decision matrix is different. Power, cost and configurability are the important dimensions. As the dominant IP supplier of microprocessors, ARM is not going to have a strategy to be the lowest price and commodify their market. It turns out that they are not the lowest power supplier either. And they are less interested in configurability than others since it doesn’t play to their strength which is that ARM is a standard.

Andes, which I like to describe as the biggest microprocessor company that you’ve never heard of (although you should have by now, I’ve been writing about them since the time I first ran across them at the Linley Mobile Conference about 18 months ago) is a Taiwanese company that historically has done most of their business in Asia. But now they are moving into the US and already have several licensees.

Up until now, AndesCores have had two of the three attributes that users require: they are not as pricey as ARM and they are lower power than equivalent cores. They also have a range of cores from simple low performance, very low power and small up to multi-stage pipeline, high performance and, while not as low power obviously as the slower cores, still very low when measured by MIPS/W.

The reason that customization of the instruction set is so important is that increasingly functionality that used to be implemented in hardware (so Verilog or SystemVerilog) is moving into software for time to market and flexibility reasons. But running software implementations of many DSP and video-processing functions on a general purpose microprocessor is too expensive in terms of power (and sometimes the performance is not enough). For example, MP3 decode on a general purpose microprocessor consumes much more power than doing it on a core with the right additional instructions. And trying to implement an LTE modem or a lot of video processing algorithms on a general purpose microprocessor will fall short on the performance available when running the processor flat out. It seems to now be received wisdom that most of these “offload” functions are best implemented in a processor core optimized for either the specific algorithm or at least for the domain (e.g. video). This gives you 90% of the flexibility of pure software and 90% of the hardware performance/power of a pure hardware implementation.


Now Andes have the EN801 which is the first extensible AndesCore. This is accomplished partially in the way the core itself is configured using ACE, the Andes Custom Extension framework and partially through the software environment used to do the configuration called COPILOT, which, pushing acronyms to some sort of asymptotic limit, stands for Custom OPtimization Instruction develOPment tools.


The EN801 is based on the highly efficient AndesCore N801, which has a 3-stage pipeline. The basic core remains important since it is used to implement the 80% of the code that is rarely executed with excellent power characteristics and reasonable performance. For the other code, the inner loops and so on, additional instructions can be added to implement them very efficiently. Added instructions can be single cycle or multi-cycle, interruptible or non-interruptible, up the 3 reads and 2 writes from the registers. When different instructions have some overlap in functionality, logic sharing is possible.


One example is building a finite impulse response (FIR) filter. Using pure C code and no instruction extensions takes 175 cycles. Adding a FIR instruction reduces this down to 10 cycles. The pure C code also consumes 28 times as much power. Of course there is a cost in terms of added hardware, nearly 7K gates. But you can trade off area, power and performance to hit what you consider the sweet spot, that is one of the attractions of configurability.

Andes website is here.


More articles by Paul McLellan…


Thread is why Nest has extra 802.15.4 goodies

Thread is why Nest has extra 802.15.4 goodies
by Don Dingee on 07-15-2014 at 4:00 pm

From last week: “Chipmakers can’t afford to wait on the sidelines, hoping their standard fare gets picked up and fits in with one of these [#IoT]teams.” This week, it’s ARM, Freescale, and Silicon Labs joining with Google and others on Thread. Yet another consortium? A lot more to this story. Continue reading “Thread is why Nest has extra 802.15.4 goodies”


Keywords: FD-SOI, Cost, FinFET

Keywords: FD-SOI, Cost, FinFET
by Eric Esteve on 07-15-2014 at 4:21 am

How to synthesize a pretty good article Is SOI Really Less Expensive, and even more important the impressive amount of comments (56) generated? Let’s start with the initial article. Pretty good, but slightly biased, when you carefully dissect it, like I did in one of the comments (you can find it in-extenso at the end of this post). In short, if FD-SOI goes into large production level (remember that Samsung has just licensed the technology), this will generate a positive impact on raw SOI wafers, accounting for 9 to 10% of the fully loaded wafers, and not 15%, in the above mentioned article. The second point (not high production dependent) is even more important: the article compares FD-SOI offering 3 Vt with Bulk, offering also 3 Vt.

If you don’t really know the technology, you may think that that’s a fair comparison. Unfortunately, it’s not! FD-SOI library requires only using two Vt when using an equivalent library (optimizing the leakage current) on Bulk will require at least four Vt. FD-SOI processing will then require less mask level, impacting again the fully loaded wafer cost. That is, you pass from par (100% normalized price for each technology) to 100% for Bulk to be compared with 90% for FD-SOI 28nm. The industry consensus is that 28nm will stay for long, and is the preferred node for cost sensitive products (like low-end wireless application processor), thus a 10% cost difference is really important. Now, the question is: should a marketing campaign be based on cost only? As a former ASIC PMM, I strongly think that using cost as the unique argument is at first not enough and finally dangerous. Highlighting the differentiators is much efficient!

We have described in this post how to benefit from the Forward Body Bias (FBB) effect in FD-SOI to increase the performance, or decrease the power consumption at the same performance level (just take a look at the article to get the complete picture). The biasing capability is a real differentiator, and the reason is that you can’t use it with FinFET!

If you don’t trust me, just look at this article from Ed Sperling in SemiconductorEngineering: “IP and FinFet at Advanced Nodes” mentioning Bernard Murphy, CTO of Atrenta: “The standard MCU guys, when they want to dial down power, don’t want to mess with the architecture because that has ripple impacts on a lot of other areas. So they use biasing, which is a great way to reduce leakage. But that doesn’t work with finFETs, and if you still have a power problem with your MCU you have to change the architecture.”
This biasing differentiation lead to the probably most important feature coming with FD-SOI: lower power consumption. It can be ultra low-power if you design a SoC for mobile application, and it can be simply more power efficient (at the same performance level) if you design a high performance networking IC. Benefiting from lower power consumption will have a tremendous impact on chip packaging and cooling, and all along the chain on massive power consumption of a server farm.

So, why the SC industry did not jump into FD-SOI technology? We get the answer in this excellent comment from IanD:

At the point where it became clear that something had to replace planar bulk technology and the industry had to decide whether to go with bulk FinFET or FDSOI (or even SOI FinFET), several things happened to influence this. The first was Intel announcing FinFET (and how fantastic is was) at least a couple of years before anyone expected it to happen, which caused a bit of a panic reaction largely driven by customers screaming “We must have FinFET to stay competitive!”. Also at this point FDSOI — especially the UTBB substrate capability — was not really ready and seen as risky and expensive, and FinFET was seen as the safe option. There was also a concern about future scaling to 10nm and below on the assumption that EUV would be ready and that these would be mass-production low-cost processes, with FinFET seen as scaling better to 7nm and 5nm which would all follow on from 10nm at the usual rate. So the industry rushed towards FinFET, probably much faster than they would have liked to without Intel’s bombshell — which as has been said many times, was a good choice for them making x86 CPUs, but not so obviously for the rest of the industry.

Since then things have changed somewhat. Some of the disadvantages of FinFET have started to emerge when SoC designers have started to actually use it — remember there isn’t such a thing as a free lunch, and the good points of FinFET in some applications (high drive, high speed, high density) are also bad points in others (high power density, higher gate capacitance, worse hotspot and EM issues) since the two are inextricably linked. Process variability which was not a killer issue for Intel CPUs is an issue for most other applications, and unlike for FDSOI can’t be trimmed out with FinFETs. …
Nobody’s suggesting that TSMC are stupid, they made the right decision based on what their customers were claiming for and given the state of FinFET and FDSOI at the time. Whether this was right given what we know now can be debated for years — hindsight is a wonderful thing, isn’t it? — but at least it now seems that FDSOI will be available and supported by multiple suppliers and IP providers for those people who are working in application spaces where it is better than FinFET, so at least the choice is there. For sure FinFET will be dominant initially as the juggernaut rolls on, but if FDSOI delivers on its low-power promises it should attract an increasing share of the market for applications where this is the #1 priority.

Another comment (see below) is not very positive about people promoting FD-SOI, but I think it is very smart in pointing the reason why FDSOI was not successful in the past: uncertainty.

From: Kencweng
Stop beating around the bush. The major challenge in FDSOI is not the substrate cost, the design ecosystem or even the stupidness of curtain persons. According to the substrate providers, the cost of SOI will soon be reduced. Based on my personal experience, porting a physical IP to FDSOI is not that difficult as what people think. Designing a physical IP for FinFET is harder. The design ecosystems are not that much different. Granted, we are all smart engineers.

IMHO, it is the uncertainty or the unknown fact(s) associated with any “promising” or “innovative” technology that makes the adoption so difficult. Unfortunately, the groups of persons who have been promoting SOI either ignore this or have their own agenda. What do we expect us to do after observing the struggle of persons who have chosen SOI? The persons who stay with bulk seem to be more successful.

All I can say is good luck for all persons who are pushing for SOI. May they succeed this time!

Uncertainty is not a scientific, quantifiable feature, it’s human, but it’s real! This is why the latest information about FD-SOI “Samsung Endorse FDSOI” is also the most important for years. Put yourself into the decision maker shoes. You were previously being cautious about FD-SOI because of uncertainty. Don’t you think that the same person will change his mind after seeing the 2[SUP]nd[/SUP] larger and most performing SC player adopting this technology? Just leave a couple of quarters, the time to start consolidating an IP ecosystem and for Samsung to familiarize with the technology, and check for FD-SOI adoption, let say during 2015…

From Eric Esteve from IPNEST

The pricing comparison seems to be the result of a deep research, nevertheless we can make two comments about the Tables, applying for 28nm and 14 nm as well:

  • Starting Wafer Cost is over-evaluated for SOI wafers in full production
  • Mask Layers are under-estimated for Bulk in 28nm or FinFET in 14nm if you take into account the various Vt you need to implement (4 to 5 Vt) to offer the same latitude than FD-SOI (with 2 Vt only) in respect with the leakage current.


Wafer Cost
Since 2011, the
agreement is $500 (in 28nm) for a SOI wafer, and $130 for a bulk wafer. The ratio is 3.8, when the Table (28nm) indicates 3% vs 15%, or a 5 ratio. Moreover, the bulk wafer pricing is at the lowest, as the 28nm Bulk is in full production. It is not un-reasonable to foresee a price decline for SOI wafer, when FD-SOI will be in full production at both STM and Samsung! Instead of 3% (bulk) and 15% (FDSOI), we may use 3% and 9-10% when in full production, or 12% as of today… Let say that the cost impact for FDSOI in full production should be (-5%)!

Multiple Vt in 28nm
Inserting the same number of Vt (3) in the table for Bulk and FD-SOI is not accurate: FD-SOI offer consists of only 2Vts and wide leakage control/optimization is reached through the use of multichannel libraries allowing to accommodate from 24nm to 40nm channel length in the std-cells pitch. For the same leakage control, with bulk technology you would have to use 4 Vts. The same applies on bitcells where to get the same range of the FD-SOI offer, bulk technologies have to differentiate several specific implants, adding masks. Why using multiple Vt approach? FD-SOI technology allows in the std-cells pitch to accommodate multi-channel libraries with the Lpoly ranging from Lmin=24nm to 40nm. This gives a great control over leakage when optimizing for power an implementation. The 2 Vts offered by FD-SOI provide a control over leakage of 1/200x (RVT L=40nm being 1/200 of the leakage of LVT Lmin=24nm, the leakiest device). To get the same leakage control range in a bulk technology (that allows a much reduced multichannel range), you need to use 4 Vts in a SoC.
FD-SOI library requires only using two Vt when using an equivalent library (optimizing the leakage current) on Bulk will require at least four Vt. The bottom line is that we count more masks on FDSOI than needed (to be removed) and less masks on Bulk (to be added), the difference being 4 to 5 masks levels, or another (-5%)!
That is, you pass from par (100% normalized price for each technology) to 100% for Bulk to be compared with 90% for FD-SOI 28nm. The industry consensus is that 28nm will stay for long, and is the preferred node for cost sensitive products (like low-end wireless application processor), thus a 10% cost difference is really important.
The same argumentation applies for 14 nm technologies. Even if the decision to select FinFET is not purely based on cost, but on better performance or leakage behavior, a 10% cost difference may have a certain impact, when making the decision. Moreover, If we consider on the edge devices processed in 14nm, the unit price is expected to be high. If you remember, we have mentioned the Forward Body Bias (FBB) capability available with FD-SOI. FBB allow either decreasing power, either increasing performance (frequency) of the same chip. But in a wafer fab, the same device is not processed the same way: it can be Slow, Typical of Fast, and you know it after test. When dealing with high variability, you may end up losing some yield because of it. FD-SOI has FBB as a plus for playing with process compensation, to eventually recover slow parts.

If your application requires chips in the high range or performance, that means that you have to trash a part of the production… except if you use FBB to compensate. Thus, you can keep the chip price in the acceptable range, as you can increase the number of good die (in respect with the performance) per wafer… This cost impact is difficult to quantify, as it will depend on the unit price, the level of performance, but we know from Intel marketing for processors that it can be high.
To summarize, even if a few percent cost difference may look negligible, the aggregation of these “small” differences lead to a 10% difference on the processed wafer cost. 28 nm technology node is expected to stay the mainstream for very long, due to the Moore’s law interruption, especially for the high volume, cost sensitive devices like low-cost application processor. As far as I remember from my ASIC PMM days, a 10% difference is all but negligible, thus I suggest using the right wafer price from the beginning. The devices implemented in 14 nm are probably not cost sensitive, but rather chase for high performance and/or ultra-low power. But is it a reason for pricing FD-SOI 10% higher than it should be?


Cadence Announces Quantus Next Generation Extraction

Cadence Announces Quantus Next Generation Extraction
by Paul McLellan on 07-14-2014 at 7:00 pm

Today Cadence announced their next generation extraction solution called Quantus QRC. Actually they are technically announcing it tomorrow, since it is being announced at CDNLive in Korea where it is already Tuesday morning.

As with the other recently announced tools that end in -us, Tempus (timing signoff) and Voltus (power integrity), there is a lot of emphasis on being scalable to large numbers of CPUs giving up to 5X faster performance for single and multi-corner extraction runs. A lot of emphasis has been put on getting best-in-class accuracy especially for FinFET processes. It is backwards compatible with the prior version of QRC in that it uses the same technology files and the results correlate. There is also a random walk field solver that allows correlation to be checked (obviously only on a tiny part of a design, field-solvers are not full-chip tools).

Cadence have also worked closely with TSMC to ensure that the results correlate with silicon. Quantus QRC is fully certified at TSMC for 16nm FinFET and technology files are already available. It is also the first extraction solution certified to support 3D-IC (TSV-based designs).

Quantus is integrated into the Encounter platform (so there is the same engine under the hood during implementation as there is during signoff, which reduces spurious ECO loops).


It is also integrated into the Virtuoso platform. The integration is tight, in that there is no stream in and out between the two tools and so the extracted view flow allows for faster circuit performance debugging. Quantus executes right from the Virutuoso UI. It handles all the things you would expect such as RF, substrate noise analysis, inductance extraction, powerMOS extraction, RC and RLCK reduction and more.

It turns out I know a lot more than you might expect about circuit extraction since in about 1983 I wrote VLSI Technology’s circuit extractor. It didn’t do anything especially clever, it ran a scan-line across the whole (flat) chip that might have as many as…wait for it…10,000 gates, identified the transistors and the interconnectivity. I didn’t even have to worry about lateral capacitance or even resistance. But we only had 10 MIPS of CPU power and a couple of megabytes of memory, so it was non-trivial to get good performance. Even though computers are several hundred thousand times faster (and you can have hundreds of them) I can appreciate that keeping both the accuracy and the performance up given how many things interact with everything else (especially in FinFET) is an achievement.

Customer experience at AppliedMicro and Open-Silicon validates that it scales to 100s of CPUs and does deliver both accuracy and much faster run times.


For example, on a 20nm design of 39M gates, running on 32 CPUs it runs in 6 hours versus 15 for plain old QRC for a speedup of 2.5X. If the CPUs are increased to 64 then the speedup is 4.3X, so close to the 5X. Other designs (see the table) get even bigger speedups.

So, in summary:

  • 5 times faster, scalable to 100s of cores
  • best-in-class accuracy, silicon proven
  • in-design convergence to accelerate design closure
  • fully qualified in TSMC 16FF


More articles by Paul McLellan…