Banner 800x100 0810

Save the Dates

Save the Dates
by Paul McLellan on 08-13-2013 at 3:22 pm

There are several events in Silicon Valley coming up of general interest to people working in EDA and the semiconductor industry.

SEMI 16th Annual Valley Lunch Forum. August 22nd, 11.30am to 1.30pm, Santa Clara Marriott

  • What are the Opportunities for Advanced Semiconductor Devices?
  • Where will the year end for 2013?
  • Will we have a double-digit increase in 2014?

Speakers: Dan Freeman (Gartner), Mike Corbett (Lynx Consulting), Brian Matas (IC Insights)
Details and registration here.

GSA Executive Forum. September 25th, 12pm to 6pm, Rosewood Sand Hill Hotel

  • Keynote: David Small, Verizon
  • Keynote: Condoleeza Rice
  • Panel session moderated by Aart de Geus

Details and registration here.

EDAC Back to the Future. October 16th, 5.30pm to 9pm, Computer History Museum
Join your colleagues on Wednesday, October 16, 2013 for an evening of networking to celebrate the EDA industry!
Details (not many yet) here.


Wanna Buy A Blackberry?

Wanna Buy A Blackberry?
by Paul McLellan on 08-13-2013 at 2:26 pm

So Blackberry (formerly known as Research In Motion or RIM) is up for sale. Basically, apart from some cash in the bank, its main value now seems to be patents and, perhaps, some security technology. The murderers are in Cupertino and Mountain View, Apple’s iPhone (and iPad) and Google’s Android along with its licensees, most especially Samsung.

When iPhone was first released, Blackberry’s CEO said “In terms of a sea-change for Blackberry, I think that is overstating it.” Of course he wasn’t the only person to underestimate the impact that iPhone would have on the entire mobile market. Both Kallasuvuo, then CEO of Nokia, and Ballmer, CEO of Microsoft, made disparaging remarks about iPhone just being a handset and unappealing without a physical keyboard (like Blackberry had). Within a very short time Apple was making over half the profits of the entire mobile industry (with just 4% market share) and Samsung was making most of the rest.

Blackberry’s stock peaked at $236 soon after the iPhone launch and has gradually declined and is now under $10. They did several things wrong, not least underestimating the fact that their strong position in the enterprise market would not be affected by whatever happened in the consumer market. In fact the era of BYOD, bring your own device, started and people were not prepared to carry and iPhone for personal use and a Blackberry for business use. After all, email is email and when Blackberry was just about the only mobile device to support it (Nokia had some devices too, but mainly in Europe) it was a killer App. When every Android and iPhone had email, not to mention internet access, maps, videos and more, email was no longer enough to get every venture-capitalist and company executive to use a Blackberry. The addictiveness went out of the Crackberry.

They made an ill-advised move into tablets with the PlayBook. Astoundingly, the only tablet that didn’t support email, Blackberry’s killer App. The PlayBook needed to be paired with a Blackberry phone for that. Soon they were out of tablets.

Finally they introduced their first touchscreen devices and a completely reworked version 10 of their operating system (build on top of QNX which they had by then acquired). But it was too little, too late: consumers had already got smartphones and, by and large, would upgrade to newer versions of the phones that they were already used to.

So now, Blackberry is like a twenty year old Triumph TR-7, only useful for spare parts. And worth next to nothing. So who might buy them? I’ve no real idea. Other analysts have the usual suspects: Microsoft, Samsung, Amazon, Cisco, HP and IBM. All make some sense, but Blackberry is a tarnished brand based on aging technology.


How Resistant to Neutrons Are Your Storage Elements?

How Resistant to Neutrons Are Your Storage Elements?
by Paul McLellan on 08-13-2013 at 1:01 pm

There are two ways to see how resistant your designs are to single-event errors (SEE). One is to take the chip or even the entire system and put it in a neutron beam and measure how many problems occur in this extreme environment. While that may be a necessary part of qualification in some very high reliability situations, it is also too late in the design cycle in most circumstances. What is needed is software to estimate the reliability during design when there is still time to do something about it.

iRoc has two tools for doing this. TFIT is used to evaluate individual cells such as flip-flops and memory elements and assess their failure rate known as FIT (failure in time). The second program, SoCFIT is used at the chip level once TFIT has been used on all the cells. It works out the FIT for the entire design based on how the various cells have been connected. A neutron that misses a flop may still cause a problem if it hits a cell connected to the flop and the current spike causes the storage element to change its value.


iRoc have just released TFIT 3.0, the latest version of the cell-level analysis tool with some major changes:

  • the necessary design layout parameters are automatically extracted from the cell layout
  • the temperature of the device may now be set to be taken into account during soft error effect simulation
  • it can analyze a named cell in the middle of a design without it having to be moved to a separate file
  • output can now be exported in xml format file which can later be used by SoCFIT or by other analysis programs


Most SoC designers do not create their own cell libraries and so they are unlikely to use TFIT directly themselves. TFIT is intended to be used by library and memory designers so that they can create libraries with acceptable reliability. Note that it is not possible to design a library so it is completely immune to SEE but what is important is to create a library with fairly uniform FIT scores. Like building a chain, you want all the links to have roughly the same strength. A cell with an especially good FIT is like an extra-strong link in the chain, it is probably a waste of resources since it is the weakest links that determine the strength of the chain just as it is the cells with the poorest FIT scores that it makes sense to focus on improving to improve the reliability of the library.


Electronic System Level: Gary Smith

Electronic System Level: Gary Smith
by Paul McLellan on 08-12-2013 at 5:07 pm

Gary Smith has been talking about how the electronic system level (ESL) is where the future of EDA lies as design teams move up to higher levels encompassing IP blocks, high level synthesis, software development using virtual platforms and so on. At DAC this year in Austin he talked about how the fact that EDA controls the modeling process for semiconductors is the secret sauce that should allow EDA to start to move up into the embedded software space and start to improve their productivity in the same way as semiconductor design has improved over the last few decades.

I don’t think that the transition to ESL took place how we expected, nor did it take place as early as we expected. I worked for two virtual platform software development companies in the last decade, and Gary himself was famous for calling the move up to ESL in a big way as imminent several times before it really happened.

I think most of us expected that high-level synthesis (HLS) from C/C++/SystemC would take over from RTL synthesis and originally that was what people envisaged when you talked about ESL. Although HLS is indeed growing, and it has certain niches such as video processing where it is very strong, it turned out that for most SoC design, IP-based design was the way that we moved up a level. Many chips today contain very little “original” RTL, consisting largely of lots of IP blocks connected up using a network-on-chip (Noc), itself a form of IP.

Up another level from the IP blocks is the software component of a modern system. For many designs, this is a huge proportion of the entire design effort, often dwarfing the semiconductor design. Software is also longer lived. Any particular system, such as a smartphone, will go through many iterations of the underlying hardware, in this case what we call the application processor, while much of the software will be inherited from iteration to iteration. iOS, Android and their Apps have obviously continued to develop, but large amounts of code from several generations back are still shipped with each phone.

In fact there is a view that the only purpose of the application processor SoC is to run the software efficiently, fast enough and without consuming too much power. In this view, the specification of the design is almost all software to be run on a microprocessor such as an ARM. Only when that is either too slow or, more likely, consumes too much power, is specialized hardware used, either by creating a custom block or by using a customizable specialized processor such as CEVA, Tensilica or ARC that can offload the main microprocessor and implement special functions such as wireless modem processing, video encoding/decoding and so on, at a much superior PPA point.

On Monday August 19th from 11am to 11.45am Gary will be presenting a webinar entitled ESL—are you ready? along with Jason Andrews and Frank Schirrmeister from Cadence and Mike Gianfagna from Atrenta.

The ESL flow has been evolving and Gary believes that there have been significant breakthroughs that now mean that the ESL flow is real. Gary will review these breakthroughs and go into details of what today’s ESL tools look like and what it is capable off. “Vendors will be named and ESL heroes will be recognized.”

Registration for the webinar is here.


The Most Disturbing Economic Graphs Ever!

The Most Disturbing Economic Graphs Ever!
by Daniel Nenni on 08-11-2013 at 7:00 pm

After driving to Silicon Valley for the past 30 years I am acutely aware of traffic patterns and to me that directly relates to the economy. The recession of 2009 really hit traffic patterns with what I would estimate as a 20% unemployment rate in Silicon Valley. I could leave my home in Danville anytime of the day and have no traffic problems. That is certainly no longer the case and I blame the mobile electronics boom, absolutely.

One of the websites I frequently visit, second only to SemiWiki, is Business Insider. Henry Blodget has a staff of researchers and puts out the most interesting content on the internet today, lots of interesting graphs too.

In the back of my mind I wonder: “Where does all the money my family is spending go?” Mobile electronics and the monthly service plans, home electronics and the monthly service plans. Everywhere I look money is being spent, profits are being made, so why are so many American families still struggling financially?

According to Henry Blodget Corporate Greed is the culprit and given these graphs I agree 100%:

Corporate profits and margins are spiking:

Wages as a percentage of the economy are at an all time low:


Employment rates fell off a cliff in 2009 and still have not recovered :

The majority of the national income is going to the executive ranks:

Graphs were created by Henry Blodgets minions using FRED Graph, the Federal Reserve Economic Data Tools.

As I mentioned before, I think the stock market is a racket where insiders profit at the expense of the masses. Publicly traded companies are at the mercy of Wall Street so by my definition they are part of the racket. One of the reasons why I favor GlobalFoundries is that they are privately held and can make decisions based on the greater good of the fabless semiconductor ecosystem versus the short term gains Wall Street favors. Just my opinion of course.

Wall Street the Movie, Gordon Gekko:

The richest one percent of this country owns half our country’s wealth, five trillion dollars. One third of that comes from hard work, two thirds comes from inheritance, interest on interest accumulating to widows and idiot sons and what I do, stock and real estate speculation. It’s bullshit. You got ninety percent of the American public out there with little or no net worth. I create nothing. I own. We make the rules, pal. The news, war, peace, famine, upheaval, the price per paper clip. We pick that rabbit out of the hat while everybody sits out there wondering how the hell we did it. Now you’re not naive enough to think we’re living in a democracy, are you buddy? It’s the free market. And you’re a part of it. You’ve got that killer instinct. Stick around pal, I’ve still got a lot to teach you.



RTL Design For Power

RTL Design For Power
by Daniel Payne on 08-11-2013 at 2:25 pm

My Samsung Galaxy Note II lasts about two days on a single battery charge, which is quite the improvement from the Galaxy Note I with only a one day battery charge. Mobile SoCs are being constrained by battery life limitations, and consumers love longer-laster devices.

There are at least two approaches to Design For Power:

  • Gate-level techniques
  • RTL-level techniques


Continue reading “RTL Design For Power”


Robust Design <- Robust Flow <- Robust Tools

Robust Design <- Robust Flow <- Robust Tools
by Pawan Fangaria on 08-10-2013 at 6:00 pm

I could have written the sequence of the title in reverse order, but no, design is the one which initiates the need of a particular flow and the flow needs support of EDA tools to satisfy that need. It’s okay if the design is small; some manual procedures and workarounds/scripts may be able to perform certain jobs. However, as the design becomes complex and its size increases, it needs systematic, established, fast, accurate and automated set of steps which can complete the chip in a reasonable time and provide high yield.

This week, it was another interesting opportunity for me, listening to a DAC 2013 presentation (in the form of a webinar) of GLOBALFOUNDRIES in association with ANSYS-Apache. It’s a typical collaboration in semiconductor industry where a chip designer as a customer and an EDA tool provider as a supplier work closely as a team throughout the design cycle to produce something whose end consumer is several chains down the line.

[Simplification of blocks by abstraction – schematic diagram]

Dr. Hendrik Mau of GLOBALFOUNDRIES explained in very simplistic terms about the complexity of their power-gated, multi-domain design at 20nm node and how they have been able to abstract it into simpler blocks to determine the overall IR drop within acceptable limits of accuracy and reasonable time. It’s a 64Mbit SRAM with 128 blocks, 6528 power domains and more than 2.3M pratio entries per block. Now determining 6528 internal power nets and analyzing IR drop at transistor level for VMIN (the minimum voltage at which an array of bits can successfully be written and read at a specified yield target) characterization of the design is a huge task. Even a tool, if run on flat design will consume more than 512GByte of main memory and several days to complete. So, there come the techniques to simplify blocks by abstraction and use hierarchical approach with the assistance of automated tools to do the jobs at each step. As we see in the picture above, a block can be simplified into a coarse block that reduces the number of power domains and restricts analysis up to a higher level of metal.

[Flow established at GLOBALFOUNDRIES]

In the above flow, Calibre from Mentor has been used for extraction of hierarchical netlist, which specifies actual locations and orientations of the cells. Apache tools have been used in the successive steps. APLMMX finds out all internal power nets connected to transistors, reads extracted netlist and generates GDSII files. APLSW does switch cell characterization and generates model for the switch depicting actual resistance. Then Totem reads in GDSII file and generates LEF/DEF for the blocks and the top level. Now Totem reads the LEF/DEF and the switch cell model to generate the IR drop.

[Hybrid Approach – GDSII view and view in Totem; IR drop results of 64Mbit SRAM]

GLOBALFOUNDRIES used a hybrid approach having 4 fine blocks in the middle surrounded by 124 coarse blocks. IR drop in metal M6 with all blocks consuming same power and whole design connected through wire bond is elliptical, where as it is increased in M1 and M2 in the fine blocks at the center.

[Comparison of run time and memory requirements in flat and hybrid approach]

The hierarchical hybrid analysis of a smaller design of size 8Mbit (which could be run in flat mode) with 4 fine blocks and 12 coarse blocks shows that compared to flat run it consumes lesser than 7.5x peak memory and takes lesser than 4x run time, while maximum IR drop remains close to that of the flat run.

It’s a classic example of how automatic switch tracing can simplify handling large designs and the use of hierarchical hybrid approach can reduce memory requirements and execution time for IR drop analysis. GLOBALFOUNDRIES has been able to successfully use this flow in 28nm and 20nm designs and is now using it in 14nm designs. Details about the design, flow, and tools can be found in the presentation titled “Hierarchical Voltage Drop Analysis Techniques for Complex Power-Gated Multi-Domain 20nm Designshere.


Intel Is Continuing to Scale While Others Pause

Intel Is Continuing to Scale While Others Pause
by Paul McLellan on 08-09-2013 at 11:52 am

Back in May, William Holt, EVP of technology and manufacturing at Intel gave a presentation to analysts entitled Advancing Moore’s Law, Imperatives and Opportunity. A pdf of the presentation is available here. I just saw it for the first time today and I’m not sure how to get my head around it. It starts off with a lot of historical stuff about how Intel has delivered process generations every couple of years (or maybe that the industry has, it’s not quite clear).

But the really interesting stuff is in the middle of the presentation. I have blogged before about how one of the challenges the semiconductor industry is facing going forward is that the cost per transistor is not coming down. Although there are more die per wafer at 20nm, 14/16nm etc, the cost of manufacturing that wafer is rising fast due to the increasing complexity of the process and, especially, due to the need for double patterning. The rule of thumb for a process generation in the past has been twice as many die per wafer (so a 50% reduction in area per transistor), but an increase of wafer cost by about 15% leaving 35% cost reduction left over.


But going forward, the public information available up to now has either shown no reduction in cost per transistor or even a small increase. For example, the above montage, from an ASML presentation at Semicon in July, shows data from GlobalFoundries, Broadcom and nVidia. And at the common platform forum earlier this year Gary Becker of IBM in the press Q&A said that costs per transistor will come down but “the reduction will be less than we have been used to.”

Both TSMC and GlobalFoundries 16/14nm processes basically have 20nm metal on top of a FinFET process, so there will be lots of speed/power improvements due to the improved transistors, but the effect on area scaling will be small.


As the above graph, from the Intel presentation shows, on the left there is a pause in area reduction (on the left) whereas Intel sees none since they have already done the heavy lifting to get FinFETs (that they call TriGate but I’m going to stick to the more generic term) into production at 22nm. But my understanding of Intel’s 22nm process was that it also was not aggressive on metal pitch to avoid double patterning, so I’m surprised they don’t show any flattening at all between 32nm and 22nm. Further, I suspect that the flatness of the competitor graph is exaggerated: even with the same metal pitch, faster transistors allow smaller standard cells to be used some of the time so I would expect to see some reduction in area.

As I said above, area reduction does not automatically result in a cost per transistor reduction since the cost per wafer may go up faster than the density comes down. This is especially true at 14/16nm when the metal does not shrink. Double patterning adds a to the cost for each layer that uses it. Twice through the stepper, and all the associated litho steps. For self-aligned double patterning, many more process steps to build the mandrel and remove it. But Intel sees none of this.


The cost per transistor is completely linear from 65nm down to 10nm despite the fact that at 65nm there is no double patterning and at 10nm there will need to be lots. And it is not an artifact of EUV, Intel have already said publicly that EUV is too late for 10nm.

I don’t understand how the above graph can be accurate. The cost per transistor is coming down completely linearly (actually, at 14nm they are predicting an even bigger reduction since the triangle is just below the line). As a presentation to financial analysts, this comes with all the caveats about forward looking statements, and clearly there may be unknown unknowns about 10nm. But no company is going to present data that is known to be false at the time it is presented, so I have to assume that this is an accurate (if simplified) view of Intel’s best estimate of their current and future costs.

I would like to know what TSMC, GF and Samsung think of these graphs. If they are true, Intel’s 14nm process has slightly better area than everyone else’s 10nm process (the top graph) and obviously hugely lower costs per transistor. I’m not sure I can believe it though.

Once again, Intel’s presentation is available here.


How To Connect Your Cell-phone Camera

How To Connect Your Cell-phone Camera
by Paul McLellan on 08-08-2013 at 5:31 pm

Your cell-phone contains a camera. In fact, it probably contains two: one forward facing for video-calls and one rear-facing for taking photographs and videos. The rear-facing one typically has much higher pixel count than the front-facing. The capabilities of cell-phone cameras are getting “good enough” that the point-and-shoot digital camera market is already in decline. In fact Nokia have recently announced a phone with a 41 megapixel camera, which is about 3 times the count on my point-and-shoot Canon. Lens size counts for something. A lot actually. My old 2001 era 2 megapixel Canon PowerShot G3 takes better pictures by far than my cell-phone despite having a lot lower pixel count, due to having a serious sized lens.


Since the companies that make the cameras and the companies that make the application processors (AP) are usually different, there is a need for standardization of the camera/AP interface. And the MIPI Alliance has been on top of this for several years. The main connection is a fast serial interface known as CSI (nothing to do with crime-scenes, this stands for camera-serial-interface). This consists of a D-PHY (well, two, one at each end) and a CSI-2 transmitter at the camera and receiver on the AP. The D-PHY provides the physical interface and the transmitter and receiver cover encoding, packing, error handling, lane distribution, assembly of image data stream and so on.

However the increasing size (pixel count) and frame-rate is driving the need for even higher bandwidth, hence CSI-3. Conceptually CSI-3 and CSI-2 are similar, both providing a high-speed link between camera and AP. But under the hood they are very different. CSI-3 has a new M-PHY (the successor to D-PHY) and there can be more than one on each side. Each M-PHY has a bandwidth of up to 6Gb/s per lane, with up to 4 lanes.

The next level up is the Unified Protocol layer (UniPro). This defines a unified protocol for connecting devices and components (not necessarily cameras). It is designed to have high speed, low power, low pin count, small silicon area, high reliability and so on.

Above that is the Camera Abstraction Layer (CAL) which does get specific to camera, images and video. In addition to defining how the images are transported, there is also a camera command set (CCS) extension which provides standardized mechanisms for controlling the camera sub-system.

Of course you can take the CSI-3 and M-PHY-standards and implement them yourself (I would link to them but you have to be a MIPI member to access them), just as with any other standard interface. However, when good IP is available it makes more sense to buy rather than make.

Arasan have a complete portfolio, including D-PHY and M-PHY, digital control IP (CSI-2, CSI-3, DSI etc)providing the smallest power and area footprints and the highest quality. These are customer-provoen and thus the low-risk path for fast time-to-market designs (which would be…well, all designs).

Arasan’s white paper on CSI-3 is here.

Andy Haines of Arasan will also be presenting at the Flash Memory Summit next week, although not on cameras, on Mobile Storage Designs for Compliance and Interoperability. FMS is at the Santa Clara Convention Center on Tuesday-Thursday 13-15th. Details here. And if you want to talk CSI-3 or anything else Arasan at the summit, they will be exhibiting at booth 610 and also be on the UFSA standards organization booth, which is 800.


SEMICON Taiwan 3D

SEMICON Taiwan 3D
by Paul McLellan on 08-08-2013 at 3:10 pm

SEMICON Taiwan is September 3rd to 6th in TWTC Nangang Exhibition Hall. Just as with Semicon West in July in San Francisco, there is lots going on. But one special focus is 3D IC. There is a 3DIC and substrate pavilion on the exhibit floor and an Advanced Packaging Symposium. Design tools, manufacturing, packaging and testing solutions for 2.5D-IC process are available this year, and the most important issue is how to improve its throughput to enable 2.5D-IC mass production in 2014.

3DIC is one of the key “More Than Moore” technologies to increase system capability in ways other than technology scaling (28nm, 20, 14/16 etc). Although in the long-term true 3D systems may be designed, with logic on all the layers, in the shorter term there are two particular areas showing promise:

  • 3D memories, stacking memory die, either to put them into a package like with Micron’s memory cube, or to stack memory on top of logic, probably using JEDEC’s wide IO standard
  • 2.5D interposer designs, where various chips, probably from different technologies, are flipped and attached to a silicon (or perhaps glass) interposer

Although there are some design issues with both of these, pipe-cleaner designs have successfully been done so the real roadblocks are economic.

The first economic problem is called the known-good-die problem. With a single die in a package, if a bad die slips through wafer test and gets packaged, then fails final test then you have wasted the cost of the package, the cost of putting one die in a package and bonding it out. You didn’t waste the die, it was bad anyway. Since wafer test costs money, there is a crossover point where doing more testing at the wafer stage outweigh the cost of discarding the occasional package. With a 2.5D interposer based design, a bad die that slips through means you waste a very expensive package, an interposer and all the other die in the package which were good, plus all the cost of putting everything together. It really is a lot more important that bad die do not survive that long and so the economics of wafer sort change completely.

The second economic problem is the cost of the assembly process. Wafers need to be thinned, glued to something strong enough that it can be handled, bumped, cut up, the backing removed, the die put in the package, the bumps bonded etc. If this is too expensive then it makes the whole idea of using a silicon interposer unattractive versus just using separate packages or doing some sort of multi-die bonded package.

Taiwan is ground zero of the packaging and assembly world. It has the world’s largest packaging and testing company, ASE, as well as SPIL, PTI, and ChipMOS reaching a global packaging and testing foundry market share of over 50 percent. Amkor (Korea) and STATS ChipPAC (Singapore) have also set up plants in Taiwan.

Design tools, manufacturing, packaging and testing solutions for 2.5D-IC process are available this year. So the technology is there. The most important issue is how to improve throughput to enable 2.5D-IC mass production in 2014.

Full details including registration here. 这里