webinar IPXACT banner

"Cook’s Law" supersedes "Moore’s Law"-its impact on Apple, Samsung, TSMC & Intel

"Cook’s Law" supersedes "Moore’s Law"-its impact on Apple, Samsung, TSMC & Intel
by Robert Maire on 05-29-2015 at 7:00 am

Apple drives the semi industry harder than Wintel ever did: Is winning Apple’s chip business a pyrrhic victory? Is 14nm done before it starts? Too short to be profitable?

Chips marching to an Apple cadence…

In the “old days” when Wintel ruled the roost and drove the semi industry, it was driving spending cycles based on new versions of Windows that stimulated unit volume of PCs and thus chips.

New versions of Windows did not specifically demand nor require new technology nodes of Intel processors which were released at the standard “Moore’s Law ” cadence. Windows releases and Intel technology nodes were not interdependent and were relatively loosely linked. It was a “nice to have” if new processors came out at the same time as a new version of Windows but it wasn’t a “must have”

In today’s world, a new version of the iPhone can’t be released unless a new processor is inside to drive it to new heights. The product, processor and software are inextricably linked.

Given that Apple is driving the train with its fall, seasonal roll out of the new iPhone every year, everybody else, who supplies Apple (this means semi suppliers) has to be on board or be left behind at the station. In essence this means that Apple is setting the schedule for the next semiconductor technology node roll out, not the semiconductor industry itself or Moore’s Law as it had previously been.

Apple is forcing and imposing a schedule upon its suppliers, which may be different than a “natural” cadence and likely negatively impacts those who are forced to follow.

Is 14nm done before it even starts?

We are amazed by the level of BS in the industry that competing players are throwing around about 14nm and now 10nm. Both Samsung and TSMC are pushing their competing press releases about 10nm in 2016 before the ink is even on the paper for 14nm orders.

Going by whats in the trade rags and around the industry, we have moved on so quickly from the issues of 14nm and FinFET on to who has the lead at 10nm it makes my head spin. Apple won’t have a product out until the fall and we have already started to talk about who will win the A10 for 2016.

If we believe the hype (and I’m not sure wether we do or not…) it sounds like 14nm will be another “lite” node much like 22/20nm was. The last “good” node being 28nm. However there are those in the industry that say that 10nm will be another “good” node much like 28nm.

It feels like 14nm is already “old news” and the PR wars and jockeying for position at 10nm is even more severe than it was at 14nm….and who does this all benefit??….Apple.

Is Apple chip business a “loss leader”?

When you take into account the massive effort to ramp, the less than ideal yields and the competitive positioning needed to win Apple’s business its not likely very profitable at the end of the day.

One of the main reason’s we would suggest this is that the cost of manufacturing semiconductors is primarily the amortization of the manufacturing costs over as many years and products as is possible. If Apple forces chip makers to move on before they get a chance to amortize the cost of equipment and R&D needed to get to that technology node then how do you make money? Certainly not on Apple. The only way you can make money is by trying to amortize that cost on the backs of trailing technology companies and no one wants to pay up for what is perceived as trailing edge devices.

We think that Apple has made it a more dangerous, potentially much less profitable game by both compressing the technology nodes and forcing them to their own cadence.

Cook’s Law..
“Supplier competition goes up exponentially with each new supplier or technology node added”

The semiconductor industry may be just as much a slave to Apple’s whims as are the Apple slaves at the Foxconn factories in China. Walmart may have a million employees in the US but Apple has more if you count suppliers globally.

If you are going to be a slave at least be a high priced slave. We have a hard time seeing the semiconductor industry getting better profitability out of Apple given the current competitive supplier dynamics involved.

We don’t see this changing soon as neither TSMC nor Samsung are likely to drop out of the race. Maybe Apple kicks Samsung to the curb again just to remind them of their place as a supplier but they will keep coming back. Maybe Global Foundries has the right idea as they are currently working on Qualcomm 14nm parts and not Apple A9. Maybe they figured out it was a bad game to play or maybe they were just too late. Apple has been the maestro of playing its suppliers and they continue to write the rules and set the standards

Can equipment companies win?

One would think with technology nodes coming fast and furious that equipment companies would be rolling in orders but that is obviously not the case. So where is the disconnect? Business is good but not great on the foundry side of life. Could it be that chip companies recognize that we have “lite” technology nodes, that are relatively short lived and are spending accordingly to not invest too much money in a node thats over as soon as it starts. Could it also be that the equipment for older nodes can get rolled over into new nodes and “reused” more quickly as not as much capacity is needed at trailer nodes as used to be the case in the past?

Even given these two factors its still going to be hard to not spend incremental money when you start talking about quadruple patterning at 10nm and below. Lots of etch and dep tools, lots of stuff to go wrong needing yield management. EUV is nowhere to be seen at 10nm and 7nm may be “iffy”.

Likely positive WFE spend trends at 10nm…
If 10nm turns out to be more than the “lite” 22/20nm node or what seems like a “lite” 14nm node that would obviously be good for the likes of Lam, AMAT & KLAC. Less so for ASML.

As far as the stocks go, we remain positive on Lam and KLAC, feel that AMAT is fully valued and ASML is overvalued…..based on these longer term trends. These should be interesting topics at the upcoming SemiCon West show……

Robert Maire
Semiconductor Advisors LLC


Also Read:
Why does Apple do business with Samsung?


Virtual HIL and the 100M LOC car

Virtual HIL and the 100M LOC car
by Don Dingee on 05-28-2015 at 7:00 pm

Aerospace and defense applications have traditionally leveraged hardware-in-the-loop (HIL) testing to overcome several issues. A big one is how expensive the physical system is. Even breaking down the system into subsystems for test can still be too expensive when fielding more than a couple test stations. Modeling elements of the “plant” for testing control electronics is essential to achieving reasonable development schedules and reducing risks through more complete testing at both the subsystem and system levels.

Automotive companies – many of whom moonlighted as defense suppliers in varying degrees – borrowed the HIL approach to improve testing of vehicle designs. While the platform isn’t nearly as expensive, the compressed development schedules of model-year releases dictates a more efficient testing approach.

Three other effects are adding to the automotive problem. First is complexity. By many estimates, the traditional metric of lines of code (LOCs) in a luxury vehicle is now surpassing 100M, and it isn’t much less at the midrange and low end as electronics content is increasing. That would be enough if it were a lump, but in fact the problem is much larger. Those LOCs are distributed across perhaps 100 or more subsystems, each with their own software and many running on different processor architectures. Manufacturers are trying to rein in that complexity by consolidating systems and standardizing around AUTOSAR and other architectures, but the problem is still large.

Second is degree of difficulty. Simulation of most electromechanical and hydraulic systems used to be a relatively easy task. Much faster response times in power electronics have made simulation a challenge, with many designers turning to FPGA-based acceleration of test platforms. Also factored in is the asynchronous nature of subsystems, loosely coupled on a vehicle bus such as CAN – accurately simulating and reproducing timing under all conditions is critical to assessing system operation.

Third is liability. The cost of failure in a car is much greater than it used to be, given the escalation in lawsuits and insurance costs. Even more dramatic is the expectation that manufacturers maintain vigilance through product recalls and warranty repairs. This has shifted the burden of test from straightforward functional verification to mitigating defect escalation, and the response is to “shift left” with earlier software testing.


For higher integrity levels in ISO 26262, the recommended approach involves fault tree analysis, fault insertion, and failure mode and effects analysis (FMEA). Physical fault insertion is expensive, time consuming, and hard to reproduce. When scaled across numerous scenarios and subsystems, it becomes difficult to sustain on an aggressive development timeline. As SoC integration is increasing, physical fault insertion is also becoming less feasible for chip users.

In a recent webcast, Synopsys has taken a fresh look at the ISO 26262 problem in the context of virtual HIL, using their experience in virtual prototypes. Using advanced tools combined with detailed, accurate models of popular automotive microcontrollers, many of the limitations of physical assessment of subsystems can be avoided. For instance, faults can be introduced virtually at the SoC level, providing rapid testing with reproducible, fully documented results.

Synopsys overviews the changes in the automotive environment, along with a look at the ISO 26262 standard and the FMEA philosophy, plus a look at how their tools work, in this SAE-moderated event:

“Shift Left” Functional Safety for Automotive System Development

Synopsys has combined their Virtualizer Development Kits with their Saber simulation environment and third-party tools such as Vector CANoe for network simulation to create simulation capability that can handle these larger, more diverse automotive systems. The examples shown in two videos in the webcast are focused on a single ECU for simplicity, but it is evident how the concept could scale.

For teams working on automotive SoCs, ECUs, or in designs targeting safety-critical systems in general, the ideas explored in this webcast may help keep up with the testing challenge.


SITRI and Coventor Partner to Scale Up MEMS in China

SITRI and Coventor Partner to Scale Up MEMS in China
by Pawan Fangaria on 05-28-2015 at 12:00 pm

When it comes to wearable technology and the rapidly emerging world of IoT, sensors and MEMS are on the frontlines. They collect and transfer raw data such as pressure, temperature and motion and process it with algorithms critical to making sure the right information gets to humans and/or machines so the right reaction is enabled. In less than a decade, there is expected to be approximately 1 trillion sensors deployed worldwide – yet the MEMS market is fragmented and there is as yet no standard process in place for MEMS development. Change is needed; a standard approach for MEMS design and manufacturing needs to evolve in order to sustain the massive growth prospects ahead.

The significance of MEMS has not gone unnoticed, especially by Chinese companies who are eager to jump into this rapidly growing market. At the intersection of MEMS and China sits a company called SITRI, who is announcing a partnership with MEMS tool leader, Coventor.

SITRI is an innovation center for accelerating the development and commercialization of “More than Moore” solutions to power the Internet of Things. In partnership with Coventor, the two companies are working together to help scale up MEMS in China. They recognize the need for an automated process for MEMS design and manufacturing. As part of this partnership, SITRI will provide representation, training and support for Coventor’s MEMS products within China.

Coventor tools for MEMS design and integration, MEMS+ and CoventorWare offer a seamless environment for designing MEMS devices. Also, Coventor’s SEMulator3D provides a virtual fabrication platform for process development and integration for MEMS, CMOS, FinFET and many other semiconductor technologies.


[Shanghai Industrial µTechnology Research Institute]

SITRI is a research and innovation centre that accelerates the development of semiconductor devices and their commercialization in-house and through its network of partner organizations. It has a large presence in China and partners across the world including Taiwan, Korea, US and Europe.

SITRI provides 360-degree solutions to start-up companies to help them grow and become successful. It provides technical expertise, infrastructure support, prototype development, process development and integration, design and simulation, market engagement, and even investment in startups. SITRI has strong ties with academic institutions, research centers, and the Chinese semiconductor industry, which make it an important player in the overall ecosystem.

Today, SITRI is heavily focused on providing expertise in MEMS design, test, process, yield, predictability and packaging that accelerate MEMS time-to-market.

China is a fast-growing market with numerous fabless companies dealing in SoCs and IP, and several foundries. It’s one of the largest consumers of MEMS and ICs. With a rapidly expanding market, research institutions, talent and expertise in semiconductor and MEMS development, and the right infrastructure, China provides an excellent ecosystem to scale up MEMS design and manufacturing to meet the rising demand for MEMS-based devices.

Coventor’sproducts are sophisticated 3-dimensional modeling and simulation tools that automate the design and process development for MEMS. By using these tools, designers can work off specific fab process capabilities to qualify a particular MEMS design for manufacturing before committing to actual fab processing, dramatically shortening development time and increasing the quality and reliability of the MEMS design. SITRI will have access to all MEMS products from Coventor and it will represent Coventor in the China market. SITRI’s engineers will provide high-level expertise and support to Coventor’s customers in accelerating MEMS development through the use of Coventor tools.

Both companies are very excited about this partnership to develop newer processes for MEMS and avail a larger opportunity provided by the IoT market in China. Coventor’s vision is to expand its solution for faster semiconductor and MEMS design and manufacturing across the world. Recently Coventor opened a sales and support office in Taiwan as well and the company already has a good presence in the U.S. and Europe.

Read more in the press release about Coventor and SITRI partnership here.

Pawan Kumar Fangaria
Founder & President at www.fangarias.com


WarpStor, the Data Tardis: Small on the Outside, Large on the Inside

WarpStor, the Data Tardis: Small on the Outside, Large on the Inside
by Paul McLellan on 05-28-2015 at 7:00 am

There is a data explosion:

  • IBM says that 90% of all data was created in the last 2 years
  • Smartphone processor development requires 100GB of data per engineer
  • Android testing requires 30GB times the number of tests times the number of testers
  • Biotech simulation, game development and more all require enormous amounts of data

This is a huge problem. While disk drives are cheap, reliable enterprise class storage is expensive and Gb ethernet connections are too slow and not scaling, and most tech environments are based on NFS which is slow and has a high overhead. With hundreds of users on projects, another challenge is to reduce needless duplication of the same files.

Methodics is introducing WarpStor to address this problem. It is a content-aware network addressable storage (NAS) optimizer built on top of ProjectIC’s abstraction model. It is vendor agnostic, co-existing with storage solutions from IBM, EMC, Netapp and more. It doesn’t require weird stuff like kernel level patches, and seamlessly integrates with existing OS infrastructure.

Although this is being announced today it is actually mature technology. It has been in use at Methodics internally for over a year with great results in their build and regression process. For example, the disk space requirements for the Methodics internal regression suite has been reduced from 300GB to 1GB, with a similar reduction in network I/O and a big reduction in wall-clock time for running the regressions.

This sort of reduction sounds too good to be true, so how does it work? There is an IP master workspace. A workspace shrink takes place to reduce the storage requirements. Changes in the workspace are handled by copy-on-write. This is how virtual memory is handled in most operating systems. Data that has not been changed is shared and when one user makes a change, only then is it copied into their workspace (and altered) and other users continue to share the original unchanged version in their workspaces. As a result, creating a new workspace before any changes have been made is instantaneous. Eventually the changes are (normally) released and then become visible to others. So the first workspace requires some disk space and time to populate, but subsequent workspaces consume almost no disk space and take less than a second to create.

WarpStor is seamlessly integrated into ProjectIC. There is no change at all in the conceptual data model, just a major increase in efficiency both in disk space usage and in the network bandwidth required to move it in and out of users’ own workspaces.

In summary, ProjectIC’s abstraction model enables smart data management and WarpStor is seamlessly integrated with it. It can create 100GB+ workspaces in seconds, requiring negligible disk space at create time. Copy-on-write is used for changed files so the disk space requirements are tied to how much of the design is actually changed. Result: a huge saving in disk space, disk reads/writes, network file transfers. This provides a true turbo-boost to ProjectIC.

“Scotty, I need warp speed in 3 minutes or we’re all dead.” No problem Captain, MethodICs can do it.

The WarpStorage webpage is here.


Why Design Data Management: A View from CERN

Why Design Data Management: A View from CERN
by Majeed Ahmad on 05-27-2015 at 10:00 pm

On July 4, 2012, the European Organization for Nuclear Research, or CERN, announced that the ATLAS and CMS experiments had each observed a new particle, which is consistent with the Higgs boson predicted by the Standard Model of particle physics. The Compact Muon Solenoid (CMS) is a general-purpose detector with a broad physics program that includes the Higgs boson. The CMS experiment is one of the largest international scientific collaborations in history, involving 4,300 particle physicists, engineers, technicians, students and support staff from 182 institutes in 42 countries.

Another prominent feature of the CMS experiment has been the extensive use of semiconductor devices—around 1 million chips in which nearly 700,000 units are ASICs—that range from pixel detectors to Si sensors to calorimeter chips. The ASIC design work is imperative in the high energy physics (HEP) community experiments because commercial off-the-shelf (COTS) components don’t meet the high-radiation, high-magnetic-field and low-power requirements.


A CMS experiment consumes nearly 1 million chips

The large-scale ASIC development is a giant challenge in its own right. However, the collaborative nature of work carried out at CERN brings a new conundrum that goes beyond the labyrinth of technical challenges generally associated with integrated circuit (IC) design work. There are around 30 engineers in CERN’s microelectronics team, and they are collaborating with 70 to 120 chip designers from 20 to 30 universities and research institutes. So, typically, design teams involved in the CERN projects are dispersed geographically as well as institutionally.

And here comes the design conundrum for CERN: The chip industry is facing a number of challenges in dealing with traditional ad hoc ASIC design methodologies. And CERN’s situation, where it’s hard to get all the stakeholders in an ASIC design project in one room, further exacerbates design challenges. In addition, the sheer scale of the number of ASICs used for such a scientific undertaking means that huge data volumes are circulating in a project.

Traditional ASIC Design Challenges

Wojciech Bialas is an IC design engineer at CERN’s microelectronics group. He shared his views on ASIC design flow problems that chip designers face at CERN at the CDN Live EMEA in Munich, Germany at the end of April. He also explained the solution that allowed ASIC designers at all sites to have access to all design data as well as changes in real-time: a multi-site design collaboration scheme. A quick recap of the design problems first.

In a traditional ASIC design flow, chip designers have scratch libraries for development, and they share master libraries that contain the finished cells. Chip designers create or edit cells in their personal scratch libraries, and then they verify them using the master libraries. ASIC designers verify the design by using the limited remote site collaboration available through either rsync or ftp,which is relatively time consuming. Access control is mostly based on trust, and if there is new design release, it’s archived in the master library.


Ad hoc ASIC methodologies offer no traceability of design changes

However, traditional design flow is quickly running out of steam for collaborative ASIC design work. For a start, design changes are usually tracked through e-mails and meetings. Chip designers can accidentally overwrite each other’s changes. Moreover, it’s hard for them to know when changes are lost, so they end up losing track of what different versions mean.

As a result, libraries become cluttered with unwanted cells and design versions. The notion that any user can make changes in libraries at any time also creates doubts about the quality of simulation and verification work. All this leads to a high risk of miscommunication and delayed turnaround for design fixes.

Design Data Management

EDA toolmakers like Cadence Designs Systems have started integrating an API layer for third-party data management solutions. That allows all design data and revisions to be managed through a project repository like SOS from ClioSoft Inc. SOS is a design data and IP management platform that is integrated with leading design flow tools such as Cadence’s Virtuoso, Mentor’s Pyxis, Keysight’s ADS and both Synopsys Custom Designer and Synopsys Laker.

The SOS project repository allows each user to have a separate and isolated work area, and each work area has a read-only linked copy of the project libraries. That allows ASIC designers to check out a writable copy of cell before editing. They can edit existing cells or create new cells in their work area. When a user changes the history of an object, the SOS data management tool automatically updates the entire project. Given the sensitive nature of the data, SOS provides administrative controls to manage design access.


SOS provides tracking and accountability of design changes

The primary SOS server located at the CERN headquarters in Switzerland manages the project repository that contains the entire project data and revisions. The distributed architecture of the SOS tool, however, permits different repositories to be set up at different locations as needed. At each remote site, cache servers are set up and automatically update with changes. In other words, ASIC designers at all sites have access to design data and changes in real-time. Designers without access to cache server infrastructure are allowed CERN computing accounts with encryption tunnels for appropriate access control.

CERN’s Bialas acknowledged that the use of SOS data management frees CERN engineers from the need for periodic syncs and artificial partitioning of ASIC designs. Moreover, it allows design managers to accomplish an optimum use of resource bandwidth among different sites.

Bialas also shared his views on the Visual Design Diff software, another ClioSoft product that displays design differences in schematics, layout and RTL. It’s particularly useful in ECO flows to track the changes made between different versions of the same design on which different design engineers are working. Bialas noted that the use of Visual Diff allows ASIC designers to quickly see the difference between the revisions. Moreover, users can take snapshots to record new configurations, and these snapshots can be recreated at any time in the design cycle.

As the 100+ designers in the CERN project discovered, ClioSoft allows companies and institutions to use the best talent, worldwide, on every project. Collaboration in real time and worldwide revision control enable exploration in new areas of growth.

Majeed Ahmad is the former Editor-in-Chief of EE Times Asia and is the author of six books about the electronics industry.

Also Read

ClioSoft Celebrates 2014 with 30% Revenue Growth!

Secret Sauce for Successful Mixed-signal SoCs

DNA Sequencing Eyes SoCs for Stability and Scale


Will Dark Silicon Dictate Server Blade Architecture?

Will Dark Silicon Dictate Server Blade Architecture?
by Tom Simon on 05-27-2015 at 7:00 pm

Does the evil sounding phenomenon known as Dark Silicon create a big opportunity for FPGA vendors as was predicted recently by Pacific Crest Securities? John Vinh posits that using multiple cores as a method of scaling throughput is flattening out, and the use of FPGA’s to perform computation can help off-load and thus overcome this issue.

The root cause of the so called Dark Silicon phenomenon has nothing to do with evil Sith Lords or a post-apocalyptic Mad Max world left without any power to run IC’s. Any ASIC designer will tell you that it is essential to manage on chip power through controlling clocks, voltages and turning modules on and off as needed. Clock gating is the main tool in this camp, given that clock trees and flops consume a tremendous share of an ASIC’s power. Of course lowering clock rates helps, but this comes at a direct cost of performance, the very thing we are trying to squeeze out of these designs.

Dark Silicon is more of an effect than a cause. When there are more gates on an ASIC than can be run within the thermal constraints of the design, the silicon that cannot be run is called Dark Silicon. It’s really better to think of it as a percentage of what needs to be switched off, rather than specific blocks that never run. However this is nothing new.

What is new is the disparity between gates available and the ability to run them. Multicores helped push through the performance barrier when clock rates for CPU’s plateaued between 3 and 4 GHz. But multicores are also running out of steam. But does adding FPGA’s programmed to perform computation in server farms really solve this problem?

The rule of thumb for converting a software task to an FPGA is that is provides about a 10X improvement in performance. But when Microsoft used a hybrid FGPA-CPU combination for its Bing search engine, they realized a 2X improvement.

So there is clearly a cost in the hybridization process. What the Pacific Crest piece overlooks is the silicon utilization question for FPGA’s. Yes they can get higher throughput than a general purpose CPU with software algorithms, but where do they stand with regards to gate utilization and power consumption per computational operation? I bet that a CPU can perform more hardware work per unit power and silicon than an FPGA. They are optimized to do just that. FPGA’s certainly run slower than dedicated CPU’s.

FPGA’s in this case are just able to solve an algorithm with less computation. So they have an advantage, but apparently not as big as you‘d expect – vis-a-vis the 2X gain, not the 10X you might expect in this scenario. And, as you might guess an ASIC would do even better for a hard wired algorithm. The easiest example to understand this is Bitcoin mining. People quickly stopped using processors and went to FPGA’s for generating Bitcoin hashes. FPGA’s were a lot faster than software for generating hashes, but the dedicated ASIC’s are orders of magnitudes faster.

Dark Silicon is real and is causing people to design differently, but the move to FPGA’s is more about how many gates need to be toggled to solve a particular problem based on how general purpose the hardware is. Companies like Google, Microsoft and the other search and big data providers have enough clout to build their own ASIC’s for search and computation. And let’s not forget Oracle with some significant in house chip design expertise. And these ASIC’s will probably run pretty fast – and yet they will still need to worry about Dark Silicon.


Getting the Best Dynamic Power Analysis Numbers

Getting the Best Dynamic Power Analysis Numbers
by Daniel Payne on 05-27-2015 at 1:00 pm

On your last SoC project how well did your dynamic power estimates match up with silicon results, especially while running real applications on your electronic product? If your answer was, “Well, not too good”, then keep reading this blog. A classical approach to dynamic power analysis is to run your functional testbench on some RTL code or even gate-level netlist, then look at the switching activity as a function of time. This approach is shown in red on the following chart:

Some of the issues with using a testbench approach to getting switching activity are:

  • Functional simulation takes a long time, so you cannot really boot an OS or run an app
  • Your stimulus may not uncover the worst case scenarios, giving you a false sense of security

Another approach is to use a hardware emulator and then actually boot the OS and run your real apps to see switching activity, shown in blue on the chart above. For this SoC it is clear that the testbench approach lead to false power peaks, which would’ve meant that silicon power consumption was much higher than expected, causing a re-spin or even cancelation of the project. The emulator-based approach which was able to run the OS and live apps provided the truest switching activity numbers, leaving no surprises for silicon.

So, who has created such an emulator-based approach to dynamic power analysis? It turns out that it’s not a single company, but rather two companies:

  • Mentor Graphics providing the emulation and integration
  • ANSYS with the PowerArtisttool for dynamic power analysis

Related – Mentor’s New Enterprise Verification Platform

I spoke with Jean-Marie Brunet of Mentor Graphics last week by phone to learn about this new capability for counting the switching activity on RTL or gate-level SoC designs using their Veloce emulator.

Mentor Graphics has offered a couple of previous flows with their emulator that created files for switching activity using the UPF and SAIF file formats. What’s new this month is that their emulator can now create this switching activity and make the results available through a dynamic API, which provides benefits like:

  • Faster time to power analysis results (no reading and writing of file interfaces)
  • Integration with the popular PowerArtist tool from ANSYS which provides the dynamic power analysis numbers

Using the SAIF (Switching Activity Interchange Format) approach is one possible power flow, and it will get you average power numbers. Another approach is to use and FSDB (Fast Signal Data Base) file flow, however you will quickly find that your hard disk is getting filled up with large files that also take a lot of CPU time to complete, but at least you are getting dynamic power values. The new, recommended approach is one where the emulator has a Power Application that can connect to PowerArtist from ANSYS with a dynamic API, allowing:

  • Fastest speed, sufficient to boot an OS and run real Apps on your SoC
  • Quickest time to results
  • Average of peak power
  • Eliminates large FSDB files

Related – Improving Verification by Combining Emulation with ABV

Here’s what the new flow looks like:

This approach was requested by leading-edge customers, and you can expect Mentor to do additional integrations in the future, even with other EDA vendors. Actual performance numbers of the time savings for this API-based, dynamic power approach compared to the older, slower file-based approach shows a speed up of 2X to 4.25X, depending on the type of design that you have. That means that you can expect to save weeks of CPU time on a large SoC project to get accurate, dynamic power numbers from either RTL or gate-level netlists.

Summary
If you already have the Veloce emulator and PowerArtist tool, then it would make sense to give this new Veloce Power App an evaluation. If your last dynamic power estimate was dramatically different than silicon, then maybe it’s time to consider using this emulator-based flow instead of your old approach.


FinFET: The Miller’s Tale

FinFET: The Miller’s Tale
by Paul McLellan on 05-27-2015 at 7:00 am

In Chaucer’s Canterbury Tales, the second of the tales told by the pilgrims is The Miller’s Tale. Since this is a family blog, I’ll leave you to research the tale yourself. But FinFETs hide another Miller’s Tale, due to Miller capacitance, sometimes called the Miller effect. This is significant since in FinFET designs Miller capacitance can be the dominant capacitative effect, even more than the wire loads themselves. Since existing STA tools and library models miss or understate this effect, this can lead to overestimation of circuit performance leading to yield or clock frequency problems in the unforgiving silicon.

So what is Miller capacitance? It is an effect well known to analog designers but largely ignored by digital designers who can ignore many second order effects for process node after process node until…well…until they can’t, which in this case is when we move to FinFETs. The Miller effect is the increase in input capacitance of an inverting amplifier (such as an inverter or a nand-gate) due to the amplification of the capacitance between the input and output terminals. For example, the diagram on the right shows an inverter with amplification A[SUB]v[/SUB] and an impedance Z between the input and output. If the physical input capacitance is C then the effective capacitance at the input, due to the Miller effect, is not just C but is actually C * (1 + A[SUB]v[/SUB]).

The Miller capacitance becomes much larger with FinFETs than planar, simply because of their three-dimensional structures. The gate, which wraps around the channel, has a much larger exposed surface area, which in turn creates greater potential for capacitance. The diagram to the right shows all of the parasitic capacitance points created by the surfaces in a FinFET. Capacitance will be a function of either overlap or surface area, so the increase in both overlap and surface creates more potential Miller capacitance in the FinFET than in older planar transistors.

The Miller Capacitance has two major impacts on circuit timing, both of which must be accounted for by delay calculation during static timing

[LIST=1]

  • the Miller capacitance acts as an active load that can dramatically impact the delay from driver to receiver
  • the relationship between the input waverform and the output waveform (which is, of course, also the input to the next stage) can be very non-linear

    Pretty much the simplest possible circuit is two inverters connected back to back as on the right. The plot below shows what happens when the first inverter is driven with a rising signal (shown in red). The output from the first stage is shown in green and is clearly non-linear. The “bump” in the waveform is due to the Miller capacitance. This has two effects. First, the delay has changed and the Miller capacitance has slowed the signal. And second the output signal is distorted and is not a simple linear ramp.

    The output from the second inverter, driven by the green signal as its input, is shown in blue. It is very non-linear. If we added a third (and more) inverter it would be even more distorted.

    Most production sign-off flows today ignore the Miller effect, even in FinFET technologies. They simplify the receiver model to a basic pin capacitance, and waveforms are linearized rather than correctly propagated. The result is that delay is understated. This is most acute on nets with multiple receivers (fanout) where the Miller effect is largest.

    There are two reasons that timing using the CCS (composite current source) receiver model is not accurate:

    [LIST=1]

  • the CCS receiver model itself is not accurate. Instead of reporting one capacitance value the receiver model creates two capacitance values to attempt to capture the Miller effect. However, this completely ignores the way the Miller effect actually works. The Miller capacitance is a function both of the physical capacitances in the receiver, and of the current driving the cell. As the current changes, so does the Miller capacitance. None of this is captured during characterization, so while there may be two capacitance values instead of one, they are both wrong.
  • propagating the waveform is essential. While STA tools claim support for waveform propagation, in practice it is unusable because the runtimes degrade to a level where only a few paths at best can be evaluated.

    Properly accounting for the Miller Capacitance requires correctly modeling the impact of an active load at both the driver and the receiver, and propagating the real waveform that results through the rest of the path.

    A CLKDA white paper on the Miller effect is here.


  • Synopsys Earnings Call

    Synopsys Earnings Call
    by Paul McLellan on 05-27-2015 at 12:00 am

    Synopsys had their earnings announcement and call last week. They were good. In Aart’s own words:I’m happy to report that our second quarter results were very strong and solidify our outlook for the full year. We delivered revenue of $557 million, non-GAAP earnings per share of $0.68 and $155 million in operation cash flow. We’re raising the midpoint of our revenue guidance, with a range of $2.210 billion to $2.235 billion, and our non-GAAP EPS objective to a range of $2.76 to $2.81, double-digit growth at the midpoint.

    Aart reiterated Synopsys’ 3-pronged product strategy:

    • Leadership in EDA for next-generation chips
    • Grow the IP offering
    • Invest in software quality and security solutions

    He gave some interesting color on tapeouts:The number of active FinFET designs and tape-outs to-date again grew almost 15% in just the last quarter, to well over 200. Synopsys is relied on for approximately 95% of these, and our momentum continues, as more and more enterprises commit to FinFET and count on us for success.

    I think you have to take the “relied on” with a grain of salt since I’m sure many of those tapeouts also “relied on” other vendors since big companies often have all the physical design tools and use whatever seems best for that part of the design.

    One thing that may be significant or not is:Our custom design solution is also gaining strength, and in fact we successfully displaced the incumbent at a global medical device company, who is now using a complete Synopsys digital-and-custom flow.

    Synopsys has tried for years to build a strong bridgehead in custom design with only limited success and Virtuoso has remained the incumbent almost everywhere.

    On to place and route. Here Synopsys claim that ICC2 is faster than any of the competition (including the newly announced stuff):Customers report that IC Compiler II is dramatically faster than any tool on the market today, including next generation offerings touted by competitors.

    Who knows what the reality is since those competitors claim speed advantage too. Again, the reality is probably that on some designs ICC2 is a lot faster and on other designs, not so much.

    Next up, verification:where approximately 80% of advanced designs rely on the Synopsys solution as their primary simulator.

    Again I’m not sure that “rely on” means what it says. I can believe 80% of advanced designs use Synopsys VCS but not that they do so exclusively.

    As to IP, Synopsys:are the number one supplier of interface, analog, memory, and physical semiconductor IP

    If RAMBUS isn’t counted since it doesn’t seem to be (and it’s business is multi-faceted, not just IP) then ARM is #1 and Synopsys is #2 in the IP market.

    In the software quality and security space, built around the core of the Coverity acquisition. This includes the recent Codenomicon acquisition. The group has been renamed and is now called the Software Integrity Group.A year after acquiring and integrating Coverity, we’ve learned the following: Coverity was a great acquisition, a compelling combination of the familiar and the new, and a platform we can build on. Specifically, we acquired excellent technology, the expanded customer base, and a brand new TAM.In order to scale the operations to a grander level, it’ll take ongoing investments in sales and marketing, as well as in R&D…We expect the Software Integrity Group to be slightly dilutive in the second half of the year.

    What they are saying is that they are going to invest more in the space at the expense of profitability. This got asked about in the Q&A.when we originally acquired it, in our initial plan, we expect it to be breakeven in the second half of the year that – now, that we know much more and now that we’ve also decided to make some additional investments specifically in broadening the language coverage that is why we specifically communicated at the end of the second half it would be slightly dilutive. But it’s very small

    Another thing asked in the Q&A was about the lawsuit with Mentor. Aart said that he could not say much but what he did say was:I will minimize my comments on this, after the verdict of course there was a set of issues we have to deal with, the version that’s on the market today does not violate any of that. And so, we see that our business is doing well.

    I’m not sure what that means about the installed base of Zebu boxes that the court found did infringe. But at least going forward things seem to be clean. Aart was asked how fast emulation was growing but he wasn’t biting on that bait:we said that it’s doing very well. We don’t disclose the growth rates of individual product. They tend to go up and down, but I would reiterate that we think that we have a very strong emulation solution.

    So all in all, a great quarter for Synopsys. No big surprises on the call that I noticed. A little bit of additional color on Synopsys’ overall business.


    Why does Apple do business with Samsung?

    Why does Apple do business with Samsung?
    by Daniel Nenni on 05-26-2015 at 10:00 pm

    The Apple and Samsung relationship is an interesting one. On one hand they have co-developed some of the most innovative products on the market today (iPod, iPhone, iPad, iWatch) yet they are fierce competitors in the mobile market. Some call this type of business relationship “frenemies” others refer to the old Italian proverb “keep your friends close, but your enemies closer.” Personally I refer to it as “foundry business as usual.” Let’s take another look at the Apple/Samsung relationship and see if we can get a better picture of what is really going on here. This of course is based on my experience, observations, and opinions so feel free to correct me if I’m wrong, but I’m not.

    Apple became a chip company in the early 1990s with the assistance of VLSI Technology. This was using the ASIC business model where Apple could “toss” an RTL level design over to VLSI and have them deliver finished chips. The first chip was for Apple’s PDA, the Newton, which lost out to the much easier to use BlackBerry and Palm Pilot.

    The smartphone (iPhone) was the next device to usher in semiconductor design at Apple. In 2007 the first iPhone was powered by the APL0098 SoC designed by Apple and the newly created Samsung Foundry Division using the same ASIC business model that VLSI Technology pioneered. The first chip used Samsung’s 90nm technology which was one process behind TSMC’s 65nm that offered twice the gate density and a power reduction of up to 50 percent.

    The next two iterations of the Apple SoC were released in 2008 and 2009 using Samsung’s 65nm technology. At the same time TSMC was delivering 40nm chips with twice the density of 65nm with significantly reduced power requirements. In 2009, 2010, and 2011 Apple used Samsung’s 45nm which delivered density and power requirements just below TSMC’s 40nm. In 2012 and 2013 Apple used Samsung’s 32nm process but TSMC was already at 28nm which again offered increased density and lower power. At the end of 2013 (iPhone 5+ and iPad Air) Apple used Samsung’s 28nm. Apple also ushered in the 64-bit smartphone with the iPhone 5s beating industry SoC leader Qualcomm.

    For the iPhone6 and iPad Air2 in 2014, Apple switched to TSMC’s 20nm which offered a 1.9x density and 25% power advantage over 28nm. The switch from Samsung Foundry to TSMC is a hotly debated topic especially since Apple is now back at Samsung for the 14nm A9 to be released in September of 2015. According to analyst estimates, Apple paid Samsung $2.7 billion for chips in 2014 which is significantly lower than the $4.3 billion Apple paid Samsung in 2013. So yes, the Apple business is a very big deal for the foundries, absolutely.

    Apple claimed its semiconductor manufacturing independence with the 2008 acquisition of P.A. Semiconductor and the 2010 acquisition of Intrinsity which enabled them to move from the ASIC business model to the fabless semiconductor powerhouse they are today. If you want my opinion, which clearly you do if you are reading this, Apple bases the process technology decisions on technology and the ability to deliver said technology, simple as that.

    I know that Apple evaluated TSMC’s 28nm for the A6 and A6x SoCs but since TSMC was the only foundry yielding at the time TSMC’s 28nm pricing and capacity were in question. At 20nm however, Apple wrote TSMC a very large check to get right-of-first-refusal and most-favored-nation pricing which squeezed out competing SoC vendors (QCOM, MEDIATEK).

    At 14nm Samsung developed an LP process specifically for Apple which started risk production in Q4 of 2014 making it viable for the Apple A9 SoC (iPhone 6+) release in Q3 2015. The big shocker here is that Samsung released their own 14nm SoC (Exynos) for their flagship mobile device the Galaxy S6 in the first half of 2015 beating everyone’s 14nm delivery expectations, including my own.

    TSMC was two quarters behind Samsung with their higher performance 16nm FinFET++ implementation which will be used in the lower volume Apple A9x SoC business for the iPad refresh in Q4 2015 (the A9 versus A9x volumes are reportedly 70% versus 30%). I also heard that Apple evaluated Intel Custom Foundry 14nm, but to no avail.

    10nm will be the next foundry battleground. Samsung and TSMC have both discussed taping out 10nm customer designs in the fourth quarter of 2015 which fits the timeline for Apple’s next product refresh using the A10 and A10x SoCs. Intel on the other hand has been very quiet which is not necessarily a good sign for the competition. Intel surprised the industry with 22nm FinFETs. Another 10nm surprise could certainly be in the making. My guess is that Apple will go to TSMC for 10nm but at this point it is just a guess.

    Bottom line: Today, Apple is clearly the most influential foundry customer worth billions of dollars in revenue annually. Apple’s regular product refresh is now driving the foundries harder than I have ever seen and that includes Intel and Samsung. Competition is what makes the fabless semiconductor ecosystem strong and who better than Apple to lead that effort?