CEVA Dolphin Weninar SemiWiki 800x100 260419 (1)

Diversity of chip segments had tempered downturns but no more?

Diversity of chip segments had tempered downturns but no more?
by Robert Maire on 10-29-2015 at 12:00 pm

* The Current Downturn is more Broadly Based…
* Back to the “bad old days” of Lemmings off a cliff???
* Its not just a Foundry thing…
* All corners of the chip industry are impacted!!!
Continue reading “Diversity of chip segments had tempered downturns but no more?”


New CoreLink IP ties in mobile GPU coherently

New CoreLink IP ties in mobile GPU coherently
by Don Dingee on 10-29-2015 at 7:00 am

A mobile GPU is an expensive piece of SoC real estate in terms of footprint and power consumption, but critical to meeting user experience demands. GPU IP tuned for OpenGL ES is now a staple in high performance mobile devices, rendering polygons with shading and texture compression at impressive speeds.

Creative minds in the desktop space long ago figured out that GPUs can be viewed as vector engines, and can be put to work on accelerating computational tasks general purpose CPUs grapple with. This was an ideal use case for mobile space, with tasks like facial recognition, computational photography, and embedded vision growing in popularity.

There are a few problems, however. First is the programming model; CPUs and GPUs are radically different. To try to solve that, Apple and others got their arms around OpenCL, providing parallel constructs with the hope of getting heterogeneous processing units to work together. OpenCL made significant progress for many tasks.

The second problem is memory space – CPUs have theirs, GPUs have theirs, and betwixt is a performance problem usually solved by copying data between the two spaces. AMD and others brought HSA (Heterogeneous System Architecture) to the table, redefining the interface between CPU and GPU (or other execution units) around a shared memory space.

Which brings us to the third problem. Shared memory is fantastic, but real performance in a multicore CPU architecture means lots of cache, and with it cache coherence. Cache miss penalties can be brutal – especially on large files like images. Tossing GPUs into the processing mix without cache coherence may produce gains on very particular benchmarks. For greater gains and consistent performance, we need new IP that maintains coherence.

ARM has rethought their interconnect architecture, introducing two new IP blocks to bring in a new crop of fully coherent GPUs. We should mention here that ARM has a three-tiered product strategy for interconnect: a low-end CoreLink NIC for basic SoCs, a high-end CoreLink CCN for the AMBA 5 CHI server-class multicore crowd, and the mid-range where this announcement lives.


The new CoreLink CCI-550 is shown with six ACE interfaces, two for a big.LITTLE cluster, and four for the GPU. This is the scaled-up configuration offering up to 60% peak interconnect bandwidth compared to the CCI-500. The CCI-550 also scales down with fewer ACE and memory interfaces for more optimized solutions. The key feature of the CCI-550 is the integrated snoop filter, which foregoes the need to send all snoops to all processors, instead using one central snoop. This lowers snoop latency and relieves what would otherwise be quadratic scaling, and removes speculative DRAM accesses.

Those DRAM accesses come through a new DRAM controller, the CoreLink DMC-500. Tuned for up to LPDDR4-4267, the DMC-500 ups memory bandwidth by 27% and drops CPU latency by 25%. These solutions have been qualified to work together, reducing integration issues.

There is also some intrigue over the GPU itself in this diagram. During our pre-briefing, ARM declined to provide details on the Mali “Mimir” GPU other than confirming new IP is in the works. My guess is stay tuned, details to be announced at ARM TechCon coming up in a few weeks. I also asked if other GPU vendors are working on coherent IP; ARM said only that they are currently sharing information under the auspices of the HSA.

Full press release for this announcement:

New ARM CoreLink System IP Provides the Foundation for Next-Generation Heterogeneous SoCs

Fully coherent ARM CPU/GPU combinations could get interesting, although the chances of something like the Quake-Catcher Network emerging on distributed mobile devices have an expensive metered 4G pipe sitting in the way. Still, removing the barrier of coherence for mobile SoCs means new algorithms taking full advantage of GPU compute power are up for grabs. This could also introduce an interesting dynamic for HSA and alternative CPU and GPU core solutions, beyond just an ARM offering.

More articles from Don…


My View of Contextual Leadership

My View of Contextual Leadership
by Sunit Rikhi on 10-28-2015 at 4:00 pm

Go off and do something wonderful,” said Robert (Bob) Noyce, co-founder of Fairchild semiconductors and Intel Corporation. He was heard, and the quality of our lives has been elevated by the wonderful pursuits of skilled pioneers. Notice that Bob did not say, “Go off and become great leaders.” This is because great leaders of the past did not set out to be that. They set out to do wonderful things.

Bob and others became known as great leaders by those who were awed by their achievements. Many of us studied their lives in order to extract recipes of leadership. With such recipes, we try to cook ourselves into leaders. But we don’t need to cook ourselves into anything. We just need to focus our attention on cooking a finger-licking good meal alongside other cooks in the kitchen. We will do well to get rid of our fascination with how to become a good leader. That is not the goal.

Defining our career aspiration
Bob is fondly and justifiably remembered as the mayor of Silicon Valley. Back in the 1980s, I was fortunate enough to witness Bob pondering alone or talking to people in the backyard of Intel’s Santa Clara building 9, which is in the same campus on which we later built a monument in his memory and called it the Robert Noyce building. I was too young then to have had a meaningful conversation with him personally. The closest I came to interacting with him was a few pleasant exchanges, one of which involved helping him light his cigarette on a windy day.

I wonder how Bob would have defined “wonderful.” I think he would have said that a wonderful thing is an impactful result beneficial to humanity. And after we have done a wonderful thing, I imagine he would advise us to do the next wonderful thing and to never stop.

In line with this thinking, I have harbored a career aspiration of increasingly impactful results that are beneficial to humanity and are achieved through methods which generate goodwill with people. I believe this is a pretty good definition of effectiveness as it encompasses results, positive impact and goodwill – all crucial components of effectiveness.

At its core, effectiveness is made of three basic elements: Crave, Act, and Care. Craving drives passion for impact and the courage to pursue it. Action applies the sweat and discipline necessary to achieve the desired impact. Care helps us achieve it in a way that generates good will.

By the time we enter the workforce, we either have these three basic elements of effectiveness or we don’t. I don’t know of a way to teach an adult how to crave, act, and care.

I do think it is possible to optimize the balance of these elements when they exist in varying degrees, however. When you see the right balance in a person, you are witness to a powerful force fueling a limitless spiral of effective contributions from that person.

When you next meet with yourself, ask yourself these three questions: Do I crave? Do I act? Do I care? If the answer to these questions is an honest yes, go off and do something wonderful! And never stop. Humanity will thank you for it.

Enabling behaviors of effectiveness
Beyond the three elements of effectiveness, lie three key essential behaviors which can each be learned and honed through practice. These are to Lead, Follow, and Collaborate. Leading is the act of pointing to a wonderful thing, initiating the journey to it and directing/guiding along the way.

Following is to understand what is being pointed to and to do our part under guidance on the journey to it. Part of following is to lead the leader with one’s questions and vigilance.

Collaborating is the act of co-envisioning and co-creating a wonderful thing. Collaborating melds leading and following together into a beautiful dance. The dance fully leverages the motivations, knowledge and experience of participants as they hand the baton of leading to each other in real time.

Contextual leadership
Contextual leadership is knowing when and how to lead, follow and collaborate. What we do and when we do it is based on contexts at various levels.

The first, and the most important context, is the individual context. For each involved individual, this context includes the attributes of functional role, mandate, capability and career motivation for that individual. These attributes taken together and applied to the topic on hand determine the placement of leading or following batons in their hands for that particular topic. This does not imply that an individual’s behavior is constrained by these attributes. The attributes’ impact can be managed and the attributes themselves change, but it is important to understand the attributes to set the level of empowerment an individual exercises in any situation. Additionally, it is not only important for us to know our own context, but also that of our colleagues. And for that, curiosity and strategic disclosure in a trust-cushioned environment is essential.

Next is the topical context. The topic in a discussion, the problem being solved, or the new paradigm under creation demand very specific contributions for their own success. The intersection of this context with the individual context, when understood well by all involved leads to gliding ease with which an individual takes and gives the leading and following batons.

Finally, the environmental context. The attributes of this context include trends, constraints and opportunities in the socio-political, enterprise and market environments. A good understanding of this context shines the right light on the topical context to determine not only the nature and timing of wonderful pursuits. We need to know when to follow hard constraints and when to lead and collaborate on opportunities to bust current paradigms.

Wonderful results, in my experience, show up when creators crave, act and care. When they crave, act, and care they gracefully take turns leading, following, and collaborating as directed by context.

Happy pursuits!

Sunit Rikhi
www.ReachforInfinity.com


Delivering Zero Defect Products – EVS Testing

Delivering Zero Defect Products – EVS Testing
by Mark Rioux on 10-28-2015 at 12:00 pm

Competition in the semiconductor product marketplace has grown increasingly difficult as suppliers constantly search for ways to differentiate their products. Customers expect low cost, problem-free product performance. Automotive manufacturers in particular expect zero defects as field failures can prove very costly.

Since no manufacturer is currently capable of producing zero defect products, this goal would seem unrealistic. However, the customer expects zero defects in the products they receive, not necessarily in the products as produced. This suggests that while defect reduction remains important, effective screening of latent defects is paramount.

Elevated voltage stress, or EVS testing provides a cost effective approach to screen out defects. Burn-In testing is used by some manufacturers to detect and remove latent defects that pass undetected through standard electrical test methods employed at the Supplier’s Wafer Sort and Final Test operations. However, Burn-In testing is extremely expensive and lasts days if not weeks to complete.

Many latent defects act to thin dielectric isolation between two conductors in the semiconductor device, such as in the gate oxide or intermetal dielectric. Since dielectric strength of these films is well understood, EVS testing can be used to detect and screen these defects to ensure surviving units last as long as needed in field use.

Consider the figure above. Two types of defect populations are plotted in weibull format. Population type 1 maintains straight line behavior from the first failure to the last with all defects occurring due to device wearout. These defects are not consequential since wearout occurs well beyond the useful life of the product. Population type 2, on the other hand, has multiple slopes with many extrinsic defects. These defects are cause for concern. The defects located in zone A fail at low charge levels and are likely removed during standard wafer sort or final testing at the component supplier. Those defects shown in zone B, conversely, would likely survive internal testing and possibly fail in field use as the cumulative charge (Q) increases.

To screen zone B latent defects, EVS testing can prove effective. The governing equation illustrating the role of EVS testing appears below.

In this example, the standard test voltage used at Wafer Sort is 5.2v and the EVS test voltage is 10 volts. As indicated, if a 3 sec EVS test is applied, the corresponding field lifetime is 189,467 years! Of course, testing at such an elevated voltage may not always be possible since other circuit devices may be damaged but the potential benefits of EVS testing is clear. Lower voltages and/or lesser test times can be employed to achieve the desired effect with more reasonable test times.

Follow the adventures of SemiWiki on LinkedIn HERE!


Do 8 Cores Really Matter in Smartphones?

Do 8 Cores Really Matter in Smartphones?
by Amit Sharma on 10-27-2015 at 4:00 pm

As the smartphone industry has begun to mature, one-upmanship among smartphone manufacturers and SoC vendors has bred a dangerous trend: ever-increasing processor core counts and the association between increased CPU core count and greater performance. This association originated as SoC vendors and OEMs have tried to find ways to differentiate themselves from one another through core counts. Some vendors are creating confusion, as phones today have core counts from 2 up to 8 and vary wildly in performance and, even more importantly, experience. One reason for this confusion is many users and reviewers have used inappropriate benchmarks to illustrate smartphone user experience and real world performance. As a result, we believe that some consumers are misled in their buying decisions and may end up with the wrong device and the wrong experience.

The 8 Core Myth…
The 8 Core Myth, also known as the Octacore Myth, is the perception that more CPU cores are better and having more cores means higher performance. Today’s smartphones range from 2 cores up to 8 cores, even though performance and user experience are not a function of CPU core count. The myth, however, will not be limited to 8 cores, as there are plans for SoCs with up to 10 cores, and we could even see more in the future.

Not All Cores Are the Same
In some phones, users are getting Octacore designs with up to 8 ARM Cortex-A53 cores. These 8 cores perform differently than 4 ARM Cortex-A57 cores paired with 4 ARM Cortex-A53 cores in what is called a big.LITTLE configuration. Core designs vary wildly from ARM’s own A53 and A57 64-bit CPUs to Intel’s x86 Atom 4-core processors to Apple’s 2-core A8 ARM processor. All these processors are designed differently and behave differently across application workloads and operating systems. Some cores are specifically designed for high performance, some for low power. Others are designed to balance the two through dynamic clocking and higher IPC (instructions per clock). As a result, no two SoCs necessarily perform the same when you take clock speed and core count into account.

Through the different benchmarks, tools, and applications, we showed that CPU core count in a modern smartphone is not an accurate measurement of performance or experience. More CPU cores are not always better. We do acknowledge that having many smaller cores is one way to simplify power management, but these tests are not focused on power; they are focused on performance and user experience.

CPU core counts are not the way that phone manufacturers or carriers should be promoting their devices. CPU core count is only one factor in Android when the SoC has fewer than 4 cores. The marketing of core counts as a primary driver of performance and experience must end and be replaced with improved benchmarking practices and education.

Hopefully this will be the start of a meaningful discussion in the comment section…


Follow the adventures of SemiWiki on LinkedIn HERE!


To err is runtime; to manage, NoC

To err is runtime; to manage, NoC
by Don Dingee on 10-27-2015 at 12:00 pm

Software abstraction is a huge benefit of a network-on-chip (NoC), but with flexibility comes the potential for runtime errors. Improper addresses and illegal commands can generate unexpected behavior. Timeouts can occur on congested paths. Security violations can arise from oblivious or malicious access attempts.

Runtime errors also tend to be things not happening in isolation, especially if the first error in a sequence goes unmitigated. If there are natural causes such as congestion, further errors are likely to pile up as operation continues. For unnatural causes such as a malicious app, small errors can be a precursor to larger exploits. A chain of runtime errors can eventually render part or all of a SoC unable to function.

Not all errors are created equal. Many errors simply happen silently, producing an incorrect response but otherwise undetected. Others are seen but unacted upon. Depending on the source and severity of the error condition, recovery might be possible, or it might be prohibitively expensive in terms of extra gates and layers of software. The last resort is the dreaded hardware reset, an increasingly archaic response that irritates users to no end.

Without the right NoC infrastructure, even the first few phases of error management are difficult, making simple errors hard to handle. In architectures such as automotive and the IoT, where real-time and safety-critical operation becomes more important, error management is taking on more importance in SoC design. With the right NoC architecture, built-in features make robust error management easier.


There are five phases in error management: detection, aggregation, logging, reporting, and recovery. In the SonicsGN architecture, detection starts with configurable initiator agents and target agents. A transaction begins at an initiator, flows through routers, is received at a target, and is acknowledged with a response that flows back to the initiator. Each agent has what amounts to a watchdog timer, looking at four situations: burst failure, target flow control, return ack fail, and initiator flow control.

Other types of in-band errors can occur. Each initiator agent has a map of the targets it is permitted to reach; an access attempt can fall into an address “hole” in the map, or might be trying to access a powered-down domain. An initiator agent might see an unsupported command, a target agent might see an access violation, or both might report some type of safety error (as in a firewall, or what Sonics terms a protection mechanism). Another common error is the out-of-band variety, such a violation of the AXI non-modifiable burst. When possible, errors are handled at the initiator agent to minimize network traffic.

The SonicsGN agents detect, aggregate, and log errors – but what happens then? Reporting is configurable, with responses ranging from simple in-band messages to sideband techniques up to processor interrupt. One interesting scenario is an attack on a sensitive IP block. It may be futile to report those errors back to the initiator, who would be generating the attack entirely on purpose. Recovering errors is also up to the customer. Software can go into the agents and sweep the error logs, looking at different classes of severity and frequency, then decide what to do.

The point is customers can use the SonicsGN capability to engineer as little or as much error management into their product as needed. Much of the original work on NoC error management was done in conjunction with TI on various OMAP family members, and Sonics has a detailed error management microarchitecture (under NDA).

There are always tradeoffs. For a fully certifiable, safety-critical system, the investment in both hardware and software for a SoC with robust error reporting and recovery in some scenarios may be well worth it. Even for less hardened systems where recovery might be expensive in silicon, the ability to recognize and report suspicious activity could be instrumental in IoT and other applications. Imagine an IoT edge device that could tell the provisioning system it is being hacked and going offline – while the attack is in progress, rather than after the fact when bad data has propagated all over network.

To me, this seems like the early days of the Internet, when IT types were looking through logs of traffic from routers, firewalls, packet shapers, load balancers, and other appliances looking for who was trying to do what to whom. The difference is now it is all happening within a single chip running a NoC. Without the type of visibility SonicsGN provides, errors could easily run out of control all over a chip – and users would never know until it was too late. With the error management capability in SonicsGN, SoC designers have a lot more control.

More articles from Don…


Are FinFETs too Expensive for Mainstream Chips?

Are FinFETs too Expensive for Mainstream Chips?
by Daniel Nenni on 10-27-2015 at 7:00 am

One of the most common things I hear now is that the majority of the fabless semiconductor business will stay at 28nm due to the high cost of FinFETs. I wholeheartedly disagree, mainly because I have been hearing that for many years and it has yet to be proven true. The same was said about 40nm since 28nm HKMG was more expensive, which is one of the reasons why 28nm poly/SiON was introduced first.
Continue reading “Are FinFETs too Expensive for Mainstream Chips?”


SpyGlass World 2015 User Group Meeting

SpyGlass World 2015 User Group Meeting
by Bernard Murphy on 10-26-2015 at 4:00 pm

I attended SpyGlass World this week – to give you an update, to catch up with old friends, including users, and to meet some of the new (to me) players from the Synopsys side of the event. The event was held in the United Club at Levi stadium, just like last year. Don’t know if this will continue. Merging the SpyGlass User Group into SNUG would be logical, also attendance wasn’t as strong as last year, perhaps because no-one expected significant news such a short time after the merger. The marketing guys confirmed it was indeed to soon to share well-developed merger plans. I’ll spare you a blow-by-blow for the event and will focus just on a few presentations that captured my interest. The detailed schedule is here. As always, it’s a lot more useful to learn about tool usage from real users than from a product company and each of the presenters delivered.Philippe Magarshack, CTO of ST, gave an enlightening keynote on IoT and FD-SOI. Some highlights:

  • For an IoT example, he mentioned a UK telematics-based insurance company called “Drive like a Girl”. You plug a device into your car which tracks your driving habits. The company uses that information to adjust your insurance rates. Not a new idea but interesting for the company name!
  • Claims FD-SOI has better reliability to soft-error rates (radiation-induced errors) than FinFET. This is interesting not just for obvious applications like satellites but also in cars which demand higher reliability standards that found in consumer electronics. The claim is that FinFETs have to compensate with logic triplication (with higher area, cost and power) which is not necessary with FD-SOI.
  • To help jumpstart the nascent IoT market, ST is incubating startups in Grenoble; this also helps ST tune their products to real applications. An example is sevenhugs, an application which monitors sleep habits and adjusts temperature and humidity to improve sleep.

Vitor Antunes from the (Synopsys) DesignWare group gave a presentation on the group’s use of SpyGlass:

  • Has been used for internal quality checking since 2012
  • Became the default choice for DW customer validation of configured cores, through coreConsultant, since 2014
  • Today run CDC and Lint, but plan to extend this to other Guideware checks over time
  • The DW group aims to stick close to the SpyGlass GuideWare 2.0 ruleset. This should be a hint to end-users still custom-blending their own SpyGlass rulesets that the need to be different is looking increasingly difficult to defend.

Nathan Hsiung from Broadcom explained their use in validating CDC correctness in large networking chips. What I found especially interesting was their use of the hierarchical CDC flow. Designers generally do everything they can to avoid hierarchical verification flows anywhere – in CDC, timing, you name it. Nathan offered several reasons for why they went hierarchical.

  • They had no choice. They build huge networking chips – many 100’s of millions of gates. Flat CDC analysis on designs this size would take too long to complete.
  • Hierarchical approaches allow running top-level CDC verification many more times
  • Analysis is much simpler when you follow a bottom-up flow. If you run everything flat, you are deluged with warnings and errors with no obvious place to start debug. If you clean-up bottom-up, each stage of debug is manageable.
  • They checked carefully that abstracted models used in higher-level runs do not make unreasonable assumptions, which gives them high confidence that those higher-level runs are not masking potential problems.

Finally, Michael Sanie from Synopsys presented the Synopsys Verification vision. This is worthy of a separate blog, so I won’t detail it here. You can learn more abut the SpyGlass products, now on the Synopsys website, HERE.

More articles by Bernard…


GlobalFoundries 14nm Process Update

GlobalFoundries 14nm Process Update
by Scotten Jones on 10-26-2015 at 12:00 pm

Last Monday Daniel Nenni and I had a conference call with Jason Gorss and Shubhankar Basu of Global Foundries to get an update on their 14nm process. Shubhankar is the product line manager for 14nm.

Global Foundries 14nm process is a FinFET on bulk process they licensed from Samsung and both companies supply the same process although as Shubhankar pointed out they have different targets for the process especially in light of Global acquiring IBM’s chip business.

The 14nnm process is run in Global’s Fab 8 in upstate New York. The 14LPE process was the first generation and was qualified in January. A second generation 14LPP process was qualified in September. They are now shipping 14nm parts to customers.

Shubhankar said that Global is being successful at getting customers to design for their 14nm process and aren’t just a “second source”. In the mobility space a lot of consumer parts need high performance. Global has a huge IP library for LPE and LPP and they are having success in mobile diversifying their customer base.

14LPE and 14LPP share the same design rules and most of the equipment is the same. 14LPP offers a 10% to 14% performance boost over 14LPE. The Back-End-Of-Line (BEOL) is the same but 14LPP has some transistor enhancements. I asked about the transistor enhancements and Shubhankar said he couldn’t give specifics. I mentioned enhancements such as taller fins. Shubhankar would only comment that you can make geometry enhancements and you can reduce parasitic by tailoring things such as implants.

My analysis of his comments is as follows: He did say the pitches are the same so my guess would be a combination of taller fins and implant adjustments. This would suggest to me that manufacturing costs aren’t very different for LPE and LPP. Taller fins would require a longer etch and likely have some yield impact but I would expect the costs to be similar, say within 10% (just my opinion). I also think this is basically what TSMC did with 16FF and 16FF+, 16FF+ is a tuned version.

Production qualification is greater than 60% yield on a 128Mb SRAM. Yields on LPP are >20 points higher than that now (>80%) and LPE is ahead of that.

Daniel mentioned that processes used to be performance first but are now mobile-power first. He asked how the FPGA and processor guys get what they want.

Shubhankar notes that FinFET changes the game, performance is so much better versus planar that it is a no brainer. Further the 3D FinFET structure has much lower leakage than planar (fully depleted). Their IP is also characterized for high performance.

Shubhankar believes 14nm will be a long lived node, there is a lot more to be gotten out of it, they aren’t standing still. I asked him if this would be like what we see at 28nm where companies such as TSMC have HP, HPL, HPM, LP, HPC and other variants. He said they would continue to tune performance and cost and that tier two and even tier three customers are adopting the process.

I asked how they segment 14nm versus the 22nm SOI family Global recently announced. Shubhankar said that certain IOT applications that are middle spectrum or on the lowest end of mobility are still on 28nm and reluctant to move to FinFETs. 22nm SOI is an intermediate process and can be pushed close to FinFFET performance. You can also run 22nm SOI at 0.4 volts and 14LPP is not ready for that space yet.

In terms of cost a 22nm SOI wafer is less expensive than a 14nm FinFET wafer but die cost depends on how much shrink you can get. Some die will be cheaper in 22nm SOI and some die will be cheaper in 14nm FinFET if you get enough die size shrink. If you need the longest battery life and performance is less important 22nm SOI wins, if you need maximum performance 14FF wins.

Daniel commented that Qualcomm and others are doing server chips. Will a foundry do a very high performance process for server chips. This led to a discussion about the IBM chip business acquisition and whether IBM’s 14nm FinFET on SOI process will be available to outside customers. Global is committed to support IBM’s SOI technology for 10 years but beyond that they can’t comment on IBM technology plans although they did say they think it is a game changer.

My analysis: An interesting thing here is IBM’s 14nm FinFET on SOI process is a server process with embedded DRAM for very large on-chip cache. This could potentially be an interesting process for very high performance applications if Global could or would offer it externally. Once again this section is just my opinion, they wouldn’t comment on this.

Daniel also commented that he thinks 10nm will be kind of a short node like 20nm because 10nm and 7nm will use the same equipment (the same way that 20nm and 16nm used the same equipment).

More information HERE.

Follow the adventures of SemiWiki on LinkedIn HERE!