webinar IPXACT banner

Will the Apple A9 Fall Flat?

Will the Apple A9 Fall Flat?
by Robert Maire on 01-23-2015 at 12:00 pm

Several months ago we had suggested that we were concerned that Apple’s A9 processor would wind up being 20nm planar (maybe 14nm planar) rather than the expected 14nm FinFET. As we are now under 9 months from a likely launch time for Apple’s next gen IPhone the timing for getting a 14nm FinFET processor on board the phone looks much more difficult. The generally held expectation is that the A9 would be 14nm FinFET, closely following on the heels of Intel’s 14nm FinFET release last year and a significant upgrade from the A8 which itself was a huge uptick in transistor count, density and overall performance from the prior A7 processor and helped make the iPhone 6 a significant hit.

Also Read:
The TSMC iPhone 6!

The math doesn’t add up…
If we assume a September roll out and work back from there, adding up the production time of the processors, getting them tested and shipped in volume and soldered onto the circuit boards and assembled into phones, we are likely talking about volume production of A9 by the end of the June quarter. Given the ongoing news about slow spending by TSMC & Samsung and the recent delay at GloFo, it seems hard to put together enough capacity at a high enough yield in the time left (maybe 5 months at best) to satisfy an Apple roll out of a new phone and associated volumes. While we wouldn’t rule it out completely, it seems increasingly difficult to get 14nm up to snuff in time without a huge risk that Apple is not likely to want to take given the potential embarrassment and potential fall out of supply issues.

KLAC is a leading indicator…
Last nights lackluster guidance for foundry spending in H1 2015 continues and underscores the sluggish rollout of 14nm FinFET at foundries. Remember that 14nm FinFET stumped even the great Intel so its no surprise that it has slowed everyone else down as well. Its hard to get the 14nm process up to yield without yield management tools (in significant volume). Right now a ramp in mid 2015 is dubious.

A “6S” would fit Apples “tick tocK” pattern…
Much like Intel’s “tick tock” practice of “shrink & exploit”, Apple seems to come out with a more significant upgrade on every other IPhone model and the IPhone 6 was a biggie, which suggests that the one this fall is less significant. This would seem to simply that the next model will be a “6S” rather than a “iPhone 7”.

20nm capacity could fit the situation…
Both Samsung and GloFo have nicely working 20nm capacity with GloFo supplying Qualcomm out of Malta. The news reports over the last several months point to Samsung winning the lions share of the A9 along with sidekick GloFo and a potential TSMC chaser (already building the A8 at 20nm). We found those news reports curious as it seemed to make little sense for Apple to commit so early in the process. However those reports make more sense if they gave up on the thought of going to 14nm FinFET in time for September and instead settled on the safe and readily available 20nm capacity today.

Also Read: Who will Manufacture Apple’s Next SoC?

This would also further support the reason why the foundries don’t appear to be in as much of a hurry over 14nm spending if the A9 deal is already done with existing technology/capacity. Furthermore this would also support the back filling of 28nm and 20nm capacity that has been talked about. Though there is the potential of 14nm planar, we don’t think that is is likely scenario (though stranger things have happened). The pieces all seem to fit together….

Core Wars…
Recently there has been a lot of buzz about the number of cores in the the processor of Android phones which are now touting 8 core designs. Maybe rather than a 14nm shrink and shift to FinFET, Apple could stick with 20nm planar and increase the die size a bit to squeeze in more cores? Though Apple probably does not want to be seen as following Android , there may not be a choice here. Obviously the Apple OS would have to be capable of using more cores (something we have no real clue about). It could be but we just don’t know …but it makes some sense.

Apple counting on things Big and Small???
It may be that Apple is counting on the IWatch and foot long, IPad Pro to carry the momentum in 2015 rather than an iPhone refresh. Logically this seems to make sense as the IWatch will likely roll from the Spring into the fall holiday selling season for 2015’s holiday gift idea while the Ipad pro attacks the business market. If this is the case it may take the pressure off of needing a big IPhone refresh in 2015. Better to wait til 2016 for the IPhone 7 and a jump step in processor power.

Slowing Moore’s Law forces choices…
It feels to us the 28nm was the last “good node” as per transistor cost increased from there in its first upward excursion ever after the long downward curve of Moore’s Law. 20nm has been OK albeit with increasing costs due to multi patterning. But 14nm FinFET looks to be a major cost dislocation causing a significant jump in wafer and per transistor costs that will set the industry on its ear and cause heartache. Most of the delay can be laid at the feet of the delay in EUV and next generation Litho improvements that would allow shrinks without as much multi patterning that we are now facing. While not the only issue in the continuation of Moore’s Law , it is clearly the core culprit.

Given what we heard from KLAC last night, that Actinic, “at wavelength” mask inspection will not be available until 2020, it underscores the view that EUV will not be ready for high volume manufacturing for another 5 years and into the 7nm node forcing more pain again at 10nm. (don’t cry for ASML as they are more profitable with current tool sales and EUV delays). End users, such as Apple & Qualcomm will have to deal with the previously reliable cadence of Moore’s Law slowing down and figure out how to roll out new and better products to an ever more discerning consumer base who always want the next great thing.

Robert Maire
Semiconductor Advisors LLC


How the iPhone Ended Nokia’s Reign!

How the iPhone Ended Nokia’s Reign!
by Daniel Nenni on 01-23-2015 at 7:30 am

The origin of ARM’s success in mobile phone space is largely traced to Symbian’s decision to exclusively support the ARM Instruction Set Architecture (ISA). This in turn was the consequence of a mid-1990s decision by Texas Instruments to use ARM in its mobile phone ASICs for Nokia, the driving force behind the inception of the Symbian smartphone project.

When the GSM cellular standard was about to enter the commercial arena, TI’s Gilles Delfassy sat in a sauna with executives of Nokia, then a troubled conglomerate, and agreed on a DSP-centric approach to build the upcoming digital cell phones. Digital signal processors or DSPs, which later became the foundation of TI growth, were developed unnoticed at its European division until this meeting took place in Helsinki in 1992. By sealing a business pact to supply specialized chips for Nokia’s cellular products, Delfassy placed TI’s DSP technology squarely in the middle of the emerging GSM products.

What happened next at TI was reminiscent to Nokia’s own blossoming into a telecommunications specialty from being a messy electronics giant. TI had just about sewn up the mobile handset silicon market by devoting vast engineering resources to Nokia for development of platforms based on its chipsets. On the other hand, the transformation of Nokia from a Victorian-era industrial conglomerate to a wireless powerhouse was a Finnish fable in its own right.

Also Read: New book untangles the Internet of Things (IoT)!

Fast forward to 2010 and the Nokia fairy tale had come down to earth. What happened to one of the most celebrated corporate champions from tiny Finland? According to Henry Blodget, former research analyst and founder of news blog Business Insider, the iPhone happened.

How did the Finnish mobile phone giant reach this crossroads? Is Nokia the next Kodak? A new book chronicles Nokia’s lost decade in which the venerable handset champion found itself in the clutches of a vicious cycle. “Nokia’s Smartphone Problem: The End of an Icon?” delves into one strategic blunder after another to provide a vivid account of this tale of management indecision. It provides a riveting look at how this comedy of errors took one of the world’s most global companies to a near-death experience.

Nokia’s Smartphone Problem” is written to educate and inform managers in the IT, wireless, semiconductor, and consumer electronic industries. It’s a groundbreaking book that exposes the past, present, and future of Nokia and smartphone businesses at large to find all the pertinent answers regarding smartphone product development cycles. That translates into a detailed treatment of the smartphone industry’s business models and basic building blocks like hardware, operating systems, apps, and ecosystems. And that makes the book a must-read for managers tasked with formulating a mobile strategy for their businesses.

The Nokia story is engulfed in a plethora of misconceptions. A lot of information about the mobile phone pioneer is cluttered, and a number of facts are not in place. “Nokia’s Smartphone Problem” aspires to clear the air, develop a comprehensible picture, and thus set the record straight. Nokia is no more the master of the mobile game, but it is still an important company. The book digs deep into Nokia’s heritage, strategy blunders, major stumbling blocks, and bailout efforts. That way, it attempts to recollect notes from this epic moment in Nokia’s life and create an authentic document that not only recounts Nokia’s breathtaking transformation, but also provides a discourse on the Finnish company’s turnaround plan.

The book was first published in May 2013 at the height of Nokia’s chaotic relationship with Microsoft. The second edition of Nokia’s Smartphone Problem, published in October 2014, covers Nokia’s formal exit from the smartphone business while Microsoft takes over its mobile phone unit to carry on with its unfinished business of reinvigorating the Windows-based smartphones.

The book takes a microscopic look at Nokia’s turbulent relationship with Microsoft and provides an insider look into Nokia’s multi-layer tie-up with the Redmond, Washington based software giant. It further reconstructs how Nokia is aiming to reinvent itself in the mobile infrastructure business.

The book also argues that chipmakers, a crucial part of the smartphone value chain, wouldn’t want the market to go polarized between Apple’s iPhone and Samsung’s Android handsets. Semiconductor firms are an important source of smartphone innovation and they have a crucial stake in the mobile game.

Nokia’s Smartphone Problemfeatures 20 images to highlight defining moments in the company’s smartphone and post smartphone era. The book is available on Amazon and Barnes & Noblein both paperback and e-book formats.


Windows on a TV

Windows on a TV
by Daniel Payne on 01-23-2015 at 12:00 am

This month I upgraded my TV at home with a 40″ LED set from Samsung, Denon AV receiver and Samsung Blu-ray player. Also being a Google fan I bought a Chromecast device.




At CES there were multiple announcements from Intel, and one that caught my eye was the Intel Compute Stick because it reminded me of the Google Chromecast device by plugging into a TV set.

This consumer electronics area is filled with devices from many manufacturers that connect to a TV, and Intel wants to offer us Windows 8.1 apps on a TV with this new Compute Stick. Convergence between the Internet and TV has been quite the rage for years now. It’s hard to compete with Chromecast because it is priced at $35.00 and I got it on sale at Best Buy for just $29.00, and then Google sweetens the offer by giving out a $20.00 Google Play credit, so in reality I paid only $9.00 for my Chromecast device.

Here’s what’s inside of the Intel Compute Stick:

  • Quad-core Atom Processor
  • 2 GB of RAM
  • 32 GB of storage
  • MicroSD support
  • WiFi
  • Bluetooth 4.0
  • USB connector
  • Mini-USB for power
  • Windows 8.1 or Linux

With such a device connected to a TV you could:

  • Browse the web with Bing
  • Social networking
  • Stream content: Netflix, Hulu
  • Play games
  • Run Windows Remote Desktop

Details are still sparse from Intel at the moment, but the retail price is set at $149 and actual product release later this year. I can see that geeks will be interested in using Linux more than Windows 8.1, while most consumers will opt for the Windows version because it is most familiar. On the Linux side the Intel device will cost just $89 and comes with 8 GB of storage and 1 GB of RAM. The Compute Stick even reminds me of the popular Raspberry Pi computer aimed at hobbyists and DIY makers as they both run Linux and have an HDMI connector.

Related: ARM + Broadcom + Linux = Raspberry Pi

To really use Windows 8.1 on a TV would require a bluetooth mouse and keyboard combination, so I look forward to the first shipment of the Intel Compute Stick and I plan to try one out at my local Best Buy store.


Tracing Insight into Advanced Multicore Systems

Tracing Insight into Advanced Multicore Systems
by Pawan Fangaria on 01-22-2015 at 7:00 am

After knowing about the challenges involved in validating multicore systems and domains of system and application level tracing as explained by Don Dingee in his article “Tracing methods to multicore gladness” which is based on the first part of Mentor Embedded multicore whitepaper series, it’s time to take a deeper insight into what all has to be considered and done for an effective tracing of a multicore system and application software.

Software tracing can be based on static instrumentation or dynamic breakpoints and hardware based tracing use probing technology needing extra hardware. It’s not necessarily technical merit that need to be considered for the approach to be used in tracing, rather strategic decisions need to be made with a combined approach depending on the target system and its architecture. The parameters to be considered for trade-off include intrusiveness, performance, capacity, granularity and availability of hardware and software resources for the trace infrastructure.

The tracing cycle consists of certain pertinent steps that need to be performed with the trace event data; the steps include collection of data, import of data into an analysis and visualization tool, analysis, exploration and post-processing of the collected data, and deficiency identification and improvement. The event data collection includes trace instrumentation, configuring tracing options, start and stop of running applications on the target system. The collected data is loaded into a trace analysis/visualization tool on a host system, where it generally is transformed and optimized into manageable visual representation, according to the tool’s preference.

Mentor’s Sourcery Analyzer supports trace event data generated by the LTTng framework. It also extracts symbolic names from the binary image of the user application that has been traced and maps these to trace event data, thus enabling the look-up of source code from trace data. Sourcery Analyzer supports the import of custom trace event data into its native event format by text importer facilities and JavaScript scripted data imports.

The analysis tools abstract data from the raw trace event data in user understandable form such as function hit count, heat map, and top time-in function chart as shown above. In order to extract the resource usage per CPU in a multicore system, an analysis is required that is able to deliver such meta-information from the raw trace event data.

The visualization tools can represent the data in appropriate forms such as logs, graphs, charts, and tables as per the need; a log view can provide details about event sequences and associated information; a graph can provide trends, patterns, min/max information and so on; and a chart can provide statistical information about main factors such as producers, consumers, polluters, etc.

Then comes the most important part of exploration of the data where correlation and synchronization of trace data from different domains, such as kernel and user space (via a common time-axis) takes place. The data section corresponding to any anomaly or peculiarity needs to be located and its initiator identified, may be through user space traces.

The accuracy and success of analysis of a complex multicore system depends highly on the availability of solutions to correlate trace data from different domains that have been recorded during the execution of the examined application code. Generally, system and application time sources are not synchronized, and time is counted with different resolutions by different methods such as time-stamp count registers (in x86 architectures) and memory mapped external timers (in embedded architectures), which vary in their handling of numeric value overflow, linearity and resolution. The above illustration shows a time-correlated kernel and user-space traces scroll-synchronized through a selection cursor, the red vertical bar.

Sourcery Analyzer supports a palette of facilities to correlate time synchronized (scroll-synchronized by placing dedicated synchronization cursors at desired time-stamps) as well as time offset traces (that can be rebased to align their time scales).

The abstracted data from analysis and visualization tools provide a good ground for the trace tools to do further measurements and computations to determine the system behavior, for example event blocks spending most execution time.

Measurement and calculation tools for post-processing the data are seamlessly integrated into Sourcery Analyzer with easy and useful user interface. The above illustration shows frame events of different durations with annotations applied by the measurement tool. The ‘PulseWidth’ graph indicates the absolute frame duration values across the x-axis.

From the visualization graphs, the calculation tool can extract and compute derived data such as load curves for individual CPUs and load trend over all CPUs obtained from moving averages of individual CPU load graphs.

Finally, the user should be able to look up the source code (through accurate and comprehensive source code information display) from the trace event data in case of any encountered problem or concern and apply appropriate improvement.

Trace data formats and customization options are other criteria to look at when choosing a particular tracing tool and technology. Mentor Embedded Sourcery Analyzer provides a versatile platform for trace data analysis, visualization, extraction, correlation, measurement and computation. It supports, but is not limited to, LTTng instrumented Linux kernels and software trace instrumented user applications. Read the second part of Mentor Embedded multicore whitepaper series – “Software Tracing Tools and Techniques for Advanced Multicore Development” to know more. Manfred Kreutzer provides a great detailed description about all the procedures required for tracing.

More Articles by PawanFangaria…..


Managing Semiconductor IP

Managing Semiconductor IP
by Daniel Payne on 01-21-2015 at 5:00 pm

SemiWiki blogger Eric Esteve does an excellent job writing about all of the semiconductor IP available, and the popularity of IP is only growing more each year. Here’s a projection from IBS about semiconductor IP showing revenues of $4.7B by 2020:

Analyst Gary Smith divides IP into three broad categories: Functional, Foundation and Application.


An example of functional platform would be IP provided by ARM, foundation platform would be IP like TI’s OMAP, and finally an application platform example would be IP or software from Audi for their navigation and infotainment systems.

Related – Filling the Gap between Design Planning & Implementation

The number of IP blocks on a modern SoC is about 200 or so, making about 80% of a chip re-used. Here’s the chart from Semico Research:

Another trend with increased IP use is the rising cost of software with each new node. Data from IBS shows that at the 22 nm node we have SoC costs dominated by software development compared with hardware design or manufacturing.

Related – Smart Collaborative Design Reduces Business Risk

One approach to manage these increased product costs is to use a functional virtual prototype:


Functional Virtual Prototype (Dassault Systemes)

 

This approach enables early hardware-software co-development, shortening the product life cycle. You can actually verify a new design before detailed implementation begins. The SoC along with all of the related software IP can become a system virtual prototype and managed with specialized software provided by Dassault Systemes.

Related – Enterprise IP Management – A Whole New Gamut in Semiconductor Space

GSA Working Group

Industry experts in IP will meet for two panel discussions and webcast on Thursday January 22, 2015 from 9AM to Noon (PT) at Synopsys in Mountain View, or by phone at 1-719-352-2630:

 

  • IP Management
    • Warren Savage, CEO, IPextreme, Moderator
    • Ranjit Adhikary, Director of Marketing, ClioSoft
    • Shiv Sikand, VP Engineering, IC Manage
    • Vishal Moondhra, VP Applications, Methodics
    • Michael Munsey, Director ENOVIA Semiconductor Strategy, Dassualt Systems
    • Kands Manickam, Senior VP & GM, IPextreme
  • IP Business Models
    • Warren Savage, CEO, IPextreme, Moderator
    • John Koeter, VP Marketing, Synopsys
    • Brian Gardner, VP Business Development, True Circuits
    • Oliver Gunasekara, CEO, NGCodec
    • Frank Ferro, Senior Director Prod cut Management, Rambus
    • Marty Kovacich, CFO, Sonics

Also Read

Filling the Gap between Design Planning & Implementation

Smart Collaborative Design Reduces Business Risk

Enterprise IP Management – A Whole New Gamut in Semiconductor Space


Analyzing Power Nets

Analyzing Power Nets
by Paul McLellan on 01-21-2015 at 7:00 am

One of the big challenges in a modern SoC is doing an accurate analysis of the power nets. Different layers of metal have very different resistance characteristics (since they vary so much in width and height). Even vias can cause problems due to high resistance. Typically power is distributed globally on high-level metal layers, which have the lowest resistance, but eventually, of course, the power has to get all the way down to the transistors through the much higher resistance metal 1 and 2 and the associated vias. A full analysis requires accurate resistance in order to do IR drop analysis, electromigration (EM) analysis and thus the implications for reliability and timing.

Silicon Frontline’s P2P (it stands for point-to-point) performs full-chip transistor level IR drop and EM analysis for the power net design. It is focused on providing easy-to-setup and easy-to-use analysis that has the speed and unlimited capacity to handle the whole chip.

Power nets form very complex systems, reaching all parts of the chip. One of the guiding principles of P2P is progressive verification—find the gross errors first, in a straightforward way, and save the compute-intensive verification for the troublesome layout issues.


First, for a qualitative view of the power net the user performs resistance analysis. The user provides the GDS and top-level pad(s), and P2P automatically calculates resistance across the complete net and provides graphical and textual output of the results, color coded from blue to red as to how high the resistance is. The user can see at a glance the absolute resistance to every point on each power net, as well as the resistance gradient, making problem identification easy.

For more detailed quantitative view the user performs IR drop/EM analysis. The user just specifies the voltage sources and current sinks and P2P does the rest, doing a complete power net analysis in minutes. Static currents can be given for any level of hierarchy and refinement. Currents can be given for block (e.g. IP), for cell (e.g. cell library element), or for transistor (e.g. P2P automatically characterizes device current according to device width).

There are several ways of making use of this capability. One is full-chip, with currents for each circuit block defined as needed (determined by model availability and stage of design refinement). An alternative method is box based where the various blocks on the SoC are analyzed one after the other, in isolation, before all blocks are analyzed together in context. Additionally, the user can annotate arbitrary currents obtained from a variety of sources. For example, pick the peak block current from a SPICE simulation for one block, and a maximum IP block current from the provider’s datasheet. In this way, the user can analyze different time points with the SoC in different modes of operation, where these modes may have very different demands on the power distribution networks.


P2P visualization allows display of resistance mapping, voltage distribution, and current density in metal interconnect and vias. It can also highlight excessive current and produce layer-by-layer resistance reports making it easy to zero-in on critical contributors to total resistance.

In summary, P2P provides:

  • easy to set-up and highly configurable, with no perturbation to existing flow
  • fast and easy to use analysis of power nets, with unlimited capacity
  • accurate resistance extraction for pad to pad, point to point, and multipoint to multipoint resistance, with layer by layer reporting for each power net
  • resistance mapping of interconnect
  • fast, accurate static IR drop analysis
  • current density analysis highlighting EM issues

P2P can be used at any stage of design and verification, and by providing accessible data covering resistance/potential/current distribution guides users to designing robust power nets able to reliably provide needed currents to all areas of the die.


Not All RTL Synthesis Approaches are the Same

Not All RTL Synthesis Approaches are the Same
by Daniel Payne on 01-20-2015 at 7:00 pm

My first experience with logic synthesis was at Silicon Compilers in the late 1980s using a tool called Genesil. Process technology since that time has moved from 3 um down to 20 nm, so there are new challenges for RTL synthesis. Today you can find logic synthesis tools being offered by the big three in EDA: Synopsys, Cadence, Mentor Graphics. Since RTL synthesis has been around for decades you may be lulled into thinking that all approaches are about the same, and that the market is mature and kind of static. If that was really true, then why did Synopsys have to recently re-write their tool from scratch in order to meet the challenges of capacity, speed and Quality Of Results (QOR)?

Mentor Graphics acquired Oasys Design Systemsabout 13 months ago, and with that move filled a gap in their digital implementation tool flow by adding RTL synthesis. Engineers at Mentor authored an 8 page white paper to explain their approach and how it’s different than anything else out there. In general a modern synthesis tool must provide SoC designers with:

  • High capacity, 100 million+ gates or cells
  • Fast runtimes in hours, not days or weeks
  • Acceptable QoR
  • Physical awareness to decrease design closure times
  • Standard inputs: Verilog, SystemVerilog, VHDL

Related – Oasys Bakes a PIE

Old Approaches
A traditional logic synthesis approach translates RTL code into gates, then optimizes the gates to meet your design specifications. More modern approaches start to take physical information like estimated routing capacitances back into the optimization phase. Optimizing the design at the gate level is a low-level approach, is very localized, and can require long run times as the design size increases.

This approach can force users to break their design up into smaller pieces and have separate synthesis runs, which in turn will increase design closure times.

Without using a full-chip floorplan, a traditional synthesis tool will cause many iterations between front-end and back-end designers trying to reach design closure. You don’t even know where the congestion bottlenecks are with this approach.

New Approach
Back in 2004 when Oasys was founded, they knew that there had to be a better way to approach RTL synthesis, and so the RealTime Designer product came to life starting in 2009, then acquired by Mentor in December 2013. Along the way Oasys received funding from Intel Capital and Xilinx, certainly two very large customers with some of the highest complexity SoC devices. Here’s what makes the new approach different:

  • Includes full chip-level physical synthesis
  • Placement-first
  • Floorplanning
  • Optimization at RTL level
  • Identifies and resolves timing, routability and power issues earlier in the design cycle

Related – Speeding Design Closure at DAC

An immediate benefit of this approach is that you can run RTL synthesis on a design with 10’s of millions of gates in just a few hours, not days. With physical synthesis the tool partitions the RTL into partitions that are placed, then using physical library cells and accurately estimating interconnect between cells. Both placement and timing information gets updated with every optimization transformation. Even congestion maps give early feedback to an RTL designer about routing issues that may limit the physical implementation.


Routing Congestion Map

The RealTime Designer tool will automatically create a floorplan based upon each high-level module and other design data. Modules from the RTL are then assigned to physical regions of the floorplan. All of this physical placement info creates accurate interconnect estimates. You can add in any custom blocks or other hard macros required for your design, along with RTL source code. Here’s a picture of the floorplanning, placement an optimization steps:

Related – Oasys Announces Floorplan Compiler

Because RealTime Designer produces results so quickly, you can now afford to do some explorations to trade off power, performance, area, congestion and DFT goals.


Design Exploration

Instead of separating logical and physical design, with this approach you can actually begin to cross-probe between RTL source code and critical paths in the physical design after floorplanning:


Cross-probing between design views

Within RealTime Designer you’ll find both static and dynamic power analysis, plus support for:

  • Multiple Vt libraries
  • Advanced clock gating
  • Multi-Corner Multi-Mode (MCMM)
  • Power density driven placement
  • UPF and multi-VDD
  • Interactive and batch analysis

DFT engineers can use the scan insertion feature, which minimizes interconnect in the scan chains and creates a standard scandef file for Place and Route or ATPG tools:


Left: design without scan chain ordering.
Right: design with scan chains ordered with physical placement.

Conclusion
Engineers at Oasys, now Mentor Graphics, have developed a new approach to RTL synthesis that can handle 100M+ gate capacity, and produce results up to 10X faster than older architectures, while meeting your PPA (Power, Performance, Area) specifications. Read the complete 8 page White Paper here for more details.


Methodics Access Controls

Methodics Access Controls
by Paul McLellan on 01-20-2015 at 7:00 am

My PhD thesis is titled The Design of a Network Filing System. Yes, that was a research topic back then (and yes, we did call them filing systems not file-systems). One big chapter was on access controls. There are several problems with designing an access control system:

  • it needs to be possible to implement it efficiently
  • it needs to be flexible enough to provide the controls that the organization requires
  • it needs to be comprehensible to the people who have to set up the controls

These are hard goals to reconcile. When I worked at VLSI Technology I was responsible for all the data management infrastructure and I soon discovered that the elegant (to me!) access control systems that a software engineer might dream up were not really usable by design engineers. They only wanted very basic capability, namely that most stuff should be shareable and alterable, except they also wanted the capability to take a snapshot of the design and keep it in a form that could never be changed. I invented something I called a read-only-library (actually a large file containing all the smaller files that made up the snapshot). This was especially useful at tapeout to capture the precise design that was taped-out and ensure nobody could change it (since then it would not match the masks), and for standard cell libraries that were also “released” in specific versions. That was about as far as IP went in the early 1980s when a state-of-the-art chip was 10,000 gates.

Now we are in a different world. As a couple of the speakers at SEMI’s ISS this week stated, the semiconductor industry is becoming a dumb-bell, with one end being very large companies with broad IP portfolios (think Broadcom say, or Synopsys) and at the other are tiny companies with a business of selling one product, either as IP or a chip. And nothing really in-between. Those large companies with huge IP portfolios have blocks that are generic and others that are crown-jewels. They need very different access controls since some products (an LTE-modem say) should be restricted on a need-to-know basis to ensure that a random engineer leaving for the competition hasn’t had access to it (unless working on the product, obviously) but others (a standard cell library perhaps) need to be widely available to every design engineer.

These days we have powerful source control systems such as git and subversion. But these are designed for software projects and don’t really map directly onto the capabilities that are required for managing the IP lifecycle or the design of semiconductors with myriad views, hierarchy and a variety of requirements that don’t map well onto software (you can see the netlist but not the layout). This is where Methodics comes in, providing the capabilities for managing the IP lifecycle in a layer between the basic underlying data management system and the designers themselves.

For example, engineers integrating an IP block (say that LTE-modem) need one level of access, whereas engineers that are designing that block need another level of access. In particular, the first group of engineers need read-only access to some subset of the views of the block, whereas the designers of the block need to be able to alter it and fix bugs in it and create releases of it.

At the IP level there is some hierarchy in the sense that large IP blocks will pull in smaller IP blocks. For example, an Ethernet controller is an integration of a MAC (digital logic) and a PHY (largely analog) probably designed by different groups. It needs to be possible to control access to the high level (designers can alter it, for example) without automatically granting the same access to the lower-level IP blocks (the integration engineers cannot change the MAC).


Methodics ProjectIC provides access control commands that provide this level of flexibility through the pi permcommand. It is possible to do things like:

  • add a new user to the project with read access to everything
  • selectively remove access to lower-level IP blocks
  • preview what permission changes a particular command would make without making the changes, to check that no unwanted changes will happen
  • construct complex access controls that are not wholly hierarchical (a contrived example: allow access to the US, deny it to California in general, but permit it for Palo Alto)

Read the white paper on IP permissions management here


More articles by Paul McLellan…


Aldec increasing the return on simulation

Aldec increasing the return on simulation
by Don Dingee on 01-19-2015 at 10:00 pm

Debate rages about which approach is better for SoC design: simulation, or emulation. Simulation proponents point to software saving the need for expensive hardware platforms. Emulation supporters stake their claims on accuracy and the incorporation of real-time I/O. A few years back, some creative types coined the term SEmulation, a hybrid utilizing both approaches. A quick search turns up an Altera white paper on that exact topic, circa 2007, and an even older reference of first usage of the term in EDA around the year 2000.

Funky names aside, the drawback with most early-stage approaches to any complex problem is they are proprietary. A particular simulator environment was lashed to a certain model of FPGA-based hardware prototyping platform, with a high degree of knowledge about the internals of each required to make it work. A handy idea, but potentially expensive, inflexible, and locked in.

2007 is right about the time the SCE-MI specification was emerging in original form from Accellera. SCE-MI standardizes the co-emulation modeling interface. It also spends a lot of effort on minimizing the interaction, or synchronizations, required between the simulator and emulator platform. Simulators like events, where emulators prefer timed sequences.

SCE-MI establishes the idea of transactors, connecting an untimed testbench in the simulator to timed modules in the emulator. This allows the emulator to run with its faster timing intact. Transactors form a pontoon bridge of sorts. By adhering to the SCE-MI standard, a simulator and emulator are loosely coupled, allowing replacement of one or both sides by adding the proper transactors. Internal knowledge is reduced, flexibility is greatly increased, and lock-in is avoided.

Aldec has taken a big step forward with the latest release of their hardware emulation solution software. HES-DVM brings a powerful simulation environment compliant with SCE-MI. It allows connection of the Aldec HES-7 FPGA-based prototyping system, or another third-party SCE-MI compliant platform, or custom in-house FPGA hardware to provide the hardware acceleration.

Users of UVM need hardware acceleration, desperately. Even with continual improvements in constrained random solvers and other algorithms, HDL simulators are still compute-intensive beasts. As the size of the design increases, the execution time of the simulation goes up dramatically. Using an FPGA-based system provides speed effective acceleration for an HDL simulator without the massive costs of a full-blown hardware emulator.

HES-DVM 2014.12 brings improvements in three main areas:

  • Significant improvements in the SCE-MI 2 Compiler expand capability to convert behavioral code into synthesizable RTL targeting an FPGA. For example, the compiler now supports SystemVerilog DPI-C import and exports as a SCE-MI function-based interface. Support for plusargs has been added, allowing arguments to be passed within transactors, aiding in run-time configurable parameterization. Turbo Mode allows compression of design clocks so edge sequences are preserved, but periods are shortened. Force signal values can send values to force any design net in RTL. Optimization has been applied to constant arguments of DPI-C function calls, reducing synchronization and speeding up emulation.
  • Scalability improvements allow jobs to be scheduled against load sharing facility (LSF) compute farms. This also applies to scaling of acceleration clusters using multiple HES-7 or similar platfforms.
  • Support has been added for the Cadence “NCSim” simulator, part of Cadence Incisive Enterprise Simulator. The DVM generates the SCE-MI DPI emulation bridge compatible with NCSim, and creating a new project or simulation options can select NCSim.

Embracing SCE-MI and adding hardware acceleration dramatically increases the “return on simulation” for ASIC developers. Aldec continues to open their environment, combining their tools, popular tools from other EDA vendors, and custom hardware platforms into a complete solution for RTL verification.

Related articles:


Analyze Substrate Noise in SoC Design?

Analyze Substrate Noise in SoC Design?
by Pawan Fangaria on 01-19-2015 at 4:00 pm

Often substrate noise analysis takes place when everything is there on the chip, but that stage comes near the tape-out which is too late to make major changes in architecture, placement, introducing noise protection circuitry for the victims and so on. It was okay when there used to be very little analog content on the chip. But in today’s SoC where substantial analog and RF content (that may be in the form of specialized IP) can be there on the chip intermingled with large digital content, there is no concession to wait till tape-out risking the schedule because increasing substrate noise can pose severe risks to those sensitive IP blocks. There may be multiple RF circuits operating in close proximity with fast switching circuits, specifically in wireless applications. If not controlled properly, substrate noise can severely affect performance of SoCs.

Although analog and digital blocks have separate power and ground supply structures, they are etched on a common silicon substrate, thus allowing the generated noise to propagate through the substrate. There are isolation techniques to separate these two types of circuits, but how to determine which technique and how much control is appropriate in a particular situation, such that the SoC is neither overdesigned nor left susceptible to substrate noise that can limit its performance? There is a need to accurately model and predict the substrate coupled noise and add appropriate isolation structures to design a robust SoC.

Ansys’s Totem-SE supports different isolation structures in its analysis that include P+ guard-ring, N+ guard-ring, N-well wall, Deep N-well wall, and Deep N-well pocket. Totem-SE considers all substrate layers and necessary technology parameters in constructing the substrate RC network and models all pertinent noise injection elements such as standard cells, memories, IOs, and specific analog and custom circuits, thus providing accurate analysis of right fitment of different isolation structures in typical scenarios.

Experimental results with different isolation structures between a digital processor core and an analog block show close correlation between measurement from silicon (blue waveform) and the prediction from Totem-SE (pink waveform). The waveform indicates the worst noise amplitude starting from digital circuit, going through different isolation structures, into the analog circuit. Totem-SE can also provide DvD (Dynamic Voltage Drop) maps for various substrate layers for designers to determine optimal locations for isolation and protection structures in their designs.

Totem-SE can be used for full-chip analysis in which the noise injection from various digital and analog components are modeled and propagated through the on-die, package and substrate parasitic network. The noise waveforms from the full-chip analysis can be captured at specific locations of an analog block and used as PWL input to the IP level timing and functional simulations using Spice to improve accuracy of the IP level simulation. By using the true voltage noise signature, the impact of the coupled noise on the IP can be explored to determine if additional changes in the layout or protection schemes are needed, thereby preventing silicon issues after tape-out.

In an SoC design, the noise generated by a digital circuit can be controlled by several means including isolation/protection structures, decoupling capacitances, power grid robustness, number of active blocks and activities on the blocks, and distance from active elements. Totem-SE is very versatile to be used in various customer specific SoC design flows to start substrate noise analysis as early as possible. Ansys provides a whole range of tool suite for power, noise, reliability analysis and optimization of SoCs by all means.

Hagay Guterman from CSRand Jerome Toublanc from Ansyswill be presenting a joint paperin DesignCon 2015, in which they will present a novel proven design flow which starts substrate analysis very early in the design stage, even with very basic chip information. The models of noise generation and noise propagation through the substrate can develop in parallel to each other when relevant data is available.

Do reserve your seat for the following session/paper in DesignCon 2015 to know more –

Session Code: 2-TH4
Track Name: 02 Analog and Mixed-Signal Modeling and Simulation Challenges
Paper Title: Substrate Noise Full-chip Level Analysis Flow from Early Design Stages Till Tapeout
Date: January 29, Thursday
Time: 11:05 AM – 11:45 AM

More Articles by PawanFangaria…..