RVN! 26 Banner revised (800 x 100 px) (600 x 100 px)

Arteris Flexes Networking Muscle in TI’s Multi-standard IoT Chip

Arteris Flexes Networking Muscle in TI’s Multi-standard IoT Chip
by Majeed Ahmad on 03-14-2015 at 7:00 am

Arteris Inc., a network-on-chip (NoC) interconnect IP solution provider, has joined hands with Texas Instruments Inc. to create an ultra-low-power chip that helps Internet of Things (IoT) devices go battery-less with energy harvesting and support coin cell-powered IoT operation for multiple years.

Another low-power MCU parable with energy efficient ARM cores? Well, yes. However, what’s new in this MCU tale is the fact that TI’s new chip supports multiple wireless standards. Another prominent highlight is how TI has implemented Arteris’ FlexNoC interconnect fabric in its SimpleLink ultra-low power wireless MCU portfolio to facilitate the communications part of the IoT device.

TI’s wireless MCU platform for IoT applications

The FlexNoC fabric serves as the system-on-chip (SoC) backbone in these low-power MCUs and helps implement IoT wireless communications standards such as Bluetooth low energy, ZigBee, 6LoWPAN and sub-1GHz. TI claims it’s the first SoC product that supports multiple wireless connectivity standards using a single-chip and identical RF design. And the fact that a single die can be used in multiple IoT products allows TI to explore a unique SoC derivatives strategy.

The Dallas, Texas–based chipmaker’s SimpleLink ultra-low-power wireless platform comprises of CC2620 (ZigBee RF4CE), CC2630 (6LoWPAN or ZigBee), CC2640 (Bluetooth low energy), CC2650 (2.4GHz technologies), and CC1310 (sub-1GHz operation) IoT chips.

TI’s CC26xx family of low-power RF devices has integrated an ARM Cortex-M3 MCU, Flash/RAM, analog-to-digital converter (ADC), peripherals, sensor controller and built-in robust security on a single chip. There is a speculation that one of the first CC26xx chips is going into an Oral-B toothbrush for kids.

About FlexNoC Part
TI has used FlexNoC power disconnect and firewall features to fuse-off select digital I/O controllers and that has enabled various combinations of ZigBee/IEEE 802.15.4, ZigBee RF4CE, Bluetooth low energy, 6LoWPAN, and the proprietary SimpliciTI wireless protocols. As a result, TI was able to produce an entire IoT product lineup with a minimal number of digital logic components.

Moreover, TI brought down the power consumption by employing the inherent low latency performance of zero-latency cycle paths, which were enabled by fully combinatorial logic in FlexNoC interconnect IP. The FlexNoC fabric has also allowed TI to effectively use firmware in meeting the most stringent power and timing requirements.

FlexNoC safely disconnects power to parts of SoC and clock tree

TI’s IoT chips are an important design win for Arteris that claims that the number one benefit of using the FlexNoC interconnect technology is power management. TI’s General Manager for Wireless Connectivity Solutions Oyvind Birkenes acknowledges the role of FlexNoC’s power management features in creating what he calls the lowest-power IoT communication devices in the world. “FlexNoC allowed us to create a small set of digital logic SoC dies that serve as the brains of more than one hundred different products, each customized for its particular market.”

The implementation of FlexNoC technology inside TI’s wireless MCUs is crucial for Arteris for two reasons. First, IoT is a high-growth market, and an IP socket in a strategic market like IoT is a significant testimonial for Arteris’ network-on-chip technology. Second, it negates the common perception within IC design circles that the NoC interconnect IP products are only used in highly complex SoC designs.

RF and Power Conundrums
According to TI, its SimpleLink family of ultra-low-power MCUs for IoT devices boasts energy footprint that is small enough for a coin cell battery to power a light switch in smart home for 10 years. TI’s wireless MCUs even promise support for battery-less operation of energy harvesting-based sensor nodes.

So how were Arteris and TI engineers able to pull off this power management feat? Power management success stories in the wireless domain are typically traced back to the RF part. It’s usually the drain caused by transmit and receive currents inside wireless chips that puts constraints on the battery.

However, a closer look reveals that RF devices like radio transceivers usually don’t contribute much to power consumption within a wireless chip. In fact, it’s small sensors and wireless protocol stacks within an SoC device that mostly add up to power drain. To counter that, for a start, TI’s CC26xx family of IoT chips use two energy efficient MCUs: an ARM Cortex-M3 and a sensor controller.

The ARM Cortex-M3 is the main system CPU inside the CC26xx device that consumes less than 3mA while running at the maximum speed of 48MHz. Next up, sensor controller—which comprises of a 16-bit CPU coupled with peripherals like ADC, analog comparators, SPI/I2C and capacitive touch—facilitates interface with external analog or digital sensors autonomously while the rest of the SoC device sleeps.

Ultra-low-power footprint despite support for major wireless protocols

And that’s only part of the story. TI’s complete chip can stay in standby mode at only 1uA using memory retention and real-time clock (RTC) techniques. According to EEMBC’s ULPBench, it enables the CC26xx platform to offer half the power of other MCUs. Here, TI has put to work many of the FlexNoC power management features to safely disconnect power to individual IP blocks as well as to parts of the on-chip fabric and clock tree.

The designers of portable wireless products can keep the standby currents between the transmissions to minimum and save enough juice in the battery for the active use. For that purpose, in the CC26xx family of IoT chips, TI uses an ultra-low leakage SRAM that can be fully retained while having the RTC running. Meanwhile, CPU is retained in standby mode, which consumes as little power as 1uA. Moreover, in shutdown, the CC26xx can wake up on external I/O events while drawing as little as 150nA.

Take heart-rate monitor, for instance, which needs to run the ADC 10 times per second to capture the heart rate accurately. An ultra-low power CC26xx MCU can let the sensor controller perform all the ADC measurements and wake up the ARM Cortex-M3 every 10[SUP]th[/SUP] ADC sample for optional processing and group RF transmission of this data.

Image credit: Texas Instruments Inc.

Majeed Ahmad is author of books Smartphone: Mobile Revolution at the Crossroads of Communications, Computing and Consumer Electronicsand The Next Web of 50 Billion Devices: Mobile Internet’s Past, Present and Future.


RF on SOI at GF

RF on SOI at GF
by Paul McLellan on 03-13-2015 at 7:00 am

Unless you have been living under a rock for the last decade, you can’t help but notice the increased importance of RF: bluetooth, WiFi, 3G, LTE, NFC, RFID and more. There is a lot of digital design associated with these standards, especially the highest bandwidth ones, but they also all contain a radio, often called a modem. Some, like LTE, have to have enough power to reach a cell-phone tower several miles away, others, like NFC, only need to work over short distances. At the FD-SOI and RF-SOI Forum in San Francisco at the end of last month, Peter Rabbeni of GlobalFoundries presented on SOI: An Enabler for RF Innovation and Wireless Market Disruption. Some RF can be done using standard CMOS but a lot requires more esoteric technologies such as SiGe (silicon-germanium), GaAs (Gallium-Arsenide). Or SOI. In fact RF on SOI is growing at a CAGR of over 20% and that is before the Internet of Things (IoT) really takes off with its almost universal requirement for wireless connectivity. Currently SOI has about 65-70% market share of RF switches. SOI brings some advantages that other approaches lack, primarily due to the isolation that comes from the high-resistance substrate, and the fact that it is basically a mainstream process that can be manufactured in a standard CMOS fab at comparatively low cost. Since standard logic can also be built on the same wafer, it is easy to integrate the control with the RF on a single die and even integrate the power amplifier on the same die. Further, due to the isolation, it is possible to stack devices to handle higher voltages. One of the challenges in radios is being able to create tunable filters. This is a pre-requisite for being able to use a single radio for multiple frequency bands. This is something enabled by SoI as is shown by some cutting-edge work done by ST and University of Twente. GlobalFoundries has various technologies suitable for RF, one of the most important being the 130nm RF SOI process which is manufactured in Singapore. This process is targeted to the RF front-end market for RF switches, integrated PA and high-voltage devices. There are a number of customers designing in the standard process today and there are also customer-specific SoI processes such as is used for the Peregrine Global1 product. In addition to GF’s own RF business, IBM has a large RF business that is manufactured in their Burlington VT fab. They are an industry leader in this business. It is complementary to the GF portfolio and the two roadmaps will be integrated once the acquisition of IBM’s semiconductor business by GlobalFoundries is completed. IBM also has a large share of the SiGe market for very high performance. GF has some SiGe business but nowhere near as extensive. Again, the two businesses will be integrated post-acquisition. The presentation should soon appear here.


CDC Verification: A Must for IP and SoCs

CDC Verification: A Must for IP and SoCs
by Pawan Fangaria on 03-12-2015 at 1:00 pm

In the modern SoC era, verification is no longer a post-design activity. The verification strategy must be planned much earlier in the design cycle; otherwise the verification closure can become a never ending problem. Moreover, verification which appears to be complete may actually be incomplete because of undetected issues which can resurface during tape-out or even in the field after fabrication. The most difficult issues to detect and verify are cited to be related to CDC (Clock Domain Crossings). These issues appear when signals in a circuit cross asynchronous clock boundaries without being synchronized. A single CDC issue, if not resolved, can render the whole chip useless. This problem gets enlarged as the number of clocks increases in the design space. In today’s SoC, there can be hundreds of asynchronous clocks driving different IP blocks and complex functions spread across the design.


Example of a Typical SoC Block Diagram

Today, a typical SoC can have billions of gates, multiple power / voltage domains driven by different clocks, while the design can operate in different modes of operations with particular portions of the design being active at different times. Such a design would need a verification methodology defined according to the design implementation and an intelligent solution for complete identification and verification of all CDC issues.

Traditional tools such as RTL simulators or static timing analyzers cannot precisely detect CDC issues. They often end up either under-reporting the real issues or over-reporting false violations, thus wasting a verification engineer’s time. A comprehensive approach is needed that can pin-point the real issues at lower levels and re-use the information at higher levels, thus optimizing the overall verification flow and improving the quality of verification.

Atrenta’sSpyGlass CDC uses a protocol independent analysis technology and provides a comprehensive methodology for CDC verification. Using this software a state-of-the-art structural analysis can be done which uses a suite of rule-sets to verify all kinds of structural CDC issues and avoid any kind of meta-stability. The protocol independent analysis identifies and filters out false negatives upfront, thus saving verification time. It can identify synchronizers such as FIFO and handshake protocols that are properly designed in a generic way. It can also identify signals which can synchronize crossings between clock domains and check if the crossings are functionally correct. The SpyGlass CDC solution also includes functional analysis that complements the structural analysis and ensures proper working of the circuit without any data loss, incoherency, or glitch. The functional checks are done either by using formal verification or simulation.

With these powerful CDC verification capabilities integrated into the SpyGlass Platform, designers have ultimate flexibility to do the verification in multiple ways – flat, hierarchical bottom-up or hierarchical top-down. This flexibility allows designers to handle different situations while designing an SoC which can vary in complexity and size from a few million to more than a billion gates. Again, all IP blocks may not be ready before integration; some of them may be coming from third parties without their functional views. The SoC designer can take appropriate action for the level of verification required for such IP and its interfaces within the SoC. Let’s take a look at the scenarios.

For small SoCs or smaller blocks of IP, the verification can be done in flat mode where the entire SoC is verified in a single run without missing any CDC bugs. The advantage in this case is ease of setup, as all clock modes and design constraints are available at the chip level.

Hierarchical bottom-up CDC verificationis highly scalable and can handle billion gate designs. However, in this case, blocks need to be setup and verified gradually as they are built. The SoC designer works with only CDC-clean blocks, verifies CDC at inter-block interfaces, and then creates an abstracted smart model for this block. In this approach, the verification quality can be ensured by maintaining completeness at each block and its higher level integration, along with coherency of constraints between each block and its top level. Use of these smart models can reduce the verification time and memory footprint by up to 10X.

In hierarchical top-down CDC verification, the constraints are created and driven from the top. This creates the possibility of having the SoC constraints refined early in the design cycle and the ownership of satisfying those constraints goes to the IP or block owners. The verification closure happens gradually as, and when, the blocks become ready. In the case of any third party IP for which the functional view may not be present, the SoC integrator needs to decide whether that IP should be fully verified or partially verified at the boundary.

The SpyGlass CDC verification flow also provides a closed loop between RTL and netlist level verification. At the RTL, substantial structural and functional analysis is done to find all CDC issues. In the netlist, insertion of clock gating, power optimization logic, and other changes may introduce new CDC issues. Therefore, it is mandatory to perform complete structural analysis again at the netlist level. The functional verification is done as required depending upon the fixes implemented during structural analysis. Depending upon the design hierarchy and complexity, it’s important that the verification methodology is defined upfront and that CDC verification be done for the complete SoC in the most optimal manner.

Read a detailed methodology description for CDC verification of billion gate SoCs in a whitepaper here at the Atrenta website. You will need to complete a short registration process in order to download these whitepapers. To get more insight into SpyGlass CDC verification methodology, you can attend an upcoming webinar with the following schedule:

Topic –Signoff Quality CDC Solution for Billion+ Gate Designs
Date/Time – March 19, 2015 4:00pm CET (8:00am PDT) and 10:00am PDT
Registration link –http://www.atrenta.com/events/?series=webinar-series-2015

The SpyGlass CDC verification flow is also tailored for FPGA designs. Atrenta provides a SpyGlass-FPGA-Kit that can be used in a XilinxFPGA-based design to make it Lint and CDC clean. Look for more details in another whitepaperSpyGlass Lint/CDC Analysis for Xilinx FPGAhere.


Getting a Grip on the Internet of Things

Getting a Grip on the Internet of Things
by Paul McLellan on 03-12-2015 at 7:00 am

QuickLogic’s CTO Tim Saxe gave a keynote Getting a Grip on the Internet of Things at the IoT Summit last week.

He started by relating how things have changed over the last 3 years when he talks to customers.

  • Three years ago it was sensor hubs in smartphones and the power budget was 3mW (so one day between re-charging, something we are all well-trained to do).
  • Two years ago it was sensor-hubs with a power budget of 500uW, almost an order of magnitude lower (one month between charges).
  • Then IoT came along and we dropped almost another order of magnitude with enterprise wearables wanting 80uW power budgets, which will last of 6 months or is low enough to make the various energy harvesting approaches workable (so battery life becomes effectively infinite).


In the past systems were primarily built around software. If the processor wasn’t fast enough then up the frequency, if that can’t be done then add more cores, if that can’t be done then add a big FPGA to accelerate some algorithms. This approach is very power hungry and IoT turns everything upside down with the extremely limited power budgets.

Let’s look at what 80uW allows you:

  • accelerometer takes 14uW
  • BlueTooth Smart takes 12uW
  • power management takes 20uW
  • which leaves 34uW for processing

A representative microprocessor takes about 100uA/MHz so you can afford around ⅓MIPS.

BTW a pet peeve of mine. If your processor runs 333K instructions per second then it is ⅓MIPS, not ⅓MIP. The S in MIPS is not making it plural, it is the “second” of “per second”. End of pet peeve.

A basic pedometer takes about ⅓MIPS but anything with more sophistication needs more. Dedicated hardware is too complex to build high-level decision-making on top of. But pure software is too energy intensive. What is required is to move the few energy intensive parts to hardware and everywhere else keep the flexibility of software to get both accuracy and low-power.

Accuracy turns out to be a nebulous concept because more accurate measurements take more power. In fact like the three most important factors in real estate all being location, Tim repeated this three times. But in practice more power means more data lost when the battery needs replenishing, meaning that the power-hungry very accurate pedometer may be outclassed by a less accurate but less power-hungry approach.

A solution to a lot of complicated issues like natural language processing is to use machine learning. This isn’t always appropriate: machine learning is going to kill a lot of pilots and destroy a lot of money before it learns how to fly a plane. But for non-critical applications it is often a superb way to build an application that soon outclasses even the best hand-constructed models.

So Tim wrapped up by going back to reiterate that the IoT requires a change in how we should think about designs. The two big takeaways are:
[LIST=1]

  • embrace uncertainty, overall accuracy is more important than point accuracy, and machine learning is a great way to get to answers that are hard to explain in advance
  • when power is critical, putting some functionality into hardware is vital

    Video of Tim’s keynote is available (21 minutes):


  • Cadence’s New Implementation System Promises Better TAT and PPA

    Cadence’s New Implementation System Promises Better TAT and PPA
    by Tom Simon on 03-12-2015 at 1:00 am

    On Tuesday Cadence made a big announcement about their new physical implementation offering, Innovus, during the keynote address at the CDNLive event in Silicon Valley. Cadence CEO Lip-Bu Tan alluded to it during his kick off talk, and next up Anirudh Devgan, Senior Vice President, Digital & Signoff Group, filled in more of the details. I was fortunate enough to have a briefing with Anirudh and Cadence Marketing Director Rahul Deokar on Innovus before the public announcement.

    Before I go into the details, I’d like to talk about my experiences with new EDA products. Over the years I have held sales and marketing positions at Cadence and Mentor as well as at smaller companies. In these roles I talked to a lot of customers, and certain themes came up over and over again. The cost of moving to new tools is high; there are risks and the results of moving need to justify the time and effort required. Thus management and engineers will only switch if the benefits are significant, or address a new and otherwise unmanageable design issue.

    Usually new tools offer either a turnaround time (TAT) advantage or an improved quality of result, such as performance, power and area (PPA), but not both. Lastly new products were often announced before challenging real world designs had been thrown at them. Now let’s talk about Innovus.

    Innovus adds a new placer technology called GigaPlace. This rounds out their updated implementation technology by complementing GigaOpt, Tempus and NanoRoute. Placement is crucial for optimal design results. During his keynote presentation even Anirudh telegraphed that they concurred placement was a weak spot for them previously. He has had 2 years to improve on the technology he inherited when he took over the implementation flow.

    In Innovus they are capitalizing on the technology from their Azuro acquisition by incorporating Clock Concurrent Optimization (CCOpt) in the flow. Azuro technology was always strong, but when it was being sold standalone the integration hurdles made for a difficult sale. I know I was there. But with this technology fully integrated with the rest of the P&R flow, using it is much easier. Plus Cadence has polished the useful skew technology that Azuro was starting to roll out at the time of the acquisition. For the clocks, they are using a hybrid approach with H-trees at the higher levels to maintain symmetry, but break out into classic CTS based clocks at lower levels. The portions that are H-tree based help reduce variation induced clock timing issues.

    The other big integration for Innovus is with Tempus, the Cadence sign off solution. Faster sign off is an obvious win with direct database integration. But Anirudh also talked about sign off based ECO’s. These are made more efficient because they can be done without tool iterations.

    Touting what Cadence calls massively parallel computing, Innovus is said to be able to work on much larger data sets and do so much more quickly. One way to gain from this is to take advantage of larger gate counts in blocks. Cadence is saying it works well with 5-10M or more instance block sizes. This reduces the number of blocks, removing channel routing areas and reducing congestion. These larger blocks will run faster in Innovus too. See the table they provided below for their numbers.

    From the chart it seems that there is a bigger win at 28nm than at 16nm, but this is understandable in that 16nm designs have many more constraints, and variation effects grow with additional masks and patterning requirements.

    But what about quality of results? Cadence provided the above chart to show improvement in PPA. One of the most interesting aspects of improved TAT and PPA is that designers might have time to improve their designs beyond specs, if they can reach initial design targets faster and then have more iterations available in the time remaining before tape out. One example Cadence cited shows this being done by one of their beta customers.

    Speaking of beta customers, it seems Cadence has been working with many of their customers on this technology. They are able to point to numerous design examples and have customer endorsements from the likes of ARM, Freescale, Juniper, Renesas, Spreadtrum and Maxlinear. An impressive list indeed.

    Anirudh spoke about how chip companies are no longer easily divided along the lines of Analog or Digital. Today’s SOC’s are predominantly mixed signal. This means that a winning flow will easily allow analog and digital content to be integrated and optimized together. Leveraging Cadence’s strength in analog design, they have added hooks to integrate Virtuoso and Innovus together in the flow. This includes a common database and adds a GUI that uses Tcl for scripting.

    Cadence appears to have addressed the main objection to moving to a new tool – will the change provide sufficient benefit to warrant the cost of moving? They seem have their ducks in a row regarding having enough miles under their belt before rolling out a new major update. ARM in particular talked very positively about their results with Innovus on their high end Cortex-A72 core implementation. And Cadence went to lengths to assure me that even for older nodes, all the technology and PDK information will still work. This means that now not only will this be useful for cutting edge designs, but it will also be helpful on a lot of IoT and mobile based designs that must have the lowest possible power and are implemented on nodes ranging from 180nm down to 40nm.

    For more information go to theCadence Innovus page.


    How many coats cover this SoC?

    How many coats cover this SoC?
    by Don Dingee on 03-11-2015 at 7:00 pm

    “Most interior paint covers with one coat.” Back when there was something called a newspaper, this was an actual blurb in the home improvement pages, section 3, part 8, page 5 of the Chicago Tribune on Sunday, August 13, 1961. Even then, marketers were catering to consumers looking to cut corners and save time, and one-coat coverage was a popular claim among pigment providers. The column filler went on to say yellow and pink have low covering power, and may need a primer. Let’s face it, pretty much nothing covers dark green in a single application.

    I suspect most SoC verification teams don’t trust that claim any more than they trust code coverage tools. To counter the lack of confidence, teams usually keep applying coats of testing until they see a functional coverage metric asymptotically approaching 100%. There is no arguing with 100%, right? Nobody gets away with walking into a meeting and saying we’re good, coverage is at 85%. The effort to get to 100% is usually accepted – expected, in fact.

    Cutting corners on painting can be somewhat hazardous. One risks the scornful review of a significant other, who visually inspects the work on two dimensions: “you missed a spot”, and how much of a mess was made in the process. Too little paint is bad; leftover paint is a bonus for future maintenance. The actual amount of paint used, and the number of brushes or rollers chewed up, is rarely a consideration if the budget was met and the aesthetic outcome is right.

    Cutting corners on verification can be deadly. However, verification differs from house painting in one important way: different tests deliver varying amounts of effectiveness and coverage. A difficult-to-cover dark green spot in an SoC design may require a focused verification routine. Wider swaths of HDL may be covered more easily and quickly.

    The question for SoC verification becomes how to provide the greatest code coverage in the least amount of time and effort. Some tests may prove better than others, and some tests may prove to be completely redundant adding no value. Most verification tools only report on the final result – coverage. Teams may be trying to fill in coverage and expending significant resources on duplicative or ineffective tests, and never know it.

    Aldec has released Riviera-PRO 2015.02 with a major new feature: test ranking. By looking at how tests contribute to coverage, comparisons become easy to spot. Teams can put more energy into higher ranked tests, and less into tests that are redundant or produce poor contributions.

    There is also a cost factor to the ranking. Parameters can include how much simulation time is required; where two tests provide equivalent coverage, the faster one should be chosen. A longer test may be worth the CPU cycles if it provides substantially better coverage than alternatives.

    Test ranking provides a unique and valuable view of the verification process. Aldec continues to expand Riviera-PRO, combining simulation, debug, and reporting into a single productivity tool for advanced verification.

    Related articles:


    SoCs More Vulnerable to ESD at Lower Nodes

    SoCs More Vulnerable to ESD at Lower Nodes
    by Pawan Fangaria on 03-11-2015 at 1:00 pm

    Electro Static Discharge (ESD) has been a major cause of failures in electronic devices. As the electronic devices have moved towards high density SoCs accommodating ever increasing number of gates at lower process nodes, their vulnerability to ESD effects has only increased. Among the reasons for ESD failures in SoCs, device breakdown and interconnect melt-down account for more than 70% of the overall ESD failures. Long term reliability is anyway at stake due to such ESD issues; they can even impact the first silicon success.

    Look at how the ESD design window for MOS devices has shrunk as the technology node moved from 130nm to 32nm. The margin between operating voltage and oxide breakdown voltage has continued to decrease, making a device more prone to breakdown due to ESD. High current flow through unintended paths due to any kind of ESD effect (Human Body, Machine or Charged Device Model) can render the device to failure. To protect the devices from ESD, it’s important and essential to introduce effective clamp circuits with I/O and P/G pads that can handle large transient currents, provide efficient discharge paths to ESD currents and prevent any pin voltage from exceeding the oxide breakdown voltage.

    In aDAC 2014 presentation I found that PathFinder from ANSYS is a state-of-the-art tool that can precisely check the resistances of signal bus, power bus and the power-to-ground path to help designers appropriately plan protection circuits for device breakdown due to any ESD event.

    If we look at the interconnect scenario, that also has become extremely vulnerable at lower technology nodes with ESD current limits going down significantly. Any current crowding on ESD devices or insufficient wire width on ESD pathways can cause melting of associated interconnect. Appropriate signal and power buses and clamps needs to be planned to prevent interconnect melt-down.

    ANSYS PathFinder can precisely define CD limits for interconnect and check it for signal bus, power/ground bus, and the power-to-ground path, thus helping designers appropriately plan for the buses and power-to-ground paths and prevent their melt-down due to any ESD event.

    Today, an SoC can have multiple IPs and blocks under different power domains. This also can give rise to cross-domain ESD issues if not properly analyzed and taken care of.

    Above is an example where ESD discharge can pass through an unintentional path causing failure. The circuit must be analyzed for such eventualities and protected from ESD.

    PathFinder performs cross-domain CD checks, bus resistance checks, and clamp connectivity checks to prevent such ESD issues that can arise at the boundaries of different domains.

    Thus PathFinder provides a powerful solution for accurate ESD analysis and prevention of ESD issues in both, IP and SoCs. It takes layout, technology information, Spice netlist including clamp models, and ESD rules as inputs and then performs all kinds of checks including resistance check, interconnect failure check, layout connectivity check, and dynamic CDM checks for IPs. The PathFinder can be used from very early stages of a design such as I/O pad planning, I/O ring planning, and floorplanning until the final sign-off. It has very easy-to-use GUI for debugging and finding root causes of issues, and fixing those. The tool is robust enough with capacity to handle full-chip analysis including its package.

    This ESD solution using PathFinder is part of ESDA(ESD Association) reference flow and TSMC reference flow.


    Innovus: Cadence’s Next Generation Implementation System

    Innovus: Cadence’s Next Generation Implementation System
    by Paul McLellan on 03-11-2015 at 7:00 am

    Yesterday was the first day of CDNLive. There were three keynotes. The first was by Lip-Bu Tan, Cadence’s CEO (and the Chairman of Walden International that he will be the first to remind you). The most interesting tidbit was that Cadence now has over 1000 people working on IP and that it represents 11% of their revenue. Then he announced Innovus, Cadence’s next generation of physical design (much more below).

    The second keynote was by Simon Segars, the CEO of ARM. He painted a vision of how the mobile phone will eventually become the only device you need, holding your plane tickets, passport, car keys, house keys, thermostat control and so on.

    He also outlined how the datacenter environment is changing from simply being mobile device and huge cloud datacenter, to having intelligence distributed through the network too. Of course, Intel does not necessarily have the ideal products for this and it is a big opportunity for 64-bit ARM. Although ARM does not have the single process performance of a high end Intel processor, it has much lower power, lower cost, easy to integrate and so smaller physical size, and very high throughput due to the large number of cores possible.

    But the most interesting keynote was the third, by Anirudh Devgan, SVP of Digital and Signoff. I think that means that if it ends in “-us” then he has it in his organization. And he has one more product in his portfolio after today, Innovus. This is Cadence’s next generation physical design, rebuilt from the ground up and tied in tightly with all the already-announced analysis tools such as Tempus and Voltus. The key big picture numbers are that it is 10-20% better PPA than Encounter and 5-10X faster.

    Anirudh jokingly said that when he joined Cadence, if he went to a bar people would say “Cadence is pretty good, but they need to improve their placement.” Well, Innovus has a completely new placement engine, GigaPlace and a new optimization engine, GigaOpt. There is a new advanced clcok design system based on the Azurro acquisition from a couple of years ago.

    Innovus has been with initial customers for quite some time, so that now that the public announcement is here it already has success stories and a track record of results.


    On run-time above are several designs. The top one is 9.3M cells in 28nm where the speedup is almost ten times, from 700 hours to 70 hours in round numbers. The last design is a mobile SoC which went from 150 hours to 29 hours (not quite scraping under the day) for a 5X speedup.


    But running fast is nice, but ultimately results are what count. Power, performance, area or PPA. The above graph shows a microprocessor design (no idea whose but the previous methodology was manual which has to narrow the field quite a bit, but being 16nm not 14nm means it is not the company I first thought of). The old manual design took a long time and gradually crept closer to the goal (the hard to see blue line). With Innovus, not only was the goal reached much faster, but it was exceeded (the red line).

    There are rumors around that Cadence has won Apple away from Synopsys. Of course nobody can comment on this. When I heard the rumor it made little sense since Cadence is also rumored to have lost pretty much every benchmark below 28nm with Encounter so it seemed unlikely Apple would be an anomaly. But if it is Innovus then the rumor is much more credible.

    ARM have been using it with the Cortex-A72 (their new core announced a month ago). As Noel Hurley, who is general manager of the CPU group, said:At ARM, we push the limits of silicon and EDA tool technology to deliver products on tight schedules required for consumer markets. We partnered closely with Cadence to utilize the Innovus Implementation System during the development of our ARM Cortex-A72 processor. This demonstrated a 5X runtime improvement over previous projects and will deliver more than 2.6GHz performance within our area target.

    Apologies for the poor quality of the images. I wasn’t given a copy of the presentations so these are all photos of the screen. But I figured getting the information out quickly is more important.

    Details of Innovus are on the Cadence website already here.


    On-Chip Power Integrity Analysis Moves to the Package

    On-Chip Power Integrity Analysis Moves to the Package
    by Tom Simon on 03-11-2015 at 1:00 am

    Power regimes for contemporary SOC’s now include a large number of voltage domains. Rail voltages are matched closely to the performance and power requirements of various portions of the design. Indeed, some of the supply voltages are so low that the noise margins in these domains is exceedingly low. Higher voltage domains are still imbued with tight power rail noise margins due to performance needs. In either case designers have good reason to be concerned with power integrity. If the dynamic operation of the chip causes power or ground excursions, chip operation could be imperiled.

    EDA companies have been working on di/dt on-chip analysis solutions for a long time. When I was at Sequence around 2002, this was a hot and heavy area of product planning and development. Sequence was ultimately acquired by Apache, which has since become part of ANSYS. But the initiative that emerged back then is still going full force.

    ANSYS has long has had a solid and commercially well accepted offering in this space – RedHawk. But as a recent white paper from ANSYS points out, looking at power integrity on the chip by itself, assuming ideal connections to the power pins, is not sufficient to ensure design success. The package plays a significant role in determining the power integrity of a design. The power grid is fed by numerous C4 bumps on the die. There is nothing ideal in power delivery through today’s flip chip packages.

    ANSYS points out that the package nets need to be modeled individually, not lumped together or aggregated. The modeling should include full RLCK effects as well. The approach that ANSYS’s RedHawk-CPA, chip package analysis, uses is to extract the package using a 3D FEM extractor. A big benefit of this extraction based approach is that it yields highly accurate results in a SPICE netlist format, instead of an s-parameter that you would get from a full field solver. Field solvers are necessary in the RF and high frequency range, but for power integrity the extraction approach is fine, and offers real benefits.

    S-parameter data have return path information embedded in them, so isolating power and ground issues is complicated. S-parameters are good in frequency domain analysis, but what is needed here is transient analysis, which RLCK models excel at. The RedHawk-CPA flow used in conjunction with RedHawk provides for dynamic or static analysis of power integrity in various modes of operation. The difference when package information is included is dramatic.

    Reading in package data is made simple for the IC designer with the ability to import SIP, ODB++ and MCM formats. Bump net assignments are handled by reading in a PLOC file. Of course this can be modified manually as well. By working with actual bump impedance values and package level decaps, RedHawk-CPA affords more realistic voltage drop and dynamic power grid performance results than would be achieved with estimates or less rigorous analysis.

    At the end of the process a chip level model for power grid analysis is created that the package designer can use to optimize the package.

    There are some nice visualizations available after the analysis. I am including some below. The resulting data can be used to visualize voltage, current maps, or waveforms. It also supports near and far field EMI computations. This is where ANSYS’s multi physics expertise comes in.

    SOC designs are operating with speed and power requirements that push them to the limit of present technology. The main side effect of this is that nothing can be designed without consideration of other elements of the whole system. To read more about RedHawk-CPA for including package information in your power integrity analysis download this white paper.


    Altera 14nm and 10nm Update!

    Altera 14nm and 10nm Update!
    by Daniel Nenni on 03-10-2015 at 7:00 pm

    In preparation for this blog I Googled around to get the latest information made available by Altera to see if it matches up with what I know from discussions amongst the fabless semiconductor ecosystem companies. Unfortunately when I Googled Altera+20nm+14nm the first three entries from the Altera website were Error 404 Page Not Found. Just a heads up, no big deal, sitemap errors happen to the best of us.

    The real reason for this blog was to make it perfectly clear that I do not now nor have I ever believed that Altera would leave Intel for TSMC at 10nm. The Altera CEO made an unfortunate comment during the last conference call that Paul McLellan mentioned in his blog Altera Back to TSMC at 10nm? Xilinx Staying There. To me this was a leadership fail more than anything else. Publicly shaming your foundry partner is really a bad idea. It is like publicly shaming the chef before your meal has arrived.

    Here are my top 10 reasons why Altera will NOT leave Intel at 10nm:

    [LIST=1]

  • Millions of dollars have been invested for Intel design enablement
  • Intel has a density advantage at 14nm and again at 10nm
  • Moving from Intel 14nm to Intel 10nm should not be difficult
  • There is no way Intel Custom Foundry would allow it to happen
  • Intel CEO endorsed the Altera relationship at Goldman Sachs Conference
  • If the Intel relationship fails it could cost the Altera CEO his job
  • Altera is at a competitive disadvantage using the same foundry as Xilinx
  • Altera would be better off moving to Samsung Foundry than TSMC
  • Because I said so
  • Altera hired the Intel FinFET design implementation team from Tabula

    Tabula was the first Intel Custom Foundry customer, right? Seriously, would Altera have hired the Tabula design implementation team if they were switching back to TSMC? Great move by Altera hiring a team with serious FPGA FinFET experience by the way. Xilinx should have been all over the Tabula carcass the the day SemiWiki broke the story: TabulaCloses its Doors

    If you think I’m wrong please say so in the comments section but I’m not so you probably shouldn’t waste your time.

    The other thing mentioned on the conference call was that Altera has not taped out at 14nm yet. From what I hear it won’t happen until later in Q2 so do not expect to hear champagne bottles popping on this coming conference call (April). What is the problem you ask? Let me share with you my opinion, observation, and experience as we bloggers do:


    In any big fabless semiconductor company there are different design implementation teams. Let’s call them the A-Team and the B-Team. The A-Team does the first implementation and the B-Team does the follow-on design iterations. From what I understand the Altera A-Team, which is here in Silicon Valley, was still busy with 20nm so the Altera Penang (Malaysia) team was tasked with 14nm and the result was a delayed schedule. Now that Altera has the Tabula design implementation team here in Silicon Valley they will finish the 14nm tape-out then start Intel 10nm. Just my opinion of course.

    Also Read: 2015, the Year of the Sheep…And the 16nm FPGA