CEVA Dolphin Weninar SemiWiki 800x100 260419 (1)

Xtensa core in Qualcomm low-power Wi-Fi

Xtensa core in Qualcomm low-power Wi-Fi
by Don Dingee on 08-31-2015 at 4:00 pm

Wi-Fi has this reputation as being a power hog. It takes a relatively big processor to run at full throughput. It is always transmitting all over the place, and it isn’t very efficient at doing it. Most of those preconceived notions arose from older chips targeting the primary use case for Wi-Fi in enterprise and residential environments. There is a wall-powered, always-on access point for high performance clients like PCs and smartphones to connect with, streaming data faster and faster.

Those Wi-Fi SoCs were optimized for performance, not power. The pervasiveness and ease of IP connectivity drove many companies to try to reduce Wi-Fi power and footprint to make it more compatible with embedded devices, and more recently, maker modules and IoT devices. Some have been more successful than others – particularly Atmel, Electric Imp, GainSpan, Microchip (from the ZeroG acquisition), Redpine Signals, Silicon Labs, and TI come to mind.

There is a new specification in development called IEEE 802.11ah. It changes frequencies from the 2.4 or 5GHz bands into the sub-GHz band, giving it better range and an improved link budget. Several power-saving mechanisms are being considered, such as Target Wake Time which allows napping until a transmit slot arrives. Proponents of other wireless sensor network technology rained all over the idea, saying competing specifications already deliver the same benefits without all the hassle.

However, none of the companies we’ve mentioned so far are Qualcomm.

One of the more fascinating aspects of our research for Chapter 9 of our upcoming SemiWiki book on mobile has been just how much resistance Qualcomm got from industry naysayers in the early days of CDMA. It was said to “violate the laws of physics”, and it could never be shrunk into a mobile handset, and it wouldn’t work reliably under real-world signaling and loading. Fortunately, there were a few supporters like William C. Y. Lee who understood how powerful digital encoding technology and the Viterbi decoder were. We see how that turned out, though it did take a few years and a detour to another processor after the first six million chips.

Klein Gilhousen of Qualcomm said a few years back, “We have always recognized that the key to feasibility of CDMA was VERY aggressive custom chip design.” We have been saying for some time here at SemiWiki that we are waiting for SoCs specifically designed for IoT tasks to appear. If there is a company that can pull off low-power Wi-Fi designs, it’s Qualcomm – just substitute “IoE” (the term Qualcomm prefers) for “CDMA” in Gilhousen’s statement.

We are seeing what the Atheros acquisition does for Qualcomm. In a world where mobile growth rates are settling down and IoT is the next field of opportunity, Qualcomm has chips for low-power Wi-Fi. Note these designs are not for 802.11ah as of yet – Qualcomm is a sponsor of the spec, but it is still in early development.

Rather, Qualcomm has taken experience from the CDMA portfolio and added new ideas for Wi-Fi in existing 2.4/5GHz settings, including its AllJoyn thin client software. For example, let’s look at the QCA4004, implementing 802.11n 1×1 SB/DB low power Wi-Fi. It is flexible enough for three Wi-Fi use cases:

The first is the traditional one, a peripheral to a host. The second is the embedded case, with a local MCU. The third is the IoE case where the onboard core is powerful enough to run AllJoyn.

Atheros first licensed the Tensilica Xtensa core back in 2005, way before Cadence acquired Tensilica in 2013. The QCA4004 was introduced almost two years ago, and its press release and datasheet have no details on the processor core inside. A more recent Qualcomm presentation at IoE Day 2014 in Shenzhen reveals a Cadence Xtensa core at the center of the QCA4004, running at 130MHz with the ThreadX RTOS from Express Logic. That allows the same software architecture to be used whether an external MCU is present or not, including a dual IPv4/IPv6 stack (take that, TI) and AllJoyn API.

The Qualcomm influence is more evident in two areas of the QCA4004 design. The first is the antenna diversity feature, with an RSSI algorithm in hardwired DSP logic to determine which antenna gives the best results. The second is a Green Tx feature, allowing the Tx power to be reduced to the lowest acceptable level for a given throughput. The active power figures for the QCA4004 are worth noting. At 2.4GHz, Tx power at 18dBm is 230mA, while Rx power is 60mA. Sleep mode gets down to 130uA.

The core choice of Cadence Xtensa puts Qualcomm in new territory for smaller, cheaper, more configurable chips for low-power Wi-Fi. Naysayers beware. While other solutions might have traction in industrial IoT environments, the combination of Wi-Fi ubiquity, the right chip design expertise, AllJoyn software, and an ecosystem that includes Microsoft makes Qualcomm a strong player in consumer IoT use.

We’re tracking what else Qualcomm is up to as they remake the company.


Optimizing SRAM IP for Yield and Reliability

Optimizing SRAM IP for Yield and Reliability
by Daniel Payne on 08-31-2015 at 12:00 pm

My IC design career started out with DRAM at Intel, and included SRAM embedded in GPUs, so I recall some common questions that face memory IP designers even today, like:

  • Does reading a bit flip the stored data?
  • Can I write both 0 and 1 into every cell?
  • Will read access times be met?
  • While lowering the supply voltage does the cell data retain?
  • How does my memory perform across variation?

If you are buying your SRAM IP then maybe you don’t have to be so concerned about these questions, however the circuit designers responsible for design and verification of memory IP are very focused on getting the answers to these. Consider the typical, six-transistor SRAM bit cell in a CMOS technology:

The basic memory element has a pair of cross-coupled inverters (devices PD1, PL1 and PD2, PL2) to store a 1 or 0, then to read or write the cell you activate the Word Line (WL) through NMOS devices PG1 and PG2. A circuit designer chooses device sizes and can start to optimize this SRAM bit cell by running a worst-case analysis where one objective is to minimize the read current path shown above in the red arrow. During this worst-case analysis the Vth for devices PG1 and PD1 are adjusted iteratively, your favorite SPICE circuit simulator is run, then the results can be determined by a tool like WiCkeDfrom MunEDA:

The axis of this plot are the variations in Vth for devices PG1 and PD1, while the circles represent the amount of variation measured in sigma. The read current contour values are shown as dashed lines. To minimize the read current we look at the intersection of the dashed lines and circles, so there’s a Red dot showing our worst-case point within a 6 sigma distance.

Related – High Sigma Yield Analysis and Optimization at DAC

Moving up one level of hierarchy from a single memory cell to the actual memory array we have a typical architecture that combines cells into columns, where at the bottom is a mux, equalizer and Sense Amp (SA) circuit:

To run a hierarchical analysis and understand how variation effects this SRAM there are challenges:

[LIST=1]

  • For each SA there are N cells
  • In Monte-Carlo sampling we have to take N cell samples for each SA sample
  • Count the failure rate whenever t[SUB]SA[/SUB] > t[SUB]max[/SUB]

    Brute Force Sampling
    One approach is brute force sampling, so if our SRAM had 256 cells per SA and we wanted 10,000 results, that would require 256 * 10,000 simulation runs in a SPICE circuit simulator. Using a brute force Monte-Carlo analysis for SRAM design isn’t really feasible for anything beyond 4 sigma, because it would require millions to billions of SPICE runs, something that we don’t have enough time to wait for.

    Scaled Sampling Approximation
    Another method is for the circuit designer to use only 4 sigma variation in the SA while the memory cell has 6 sigma. This approach takes less effort than brute force and is easier to run, however it provides an incorrect approximation.

    Two Stage Worst-Case Analysis (WCA)
    The recommended approach by MunEDA is to first calculate the 6 sigma worst-case condition of the cell using the Voltage on the Bit Line (VBL) at the moment that the SA is enabled. The second stage is to calculate the 4 sigma worst-case condition of t[SUB]SA[/SUB] for the sense amp, equalizer and MUX. Here are two charts showing the SA offset versus cell current for a variation where the worst-case point is in spec (green region), and out of spec (red region).


    Worst-case point is in spec, then out of spec

    You can also compare a sampling approach against the two stage WCA by looking at the following charts for SA offset versus cell current:


    With the sampling approach it estimates the failure rate by using sampling points, Red dots for failing and Green dots for passing. On the downside sampling relies on tail distribution accuracy, and suffers from sampling error. The distribution of local variation variables in the tails with >5 sigma is not well characterized, so the true tail distribution in silicon can differ significantly from the ideal Gaussian distribution used inside the model files. Running a global process monte-carlo is not a guarantee to cover the full corner range that can be seen in silicon.

    So a large local plus global monte-carlo run is infeasible because of long run times, plus it’s sensitive to distribution errors. Even speeding up monte-carlo is not really sufficient because it will just produce unsafe results in a shorter period of time. So, what we really need is a new method that can:

    [LIST=1]

  • Handle the large, structured, hierarchical netlist of SRAM arrays.
  • Adjust conservatism in the local variation tails
  • Run quickly, so that local variation analysis can be repeated over multiple PVT corners, design options and layout options


    With two stage WCA we are estimating the failure rate by a large sample Monte-Carlo approximation in the pink region, using a conservative estimate in the pink plus green region, and showing the worst-case point check as passing by a Green dot. The tool flow GUI in WiCkeD makes it quite easy for a circuit designer to specify their own memory array size, target failure rate, and to trade off the array failure rate with read time:

    Comparing all three analysis techniques for a 28nm SRAM block show how the two stage WCA approach uses the least CPU effort in SPICE circuit simulations, scales well to high sigma analysis, and has results close to full Monte-Carlo:

    Related – When it comes to High-Sigma verification, go for insight, accuracy and performance

    Summary

    It is possible to design, analyze and optimize SRAM IP blocks using a two stage WCA approach, while taking much less circuit simulation time than a brute force Monte-Carlo, and at comparable accuracy. All you need to add into your existing EDA tool flow is the WCA capabilities in the MunEDA WiCkeD tool.

    To find out more about MunEDA, there’s a 30 minute webinar coming up on September 9th at 10AM (PST), register here.


  • Michael Sanie Plays the Synopsys Verification Variations

    Michael Sanie Plays the Synopsys Verification Variations
    by Paul McLellan on 08-31-2015 at 7:00 am

    I met Michael Sanie last week. He is in charge of verification marketing at Synopsys. I know him well since he worked for me at both VLSI Technology and Cadence. In fact his first job out of college was to take over support of VLSIextract (our circuit extractor), which I had written. But we are getting ahead.

    Michael was born in Iran and came to the US as a teen. He was a very good pianist, good enough to go on music tours and appear on TV shows, an experience he describes as “not necessarily recommended for any kid”. He expected to go to a music conservatory but instead he went to Purdue and studied CEE (what everyone calls CS/EE today). As he likes to say “I studied to be a conductor, and I must have been only semi-good, as I ended up in semi-conductors.”

    While supporting VLSIextract and more, Michael went to Santa Clara and got an MBA and also made the transition from engineering into marketing.

    In 1999 he joined Numerical Technologies, one of the many DFM startup companies of that era, doing marketing and business development. It was the most successful of those DFM startups, going public in 2000 and being acquired in 2003 by Synopsys for $250M.

    Next stop was Cadence where he did various things, including as I mentioned working for me for at time in strategic marketing.

    After a couple of years at Cadence, Michael moved to Calypto to be their first VP marketing. They were bringing to market sequential formal verification technology, which was new and innovative technology. Michael drove the first deals and also created a partnership with Mentor which eventually ended up with Mentor taking a majority position in the company and adding their Catapult synthesis to the product portfolio. Of course Calypto still exists today.

    Michael joined Synopsys almost six years ago and joined the Verification business group, which has grown steadily to now offering simulation, emulation, static analysis, formal, debug and more.

    Michael reckons that verification is an area where lots of innovation is happening, at least partially because you can never have enough verification. There has been a big shift in the last few years towards emulation, which has gone from an esoteric segment for the occasional group who can afford a seven-figure sum to part of the mainstream. It has also turned out to be the complementary piece to virtual prototyping.

    A lot of the focus of verification these days is to make sure software can be run before tapeout. This is firstly so that the software engineers can start development before silicon is back from the fab. But it is also because if your chip has to run, say, Android, then you really really want to boot Android on a model of the chip before tapeout, no matter how much verification has already been done. Though 90%+ of bugs are still found by simulation, booting Android takes around 40B cycles so you are going to need emulation for it. .

    The next step looks like systematic power analysis in the context of verification, in particular emulation with power. So not just “can I boot Android?” but “can I boot Android without frying the chip?”

    Another challenge is how best to intelligently combine formal and simulation. If formal verification has proved something, then it is not necessary to prove it again with simulation. On the other hand, simulation results can be intelligently used to lead to good assertions for use in formal.

    Synopsys recently expanded its verification investment by acquiring Atrenta, and now they are looking at how to combine static analysis with other techniques to get results faster. One great things about Spyglass (Atrenta’s anchor platform) is that it is used on RTL when the design has only just started to be developed, long before it is in any sense complete. Early feedback on power, DFT and physical issues while the RTL is being created.

    Verification does have a big and direct impact on time-to-market, especially due to software bringup. In the days when you had to wait for the chip before software development started, versus today, the schedule can be pulled in by weeks if not months.

    On emulation, as you probably know, Synopsys acquired EVE a couple of years ago. The ZeBu hardware technology was and is based on Xilinx FPGAs so that they can leverage Xilinx’s R&D investment with faster chips every 2-2½ years, and with lower cost than designing their own chips. Many industry leaders are aligning with this strategy as it enables them to keep up with their demands for performance, capacity, and cost. Historically one challenge with emulation (any emulator) is that it took a good amount time and effort (including possible changes to the RTL) to bring a design up in emulation. Synopsys has unified the VCS compile process with ZeBu to significantly shorten this bring up time. Also as another result coming from this project, ZeBu compile time is 3x faster (most designs only take 4 hours and often much less) and if it compile in VCS is should compile on the ZeBu. Synopsys are still #3 in emulation but bridging the gap fast.

    VIP is another area that is doing well since the interfaces have got so complex that no design group can do verification on their own. For example, the latest USB 3 spec is insanely complex, and there are a dozen interfaces like that. There is an obvious synergy with Synopsys’s IP business since they are #1 in interface IP.

    Of course Michael is not going to make some product announcement in the middle of an interview like this, but he did say that, as you would expect, Synopsys is working on all of these areas. “Watch this space” as the saying goes.

    Yes, Michael continues to study piano although with his travel schedule he can’t exactly go on concert tours., He studied for many year (started when he was at VLSI) with a former professor of the Vienna Conservatory. More recently, he’s been taking lessons at Stanford. It turns out that there are a surprising number of people at Stanford studying both music and engineering. Apparently math, music, language and computer science all use the same part of the brain so often people who are good at one on the list are good at another.

    See also John Koeter: How To Be #1 in Interface IP
    See also Synopsys’ Andreas Kuehlmann on Software Development
    See also Antun Domic, on Synopsys’ Secret Sauce in Design
    See also Bijan Kiani Talks Synopsys Custom Layout and More


    China: drag on global semiconductor market?

    China: drag on global semiconductor market?
    by Bill Jewell on 08-30-2015 at 7:00 pm

    The Chinese stock market (as measured by the Hang Seng Index) dropped 11% from August 14 to August 24 over concerns of a slowing economy. In reaction, the U.S. stock market (as measured by the S&P 500) dropped 11% from August 17 to August 25. The China market has since rebounded 2% while the U.S. market rebounded 5%. Will a slowing China drag down the global economy? China accounts for about half of the global semiconductor market. Will slowing semiconductor demand in China lead to a major slowdown in the global semiconductor market?

    Chinese electronics production data presents a mixed picture as shown below. Unit production three-month-average change versus a year ago shows mobile phones went negative in March 2015. Mobile phones also went negative in 2012 before bouncing back to strong growth in 2013. PCs turned negative in September 2014, reflective of the weak global demand for PCs. Televisions were negative in December 2014 and May 2015, but have generally had modest growth in 2015. In contrast to the volatility of unit production, the overall change in electronics production (communications equipment, computers and other electronics) as measured in Chinese yuan has been steadier. Overall electronics production growth has been below 10% for the last three months after averaging 12% for the years 2012 to 2014.


    Over the last ten years, China GDP growth has been gradually slowing down. Following double-digit growth in 2006-2007, GDP dropped to 9.6% in 2008 and 9.2% in 2009 – still strong growth especially since most of the rest of the world was experiencing a major recession. Growth picked back up to 10.4% in 2010 and moderated to 7.4% in 2014. The International Monetary Fund (IMF) projects China GDP will continue to decelerate to 6.0% in 2017 before increasing to 6.3% in 2019 and 2020. China electronics production growth (as measure in yuan) has averaged 4 percentage points above GDP growth from 2006-2014. Our forecast at Semiconductor Intelligence (SC IQ) is electronics will grow in the 9% to 10% range through 2020.


    Thus China should still be a major growth driver of the global economy and semiconductor market for at least the next few years. The behavior of stock markets is almost impossible to predict and difficult to explain. Stock markets are influenced by economic factors, short term computer trading, greed and fear. We believe the recent behavior of the China and U.S. stock markets are not a sign of a significant slowing of the growth of the Chinese economy in the next few years.


    Simulating to a fault in automotive and more

    Simulating to a fault in automotive and more
    by Don Dingee on 08-30-2015 at 12:00 pm

    We’re putting the finishing touches on Chapter 9 of our upcoming book on ARM processors in mobile, this chapter looking at the evolution of Qualcomm. One of the things that made Qualcomm go was their innovative use of digital simulation. First, simulation proved out the Viterbi decoder (which Viterbi wasn’t convinced had a lot of practical value at first) prior to the principals forming Linkabit, then it proved out enhancements to CDMA technology (which was working in satellite programs) before Qualcomm launched into mobile. Continue reading “Simulating to a fault in automotive and more”


    4 Design Tips for AVB in Car Infotainment

    4 Design Tips for AVB in Car Infotainment
    by Majeed Ahmad on 08-30-2015 at 7:00 am

    Audio Video Bridging (AVB) is a well-established standard for in-car infotainment, and there is a significant amount of activity for specifying and developing AVB solutions in vehicles. The primary use case for AVB is interconnecting all devices in a vehicle’s infotainment system. That includes the head unit, rear-seat entertainment systems, telematics unit, amplifier, central audio processor, and rear-, side- and front-view cameras.

    The fact that these units are all interconnected with a common, standards-based technology that is certified by an independent market group—AVnu—is a brand new step for the automotive OEMs. The AVnu Alliance facilitates a certified networking ecosystem for AVB products built into the Ethernet networking standard.


    AVB is an established technology for in-car infotainment

    According to Gordon Bechtel, CTO, Media Systems, Harman Connected Services, AVB is clearly the choice of several automotive OEMs. His group at Harman develops core AVB stacks that can be ported into car infotainment products. Bechtel says that AVB is a big area of focus for Harman.

    AVB Design Considerations

    Harman uses Atmel’s SAM V71 microcontrollers as communications co-processors to work on the same circuit board with larger Linux-based application processors. The design firm writes codes for customized reference platforms that automotive OEMs need to go beyond the common reference platforms.

    Based on his experience of automotive infotainment systems, Bechtel has outlined the following AVB design do’s and don’t’s for the automotive products:

    1) Sub-microsecond accuracy: Every AVB element on the network is hooked to the same accurate clock. The Ethernet hardware should feature a time stand to ensure packet arrival in the right order. Here, Bechtel mentioned Atmel’s SAM V71 microcontroller that boasts screen registers to ensure advanced hardware filtering of inbound packets for routing to correct receive-end queues.

    2) Low latency: There is a lot of data involved in AVB, both in terms of bit rate and packet rate. AVB allows low latency through reservations for traffic, which in turn, facilitate faster packet transfer for higher priority data. Design engineers should carefully shape the data to avoid packet bottlenecks as well as data overflow.

    Bechtel once more pointed to Atmel’s SAM V71 microcontrollers that provide two priority queues with credit-based shaper (CBS) support that allows the hardware-based traffic shaping compliant with 802.1Qav (FQTSS) specifications for AVB.


    Gordon Bechtel: V71 MCU has a number of capabilities that directly aid in efficient AVB support

    3) 1588 Timestamp unit: It’s a protocol for correct and accurate 802.1 AS (gPTP) support as required by AVB for precision clock synchronization. The IEEE 802.1 AS carries out time synchronization and is synonymous with generalized Precision Time Protocol or gPTP.

    Timestamp compare unit and a large number of precision timer counters are key for the synchronization needed in AVB for listener presentations times and talker transmissions rates as well as for media clock recovery.

    4) Tightly coupled memory (TCM): It’s a configurable high-performance memory access system to allow zero-wait CPU access to data and instruction memory blocks. A careful use of TCM enables much more efficient data transfer, which is especially important for AVB class A streams.

    It’s worth noting that MCUs based on ARM Cortex-M7 architecture have added the TCM capability for fast and deterministic code execution. TCM is a key enabler in running audio and video streams in a controlled and timely manner.

    AVB and Cortex-M7 MCUs

    The Cortex-M7 is a high-performance core with almost double the power efficiency of the older Cortex-M4. It features a 6-stage superscalar pipeline with branch prediction—while the M4 has a 3-stage pipeline. Bechtel of Harman acknowledged that M7 features equate to more highly optimized code execution, which is important for Class A audio implementations with lower power consumption.

    Again, Bechtel referred to Atmel’s SAM V71 microcontrollers—which are based on the Cortex-M7 architecture—as particularly well suited for the smaller ECUs. ” Rear-view cameras and power amplifiers are good examples where the V71 microcontroller would be a good fit,” he said. “Moreover, the V71 MCUs can meet the quick startup requirements needed by automotive OEMs.”


    Atmel’s V71 is an M7 chip for Ethernet AVB networking and audio processing

    The infotainment connectivity is based on Ethernet, and most of the time, the main processor does not integrate Ethernet AVB. So the M7 microcontrollers like V71 bring this feature to the main processor. For the head unit, it drives the face plate, and for the telematics control, it contains the modem to make calls, so echo cancellation is a must, for whom DSP capability is required.

    For instance, take audio amplifier, which receives a specific audio format that has to be converted, filtered, modulated to match the requirement for each specific speaker in the car. So, infotainment system designers will need both Ethernet and DSP capability at the same time, which Cortex-M7 based chips like V71 provide at low power and low cost.

    Also read:

    Atmel Tightens Automotive Focus with Three New Cortex-M7 MCUs

    3 Design Hooks of Atmel MCUs for Connected Cars


    IoT and OTP are Like Peanut Butter and Jelly!

    IoT and OTP are Like Peanut Butter and Jelly!
    by Daniel Nenni on 08-29-2015 at 7:00 am

    Have you ever had a peanut butter and bacon sandwich? Everything goes better with bacon! Which brings me to one of my favorite sayings: “(insert two complimentary things) go together like peanut butter and jelly”. How about this: “low power and IoT”, “IoT and OTP”, and “Low Power OTP and Sidense go together like peanut butter and jelly!”

    Programmable memory started with PROMs in the 1950s and moved to antifuse one time programmable memory in the late 1960s. Texas Instruments brought OTP to MOS technology in the 1970s, Kilopass brought OTP to CMOS in 2001, and in 2005 Sidense introduced a low power split channel antifuse device. A Split Channel bit cell combines the thick and thin gate oxide devices into one transistor (1T) with a common polysilicon gate. That little history lesson was more for me than you by the way since I have not worked with non-volatile memory (NVM) since the Virage Logic days.

    During my recent worldly travels FinFETs were all the rage but it was repeatedly mentioned that 14nm and more importantly 10nm is “challenging” for both EDA tools and semiconductor IP. This time it was not just “will our design yield?” which is always a concern, but it was also “will our IP work?” Getting the answers to those and other modern semiconductor design questions is of course the whole point behind the TSMC Open Innovation Platform Ecosystem Forumto be held this year on September 17[SUP]th[/SUP] at the Santa Clara Convention Center. Remember, TSMC has completed 15 reference flows with 7,500+ tech files, 200+ PDKS, and more than 8,600 silicon proven IP titles from .35u to 10nm. If you have EDA or IP questions this is the place to be, absolutely.

    Back to OTP, one of the first TSMC IOP partner presentations is by Sidense R&D Director, Betina Hold, and is titled:

    Ultra Low Power OTP Design for Smart Connected Universe Applications

    Betina has spent the majority of her 25+ year career at ARM so she knows low power. Here is the abstract:

    Sidense innovative low-voltage Non-Volatile Memory (NVM) designs targeting TSMC Ultra Low Power (ULP) and FinFET process nodes enable a wide range of Smart Connected ICs, spanning several key market segments including IoT, mobile computing, wearable technology, automotive, industrial and medical.

    Smart Connected applications need embedded NVM to meet stringent power and reliability requirements. These requirements often include operation from low-voltage battery sources, extended battery life, and operation in safety-critical and/or harsh environmental conditions, and high reliability and extended temperature range are necessary attributes.

    This presentation will discuss how the latest OTP IP developments from Sidense address these demands with innovative designs and a 3D 1T-OTP bit cell developed for the most advanced TSMC process nodes.

    Along with the low-power properties of Sidense’s patented antifuse-based 1T-OTP bit cell, Betina will also discuss how the right macro design can result in low read voltages along with low power, critical attributes for many Smart Connected applications. She will also cover double-fin FinFET design which has shown significantly lower leakage current, higher programmed cell current, and very high read margin compared to 28nm/20nm bulk CMOS.

    I hope to see you there!


    Why You Really Need Chip-Package Co-analysis

    Why You Really Need Chip-Package Co-analysis
    by Daniel Payne on 08-28-2015 at 12:00 pm

    There’s only one software company that I know of that covers four major disciplines: Fluids, Structures, Electronics and Systems. That company is ANSYS and when they acquired Apache Design Automation back in 2011 they filled out their products for electronics design, and more specifically in the area of integrated chip-package co-analysis. I just reviewed a presentation from ANSYS given at DAC back in June titled, Achieving Faster Power Design Closure with Integrated Chip-Package Co-analysis. The historic approach in EDA was to have separate tools for the IC designer, package designer and PCB designer, leading to silos of data that didn’t easily talk to each other, let alone do any co-analysis. Apache saw the opportunity and basically created a new category of EDA tool to support integrated chip-package co-analysis.

    Related – Will your next SoC fail because of power noise integrity in IP blocks?

    The actual co-analysis is for the Power Delivery Network (PDN) and thermal across the IC and Package domains combined. The premise is that at the IC level you cannot simplify and assume an ideal package for PDN and thermal, because you would be missing the interactions between IC and package, leading to under-design and failed silicon. Having to re-spin silicon is simply too expensive today, so having a co-analysis for PDN and thermal across IC and Package during the design phase helps ensure first silicon success.

    At ANSYS the software tool used for power closure is called RedHawk and it provides quite a wide range of checks for power noise and reliability:

    With the ANSYS approach you create both a chip model and a package model for PDN and thermal co-analysis. This then allows a designer to do a package-aware chip simulation, plus a chip-aware package optimization:

    The benefits of this co-analysis are many:

    • Measuring the package impact on IC
    • Knowing the IC impact on package
    • The IC and package can be co-designed, instead of separately design, in less time
    • System level transient analysis
    • System AC impedance
    • System resonance is known
    • System decap requirements can be validated and optimized

    If you ran a transient analysis on your IC and didn’t include package modeling, then the simulated results would look much better than what silicon reported. Here’s a quick comparison of transient analysis for an IC without package models and with package models:

    The hot-spots shown on the right where the package models are included while doing transient analysis on the IC clearly show that you must do co-analysis including both IC and package models to get accurate results. When you run IC and Package co-analysis it provides:

    • Support for IR drop, DvD (Dynamic Voltage Drop) and Power-up analysis
    • DC-IR static analysis of the package
    • AC-hotspot analysis of the package

    Likewise, by adding package modeling during SoC analysis you get:

    • 3D full-wave accuracy
    • Chip/Package connection is automatically specified
    • Modeling of the power and ground supplies independently
    • Per-bump resolution granularity

    Related – How PowerArtist Interfaces with Emulators

    Consider what happens if you assume that all your power or ground package bumps are lumped together versus modeled as independent, per-bump. With a lumped approach the bump voltage is only 13.8mV, however with the per-bump model you get a more accurate worst-case of 19.2mV.

    Summary

    Divide and conquer is an approach that no longer works with IC and Package design, so today you should consider using a co-analysis approach to get the most accurate results. ANSYS has been around the longest time in our industry, engineering software tools like RedHawk that support a chip-package co-analysis flow.


    Strategic Materials

    Strategic Materials
    by Paul McLellan on 08-28-2015 at 7:00 am

    Here on SemiWiki we spend a certain amount of time discussing semiconductor equipment, especially the big sea-change items like EUV and 450mm, where everyone wants to know when (and if) they will happen. But there is another aspect to next generation processes other than equipment and that is materials. When the received wisdom is that Hafnium is important for transistors going forward, there need to be people whose reaction is not “Hafnium, great name for a heavy metal band” but how do we get it, how do we make it pure, how do we deliver it to fabs on multiple continents. If it is something that is required in large quantities, such as many gases, then the supply chain is even more complicated. In fact, the reality is that process advances depend on material innovation as well as equipment innovation.

    On September 22nd-23rd at the Computer History Museum it is the annual SEMI Strategic Materials Conference (SMC). This year’s theme, Materials for a Smart and Interconnected World, will take a broad look at what is driving the demand for new materials, how material suppliers are being impacted by the value chain they serve and how this affects the smart and interconnected world we live in.

    The opening keynote on Tuesday morning is by Garry Patton. He was the head of R&D at IBM Semiconductor but he decided to come over with the acquisition and is now the CTO and head of R&D for GlobalFoundries. His topic is The Importance of Accelerating Material Innovation. I always enjoy Gary’s keynotes since he has a deep technical knowledge but an ability to talk about technology in ways that are accessible to more than the deep specialists.

    Gary is followed by Mark Thirsk of Lynx Consulting.

    The rest of the morning is taken up with the Economics/Material Trends session.Market forces that drive demand for semiconductor process materials not only involves the influence / demand from chip fabricators, but also involves end use applications, largely influenced by consumer demand, locally and globally. The semiconductor business environment will be presented from various vantage points – from the materials perspective through chip fabricator, to the global economic view point. Information on materials and business trends for a broad range of semiconductor device technologies, and the driving forces behind these trends, will be presented.

    In the afternoon it is a session on Material Enabling Silicon Everywhere. That’s material speak for IoT!Critical to the IoT vision is the interconnection of tens or hundreds of billions of systems that will supply information for analysis and action. These devices will combine existing and novel capabilities such as new types of sensors, low-power operation, energy harvesting, and interconnectivity to mobile or fixed wire communications. Many of these devices will build off well-known technologies, but “More than Moore” integration will likely be necessary to fulfill the vision. This session of SMC 2015 will review the process material requirements and device manufacturing implications of distributing IC-based devices to all aspects of our lives.

    From 5-7pm there will be a reception.


    The next day, 23rd, has 3 major sessions:

    • New Emerging Materials Technology and Opportunities at the Edge
    • Sustainable Manufacturing: Sustainability Considerations of Advanced Materials
    • Advanced Interconnect Technologies

    Finally, A View from the Fabs: Executive Panel Session. This consists of three short executive presentations followed by a panel session moderated by Kurt Carlsen of Air Liquide Electronics. The three executives are:

    • Vin Menon of Texas Instruments
    • Hans Stork of ON Semiconductor
    • Gary Patten (our keynoter from the day before) of GlobalFoundries

    The conference wraps up at 5pm.

    Full details of SMC are on the SEMI website here. Registration is discounted until September 11th.


    Moore’s law observations and the analysis for year 2019.

    Moore’s law observations and the analysis for year 2019.
    by Vaibbhav Taraate on 08-27-2015 at 4:00 pm

    As semiconductor professionals we all are familiar with Moore’s law. Respected Gordon Moore during year 1965-1975 observed and stated that, number of transistors in dense Integrated Circuit has doubled for approximately two years. In the present scenario, if we consider the complexity of Integrated Circuit and if we use the mathematical analysis with the fundamentals of Physics and relativity theory then for the shrinking process node, the law can be stated as ” Below 14 nm, the number of transistors in dense Integrated Circuit has to be doubled for approximately 32 months.” May be true till year 2019, the reasons are many, the exponential logic depth and computational efficiency, low power issues and need, on chip variation issues, latency, constraints at system level, parallelism, noise margins, crosstalk etc.

    It is my observation and analysis in the past couple of years that at lower process nodes the real limitation is due to material properties, atomic spacing and the transfer of data due to the fabrication related issues. The technology shift can happen with the evolution of the process flow in the design of Integrated Circuits due to the issues related with the shrinking process node and the requirements of the analytical, mathematical and numerical modeling at the system, architecture and even at the design levels. .

    At the engineering level the real bottleneck is the specification complexity, implementation and validation of the design at the system level. Even the practical limitation for the shrinking is the CAP theorem. According to CAP theorem, ” It is impossible for any computer system to simultaneously provide the consistency, partition tolerance and availability. “So there is limitation for the computing efficiency of the SOCs at the system architecture levels.

    But the real limitation for shrinkage and computing performance is due to the space , energy, time issues. If we try to perceive the relativity theory of Einstein; then there is the limitation of the traveling particle with the speed of light. The carrier mobility due to the issues of the dielectric constants, conductivity of the material is the real limitation for the information transfer between the carriers. Another important limitation at the shrinking process node is the physical integration, synchronization to achieve the parallelism with high computational efficiency.

    Another important limitations at device level are : aging, leakage, interfaces and contacts size and delay variations. So the real challenging phase for the semiconductor professionals is below 10nm process node.The real era of miniaturization can face challenges at 8 nm process node and there may be the evolution of design and process flow.

    Probably during year 2019 one can expect the modified Moore’s law observation with the technological shift and changes in the design and manufacturing processes, where the number of transistors in dense integrated has to be doubled for approximately 36 to 38 months and may continue for almost one decade after 2019.

    Although there are limitations, still we all are intelligent to design, innovate the complex SOCs. Let us hope for the great era of miniaturization!