webinar IPXACT banner

GPS Chronicle: The Early History

GPS Chronicle: The Early History
by Majeed Ahmad on 07-19-2015 at 4:00 am

There is really nothing new about GPS: the technology was reinvented from the old. After satellite communications was established, scientists and engineers started to look for different ways of utilizing this fascinating space marvel. Radio navigation systems had been developed during the World War II for aircraft operations, which subsequently evolved into Loran satellite system.

In 1958, the U.S. Navy began working on Loran satellite to develop a system “Transit” for indicating the position of a receiver on the ground. Two years later, the navy launched Transit-1B system to demonstrate the feasibility of using satellites for navigational aids. A receiver on a ship used the measured shift of satellite’s radio signal, along with known characteristics of the satellite orbits, to calculate the ship position.


GPS has been reinvented from satellite communications

A practical system was born out of the need of the U.S. troops to pinpoint their locations during the Vietnam War. However, this system had a limited accuracy and was difficult to use due to its bulky terminal size. So, in the mid-1970s, the U.S. Department of Defense began a project to upgrade the navigation devices built around this concept for classified military use. The solution they developed required two dozen satellites, atomic clocks, microwave radio transmitters and some heavy-duty number-crunching hardware.

A more portable unit could now pinpoint an object’s exact location anywhere on the globe by receiving signals from a network of satellites in an orbit and triangulate them to determine latitude and longitude. The military called it Navstar, after the satellite constellation it used, but the industry and users ignored this nomenclature, and technology became known to the world as Global Positioning System or GPS.

The operational system contained twenty-one satellites in three orbital planes, with three spare satellites. The GPS collection of twenty-four satellites orbited twelve thousand miles above the Earth. These satellites constantly transmitted their precise time and position in space. With GPS, a receiver on ground or in the air could calculate its position using time signals from the satellites.

The calculation itself was based on a kind of triangulation—a math technique used to locate an object based on its distance from three points. So signals from three satellites were necessary, although in practice a fourth satellite was used to improve the accuracy of the other three signals. The result was that a GPS receiver could produce highly accurate coordinates of latitude, longitude, and altitude.


GPS was originally developed for military use

The U.S. Air Force played a crucial role in nurturing the GPS technology by incorporating features like accurate digital maps and satellite photographs. As a result, the pilots were able to spot the key target areas and hit them effectively. Precision-guided munitions, dubbed “smart bombs,” increasingly used GPS to hone in on a fixed target such as a military installation or an airfield.

Content of this article is based on excerpts from Smartphone: Mobile Revolution at the Crossroads of Communications, Computing and Consumer Electronics.


TSMC (Apple) Update Q2 2015!

TSMC (Apple) Update Q2 2015!
by Daniel Nenni on 07-18-2015 at 8:00 pm

The TSMC quarterly conference call was last week and of course it stirred up quite a bit of controversy. Let me share with you my experience, observations, and opinions and maybe together we can come up with an accurate prediction for 2016. First let’s take a look at 20nm and what people now call the “Apple effect.”

Correct me if I’m wrong here but this is how I remember it: The TSMC 20nm process was highly criticized for cost, power leakage, and yield prior to the arrival of the Apple A8 and A8x SoCs. As we now know 20nm was the fastest ramping process in the history of TSMC and the A8 powered iPhone 6 is a huge success. This much is now well documented.

Next came TSMC 16nm. Unfortunately, the first 16nm process did not meet expectations of the fabless semiconductor ecosystem as compared to Intel and Samsung 14nm. Intel 14nm was faster and denser and Samsung 14nm was lower power. This was clearly a missstep for TSMC but they learned from it and came back with 16FF+ (second generation FinFETs) which is now the best performing process of its kind. TSMC openly makes this claim but I have confirmed it with several early access IP and fabless companies and they would know. 16FF+ based mobile products will hit the market in Q4 2015 and you will be impressed, absolutely.

TSMC 16FF+ does use the same BEOL (back end of line) as 20nm, which is the second half of the chip manufacturing process. The FEOL (front end of line) however is quite different. In fact, you will see a difference between the original TSMC 16nm and 16FF+ which has resulted in a significant PPA improvement (performance, power, and area). So when Morris Chang claims that 16FF+, which is technically their second generation FinFET, will be an even faster ramp than 20nm I believe it to be true.

As I predicted last year, Apple chose Samsung 14nm LPE for the iPhone6S (A9 SoC) and TSMC 16nmFF+ for the iPads (A9x). I stand by that prediction even though on the conference call Morris Chang said that in 2016 TSMC 16nm market share will be much greater than “our next competitors.” Given that Apple and QCOMM, TSMC’s two largest customers, are currently using Samsung 14nm there is really only one way this prediction can come true: Apple and QCOM will use 16FFC (TSMC’s third FinFET generation) for their SoCs in 2016.

TSMC also mentioned that 10nm will be in production in Q1 2017 which supports the above prediction that the iProducts released in 2016 will not be 10nm. The other interesting thing to note is the PPA numbers for 10nm: 15% speed gain at the same total power, or more than 35% power reduction at the same speed, and with k density of 2.2 times that of 16nm FinFET. I can tell you that Apple will not accept a 15% speed gain for a new process. I was told that the new 16FFC process due out mid 2016 was built “with” Apple so I would expect the same for 10nm. 16nm FF+ provides a 40% higher speed and 60% power savings over 20nm. My prediction is that the Apple version of 10nm for the 2017 iProducts will offer a minimum 25% speed increase.

Sound reasonable?

The conference call transcript is HERE.


Why Drones Love Atmel SAM E70

Why Drones Love Atmel SAM E70
by Eric Esteve on 07-18-2015 at 7:00 am

Avionics is by nature a mature market, requiring the use of validated system solution: safety is an absolute requirement, innovative systems require stringent qualification phase. That’s why the very fast adoption of drones as alternative solution for human piloted planes is impressive. It took 10 years or so for drones to be widely developed and used for applications ranging from war to entertainment, pricing ranging from a few $100’s to several 100’s of $K. But, even if we consider consumer oriented, rather cheap drones, the processing needs require using not only high performance but also versatile MCU, able to manage gyroscope, accelerator, geomagnetic sensor, GPS, rotational station, 4 to 6 axis control, optical flow and so on.

When I was designing for Avionics, namely the electronic CFM56 motor control (this reactor being jointly developed by GE in the US and Snecma in France was the WW leader, equipping Boeing and Airbus planes), the CPU was a multi-hundred dollar Motorola 68020, leading to $20 per MIPS cost! I don’t precisely know the ATMEL SAM E70 price (I would guess that it cost a few dollars) but that I know is that the MCU is offering an excess of 600 DMIPS. This very high performance as well as the very large on-chip memory size, up to 384 Kbytes SRAM and 2 Mbytes Flash are the main reasons why this MCU has been selected to support the “Drone with integrated navigation control to avoid obstacle and improve stability”.

In fact the key design requirements for this application were: +600 DMIPS, Camera sensor interface, Dual ADC and PWM for motor control, Dual CAN and small package offering. Looking at the block diagram below helps linking the MCU features with the various application capabilities: Gyroscope (SPI), Accelerator (SPI x2), Geomagnetic sensor (I2C x2), GPS (UART), 1 or 2 channel rotational station (UART x2), 4/6 axis control communication (CAN x2), Voltage/current (ADC), Analog sensor (ADC), Optical Flow sensor (through Image Sensor Interface or ISI) and Pulse Width Modulator (PWM x8) to support rotational station and 4/6 axis speed PWM control.

SAM E70 is based on Cortex M7, a principle and multi verse handling MCU which can handle high performance combined with extensive peripheral sets supporting multi-threaded processes. This multi-thread support will open in the future many more drones capabilities than simply flying…

Today’s drones are capable to fly or stay stationary, takes pictures or movies… and that’s already very impressive to see sub-kilogram devices offering such capabilities! But the drone industry is already preparing the future, with the desire to get more application stacks into the Drones so they can take in automation, routing, cloud connectivity (when available), 4g/5g, and various optional connectivity to enhance data pulling and posting…. Just imagine a small town counting a few thousand habitants, except a couple of days or weeks per year, because of a special event or simply holidays when suddenly hundred thousand of people are coming. These peoples want to feed their smartphone with multimedia or share live experience by sending movies or pictures, most of them at the same time. The 4G/5G and cloud infrastructure is not tailored for such an amount of people, so the communication system may simply break. This could be fixed simply by sending drones to reinforce communication infrastructure.

This is just one example of what could be the advanced usage of drones and these innovative applications will be characterized by common set of requirements: high processing performance, large SRAM and Flash memory capability and extensive peripheral sets supporting multi-threaded processes. Cortex M7 ARM based SAM E70 MCU from Atmel is a good example, offering processing power in excess of 1000 DMIPS, large on-chip SRAM (up to 384 Kbytes) and Flash (up to 2 Mbytes) capabilities managing all sorts of sensors, navigation, automation, servos, motor, routing, adjustments, video/audio, and more.

More products and design kit on Atmel Sales portal:

By Eric Esteve from IPNEST


How ARM Implemented a Mali GPU using Logic Synthesis and Place/Route Tools

How ARM Implemented a Mali GPU using Logic Synthesis and Place/Route Tools
by Daniel Payne on 07-17-2015 at 12:00 pm

ARM is a well-known semiconductor IP provider and they often create a reference design so that SoC companies can have a starting point to work with. On the GPU side of IP the ARM engineers have an architecture called Mali, and a recent webinar hosted by Synopsys reviewed how the physical design area was minimized by using a combination of tools:

  • Logic Synthesis – Design Compiler Graphical
  • Place/Route – IC Compiler

Front-end design engineers should be attracted to Design Compiler Graphical over the standard Design Compiler tool for logic synthesis because of the promises of: improved QoR like up to 10% higher clock frequency, congestion prediction and optimization, floorplan exploration, and providing physical guidance to IC Compiler that gives 1.5X faster placement.

Pierre-Alexandre Bou-Ach from ARM talked about how the Mail GPUs were designed and optimized for smallest area or lowest power. The ARM Mali-T820 was a GPU optimized for smallest area. The Implementation Reference Methodology (iRM) for the Mali GPU is based on Synopsys tools and shows how to achieve a specific PPA (Power, Performance Area) result.

Related – Synopsys Eats Their Own Dog food

There are a multitude of both front-end and back-end factors that will affect silicon area for a GPU, like:

For an area-centric design the strategy is to continuously track area using multiple metrics:

  • Core area
  • Die area

    • Physical only cells area
    • Hard macro area
    • Memories area
  • Combinational cells area
  • Repeaters area
  • Sequential standard cells area
  • Standard cells area

Related – ARM A57 (A53) Virtualizer + IP Accelerated = ?

An area Pareto chart shows that the larges area contribution was coming from the combinational cells without repeaters. The grey line is cumulative area contribution.

An analysis of area by design hierarchy was performed so that any change to the RTL could be directly related to an area impact, and the biggest modules were identified during the earliest stages of development. The placement of blocks within the hierarchy was studied to understand how to minimize repeater insertions. The IC Compiler tool helps in area reduction by reporting why any new cells are being inserted, so for the shader core the new cells added were to fix hold time violations:

Some best practices in the iRM flow when using the 28HPM process node:

  • Apply dont_use constraints on high drive repeaters and complex cells
  • Use memories from the ARM compiler
  • Manage the cell density with placer_max_cell_density_threshold 0.80
  • Design Compiler Graphical

    • Use the SPG flow
    • Try hierarchy reduction and flattening
    • Increase area priority
    • Set a realistic clock latency
    • Use area recovery
  • IC Compiler

    • Control repeater insertion during placement
    • Refine path group control
    • Area recovery enabled
    • Layer optimizations


Using multibit registers (2 bit and 4 bit cells) versus no multibit showed a savings up to 32% with standard cell implementation. Using ultra high density memories where appropriate in the shader core provided 25.46% area reduction of the memory, while using UHD memories on the top-level L2 had a 16.37% area reduction. Total area reduction using UHD memories was 4.57% for the shader core and 6.87% for the top-level L2.

Adding up all of the optimizations the Mail-T820 GPU team was able to achieve >4% area savings across the total cell area, while at the same time leakage power was reduced by >4%.

Summary
ARM has created an iRM flow that provides a reference Mali-T820 design for minimum area when using the Synospys tools for logic synthesis and place/route. Watch the entire 25 minute archived webinar online here.


GlobalFoundries 22nm FD-SOI: What Happens When

GlobalFoundries 22nm FD-SOI: What Happens When
by Paul McLellan on 07-17-2015 at 7:00 am

Earlier in the week I wrote about GlobalFoundries announcement of 22nm FD-SOI. At SEMICON West there were three events that filled in some more details. First, on Tuesday, a lunch presentation given by SOITEC who make the wafer blanks that FD-SOI requires. Then on Wednesday I sat down for an hour with Gary Patton and Subi Kengeri to get more details. And finally, on Wednesday evening there was a meeting with many of the people who are participating in the 22nm FD-SOI ecosystem. See also GlobalFoundries FD-SOI. Yes, it’s trueGary Patton used to be the head of R&D at IBM Semiconductor. Since IBM is retaining semiconductor R&D then he was what was called a “voluntary” and he could decide whether to remain with IBM or join GlobalFoundries. He decided to join GF as the CTO and says he is “all in”. He is impressed with Sanjay Jha (who I assume was also the person who closed the deal with Gary to bring him over). A bit of history. IBM used SOI for its high end processors, but not fully-depleted, partially-depleted. That is a process that is very high performance, but expensive, and hard to deal with. There is also an RF-SOI process used by both IBM and GlobalFoundries (in Singapore). This has become the substrate of choice for building radios in the modern era with multiband phone. Your phone has some in since 100% of phones do these days (it is a lot cheaper to manufacture than SiGe or GaAs). The IBM RF process (where they are world leaders) will run in Burlington, Dresden and Singapore. See also GlobalFoundries Adds RF to 28nm Gary said that despite deciding to primarily go forward with FinFET that IBM continued to do research on FD-SOI at Albany. Plus, of course, STMicroelectronics developed 28nm FD-SOI which GlobalFoundries licensed. But when they went to customers, they were told the performance wasn’t high enough. So they decided to develop a 22nm version with the aim of getting very close to FinFET performance but with a manufacturing cost the same as 28nm, and much lower power. As I said in my earlier blog, there are actually 4 different processes that make up the 22FDX process family although it is very modular. Each process has a couple of extra masks but it is almost the same basic process. Why did they do this? After all, GlobalFoundries already has 14nm FinFET (licensed from Samsung). The business driver is that volume and growth are both higher at the low end. Yes, the most advanced application processors for mobile need FinFET but the price is too high for the mainstream. Think of a cheap application processor with battery life of a week for emerging markets. So what are the key features:

  • operation as low as 0.4V. It is the only process in the world that can do this, including any that are known to be in development
  • integrated RF. The insulating substrate makes this a lot easier
  • body bias allowing for tradeoffs between power and performance under software control
  • up to 70% lower power than 28HKMG
  • performance up to 70% faster than 28HKMG (with FBB at 1.5V, can actually go to 1.8V) although not at the same time as the lowest power
  • 50% fewer immersion layers than FinFET (hence the significantly manufacturing cost and lower mask cost)
  • 20% smaller die than 28 planar

So what about availability? The initial PDKs exist and have gone to early customers and IP developers. Because 22FDX is similar to 28nm FD-SOI, doesn’t require double patterning, and doesn’t have the complexity of FinFET, the program is on an accelerated schedule with design enablement (EDA, IP etc) working in parallel with technology development. They expect early tapeouts soon after the technology is qualified. So when will that be? They are running internal shuttles already (they call them TQE). They will start external shuttles in Q1 of 2016. Risk production is planned for the end of 2016. Apparently the first silicon run of the “lightning” testchip was closer to the target than anyone had ever seen before, with N transistors on the dot and P just 2% off. The process will run in the Dresden fab (where Dan has been this week, along with CEO Sanjay Jha, not to mention Angela Merkel). It uses the same toolset as 28nm. It could also run in Malta or East Fishkill, but not Singapore. There is plenty of capacity for high volume customers. I asked Gary about 10nm and 7nm. He pointed out that with the IBM semiconductor acquisition that there is a huge infusion of talent who have done leading edge TD for decades. Most of them are now in Malta to accelerate 10/7nm plus there is the Albany Nanotech Center just 20 minutes away. Later in the evening there was a one hour panel session, moderated by Subi, with:

  • Marie-Noëlle Semeria, CEO, Leti (research on FD-SOI)
  • Paul Boudr​e, CEO, Soitec (manufacturer of base wafers)
  • Ron Moore, VP Marketing, ARM (physical libraries and microprocessors)
  • Juan Rey, Senior Engineering Director, Mentor Graphics (EDA)
  • Brandon Wang, Group Director, Strategic Programs, Cadence (EDA and IP)
  • Jamil Kawa, Group Director, Synopsys (EDA and IP)
  • Bill Wang, VP and GM, VeriSilicon Holdings (design services)
  • Patrick Soheili, VP, Product Management and Corporate Development, eSilicon (fabless ASIC)
  • Dasaradha Gude, CEO, INVECAS (design services)

I won’t go into what everyone said. The main conclusions were that the forward body bias (FBB) is the only thing that requires special attention. Obviously physical verification rule decks need to be created but since DRC/LVS already supports 20nm planar, 16nm FinFET, 28FD-SOI, no issues are anticipated. The IP people all had 28FD-SOI experience and also don’t expect any issues. Ron Moore of ARM confirmed that they had PDKs and they were investigating the performance of ARM processors (which also means they must have built a preliminary standard cell library). So, it’s been FD-SOI week all week. Given everything I’ve seen and heard this is a real announcement of something significant.


How PowerArtist Interfaces with Emulators

How PowerArtist Interfaces with Emulators
by Pawan Fangaria on 07-16-2015 at 5:00 pm

Last month in DAC I could see some of the top innovations in the EDA world. EDA is a key enabler for advances in semiconductor designs. Among a number of innovations worth mentioning (about which I blogged just after DAC), the integration of Mentor’s Veloce with ANSYS’ PowerArtist for power analysis of live applications caught my attention. We already know about Veloce as a versatile tool for hardware emulation and PowerArtist as a versatile tool for power analysis of SoCs from RTL level. What makes the combination of two interesting is that the power consumption in a device during actual running of an application can be accurately measured and analyzed much faster. So I was more interested in learning about how exactly the interface between these tools work.

Before I go into the interface details, let me briefly mention about ANSYS PowerArtist functionality. PowerArtist provides power analysis of SoCs at RTL level in different measures such as average or time-based power. Also power-critical vector selection can be done. The PowerArtist uses RTL Power Model (RPM) for RTL-driven physical power integrity. The PowerArtist Calibrator and Estimator (PACE) technology ensures that early RTL power estimates track the final gate-level power numbers. The PowerArtist provides interactive debugging for power and employs various techniques for power reduction at clock, memory and logic level.

The activity data for computing power is typically acquired from simulation testbench and stored in files with standard formats such as SAIF (Switching Activity Interchange Format), VCD (Value Change Dump) and FSDB (Fast Signal Database). The PowerArtist reads the data from these files for power analysis. Clearly the file based interface provides post-simulation power analysis and brings its own overhead in making the analysis slow and error prone. Moreover, these formats lack either in terms of accuracy or capacity; SAIF does not include temporal information; VCD has temporal information but is inefficient because it is a textual format; FSDB is both temporal and binary but its generation slows down emulators and simulators.

To overcome these issues, ANSYSand Mentordeveloped an innovative approach where the activity data stream of an application running in Veloce emulator is directly captured by PowerArtist through an streaming interface. Due to elimination of file-based interface, both the emulator hardware and power analysis software tools run order of magnitude faster with the accuracy of actual consumed power. A key advantage of this approach is that it enables early RTL power visibility and budgeting for live applications which is not possible with traditional file-based approach.

PAVES (PowerArtist Vector Streaming) is a new innovative RTL power socket that can connect with emulators and simulators enabling streaming activity transfer. The PAVES socket interface with Veloce emulator’s DRW (Dynamic Read Waveform API) has been demonstrated working well in 52[SUP]nd[/SUP] DAC. This enables early gate-level power verification for live applications and therefore decisions for power budgeting of derivative designs. Since PAVES can process activity in parallel with the application running in Veloce, the power analysis can be much faster and accurate.

This approach of power analysis and budgeting for live applications has been tested by early access partners and customers. The runtime performance improvement with this new approach compared to the file-based approach can be up to 4.25x among the designs shown in the table above. This performance improvement is without any compromise on RTL-to-gate power accuracy.

With PAVES PowerArtist can read switching data directly from any supported emulator running a live application and provide visibility into RTL power as well as perform gate-level power verification without any overhead of file-based transfer. This is another feather added into the growing importance of emulation-based verification of SoCs.

Read a technical paper at ANSYS website to know more about this methodology.
Also read:
Eyes Meet Innovations at DAC
Getting the Best Dynamic Power Analysis Numbers
Benefits of RTL Power Budgeting

Pawan Kumar Fangaria
Founder & President at www.fangarias.com


Leveraging Power Reduction Techniques for MCU Based SoCs

Leveraging Power Reduction Techniques for MCU Based SoCs
by Daniel Nenni on 07-16-2015 at 12:00 pm

Dolphin Integration launched a new 32-bit microcontroller, RISC-351 Zephyr, targeting low-power SoCs for IoT-like competitive markets taking into consideration three angles for optimization of power consumption: architectural, memory and software.

Architecture Angle
As a reminder, 8-bit versus 16-bit versus 32-bit applies to 3 dimensions independently: instruction code, addressing space, and word width. The Arithmetic Logic Unit (ALU) performs operations on the word width. Thanks to an innovative instruction set and core micro-architecture, the RISC-351 Zephyr offers the unique flexibility of dealing with 8, 16, and 32-bit words using dedicated instructions and minimum sufficient data path in order to achieve low power consumption and small silicon area at subsystem-level (including program and data memories).

Beyond clock gating, which has been carefully implemented so that most functional blocks can be separately gated, RISC-351 Zephyr is available in a Retention Ready (RR) version which supports efficient power gating in ‘Deep Sleep’ mode. The advantage of this mode is that only the registers, which hold the needed information to wake-up in the same state, are kept in retention. All other logic is completely switched off.

Memory Angle
Memories play a major part in the overall power consumption of any microcontroller based subsystem. The Reduced Instruction Set of Zephyr achieves unsurpassed code density thanks to a smart use of variable instruction sizes whenever possible. This either enables adding more functionalities in the program or selecting smaller program memories, whether RAMs or NVMs (and thus saving leakage power).

The RISC-351 Zephyr also features an innovative pre-fetch interface dedicated to minimizing the number of accesses to the program memory by eliminating unnecessary ones. The number of accesses is reduced by 15% compared to conventional 32-bit low-power MCUs.

In addition, Dolphin Integration proposes an instruction cache-controller (R-Stratus-LP) which has been specifically designed to reduce power consumption and access time of embedded Flash and EEPROM memories by more than three times. The R-Stratus-LP offers highest hit rates because of its on-the-fly parameterized associativity ways and line size change capabilities.

Software Angle
A complete and innovative Integrated Development Environment (IDE) and compiler is essential to fully optimize any MCU subsystem.

The RISC-351 Zephyr is delivered with an innovative compiler SmartCC, the first compiler in the low-power MCU market to be based on the widely acclaimed LLVM framework. In addition to the wide compatibility of SmartCC with GCC and the latest ANSI-C standards, the compiler has been designed to maximize the use of the internal registers and thus reduce dynamic power consumption by minimizing energy consumption of the memory accesses.

Last but not least, Dolphin Integration enables developers to go further with its new IDE SmartVision[SUP]TM,[/SUP] by being able to quantify the energy consumed by each function during the program execution therefore allowing a designer to identify and optimize the most energy-consuming functions.

More information on: RISC-351 Zephyr and its IDE SmartVision™


Coventor SEMulator3D, Now With Added Dopant, Diffusion, Illumination and More

Coventor SEMulator3D, Now With Added Dopant, Diffusion, Illumination and More
by Paul McLellan on 07-16-2015 at 7:00 am

Coventor just rolled out the latest version of SEMulator3D, their virtual fabrication tool. Very conveniently it is SEMICON West this week and they have a booth. I dropped by and got a demo from David Fried, Coventor’s CTO about all the new stuff. He’s very proud of SEMulator3D’s new logo but mostly he is proud of several major improvement in their ability to do virtual fabrication of wafers. The value proposition of SEMulator is that you can avoid the cost and especially the time of running a lot of wafers, especially doing design of experiments (DOE) type work where many wafers need to be run with slightly different parameters. He told me that customers tell him the time is probably the biggest factor since, in practice, in a new fab, not running wafers just means the equipment sits idle and the only saving is in materials so they tend to run wafers of some sort continuously. As we all know, the primary way to get yield up is to run a lot of wafers.

As processes get more complex, and especially get more vertical, we need to model very complex structures such as vertical III-V nanowires, octuple patterning, vertical flash. Doing it the old way, by running experimental wafers is not enough on its own. Cost and development time stretch out and eliminating systematic structural defectivity becomes the key to a successful ramp. But the vertical structures cause problems due to shadowing, complex doping, deep etching and more.

So what’s new in version 5.0?

SEMulator3D has always had a module for handling dopants, basically implant and diffusion, but it was inadequate for the types of processes now being created. There is a lot of interaction between the physical structure and the electrical effect of the dopants. The new module handles ion implant, thermal diffusion, doped diffusion, doped epitaxy. It further includes visualization of the dopant concentration gradiants and the dopant-type concentrations, as in the diagram below of a 20nm SRAM. For example, look at the NFETs (top row) where you can see the source/drain implants penetrating, and shadowed by the gate/spacer structure. In the PFETs (bottom row) you can see the Boron (blue) and how it finally diffuses out (the Arsenic in the NFET stays put better due to diffusion properties).

Another new module allows analysis of visibility limitations, such as shadowing and off-axis effects. In some cases these off-axis effects are intentional, such as angled etch (as in the picture on the left below). And sometimes unintentional, such as where on a 300mm wafer the central die my have perfectly vertical etch but near the edge of the wafer there may be few degrees of error (as in the picture on the right). This sort of analysis can be key to high yield at the edge of the wafer.

The final big change, other than lots of incremental improvements in user interface, performance and so on, is in being able to link SEMulator3D to other tools. It is the best tool for the type of modeling it does but there is a whole selection of other tools that handle various detailed analysis, typically starting from a mesh of the structure. SEMulator3D has always supported output interfaces, and SEMulator3D 5.0 adds to the library of available modeling platforms.


Coventor are at SEMICON West at booth 2531.


Internet of Things Bubble?

Internet of Things Bubble?
by Bill Jewell on 07-15-2015 at 10:00 pm

Speaking at the World Business Forum in Sydney, Australia in May, Steve Wonziak (cofounder of Apple) said about the Internet of Things : “I feel it’s kind of like a bubble, because there is a pace at which human beings can change the way they do things. There are tons of companies starting up.” According to market research firm Gartner, the Internet of Things (IoT) is “the network of physical objects that contain embedded technology to communicate and sense or interact with their internal states or the external environment.”

The obvious comparison to a potential IoT bubble is the Internet bubble in the 1990s. The bursting of the Internet bubble in 2000 put many marginal companies out of business and led to major declines in the stock market value of all Internet related companies. However it did not change the inevitable trend of the Internet becoming pervasive in our lives. The same is probably true of IoT. Some companies and technologies are overhyped and could crash and burn in a market correction. However the trend is inevitable – devices will increasingly communicate with the Internet, many without any human interaction.

How big will IoT be in the next few years? There is surprising agreement among forecasters.

[TABLE] align=”center” border=”1″ style=”width: 495px”
|-
| colspan=”3″ style=”width: 395px; height: 19px; text-align: center” | Internet of Things Installed Base, Billions of Units
|-
| style=”width: 209px; height: 19px” | Source:
| style=”width: 96px; height: 19px; text-align: center” | 2019
| style=”width: 89px; height: 19px; text-align: center” | 2020
|-
| style=”width: 209px; height: 19px” | Gartner, Nov. 2014
| style=”width: 96px; height: 19px” |
| style=”width: 89px; height: 19px; text-align: center” | 25
|-
| style=”width: 209px; height: 19px” | McKinsey, Dec. 2014
| style=”width: 96px; height: 19px” |
| style=”width: 89px; height: 19px; text-align: center” | 26-30
|-
| style=”width: 209px; height: 19px” | Business Insider, April 2015
| style=”width: 96px; height: 19px; text-align: center” | 23
| style=”width: 89px; height: 19px” |
|-
| style=”width: 209px; height: 19px” | PwC, May 2015
| style=”width: 96px; height: 19px” |
| style=”width: 89px; height: 19px; text-align: center” | 30
|-
| style=”width: 209px; height: 19px” | IDC, June 2015
| style=”width: 96px; height: 19px” |
| style=”width: 89px; height: 19px; text-align: center” | 29.5
|-

The consulting and market research firms are closer in their projections for 2019-2020 (23 billion to 30 billion unit installed base) than they are in their estimates of 2014 (3 billion to 10 billion). My theorem about high technology markets is the forecasters are always closer to each other than they are to the final reality. This is not surprising since the companies are operating with the same general set of assumptions. Unforeseen events often drive the market higher or lower than the consensus forecast.

The IoT is made up of numerous devices across many applications. The media focuses on consumer purchased devices such as wearables and connected home devices. The hype was obvious at International CES 2015 in January. The top two categories for exhibitor press releases were wearables and connected home.

However most of the IoT is driven by business and government. Business Insider estimates in 2019 about 74% of the IoT installed base will be business and government applications and about 26% will be consumer applications. IDC expects digital signage will be the biggest IoT growth driver in 2015. Businesses and government can usually justify IoT investment with eventual cost savings. A utility company can rationalize the cost of installing smart electric meters at their customer sites will pay off with the savings in not sending human meter readers to each location every month.

Consumer IoT applications will continue to command most of the media attention. Business Insider project about one-third of the 6 billion unit IoT installed base among consumers will be connected home devices, including energy, security and appliances. Another key consumer IoT driver is wearable devices which can monitor health, fitness and activity. They can also be an interface to a smartphone. Wearable devices are primarily worn on the wrist – including smart watches and bands.

Prognosticators are also remarkably close in their projections for the wearable device market in four years. Estimates of the 2019 wearable device market are in a narrow range of 145 to 156 million units.

[TABLE] align=”center” border=”1″ style=”width: 527px”
|-
| colspan=”2″ style=”width: 421px; height: 15px; text-align: center” | Global Wearable Device Market, Millions of Units
|-
| style=”width: 289px; height: 16px” |
| style=”width: 131px; height: 16px; text-align: center” | 2019
|-
| style=”width: 289px; height: 15px” | Analysis Mason, Sep. 2014
| style=”width: 131px; height: 15px; text-align: center” | 145
|-
| style=”width: 289px; height: 15px” | Business Insider, May 2015
| style=”width: 131px; height: 15px; text-align: center” | 148
|-
| style=”width: 289px; height: 16px” | IDC, June 2015
| style=”width: 131px; height: 16px; text-align: center” | 156
|-

Two high profile wearable devices are the Apple Watch, launched on April 24, and the Fitbit Charge HR, launched on January 6 at CES. Fitbit is the global leader in wearable devices, with 34% share in first quarter 2015 according to IDC. The Apple Watch is being “watched” closely due to Apple’s success in redefining markets with its iPad and iPhone. Slice Intelligence estimates 3 million Apple watches were sold online in the U.S. in the three months from April 10 (the first day of pre-orders) through July 10. Slice Intelligence estimated Fitbit sold 850 thousand total devices in U.S. online sales in May, beating Apple Watch sales of 777 thousand devices. Apple Watch is expected to be a major factor in total year 2015 wearable devices sales, with estimated share of to 27% (CCS Insight) to 40% (Business Insider).


How will wearables affect the semiconductor market? IHS estimated the costs of the components in an Apple Watch Sport at $81. We at semiconductor Intelligence estimate the semiconductor portion of the cost at $50, 14% of the watch retail price of $349. IHS estimated the component costs of an iPhone 6 at $196. We estimate semiconductor content at $130 for the iPhone 6, 20% of the retail price of the phone.

Below are IDC’s recent projections of unit shipments of PCs, tablets/2-in1 devices, smartphones and wearables for 2019 compared to 2014. Smartphones will continue as the dominant device people use to connect to the Internet, with an expected 1.9 billion units shipped worldwide in 2019. The combined total for PCs and tablets in 2019 is 563 million units, a slight increase from 539 million in 2014. Wearable devices are forecast to show six-fold growth from 26 million units in 2014 to 156 million in 2019. Even with the strong growth, wearables shipments will be less than one-tenth of smartphones in 2019 and less than either PCs or tablets. The semiconductor content of wearables per device will continue to be significantly less than in the other devices.

Earlier this month Gartner forecast wearable devices will account for only one percent of the semiconductor market in 2019. The total IoT semiconductor market will be more significant, with Gartner data showing it could account for eight percent of the semiconductor market in 2019.

The Internet of Things may not be the “Next Big Thing” to drive the semiconductor market, but it does represent a meaningful growth opportunity. The key semiconductor products in IoT devices are sensors, microcontrollers, processors and communication & connectivity devices. IoT devices will also drive a market for systems and infrastructure to support the devices. If there is an IoT bubble, it is more likely to slowly deflate than burst. Many companies currently focusing on IoT will go out of business, but the remaining companies will have an opportunity to profit from a strong growth market.


How Emulation Enables Complex Power Intent Modeling

How Emulation Enables Complex Power Intent Modeling
by Pawan Fangaria on 07-15-2015 at 12:00 pm

As the number of CPU, GPU, and IP is growing in an SoC, power management is becoming more and more a complex task in itself. A single tool or methodology may not be enough for complete power management and verification of an SoC. In an SoC, there can be multiple modes of operations involving hardware and software interactions, different applications running, and complex dynamic power profiles of those applications. One key aspect is power intent modelling and verification. The power intent is described in CPF (Common Power Format) or UPF (Unified Power Format) files. The design has to honour the rules described in these files and work properly even when certain parts of the design are inactive or in sleep mode.

Broadly speaking, the UPF files contain descriptions about components such as switches, level shifters, isolation cells, and retention cells. These files can be best used by different verification tools including simulation and emulation under different situations. Emulation has been a unique verification method for real world testing. AMDhas been using In-Circuit Emulation (ICE) for its GPU (Graphics Processing Unit) and APU (AMD Accelerated Processing Unit) power modelling and verification. GPU is a simpler case in which the GPU resides inside Cadence’sPalladium system and is divided into multiple power domains. A test PC is connected to the GPU through a PCIe speed-bridge. The test PC is also connected to a debug PC through fire-wire. The switching of power domains is dynamically controlled by hardware and software. The main purpose is to keep the Power dissipation of the overall GPU low. The APU is more complex than GPU; an example is below.

Inside the emulator, the power domains are shown in orange color. There are many power domains in the Graphics & Multimedia Engine and other units which need extensive power management with many complex power scenarios to test. There are multiple CPU cores. There is a separate CPU for system management and security. There are power safe operations in I/O subsystem and system management units which need to be unified. The Operating System (OS) and test system are on a separate hard disk sitting outside the emulator and connected with it through SATA speed-bridge.

In such a scenario emulation is the most powerful method to test the APU in real world with real use-cases. The application and OS level testing is nearly impossible with the usual simulation method. Dynamic Frequency and Voltage Scaling (DFVS) is a powerful technique to reduce power dissipation by independently scaling frequency and voltage in each power domain. However it requires a significantly complex system management, because in larger and faster chips there can be multiple clock domains to propagate clock signals across the chips. Such a system can be best tested with emulation in real life scenario for best accuracy.

AMD uses UPF 2.0 with all construct level support and the latest UPF 2.1 with semantic level support in Palladium emulation system. Memory scrambling test has been automated with UPF directives which describe about the power domain where the scrambling takes place. With waveform support for objects one can see the states of power related objects. With SDL trigger support in UPF, Palladium can create dynamic triggers on any power related object in the design to put it on or off. This is a unique and powerful feature for power safe transitions.

AMD has a very balanced approach of selecting tests which needs to be run on simulation or emulation. Usually, long running directed tests and those that do not require complex testbench interaction are run on UPF enabled emulation. The pre-silicon emulation workloads comprise power management, BIOS, firmware, OS, and application level workloads. In emulation multiple passes of power sequences can be done as against a single pass in simulation. This enables complex interaction scenarios and repeated power state entry or exit with variable external stimulus.

The APU based testbench can be simplified by connecting a system model with the design in the Palladium system through transactors.

The system model can be in C++. The executing code communicates with the system model where a power event request can be sent to the system model and the system model responds accordingly. This scheme is configurable and aligns uniquely with AMD’s RTL simulation methodology.

This emulation methodology at AMD has proved to be very effective reducing long simulation runtimes drastically. Complex SoC power interactions and closer to real life workloads can be possible through emulation. The runtime failures can be detected through self checking stimulus and power based assertions instrumented into the emulation build.

Alex Starr from AMD presented in detail about this methodology at 52[SUP]nd[/SUP] DAC. The same presentation has been posted HERE with the title “Experiences with SoC Deployment of Hardware Emulation Based Power Intent Modeling”. There is no registration required.

Pawan Kumar Fangaria
Founder & President at www.fangarias.com