llmda newsletter ad (2)

Powering the IoT – Wishful Thinking versus Reality

Powering the IoT – Wishful Thinking versus Reality
by Bernard Murphy on 12-11-2015 at 12:00 pm

There’s a lot of discussion these days on IoT applications, architectures, communication, security and more, all very good stuff, but little debate on how these devices will be powered. If you can plug them in, this maybe isn’t an issue (though we may need to think about increased demand on our overstrained power generation infrastructure). However, for mobile and remote applications, the question is commonly dismissed as something that can be resolved through energy-harvesting, without detailed investigation of how practical that option really is, at least from what I have read.

I too thought that energy harvesting was a really exciting direction, but was frustrated by many popular articles that, while starting from real research/applications, seem to point without support to wildly-extrapolated implications. So I did my own literature survey, with a bias to journal articles rather than popular articles wherever possible and what I found wasn’t quite so promising.

A quick summary:

· Inductive charging and wireless-beamed power are very practical but limited by heat generation and safety and of course by proximity to a wall-plug powered source

· Piezo- and thermo-electric charging are limited to order of uA/cm[SUP]2[/SUP]

· Biochemical charging can get to mA/ cm[SUP]2[/SUP] but the enzymes required for this method have to be replaced every couple of years

· Ambient wireless charging value is negligible unless very close to an antenna (if we could get this to a realistic charging level, I would worry about being cooked while I charge)

· Nuclear, surprisingly, is order of uA/cm[SUP]2[/SUP] or less apart from radio-thermal which is not considered safe for public use

· Biomechanical varies widely depending on the method used but suffers both from being a very intermittent source and potentially being tiring to use (you are the real power source)

· There are some scaled-up sources, such as piezo harvesting traffic pressure in roadways, building vibrations and vibration in railway tracks that can generate meaningful power (~100W for regenerative shock absorbers in cars and on tracks to ~10kW from skyscraper building dampers).

· One bright spot is photovoltaic (solar). Power density is still at the uA/cm[SUP]2[/SUP] level but we know how to scale up solar panels. A 10cm[SUP]2[/SUP] panel (a large watch face) could in principle power an LTE radio with a 1% duty-cycle. But of course solar is limited to outdoor applications.

None of these numbers is very exciting. Apart from solar, the problem seems to be that tapping new sources starts with a relatively small amount of practically and theoretically accessible energy which is then substantially reduced by unavoidable limits and inefficiencies in conversion. One way to offset this problem is through massive scale-up, but then you are limited to platforms like buildings and bridges. And even in those cases, the amount of energy that can be generated is dwarfed by the daily needs of the structure (a skyscraper for instance). I might wish the reality were different, but I suspect outside of the commercial successes we already know, this aspect of the green movement will have a rather short life. All of which may mean we’ll be stuck with the grid, batteries and proximity charging (inductive and beamed) for the foreseeable future.

I published my findings in a couple of LinkedIn blogs, the first on limitations to powering wearables and the second on limitations in locally-generated power (power generated close to the consumer of power, without connection to the grid). What I hope is different between my research and what appears in popular articles is that I tried to survey widely and I documented all my sources, so you can check what I found. I would be very happy to see supported counter-examples, because it really would be fantastic to find that there are practical local-scale harvesting technologies.

The link on limits to wearable power is HERE (the blog starts with power consumption in wearables; the generation part is in the second half of the blog). The link on limits to locally-generated power is HERE.

More articles by Bernard…


IEDM 2015 Blogs – Part 1 – Overview

IEDM 2015 Blogs – Part 1 – Overview
by Scotten Jones on 12-11-2015 at 7:00 am

The International Electron Devices Meeting (IEDM) is one of, if not the premier conference for semiconductor process technology. The 2015 meeting just finished up on Wednesday, December 9th.

This year’s meeting was held from Saturday, December 5[SUP]th[/SUP] through Wednesday, December 9[SUP]th[/SUP] in Washington DC. As a side note the conference has historically alternated years between Washington DC and San Francisco but after this year will be held exclusively in San Francisco. Basically with the so much of the semiconductor industry now located in the Far East attendance is better when the conference is in San Francisco.

I find the conference to be very helpful in terms of understanding the latest process technologies. Not only is the conference very good but there are a lot of side events held around it during lunch and the evenings.

Saturday

A series of six tutorials was held although I didn’t attend any of them.

Sunday
On Sunday there were two all day short course, one on CMOS for 5nm and beyond and the other on memory technology. I attended the memory technology seminar and I will blog about that in a subsequent blog.

Sunday night CEA-Leti and IMEC each held technology forums. I attended the IMEC forum and I will blog about that. I am also trying to set up a briefing from LETI on their recent work and if that comes about I will blog about that as well.

Monday

Monday opened the conference technical session with the plenary session. I will blog about Greg Yeric’s excellent address on Moore’s Law at 50 in a follow on blog.

At lunch on Monday I attended the press luncheon. General attendance for the conference this year is expected to be around 1,400 with 365 attendees for the tutorials and 505 attendees for the short courses. There were also two luncheons and they were expected to be sold out. Both David Lamers and myself asked about Platform CMOS papers and industrial participation. In general our impressions are that the conference has become more academic and less industrial practice oriented. The organizers felt that has been steady since 2005. They did agree that before 2005 there were more “platform” papers and they attributed that to consolidation. There was also a feeling that platform papers tended to be every other year now.

Monday afternoon I got a briefing from Global Foundries on their 22FDX technology and I will be blogging about that.

Micron and Intel presented their floating gate 3D NAND technology Monday afternoon and I will likely blog about that paper as well.

Tuesday

The conference continued on Tuesday with more technical papers. I am still organizing my thoughts on Tuesday’s papers but I will likely blog about what I saw.

Tuesday night I attended Coventor’s “the last half nanometer” panel discussion and I will blog about that.

Wednesday
The conference wrapped up Wednesday. There were a couple of excellent DRAM papers from Samsung and SK Hynix and I will blog about them plus possibly some other papers from Wednesday.

Wednesday at lunch I attended ASM’s luncheon where Dino Triyoso from Global Foundries presented. Unfortunately I was asked not to blog about that event.

Conclusion
IEDM 2015 was another excellent and informative conference. I will follow up this blog with approximately 8 or more blogs on what I saw.


3 flavors of TMR for FPGA protection

3 flavors of TMR for FPGA protection
by Don Dingee on 12-10-2015 at 4:00 pm

Back in the microprocessor stone age, government procurement agencies fell in love with the idea of radiation hardened parts that might survive catastrophic events. In those days, before rad-hard versions of PowerPC and SPARC arrived, there were few choices for processors in defense and space programs.

One of the first rad-hard microprocessors was the Performance Semiconductor PACE P1750A, a product line since acquired by Pyramid Semiconductor. It was born in the Reagan-era “Star Wars” boom, where total ionizing dose (TID) and low power consumption were the first two requirements. Thank goodness, our project using the PACE P1750A never got past system design and lab prototyping, because I don’t think we fully appreciated what we were up against in creating a totally rad-hard, space-ready system.

What the semiconductor industry has learned about rad-hard and rad-tolerant design since fills volumes of books and is still developing. Geometries have shrunk, worsening the chance of a disruption from radiation. Processes have improved with technology such as silicon on insulator (SOI). Software content in all projects has swelled, justifying investments in creating rad-hard processors delivering a high level of confidence at a high level of cost.

Rad-hard ASICs, however, are another matter. While the technology exists, the volumes around a custom design usually do not. Fortunately, FPGA vendors and some defense firms licensing technology have stepped in with rad-hard parts targeting space-based projects.

Space is not the only place radiation exists. Many applications, including industrial, medical, and automotive, are subject to single event upset (SEU). To provide adequate levels of safety-critical operation without the expense of full rad-hard FPGAs, designers are turning more and more to SEU-tolerant approaches in FPGA logic synthesis. These same approaches are even applicable in full rad-hard FPGAs, as various FPGA technologies present different susceptibility and need additional mitigation techniques in some areas.

The cornerstone of SEU mitigation is triple modular redundancy, or TMR. This is the classic voting scheme, where circuitry is replicated three times and combined into a majority voter. In theory, if an SEU occurs in one block, the other two provide correct results. TMR schemes can detect and correct single-bit errors.


Dialing in TMR by hand in a complex FPGA-based design could take forever, take a lot of area on the chip, and potentially mess up timing. Understanding the tradeoff between safety, area, and timing can make or break a project. Synopsys has spent significant amounts of research on its Synplify Premier tool, studying popular FPGA architectures and mitigation approaches, to automate the insertion of TMR during synthesis.

For instance, there are actually three flavors of TMR. Registers can be protected with local TMR (LTMR), a simple replication. However, researchers are finding SRAM-based FPGAs in space-qualified applications are still susceptible to upset using LTMR – geometries are small enough and events rapid enough that radiation strikes two or even all three blocks.

To protect I/O and logic and provide more hardening for space-based designs, distributed TMR (DTMR) physically separates the triplicated circuitry on the chip. Block TMR (BTMR) takes the approach a step further with physical separation and clock synchronization, and can be used with indivisible or encrypted IP blocks.

Synplify Premier handles all three of these TMR types and more mitigation techniques, with automated FPGA-aware synthesis techniques supporting all popular devices. Synopsys application engineer Sharath Duraiswami dives into the details in an archived webinar:

Building Highly Reliable FPGA Designs for Applications Needing Functional Safety

One idea Sharath discusses is “partial DTMR”, where voters around flushable dual flip-flops are optimized to save area when possible. He also shows how the physical separation works, along with Synplify mitigation techniques for each type of functional block in an FPGA including duplicate with compare (DWC), Hamming-3 encoding, and safe case FSM. One example even shows use of a Xilinx Zynq-7000 SoC using DWC techniques for error control.


The webinar tips will be helpful for designers working with full rad-hard FPGAs or trying to harden safety-critical applications, whether working with Altera, Lattice, Microsemi, Xilinx, or other parts. It’s evident just how much work Synopsys has put into Synplify Premier to automate synthesis for a wide variety of scenarios, far beyond just blasting away with logic triplication. I like this presentation because it isn’t tied to just one FPGA architecture or vendor – each has its merits and limitations in safety-critical design that engineers need to be aware of.

More articles from Don…


The Mobility Imperative and the Untethered Consumer

The Mobility Imperative and the Untethered Consumer
by Alex Lidow on 12-10-2015 at 12:00 pm

Consumers want to be able to go where they want, when they want. They want televisions to be seamlessly synchronized with tablets, phones, laptops, and automobiles. They want all their communication, information, and entertainment to be available immediately, with high resolution, all the time. Recently the automobile industry has caught on to this trend and has begun to show its vision of the future for the fully mobile lifestyle.

They also do not want to worry about running out of battery life – no more looking for an outlet at the airport. This untethered life is the Mobility Imperative and it is driving innovation in consumer products, which in turn, is pushing the limits of silicon-based semiconductor technology.

As silicon power transistors (MOSFETs) run out of gas, gallium nitride transistors are the next generation semiconductor devices in the world of power conversion and data transmission. Enhancement-mode gallium nitride transistors (eGaN[SUP]®[/SUP] FETs) from Efficient Power Conversion Corporation (EPC) have been in production for over five years. These devices are smaller in size, superior in performance, and lower in cost when compared with their aging silicon ancestor, the power MOSFET. GaN’s high-speed capability, coupled with lower production costs and smaller size, makes this technology ideal for accomplishing the Mobility Imperative.

Increasing Wireless Bandwidth – Increased Data Transmission, Increased Battery Life
Envelope tracking is a power supply technique for improving the energy efficiency of Radio Frequency Power Amplifiers by precisely tracking the power demand, as compared to today’s fixed-power systems. In cell phones use of envelope tracking means longer talk time, and in base stations it means smaller, less expensive amplifiers that consume far less energy and are less expensive to operate.

As our demand for wireless data grows, the value provided by envelope tracking increases dramatically. More transmitters alone cannot solve the problem; rather, more data transmission bandwidth per power amplifier is required. As the data transmission bandwidth increases, the efficiency of the transmitter’s power amplifier sharply declines unless the system adopts envelope-tracking methods.

Gallium nitride is being seen as an enabling technology for both envelope tracking converters and wide bandwidth RF Power Amplifier designs. The ultra-fast switching capabilities of eGaN FETs enable the high frequency, multi-phase buck converters used in envelope tracking power systems.


Figure 2: An example of an envelope tracking system using eGaN FETs. eGaN FETs are the tiny blue rectangles on the circuit board. (Photo courtesy of NewEdge Signal Solutions.)

Wireless Power Transfer Cuts the Cord…No Need to “find an outlet!”
Since Nikola Tesla first experimented with wireless power during the early years of the 20[SUP]th[/SUP] century, there has been a quest to “cut the cord” of electrical power – and go wireless! Now, more than 100 years later, the technological capability to achieve Telsa’s vision is a reality.

Highly resonant wireless power transfer, based on the generation of magnetic fields, has proven to be a viable path. Magnetic fields offer the necessary requisites for implementing wireless power – ease of use, robustness and, most importantly, it is considered safe. Applications for wireless power are endless, from charging cell phones and computers, to powering systems in hazardous environments and implantable medical devices.

With the explosion in the variety and number of mobile devices, wireless power transfer offers the convenience of charging batteries without the annoyance of cumbersome cables and the inconvenience of looking for outlets to “plug in.” Figure 3 is an illustration of what the home of the future might look like with all electrical appliances powered without power cords.

Over the past several years, three standards for wireless power transmission have emerged. These standards, put forth by industry consortia include the Wireless Power Consortium’s Qi, the Power Matters Alliance and the AirFuel Alliance standard, also known as Rezence[SUP]®[/SUP]. Only the technical approach embodied in the Rezence standard allows multiple gadgets to simultaneously charge from a single transmitter at a significant distance.


Figure 3: In the future electrical power cords may become obsolete as illustrated in this vision of the home of the future

The Rezence[SUP]®[/SUP] standard for wireless power transmission is about to see rapid adoption in mobile phone and tablet charging applications. For example, several automotive manufacturers are planning to embed wireless charging systems in the center console of their vehicles so the smartphone, as well as other mobile devices, can remain charged despite intense and continuous usage while the automobile is in operation. Given that the Rezence standard requires a high speed, 6.78 MHz, frequency for power transmission, eGaN FETs are the heavy favorite for adoption over the slower and less efficient silicon power MOSFET.
Figure 4: Wireless power transfer will be used in automobiles to keep smartphones charged despite continuous usage as part of the infotainment system. (Photo courtesy of Gill Electronics.)

Wireless charging for electric vehicles is also becoming more available as electrically powered cars become more prolific. Although there is no universal standard yet, loosely coupled magnetic energy transfer, similar to the method used in the Rezence standard, is common to all implementations. This is due to its ability to transfer power without precise alignment of transmitter and receiver units. eGaN FETs are certainly a good candidate technology for this application.

Automotive Sensing and Autonomous Control – Collision Avoidance or “Relax and Enjoy the Ride”
For safety reasons, it is critical that a car know what is around it at all times. This becomes even more essential as the car evolves into a self-driving machine. Further, the higher the speed of the vehicle and the more complex the surroundings, the faster the environmental sensing system needs to be, and the more precisely it needs to interpret the distance to a potential collision.

Today automotive manufacturers use a variety of sensors in these functions, including Light Distancing and Ranging (LiDAR) sensors that have only recently begun to emerge in automotive sensing autonomous driving applications.

Summary
In conclusion, a “Mobility Imperative” is upon us…the modern consumer is demanding that:
· they do not want the range anxiety caused by the worry about running out of battery life and having to “find an outlet.”
· all their information and entertainment be available all the time via their smartphone …all in high resolution and all “right now”

Gallium nitride is the fundamental technology bringing the “Mobile Imperative” to reality since it provides:
· increased switching speed leading to higher resolution and less power consumption.
· smaller size, thus enabling product miniaturization and weight reduction.
· low product costs, thus stretching the consumer dollar farther.

Consumers want to be able to go “mobile” wherever and whenever they want…this is today’s Mobile Imperative and it is driving innovation throughout consumer electronics.

The current semiconductor technology, silicon power transistors (MOSFETs), has reached its performance limits; fortunately, gallium nitride transistors with their high-speed capability coupled with lower production costs and smaller size, has come of age. It is GaN technology that will make Tesla’s vision come to fruition…and make it possible for us to achieve the Mobility Imperative.


Does Managing Tools as if they are IP Make Sense?

Does Managing Tools as if they are IP Make Sense?
by Tom Simon on 12-10-2015 at 7:00 am

Years ago I thought that chip design companies would embrace the latest technology and be eager to adopt new tools. What I learned was that the people implementing and managing design projects were taking a lot of risks with almost every aspect of their projects. What they most wanted is to minimize risk from the design process – especially from design tool changes.

The reluctance to change goes much deeper. In the middle of a project a design team would never be willing to change tools, or even tool versions. Even minor updates from vendors can have subtle algorithmic changes that affect results. Beyond the obvious possibility of an outright bug, there can be variations in results that can affect every downstream step. This is true for implementation and sign off tools.

Chip companies spend significant resources on correlation and validation of tools. In some cases, known bugs in software are compensated for and if a tool vendor were to suddenly fixed the bug it could break the flow. Pretty much the only reason a design team will change any tool or tool version is to fix a show-stopper issue.

Now, think about how many tools there are in the the typical design flow. Each one of these tools has configuration files, rule decks, libraries or stack-up information, and command scripts that drive the tool. If anything changes it can ripple downstream.

Broadening our scope, the same reasoning applies to all the PDK data. PDK’s contain thousands of files. Stability of the PDK through a project is essential. Nevertheless, some project cannot avoid PDK changes because the foundry is refining the process, and those changes need to be adopted across the entire project prior to tape-out.

Presently, large team projects usually already use data management for the design data, maybe even rule files. As we can see from the discussion above the same kind of management that is used for design data could be beneficial when applied to the tools in the flow.
Methodics Inc., a data management company for EDA, has just written about how they support complete management of the design environment using their software. They point out that large teams spread out in locations around the world need to have consistent and well managed tool environments. Treating the design environment as if it were IP allows a systematic way of managing all the tools in the flow.

Using a variety of techniques, it is possible to make setting up the user environment efficient and fast. One frequent concern is whether making multiple copies of all the tool installations is necessary. Methodics gives customers the choice of instantiating the files or using soft links to save space and copying time. Another important consideration in their solution makes it possible to handle user specific customization, while ensuring before the final project release that known versions of all the tools are used in the final tape out flow. It is also possible to switch tool release versions and keep old tool environments available in case there is a need to roll back a tool update.

The Methodics white paper goes into more detail describing the different ways their solution can be deployed. But there is no question that managing the software used for a design project is just as important as managing the design data itself.


5 Verification Challenges of IoT Solved by Emulation

5 Verification Challenges of IoT Solved by Emulation
by Pawan Fangaria on 12-09-2015 at 4:00 pm

Software-centric Emulation environment takes the forefront in modern SoC verification. As more and more devices are IoT enabled, the SoCs have to make special provisions to factor many things including communication, power usage, and network switching, and so on. Also, the demand for an SoC (specifically for smartphone which is pivotal for IoT) to handle multiple functions pertaining to audio, video, data, mobile etc. increases the size and complexity of the SoC significantly. Considering the three connection points for IoT; endpoint, gateway and cloud, the complexities of chipsets increase in that order. An SoC for cloud (data center) application, or even for gateway application has to handle what we call “Big Data” coming from multiple sensors from all connected devices through the system.

Traditional verification methods such as simulation with testbench approach or even traditional in-circuit emulation (ICE) are not sufficient for verifying such SoCs. There is a need for a more robust, software-based, virtual emulation solution that is scalable, flexible with job sharing and remote access facility for multiple teams to work at a time, easy-to-use, and reliable without much cabling in the system.

By now we are aware of Mentor’sVeloce emulation system coming out of the closet for remote teams to run their live applications on it. During this year’s DAC, Mentor also announced Veloce’s integration with ANSYS’PowerArtist for advanced real-time power analysis early in the design cycle.

Let’s review Veloce in the context of an emulation data center and see how it addresses the key verification challenges of an IoT-centric SoC or any other mobile SoC for that matter.


Veloce VirtuaLAB is a software-based virtual emulation data center with Enterprise Server capability that requires only the emulator and workstations to execute the software versions of the protocol models. The emulator has a single operating system, Veloce OS for all applications (internal or third party) to run on it. The Enterprise Server optimizes resource utilization and job sharing through LSF software.

Multiple users from remote sites across the world can concurrently access this emulation system for multiple projects from their desktops. The VirtuaLAB models can be easily reconfigured by simply changing their compile parameters as required. Let’s see how this system addresses the five key challenges –

Protocol Solutions – Chips are accommodating increasing number of protocols. Veloce offers software-based solutions including host/peripheral models, protocol exerciser/analyzers, and software debug for multiple market segments such as mobile, networking, multimedia, and storage interconnect. These are mostly IP-based. It’s expected that the solution be extended for other popular IoT frameworks as well.

Larger Designs – Veloce provides a scalable platform for increasing sizes of SoCs – Quattro for up to 256 million gates / system and up to 16 users; Maximus for up to 1 billion gates and up to 64 users; Double Maximus for up to 2 billion gates and up to 128 users.

Lower Power – Veloce can boot the OS and run billions of cycles to fully exercise live software applications running on the target hardware. As said above, third party power analysis tools such as PowerArtist can be integrated with Veloce to get accurate power analysis numbers for real applications early in the design flow. This is a unique capability of Veloce.

Debugging large software content – The software in IoT context can be varied for edge, gateway, and cloud. Veloce provides a virtual environment where different applications can run on the same Veloce OS. Virtual Probes which provide virtual connection with the software debugger are used for live interactive debugging.


When the emulator is off, Veloce Codelink supports offline and re-playable debugging for multiple, concurrent users. Multiple databases are generated by the emulator which can be used offline for debugging, thus freeing up the emulator for other tasks and users.

Network Switch and Router Ports – In IoT applications an SoC can have thousands of ports making it impossible to provide connections in hardware test environment. Veloce VirtuaLAB provides most of the test environment into software where the emulator is connected to the user environment on a workstation through one or more software connections that enable the user to interact with the DUT running in the emulator. As an example, for Ethernet, there is Ethernet Packet Generator and Monitor (EPGM) application that runs on the workstation to generate virtual Ethernet traffic and provide visibility, analysis, and user control of the traffic.

The Veloce virtual emulation data center is a step in the right direction to support massive amount of verification needed to design the products and networks for IoT world. It provides better reliability, scale of operation with multi-user environment and remote access, lower cost of operation, and better quality with higher debug visibility in a software-based environment.

More details can be found in a whitepaper written by Richard Pugh on the Mentor Graphics website HERE.

Also read: Power Analysis Needs Shift in Methodology

Pawan Kumar Fangaria
Founder & President at www.fangarias.com


Cadence Enters the RTL Power Estimation Game

Cadence Enters the RTL Power Estimation Game
by Bernard Murphy on 12-09-2015 at 12:00 pm

At the Cadence front-end summit last week, Jay Roy presented the Cadence Joules solution for RTL (and gate-level) power estimation. Jay is ex-Apache, so knows his way around RTL power estimation which should make Joules a product to watch. Joules connects very natively to Palladium for power characterization for realistic software loads, which I’ll cover in a separate blog. Here I want to focus on Joules as a characterization competitor to Apache/Ansys, Atrenta/Synopsys and other products.

Jay’s claim, and I think he’s right today, is that Joules has all the pieces to get high accuracy for RTL power estimation. They have Genus for synthesis and Innovus for implementation so they can do (somewhat) production quality fast estimations straight from RTL and know they are (somewhat) going to correlate with the real implementation, and therefore they can get power estimates from RTL simulations which will correlate within ~15% of gate-level estimation. Jay showed a table of comparison which indeed support this assertion.

You may notice I am (somewhat) hedging my support for the attainable level of accuracy. I also know a little about this domain and some of the challenges in RTL estimation. Part of the problem is indeed in using the same tools for fast physical synthesis as you use for production implementation. But that’s not all of the problem. Fast physical synthesis is fast because it cuts corners and that can lead to correlation problems between RTL and gate-level estimates, even if you use the same physical synthesis tools you use for production.

It seems obvious that the way to understand this problem should be a detailed analysis of sources of miscorrelation between RTL and gate-level estimates. But I have yet to see such an analysis from any provider and that’s a problem because it leads to unscientific trial and error approaches to improving correlation, with no deep understanding. Scientific approaches (you know, start with a hypothesis, test against data) would provide a credible basis for knowing how to repeatably improve correlation or, just as important, knowing that perhaps 15% is as low as you can go and you cannot repeatably improve on that. This would be a lot of work, but whoever does this first will be able to claim the laurels of true expertise in this domain.

I don’t think it is necessary to test every conceivable design – that would not be a scientific approach. Useful hypotheses are simple – I’ll offer a couple to get the ball rolling. First, I believe the harder you push performance, the worse the correlation will become. The harder you push, the more buffers have to be upsized; also there are implications for routing in the presence of factors not considered in fast estimation (DFT, detailed routing, signal integrity, …), leading to yet more buffer upsizing, further impacting power. A related but not identical hypothesis is that accuracy will negatively correlate with the number of near-critical paths. As you get into implementation, some of these will become critical, requiring (probably) buffer upsizing; the more of these you have, the more implemented circuit power will deviate from initial estimates. Cadence has a running start with a fully integrated solution which should minimize known systematic sources of error from the estimation tool – they could lead the field with a detailed correlation analysis.

None of this is intended to diminish the role Joules can play today. As far as I know today they have the only full in-house flow for estimation based on implementation class physical synthesis, so they are likely to be best in-class for estimation until Synopsys inevitably releases something similar. And then they will both have a significant edge in accuracy over Apache and Calypto for the foreseeable future.

To learn more about Joules, click HERE.

More articles by Bernard…


New Book: Mobile Unleashed!

New Book: Mobile Unleashed!
by Daniel Nenni on 12-08-2015 at 6:00 pm

This is the origin story of technology super heroes: the creators and founders of ARM, the company that is responsible for the processors found inside 95% of the world’s mobile devices today. This is also the evolution story of how three companies – Apple, Samsung, and Qualcomm – put ARM technology in the hands of billions of people through smartphones, tablets, music players, and more.

It was anything but a straight line from idea to success for ARM. The story starts with the triumph of BBC Micro engineers Steve Furber and Sophie Wilson, who make the audacious decision to design their own microprocessor – and it works the first time. The question becomes, how to sell it? Part I follows ARM as its founders launch their own company, select a new leader, a new strategy, and find themselves partnered with Apple, TI, Nokia, and other companies just as digital technology starts to unleash mobile devices. ARM grows rapidly, even as other semiconductor firms struggle in the dot com meltdown, and establishes itself as a standard for embedded RISC processors.

Apple aficionados will find the opening of Part II of interest the moment Steve Jobs returns and changes the direction toward fulfilling consumer dreams. Samsung devotees will see how that firm evolved from its earliest days in consumer electronics and semiconductors through a philosophical shift to innovation. Qualcomm followers will learn much of their history as it plays out from satellite communications to development of a mobile phone standard and emergence as a leading fabless semiconductor company.

If ARM could be summarized in one word, it would be “collaboration.” Throughout this story, from Foreword to Epilogue, efforts to develop an ecosystem are highlighted. Familiar names such as Google, Intel, Mediatek, Microsoft, Motorola, TSMC, and others are interwoven throughout. The evolution of ARM’s first 25 years as a company wraps up with a shift to its next strategy: the Internet of Things, the ultimate connector for people and devices.

Research for this story is extensive, simplifying a complex mobile industry timeline and uncovering critical points where ARM and other companies made fateful and sometimes surprising decisions. Rare photos, summary diagrams and tables, and unique perspectives from insiders add insight to this important telling of technology history.

The forward by Sir Robin Saxby alone is worth the price of admission, not to mention the picture of Simon Segar as a young engineer when he first joined ARM… 😉 There is also a cameo by Wally Rhines from his TI days.

I truly believe you need to fully understand, as a semiconductor professional, how we got to where we are today to better understand where we are going tomorrow and that is what this book is all about. On a personal note, writing books like this is a lot like giving birth (although my wife may disagree). It was nine months of hard work but let me tell you one thing, Don Dingee made this whole process a lot easier. Don is the most dedicated, thorough, and hardworking researcher I have ever worked with, absolutely!


The Twists and Turns of Xilinx vs Altera!

The Twists and Turns of Xilinx vs Altera!
by Daniel Nenni on 12-08-2015 at 12:00 pm

The battle between Xilinx and Altera continues to be one of the more interesting stories to cover. It really is the semiconductor version of a reality TV show. In the beginning it was two fabless companies partnered with rival foundries going head-to-head controlling a single market that touches a variety of industries.

Then things got interesting when Xilinx left UMC to share TSMC with Altera taking the foundry differences out of the equation. Next Altera left TSMC for Intel? Say what!?!?! Then Altera did a head fake back to TSMC and Intel bought them for a 56% premium! Now that process differences are back in the equation let’s take another look.

In the beginning FPGA vendors had very close relationships with the foundries. FPGAs were used during process development and ramping due to their dense design blocks that are used repeatedly throughout the chip. It was a very intimate partnership, one that made FPGAs bleeding edge chips at the process level. That all changed of course at 28nm when Xilinx joined Altera at TSMC which brought a level playing field where design and implementation was key.

It is well documented that the first FPGA vendor to a new process node is awarded majority market share. Even Intel CEO Brian Karazchi recently called it out as one of the three reasons why Intel bought Altera during a fireside chat with John Pitzer of Credit Suisse:

“there is strong data that suggests that the percentage — the company that’s had leadership position, first products on the first node, you have to have the right design, right architectural point and all, but they’ve tended to gain share.”

For the record, Xilinx won the 28nm battle by out-implementing Altera at TSMC 28nm and again at TSMC 20nm. Altera moving to Intel Custom Foundry made the race to FinFET interesting but Xilinx again won that one. In fact, I have yet to see Altera/Intel 14nm silicon while Xilinx started shipping FinFET parts at the end of Q3 2015. Last word on the Altera roadmap had “Cedar” (replacing Cyclone) fabricated using TSMC 16nm (due to “cost/power” reasons) to be delivered 1H 2016. “Oak” is Intel 14nm and is due in 2H 2016. “Sequoia” is Intel 10nm due sometime in 2018.

The good news is that technically Altera will win the 10nm process node race since Xilinx is skipping 10nm in favor of an accelerated 7nm process. The bad news is that TSMC 7nm will be in production at about the same time as Intel 10nm so it will be a hollow win.

It is too soon to tell if the Intel process is favorable to FPGAs, 14nm will tell us that next year. I highly doubt Intel will use Altera FPGAs to ramp their 10nm process so it really is a coin toss. Even so, it may not be enough advantage over Xilinx/TSMC if they are a process node ahead. Remember, I’m not a journalist reporting the news here, this is the opinion, observation, and experience of a 30+ year fabless semiconductor professional who also likes to write.


Syncing Up CDC Signals in Low Power Designs

Syncing Up CDC Signals in Low Power Designs
by Ellie Burns on 12-08-2015 at 7:00 am

So far in my blog series on low power we’ve looked broadly at what’s changing in the low power verification landscape and focused on a new methodology developed by Mentor Graphics and ARM called successive refinement, which is now included in the UPF standard. Power management techniques create their own brand of clock domain crossing (or CDC) problems, so it is important to include CDC verification in the successive refinement, or any, verification flow.

In order to better understand these new CDC challenges and how to deal with them, I’ve asked our resident CDC expert, Ping Yeung, to join me. Ping has studied these issues and helped develop their solutions for 15 years and presented several papers on the topic.

Hello Ping. How about starting out with what CDC is in general, and what is it trying to address?

Certainly, Ellie. CDC signals are having a greater impact because people are breaking up their designs into more clock and power domains. Partitioning an ASIC or SoC into multiple power domains is a very effective way to reduce power consumption. The power of these domains is then controlled by either switching off power or reducing voltage levels.

The tricky thing is that partitioning a design creates various challenges because all of the signals going to and from these different domains need to be synchronized. If they are not, they will behave unpredictably. When it comes to power specifically, the interdependence of logic between power domains requires designers to add isolation, retention, and voltage shifter components at the power domain interfaces. The addition of power control logic, which is becoming increasingly prevalent in all sorts of designs, introduces new challenges to both the design and verification efforts.

There are two main challenges.

The first problem introduced by splitting the design into multiple power domain is that when clock signals cross power domains, they are not synchronous anymore because of the level shifters or isolation cells that have been inserted at the domain interfaces. The power domains affect the clock tree and the reset tree. And every part of the design requires the clock and reset signals to operate correctly. So when you’re partitioning your design into multiple power domains, you have to be very careful about how those clock and reset trees need to be distributed.

This is not a big concern with clock gating, but when you start using multi-voltage techniques that’s where real problems come in. The same goes for dynamic voltage frequency scaling (or DVFS) which is useful for making tradeoffs between performance and power by scaling the voltage either up and down. But when you shift the voltages, signals crossing from one power domain to the other must move from one voltage level to another. As a result, clocks in different power domains with significant voltage differences are no longer synchronous even though they may have the same source.

Now let’s talk about the second issue. When you introduce power domains in a design, they need to be controlled by various control signals that are generated by a power controller. This entails introducing new modules and new signals into the RTL code (i.e., the power controller and all the power management architecture that goes with it). These are not part of the original specification or the RTL code, but are captured in the Universal Power Format, the UPF file.

All of those control signals are generated by the power controller, so they correlate to the power controller clock domain. And the power controller clock often runs independently from the design clock, resulting in a major clock domain crossing between the power controller and the rest of the design. So you have to make sure all these power control signals, which are introduced by the UPF, are synchronized with the rest of the design before they’re used. If you don’t you have synchronization problems and unpredictable behaviors again.



Figure 1. Power Aware CDC Flow

So, what is the solution?

In order to verify this type of design, you need to represent the design with the all of the domain information in place. The designer needs to know which power domain a signal belongs to, they want to know which clock domain it belongs to, and they want to know which reset domain it belongs to. That way they can make the decision as to whether there is a CDC problem or not. Because the power domain will impact the clock and reset domains, we want to present this information so that when the designer makes a decision of partitioning, they understand immediately the impacts on the other things in their design.

Designers now have a new flow available to help them get these tricky domain crossings right. Mentor’s Questa CDC solution supports a new concept in UPF 2.0 and UPF 2.1 called the power supply set(not to be confused with the power supply net). This is a power network grouping option that allows designers to define and test the power distribution network earlier in the project cycle before the power distribution network has been implemented. So you don’t need to know the physical implementation yet. Using the supply set, I can define the power domain at the RTL level early in the design cycle without knowing which voltage or power supply net it is actually connected to. So you can verify this regardless of how the supply net will be implemented.

The new power distribution network in UPF 2.1 is an example of the successive refinement methodology you’ve been talking about in your blogs, Ellie. In this context, the power network can be incrementally built over the duration of the project cycle by the different design project teams. The block and system designers can begin to verify the power management logic before the power distribution network has been implemented, then the final power management logic verification will occur later in the design flow when the physical designers add the power distribution network.

That’s very useful because when you are integrating multiple IP together, every IP can come with its own supply set. Now when you integrate, you can look at the supply set at the IP level and start finding a common ground to satisfy all the supply sets. And then you can design the supply net.

When we do the verification, we can now actually look at the supply set and figure out, by looking at the supply set, how many power domains you have in your design. Using the supply set information you can make sure the design can support those power domains later on. You use the supply set to verify whether your clock and reset trees were partitioned correctly. Once you have the tree information refined, you can run the CDC analysis using Questa CDC.

Our customers have already had success using our CDC solution in production. For example, AMD presented a Power Aware CDC (PA-CDC) verification paper at DAC in 2014. One of the things they uncovered was that control signals were not synchronized. They reported that complete PA-CDC checking at the RTL allowed them to find issues earlier in the design cycle and make turnaround times faster, and they used the PA-CDC flow to double check their generated UPF files against the RTL design.

Thank you, Ping, for giving us a quick insight into the effects of advanced low power management techniques on CDC design and verification. For those wanting to learn more, please check out the new whitepaper, Power Aware CDC Verification of Dynamic Frequency and Voltage Scaling (DVFS) Artifacts, which was presented at DVCon Europe 2015.

I’ll see you next year when we talk about what to look for in a low power debugging flow.