RVN! 26 Banner revised (800 x 100 px) (600 x 100 px)

Extending Moore in Silicon

Extending Moore in Silicon
by Daniel Nenni on 04-30-2015 at 7:00 am

A year ago many eulogized the death of Moore’s Law at 28nm due to higher prices per transistor at more advanced nodes, but now that we have celebrated the 50th anniversary let’s look ahead to technology scaling and electronic systems miniaturization for the next decade. Despite our industry’s bipolar tendencies and daunting technical challenges, I remain soberly optimistic about the future of semiconductors. Here is one reason why:

“The technology shift from planar to 3D MOSFETs has created new opportunities for disruptive device and process innovations that can extend Moore’s Law.” –Jeff Wolf, CEO of FinScale Inc.

Has technology scaling brought us into a physical domain where new approaches are needed? Many think so, and it seems counter-intuitive (to me) that we could capture all of the potential gains offered by the shift to 3D MOSFETs using the industry’s now institutionalized tick-tock approach to device and process development. A break from past paradigms should be explored when transitioning from two to three dimensions in any context – as we’ve already experienced in graphics, CAE and video games.

I recently met with founders of FinScale, a device and process innovation start-up, who introduced me to their 3D quantum FinFET. As CTO Victor Koldyaev described how FinScale’s qFinFET harnesses positive quantum effects and mitigates negative QE, I had to stop him and ask, “Do you really talk to customers like this?” It was refreshing to hear scientific support for how we can continue to scale forward with Moore’s Law benefits, and made me realize we haven’t been getting as much technical explanation from manufacturers about their process and yield improvement learning as in the past.

We should expect changes in the way we describe semiconductor technology in the quasi-ballistic regime. I don’t yet hear many talking about the fundamentals of sub-20 nm 3D MOSFET operation, which are increasingly dominated by the atomic properties of materials and quantum effects. Beyond our vital discussions about litho, punch-through and electrostatic control, I’m looking forward to hearing more discussions about quantum inversion layers, ballistic transport, scattering centers and carrier relaxation times.

FinScale touts qFinFET’s many-node Moore’s Law scaling roadmap in silicon down to the 5 nm node, on either bulk or SOI, with dimensions specified for critical device features at each node. The FEOL process sequence is defined, including many FinScale innovations that work together to improve performance, density and power efficiency, and lower the manufacturing cost of fin-based devices. FinScale claims that qFinFET can be readily fabricated using existing advanced node process modules, equipment and materials, and provide fin-based device solutions for logic, embedded and stand-alone memories, analog, RF and image sensors.

While this all sounds good and is well-supported by scientific and advanced manufacturing technology research, FinScale’s inventions need to be validated in practice by a high-volume foundry or IDM early adopter, which is FinScale’s goal.

I think a compelling case can be made for start-ups as enablers of the industry’s path forward – to incubate potential breakthroughs with focused intensity on accelerated timelines. The figure below shows the recent results for the all-in-house approach to R&D. For the past five years the time gaps between node-to-node transitions at leading manufacturers are increasing. It doesn’t appear that business-as-usual is getting the industry to where it needs to be according to Moore. These gaps are opportunities that FinScale aims to fill.

http://www.semi.org/en/node/50391, accessed 13 April 2015

I’m encouraged to see renewed investment interest in semiconductor start-ups after declining deal flow in recent years. Our industry needs to keep seeding and cultivating promising start-ups and disruptive technologies so that they’re ready when needed to solve difficult challenges and capitalize on tomorrow’s world-changing opportunities.


Single Chip MCU + DSP Architecture for Automotive: SAMV71

Single Chip MCU + DSP Architecture for Automotive: SAMV71
by Eric Esteve on 04-29-2015 at 7:00 pm

It’s all about Cost of Ownership (CoO) and system level integration. If you target automotive related application, like audio or video processing or control of systems (Motor control, Inverter…) you need to integrate strong performance capable MCU with a DSP. In fact if you expect your system to support Audio Video Bridging (AVB) MAC on top of the targeted application and to get the automotive qualification, ARM Cortex-M7 processor based Atmel SAMV70/71 should be your selection: offering the fastest clock speed of his kind (300 MHz), integrating a DSP Floating Point Unit (FPU), supporting AVB and qualified for automotive.

Let’s have a closer look at the SAMV71 internal architecture:

When developing a system around a Micro Controller Unit (MCU) you expect this single chip to support as many peripherals as needed in your application to minimize the global cost of ownership. That’s why you can see the long list of system peripherals (top left of the block diagram).
Atmel SAMV71 is dedicated to support automotive infotainment application, so the Dual CAN and Ethernet MAC support (bottom right). If we dig into these functions, we can list these supported features:

  • 10/100 Mbps, IEEE1588 support
  • MII (144-pin), RMII (64-, 100, 144-pin)
  • 12 KB SRAM plus DMA
  • AVB support with Qav & Qas HW support for Audio traffic support
  • 802.3az Energy efficiency support
  • Dual CAN-FD
  • Up to 64 SRAM-based Mailboxes
  • Wake up from Sleep or Wake up Modes on RX/TX

The automotive-qualified SAM V70 and V71 series also offer high-speed USB with integrated PHY and Media LB, which, when combined with the Cortex-M7 DSP extensions, make the series ideal for infotainment connectivity and audio applications. Let’s take a look at this DSP benchmark:

If you are not limited by budget consideration and can afford integrating one standard DSP along with a MCU, you will probably select the SHARC 21489 DSP (from Analog devices) offering the best-in-class benchmark results for FIR, Biquad and real FFT. But such performance has a cost, not only a $ cost but also in term of power consumption and board footprint, let’s call it “cost of ownership”. Automotive applications are running in production by million units per year, and $ cost is absolutely crucial in this market segment and you will quickly decide to go to an integrated solution.

To support audio or video infotainment application, you expect the DSP integrated in the Cortex M7 to be “good enough” and you can see from this benchmark results that it’s the case for Biquad for example, as ARM CM7 is equal or better than any other DSP (TI C28, Blackfin 50x or 70x) except the SHARC 21489… but much cheaper! Good enough means that the SAMV70 will support automotive audio (Biquad in this case) and keep enough DSP power for Ethernet MAC (10/100 Mbps, IEEE1588) support.

You can see on the above picture the logical SAMV71 architectures for Ethernet AVB support and how using the DSP capabilities for Telematics Control Unit (TCU) or Audio Amplifier.

Integrating a DSP means that you need to develop the related DSP code. Because the DSP is tightly integrated into the ARM CM7 core, you may use the MCU development tools (and not specific DSP tools) for developing your code. Since February the ATSAMV71-XULT (Full featured Xplained board) is available from Atmel. As this board has been built around the features rich SAMV71, you can develop your automotive application on exactly the same MCU architecture than the part going into production:

More information about this evaluation/development board is available on Atmel web here.

From Eric Esteve from IPNEST


Automating Timing Closure Using Interconnect IP, Physical Information

Automating Timing Closure Using Interconnect IP, Physical Information
by Majeed Ahmad on 04-29-2015 at 1:00 pm

Timing closure is a “tortoise” for some system-on-chip (SoC) designers just the way many digital guys call RF design a “black art”. Chip designers often tell horror stories of doing up to 20 back-end physical synthesis place & route (SP&R) iterations with each iteration taking a week or more. “Timing closure”, a largely inefficient process mired with uncertainty, can effectively turn into “timing experimentation” if the factors that delay or prevent timing closure are not addressed in the early stages of the design flow.

Today’s new, larger and more complex SoC designs are developed in even shorter timeframes and increasingly demand higher productivity in each phase of the design cycle. Increasing on-chip interconnect design efficiency has historically been key to the recent reductions in SoC design times because the on-chip network touches every IP on the chip and must adapt as the SoC design changes.

The interconnect fabric is akin to an SoC’s skeleton and nervous system, holding the SoC together and managing its on-chip communications while living in the narrow confines of the floorplan “white space” lanes between IP blocks. Increasing the efficiency of interconnect design automatically increases overall SoC design efficiency. However, as chips have grown in size and the number of IP blocks increased, the interconnect IP has become the major source of long timing paths that must be stretched through congested areas of the SoC. Because of these long paths, the interconnect fabric has become a major source of timing closure issues that until now were only uncovered in the back-end SP&R phases of the chip design process.


Arteris has launched FlexNoC Physical interconnect IP to automate timing closure

Physically Aware Interconnect IP?

Trying to solve physical timing closure problems by improving the on-chip interconnect IP seems strange at first, but makes sense when the fact that most timing issues arise from within the interconnect IP is understood. Furthermore, using the interconnect IP as a foundation to implement front-end technology that eases back-end design issues makes sense when you consider that Arteris Inc. claims to have 52 active customers and nearly 200 SoC projects. Moreover, the interconnect IP supplier from Silicon Valley is adding eight to 10 new customers every year (Arteris added nine in 2014).

Interesting fact: Most of the world’s smartphones use the Arteris FlexNoC IP fabric, and these application processors and digital baseband modems are exactly the types of complex SoCs that take the longest to close timing.

Arteris’ NoC technology has been on the forefront in minimizing wire routing congestion and reducing silicon area, cost and power consumption of all kinds of chips. Now Arteris is taking on the next interconnect challenge in the SoC design flow: Timing closure. The interconnect IP firm is doing this by leveraging its network-on-chip architecture to accelerate physical design.

The Campbell, California–based supplier of interconnect IP claims that it has developed a way for chip design teams to cut months off development time by automating the timing closure process through the use of physically aware network-on-chip interconnect fabric IP.


Interconnect is skeleton and nervous system of SoCs

Normally, SoC designs change eight to 10 times during a normal chip design lifecycle due to market and technical factors, and the problem of SoC design iterations is likely to get worse at smaller nodes like 16nm, 14nm, and 10nm. The fact that physical design is getting in the way of SoC development makes it imperative for chip developers to visualize the SoC project and see what’s going on from a physical aspect.

Arteris announced the availability of FlexNoC Physical interconnect IP at the Linley Mobile Conference on April 22, 2015. Arteris’ FlexNoC Physical interconnect IP has automated the placement of pipeline stage IP on long paths that cause timing closure issues.

Arteris President and CEO K. Charles Janac says that the FlexNoC Physical interconnect IP can help the RTL implementation team to automatically add pipelines for timing closure, cutting months off complex SoC development cycles. He added that the new interconnect solution helps SoC architects to visualize the physical implications of their typologies early in the design cycle and address the physical layout and floorplan issues prior to completing the back-end design.

How FlexNoC Physical Works

Usually, SoC teams over-design their chips in the front-end stage to avoid timing problems in the back-end, creating excess timing slack and increasing die area and power consumption. So FlexNoC Physical IPintelligently estimates and predicts in the front-end phase where timing issues will occur in the back-end, allowing design teams to implement the minimum number of pipeline stages to achieve desired frequencies. Moreover, SoC designers can generate interconnect floorplan outlines and treat the interconnect as a separate IP to carry out physical synthesis place-and-route independently. That simplifies the job of the layout team.

The physically aware interconnect IP determines where pipeline stages must be used and adds pipelines automatically. It evaluates all timing arcs in the NoC interconnect because distance and logic depth dictate number of pipeline stages. The interconnect IP first predicts and then implements pipeline location while minimizing area and latency.


Time closure with physical synthesis

Additionally, Arteris has developed a more effective data exchange between front-end and back-end design tools and IP to accelerate physical design by feeding the physical design better data. That speeds up layout with physically-aware interconnect IP, which in turn, allows isolating the NoC physical instance to optimize timing and routing.

Design teams can now optimize the NoC physical instance independently of the rest of SoC. The FlexNoC Physical solution leverages the architectural knowledge of the SoC interconnect to accelerate timing closure as well as improve layout quality of results (QoR) by using less slack to meet timing.

Using FlexNoC Physical IP can Help Avoid Disaster

Timing closure can take months, and mistakes can cause subsequent delays in the market introduction. Arteris claims that accelerating timing closure through the use of NoC interconnect fabric enables SoC designers to cut months off their development time and gets them to market faster.

Take the case of an SoC maker that ended up designing an un-manufacturable chip. The company had a very firewalled relationship between its front-end design, architecture team and back-end layout team. The front-end team created a design that couldn’t be placed by the back-end while meeting timing and frequency requirements. The chipmaker had to redo the design, which was eventually canceled.

The bottom line was that the firm paid more than twice the normal R&D costs for a chip that was never made. Obviously, $200 million is a lot of money to pay for a timing closure failure.

Also read:

Rockchip Bets on Arteris FelxNoC Interconnect IP to Leapfrog SoC Design

Got FPGA Timing Problems?

What’s New with Static Timing Analysis


How Good Are Your Clocks?

How Good Are Your Clocks?
by Paul McLellan on 04-29-2015 at 7:00 am

One of the trickiest tasks in designing a modern SoC is getting the clock tree(s) right. The two big reasons for this:

  • the clocks can consume 30% or more of the power of the whole chip, so minimizing the number of buffers inserted is critical to keeping power under control
  • the clock insertion delay and clock skew have a major impact on timing. If a flop on the early side of the skew window drives a flop on the late side, or vice versa, it can consume a large part of the setup/hold margin and so affect the maximum clock frequency that the chip will work at

The clock-tree is actually constructed during physical design during the clock-tree synthesis (CTS) phase. This is driven by constraints provided by the design team and so a large part of producing a good clock-tree is creating good constraints.

An additional issue is that increasingly SoCs are built out of blocks of IP assembled together. Typically the IP blocks are designed by a “front-end” design team, often overseas, and the physical design and assembly is done by a “back-end” team at the headquarters.

But this leads to another problem. The front-end designers have to come up with good constraints, plus avoid producing inherently unbalanced logic that will be difficult to clock. However they don’t think like back-end designers and don’t understand the physical CTS process well.

Meanwhile the back-end team doesn’t understand the clock structure well, and by that stage in the design process has little time for interaction. They will typically run with whatever the front-end teams gave them and do their best to close timing with what they have. But it is frustrating and may be impossible to close timing with a suboptimal clock tree.

ICScape has a tool, ClockExplorer, that addresses these problems. It provides front-end designers with feedback on the quality of the clock tree to find errors or suboptimal design. Structure and constraint checking can also evaluate clock quality, and help front-end and back-end designers to identify design problems that should be fixed early.

It then allows the front-end designers to communicate this information to the back-end designers and gives them similar feedback. It can also be used after CTS to do a more in-depth analysis taking the physical information into account. Of course at this point it can display a layout view, showing where the actual clock-paths run on the physical chip. For each problem, ClockExplorer can identify the problem, detail what issue it will cause and explain what needs to be changed to fix the problem. In this way it allows less experienced designers to be effective and avoid creating problems that will only show up later.

Note that ClockExplorer does not create the actual clock tree, that is still left to the CTS. ClockExplorer is a tool that allows front-end and back-end designers together to create good clock constraints, which in turn will lead to better clocks, lower power, and a fast timing closure process. In short, better CTS QoR.

ClockExplorer allows designers to look at a schematic of the clock tree. Since all the datapath elements are suppressed, it can handle extremely large designs very fast. For front-end designers it produces a timing dependency report, reports suboptimal structures, missing constraints and so on. It can automatically identify false paths or unnecessary balancing, and so minimize the number of buffers that will need to be inserted. The clock tree can be displayed by level or by delay.

As an example of its use on a 28nm design with 600K instances it reduced the clock tree buffer count by 40%, the hold time total negative slack (TNS) by 80% and so on. See the table below.

In summary, ClockExplorer is a tool offering structure and constraint checking, constraint optimization, and clock tree debugging.

More details on ClockExplorer are available on the ICScape website here.


Motley Fooled by FinFETs!

Motley Fooled by FinFETs!
by Daniel Nenni on 04-28-2015 at 10:00 pm

There was an article on Motley Fool recently detailing Intel’s 14nm FinFETs and comparing them to TSMC. Unfortunately the author has zero semiconductor education or experience even though he writes with authority on all things semiconductor. He also has no shame in using outdated papers from conferences he did not even attend to make his misguided point. The things people do for a penny per click… and yes I did speak to him privately about this but he stands by his article and left it to me to prove him wrong, which is why I write this now.

Intel Corporation to Detail 14-Nanometer System-on-Chip Technology at VLSI Symposium

According to SemiWiki experts, Motley Fool’s article misrepresents some of the intricacies associated with FinFets and how drive currents are defined. On the face of it, Intel’s 14nm announcement looks impressive; 37-50% drive current improvements over 22nm, who could complain about that? Unfortunately a slightly deeper dive reveals some issues with this conclusion. Intel, at various meetings, including their analyst meeting back in November 2014, proudly announced their fin pitch scaled from 60nm to 42nm, while their fin height increased from 34nm to 42nm. All good assuming I have similar current/micron of fin perimeter (also called Weff which for one fin is 2* height + top width, but more on that later). With more fins/micron and taller fins, I should be able to have much better performance.

However, if you read their IEDM 2014 paper carefully, you will notice that all Intel’s drive current numbers are quoted per micron of drawn width, i.e. for one micron of top view silicon width. Now taking the assumption above of same current/micron of fin perimeter, how much performance improvement should I get per drawn micron? Using Intel’s own numbers, we have 60/42 =1.43X more fins/micron and fins are 42/34 = 1.24X taller, so all in all we should get 1.76X more drive current/drawn micron. In others words, at equal drive current per micron of fin perimeter, we should have seen 76% more current from these tighter taller fins, but Intel is reporting only 37-50%. Clearly the drive current per effective micron is going down. Intel struggled with their 14nm yield, this suggests they may have also struggled with their device performance.

The article goes on to compare Intel to TSMC 16nm FinFet however the author does not realize that the TSMC 2013 IEDM paper was quoting drive currents/Weff as described above, not per drawn micron. TSMC actually pointed this out in their 2014 IEDM presentation. TSMC also showed even better performance in their 2014 paper than the earlier 2013 version, hence the new process name 16FF+. So in the end, how do they stack up? If you use the Intel’s per drawn micron metric, TSMC 16FF+ has ~10% more drive current than Intel 14nm (all other things being equal including leakage and voltage). If you use another metric like current/fin, or current/Weff, TSMC has an even stronger advantage.

That is why during the TSMC symposium last month Dr. BJ Woo emphatically stated TSMC had “the best” transistor in the 14-16nm technologies. It will be interesting to watch how this unfolds as 10nm process details are disclosed. In my 30 years in the semiconductor industry I don’t remember a more exciting time, absolutely.


PDK Generation Needs Paradigm Shift

PDK Generation Needs Paradigm Shift
by Pawan Fangaria on 04-28-2015 at 4:00 pm

For any semiconductor technology node to be adopted in actual semiconductor designs, the very first step is to have a Process Design Kit (PDK) developed for that particular technology node and qualified through several design tools used in the design flow. The development of PDK has not been easy; it’s a tedious, time consuming, and repetitive process to develop and validate various design rules in the silicon during the technology development. Even after readiness of a technology node, PDK development may take several months before any real design can be developed and tested on that technology node. Usually, the process flow is unstable during early phases of technology development and that adds certain amount of inaccuracy in the results, which is later corrected after several iterations.

Now consider today’s advanced process nodes, for example a FinFET 14nm FEOL or a Self-Aligned Quad-Patterned (SAQP) 10nm BEOL. It’s a 3D structure that introduces several complex process rules and constraints in the technology development. The design rules can vary with the contexts of the design constructs. In such a situation, if we keep using the same old method of design rule analysis by using spreadsheet calculations and various process assumptions, it no longer remains predictive and can further delay PDK creation. It’s time for using a predictive Virtual Fabrication Platform to develop and validate design rules for faster and accurate PDK generation. Coventor’sSEMulator3D provides an excellent platform for this new way of PDK generation that can prove extremely productive and worthy for advanced complex technology nodes. It provides accurate models and simulations for predictive analysis of design rules for different design constructs under different contexts.

Consider a self-aligned via V1 placed at the crossing between metals M1 and M2 as shown in figure 1. As is evident from the SEMulator3D predictive structural model and its simulation graphs, the contact area between V1 and M1 is dictated by V1-M2 overlay errors, and not by M2-M1 overlay errors.

Now consider another design situation (as shown in figure 2) where the V1 is between the adjacent corners of M1 and M2. In this case, the contact area between V1 and M1 is dictated by both, V1-M2 overlay errors and M2-M1 overlay errors. Under the even tighter overlay controls, the contact area is significantly reduced compared to the crossing case in figure1. The reduction in the contact area has to be limited in order to preserve the yield and reliability criteria, otherwise such construct is disqualified. Improvements in design finishing, OPC, process or design rules can all be driven from such an analysis.

The SEMulator3D Virtual Fabrication Platform can develop such design-sensitive rule analysis in a physically predictive sense to help produce more robust high quality design rule checking.

The SEMulator3D’s predictive modeling along with Virtual Metrology and Expeditor can also provide accurate process variations that can be used to gain better yield and equip designers to accurately predict electrical behavior.

Parasitic extraction is another key task in PDK development. SEMulator3D has the capability to produce meshes for finite element analysis based on its process-predictive physical models. The parasitic values extracted from SEMulator3D models are significantly more accurate than those from typically used geometry-based models.

The above capabilities in SEMulator3D make it an ideal and powerful platform for generation of PDKs for complex technology nodes. It saves several months of time in producing PDKs for new technologies and at the same time provides robust and accurate design rules, thus providing huge benefits to early adopters as well as foundries and IDMs.

The success of SEMulator3D has led Coventor to expand in Taiwan, a region buzzing with semiconductor technology development and chip manufacturing. Last week Coventor made a press release about opening its office near Hsinchu Science Park in Taiwan. The idea is to better serve Coventor’s large customer base in that region and also expand the use of SEMulator3D Virtual Fabrication Platform among other foundries, IDMs and memory manufacturers.

Advanced 3D memory technologies for DRAM and NAND Flash as well as 3D FEOL logic processes for FinFETs represent significant development challenges for foundries and fables teams. The SEMulator3D Virtual Fabrication Platform enables design and process architects to effectively collaborate at the physical process level and integrate new technologies into the manufacturing process. Thus SEMulator3D also provides an excellent platform for new technology development. It significantly reduces silicon learning cycles and capital expenditure in new technology development.

Pawan Kumar Fangaria
Founder & President at www.fangarias.com


Linley: Mobile Peaked in 2013?

Linley: Mobile Peaked in 2013?
by Paul McLellan on 04-28-2015 at 7:00 am

Last week was the Linley Mobile Conference. Mobile is a huge semiconductor market and, outside of Intel, is the main driver for next generation process technologies. A new generation of mobile phones comes along, fills the leading edge fabs for a year or two and then moves on to the next generation. Nothing comes close to requiring so many wafers and requiring such a steep ramp.

Linley Gwenapp (whose name always makes me think of those puzzles for kids: rearrange these letters to make a famous Hollywood star…Gwyneth Paltrow…so close) gave the opening keynote as usual. The reality, though, is that although the mobile market is huge and a lot of time is burned up on whether Apple’s next part will be made by TSMC, Samsung or Global, the reality is that the market has got rather boring. Smartphone growth is still significant but it is slowing. Broadcom followed TI and Freescale out of the market and, despite having a world-class team, could find no buyers. Spreadtrum merged with RDA Micro.

The high-end phones all seem to be going internal. Apple has made its own application processors (AP) for some time and Samsung has switched from Qualcomm to its own Exynos. Huawei and probably others seem to be going that route too. The merchant AP market has 4 vendors that have, along with internal, a combined 95% market share: Qualcomm, Mediatek, Samsung and Spreadtrum.
The tablet market is both more boring and more interesting. There are more suppliers but the market has basically stopped growing. Every year the leader in the tablet AP market seems to change. Over the last four years the leader was nVidia, then Allwinner, then Intel. Last year was Intel only because they shipped a lot of “negative revenue” with each part. They have reduced that and so in 2015 the leader is expected to be Mediatek. And, of course, the true market leader every year was Apple with their own processor used only in the iPad.

The result of all of this is that the merchant AP market (excluding internal) peaked two years ago in 2013. A year or so ago only Qualcomm seemed to have a viable LTE baseband (BB) modem but now there are 11 vendors and it is no longer any sort of differentiator. Going forward, integrated AP+BB will be obligatory in a couple of years (perhaps outside of tablets). It will be interesting to see if Apple adds BB to their Ax line or continues to have a separate BB, historically from Qualcomm but rumors about Intel abound here.

The big unknown going forward is how much Moore’s Law is stalling. If transistors stay at a fixed cost, then going to a new process generation will result in faster, lower-power but more expensive chips (assuming more capability is added). Apple probably doesn’t care too much if their Ax part costs $25 or $30 but if it goes to $50 next year and $100 the year after that will require a lot of additional capability to justify the cost. Just a few more cores on the GPU is probably not enough.

See also Demler: Quadcore is Just For Marketing

The APs contain differing numbers of cores, from four big and four little at the high end, to two small at the low end. Having four big cores seems a waste since they cannot really be used simultaneously much due to thermal reasons but as I have written about before, four big cores (and eight total cores) is just for marketing. Liley reckons that the mainstream will settle down on four cores with eight as a premium configuration. There is also a fast shift to 64-bit going on, with almost all APs by 2018 being ARM v8 (apart from a tiny sliver of Intel cores if they still play in the market).

New things coming to mobile, enabled by off-load processors:

  • always on functionality
  • voice recognition without needing to push a button
  • pedometer (fitbit in your phone)
  • indoor inertial navigation
  • vision processing (augmented reality)

Wearables were a 5 million unit market in 2014 with high end devices around $79 and knock-offs at $15. Smart watches are obviously coming but it remains to be seen how popular they get. Apple iWatch today costs $349 but others are $50. At that lower point, bunding a watch with a high end phone becomes a no-brainer. Linley’s forecast is that by 2019 smart watch units will be 17% of smart phone units, growing to 400M units. Smart glasses will just be a tiny niche and fitness bands will be replaced by watches. Some medical wearables may exist but the market size is limited by the people who are sick in the right way.

My own feeling, based just on having a Pebble watch (that I pretty much never wear since it only shows true SMS messages and not stuff from WhatsApp and WeChat etc) and a Fitbit (which I often wear until I need to re-charge the battery and then I forget it for days) is that the current offerings are just not enough. But if they could do a good job of monitoring true health (blood pressure, ECG etc) that would be the killer app. After all, who wouldn’t pay a few hundred dollars to get a few hours advance warning of serious health problems such as an impending heart attack.


SoC Debugging Just Got a Speed Boost

SoC Debugging Just Got a Speed Boost
by Daniel Payne on 04-28-2015 at 4:00 am

Sure, design engineers can get more attention than verification engineers, but the greater number of verification engineers on SoC projects means that the verification task is a bigger bottleneck in a schedule than pure design work. A recent survey conducted at Cadence shows how verification effort can be divided into several, distinct tasks:

The largest portion of verification time is spent in debugging, followed by the actual test execution run time. Speeding up the time spent on debug would directly benefit any SoC schedule. You can debug by printing out internal node values and staring at waveforms, while rerunning verification tests to try and pinpoint each error. If you didn’t add the right internal node, then you have to find it, add it and rerun verification tests, creating an iterative and slow debug process. There has to be a more elegant approach to debug.

Related – ARM & Cadence IP Partnership for Faster SoC Design

Engineers at Cadence have come up with a better and patented methodology to quickly find bugs using Root Cause Analysis (RCA) technology. With the RCA approach you start by running a verification test, an output mismatch is detected by your testbench, then the underlying bug is found by using Big Data concepts. You then fix your identified bug, and re-run verification, thus reducing the number of brute force iterations previously used.

Big Data analysis looks at the entire design space in a single verification run while performing three functions to identify the actual root cause of the mismatches:

  • SmartLog – capture all messages (SystemC, Verilog, etc.)
  • Reverse debugging – ability to go forward and backward in time
  • Multi-engine data – from functional simulation, formal tools, etc.

This totally new debugging platform is named Indago, and it has three apps that may be used either stand-alone or concurrently based on what you are looking for:

  • Indago Debug Analyzer App – multi language testbench debug (SystemVerilog, e, SystemC), reverse debug, UVM debug, macro debug
  • Indago Embedded SW Debug App – for embedded SW/HW integration debugging
  • Indago Protocol Debug App – works with Cadence Verification IP (ARM AMBA AXI and ACE, DDR4)

Related – Cadence’s New Implementation System Promises Better TAT and PPA

Using the GUI in the debug analyzer you can pinpoint what caused the mismatched output value to change, this helps guide you and automates the bug tracking process. SmartLog lets you preview and filter results quickly.

SW engineers can debug to the source-code level and see how their code impacts output waveforms in the hardware.

With the channel viewer you can see VIP (Verification IP) results, step through each FSM (Finite State Machine) state and transitions, and use the source code debugger. Expect Cadence to grow the number of supported protocols over time.

Users of Indago would include SW engineers, HW design engineers and verification engineers.

Related – Is Cadence the Best EDA Company to Work for?

There are basically three generations of debugging available: Basic, Mainstream and Advanced. Indago gets you to the advanced debugging capabilities.

The learning curve for Indago is typically just an hour or two to start getting your first results. Early customers are saying that they have cut their debug times in half, so imagine what that means on your SoC project. When I spoke with Kishore Karnaneand Adam Sherer of Cadence last week, they said that early adopters are using Indago now, while general availability is expected in June around time for the 52nd annual DAC show. The internal VIP group at Cadence is also a big user of Indago to help speed their products to market quicker and with higher quality.


The 2015 DAC Designer and IP Track

The 2015 DAC Designer and IP Track
by Anne Cirkel on 04-27-2015 at 8:00 pm

What an exciting year for DAC with record submissions in nearly every category. Most impressive is the increase in Designer and IP Track submissions, content that is helping to continue to evolve and improve the show. If you haven’t already registered, why not do so now?

A brief bit of background about the conference: DAC’s roots are as an academic conference, and the ACM/IEEE refereed research content remains the backbone of the conference to this day. However DAC is also a conference for day-to-day designers and in recent years much work has been done to expand this part of the show. This effort is surely one reason the conference has remained so vibrant and had such staying power in an industry that embodies creative destruction unlike any other. Think of the technology conferences (and publications for that matter) that have come and gone through the years. Meanwhile this June thousands will descend on Moscone in this DAC’s 52[SUP]nd[/SUP] year!

As the conference general chair, I can report that DAC in its fifth decade is vibrant across the board, and especially when it comes to Designer and IP track submissions, which are up 27% compared to 2014. This is an amazing success and we have to shout out a thank you to all the volunteers for getting the word out and motivating their industry peers to submit. We’ve received the highest number of submissions since we started the designer track in 2010, with most of the growth coming in IP and embedded. The focus on embedded dates back to the 48th DAC. IBM’s Leon Stok, DAC general chair that year, set a goal that 30% of conference content should be on embedded systems and software. That rule of thumb has applied ever since and I’m happy to say that if embedded was once DAC’s best kept secret, that’s surely no longer the case. Evidence for this includes the fact that Open Systems Media will be running the collocated Embedded TechCon at this year’s DAC, an exciting complimentary program I first announced on my DAC blog.

In DAC’s Designer and IP tracks we have 16 different sessions in areas such as low power IP, subsystem IP and IP management, security and analytics for IoT, planar to FinFET, embedded systems design – models and optimization, to name just a few. We are bringing back the popular session “New Chips on the Block” with an exciting lineup of talks from biotech to sensors to an application of the OpenPOWER initiative.

And I’m very excited about our Monday morning opening session designer keynote: Google [x] director Brian Otis will give an update on the Google smart contact lens project, which might revolutionize treatment of diabetes for the 1 in 19 people on Earth who suffer from the disease. (That’s 9:20 a.m. in the Gateway Ballroom. Mark your calendars now! We’ve made some changes in the DAC schedule so that nothing in the program competes with the keynotes, all of which are sure to be stellar.)
Check out the Designer and IP track information in the conference program in more detail or just take my word for it and register today. The Designer Special rate is just $95, a bargain for some truly amazing content.

And all registrants, including those signed up for the free “I Love DAC” registration, are invited to our Tuesday networking session at 4:30 p.m. Mingle with your peers, check out their posters and make some new friends. This Designer and IP track social will flow seamlessly into the overall DAC networking session from 6:00 -7:00 p.m.

See you in just seven weeks!


Four Reasons Why Atmel is Ready to Ride the IoT Wave

Four Reasons Why Atmel is Ready to Ride the IoT Wave
by Majeed Ahmad on 04-27-2015 at 1:00 pm

In 2014, a Goldman Sachs’ report took many people by surprise when it picked Atmel Corp. as the company best positioned to take advantage of the rising Internet of Things (IoT) tsunami. At the same time, the report omitted tech industry giants like Apple and Google from the list of companies that could make a significant impact on the rapidly expanding IoT business. So what makes Atmel so special in the IoT arena?

The San Jose, California–based chipmaker has been proactively building its ‘Smart’ brand of 32-bit ARM-based microcontrollers that boasts an end-to-end design platform for connected devices in the IoT realm. The company with two decades of experience in the MCU business was among the first to license ARM’s low-power processors for IoT chips that target smart home, industrial automation, wearable electronics and more.


Goldman Sachs named Atmel a leader in the Internet of Things (IoT) market

A closer look at the IoT ingredients and Atmel’s product portfolio shows why Goldman Sachs called Atmel a leader in the IoT space. For a start, Atmel is among the handful of chipmakers that cover all the bases in IoT hardware value chain: MCUs, sensors and wireless connectivity.

1. A Complete IoT Recipe

The IoT recipe comprises of three key technology components: Sensing, computing and communications. Atmel offers sensor products and is a market leader in MCU-centric sensor fusion solutions than encompass context awareness, embedded vision, biometric recognition, etc.

For computation—handling tasks related to signal processing, bit manipulation, encryption, etc.—the chipmaker from Silicon Valley has been offering a diverse array of ARM-based microcontrollers for connected devices in the IoT space.


Atmel has reaffirmed its IoT commitment through a number of acquisitions

Finally, for wireless connectivity, Atmel has cobbled a broad portfolio made up of low-power Wi-Fi, Bluetooth and Zigbee radio technologies. Atmel’s $140 million acquisition of Newport Media in 2014 was a bid to accelerate the development of low-power Wi-Fi and Bluetooth chips for IoT applications. Moreover, Atmel could use Newport’s product expertise in Wi-Fi communications for TV tuners to make TV an integral part of the smart home solutions.

Furthermore, communications across the Internet depends on the TCP/IP stack, which is a 32-bit protocol for transmitting packets on the Internet. Atmel’s microcontrollers are based on 32-bit ARM cores and are well suited for TCP/IP-centric Internet communications fabric.

2. Low Power Leadership

In February 2014, Atmel announced the entry-level ARM Cortex M0+-based microcontrollers for the IoT market. The SAM D series of low-power MCUs—comprising of D21, D10 and D11 versions—featured Atmel’s signature high-end features like peripheral touch controller, USB interface and SERCOM module. The connected peripherals work flawlessly with Cortex M0+ CPU through the Event System that allows system developers to chain events in software and use an event to trigger a peripheral without CPU involvement.

According to Andreas Eieland, Director of Product Marketing for Atmel’s MCU Business Unit, the IoT design is largely about three things: Battery life, cost and ease-of-use. The SAM D microcontrollers aim to bring the ease-of-use and price-to-performance ratio to the IoT products like smartwatches where energy efficiency is crucial. Atmel’s SAM D family of microcontrollers was steadily building a case for IoT market when the company’s SAM L21 microcontroller rocked the semiconductor industry in March 2015 by claiming the leadership in low-power Cortex-M IoT design.

Atmel’s SAM L21 became the lowest power ARM Cortex-M microcontroller when it topped the EEMBC benchmark measurements. It’s plausible that another MCU maker takes over the EEMBC benchmarks in the coming months. However, according to Atmel’s Eieland, what’s important is the range of power-saving options that an MCU can bring to product developers.


Eieland: “Low-power dynamics change every 2-3 years amid new features and changes in chip design”

“There are many avenues to go down on the low path, but they are getting complex,” Eieland added. He quoted features like multiple clock domains, event management system and sleepwalking that provide additional levels of configurability for IoT product developers. Such a set of low-power technologies that evolves in successive MCU families can provide product developers with a common platform and a control on their initiatives to lower power consumption.

3. Coping with Digital Insecurity

In the IoT environment, multiple device types communicate with each other over a multitude of wireless interfaces like Wi-Fi and Bluetooth Low Energy. And IoT product developers are largely on their own when it comes to securing the system. The IoT security is a new domain with few standards and IoT product developers heavily rely on the security expertise of chip suppliers.


Atmel offers embedded security solutions for IoT designs

Atmel, with many years of experience in crypto hardware and Trusted Platform Modules, is among the first to offer specialized security hardware for the IoT market. It has recently shipped a crypto authentication device that has integrated the Elliptic Curve Diffie-Hellman (ECDH) security protocol. Atmel’s ATECC508A chip provides confidentiality, data integrity and authentication in systems with MCUs or MPUs running encryption/decryption algorithms like AES in software.

4. Power of the Platform

The popularity of 8-bit AVR microcontrollers is a testament to the power of the platform; once you learn to work on one MCU, you can work on any of the AVR family microcontrollers. And same goes for Atmel’s Smart family of microcontrollers aimed for the IoT market. While ARM shows a similarity among its processors, Atmel exhibits the same trait in the use of its peripherals.


Low-power SAM L21 builds on features of SAM D MCUs

A design engineer can conveniently work on Cortex-M3 and Cortex -M0+ processor after having learned the instruction set for Cortex-M4. Likewise, Atmel’s set of peripherals for low-power IoT applications complements the ARM core benefits. Atmel’s standard features like sleep modes, sleepwalking and event system are optimized for ultra-low-power use, and they can extend IoT battery lifetime from years to decades.

Atmel, a semiconductor outfit once focused on memory and standard products, began its transformation toward becoming an MCU company about eight years ago. That’s when it also started to build a broad portfolio of wireless connectivity solutions. In retrospect, those were all the right moves. Fast forward to 2015, Atmel seems ready to ride on the market wave created by the IoT technology juggernaut.

Also read:

Atmel’s L21 MCU for IoT Tops Low Power Benchmark

Atmel’s New Car MCU Tips Imminent SoC Journey

Atmel’s Sensor Hub Ready to Wear

Majeed Ahmad is author of books Smartphone: Mobile Revolution at the Crossroads of Communications, Computing and Consumer Electronicsand The Next Web of 50 Billion Devices: Mobile Internet’s Past, Present and Future.