Banner Electrical Verification The invisible bottleneck in IC design updated 1

TSMC: Semiconductors in the next ten years!

TSMC: Semiconductors in the next ten years!
by Daniel Nenni on 10-23-2017 at 6:00 am

The TSMC 30th Anniversary Forum just ended so I will share a few notes before the rest of the media chimes in. The forum was live streamed on tsmc.com, hopefully it will be available for replay. The ballroom at the Grand Hyatt in Taipei was filled with cameras, semiconductor executives, and security personnel.

Here is the replay

The event started with a video about TSMC over the last 30 years followed by comments from Chairman Morris Chang. The keynotes were by Nvidia CEO Jensen Huang, Qualcomm CEO Steve Mollenkopf, ADI CEO Vincent Roche, ARM CEO Simon Segars, Broadcom CEO Hock Tan, ASML CEO Peter Wennink, and Apple COO Jeff Williams. Next was a panel discussion led by Chairman Morris Chang.

First let’s start with the jokes. Jensen Huang was supposed to go first but his presentation was not ready and Morris roasted him a bit over it. Jensen replied that it took him longer because he actually prepared for the event. Funny because it was a joke with a bit of truth to it because the other presentations were standard stock. Jensen did the best presentation which was all about AI which is in fact the future of semiconductors in the next ten years.

The best joke however was in response to a question about legal matters, if AI goes wrong who is held accountable? Morris pointed out that Steve Mollenkopf probably has the most legal experience of the group referring to Qualcomm’s massive legal challenges of late. Steve recused himself from the question of course. Even at 86 years old Morris still has a quick wit and provided most of the humor for the evening.

As I have mentioned before, AI will touch almost every chip we make in the coming years which will bring an insatiable compute demand that general purpose CPUs will never satisfy. This year Apple put a neural engine on the A11 SoC that’s capable of up to 600 billion operations per second. Nvidia GPUs do trillions of operations per second so we still have a ways to go for edge devices.

A couple of more interesting notes, the Apple-TSMC relationship started in 2010 which didn’t produce silicon until the iPhone 6 in 2014. Morris described the Apple-TSMC relationship as intense but Jeff Williams (Apple) said that you cannot double plan for the volumes of technology that Apple requires so partnerships are key. My take is that the TSMC-Apple relationship is very strong and will continue for the foreseeable future. Who else is going to be able to do business the Apple (non competing) way and still make big margins?

Jeff also predicts that medical will be the most disruptive AI application to which Morris agreed suggesting mediocre doctors will be replaced by technology. This is something I feel VERY strongly about. Medical care is barbaric by technology standards and we as a population are suffering as a result. Apple is focused on proactive medical care versus reactive which is what you see in most hospitals. Predicting strokes or heart events is possible today for example. AI enabled medical imaging systems is another example for tomorrow.

Security and privacy were discussed with Apple insisting that your data is more secure on your device than it is in the cloud. Maybe that’s why the new phones have a huge amount of memory (64-256 GB) while free iCloud storage is still only 5 GB. We use a private 1 TB cloud for just that reason by the way, our data stays in our possession. I certainly agree about security but privacy seems to be lost on millennials and they are the target market for most devices.

Bottom line: Congratulations to the TSMC support staff, this event was well done and congratulations to TSMC for an amazing 30 years. The room was filled with C level executives and a smattering of media folks like myself. It really was an honor to be there, being part of semiconductor history, absolutely.


Webinar: Optimizing QoR for FPGA Design

Webinar: Optimizing QoR for FPGA Design
by Bernard Murphy on 10-22-2017 at 12:00 pm

You might wonder why, in FPGA design, you would go beyond simply using the design tools provided by the FPGA vendor (e.g. Xilinx, Intel/Altera and Microsemi). After all, they know their hardware platform better than anyone else, and they’re pretty good at design software too. But there’s one thing none of these providers want to support – a common front-end to all these platforms. If you want flexibility in device providers, making a vendor change will force you back to an implementation restart. Which is one reason why tools like Synplify Premier from Synopsys have always had and always will have a market.


REGISTER HERE for this webinar on October 25[SUP]th[/SUP] at 10am PDT

The other reason is that a company whose primary focus is in design software, and which started and still leads design synthesis market, is likely to have an edge in synthesis QoR, features and usability over the device vendors. Of course, the physical design part of implementation still comes from the vendors, but Synplify tightly couples with these tools, not just in the sense of “you can launch Vivado from Synplify” but also in the sense that you can iteratively refine the implementation, as you’ll see soon.

As an example of what you get in synthesis from a tool in the Synopsys stable, Synplify Premier will handle optimization for state-machines (including recoding to other styles such as Gray encoding), resource-sharing, pipelining and retiming. And of course, they support DesignWare IP.

This webinar provides a fairly detailed overview of what is possible using Synplify Premier as your FPGA design front-end. Much of this will be familiar to ASIC designers or to FPGA designers already familiar with tools from device vendors. One topic is on optimal RTL coding styles, for FSMs (for optimization to the target device, to map away unreachable states, add safe recovery from invalid states or to change coding), math and DSP functions for efficient packing (for filters, counters, adders, multipliers, etc) and optimized RAM inferencing based on availability of resources (block RAMs etc).

Static timing analysis will look very familiar, except that the Synopsys constraint format is called FDC (FPGA design constraints) rather than SDC. Synplify Premier provides a nice feature to automatically create a quick set of constraints in early design to help you get through the basic flow-flush. Naturally you’ll want to work on developing real constraints (real clocks, clock groups, I/O constraints, timing exceptions, etc) before you move to physical design.

I mentioned earlier that interoperability between Synplify Premier and the vendor physical design tools isn’t just about compatibility in libraries, tech files and data passed from the synthesis tool to the vendor tool. A great example is in congestion and QoR management. These problems happen for well-known reasons – high resource utilization, over-aggressive constraints, logic packing problems and others.

One particularly important root cause can happen on Xilinx device which are multi-die (each die is known as super-logic region/SLR) on an interposer connected by super-long line (SLL) interconnects. You already know where this is going; there are only so many SLLs, which means they can be over-used (I assume there might also be reduced timing margin on SLLs). So lots of congestion and timing closure problems can happen – no news to implementation experts. What is interesting here though is that Synplify Premier can take this information from Xilinx or Intel project files and use it to drive re-synthesis to reduce congestions and timing closure problems. It also can drive many runs in parallel on a server farm so you can quickly explore different implementation strategies. That’s real and very useful interoperability.

If you’re not familiar with Synplify Premier, this should be a must-see. Remember to
REGISTER HERE for this webinar on October 25[SUP]th[/SUP] at 10am PDT


The Interface IP Market has Grown to $530 Million!

The Interface IP Market has Grown to $530 Million!
by Eric Esteve on 10-22-2017 at 7:00 am

According with IPnest, the Interface IP market, including USB, PCI Express, (LP)DDRn, HDMI, MIPI and Ethernet IP segments, has reached $532 million in 2016, growing from $472 million in 2015. This is an impressive 13% Year-over-Year growth rate, and 12% CAGR since 2012!



Who integrate functions to interface a chip with others IC or connector? The answer is simply that for any application, you need to interact with another chip (DRAM, SSD, Application Processor or ASSP), or with the outside world through a connector (from HDMI to PCIe or USB). If you consider chip design, you quickly realize that two kinds of functions are ubiquitous: processing and interfacing. This consideration is comforted by their respective weight in the IP license market (before royalty). In 2016, the license only market has weighted $1930 million. The license revenues generated by processor IP (CPU, GPU, DSP) of $680 million represent 35% of the total, when the interface IP license, weighting $532 million, is 27.5%. The addition of processor and interface IP generates 62.5% of the total (license) IP market.



If ARM is known to be the king of the processor IP market, and consequently, thanks to royalties, of the total IP market with 48.4% market share, Synopsys is the duke. Synopsys is now the clear #2 of the design IP market, and the undisputed leader of the interface IP market, with 51% market share and over $270 million revenues. In fact, if we look at this market by segment, USB, PCI Express, etc., Synopsys is also the leader in each of these 5 IP segments: USB, PCI Express, MIPI, HDMI, DDRn, with a market share between 50% to 75%.



In the survey, IPnest is making a very comprehensive analysis, by protocol, including a ranking by IP vendor and competitive analysis, or a review of all the IP vendors active in the segment. It’s always possible to find a niche where a vendor, not necessarily leader, will enjoy good business. IPnest also analyze the market trends to predict the future adoption of a specific protocol in new applications. For example, PCI Express protocol was initially developed to support the PC, computing and networking segments. We have seen the pervasion in mobile, with Mobile Express definition in 2012, but also in storage (NVM Express) and the pervasion in automotive is now acted.

Such comprehensive analysis will help IPnest to build 5 years forecast, taking into account the growth of number of design starts including PCIe function, and also that we call “externalization factor”. The externalization factor is the augmentation of the proportion of PCIe IP being externalized, and this factor may change every year, even if the proportion of commercial IP is only growing, year after year.

Competitive analysis: IPnest propose, by protocol, a competitive analysis and a ranking, like for example for PCI Express:


Being part of the DAC IP committee and running IPnest, Eric Esteve was also the chairman of the panel “The IP Paradox” (The semiconductor industry is consolidating, and the number of potential customers is shrinking, but the IP market is still growing, in particular the interface IP market. How to explain this growth?). If we can answer this question, we will be able to more accurately forecast the IP market growth.

John Koeter, VP Marketing for Synopsys, has proposed an explanation: “We study the market and 60-70% of the IP is outsourced. When I look at IP, I think it is potentially the same size as the EDA market. EDA is fully outsourced, but IP is not there yet which means there is growth available.”IPnest 100% agree with this! If we try to model the IP market growth, we see that there is 10 to 15 years growth reserve for the IP market to be fully outsourced (assuming +/-3% value for the externalization factor).



A graphic view of the market evolution, by protocol, for 2012 to 2021:


It’s important to notice that IPnest is now the only company offering the “Design IP Report” (2015 and 2016 ranking of all the IP vendors by categories, from CPU to GPU, DSP, mixed-signal, memory compilers, libraries, interface, etc.) as Gartner has stopped to make it in 2016. IPnest is also the only analyst launching the “Interface IP survey & forecast”. In fact, this is the 9[SUP]th[/SUP] version of this report and was launched last week.

If you are interested by the Table of Content for the 2017 version of the report (2012-2016 Survey – Forecast 2017-2021), just send me a message on Semiwiki, or on Linkedin: Eric Esteve

We can also meet during ARM TechCon in Santa Clara (10/24 to 10/26), I will stay until 10/27.

Eric Esteve


IEDM 2017 Preview

IEDM 2017 Preview
by Scotten Jones on 10-20-2017 at 7:00 am

The 63rd annual IEDM (International Electron Devices Meeting) will be held December 2nd through 6th in San Francisco. In my opinion IEDM is one of, if not the premier conference on leading edge semiconductor technology. I will be attending the conference again this year and providing coverage for SemiWiki. As a member of the press I got some preview materials today and I wanted to share some of it with you.

Leading Edge Logic
As anyone who has read my articles on SemiWiki knows I follow the latest advances in logic process technology very closely. In the Platform Technology Session there will be papers from Intel on their 10nm technology and GLOBALFOUNDRIES on their 7nm technology and I am really looking forward to these papers:

  • Intel: Intel researchers will present a 10nm logic technology platform with excellent transistor and interconnect performance and aggressive design-rule scaling. They demonstrated its versatility by building a 204Mb SRAM having three different types of memory cells: a high-density 0.0312µm[SUP]2[/SUP] cell, a low voltage 0.0367µm[SUP]2[/SUP] cell, and a high-performance 0.0441µm[SUP]2[/SUP] cell. The platform features 3[SUP]rd[/SUP]-generation FinFETs fabricated with self-aligned quadruple patterning (SAQP) for critical layers, leading to a 7nm fin width at a 34nm pitch, and a 46nm fin height; a 5[SUP]th[/SUP]-generation high-k metal gate; and 7[SUP]th[/SUP]-generation strained silicon. There are 12 metal layers of interconnect, with cobalt wires in the lowest two layers that yield a 5-10x improvement in electromigration and a 2x reduction in via resistance. NMOS and PMOS current is 71% and 35% greater, respectively, compared to 14nm FinFET transistors. Metal stacks with four or six workfunctions enable operation at different threshold voltages, and novel self-aligned gate contacts over active gates are employed. (Paper 29.1, “A 10nm High Performance and Low-Power CMOS Technology Featuring 3[SUP]rd[/SUP]-Generation FinFET Transistors, Self-Aligned Quad Patterning, Contact Over Active Gate and Cobalt Local Interconnects,” C. Auth et al, Intel)
  • GLOBALFOUNDRIES (GF): GF researchers will present a fully integrated 7nm CMOS platform that provides significant density scaling and performance improvements over 14nm. It features a 3[SUP]rd[/SUP]-generation FinFET architecture with SAQP used for fin formation, and self-aligned double patterning for metallization. The 7nm platform features an improvement of 2.8x in routed logic density, along with impressive performance/power responses versus 14nm: a >40% performance increase at a fixed power, or alternatively a power reduction of >55% at a fixed frequency. The researchers demonstrated the platform by using it to build an incredibly small 0.0269µm[SUP]2[/SUP] SRAM cell. Multiple Cu/low-k BEOL stacks are possible for a range of system-on-chip (SoC) applications, and a unique multi-workfunction process makes possible a range of threshold voltages for diverse applications. A complete set of foundation and complex IP (intellectual property) is available in this advanced CMOS platform for both high-performance computing and mobile applications. (Paper 29.5, “A 7nm CMOS Technology Platform for Mobile and High-Performance Compute Applications,”S. Narasimha et al, Globalfoundries)

Silicon Photonics
Silicon Photonics is an area of great interest in the industry today and in my cost modeling business I am getting a lot of interest in Silicon Photonics costs. Session 34 will focus on Silicon Photonics.

Silicon Photonics: Current Status and Perspectives (Session #34) – Silicon photonics integrated circuits consist of devices such as optical transceivers, modulators, phase shifters and couplers, operating at >50 GHz for use in next-generation data centers. This session describes the latest in photonics IC advances in state-of-the-art 300mm fabrication technology; integrated nano-photonic crystals with fJ/bit optical links; and advanced packaging concepts for the specialized form factors this technology requires.

  • Developments in 300mm Silicon Photonics Using Traditional CMOS Fabrication Methods and Materials,” by Charles Baudot et al, STMicroelectronics
  • Reliable 50Gb/s Silicon Photonics Platform for Next-Generation Data Center Optical Interconnects,” by Philippe Absil et al, Imec
  • Advanced Silicon Photonics Technology Platform Leveraging the Semiconductor Supply Chain,” by Peter De Dobbelaere, Luxtera
  • Femtojoule-per-Bit Integrated Nanophotonics and Challenge for Optical Computation,” by Masaya Notomi et al, NTT Corporation
  • Advanced Devices and Packaging of Si-Photonics-Based Optical Transceiver for Optical Interconnection,” by K. Kurata et al, Photonics Electronics Technology Research Association

Nanowires
With FinFETs coming to the end of it’s scaling potential nanowires are garnering a lot of interest as the next generation technology. In session 37 there will be a couple of papers on nanowires incuding:

First Circuit Built With Stacked Si Nanowire Transistors: As scaling continues, gate-all-around MOSFETs are seen as a promising alternative to FinFETs. They are nanoscale devices in which the gate is completely wrapped around a nanowire, which serves as the transistor channel. Nanosheets, meanwhile, are sheets of arrays of GAA nanowires. A talk by Imec and Applied Materials will describe great progress in several key areas to make vertically stacked GAA nanowire and/or nanosheet MOSFETs practical. The team built the first functional ring oscillator test circuits ever demonstrated using stacked Si nanowire FETs, with devices that featured in-situ doped source/drain structures and dual-workfunction metal gates. An SiN STI liner was used to suppress oxidation-induced fin deformation and improve shape control; a high-selectivity etch was used for nanowire/nanosheet release and inner spacer cavity formation with no silicon reflow; and a new metallization process for n-type devices led to greater tunability of threshold voltage. (Paper 37.4, “Vertically Stacked Gate-All-Around Si Nanowire Transistors: Key Process Optimizations and Ring Oscillator Demonstration,” H. Mertens et al, Imec/Applied Materials)

Conclusion

These papers are just a sampling of what will be presented that are of interest to me. I highly recommend attending IEDM for anyone interested in staying current on the state-of-the art.

https://ieee-iedm.org/


How standard-cell based eFPGA IP can offer maximum safety, flexibility and TTM?

How standard-cell based eFPGA IP can offer maximum safety, flexibility and TTM?
by Eric Esteve on 10-19-2017 at 12:00 pm

Writing a white paper is never tedious, and when the product or the technology is emerging, it can become fascinating. Like for this white paper I have written for Menta “How Standard Cell Based eFPGA IP are Offering Maximum Flexibility to New System-on-Chip Generation”. eFPGA technology is not really emerging, but it’s fascinating to describe such a product: if you want to clearly explain eFPGA technology and highlight the differentiators linked with a specific approach, you must be subtle and crystal clear!


Let’s assume that you need to provide flexibility to a system. Before the emergence of eFPGA, the only way was to design a FPGA, or to add a programmable integrated circuit companion device (the FPGA) to an ASIC (the SoC). Menta has designed a family of FPGA blocks (the eFPGA) which can be integrated like any other hard IP into an ASIC. It’s important to realize that designing eFPGA IP product is not just cutting a FPGA block that you would deliver as is to an ASIC customer.

eFPGA is a new IP family that a designer will integrate into a SoC, and in this case, every IP may be unique. Menta is offering to the SoC architect the possibility to define a specific eFPGA where logic and memory size, MAC and DSP count are completely customizable, as well as the possibility to include inside this eFPGA certain customer defined blocks.

Menta has recently completed the 4[SUP]th[/SUP] generation of eFPGA IP (the company has been started 10 years ago) and the vendor offers some very specific features to build a solution more attractive than these offered by the competition. Why is Menta eFPGA IP more attractive? We will see that the solution is more robust, the architecture provides maximum flexibility and the porting to different technology node is safer and faster, allowing faster time-to-market. This solution also allows smoother integration in the EDA flow, including easier testability.

When most of FPGA are programmed via internal SRAM (as well as most of eFPGA), Menta has decided to rely on D-Flip-Flop for the programming. This approach makes the eFPGA safer, and for two reasons. At first, when SRAM are known to be prone to Single Upset Event (SUV), DFF show a better SUV immunity. The reason is very simple, the most significant factor is the physical size of the transistor geometries (smaller means less SEU energy required to trigger them), and the DFF geometry is larger than the equivalent storing cell in SRAM. That’s why Menta eFPGA architecture is well suited for automotive application, for example.

The second argument for a better safety is that designing programming SRAM will be based on a full custom approach, requiring new characterization every time you change technology node, when Menta is using DFF from a standard cell library, or pre-characterized, by the foundry or the library vendor.

In the white paper, you will learn why Menta eFPGA architecture eFPGA provide maximum flexibility, as the designer can include logic, memory, and internal I/O banks, infer pre-defined (by Menta) DSP primitives or include custom (made by the designer) DSP blocks.

Really, the key differentiator is linked with the decision to base eFPGA architecture only on standard blocks. The logic is based on standard cells, as well as the DSP primitives and internal I/O banks. Once Menta has validated eFPGA IP on a certain technology node, any customer defined eFPGA will be correct by construction. When a “mega cell” is only made of standards cells characterized by the foundry or the library vendor, the direct two consequences are safety and ease of use.

Safety because there is no risk of failure when using pre-characterized library and ease of use because the “mega cell” will integrate smoothly into the EDA flow. All required models or deliverables are already provided and guaranteed accurate by standard-cell library providers. There is a subtler consequence, which may have a significant impact on safety and time-to-market. If the SoC customer, for any reason, has to target a different technology node, the porting is accelerated due to the absence of full custom blocks as there is no need for a complete characterization, this has been previously done by the library provider. No full-custom block also greatly minimizes the risk of failure during the porting.



Menta has developed a patented technology (System and Method for Testing and Configuration of an FPGA) to offer to the designer a standard DFT approach. The eFPGA testability is based on multiplexed scan, using boundary scan isolation wrapper. Once again, the selected approach allows following a standard design flow.

By reading this white paper, you will also learn about the specific design flow to define the eFPGA itself. No surprise, this flow allows to interface via industry standards (Verilog, SDF annotation, gds, etc.) with the SoC integration flow from the EDA vendor.

As far as I am concerned, I really think that the semiconductor industry will adopt eFPGA when adding flexibility to a SoC is needed. The multiple benefits in term of solution cost and power consumption should be the drivers, and Menta is well positioned to get a good share of this new IP market, thanks to the key differentiators offered by the architecture.

You can find the white paper here: http://www.menta-efpga.com

From Eric Esteve from IPnest


How standard-cell based eFPGA IP can offer maximum safety, flexibility and TTM?

How standard-cell based eFPGA IP can offer maximum safety, flexibility and TTM?
by Eric Esteve on 10-19-2017 at 12:00 pm

Writing a white paper is never tedious, and when the product or the technology is emerging, it can become fascinating. Like for this white paper I have written for Menta “How Standard Cell Based eFPGA IP are Offering Maximum Flexibility to New System-on-Chip Generation”. eFPGA technology is not really emerging, but it’s fascinating to describe such a product: if you want to clearly explain eFPGA technology and highlight the differentiators linked with a specific approach, you must be subtle and crystal clear!


Let’s assume that you need to provide flexibility to a system. Before the emergence of eFPGA, the only way was to design a FPGA, or to add a programmable integrated circuit companion device (the FPGA) to an ASIC (the SoC). Menta has designed a family of FPGA blocks (the eFPGA) which can be integrated like any other hard IP into an ASIC. It’s important to realize that designing eFPGA IP product is not just cutting a FPGA block that you would deliver as is to an ASIC customer.

eFPGA is a new IP family that a designer will integrate into a SoC, and in this case, every IP may be unique. Menta is offering to the SoC architect the possibility to define a specific eFPGA where logic and memory size, MAC and DSP count are completely customizable, as well as the possibility to include inside this eFPGA certain customer defined blocks.

Menta has recently completed the 4[SUP]th[/SUP] generation of eFPGA IP (the company has been started 10 years ago) and the vendor offers some very specific features to build a solution more attractive than these offered by the competition. Why is Menta eFPGA IP more attractive? We will see that the solution is more robust, the architecture provides maximum flexibility and the porting to different technology node is safer and faster, allowing faster time-to-market. This solution also allows smoother integration in the EDA flow, including easier testability.

When most of FPGA are programmed via internal SRAM (as well as most of eFPGA), Menta has decided to rely on D-Flip-Flop for the programming. This approach makes the eFPGA safer, and for two reasons. At first, when SRAM are known to be prone to Single Upset Event (SUV), DFF show a better SUV immunity. The reason is very simple, the most significant factor is the physical size of the transistor geometries (smaller means less SEU energy required to trigger them), and the DFF geometry is larger than the equivalent storing cell in SRAM. That’s why Menta eFPGA architecture is well suited for automotive application, for example.

The second argument for a better safety is that designing programming SRAM will be based on a full custom approach, requiring new characterization every time you change technology node, when Menta is using DFF from a standard cell library, or pre-characterized, by the foundry or the library vendor.

In the white paper, you will learn why Menta eFPGA architecture eFPGA provide maximum flexibility, as the designer can include logic, memory, and internal I/O banks, infer pre-defined (by Menta) DSP primitives or include custom (made by the designer) DSP blocks.

Really, the key differentiator is linked with the decision to base eFPGA architecture only on standard blocks. The logic is based on standard cells, as well as the DSP primitives and internal I/O banks. Once Menta has validated eFPGA IP on a certain technology node, any customer defined eFPGA will be correct by construction. When a “mega cell” is only made of standards cells characterized by the foundry or the library vendor, the direct two consequences are safety and ease of use.

Safety because there is no risk of failure when using pre-characterized library and ease of use because the “mega cell” will integrate smoothly into the EDA flow. All required models or deliverables are already provided and guaranteed accurate by standard-cell library providers. There is a subtler consequence, which may have a significant impact on safety and time-to-market. If the SoC customer, for any reason, has to target a different technology node, the porting is accelerated due to the absence of full custom blocks as there is no need for a complete characterization, this has been previously done by the library provider. No full-custom block also greatly minimizes the risk of failure during the porting.



Menta has developed a patented technology (System and Method for Testing and Configuration of an FPGA) to offer to the designer a standard DFT approach. The eFPGA testability is based on multiplexed scan, using boundary scan isolation wrapper. Once again, the selected approach allows following a standard design flow.

By reading this white paper, you will also learn about the specific design flow to define the eFPGA itself. No surprise, this flow allows to interface via industry standards (Verilog, SDF annotation, gds, etc.) with the SoC integration flow from the EDA vendor.

As far as I am concerned, I really think that the semiconductor industry will adopt eFPGA when adding flexibility to a SoC is needed. The multiple benefits in term of solution cost and power consumption should be the drivers, and Menta is well positioned to get a good share of this new IP market, thanks to the key differentiators offered by the architecture.

You can find the white paper here: http://www.menta-efpga.com

From Eric Esteve from IPnest


Accelerating Accelerators

Accelerating Accelerators
by Bernard Murphy on 10-19-2017 at 7:00 am

Accelerating compute-intensive software functions by moving them into hardware has a long history, stretching back (as far as I remember) to floating-point co-processors. Modern SoCs are stuffed with these applications, from signal processors, to graphics processors, codecs and many more functions. All of these accelerators work extremely well for functions with broad application where any need for on-going configurability can be handled through switches or firmware / software upgrades in aspects which don’t significantly compromise performance.


But that constraint doesn’t always fit well with needs in the very dynamic markets which are common today, where competitive differentiation continually changes targets for solution-providers. That’s why FPGAs have become hot in big datacenter applications. Both Amazon Web Services (AWS) and Microsoft Azure have announced FPGA-based capabilities within their datacenters, for differentiated high-speed networking and to provide customizable high-performance options to cloud customers. The value proposition is simple – as demands change, the FPGA can be adapted more quickly than you could build a new ASIC, and often more cheaply given relatively low volumes in these applications.

Naturally there is a middle ground between ASIC and FPGA options. FPGA SoCs might be an answer in some cases, but when you’re stretching for a differentiated edge or wanting to offer an SoC solution to those who are, it’s not hard to imagine cases where an application-specific ASIC shell around an embedded FPGA core might be just right. You get all the flexibility of the FPGA core, combined with high performance plus low power and area of the fit-to-purpose ASIC functionality around the core. Target applications include data intensive AI / machine learning, 5G wireless, automotive ADAS and datacenter and networking applications.


As in any good FPGA, you expect support for logic and ALUs, DSP functions, also block RAMs (BRAM) and smaller RAM blocks (LRAMs in the picture above). When you want to customize the embedded FPGA (eFPGA) in your SoC, you go through the usual design cycle to map a logic design onto the primitives in the eFPGA. If you are using the Achronix Speedcore technology, you will use their ACE design tools.

Now take this a step further. When you write a piece of software, you can profile it to find areas where some additional focus could greatly speed up performance. The same concept can apply in your eFPGA design. By profiling benchmark test cases (Achronix works collaboratively with customers to do this), you can identify performance bottlenecks. Based on this analysis, Achronix can then build custom blocks for certain functions, which can be tiled into the eFPGA. Now you have the advantage of the high-performance shell along with configurability in the eFPGA, yet with significantly better PPA than you would get in a conventional eFPGA.


Achronix offer several application examples where the benefit of their Speedcore Custom Blocks is quite obvious. The first is for a YOLO (you only look once) function supporting a convolutional neural net (CNN) in real-time object detection. By converting a matrix-multiply operation to a custom block they have been able to reduce the size of the eFPGA by 35%.


In another example for networking, they have been able to build custom functions which can examine network traffic at line speed (400Gb/s line rate), for example to do header inspection. In this example, the purple blocks are the custom packet segment extraction/insertion blocks.


Another especially interesting example is use of this capability in building TCAMs. These functions are widely used in networking but are typically considered very expensive to implement in standalone FPGAs. However they can be very feasible in application-specific uses in an eFPGA when implemented as Custom Blocks.


One final example – string search. This has many applications, not least in genome matching, another hot area. (If you don’t like that example, think of how many programs contain string equaloperations, how that operation dominates many profiles and is therefore likely to be a bottleneck in real-time matching on streams or fast matching on giant datasets.) FPGAs are already used to accelerate these operations but are still not fast enough. Which makes this a great candidate for Custom Block acceleration. Achronix show an example where they can reduce time to do a match from 72 cycles to 1 cycle and massively reduce area.

No big surprise in a way – we all know that custom is going to be much faster and smaller than FPGA. The difference here is that now you can embed custom in eFPGA – pretty neat. Of course, this takes work. Robert Blake, the CEO of Achronix, told me that you might typically expect a 6-month cycle for profiling and custom block development. And there will be an NRE (you didn’t think it would be free, did you?). But if it can deliver this kind of advantage, it may be worth the investment.


Achronix business is growing very nicely, thanks to development in each of their FPGA accelerator lines. They expect to close 2017 at >$100M, with a strong pipeline and apparently well-balanced between their standalone FPGA (Speedster) and embedded applications. Speedcore, introduced to customers in 2015, is their fastest–growing product line and is already in production on TSMC 16nm and at testchip and first designs in TSMC 7nm.

You can read more HERE. You can also see Achronix present at ARM TechCon on:

· Reprogammable Logic in an Arm-Based SoC, presented by Kent Orthner, Systems Architect
· Smaller, Faster and Programmable – Customizing Your On-Chip FPGA, presented by Steve Mensor, VP of Marketing
· Customize Your eFPGA – Control Your Destiny for Machine Learning, 5G and Beyond, presented by Kent Orthner, Systems Architect


Rethinking IP Lifecycle Management

Rethinking IP Lifecycle Management
by Daniel Payne on 10-18-2017 at 12:00 pm

We recently saw both Apple and Samsung introduce new smart phones, and realize that the annual race to introduce sophisticated devices that are attractive and differentiated is highly competitive. If either of these companies misses a market window then fortunes can quickly change. SoCs with billions of transistors like smart phone processors make semiconductor IP re-use a central approach in design productivity, instead of starting from scratch for each new generation.

Tracking and managing hundreds of IP blocks in an SoC is a task best suited for an optimized tool, not using an Excel spreadsheet and manual email notifications. I’ve written before about Methodics and how their IP Lifecycle Management (IPLM) approach in the Percipient tool is such an optimized tool for IP-centric design flows. One aspect of Percipient that is worthy of attention is their Graph Database (white paper here) which is the key technology for fast and seamless IP reuse.

My first introduction to Relational Database Management Systems (RDBMS) was in the 1990’s while learning MySQL and PHP in building custom, data-driven web sites. Oracle now owns MySQL and it powers many web sites today, like WordPress sites with some 150,000,000 users. Tables are used in MySQL to store rows of information, where each row has multiple columns and some index field. Tables can be related to each other by joining them which enable complex queries.

Percipient instead uses a Graph Database which stores data using nodes and relationships, with key-value properties. A relationship will connect two nodes to one another, and they are both typed and directed. The beauty of this graph database approach is that the relationships can be traveled in either direction. SoCs use hierarchy to define how IP is placed, and a graph database models hierarchy natively.

In contrast, a RDBMS doesn’t natively support or use hierarchy at all. Sure, you could use a series of MySQL database tables to store and traverse all of your IP but the performance would begin to suffer as the data scales up in size.

Related blog – Something new in IP Lifecycle Management

Each IP block in your system has dependencies, so for example a USB component depends on PDKs, libraries and test-benches. Our IPLM has to understand and track all of these dependencies efficiently. Your system may even use different versions of the same component in the same design, so knowing how to avoid conflicts is essential. Dependencies map directly into a graph database, so it’s straight forward to add, delete or manage conflicts. The Percipient tool is used on SoC hierarchies that use several hundred nodes, even up to 8 levels of hierarchy.

The team at Methodics chose the Neo4j graph database in their Percipient tool because of its popularity, speed and scalability. The previous generation IPLM tool from Methodics was called ProjectIC and it used SQL tables with PostGres, which worked fine for smaller designs but didn’t scale up with enough speed. Let’s take a quick look at speed comparisons between the older ProjectIC approach and the newest Percipient through the following scatter plot showing response time on the Y-axis in seconds versus calendar time on the X-axis:

Notice the general increase in time to manage IP as the hierarchy grew to six levels and about 290 nodes while using the older ProjectIC tool, then the customer started using Percipient with a graph database which dramatically lowered their response times and continued to scale well. Actual customer usage created this graph while doing a production IC, not a benchmark. Using the graph database approach the customer can now query the status of a hierarchical IP in a workspace, or even view conflicts in just seconds instead of minutes. These speed improvements will scale into hundreds, thousands or millions of nodes.

Related blog – New concepts in semiconductor IP lifecycle management

Summary

Methodics has been offering their ProjectIC IPLM tool for many years, then took the next step and re-engineered their approach to exploit a graph database in their newest Percipient IPLM tool. The speed improvements and scalability with the Neo4j graph database look excellent, which means that SoC designers save time and are more likely to meet critical deadlines.


ARM TechCon 2017 Preview with Mentor!

ARM TechCon 2017 Preview with Mentor!
by Daniel Nenni on 10-18-2017 at 7:00 am

Next week is ARM TechCon which is one of my favorite conferences (SemiWiki is an event partner). This year is lucky number thirteen for ARM TechCon and it includes more than sixty hours of sessions plus more than one hundred ARM partners in the exposition. I will be signing free copies of our new book “Custom SoCs for IoT: Simplified” in the Open-Silicon booth #918. Please stop by Wednesday morning and get a book. It would be a pleasure to meet you!

This year Mentor Graphics, A Siemens Business, is a platinum sponsor and has an impressive line-up. Emulation of course is featured due to the conference emphasis on silicon design to software development. Emulation is a fast growing market and, coincidentally, is the topic of our next book (a collaboration with Mentor) which is due out early next year:

Mentor Graphics delivers the most comprehensive Enterprise Verification Platform™ available for ARM based SoCs and Interfaces: including the Visualizer™ Debug Environment for common debug across simulation, formal, emulation and prototyping, Questa® for high performance simulation, verification management and coverage closure, low-power verification with UPF, CDC, Formal Verification and Veloce® for high-performance system emulation, hardware/software co-verification or integration, system-level prototyping, and power estimation and performance characterization. This comprehensive platform supports UVM. Come check out our latest demos…

Tanner EDA is also featured this year which makes complete sense considering their focus on AMS and MEMs design. I worked with Tanner prior to the acquisition and am a big fan of their tools. I pushed for the acquisition believing that Mentor and Tanner would be a 1+1=3 proposition and I was right, absolutely.

Tanner EDA offers complete design flows for the design, implementation and verification of Analog, Mixed Signal and RF integrated circuits, as well as MEMS. Tanner enables the next generation of IoT edge devices by making it easier for to designers of sensors, MEMS and actuators to create custom SoCs…

Low power design has also been a top trending key word on SemiWiki as it touches just about every market segment and I do not expect that to change ever. Low power design is also prevalent at Arm events for obvious reasons which is why Catapult is also featured:

The Catapult® High Level Synthesis (HLS) PowerPro® Register Transfer Level (RTL) Lower-Power family of products enable ASIC, SOC and FPGA designers to quickly create fully-verified, power-optimized RTL for downstream synthesis and physical design…

And of course Mentor Embedded is featured due to the more than four thousand embedded designers that are expected to attend:

Mentor solutions for Arm® processors enable the development of advanced embedded systems, scalable footprint for Cortex®-M and Cortex®-A applications targeting single to heterogeneous multicore devices for high-performance, power-efficient, secure and safety certified embedded devices. Embedded developers can create systems with the latest Arm processors and micro-controllers with commercially supported and customizable Linux®-based solutions including the industry-leading Sourcery™ CodeBench and Mentor® Embedded Linux products. For real-time systems, developers can take advantage of the small-foot-print and low-power-capable Nucleus® real-time operating system (RTOS)…

See the full Mentor ARM TechCon landing pageHERE

See the ARM TechCon WebsiteHERE

See SemiWiki ARM contentHERE

See SemiWiki Mentor ContentHERE