SemiWiki – Page 542 – The Open Forum for Semiconductor Professionals

August 28, 2017July 18, 2025

Bluetooth 5 IP is Ready for SoC Integration

Bluetooth 5 IP is Ready for SoC Integration
by Eric Esteve on 08-28-2017 at 7:00 am
Categories: IP, Synopsys

Bluetooth®, WiFi, LTE, and 5G technologies enable wireless connectivity for a range of applications. While each offer unique features and advantages, designers need now to decide which protocol to integrate in a single chip after having test the market by using wireless off-chip solutions. Bluetooth 5 builds upon the success of Bluetooth-enabled audio, wearable, and other small portable devices. Bluetooth 5 expands into smart home applications, extends beacon capabilities, and opens the door to many other new and feature-rich applications requiring wireless technology. The adoption of Bluetooth by the mobile phone has positioned it as a leading candidate to solve interoperability hurdles and the use of Bluetooth has grown beyond traditional applications and into audio, wearable, and other small portable device and toy applications. The benefits of wireless integration into a single SoC for Internet of Things (IoT) applications in term of power, performance and cost is becoming evident, especially as designs move to more aggressive process nodes like 55nm and 40nm.

There are many wireless standards including ZigBee, wirelessHART, Z-Wave, WiSun and more, that have served niche applications such as smart homes, remote controls, building automation, and metering, but the industry is having difficulty finding interoperability between the Internet of Things devices among the fragmented set of standards and options available. In 2016, the SIG addressed the key requirements of simple and secure wireless connectivity by introducing Bluetooth 5, which “quadruples range, doubles speed, increases data broadcasting capacity by 800%” according to the June 2016 Bluetooth SIG press release. The evolution of Bluetooth to Bluetooth 5 continues to build momentum and “will deliver robust, reliable Internet of Things (IoT) connections” that make wearables and now smart homes a reality.

In fact, increasing operation range offered by Bluetooth 5 will enable connections to IoT devices that extend far beyond the walls of a typical home, while doubling the speed allow supporting faster data transfers and software updates for devices. Bluetooth 5 specification is also defining key features like adaptive frequency hopping, to pave the way for operation in densified wireless installations anticipated with the development of future IoT solutions, including 5G.
Some of the key features of Bluetooth 5 are listed below:

Data rates from 1Mbps to 2Mbps with more flexible methods to optimize power consumption
Longer range via larger link budget and supporting up to 20 decibel-milliwatts (dBM) where local law allows
Higher permission-based advertising transmission to deliver Bluetooth messages to Bluetooth-enabled devices, especially beacons
Adaptive Frequency Hopping (AFH) based on channel selection algorithm to improve connectivity performance in environments where other wireless technologies are in use
Limited high duty cycle non-connectable advertising using intervals of less than 100ms for limited periods of time re-connectivity to improve user experience and battery life with faster connections
Slot availability masks to detect and prevent other wireless band interferences

In the past, Bluetooth has been implemented in systems via combo chipsets that include WiFi and other wireless technologies. As a result, a vast number of implementations are dual mode (low energy and classic) combo wireless chipsets, but solutions only supporting the Bluetooth low energy specification are rapidly moving to be fully integrated into a single monolithic system, or a SoC. This approach allow to take full advantage of power consumption, process node alignment and higher performance. We can see that MCUs are now adopting Bluetooth low energy IP into their chipsets. There is a clear trend that Bluetooth low energy will continue to penetrate MCU solutions as a de facto standard feature, and process nodes such as ultra-low power 55-nm and 40-nm processes play a critical role in the ability to integrate Bluetooth.

Last but not least, the DesignWare Bluetooth Low Energy IP is qualified by the Bluetooth SIG and has gone through a rigorous validation process from a complete design verification flow to full characterization of power, voltage, temperature (PVT) corners and interoperability with the ecosystem.

You can get more information in this article from Ron Lowman:
https://www.synopsys.com/designware-ip/technical-bulletin/bluetooth5-dwtb-q217.html

Eric Esteve from IPNEST

August 26, 2017July 5, 2021

Oracle is out of the Chip Business!

Oracle is out of the Chip Business!
by Daniel Nenni on 08-26-2017 at 7:00 am
Categories: EDA
38 Comments

This is a story I have been tracking for some time now. Oracle is one of the companies I was enamored with early in my career, Sun microsystems as well. They are both legacies here in Silicon Valley, absolutely. I can now confirm that Oracle is getting out of the chip business. Oracle got into the chip business with the $7.4B acquisition of Sun Microsystems in 2009 (which was a diving catch in my opinion) and has continued to make SPARC CPUs down to TSMC 20nm. Unfortunately, they fumbled 20nm and never made it into the FinFet era which was their undoing, my opinion.

Sun really took the EDA business by storm in the 1980’s replacing mini computers with Motorola 68000 based systems that not only out performed the legacy DEC and Data General systems, they ran a native UNIX OS and were much less expensive. Sun also started the “personal CAD workstation” revolution with the SparcStation that was based on the Sun SPARC RISC CPU paired with a GPU.

About the time IBM PC clones were storming the market Sun licensed SPARC in an attempt to create a Sun clone market. One of those companies was Solbourne Computer (I worked for Solbourne). Solbourne (born of sun) did quite well in the server market with their multi cpu server business. They were the first to SPARC multiprocessing and made huge amounts of money selling (4) CPU servers. Back then EDA software was system based so you could almost get four times the EDA throughput on a Solbourne system using the same software license as a single CPU Sun system. This was short lived of course since the software vendors changed their licensing when Sun started shipping multiple CPU systems. Back then a CPU was a board versus a chip.

Sun played some dirty tricks on the clone vendors (holding back OS releases, nasty sales FUD, and such) so the clone business died a slow and expensive death. Sun went on to hyper-growth during the dotcom boom then hyper-bust when the dotcom money dried up. Sun was never really the same after that and was ready to go belly up when Larry Ellison came in for the save. Larry and Sun co-founder Scott McNealy were friends so it was an inside deal. Larry wanted to go up against database competitors HP and IBM by providing turnkey DB systems but Oracle is now all-in on cloud computing.

The big failure here was how Sun operated internally in my opinion. One glaring example was the CAD group. Sun was notoriuosly unfair to EDA vendors. One time we were told point blank that, “Sun did not buy EDA software”. We had to give them the software for free but they would pay a 15% support fee. Sun was also slow to adopt commercial tools and to this day Oracle still has a very large CAD tool development group. Yes, Oracle still develops some of their own EDA tools.

Sun really changed the way we designed chips taking us from mini computers in the back room to workstations on our desks. Similar to what Apple has done taking computing from the desk to our pockets. R.I.P Sun Microsyetms and thanks for the memories (and CPUs)…

Please share your Sun Stories in the comments section…. This really is an end of an era.

August 25, 2017July 5, 2021

Semiconductor Reliability and Product Qualification

Semiconductor Reliability and Product Qualification
by Daniel Nenni on 08-25-2017 at 7:00 am
Categories: Semiconductor Services, Semitracks
2 Comments

This week, we are continuing our discussion of various topics that Semitracks addresses in their training activities. One area that they focus on quite a bit is Semiconductor Reliability and Product Qualification.

One of the key activities that a Product Engineer will coordinate is the qualification of new products before they are released to market. This activity will normally involve running a battery of tests on a sample of the products to determine if the product will be fit for service in the customer’s application. While this activity is normally done according to a standard, like JEDEC JESD-47, MIL-STD 883, or AEC-Q100, there are a number of subtleties that make this activity a challenging one for the engineer. While the standards provide decent instructions and a process to follow, they do not account for a number of items. For example, the customer may have a use condition that is not well addressed by the standard. Consider a chip that will be part of a sensor unit that goes into a well deep underground where the temperature is very high. JEDEC testing does not describe what to do when a chip might operate for an extended period of time above 125°C. Consider a chip that might sit idle in a weapons system for decades, but then needs to work with high reliability after sitting idle all that time. Again, JEDEC, and even the MIL-STD documents don’t describe what to do.

Now you might think that these are extreme cases, and I am just working on a chip that goes into a smart phone. Well, smart phone usage varies widely from person to person. For example, I might use my smart phone primarily for calls or texting, while my son might use his to stream video on YouTube. I don’t do that, because at my age, my eyes can’t see that small of a screen clearly! This means that certain aspects of the smart phone will be stressed differently from customer to customer. Consider other aspects of the use conditions of a smart phone. One person might always have their phone in their back pocket, where the unit undergoes flexing from sitting down on it, while others might put their smart phone in their purse instead. Some might leave their phone in their car during the day, where it is exposed to more temperature cycling, while others might keep their phone with them at home or in their office, and the unit experiences less temperature cycling.

Sometimes the standard doesn’t account for changes in the technology, or the manufacturing process. For example, we might introduce a new package configuration onto the market – take Fan-Out Wafer Level Packaging (FOWLP) for instance. There might be new materials, new manufacturing methods, or new failure mechanisms both during assembly and by the customer to produce the unit and the end system that JEDEC might not address correctly. This might require deviating from the standard to achieve a successful result for the customer.

Finally, the standards are not always worded well. For example, when re-qualifying a product, the standards mention that minor logic changes do not require re-qualification, but major ones do. What constitutes a minor versus a major change? Additionally, the standards do not explicitly call out when you should use a static or dynamic test. These types of “unclear” situations require that the Product Engineer be able to state his or her case in a clear and convincing manner to the customer.

Since there are a number of subtleties around the qualification process, it’s important that the engineer have a strong background and understanding of chip reliability. Semitracks is a leader in this type of training, and has a course that directly addresses this particular topic.

I would encourage anyone who deals with product qualification and reliability testing activities to take this course. It is definitely worth your while, absolutely!

August 24, 2017July 18, 2025

Samsung, Synopsys and Qualcomm at DAC

Samsung, Synopsys and Qualcomm at DAC
by Daniel Payne on 08-24-2017 at 12:00 pm
Categories: EDA, FinFET, Samsung Foundry, Synopsys
20 Comments

I’m a user of many Samsung products as my family has Samsung Galaxy smart phones and my MacBook Pro uses Samsung SSD for storage, so at DAC I attended a breakfast panel with presenters from Samsung, Synopsys and Qualcomm. This was the second day of DAC and they served us breakfast, and with the big names on the panel the room was crowded with engineers all wanting to hear about the Samsung roadmap and how Qualcomm designed their latest SoC.

Continue reading “Samsung, Synopsys and Qualcomm at DAC”

August 24, 2017

Eclipse 2017: Drive All Night!

Eclipse 2017: Drive All Night!
by Roger C. Lanctot on 08-24-2017 at 7:00 am
Categories: Automotive
16 Comments

The reports are coming in now that Apple is working on self-driving car technology in support of developing shuttle buses – not unlike the dozen or so other companies like Local Motors and Navya working on the same idea. The message is clear that the only way to reduce traffic and congestion is to get human beings to pack themselves more tightly into fewer vehicles.

My wife and I got that message loud and clear Monday night. We flew to Greenville, S.C., to experience the totality together – a week or so ahead of our 32nd anniversary. Due to the lack of flights out of Greenville we planned to drive home. That is when I got my eclipse epiphany.

Leading up to our eclipse escapade I had had several signals as to the future of transportation. I simply wasn’t paying attention. But, now, I finally get it. I am on board.

My education came as I commenced what I thought would be a seven-and-a-half-hour drive home, but which became a 12-hour life-or-death ordeal. Not even Waze or Google could cope with or overcome the mass of humanity striving to drive en masse from points southeast to points northeast up the Atlantic Coast of the United States.

At the outset of our journey we thought we were being clever using Google to find alternative secondary roads to avoid the parking lot that was once Interstate 85. But it was clear, early on, that we weren’t the only ones trying to outwit the gods and the gods were having none of it.

There were simply too many people traveling simultaneously from too many points of origin to too many destinations clustered in roughly the same geographic area. It was like a massive national rush hour happening at night – and throughout the middle of the night at that.

I could see from the license plates on the cars surrounding us that all of the cars from places like Charleston/Columbia/Greenville, SC and Knoxville/Nashville/Sweetwater, TN and Bowling Green, KY were shifting at once up a two-lane highway toward PA, NJ, MD, VA, NC, NY and, yes, even Rhode Island. That meant that the traffic would only get worse during the night, not better, with Interstate 85 and, then, Interstate 81 acting as a giant funnel – and a funnel that conveniently had construction shutting down various portions of the road.

But this is no different from the kind of day-long gridlock that can be seen on the 405 in Los Angeles, where I was a few weeks earlier getting my introduction to Lyft Line. There’s a reason Uber and Lyft are pushing carpool options: it’s the best way to maximize profits, revenue and passenger share while mitigating traffic and congestion.

This is also the reason Ford acquired Chariot. Transportation experts want more people packing themselves into vehicles of all kinds. It is the only path to fewer vehicles on the road. In New York City alone you have Gett, Via, Juno and Bandwagon all offering shared rides.

The phenomenon ought to represent a huge opportunity for both General Motors and FCA. Both companies are shifting massive quantities of large SUVs and crossovers into the market in the midst of a passenger car sales deficit. Both companies could cleverly enable mobility services connecting customers with shared commuter routes.

Washington, DC, like a handful of other metropolitan areas around the country, has slug lines at various locations around the city to enable drivers to pick up passengers thereby allowing them to use HOV lanes on the way out of the city. Why can’t GM’s OnStar offer a slug-line app to enable drivers of connected GM Suburbans, Equinoxes, Cadillacs and Traverses to pick up passengers and do their part to reduce congestion? Why can’t FCA equip all its minivans and SUVs with connected services and enable ad hoc ride sharing? (It’s worth noting that the first Lyft ride of our journey – to the airport in DC – was in a Chrysler 200 and our last in Greenville, SC, was in a Dodge Avenger.)

You could say that my wife and I went to the eclipse and I saw the light. But will GM and FCA see the light? Why sell all those mini-vans, FCA, if you aren’t going to make it possible for the owners to pick up some extra passengers and maybe make a little money on the side. OnStar as a ride-sharing service provider? That’s a no-brainer, especially in the context of saving the environment.

August 23, 2017July 5, 2021

TSMC OIP Ecosystem Forum 2017 Preview!

TSMC OIP Ecosystem Forum 2017 Preview!
by Daniel Nenni on 08-23-2017 at 12:00 pm
Categories: FinFET, Foundries, TSMC
2 Comments

The TSMC OIP Ecosystem Forum is upon us again. I have yet to meet a disappointed attendee so it is definitely worth your time: Networking with more than 1,000 semiconductor professionals, the food, mingling with the 50+ EDA, IP, and Services Companies, the food, and of course the content. The 7nm and 7nm EUV updates alone are worth the trip, absolutely! And remember, it is September 13th at the very convenient Santa Clara Convention Center.

Continue reading “TSMC OIP Ecosystem Forum 2017 Preview!”

August 23, 2017July 18, 2025

A Functional Safety Primer for FPGA – the White Paper

A Functional Safety Primer for FPGA – the White Paper
by Bernard Murphy on 08-23-2017 at 7:00 am
Categories: EDA, FPGA, Synopsys

Following up on their webinar on functional safety in FPGA-based designs, Synopsys have now published a white paper expanding on some of those topics. For those who didn’t get a chance to see the webinar this blog follows the white paper flow and is similar but not identical to my webinar blog, particularly around differences between different FPGA architectures. The topic is design for functional safety, particularly as applied to FPGA design and how these techniques can be implemented using Synopsys’ Synplify Premier. In fact, in this very detailed review I saw little that isn’t of equal relevance in ASIC design, though typically implemented using different tools and methodologies.

The paper kicks off with a bite-sized summary of the IEC 61508 standard for safety in industrial electrical and electronic systems and the ever-popular ISO 26262 standard for safety in automotive electronic systems. Both are useful to have on hand when someone asks you what you know about these standards, expecting a quick answer but wanting something more than just “safety”, preferably involving mention of SIL and ASIL levels.

The authors go on to note that what safety techniques you might choose to apply will depend to some extent on the FPGA architecture (SRAM, anti-fuse and flash-based). I don’t remember this coming up in the webinar, so there’s new information here. This is particularly applicable to redundancy approaches to safety.

They discuss how triple modular redundancy (TMR) can be best applied. In SRAM-based FPGAs, configuration memories (used for routing matrix and LUT functionality) are particularly susceptible to soft errors whereas the failure-in-time (FIT) rate for individual registers is relatively small. Therefore safety-sensitive logic requires triplication with majority-voter logic at outputs. If the logic contains feedback loops, it may require distributed TMR (DTMR) to embed further triplication and voting in those loops, though there is an option for lower overhead with block TMR (here’s one useful reference for the various flavors of TMR).

In anti-fuse and flash-based FPGAs, routing is insensitive to soft errors and registers are more of a concern. Here local TMR around registers is a more appropriate solution.

The authors go on to talk about safety techniques for finite-state machines (FSM) – a recap from the webinar. FSMs are particularly prone to soft error problems, since an incorrect bit flip can send the FSM in an unexpected direction, causing all kinds of problems downstream. The white paper describes two recovery mechanisms: “safe” recovery where a detected error takes the state machine back to the reset state and “safe case” where detection takes the FSM back to a default state (In Synplify Premier, this also ensures the default state is not optimized out in synthesis – you would need a similar approach in other tools).

It is also possible to support error correction in FSMs where state encoding is such that the distance between current state and next state is 3 units. In this case single-bit errors can be detected and corrected without needing to restart the FSM or recover from a default state.

For memories, some FPGA devices supports ECC (error-detection and correction) logic for block RAM. This is one-bit correction and can be used where appropriate. Where this does not meet your needs (if you need a higher level of protection) you can again use triplication, with block, distributed or local TMR as appropriate.

The authors also cover a point in the paper that they did not cover much in the webinar. Triplication is obviously expensive and in some cases may be too expensive to consider. In these cases, a better option may be to detect an error using duplicated logic. Like a parity check, this can’t fix but can at least detect an error. It’s then up to you to decide how you want to recover. You might choose to do a hard reset and scrub configuration logic (in an SRAM-based FPGA) or perhaps there is a design/use-case-dependent possibility to do a more localized recovery. In this case you’re obviously trading off time to recover from an error against utilization/device size.

Finally, although not mentioned in the paper, I assume a physical constraint raised in the webinar is still important so I’ll repeat it here. Radiation-induced soft-errors can trigger not just the initial error but also secondary ionization (a neutron bumps an ion out of the lattice, which in turn bumps more ions out of place, potentially causing multiple distributed transients). If you neatly place your TMR device/blocks together in the floorplan, you increase the likelihood that 2 or more of them may trigger on the same event, completely undermining all that careful work. So triplicates should be physically separated in the floorplan. How far depends on free-ion path lengths in secondary ionization.

Naturally many of these methods can be implemented in FPGA designs using Synplify Premier; The white paper gets into a little more detail in each case on what is supported. You can get the white paper HERE.

August 22, 2017

Webinar Alert: High Bandwidth Memory ASIC SiPs for HPC and Networking Applications

Webinar Alert: High Bandwidth Memory ASIC SiPs for HPC and Networking Applications
by Mitch Heins on 08-22-2017 at 12:00 pm
Categories: EDA, Open-Silicon

Calling all ASIC designers working on High-Bandwidth Memory (HBM) access architectures in high-performance computing (HPC), networking, deep learning, virtual reality, gaming, cloud computing and data center applications. You won’t want to miss this upcoming webinar focused on system integration aspects of a HBM2 ASIC SiP (Systems in Package). The HBM2 SiP is based on Open Silicon’s HBM2 IP-subsystem as implemented on TSMC’s 16nm FF+ process integrated with Samsung’s HBM2 memory using a TSMC CoWoS (chip-on-wafer-on-substrate) 2.5D silicon interposer.

HBM2, a 2016 JEDEC standard, offers wide-channel (128 bit) buses that can deliver up to 256 GBps bandwidth memory access. That’s screaming fast compared to the typical DDR3 access rates of 4GBps at one third the power efficiency. These architectures typically combine compute engine SoCs, with a 3D-stack of DRAM dice sometimes mounted on top a memory-controller ASIC. Stacked DRAM die interface using through-silicon-vias (TSVs) while the entire memory stack is then interfaced to the processing compute core using micro-bumps connections on short metal traces deposited on a thin, silicon-based interposer. System I/O make their way out of the package through a standard bump bonding on an organic package substrate (e.g. Chip-on-Wafer-on-Substrate).

Open Silicon will be reviewing design aspects of their HBM2 IP-subsystem (memory controller, PHY and custom I/Os), including testability features for debug of the embedded chips in the SiP. Open Silicon will also be presenting validation data from a fabricated HBM2 SiP test platform design. Results will include data generated from four pseudo channels of the HBM2 memory using random, incremental, and walking 0’s and 1’s tests to mimic patterns typical of video buffer applications and network packet traffic. Silicon validation results will include calibration data on signal integrity and valid data window, power consumption of the HBM2 memory, HBM2 IP, ASIC core blocks and total power consumed by the 2.5D HBM2 ASIC SiP at various voltages and temperatures.

The seminar is hosted by Open Silicon and will be moderated by Herb Reiter, president of Eda-2-ASIC Consulting Inc. Herb has more than 20 years in technical and business roles at semiconductor and EDA companies and is currently driving the new System Scaling Working Group at the Electronic System Design Alliance (formerly known as EDAC).

Presenting at the seminar will be Open Silicon’s Kalpesh Sanghvi and Vinay Somanache. Kalpesh has over a decade of professional experience in the semiconductor and embedded industry and is currently responsible for business development and technical pre-sales/support for Open Silicon’s IP.

Vinay has nearly two decades of experience in architecture and IP design engineering along with a wide variety of protocols and currently serves as Principal Architect of IP for Open Silicon.

Make sure to add this webinar to your calendar. It will be given on Tuesday, September 19, 2017 at 8:00AM PDT / 11:00AM EDT. Click this link to register now:

High Bandwidth Memory ASIC SiPs for High Performance Computing and Networking Applications

About Open-Silicon
Open-Silicon transforms ideas into system-optimized ASIC solutions within the time-to-market parameters desired by customers. The company enhances the value of customers’ products by innovating at every stage of design — architecture, logic, physical, system, software and IP — and then continues to partner to deliver fully tested silicon and platforms. Open-Silicon applies an open business model that enables the company to uniquely choose best-in-industry IP, design methodologies, tools, software, packaging, manufacturing and test capabilities. The company has partnered with over 150 companies ranging from large semiconductor and systems manufacturers to high-profile start-ups, and has successfully completed 300+ designs and shipped over 125 million ASICs to date. Privately held, Open-Silicon employs over 250 people in Silicon Valley and around the world. To learn more, visit www.open-silicon.com

August 22, 2017July 18, 2025

HBM offers SOC’s dense and fast memory options

HBM offers SOC’s dense and fast memory options
by Tom Simon on 08-22-2017 at 7:00 am
Categories: IP, Synopsys

Dual in-line memory modules (DIMM’s ) with double data rate synchronous dynamic random access memory (DDR SDRAM) have been around since before we were worried about Y2K. Over the intervening years this format for provisioning memory has evolved from supporting DDR around 1995, to DDR1 in 2000, DDR2 in 2003, DDR4 in 2007 and DDR4 in 2014. Over this time bus clock speeds have gone from 100-200 MHz to 1.06-2.13GHz, along with this has come a commensurate increase in performance. While the DIMM format is still king of the hill when it comes to overall capacity, the time has arrived for an entirely different format – based on much newer technology – to provide SDRAM for compute intensive applications that demand the highest bandwidth.

JEDEC’s High Bandwidth Memory (HBM) specification represents the most significant reformulation of SDRAM format in over a generation. Not only does HBM go faster, but it has a dramatically reduced the physical footprint. I recently had a chance to review a presentation given at the Synopsys User Group Silicon Valley 2017 Conference that does an excellent job of clarifying how HBM operates and differs from DDR based technologies. The title of the presentation is “DDR4 or HBM2 High Bandwidth Memory: How to Choose Now”. It was prepared by Graham Allan from Synopsys.

While CPU performance has been scaling at ~60% per year, memory has languished at a rate closer to 7% per year. This means that in some applications CPU performance is outpacing the capabilities of the system memory. Interestingly the DDR specifications have diversified for different applications: DDR3/4 are migrating to DDR5 for servers, PC’s and consumer uses. LPDDR2/3/4 are moving to LPDDR5 for mobile and automotive. Lastly GDDR5 is leading to GDDR6 for graphics. The issues that need to be addressed are bandwidth, latency, power, capacity and cost. And as always there is a relentless trade space between these factors.

HBM has taken the strategy of going wider while keeping operating speeds at modest levels. What’s nice is that the JEDEC DRAM spec is retained, but the operating bus is now 1,024 pins. The HBM2 spec allows 2000Mb/s per pin. This is done using stacked die and through silicon vias (TSV’s). In fact the die can be stacked 2,4 or 8 high, with a logic and PHY die on the bottom.

HBM started out as a memory specification for graphics, which is often a driver for leading edge technologies. HBM has however proven itself in many other areas. These include parallel computing, digital image and video processing, FPGA’s, scientific computing, computer vision, deep learning, networking and others.

The Synopsys presentation from SNUG is full of details about how DDR4/5 compares to HBM. Here are some of the highlights. The physical configuration is completely different, with the HBM memory die planar to the system board and other SOC’s. Instead of a socket perpendicular to the motherboard, HBM is placed on top of an interpose. Three prominent choices today for this are TSMC’s CoWoS, Intel’s EMIB, or an organic substrate. The smaller size and shorter interface connections offer big benefits to HBM.

On the capacity front DIMM’s win with up to 128GB in 2 ranks RDIMM with stacked 3DS 8GB SDRAM die. It seems that stacked die is a trick that is being utilized in DIMM configurations too. Also the “R “in RDIMM means that there are registers added to help allow increased fanout for more parallel devices on the same bus. All told, high end servers can sport 96 DIMM slots yielding over 12TB of RAM. In comparison, a host SOC with 4 HBM interfaces each with 8GB can provide 32GB of RAM. Despite the large difference in peak capacity, the HBM interface will be able to run much faster – around 250GB/s. This is the result of many important architecture changes made in moving to HBM. There are 8 independent 128-bit channels per stack. Also, refresh efficiency has been improved.

The last topic I want to cover here is thermal. DIMM configurations take up a larger volume, so consequently are easier to cool. Simply using a fan will usually provide enough cooling. HBM is much denser and usually resides much closer to the host SOC. There are features to limit temperature increases so they do not reach catastrophic levels. Regardless, more sophisticated cooling solutions are required. In many cases, liquid cooling is called for.

There are many trade-offs in deciding ultimately which is the best solution for any one design. I would strongly suggest reviewing the Synopsys presentation in detail. It covers many of the more nuanced differences between DIMM or HBM based memory solutions. Indeed, there may be cases where a hybrid solution can offer the best overall system performance. Please look at the Synopsys web site for more information on this topic and the IP they offer as potential solutions for design projects.

August 21, 2017June 14, 2019

Extraction Features for 7nm

Extraction Features for 7nm
by Tom Dillinger on 08-21-2017 at 12:00 pm
Categories: Cadence, EDA, FinFET

Frequent Semiwiki readers are familiar with the importance of close collaboration between the foundries and EDA tool developers, to provide the crucial features required by new process nodes. Perhaps the best illustration of the significance of this collaboration is the technical evolution of layout parasitic extraction. All signoff-level electrical analysis flows rely upon an accurate (and where appropriate, concise) parasitic model. Increasingly, accurate and fast extraction algorithms need to be integrated with implementation flows to enable electrical optimizations during physical design.

The 7nm process node based on FinFET devices with local interconnect is emerging as a very attractive offering, with aggressively scaled circuit density, performance, and power over previous nodes — and, as with previous node transitions, the 7nm node introduces extensive requirements for both sign-off and integrated extraction.

I recently had the opportunity to chat with Hitendra Divecha, Product Management Director, and Hao Ji, Software Engineering Director, for the Digital and Signoff Group at Cadence. They described the numerous enhancements in Quantus QRC provided for the 7nm node. Here are but a few of the highlights of the new capabilities.

Integrated Virtual Metal Fill (IVMF)
Metal fill algorithm(s) are used to add metal shapes to the tapeout database, to provide improved data density uniformity (and density gradient uniformity) over a relatively small window area. The utilization of Chemical-Mechanical Polishing (CMP) for metal layer processing requires uniformity to prevent dishing profiles of interconnects after CMP.

The complexity of the (track-based and non-track based, multipatterning color-aware) metal fill patterns and design rules has grown with advanced nodes. And, the electrical (coupling capacitance) impact of the fill data to signals has also increased.

Yet, the signoff-level fill algorithms are too computationally expensive to be integrated in implementation flows — as physical design optimizations modify the routing data, the fill data needs to quickly adapt, as well. Cadence Quantus (and Innovus) have integrated a virtual metal fill and extraction capability. The IVMF algorithms optimize the fill cell size, and focus on accurate fill where the electrical model impact is greatest. The figure below depicts the high accuracy of the IVMF approach, relative to the final metal fill tapeout data.

Inductance Extraction
In previous process nodes, extracted inductive elements have typically been relegated to specialized tools that focus on the top metal redistribution layers of the power distribution network. The significance of inductive impedance is increasingly applicable to clock net distribution — this impedance strongly impacts the clock waveform characteristics from the clock source through repowering to the state point endpoints. Quantus QRC has incorporated Partial Element Equivalent Circuit (PEEC) algorithms to add inductive parasitics to the extracted model for clock simulation, used to provide more accurate clock arrival data for static timing analysis.

Specifically, there are algorithms to accurately identify the current loop(s) within the layout — a key requirement for any inductance calculation. And, the inductance extraction feature provides a model that is accurate across a wide frequency range — a key requirement to represent the impedance across the broad spectrum content of a high frequency, fast transition time clock signal.

Electromigration Analysis
Frequent Semiwiki readers are also aware that electromigration is an increasingly significant reliability concern at advanced process nodes, with two key issues.

Self-Heating Effect (SHE)

The high areal current density and device channel profile of FinFET processes results in vastly different thermal energy dissipation than bulk FET processes. The temperature rise in the metal, contact, and via structures around active devices impacts both the resistivity and the metal migration activation probability associated with EM fails.

The Cadence Quantus and Voltus teams have collaborated with the foundries to develop an appropriate flow to integrate the SHE into electrical parasitic models for analysis.

The figures above depict the thermal energy model and the (iterative) SHE calculation flow, with an initial delta_T estimate that is subsequently refined.

Current Distribution Modeling

Another crucial facet to EM analysis is to develop a parasitic model that represents the detailed currents through each metal segment, contact, and via. There are some subtle nuances to this distribution that are prevalent in 7nm designs.

(1) pillar vias

The local currents of (high-drive strength) devices combined with the lithographic requirements of the 7nm node necessitate using multiple metal segments and vias to distribute the current — the figure below illustrates a “pillar via” consisting of a 2-by-2 array.

These pillar structures may use a simplified (reduced) model for timing and noise analysis, but require a detailed model for electromigration analysis to measure individual segment/via current distributions. Quantus has implemented algorithms for both modeling requirements.

(2) horizontal versus vertical currents (in local interconnects)

The use of local interconnects (often referred to as “M0”) in cell layouts in recent process nodes has resulted in substantial improvement in circuit density. The M0 layer(s) enables efficient connection across the source/drain nodes of fins in a device, local connectivity between different device gate/source/drain nodes, and an appropriate cross-section profile for subsequent contact to M1 metal segments.

As a result, the current flow in these segments is a combination of horizontal and vertical distribution, with different EM reliability limits. Quantus accurately models the M0 current directionality.

(3) logical versus physical pin detail

IP cell designs may offer multiple (or large area) logical pin shapes to enable a router to connect to the pin at any access point to minimize wiring congestion. Alternatively, the IP may add a “must connect” property to distinct physical pins to complete a single logical connection. The current distribution within the IP may be vastly different, depending upon how the connection is made. Quantus supports the extraction of a detailed view (in the presence of various logical pin designs and physical connectivity) that ensures the subsequent EM analysis utilizes an accurate current distribution.

This article has just touched on new extraction requirements for the 7nm process node. The biasing of layout data to meet lithographic requirements has become more complex, necessitating intricate correction factors within the Quantus QRC tool — this is magnified by the increasing application of multi-patterning decomposition and patterning.

For applications other than EM, parasitic model reduction is required to manage the size of the electrical model presented to other electrical analysis flows — the Quantus QRC team has also focused on reduction runtime for 7nm (without sacrificing accuracy).

The recent release of Quantus QRC to provide 7nm process node enablement is an excellent example of the collaborative design enablement with foundries. The new features address the additional complexities and requirements of advanced process nodes. For more information on Quantus QRC, please follow this link.

-chipguy