Bronco Webinar 800x100 1

A Forbidden Pitch Combination at Advanced Lithography Nodes

A Forbidden Pitch Combination at Advanced Lithography Nodes
by Fred Chen on 03-06-2020 at 10:00 am

A Forbidden Pitch Combination at Advanced Lithography Nodes

The current leading edge of advanced lithography nodes (e.g., “7nm” or “1Z nm”) features pitches (center-center distances between lines) in the range of 30-40 nm. Whether EUV (13.5 nm wavelength) or ArF (193 nm wavelength) lithography is used, one thing for certain is that the minimum imaged pitch will be less than the wavelength divided by the numerical aperture of the corresponding wavelength tool (0.33 for EUV, 1.35 for ArF immersion).

1X Pitch
Under these conditions, the targeted minimum pitch, which will be labelled as “1X” here, is imaged as the interference of two beams, no more, no less. Furthermore, only certain illumination angles will allow this interference to occur; other angles will not produce any image at all, and only appear as unwanted background light.

2X Pitch
Commonly, a processor layout can also include 2X pitches, i.e., twice 1X, lines which are separated by twice the minimum distance. These naturally occur when design grids are used, with the grid spacing correlating to the minimum pitch. However, when 2X pitches are imaged with the same illumination as 1X pitches, there is not only a difference of image but also a difference of depth of focus. The reason is 2X pitches are imaged as the interference of three beams rather than two.

Figure 1. Two-beam interference for 1X pitch and three-beam interference for 2X pitch.

1X vs. 2X Depth of Focus
The difference of optical path lengths between the middle beam and the side beam is large for the three-beam case, while for the two-beam case, there is no middle beam and the two side beams have similar optical paths at the appropriate angle. Consequently, the depth of focus (DOF) is worse for the three-beam case compared to the two-beam case. On the other hand, at a non-optimum angle, even the two-beam interference will defocus poorly, for the same reason of different optical path lengths between the two beams (Figure 2).

Figure 2. Different optical paths, indicated by the gap between the vertical positions of the different wavefronts in red and blue, for three-beam interference (left) and two-beam interference with non-optimum angle (right).

However, for the 2X pitch, it is still possible to find different illumination angles that result in close to the two-beam interference. These correspond to different source points in the pupil (Figure 3). The 2X pitch points have x angular coordinates which are half those of the optimum points for the 1X pitch, and at the same time, sufficiently high y angular coordinates to ensure that only two diffracted beams are captured within the numerical aperture.

Figure 3. Different source points in the pupil give rise to the desired two-beam interference patterns, for the 1X case (left: pitch = 0.88 wavelength/NA; right: pitch = 0.98 wavelength/NA) and the 2X case (left: pitch = 1.76 wavelength/NA; right: pitch = 1.96 wavelength/NA). Moreover, for the lower k1, the best DOF is a limited subset of all possible two-beam interference source points.

The different illumination conditions indicate the mutually exclusive defocus tolerance. A single exposure cannot offer the same focus windows to both 1X and 2X pitches.

Subresolution Assist Features (SRAFs)
A widely suggested proposed solution [1] to accommodate both 1X and 2X pitches is to use subresolution assist features (SRAFs) on the 2X pitches to make them appear more similar to 1X features (Figure 4). This essentially suppresses the middle beam of the three beams, resulting in the growing of side lobes in between lines. Care must be taken, however, not to have these side lobes print. The 2X pitch isn’t changed, and the middle beam cannot be completely eliminated, so the focus window improvement will still be narrower than the 1X case. SRAFs would also be more vulnerable to stochastic effects due to their smaller sizes [2]. In situations where multiple patterning is already planned to be used to add or remove lines, the use of SRAFs to match 2X to 1X pitches would be unnecessary.

Figure 4. Subresolution assist features (SRAFs) make the 2X pitch look more like a 1X pitch. However, side lobe printing at the SRAF locations is a risk that cannot be neglected.

EUV Pupil Rotation Sensitivity
It has been reported [3,4] that the distribution of EUV illuminating source points rotates azimuthally about the optical axis at different points across the exposure slit, to a range of more than +/- 18 degrees (Figure 5); it is a natural outcome of using reflective ring-field optics [5]. As a result, illumination angles with best depth of focus are rotated to angles with inferior depth of focus. To avoid this undesired outcome, a large portion of the exposure slit would have to be excluded, resulting in effectively a much smaller exposure field width. To compensate, the scan across the wafer must stop at many more locations, reducing throughput substantially (Figure 6). Even worse, some chips are already very wide, almost taking up the entire maximum 26 mm field width. For these chips, the wide exposure width would have to be divided for multiple exposures with separately optimized illuminations. One patent from TSMC [6] even rotates different parts of the wide chip layout accordingly, in order to keep the single exposure.

Figure 5. Pupil rotation across slit (~ +/-20 deg) does not preserve depth of focus at all slit locations, as the illumination source points are displaced from optimum locations.

Figure 6. Pupil rotation with a full field would suffer from illumination rotation (left); this can be mitigated with a smaller exposure field width (right), which requires more stopping time.

How It Is Dealt With

The 1X vs 2X pitch incompatibility for depth of focus can be handled in four different ways:

(1) Design rule restrictions: Exclude the 2X pitch as forbidden. This is by far the simplest approach. But it may be too restrictive.

(2) SRAFs: This has been implemented successfully for DUV lithography, with care taken to not print the SRAFs in the process. For EUV, though, stochastic effects are aggravated, and pupil rotation is not addressed.

(3) Multiple Patterning: Splitting out 1X and 2X pitches can occur as part of multipatterning.

(4) EUV pupil rotation: The EUV pupil rotation would either limit the exposure field or require pre-rotation of parts of the layout, to avoid multiple patterning otherwise.

References

[1] J. G. Garofalo et al., Proc. SPIE 2440, 302 (1995).

[2] https://www.linkedin.com/pulse/stochastic-printing-sub-resolution-assist-features-frederick-chen/

[3] R. Capelli et al., Proc. SPIE 9231, 923109 (2014).

[4] A. Garetto et al., J. Microlith/Nanolith. MEMS MOEMS 13, 043006 (2014).

[5] M. Antoni et al., Proc. SPIE 4146, 25 (2000).

[6] US Patent 9091930, assigned to TSMC.

This article first appeared in LinkedIn Pulse: A Forbidden Pitch Combination at Advanced Lithography Nodes

Related Lithography Posts


TSMC’s 5nm 0.021um2 SRAM Cell Using EUV and High Mobility Channel with Write Assist at ISSCC2020

TSMC’s 5nm 0.021um2 SRAM Cell Using EUV and High Mobility Channel with Write Assist at ISSCC2020
by Don Draper on 03-06-2020 at 6:00 am

Fig. 1 Semiconductor Technology Application Evolution

Technological leadership has long been key to TSMC’s success and they are following up their leadership development of 5nm with the world’s smallest SRAM cell at 0.021um 2 with circuit design details of their write assist techniques necessary to achieve the full potential of this revolutionary technology. In addition to their groundbreaking device developments such as High Mobility Channel (HMC) they are the leading implementers of Extreme Ultra-Violet (EUV) patterning to enable higher yield and shorter cycle time at this advanced node.

Semiconductor technology evolution has been driven by the application landscape which in the current phase of High-Performance Computing (HPC), Artificial Intelligence (AI) and 5G communication requires the highest performance with limited power dissipation as illustrated in Fig. 1.

Fig. 1 Semiconductor Technology Application Evolution

This technology was described by TSMC at IEDM 2019, where they described their 5 nm process which uses more than 10 Extreme Ultra-Violet (EUV) mask patterning steps replacing three or more immersion mask steps each and High Mobility Channel (HMC) technology for higher performance. This technology has been in risk production since April of 2019 and will be in full production 1H2020.

The implementation of this technology for the development of high- performance SRAM bit cells and arrays was described by Jonathan Chang, et al at ISSCC2020.

The quantizing of FinFET transistor sizing continues to be a major challenge and forces all transistors in the high-density 6T SRAM cell to use only a single fin. The design is optimized through Design-Technology Co- Optimization (DTCO) to give high performance and density as well as high yield and reliability. SRAM bit cell scaling for 2011 to 2019 is shown in Fig. 2.

Fig. 2. SRAM bit cell scaling is shown for 2011 to 2019.

It can be noted that the cell size reduction rate from 2017 to 2018 to 2019 is much slower than the rate for preceding years 2011 to 2017, showing that SRAM cells have not been scaling at the same rate as logic in general. At IEDM 2019, the 5nm process was quoted to have 1.84x logic density improvement compared to 1.35x SRAM density improvement. Further area reduction utilizing Flying Bit Line (FBL) architecture is implemented for 5% area savings. The layout of the 5nm cell is shown in Fig. 3.

Fig. 3. Layout of the high-density 6T SRAM bit cell.

For power reduction, a key approach is lowering the minimum operating voltage Vmin of the SRAM array. The increased random threshold voltage variation in this latest technology limits Vmin which in turn limits the opportunities for power reduction. The SRAM voltage scaling trend is shown in Fig. 4, where the blue line indicates the Vmin without write assist and the red line indicates Vmin with write assist, showing great benefit of write assist with each generation. It will be observed that the Vmin from 7nm to 5nm shows very little improvement, indicating that further power reduction must be gotten from improvements in write assist generation circuits. This article will describe the major write assist methods to enable lower Vmin in operation, negative bit line (NBL) and Lower Cell VDD (LCV).

Fig. 4. SRAM cell voltage scaling trend without write assist (blue line) and
with write assist (red line).

The SRAM cell schematic is shown in Fig. 5 showing contention during write between the PU and pass-gate transistor PG. A stronger PU transistor would yield a higher read stability, but it degrades the write margin significantly and results in a contention write Vmin issue.

Fig. 5. SRAM cell schematic showing contention during write between the
PU and pass-gate transistor PG.

The first method to improve the write Vmin is to lower the bit line voltage during write, called Negative Bit Line or (NBL). This method has been employed for several years, using a MOS capacitor to generate a negative bias signal on the bit line, but this write assist circuitry results in area overhead. Furthermore, a fixed amount of MOS capacitance induces over boosted NBL level for short BL configuration and may led to dynamic power overhead in short bit lines, as shown in Fig. 6.

Fig. 6. Fixed amount of MOS capacitance induces over-boosted NBL level
for short BL configuration and may lead to dynamic power overhead
avoided by the metal cap NBL.

The overboost and the MOS capacitor area issues can be avoided by using a metal capacitor-coupled scheme based on coupled metal tracks laid out on top of the upper metal of the SRAM array. To avoid the overboost, the metal capacitor length can be modulated with the SRAM array bit line length, saving dynamic power. Furthermore, the coupled NBL level can also be adjusted to compensate the loss of write ability induced by BL IR drop on the far-side bit cell.

The NBL enable signal (NBLEN) in Fig. 7 drives one side of the metal capacitor C1 negative which couples a negative bias signal at the virtual

ground node NVSS which then passes through the write driver WD and column multiplex to the selected bit line.

Fig. 7. The NBL enable signal (NBLEN) couples the configurable metal
capacitor C1 to NVSS.

The NBL coupling level with different bit line configurations, Fig. 8, showing that the configurable metal capacitor C1 can track with bit line length so that the variation of the coupling NBL level with different Bit line length can be mitigated.

Fig. 8. NBL coupling level with different bit line configurations showing the
longer 256bit bitline (blue) having an extended NBL boosted level.

The second method of write assist is to Lower the Cell VDD, (LCV). The conventional techniques of LCV require a strong bias or an active-divider to adjust the column-wise bit cell power supply during write operation, but these techniques consume a huge amount of active power across operating time. Pulse Pull-down (PP) and Charge Sharing (CS) techniques are two alternative solutions but precise timing is difficult for PP, so CS is proposed using metal wire charge sharing capacitors on top of the array as shown in Fig. 9.

Fig. 9. Charge Sharing (CS) for Low Cell VDD (LCV) for write assist using
CS metal tracks on top of the SRAM array.

In write operation, the LCV enable signal (LCVEN) goes high, it turns off the pull low NMOS (N1) to isolate the charge sharing capacitor C1 from ground. A column is selected by COL[n:0] to turn the header P0 off and isolates the array virtual power rail CVDD[0] from true power VDDAI. Because the metal wire capacitance scales along with the size of the bit-cell array, it also benefits the SRAM compiler design and provides a relatively constant charge sharing voltage level with varied BL configurations. The charge sharing level is determined by metal capacitance ratio of CVDD and the charge sharing metal track. Fig. 10 shows three LCV-VDD ratios are implemented for 6%, 12% and 24%.

Fig. 10. Three LCV-VDD ratios are implemented for 6%, 12% and 24%.
With write assist turned off, Vmin is constrained by write failure. Measured
results with Write Assist in Fig. 11 show NBL improves Vmin by 300mV and 24% LCV improves Vmin independently by over 300mV.

Fig. 11. Measured results of (a) metal capacitor-boosted Write Assist
WAS-NBL scheme and (b) metal charge-sharing capacitor WAS-LCV
scheme.

Performance of the 5nm process is enhanced by the High Mobility Channel with ~18% drive current gain shown in Fig. 12. This technology was described in detail at IEDM2019.

Fig. 12. High Mobility Channel (HMC) performance gain of ~18%.
This performance gain is exemplified by the high-speed SRAM array for
L1 cache application achieving 4.1Ghz cycle time t 0.85V shown in the
shmoo plot in Fig. 13.

Fig. 13. Shmoo plot of the HD SRAM array for use as a high performance
L1 cache showing 4.1 GHz at 0.85V. The measured results are based on the 135 Mb test chip shown in Fig. 14.

Fig. 14. 135 Mb test chip in 5 nm HK-MK FinFET with High Mobility
Channel (HMC) and 0.021um 2 SRAM bit cell.

In summary, the detailed circuit design techniques described here enable the product developer to get the maximum advantage from this leading technology. An important device development approach is to do Design- Technology Co-optimization (DTCO) between product/circuit designers and process developers responsible for product yield and reliability.

ALSO READ: TSMC Unveils Details of 5nm CMOS Production Technology Platform Featuring EUV and High Mobility Channel FinFETs at IEDM2019


The Story of Ultra-WideBand – Part 2: The Second Fall

The Story of Ultra-WideBand – Part 2: The Second Fall
by Frederic Nabki & Dominic Deslandes on 03-05-2020 at 10:00 am

The Story of Ultrawideband SemiWiki 1

Over-engineered to perfection, outmaneuvered by Wi-Fi
In Part 1 of this series, we recounted the birth of wideband radio at the turn of the 20th century, and how superheterodyne radio killed wideband radios for messaging after 1920. But RADAR kept wideband research alive through World War 2 and the Cold War. Indeed, the story of wideband radios was not over…

Continuing the story, the benefits of ultra-wideband (UWB) became more apparent as demand for wireless communications grew in the 1990’s. But commercial deployment of UWB systems required worldwide agreement on frequency allocations, harmonic and power restrictions, etc. As interest in the commercialization of UWB increased, developers of UWB systems began pressuring the FCC to approve it for commercial use. In 2002 the Federal Communication Commission (FCC) finally allowed the unlicensed use of UWB systems. The European Telecommunications Standards Institute (ETSI) followed a few years later with their own regulations, unfortunately slightly different than the FCC regulation. Other regions followed, often aligning with FCC or ETSI.

UWB systems use short-duration (i.e. picosecond to nanosecond) electromagnetic pulses for transmission and reception of information. They also have a very low duty cycle, which is defined as the ratio of the time that an impulse is present to the total transmission time. Based on emission regulations set in the 2000s, an UWB signal is defined as a signal having a spectrum larger than 500 MHz. Most countries have now agreed on the maximum output power for UWB, defined as -41.3 dBm/MHz.

With regulations now in place, an alliance of companies started to form in order to standardize the physical layers and media access control (MAC) layers. In 2002 the WiMedia Alliance was formed which was a non-profit industry trade group that promoted the adoption, regulation, standardization and multi-vendor interoperability of UWB technologies. It was followed, in 2004, by the Wireless USB Promoter Group and the UWB forum.

In order to understand the choices made by these alliances, we should contextualize them. In 2002, WiFi was a relatively new technology. An 802.11b router, available since 1999, had a theoretical maximum speed of 11 Mbit/s using the 2.4 GHz frequency band. The 802.11a standard, also defined in 1999 and promising a theoretical maximum speed of 54 Mbit/s in the 5 GHz band, was not getting traction in the consumer space mainly due to its higher chipset cost. In 2003, the 802.11g standard was introduced, providing a theoretical maximum speed of 54 Mbit/s in the 2.4 GHz band. Even though the 802.11g standard proved to be a great success, the data rate was still limited by the crowded 2.4 GHz band, which was the backbone of wireless LANs at the time, and also microwave ovens and well-marketed (a.k.a. more trouble than they were worth) cordless phones!

It is with these limitations in mind that a new generation of UWB radios were proposed. With emission regulations now in place, it was hard to resist the promise of UWB-enabled high data rates. Indeed the 7.5 GHz of bandwidth allocated between 3.1 and 10.6 GHz by the FCC was an extremely valuable resource for wireless communication engineers. This is how specifications for short-range (i.e., few meters) file transfers at data rates of 480 Mbit/s were proposed based on UWB multi-band orthogonal frequency-division multiplexing (OFDM). After a few years of development, the first retail product started shipping in mid-2007. This was very much an overengineered wireless radio that multiplexed in a relatively classical way multiple wide bandwidth carriers, and was not per se an impulse-based radio akin to the spark-gap radio.

Even though OFDM UWB was making a lot of noise at the time and the products were promising, its introduction to the market faced a perfect storm in the late 2000s. 2008 marked the beginning of the great recession, leading to a significant decline in retail sales of consumer electronics. In addition, while the different UWB alliances were working on novel products, the WiFi Alliance was not standing still. In 2006, after years of development and negotiations, they published the first draft of the 802.11n standard. Supporting the Multiple-Input and Multiple-Output (MIMO) concept to multiplex channels, it was developed to provide data rates of up to 600 Mb/s. Although the final version of the standard was not published before October 2009, routers supporting the draft standard started pre-emptively shipping in 2007.

The last nail in the coffin of OFDM UWB came from the technology itself. The complexity of the OFDM UWB transceiver RF architecture proposed at the time and its stringent timing requirements lead to a relatively high product cost and a lackluster power-consumption.

This combination of events and technologically over-engineered chipset signed the demise of the high-speed UWB radios. The leader in UWB chipsets at the time, WiQuest with 85% of the market in early 2008, ceased operations on October 31, 2008. The UWB forum was disbanded after failing to agree to a standard due to contrasting approaches with the WiMedia Alliance. The WiMedia Alliance ceased its operations in 2009 after transferring all their specifications and technologies to the Wireless USB Promoter Group and the Bluetooth Special Interest Group. The Bluetooth Special Interest Group, however, dropped development of UWB as part of Bluetooth 3.0 in the same year.

Unfortunately, almost exactly a century after the retirement of the first UWB systems based on spark-gap radios, this new iteration of UWB radios based on the OFDM radio architecture was falling out of favor. However, against all odds, the world would not have to wait another century before seeing a new and improved implementation of and UWB radio. Indeed, the spark-gap radio would become even more of an inspiration for this UWB resurgence, and this resilient nature of UWB will be discussed in the third part of this series.

About Frederic Nabki
Dr. Frederic Nabki is cofounder and CTO of SPARK Microsystems, a wireless start-up bringing a new ultra low-power and low-latency UWB wireless connectivity technology to the market. He directs the technological innovations that SPARK Microsystems is introducing to market. He has 17 years of experience in research and development of RFICs and MEMS. He obtained his Ph.D. in Electrical Engineering from McGill University in 2010. Dr. Nabki has contributed to setting the direction of the technological roadmap for start-up companies, coordinated the development of advanced technologies and participated in product development efforts. His technical expertise includes analog, RF, and mixed-signal integrated circuits and MEMS sensors and actuators. He is a professor of electrical engineering at the École de Technologie Supérieure in Montreal, Canada. He has published several scientific publications, and he holds multiple patents on novel devices and technologies touching on microsystems and integrated circuits.

About Dominic Deslandes
Dr. Dominic Deslandes is cofounder and CSO of SPARK Microsystems, a wireless start-up bringing a new ultra low-power and low-latency UWB wireless connectivity technology to the market. He leads SPARK Microsystems’s long-term technology vision. Dominic has 20 years of experience in the design of RF systems. In the course of his career, he managed several research and development projects in the field of antenna design, RF system integration and interconnections, sensor networks and UWB communication systems. He has collaborated with several companies to develop innovative solutions for microwave sub-systems. Dr. Deslandes holds a doctorate in electrical engineering and a Master of Science in electrical engineering for Ecole Polytechnique of Montreal, where his research focused on high frequency system integration. He is a professor of electrical engineering at the École de Technologie Supérieure in Montreal, Canada.


COVID-19 Collateral Chip Collision – Will Fabs & Foundries Flounder?

COVID-19 Collateral Chip Collision – Will Fabs & Foundries Flounder?
by Robert Maire on 03-05-2020 at 6:00 am

Corvid 19 SemiWiki

Corona Fab Impact –
lower production/raise prices
Chip production supply chain may break
It could temporarily fix memory oversupply
Could it risk the fall roll out of next Iphone

The ” Two week tango” – Waiting games at fabs
When a highly specialized piece of semiconductor equipment misbehaves to the point where fab workers can’t fix it, they pick up the phone and call their friendly tool maker field service techs who show up relatively quickly and fix the issue with on site or nearby spare parts and get the tool back up very quickly.

That is until Corona……

We have heard that at least TSMC and Intel fabs have instituted a two week quarantine period for outside service people.

Which means a tech shows up to fix or install a tool and has to cool his or her heels in a local hotel for two weeks until they prove they are not infected.

What happens to the tool for those two weeks? It stays down or not installed….

When you have dozens of Dep or Etch tools, losing a few may slow things down but lose a litho tool, especially a highly complex, prone to problems, in need of lots of preventative maintenance EUV tool and it will ruin a fab managers day and month for sure.

As you can imagine in a fab with literally hundreds of tools, this could and will easily “snowball” into a major league problem.

Yields will fall, throughput will suffer, lots of wafers will wind up in the trash.

If we take the cost of a $7B fab and try to calculate the hourly operating cost….its a lot….The lost productivity will be a big number.

Installs will suffer as well
Aside from impacting ongoing production, the Corona related delays and problems also obviously impact new tool installs, perhaps even more so than ongoing operations.

If the tool maker has to send a team to help install a new tool and the team has to cool its heels at TSMC for two weeks , it means that those same experts can’t be at another fab on time to install another tool for the next customer and installs go to hell in a hurry.

Basically the amount of time spent waiting around in hotels and not installing or fixing tools will be a very significant productivity loss.

Its also not like you can hire a warm body off the street and have them install and EUV scanner the next day.  You can’t ramp up the number of service people over night.

If you think this is a good reason for remote diagnostics….think again.  There is no such thing as a semiconductor tool hooked up to the internet at TSMC.  No data connection whatsoever is allowed to the outside world lest it get hacked and secret recipes stolen or machines hijacked. The tools are in isolation with only highly supervised, hands on visits allowed

Could it push out Apples Iphone launch?
TSMC and Apple have developed a very predictable working rhythm to get Iphones launched every fall like clockwork.

TSMC orders new semiconductor tools for the next generation of Apple processors roughly Q4 of the prior year, they get installed in Q1, initial production starts in Q2 with full production in Q3 for the September or October launch in time for holiday sales.

We are currently in whats is likely peak new tool installation time at TSMC for the next gen semiconductor process.

If enough new tools installations get delayed, it could very easily push the whole schedule back.

Slow down the install of a few EUV tools today and suddenly you don’t have enough tools to do enough layers in high enough volume to make enough chips to go into enough Iphones for the fall launch dates.

The domino effect could be quite large….

AMD and Intel not immune either
You might think that Intel and AMD are not as impacted but AMD gets its parts from TSMC and Intel has the same Corona protocol as TSMC.  Intel has already been short of production and has also farmed out to TSMC.

Supply chain has lots of single points of failure
It usually never comes to light, but the semiconductor industry has a lot of single points of failure in its highly complex supply chain.

It was pointed out in painful detail how the trade spat between Korea and Japan got ugly quickly as Japan had a monopoly as one of those single points of failure in the supply chain of photoresist and certain chemicals.

If those sole suppliers are in the wrong place and the wrong time due to Corona it will ripple through the industry.

Though there are many chemicals and materials one such chemical is TMAH (TETRAMETHYLAMMONIUM HYDROXIDE ) used in silicon etch and other applications.  A nasty substance supplied mainly by China….home to Corona.

Maybe Corona will hit memory fabs
Maybe the semiconductor industry will get lucky and the oversupply of memory chips will get fixed by the Corona slow down hitting a few memory fabs and take them off line, putting supply and demand back in balance…… Idaho may luckily be the last place Corona will show up at.  Memory prices may get a boost…..

The stocks
We should start to see some pre-announcements of missing numbers over the next few weeks as the electronics food chain grinds slower and slower.  Though things will obviously recover, it make take a while for the supply chain to recover and in some cases the time will never be recovered .

The damage may be contained within the calendar year for some but maybe not all.  Those further down the food chain, the users of chips, such as Apple will likely see the most impact as they have the widest exposure to many, many parts and suppliers. Fabs and other complex manufacturers clearly will have issues.


Designing Next Generation Memory Interfaces: Modeling, Analysis, and Tips

Designing Next Generation Memory Interfaces: Modeling, Analysis, and Tips
by Mike Gianfagna on 03-04-2020 at 10:00 am

IBIS AMI vs. Transient

At DesignCon 2020, there was a presentation by Micron, Socionext and Cadence that discussed design challenges and strategies for using the new low-power DDR specification (LPDDR5). As is the case with many presentations at DesignCon, ecosystem collaboration was emphasized. Justin Butterfield (senior engineer at Micron) discussed the memory aspects and Daniel Lambalot (director of engineering at Socionext) discussed the system aspects. I was able to spend some time with one of the other authors, Zhen Mu (senior principal product engineer at Cadence) as well. Zhen provided background on the tool platform used in this program, which is completely supplied by Cadence.

The LPDDR5 spec was finalized and published last year and is the cutting edge of DDR memory interfaces. Increased speed and lower power don’t come for free. There are many challenges associated with using LPDDR5, including channel bandwidth, reduced voltage margin, the need to route multiple parallel channels, dealing with crosstalk and ensuring proper return currents, multi-drop configurations (2 DRAM loads) and limited equalization capability.

The key features of LPDDR5 can be summarized as follows:

  • Higher data rates (up to 6.4Gbps)
    • Data transfer boosted about 1.5 times of the previous LPDDR4 interface
  • Power-isolated LVSTL interface with:
    • VDD2H=1.05V for the DRAM core
    • VDDQ=0.5V for the I/O
  • New packaging
  • Non-targeted on-die termination (ODT)
  • New eye mask specifications
    • Change from rectangular mask in LPDDR4 to hexagonal mask in LPDDR5
    • Two timing measurements – tDIVW1/tDIVW2 and vDIVW
    • See diagram, below
  • Data bit inversion (DBI)
  • Separate Read strobe (RDQS) and Write strobe (WCK)
  • Advanced equalization technologies such as feed-forward equalization (FFE), continuous time linear equalization (CTLE), and decision feedback equalization (DFE) for the controller and the memory

Timing is one of the most challenging aspects of LPDDR5; controller jitter must be considered. Accurate modeling is key for success. The presentation discussed the details of modeling and analysis approaches to optimize the use of LPDDR5 in actual designs. Items to be considered include device modeling, system-level design with typical topology, channel simulation for parallel bus analysis, bus characterization, modeling filtering functions implemented in LPDDR5 DRAMs and crosstalk simulation.

IBIS (I/O Buffer Information Specification) and the Algorithmic Modeling Interface (AMI) extensions are standards typically used in SerDes design and analysis. IBIS-AMI modeling can also be applied to parallel bus analysis for LPDDR5 designs. The benefits of this modeling approach include interoperability (different models work together), portability (models run in multiple simulators), accuracy (results correlate to measurements), IP protection (circuit details are not exposed) and performance (million-bit simulations are practical).

There are challenges to apply SerDes modeling to a DDR interface, including the non-symmetrical nature of LPDDR5 timing, see diagram below.

From a big picture point of view, channel simulations and circuit transient analysis are correlated, including IBIS-AMI models, using the Cadence Sigrity Explorer tool. To complete the analysis, memory models were supplied by Micron and controller models were supplied by the Cadence IP team. Socionext and Micron provided package models for controller and memory, respectively. See diagram below for some results.

Green: Circuit simulation; Blue: Channel simulation

For crosstalk simulation, two approaches were used -characterize each bus signal individually as is done in SerDes channel simulations and characterize the entire bus with practical stimulus patterns. Using the following conditions, the effect of channel length on performance can be modeled.

The analysis suggests a trace length of one inch is desirable. This presentation highlighted the modeling and simulation techniques needed to help achieve a fully functional system with LPDDR5 memories. Accurate modeling with good ecosystem participation are required for success.


LithoVision – Economics in the 3D Era

LithoVision – Economics in the 3D Era
by Scotten Jones on 03-04-2020 at 6:00 am

Slide3

Each year on the Sunday before the SPIE Advanced Lithography Conference, Nikon holds their LithoVision event. This year I had the privilege of being invited to speak for the third consecutive year, unfortunately, the event had to be canceled due to concerns over the COVID-19 virus but by the time the event was canceled I had already finished my presentation so I thought I would share it with the SemiWiki community.

Outline
The title of my talk is “Economics in the 3D Era”. In the talk I will discuss the three main industry segments, 3D NAND, Logic and DRAM. For each segment I will discuss the current status and then get into roadmaps with technology, mask counts, density and cost projections. All the status and projections will be company specific and cover the leaders in each segment. All the data for this presentation, technology, density, mask counts, and cost projections come from our IC Knowledge – Strategic Cost and Price Model – 2020 – revision 00 model. The model is basically a detailed industry roadmap that provides simulations of cost, equipment and materials requirements.

You can read about the model here.

3D NAND
3D NAND is the most “3D” segment of the industry with a layer stacking technology that provides density improvement by adding layers in the third dimension.

Figure 1 illustrates the 3D NAND TCAT Process.

Figure 1. 3D NAND TCAT Process.

In the 3D NAND segment, the market leader is Samsung and they use the TCAT process illustrated in this slide. Number two in the market is Kioxia (formerly Toshiba Memory) and they use an essentially identical process. Micron Technology is also adopting a charge trap process we expect to be similar to this process making this process representative of the majority of the industry. SK Hynix uses a different process that still shares many key elements with this process. The only companies not using a charge-trap process are Intel-Micron and now that Intel and Micron have split apart on 3D NAND, Intel will be the only company pursuing floating gate.

The basic process has three major sections:

  1. Fabricate the CMOS – the CMOS writes, reads and erases the bits. Initially everyone except Intel-Micron fabricated the CMOS next to the memory array with Intel-Micron fabricating some of the CMOS under the memory array. Over time other companies have migrated to CMOS under the array and we expect within a few generations that all companies will migrate to CMOS under the array because it offers better die area utilization.
  2. Fabricate the memory array – for charge trap the array fabrication takes place by depositing alternating layers of oxide and nitride. A channel hole is then etched down through the layers and refilled with an oxide-nitride-oxide (ONO) layer, a poly silicon tube (channel) and filled with oxide. A stair step is then fabricated using a mask – etch – mask shrink – etch approach. A slot is then etched down through the array and the nitride film is etched out. Blocking films and tungsten are then deposited to fill the horizontal openings where the nitride was etched out. Finally, vias are etched down to the horizontal sheets of tungsten.
  3. Interconnect – the CMOS and memory array are then interconnected. For CMOS under the array, some interconnect takes place before the memory array fabrication.

This approach is very mask efficient because many layers can be patterned with a few masks. The overall process requires a channel mask, several stair steps masks depending on the number of layers and process generation, in early generation processes, a single mask could produce approximately 8-layers but some process today can reach 32-layers with a single mask. The slot requires a mask, sometimes there is also a second shallow slot that requires a mask and finally via requires a mask.

The channel hole etching is a very difficult high-aspect-ratio (HAR) etch and once a certain maximum number of layers is reached the process must be broken up into “strings” in something called “string stacking”. In string stacking, basically, a set of layers is deposited, a mask is applied, and the channel is etched and filled. Then another set of layers is deposited and masked, etched and filled. In theory this can be done many times. Intel-Micron use a floating gate process that uses oxide-polysilicon layers that are much more difficult to etch than oxide-nitride layers, and they were the first to string stack. Figure 2 illustrates Intel-Micron string stacking.

Figure 2. Intel-Micron String Stacking.

Each company has their own approach to channel etching and their own limit in terms of when they string stack. Because they use oxide-poly layers Intel-Micron produced a 64-layer device by stacking 2 –  32-layer strings and then produced a 96-layer device by stacking 2 – 48-layer strings. Intel has announced a 144-layer device that we expect to be 3 – 48-layer strings. SK Hynix began string stacking at 72-layers and Kioxia at 96-layers (both charge trap processes with alternating oxide-nitride layers). Samsung is the last holdout on string stacking having produced a 92-layer device as a single string and they have announced a 128-layer – single string device.

Memory density can also be improved by storing multiple bits in a cell. NAND Flash has moved through a progression from 1 bit – single-level cell (SLC), to 2 bit – multi-level cell (MLC), to 3 bit – three-level cell (TLC), to 4 bit – quadruple-level cell (QLC). Companies are now preparing to introduce 5 bit – penta-level cells (PLC) and there is even discussion of 6 bit – hexa-level cells (HLC). Increasing the number of bits per cell helps with density but the benefit is decreasing, SLC to MLC is 2.00 the bits, MLC to TLC is 1.50x the bits, TLC to QLC is 1.33x the bits, QLC to PLC will be 1.25x the bits and if we get there PLC to HLC will be 1.20x the bits.

Figure 3 presents string stacking by year and company on the left axis and maximum bits per cell on the right axis.

Figure 3. Layers, Stacking and Bits per Cell.

Figure 4 presents our analysis of the resulting mask counts by exposure type, company and year. The dotted line is the average masks by year that is increasing from 42 in 2017 to 73 in 2025, this contrasts with the layers increasing from an average of 60 in 2017 to 512 in 2025. In other words, only a 1.7x increase in masks is required to produce an 8.5x increase in layers highlighting the mask efficiency of the 3D NAND processes.

Figure 4. Mask Count Trend.

Figure 5 presents actual and forecast bit density by company and year for both 2D NAND and 3D NAND. This is the bit density for the whole die or in other words the die bit capacity divided by the die size.

 Figure 5. NAND Bit Density.

 From 2000 to 2010, 2D NAND bit densities were increasing by 1.78x per year driven by lithographic shrinks. Around 2010 the difficulty of continuing to shrink 2D NAND led to a slow down to 1.43x per year until around 2015 when 3D NAND became the driver and continued at a 1.43x per year rate. We are projecting a slight slowdown from 2020 to 2025 to 1.38x per year. This is an improvement in our forecast from last year because we are seeing the companies push the technology faster than we originally expected. Finally, SK Hynix has talked about 500 layers in 2025 and 800 layers in 2030 resulting in a further slow down after 2025.

Figure 6 presents NAND Bit Cost Trends.

Figure 6. NAND Bit Cost Density.

In this figure we have taken wafer costs calculated using our Strategic Cost and Price Model and combined them with the bit density from figure 5 to produce a bit cost trend. In all cases the fabs are new greenfield 75,000 wafer per month fabs because that is the average capacity of NAND fabs in 2020. The countries where the fabs are located are Singapore for Intel-Micron, China for Intel, Japan for Kioxia and South Korea for Samsung and SK Hynix. These calculations do not include packaging and test costs, do not take into street width and have only rough die yield assumptions in them.

The first three nodes on the chart are 2D NAND where we see a 0.7x per node cost trend. With the transition to 3D NAND the bit cost initially increased for most companies but has now come down below 2D NAND bits costs and is following a 0.7x per node trend until around 300 to 400 layers. At 300-400 layers we project the cost per but will level out possibly placing an economic limit on this technology unless there are some breakthroughs in process or equipment efficiency.

Logic
For 3D NAND “nodes” are easy to define based on physical layers, for DRAM nodes are the active half-pitch, for logic nodes are pretty much whatever the marketing guys at a company wants to call them.

Some people consider the current leading edge FinFET processes to be 3D because the FinFET is a 3D structure but in the context of this discussion we consider 3D to be when device stacking allows multiple active layers to be stacked up to create stacks of devices. In this context 3D Logic will really come in once CFETs are adopted.

Figure 7 presents the nodes by year for the 3 companies pursuing the state of the art.

Figure 7. Logic Roadmap.

 The node comparisons in this chart are complicated by the split between Intel and the foundries. Intel has followed the classic node names, 45nm, 32nm, 22nm, 14nm whereas the foundries have followed the “new” node names of 40nm, 28nm, 20nm, 14nm. Furthermore, intel has shrunk more per node and so Intel 14nm has similar density to foundry 10nm and Intel 10nm has similar density to foundry 7nm.

At the top of the figure I have outlined a consistent node name series based on alternating 0.71 and 0.70 shrinks.

In the bottom of the figure I have nodes by company and year with transistor density for each node. The transistor density is calculated based on a weighting of NAND and Flipflop cells as I have previously discussed. Next to each node in parenthesis is either FF for FinFET, HNS for horizontal nanosheet, HNS/FS for horizontal nanosheets with a dielectric wall (Forksheet) to improve density based on work Imec has done and CFET for complimentary stacked FETs where nFETs and pFETs are vertically stacked. CFETs will be when logic crosses over into a layer-based scaling approach and becomes a true 3D solution, in principle CFETs can continue to scale by adding more layers.

Bold indicates leading density or technology. In 2014 Intel takes the density lead with their 14nm process. In 2016 TSMC takes the density lead with their 10nm process and maintains the lead in 2017 with their 7nm process. TSMC and Samsung have similar densities at 7nm but going to 5nm TSMC is producing a much larger shrink than Samsung and in 2019 TSMC maintains the process density lead with their 5nm technology. If Samsung delivers on their HNS technology in 2020 that we are calling a 3.5nm node, they may take the density lead and be the first company to manufacture HNS. In 2021 the TSMC node we are calling 3.5nm may return them to the density lead. If Intel can deliver on their two-year cadence with the kind of shrinks they typically target we believe they could take the density lead in 2023 with their 5nm process. In 2024 we may see a first CFET implementation from Samsung taking the density lead until 2025 when Intel may regain the lead with their first CFET process.

Figure 8 presents the logic mask counts for these processes. The introduction of EUV is mitigating mask layers, without EUV we would likely see over 100 masks on this chart. As we did with the NAND mask count figure, the dotted line is average mask counts. We have also grouped processes based on “similar” densities so for example Intel 14nm is combined with the foundry 10nm process and intel 10nm with the foundry 7nm processes.

Figure 8 Logic Mask Counts.

Figure 9 presents the logic density in transistors per millimeter squared based on the NAND/Flipflop weighting metric mentioned previously.

Figure 9. Logic Density Trend.

There are six types of processes plotted on this chart.

Planar transistors were the primary leading-edge logic process until around 2014 and produced a density improvement of 1.33x per year, FinFETs then took over at the leading edge and have provided a 1.29x per year density improvement. In parallel to FinFETs we have seen the introduction of FDSOI processes. FDSOI offers simpler processes with lower design costs and better analog, RF and power but cannot compete with FinFETs for density or raw performance. When HNS takes over from FinFETs we expect the rate of density improvement to further slow to 1.16x per year and eventually CFETs take over and increase density at 1.11x per year. We have also plotted SRAMs produced by vertical transistors based on work by Imec that may provide an efficient solution for cache memory chiplets.

Figure 10 illustrates the trend in logic transistor cost.

Figure 10. Logic Transistor Cost.

 Figure 10 presents the cost per billion transistors by combining wafer cost estimates from our Strategic Cost and Price Model with the transistor densities in figure 9. All fabs are assumed to be new greenfield fabs with 35,000 wafers per month capacity because that is the average size of logic fabs in 2020. The assumed countries are Germany for GLOBALFOUNDRIES except for 14nm that is done in the United States, the United States is also assumed for Intel except for 10nm in Israel, South Korea is assumed for Samsung and Taiwan for TSMC.

This plot does not include mask set or design cost amortization so while manufacturing cost per transistor is coming down the number of designs that can afford to access these technologies is limited to high volume products.

This plot does not include any packaging, test or yield impact.

From 130nm down to the i32/f28 (intel 32nm/foundry 28nm) node costs were coming down by 0.6x per node, then at the i22/f20 and f16/f14 node the cost reductions slowed because the foundries decided not to scale for their first FinFET processes. This slow down led to many in the industry erroneously predicting the end of cost reduction. From the f16/f14 node down to the i5/f2.5 node we expect costs to decrease by 0.72x per node and then slow to 0.87x per node thereafter. The g1.25 and g0.9 nodes are generic CFET processes with 3 and 4 decks respectively.

Figure 11 examines the impact of mask set amortization on wafer cost.

Figure 11. Mask Set Amortization.

The wafer costs in figure 11 are based on a new greenfield fab in Taiwan running 40,000 wafers per month. The amortization is mask set only and does not include design costs.

The table presents the 2020 mask set cost for 250nm, 90nm, 28nm and 7nm mask sets. Please note that at introduction these mask sets were more expensive. The mask set cost is them amortized over a set number of wafers and the resulting normalized costs are shown in the figure. In the table the wafer cost ratio is the wafer cost with amortization for 100 wafers run on a mask set divided by the wafer cost with amortization for 100,000 wafers run on a mask set.

From the figure and table, we can see that mask set amortization has a small effect at 250nm (1.42x ratio) and a large effect at 7nm (18.05x ratio). Design cost amortization is even worse.

The bottom line is that design and mask set costs are so high at the leading edge that only high-volume products can absorb the resulting amortization.

DRAM
Leading edge DRAMs have capacitor structures that are high aspect ratio “3D” devices but similar to current logic devices, DRAM doesn’t have scaling by stacking of active elements.

Figure 12 presents DRAM nodes by company in the top table and on the bottom of the figure are some of the key structures.

Figure 12. DRAM Nodes.

 As DRAM nodes proceeded below the 4x nm generation the buried saddle fin access transistor with buried word line came into use (see bottom left). The bottom right illustrates the progression of the capacitor structure to higher aspect ratio structures with two layers of silicon nitride “MESH” support. DRAM capacitor structures are reaching the mechanical stability limits of the technology and with dielectric k values also stalled, DRAM scaling is evolving into single nanometer per node scaling.

Figure 13 illustrates mask counts by exposure type and company.

Figure 13. DRAM Mask Counts.

 From figure 13 it can be seen that from the 2x to 2y generation there is a big jump in mask counts. The jump is driven by performance and power requirements that led to the need for more transistor types and threshold voltages in the peripheral logic.

At the 1x node Samsung is the first company to introduce EUV to DRAM production and the number of EUV layers grows at the 1a, 1b and 1c nodes. SK Hynix is also expected to implement EUV, we do not currently expect Micron to implement EUV.

Figure 14 illustrates the trend in DRAM bit density by year.

Figure 14. DRAM Bit Density.

 In figure 14 the bit density is the product capacity in gigabytes divided by the die size in millimeters square.

Form figure 14 is can be seen that there has been a slowdown in bit density growth beginning around 2105. DRAM bit density is currently constrained by the capacitor and it isn’t clear what the solution will be. Long term a new type of memory may be needed to replace DRAM. DRAM requires relatively fast access with high endurance and currently MRAM and FeRAM appear to be the only options that have the potential to meet the speed and endurance requirements. Because MRAM requires relatively high current to switch, large selector transistors are required constraining the ability to shrink MRAM to competitive density and cost. FeRAM is also a potential replacement and is getting a lot of attention at places like Imec.

Figure 15 illustrates the DRAM bit cost trend.

Figure 15. DRAM Bit Cost Trend.

 Figure 15 is based on combining wafer cost estimates from our Strategic Cost and Price Model with the bit densities in figure 14. All fabs are assumed to be new greenfield fabs with 75,000 wafers per month capacity because that is the average size of DRAM fabs in 2020. The assumed countries are Japan for Micron and South Korea for Samsung and SK Hynix.

These calculations do not include packaging and test costs and do not take into street width or die yield.

In this plot the combination of higher mask counts and slower bit density growth lead to a slow down from 0.70x per node cost trend to a 0.87x per node cost trend.

Conclusion
NAND has successfully transitioned from 2D to 3D and now has a scaling path until around 2025. After 2025 scaling may be possible with very high layers counts but unless a breakthrough in the process or equipment efficiency is made, cost per bit reductions may end.

Leading edge logic today utilizes 3D FinFET structures but won’t be a true stacked device 3D technology until CFETs are introduced around 2025. Logic has the potential to continue to scale until the end of the 2020s by transitioning from FinFET to HNS to CFET although the cost improvements will likely slow down.

DRAM is the most constrained of the 3 market segments, scaling and cost reductions have already slowed down significantly and no solution is currently known. Slower bit density and cost scaling will likely continue until around 2025 when a new memory type may be needed.

Here is the full presentation:

Lithovision 2020

Also Read:

IEDM 2019 – Imec Interviews

IEDM 2019 – IBM and Leti

The Lost Opportunity for 450mm


The Story of Ultra-WideBand – Part 1: The Genesis

The Story of Ultra-WideBand – Part 1: The Genesis
by Frederic Nabki & Dominic Deslandes on 03-03-2020 at 10:00 am

The Story of Ultra WideBand Part 1 SemiWiki

In the middle of the night of April 14, 1912, the R.M.S. Titanic sent a distress message. It had just hit an iceberg and was sinking. Even though broadcasting an emergency wireless signal is common today, this was cutting edge technology at the turn of the 20th century. This was made possible by the invention of a broadband radio developed over the previous 20 years: the spark-gap transmitter.

Developed by Heinrich Hertz in the 1880s, the spark-gap radio was improved by Guglielmo Marconi who succeeded in sending the first radio transmission across the Atlantic Ocean in 1901. After the Titanic disaster, wireless telegraphy using spark-gap transmitters quickly became universal on large ships, with The Radio Act of 1912 requiring all seafaring vessels to maintain 24-hour radio watch. The spark-gap radio was then the most advanced technology enabling wireless communication between ships, used through the first world war.

The architecture of the spark-gap radio was significantly different than what is currently used in wireless transceivers, including our cellphones, WiFi networks and Bluetooth devices. Modern narrowband communications systems modulate continuous-waveform radiofrequency (RF) signals to transmit and receive information. But at the turn of the 20th century, the spark-gap transmitter generated electromagnetic waves by means of an electric spark and no narrowband radiofrequency signal was being modulated. The spark was generated using a capacitance discharged through electric arcing across a gap between two conductors. These very short time discharges generated oscillating currents in the wires, which then excited an electromagnetic wave that radiated out and could be picked up electromagnetically at a great distance. From the well-known time-frequency duality principle, a short impulse in time, analogous to the electric spark, gives a wideband signal in frequency and this was the basis of communications for two decades.

An interesting point to note is that the spark-gap radio could not support a continuous transmission, such as a sound signal. A message had to be composed of a series of sparks, transmitting discrete pieces of information, making it the first digital radio. This characteristic was ideal to transmit Morse code. However, it was then believed that it was not possible with the spark-gap radio to transmit a continuous signal like voice or music, without loss of information. It was decades before Shannon and Nyquist showed how to do that with digital modulation techniques.

This gap in digital modulation knowledge, coupled with the difficulty to generate high power spark-gap transmissions were shortcomings that were fatal to the spark-gap radio. After World War 1, carrier-based transmitters were developed using vacuum tubes, producing continuous waves that could carry audio. Nowadays, virtually all wireless transceivers use the same architecture, based on the work of US engineer Edwin Armstrong in 1918. Called the superheterodyne radio, this architecture uses frequency mixing to convert a received narrowband signal to a relatively low intermediate frequency (IF) that is then processed in baseband circuitry. This innovation gave rise, starting around 1920, to the AM radio which was followed a decade later by the FM radio. By the late 1920s the only spark transmitters still in operation were legacy installations on naval ships.  Wideband radio was effectively dead.

Wideband’s Rebirth after 100 years: A Detective Story
Why then did Apple release the iPhone 11 in 2019 with an ultra-wideband (UWB) transceiver, implemented in silicon on their new U1 wireless processor chip? The answer requires some detective work into clues stretching back to the middle of the last century.

The first clue was another impulse-based wideband radio technology developed in top secret laboratories around the world in the 1930’s and during World War 2: RADAR. The story of RADAR has been told many times; it provided a pivotal advantage in both the Battle of Britain and naval battles in the Pacific.

For the purposes of this discussion, RADAR is able to determine the range, angle and velocity of objects. After the war, impulse-based transceivers started once more to gain momentum, now in military applications. From the 1960s to the 1990s, this technology was restricted to military applications under classified programs, both as a location finding and a communication technology. By the mid-1980s, a wide range of research papers, books and patents from UWB pioneers like Harmuth at Catholic University of America and Ross and Robbins at Sperry Rand Corp became available. This great source of information revived the interest in UWB systems because of wideband’s unique ability to deliver location data.

Apple’s first use for UWB is to provide positioning data. Positioning enables many applications in augmented reality (AR), virtual reality (VR), gaming, device recovery, file sharing and advertising beacons. We will explore UWB positioning technology further in Part 3. But positioning by itself is not sufficient reason for Apple to build a custom silicon UWB implementation.

Future articles in this series will discuss five clues to understanding Apple’s adoption of UWB for the iPhone 11:

  • UWB can provide positioning data
  • Its very low power emissions ensure that UWB does not interfere with other communications
  • Low power output also makes UWB signals difficult to detect by unintended users
  • The low duty cycle enables ultra-low power and increases resistance to jamming or interference
  • The very short impulses enable the reduction of the communication latency.

The story continues in Part 2

About Frederic Nabki
Dr. Frederic Nabki is cofounder and CTO of SPARK Microsystems, a wireless start-up bringing a new ultra low-power and low-latency UWB wireless connectivity technology to the market. He directs the technological innovations that SPARK Microsystems is introducing to market. He has 17 years of experience in research and development of RFICs and MEMS. He obtained his Ph.D. in Electrical Engineering from McGill University in 2010. Dr. Nabki has contributed to setting the direction of the technological roadmap for start-up companies, coordinated the development of advanced technologies and participated in product development efforts. His technical expertise includes analog, RF, and mixed-signal integrated circuits and MEMS sensors and actuators. He is a professor of electrical engineering at the École de Technologie Supérieure in Montreal, Canada. He has published several scientific publications, and he holds multiple patents on novel devices and technologies touching on microsystems and integrated circuits.

About Dominic Deslandes
Dr. Dominic Deslandes is cofounder and CSO of SPARK Microsystems, a wireless start-up bringing a new ultra low-power and low-latency UWB wireless connectivity technology to the market. He leads SPARK Microsystems’s long-term technology vision. Dominic has 20 years of experience in the design of RF systems. In the course of his career, he managed several research and development projects in the field of antenna design, RF system integration and interconnections, sensor networks and UWB communication systems. He has collaborated with several companies to develop innovative solutions for microwave sub-systems. Dr. Deslandes holds a doctorate in electrical engineering and a Master of Science in electrical engineering for Ecole Polytechnique of Montreal, where his research focused on high frequency system integration. He is a professor of electrical engineering at the École de Technologie Supérieure in Montreal, Canada.


Trends in AI and Safety for Cars

Trends in AI and Safety for Cars
by Bernard Murphy on 03-03-2020 at 6:00 am

AI at the Edge

The potential for AI in cars, whether for driver assistance or full autonomy, has been trumpeted everywhere and continues to grow. Within the car we have vision, radar and ultrasonic sensors to detect obstacles in front, behind and to the side of the car. Outside the car, V2x promises to share real-time information between vehicles and other sources so we can see ahead of vehicles in front of us, around corners to detect hazards, and see congested traffic and emergency vehicles. Also this AI can improve on the fly, adapting to new conditions through training updates from the cloud.

This all sounds wonderful, but of course implementation is much more complex than the vision. It demands a lot of specialized devices, each with its own constraints in performance, latency (how quickly it can respond to a change) and power consumption. Put them all together in the car and more constraints emerge: How well can the central brain respond to the massive flow of data being generated by all these sensors? Will it become bogged down and not be able to respond quickly enough to a pedestrian ahead of the car? Will AI be more reliable if object-based or rule-based or a combination of the two? Most important of all, will it be safe?

Taken together, it’s not surprising that the nirvana of full autonomy isn’t right around the corner. But progress is being made, bottom-up as it should be. A good place to see this in action is at the edge of the car, in sensors, sensor fusion and local AI.

Memory Implications

AI is being pushed to the edge. This is not new. Transmitting raw video or radar or ultrasonic streams would bog down the car network, create massive latency and burn a lot of power. Doing all the object detection and fusion close to the sensor reduces those problems. For cost, reliability and again, power reasons, it’s better per sensor cluster to do all of that in one chip.

Now you need to integrate onto a single chip support for multiple AI subsystems along with other administrative and safety functions. This has some interesting design implications. To manage power and latency it is important to share memory between central compute and the AI accelerators. You don’t want to have to go out to external memory any more than absolutely necessary because that’s much slower and burns more power than staying on chip. Further, as more accelerators and compute clusters are added to the system-on-chip it become nearly impossible to efficiently manage the data flow using software only. Therefore the AI accelerator subsystems must be cache coherent with the rest of the chip (CPU cluster, memory subsystem, communication, etc).

Cache coherence goes further. Within that group of AI accelerators it may be important to share memory, in fusion for example. Which means you need cache coherence between accelerators. But you don’t necessarily want to pay the penalty every time for coherence with the whole system; most of the time coherence is needed just between accelerators. Now you want hierarchical cache coherence – between accelerators at one level, then to the full system at the top. Kurt Shuler (VP Marketing at Arteris IP) told me cache coherence requirements of this kind are becoming more common in automotive applications, because they’re dealing with big images across more accelerators, yet they still need to manage to a tight power and performance budget.

Safety

What about safety? There is a larger question of how you quantify safety in non-deterministic systems, as most machine-learning based systems are. This is where SOTIF (ISO/PAS 21448:2019) and UL4600 are headed. But even before we get there, how do ISO 26262 and AI interact? Most accelerators so far have not been ASIL-rated, so must be managed in a larger system aiming for ASIL D compliance.

This mix of safety standards and levels is pushing a trend to a safety island on the system on chip (SoC) to monitor system safety, along with an ability to isolate each IP on the interconnect in turn for temporary in-service testing, or longer-term isolation if an IP is found not to be performing to expectations.

This level of monitoring acknowledges a few realities. We may never be able to build large SoCs in which each component can be brought up to ASIL D. Components will fail; systems must self-monitor to determine if this may be about to happen or has happened and must provide means for self-correction where possible (through say a soft reboot of a subsystem). And where a problem cannot be corrected, systems must provide notification to the central control to enable a fail-operational response – maybe the driver should retake control of the car.

Could AI be accelerators be brought up to ASIL D? This is still very much a research topic. A lot of work has been done on the software side. In hardware, Kurt tells me that attention is mostly on conventional functional safety (FuSa) for the various elements inside the accelerator. One interesting observation he made was that FuSa seems to be more important in later planes in the neural net. Sensitivity to errors in earlier planes is not as strong. Interesting topic to follow!

Interconnect Implications

One thing is clear. The interconnect becomes the backbone for mediating all this activity – coherent caching and safety. Coherent caching because a full SoC is inevitably going to depend on a mix of IPs from multiple suppliers, yet caching must still be managed coherently across all those IPs.

And safety because the NoC or NoCs running through these systems must interconnect a wide range of IPs with differing ASIL capabilities. Some of those IPs can be very exotic indeed, serving the needs of some of the most advanced AI suppliers. NoCs must enable and mediate this trend to safety islands, self-testing and isolation support, while also providing safety control and monitoring within the network itself.

This is a complex and fast-moving range of needs. Arteris IP is clearly working to keep up with these needs. Kurt is on the ISO 26262 working group, and they work with a lot of AI companies, including some of the most prominent in automotive applications. Check them out.

Also Read:

Autonomous Driving Still Terra Incognita

Evolving Landscape of Self-Driving Safety Standards

Safety and Platform-Based Design


Reliability Challenges in Advanced Packages and Boards

Reliability Challenges in Advanced Packages and Boards
by Herb Reiter on 03-02-2020 at 10:00 am

CTE Stress ANSYS SemiWiki

Today’s Market Requirements
Complex electronic devices and (sub)systems work for us in important applications, such as aircrafts, trains, trucks, passenger vehicles as well as building infrastructure, manufacturing equipment, medical systems and more. Very high reliability (the ability of a product to meet all requirements in the customer environment over the desired lifetime) is becoming increasingly important. Big Data and AI (Artificial Intelligence) are making humans even more reliant on electronic systems and will make insufficient reliability more painful, costly, even deadly. At the recent DesignCon 2020 I had the opportunity to learn how ANSYS is enabling engineers to design highly reliable products.

Brief ANSYS history, focus and key acquisitions
ANSYS, based near Pittsburgh, Pennsylvania, was founded in 1970 and employs now about 4,000 experts in finite element analysis, computational fluid dynamics, electronics, semiconductors, embedded software and design optimization work. ANSYS is well known as partner for very demanding customers in space and aircraft applications. ANSYS grew rapidly, also by acquiring other EDA suppliers. They bought and integrated Ansoft Corp. in 2008 and Apache Design Solutions in 2011. In May 2019 ANSYS acquired DfR Solutions to deepen their capabilities in electronics reliability, including simulation of semiconductor packaging and PCBAs as well as a physical laboratory capable of characterization and library generation as well as analysis and testing of a broad range of electronic parts (semiconductors, displays, batteries, etc.). DfR’s best known product is Sherlock, a powerful pre- and post-processor for reliability prediction of dice, packages, and PCBs subjected to thermal, thermo-mechanical, and mechanical environments.

The value of FEA tools and accurate inputs (libraries)
Analyzing the reliability of prototypes and/or pre-production units with a test-fail-fix approach is costly, time consuming and provides results very late in a product’s life cycle. ANSYS’ Sherlock applies finite element analysis (FEA) and enables engineers to easily assess a hardware design’s reliability, accurately and at the beginning of a design cycle. This also allows designers to evaluate trade-offs (e.g. different architectures, geometries and materials) early and across a wide range of conditions, to achieve optimal results.

Summary of the ANSYS Design for Reliability presentation at DesignCon 2020
In a fully-packed conference room, ANSYS’ Kelly Morgan, Lead Application Engineer, presented three examples for failure mechanisms, where Sherlock can add significant value. Sherlock and ANSYS Mechanical apply physics of failure principles to predict hardware reliability for: 1) Low-k cracking, 2) Solder joint fatigue and 3) Micro-via separation. The pointers lead to much more information than the paragraphs below can provide.

To 1) Low-k cracking: Dielectric material with low dielectric constant (k) reduces parasitic capacitance, enabling higher circuit performance and lower power dissipation. However, its low mechanical strength leads sometimes to cracks in the dielectric, due to thermal-mechanical forces from differences in coefficient of thermal expansion (CTE) that occur during reflow or thermal cycling. Acoustic inspection can reveal these cracks. If the low-k material is found to be cracked at this late stage of product’s introduction, it can trigger a dreaded redesign cycle. In contrast, Sherlock and ANSYS Mechanical allow an IC designer – at the beginning of a project – to predict such failures, take corrective actions right away and pre-empt such problems from occurring.

Figure 1: CTE differences between a copper pillar and a die lead to both compressive and tensile stress — and impact adjacent transistors’ performance as well as reliability  Courtesy: ANSYS

To 2) Solder joint fatigue: Many integrated circuits (ICs) have traditionally used lead (Pb) free solder bumps as connections to other dice, the package, and even the printed circuit board (PCB). Different CTEs and temperatures in adjacent layers make the materials expand and contract differently. These thermal-mechanical forces, as well as vibrations, mechanical shock, etc., cause strain on the solder bumps and may lead to cracks within the solder bumps and/or at the interconnect surfaces. More recently, copper pillars have become popular because they allow much tighter spacing than solder bumps. However, these interconnects are more rigid and can fail faster, depending on the strain being applied. Sherlock’s and ANSYS Mechanical’s multi-physics capabilities allow users to easily and accurately predict the reliability of such interconnects and, if needed, drive needed changes early in the design cycle.

Figure 2: Cross section of a solder joint and how different CTEs cause materials to shrink or expand     Courtesy: ANSYS

To 3) Micro-via separation: As spacings in electronics get smaller and smaller, the use of micro-via technology in PCBs has exploded. Micro-vias stacked as much as three or four high have become very common. However, if these designs do not use the right materials and geometries, the micro-vias can experience unexpected cracking and delamination.

Thermal-mechanical stress, moisture, vibration and other forces, can lead to separation of micro-vias, as well as delamination from copper traces at the top or bottom of plated through-hole vias (PTHs). Sherlock analyzes these problem areas, considers overstress conditions during reflow and/or operation and can predict when fatigue will lead to interconnect failures between vias, PTHs and routing layers and/or under bump metal (UBM) contact points.

Figure 3: Likely reliability risks in electronic products during manufacturing and operation  Courtesy: ANSYS

Design flow integration of Sherlock
Even a best-in-class point tool, like Sherlock, needs to be integrated into a user-friendly and high productivity design flow, to provide its full value in a customer’s design environment. Only smooth data exchanges with up- and down-stream tools enable engineers to utilize Sherlock’s many capabilities quickly and efficiently. Flow integration minimizes scripting, data format translations as well as error-prone and time-consuming manual interventions. Sherlock interacts with ANSYS’ Icepak and ANSYS Mechanical to combine these tools into a high productivity and very reliable design flow for reaching the “ZERO DEFECTS” goal more and more applications require.  Learn more about ANSYS Sherlock HERE.

Figure 4: Important stages where, in a hardware design process, Sherlock can avoid surprises   Courtesy: ANSYS


GLOBALFOUNDRIES Sets a New Bar for Advanced Non-Volatile Memory Technology

GLOBALFOUNDRIES Sets a New Bar for Advanced Non-Volatile Memory Technology
by Mike Gianfagna on 03-02-2020 at 6:00 am

eNVM applications


Whether it’s the solid-state disk in your laptop, IoT/automotive hardware or  edge-based AI, embedded non-volatile memory (eNVM) is a critical building block for these and many other applications. The workhorse technology for this capability has typically been NOR flash (eFlash), but a problem looms as eFlash presents challenges to scale economically below the 28nm node. That’s why a recent press release from GLOBALFOUNDRIES (GF) caught my attention:

GLOBALFOUNDRIES Delivers Industry’s First Production-ready eMRAM on 22FDX Platform for IoT and Automotive Applications.

Embedded magnetoresistive non-volatile memory (eMRAM) is a mouthful. I did a bit of research, and MRAM was presented back in 1974, when IBM developed a component called a Magnetic Tunnel Junction (MTJ). The device had two ferromagnetic layers separated by a thin insulating layer and a memory cell was created by the intersection of two wires (i.e., a row line and a column line) with an MJT between them. MRAMs can combine the high speed of SRAM, the storage capacity of DRAM, and the nonvolatility of eFlash at low power, so a production embedded implementation of the technology below 28nm is a big deal.

First, a bit about the implementation technology. 22FDX is a 22nm fully
depleted silicon-on-insulator (FD-SOI) technology from GF.  Another mouthful. FD-SOI delivers near FinFET-like performance without the design and manufacturing complexities of FinFET. The figure at the right summarizes the benefits of GF’s 22FDX.

“We continue our commitment to differentiate our FDX platform with robust, feature rich solutions that allow our clients to build innovative products for high performance and low power applications,” said Mike Hogan, senior vice president and general manager of Automotive and Industrial Multi-market at GLOBALFOUNDRIES. “Our differentiated eMRAM, deployed on the industry’s most advanced FDX platform, delivers a unique combination of high performance RF, low power logic and integrated power management in an easy-to-integrate eMRAM solution that enables our clients to deliver a new generation of ultra-low power MCUs and connected IoT applications.”

I caught up with Martin Mason, senior director automotive, industrial and multi
market BU at GF to get a bit more detail about their new, production-ready eMRAM. He took me through a very robust qualification process for the device, including a bit error rate in the 6E-6 range, robust data retention after 5X solder reflows, stand-by data retention sufficient for industrial-grade and automotive-grade 2 applications and multiple magnetic immunity tests. Martin summed up our discussion like this, “22FDX with embedded MRAM is an enabling technology platform for Intelligent IoT (IIoT), wearables, MCUs and advanced automotive products. We have a qualified Flash-like robust eMRAM process with our first client single product MRAM tape out in fab, multiple
clients running MRAM test chips and many silicon validated MRAM macros
(4Mb-48Mb).  Unlike other eMRAM solutions we built GFs 22FDX MRAM to be very robust with -40C to 125C operating range, high endurance and long data
retention, passing five rigorous real-world (5x) solder reflow tests while maintaining leading magnetic immunity. The GF eMRAM is very much like eFLASH – only better, with faster read and write times and reduced mask count manufacturing compared with traditional embedded Flash technologies.” The diagram to the right summarizes GF’s new eMRAM vs. eFlash.

GF reports they are working with several clients with multiple production tape-outs scheduled in 2020 using the new, production-ready eMRAM technology in 22FDX. GF’s state-of-the-art 300mm production line at Fab 1 in Dresden, Germany will support volume production for these projects. They also report custom design kits featuring drop-in, silicon validated MRAM macros ranging from 4 to 48 mega-bits, along with the option of MRAM built-in-self-test support is available today from GF and their design partners.

Looking ahead, GF expects its scalable eMRAM to be available on both FinFET and future FDX platforms as a part of the company’s advanced eNVM roadmap.  If you need an eFlash alternative below 28nm this is definitely something to look into.

Also Read:

Specialized Accelerators Needed for Cloud Based ML Training

The GlobalFoundries IPO March Continues

Magnetic Immunity for Embedded Magnetoresistive RAM (eMRAM)