webinar banner2025 (1)

The Need for Low Pupil Fill in EUV Lithography

The Need for Low Pupil Fill in EUV Lithography
by Fred Chen on 03-15-2020 at 10:00 am

The Need for Low Pupil Fill in EUV Lithography 1

Extreme ultraviolet (EUV) lithography targets sub-20 nm resolution using a wavelength range of ~13.3-13.7 nm (with some light including DUV outside this band as well) and a reflective ring-field optics system. ASML has been refining the EUV tool platform, starting with the NXE:3300B, the very first platform with a numerical aperture of 0.33. With the current NXE:3400B platform, a pupil fill ratio lower limit of 20% (without loss of light) is now possible, offering improved contrast for smaller features [1]. In this article, the basic reasons behind this improvement are presented.

Illumination locations in the pupil

The pupil is essentially an aperture or window through which the specified angles of illuminating the EUV mask go through. The center angle is along the optical axis, which naturally folds throughout a reflective optical system, and also happens to be at a 6 degrees with respect to the normal to the mask. A pupil can be represented as an assembly of points, whose x and y coordinates are the sine of the angle made with the optical axis in x and y directions, respectively.

Optical projection systems basically operate on spatial frequencies, which in turn are indicators of feature pitches (spatial frequency = 1/pitch). Images containing larger pitches, more isolated features, will have a wide range of spatial frequencies, including very low ones. Images containing the densest features, smallest pitches, will have only the highest spatial frequencies. When the spatial frequency is too high, it will be cut off by the numerical aperture of the final lens (or mirror in the case of EUV). The numerical aperture is essentially a low-pass filter. In order to retain these high spatial frequencies, it would be necessary to shift the illumination, selecting pupil points which are off-center. This shifts the target spatial frequencies away from the cutoff edge of the numerical aperture.

Stochastic Effect of Large Pupil Fill

Complex 2D patterns will contain both minimum pitches and much larger pitches. In that case, it is necessary to check each point in the pupil for what images will result from an illumination from that point.

Figure 1 shows an example for where minimum pitch is 28 nm, and there are larger pitches of 84 nm in x and 168 nm in y. The wavelength is taken to be 13.5 nm, the numerical aperture to be 0.33.

Figure 1. Images from two different neighboring source points in the pupil. The target pattern is on the lower left. Although quite similar, the two images are distinctly different. In a realistic exposure condition, both would be represented by very small fractions of the total dose.

The images produced by two neighboring source points were shown to be similar but still different. Assuming a conventional illumination condition with sigma=1 (maximum on NXE:3400B [2]), uses the entire displayed collection of source points, and represents the largest possible pupil fill. Under this condition, the images represented by these two selected source points are only supported by 1-3% of the dose applied at the wafer. Hence there are few photons emanating from these points (<1 mJ/cm2, <1 photon/nm2), which aggravates the shot noise.

If fewer differently diffracted images need to be added up together to form the final image, this would reduce the stochastic impact. Presumably, instead of the conventional illumination using all the displayed points, only a certain optimized subset would work. This is the basic principle behind source-mask optimization (SMO) [3]. Naturally, the pupil fill is thus reduced. Ideally, only one diffracted image will be used, with all the exposure dose going there. However, the pupil fill fraction may go very low, even as low as 1-3% if we used one of the two source points used above as examples. On the current NXE:3400 systems, pupil fill fractions below 20% will result in loss of throughput [2].

Rotation Sensitivity for Dense Features

For a 0.33 NA and central wavelength of 13.5 nm, dense features with pitches of 40 nm or less can only be resolved by illuminations at wide enough angles, i.e., angles that deviate far enough from the optical axis. As an example, a 28 nm vertical line pitch can only be resolved by a distribution of angles indicated by the blue dots in the left part of Figure 2.

However, since the EUV optical system is reflective, it requires a ring-field to minimize aberrations. This, in turn, requires the exposure field to be shaped like an arc. The pupil is rotated azimuthally through this arc across the field, in some reported cases by over 24 degrees [4, 5].

Figure 2. Left: Dipole illumination for 28 nm vertical line pitch. Right: after 18 degree rotation at a different slit position, some illumination points are outside the originally allowed range. This would lead to more background light washing out the image. A reduced pupil fill (including points only within the dotted lines) avoids this outcome.

The right part of Figure 2 compares the original illumination distribution with after an 18 degree rotation. Some points in the original illumination distribution have now been rotated outside into the forbidden illumination zone. These illumination points will now degrade the final image, as they no longer let the required spatial frequencies pass through to the wafer. They only contribute background light. Hence, to avoid this, a subset of the original distribution (marked by the dotted lines) should be used. However, this pupil fill will be less than 20%.

Alternatively, to use a larger pupil fill, the rotation range can be reduced by limiting the field width [6]. Instead of the full 26 mm width of the exposure field, half of it or less can be used instead. The tradeoff here is that the scanner now must make more exposure field stops across the wafer, and this also limits throughput.

Conclusions

The need for pupil fill reduction can now be understood as conferring benefits of reduced impact from stochastics and pupil rotation in EUV lithography. However, with the likely need to go to pupil fill ratios of <20%, throughput concerns cannot be ignored.

References

[1] A. Pirati et al., Proc. SPIE 9776, 97760A (2016).

[2] M. van de Kerkhof et al., Proc. SPIE 10143, 101430D (2017).

[3] A. E. Rosenbluth et al., Proc. SPIE 4346, 486 (2001).

[4] https://sst.semiconductor-digest.com/2014/02/the-impact-on-opc-and-sraf-caused-by-euv-shadowing-effect/

[5] R. Capelli et al., Proc. SPIE 10957, 10950X (2019).

[6] https://www.linkedin.com/pulse/forbidden-pitch-combination-advanced-lithography-nodes-frederick-chen/

This article first appeared in LinkedIn Pulse:  The Need for Low Pupil Fill in EUV Lithography


There is No Easy Fix to AI Privacy Problems

There is No Easy Fix to AI Privacy Problems
by Matthew Rosenquist on 03-14-2020 at 8:00 am

There is No Easy Fix to AI Privacy Problems

Artificial intelligence – more specifically, the machine learning (ML) subset of AI – has a number of privacy problems.

Not only does ML require vast amounts of data for the training process, but the derived system is also provided with access to even greater volumes of data as part of the inference processing while in operation. These AI systems need to access and “consume” huge amounts of data in order to exist and, in many use cases, the data involved is private: faces, medical records, financial data, location information, biometrics, personal records, and communications.

Preserving privacy and security in these systems is a great challenge. The problem grows in sensitivity as the public becomes more aware of the consequences of their privacy being violated and misused. Regulations are continually evolving to restrict organizations and penalize offenders who fail to respect users’ rights. British Airways was, for example, recently fined $228 million by the European Union for privacy violations.

There is currently a fine line that AI developers must walk to create useful systems to benefit society and yet avoid violating privacy rights.

For example, AI systems are an excellent candidate to help law enforcement rescue abducted and exploited children by identifying them in social media posts. Such a system would be relentless in scouring all posts and matching images to missing persons, even taking into account the likely changes of years passing by, something impossible for humans to accomplish accurately or at scale. However, such a system would need to do facial recognition analysis on every picture posted in a social network. That could identify and ultimately contribute to tracking everyone, even bystanders in the background of images. Sounds creepy and you may likely object. This is where privacy regulations and ethics must define what is allowable. Bringing home kidnapped kids or those who are forced into sex trafficking is very worthwhile but still requires adherence to privacy fundamentals, so greater harms aren’t inevitably created.

To accomplish such a noble feat, a system would need to be trained to recognize the faces of children. For accuracy, it would require a training database with millions of children’s faces. To follow the laws in some jurisdictions, the parents of each child in the training data set would need to approve the use of their child’s image as part of the learning process. No such approved database currently exists and it would be a tremendous undertaking to build one. It would probably take many decades to coordinate such an effort, leaving the promise of an efficient AI solution for finding kidnapped or exploited children just a hopeful concept for the foreseeable future.

Such is the dilemma of AI and privacy. This type of conflict arises when AI systems are in training and also when they are put to work to process real data.

Take that same facial recognition system and connect it to both a federal citizen registry and millions of surveillance cameras. Now, the government could identify and track people wherever they go, regardless if they have committed a crime, which is very Orwellian.

But innovation is coming to help – federated learning, differential privacy, and homomorphic encryption are technologies that can assist in navigating such challenges. However, they are just tools and not complete solutions. They can help in specific usages but always come with drawbacks and limitations, many of which can be significant.

  • Federated learning (aka collaborative learning) makes possible the training of algorithms without local data sets being exchanged or centralized. It’s all about compartmentalization, which is great for privacy, but it difficult to set up and scale. Additionally, it can be limiting to data researchers that are desperate for massive data sets containing the rich information needed for training AI systems.
  • Differential privacy takes a different approach, attempting to obfuscate the details by providing aggregate information but not sharing specific data, i.e., “describe the forest, but not individual trees”. It is often used in conjunction with federated learning. Again, there are privacy benefits but it can result in serious degradation of accuracy for the AI system, thereby undermining the overall value and purpose.
  • Homomorphic encryption, one of my favorites, is a promising technology that allows for data to remain encrypted yet still allow useful computations to be done as if they were unencrypted. Imagine a class of students being asked who is their favorite teacher: Alice or Bob. To protect the privacy of the answers, an encrypted database is created containing the names of individual students and the corresponding name of their favorite teacher. While in an encrypted state, calculations could be done, in theory, to tabulate how many votes there were for Alice and for Bob, without actually looking at the individual choices by each student. Applying this to AI development, data privacy remains intact while training can still proceed. Sounds great, but in real-world scenarios, it is extremely limited and takes tremendous computing power to accomplish. For most AI applications it is simply not a feasible way to train the system.

For now, there is no perfect solution on the horizon. It currently takes the expertise of and committed partnerships between privacy, legal, AI developers, and ethics professionals to evaluate individual use-cases to determine the best course of action. Even then, most of the focus is placed only on current concerns and not on applying a more difficult strategic viewpoint of what challenges will emerge in the future. The only thing that is clear is that we need to achieve the right level of privacy so we can benefit from the tremendous advantages that AI potentially holds for mankind. How that is achieved in an effective, efficient, timely, and consistent manner is beyond what anyone has figured out to date.


SPIE 2020 – Applied Materials Material-Enabled Patterning

SPIE 2020 – Applied Materials Material-Enabled Patterning
by Scotten Jones on 03-13-2020 at 10:00 am

2020 SPIE Media Briefing Full Slides for Scott Jones Page 16

I wasn’t able to attend the SPIE Advanced Lithography Conference this year for personal reasons, but Applied Materials was kind enough to set up a phone briefing for me with Regina Freed to discuss their Materials-Enabled Patterning announcement.

At IEDM Applied Materials (AMAT) tried to put together a panel across the entire semiconductor ecosystem on how to shrink the technology. Authors note, you can read my write up on the panel here.

There is a need to look at all of the factors when shrinking with the latest focus being power, performance, area and cost (PPAC). On the panel TSMC also mentioned the need to consider time. Part of AMAT is announcing is simplification of processes that helps with cost and time by eliminating steps.

The three pieces to the announcement are:

  1. Square spacers
  2. Lateral etching
  3. Selective processing

Square spacers

SAxP is a widely used patterning technology with Self Aligned Double Patterning (SADP) and Self Aligned Quadruple Pattering (SAQP) being the most common. The basic premise is to create a mandrel pattern and then deposit sidewall spacers on the edges of the mandrel to double the pitch. Typically, these sidewall spacers have rounded top edges. When running SAQP, one way to compensate for the rounded top edges is to deposit two mandrels and use the spacers to define the second mandrel but this adds process complexity.

In the past people have tried to control the spacer top edge rounding by optimizing the etch and have added a second mandrel but this increases cost and complexity. AMAT has changed the spacer material to get square spacers allowing them to reduce the number of process steps. Figure 1 illustrates the conventional double mandrel SAQP process (top) and square spacer SAQP process (bottom).

Figure 1. Square Spacer SAQP Versus Double Mandrel SAQP.

 Providing square spacers can reduce major process steps from 15 to 11 because the square spacer is high enough quality to be the next mandrel. You do lose some ability to have multiple critical dimensions (CDs) with this technique.

Lateral Etch

SAxP processes create lines and spaces with double the pitch for SADP and quadruple the pitch for SAQP. The resulting lines need to be cut in the orthogonal direction. The distance between the cut line ends is referred to as tip to tip (T2T) and there is a fundamental trade-off between the line-space pitch and the T2T.

AMAT’s new lateral etch process enables lateral etching with control of the direction so they can reduce T2T. Figure 2, illustrates the ability to reduce T2T by etching preferentially in one direction.

Figure 2. Reduction in T2T by Lateral Etching.

During the call I suggested to Regina that this lateral etching technique could be useful for 3D NAND stair step etching where shrinking lateral dimensions without reducing the photoresist could potentially reduce the number of masks required and she agreed that could be very interesting.

Selective Processing

Edge Placement Errors (EPE) is a serious problem particularly for complex multi-patterning schemes where the interaction of multiple masks adds together to drive up EPE.

Previous selective materials grew up and out creating a kind of mushroom structure limiting their use to thin films. The new selective deposition from AMAT grows up allowing thicker films. The new selective deposition material is also etch selective to titanium nitride (TiN) hard masks so selective patterns can be created where etching is selective to TiN and then to the new material eliminating EPE and allowing for maximum size of critical features like vias providing lower resistance.

An example of the process would be:

  1. The wafer already has a metal pattern on it.
  2. Selective deposition creates tall layers over the existing metal pattern.
  3. Gap fill deposition fill between the tall features and up over the top of the features and is then planarized by CMP.
  4. Deposit TiN hard mask.
  5. Metal lithography defines the pattern for the next metal layer.
  6. Etch the metal pattern into the TiN and into the gap fill film, this expose the underlaying selective deposition film wherever the current metal pattern overlaps the previous metal pattern.
  7. Via lithography opens where the vias will be formed, this mask can be oversized because the vias will be self-aligned.
  8. Etch the vias, the via pattern is constrained by the TiN metal hard mask in one direction and only etch out where the selective deposition material is exposed creating vias that are self-aligned to the original metal layer.

Figure 3 illustrates the EPE and cost advantage of the self-aligned process.

Figure 3. Selective Processing Enable EPE and Cost Improvement.

Conclusion

The three process innovations described here improve process latitude, reduce cost and time and improve performance.

  1. Square spacers eliminate process steps in SAQP processes reducing cost and process time.
  2. Directional etch improves T2T spacing enabling more compact layouts improving cost.
  3. Selective processing reduce EPE and enables maximum via sizes improving performance.
Also Read:

LithoVision – Economics in the 3D Era

IEDM 2019 – Imec Interviews

IEDM 2019 – IBM and Leti


Why IP Designers Don’t Like Surprises!

Why IP Designers Don’t Like Surprises!
by Daniel Nenni on 03-13-2020 at 6:00 am

IPDelta SemiWiki

If it’s your job to get a SoC design through synthesis, timing/power closure and final verification, the last thing you need are surprises in new versions of the IP blocks that are integrated into the design. If your IP supplier sends a new version, the best possible scenario is that this is only a small incremental change from what you had before, fixing only those issues that are in the way of final tape-out.

What you really don’t want are unhappy surprises in, for instance, a new Hard IP release. Suppose you requested some sleep-mode power-modeling improvements to be made as the original version showed some doubtful values for several process corners. The new IP release comes in and indeed now all process corners show consistent trends for sleep-mode power usage. But to your horror also all timing characterisation was updated, setting your design closure back by several weeks.

Were the delay modeling updates valid at all? If so, and had you known about them to begin with, the impact would have been studied on a sub-component; workarounds and synthesis adjustments would have been developed before applying a re-run of the entire design.

The problem with IP releases is often simply not knowing what is in there. In particular over time, as subsequent improvements to IP blocks are delivered, these incremental changes need be just that: incremental. Anything else carries the risk of breaking the iterative improvement synthesis cycles that take the design closer to final verification.

A new tool called IPdelta™ has been released by Fractal Technologies to address just this problem of identifying what has changed from one IP revision to the next.

IPdelta
When new IP is received an incoming inspection needs to be performed. Part of this task is to verify that the IP is internally consistent and complete and meets the quality requirements agreed with the IP supplier. This task is typically done by Fractal Crossfire™, the industry-standard tool for IP qualification. If the IP release is intended to replace a previous version in a design, IPdelta™ must be used on top of Crossfire ™ to bring out the changes introduced in the new version.

The objective of IPdelta™ is to inventory all aspects in which one IP revision may differ from the next. Every database and file-format supplied is compared and deltas are reported for every relevant category of design data. This includes basic elements like cells and terminals but extends to delay-, power- and noise- arcs, their conditions and associated characterization data. Also physical layout (LEF, OASIS, OA, MilkyWay) is covered, as are schematics, netlists, synthesis properties and functional models.

ABOUT FRACTAL
Fractal Technologies is a privately held company with offices in San Jose, California and Eindhoven, the Netherlands. The company was founded by a small group of highly recognized EDA professionals. Fractal Technologies is dedicated to provide high quality solutions and support to enable their Customers to validate the quality of internal and external IP’s and Libraries. Thanks to our validation solutions, Fractal Technologies maximize value for its Customers either at the Sign Off stage, for incoming inspection or on a daily basis within the Design Flow process. Fractal Technologies goal is to become the de facto IP & Library Validation Solutions Provider of reference for the Semiconductors Industry, while staying independent to keep its intrinsic value by delivering comprehensive, easy to use and flexible products.


The Story of Ultra-WideBand – Part 4: Short latency is king

The Story of Ultra-WideBand – Part 4: Short latency is king
by Frederic Nabki & Dominic Deslandes on 03-12-2020 at 10:00 am

The Story of Ultra WideBand Part 4

How Ultra-wideband aligns with 5G’s premise

In part 3, we discussed the time-frequency duality or how time and bandwidth are interchangeable. If one wants to compress in time a wireless transmission, more frequency bandwidth is needed. This property can be used to increase the accuracy of ranging, as we saw in part 3. Another very interesting capability enabled by this time-frequency duality is the possibility to reduce the latency in systems. But first, a primer on latency and how it affects wireless connectivity.

As engineers, we understand latency as the time interval between a triggering action and its response. From a wireless link point of view, this is the time delay between sending a frame of data and receiving it. But consumers have a visceral reaction to latency. Gamers playing combat and sports games experience latency as the lag between pressing a button and seeing the expected action on the screen. This delay can be a matter of in-game life or death! Monitors and peripherals are being aggressively marketed with reduced latency (e.g., 240 Hz refresh rate gaming monitors), and as a result, wired peripherals are surprisingly still ubiquitous in the gaming circles. The wire, as old a contraption as one can remember, remains undisputed with regards to latency.

The quest for latency is intensifying today, as applications that are more sensitive to latency become mainstream. For example, designers and gamers wearing augmented reality (AR) or virtual reality (VR) headsets experience latency as the disconcerting lag between their motions and the visual response. AR and VR make users prone to motion sickness at the slightest onset of latency. Moreover, home theater owners curse latency when the characters’ lips on screen go out of sync with their voices, and while recorded video can be carefully delayed to calibrate out the latency, feeds that require live intervention cannot benefit from this tactic. This wireless latency problem touching live interactions manifests itself as easily as when typing on your smartphone and seeing the keys move out of synch with the keys’ audio feedback coming through a wireless headset. Some phone makers will hide this limitation by having the keyboard audio feedback not come through the wireless headset. Ironically however, using a now defunct audio jack on an antiquated phone with a barebone wired headset does not pose a latency problem! This issue goes even deeper, where industrial engineers experience latency as unacceptable lag in critical sensor and control systems. All in all, current wireless technologies cannot deliver acceptable gaming, AR / VR, live video or industrial IoT experiences, so those applications remain wired in 2020.

The brain can typically discern a latency tens of milliseconds or more, and some instrument players are able to “feel” 3 ms of latency. Wireless latency has multiple causes. It is first a consequence of the speed of light, similarly as with wires. However, at the human scale, the speed of light is not the limiting factor, as a 100 meters wireless communication will only incur 333 ns of latency. The second cause is processing time in the transceiver. But again this is usually not the limiting factor as processors can often finish operating on a frame in microseconds. The third cause and one of the most important one is the speed at which the transceiver can communicate its data. In a wireless transceiver, each data frame must be completely received before it can be processed. This means that the speed at which data is transmitted and received is a significant factor contributing to latency. As an example, transmitting a thousand bits frame at a data rate of 1 Mbps will result in a 1 ms latency. This is known as the airtime. In addition to the airtime, there is also the time required by the Media Access Control Layer, or MAC-Time, which is related to the communication stack used by the protocol and may include time for carrier sensing, frame acknowledgment, frame retransmission, flow control etc. The MAC-Time varies greatly depending on the application and can go from being negligible to being the dominating factor compared to the airtime. Ultimately MAC-Time is often correlated to the airtime, such that radios than can condense airtime are able to provide shorter latencies.

Combining all these factors makes it difficult to fairly compare latencies of different wireless radios. Each technology has its targeted applications which means the MAC layer has been developed accordingly. A wireless link that requires 99.999% reliability will not have the same latency as a best effort broadcast system. Nevertheless, the latency will always be limited and derived from the airtime of the radio which is a good point of comparison. The IEEE 802.15.4 standard, behind the ZigBee specification, provides a data transfer rate of 250 kbps, while BLE 4.2 supports 1 Mbps and BLE 5 2 Mbps. These data rates provide an airtime in the order of a few milliseconds for BLE and tens of milliseconds for IEEE 802.15.4. These airtimes are further “amplified” by the MAC layer and cause much longer overall latencies that can go beyond 100 ms, and can be easily noticed by users.

A good way to reduce the latency is to increase the data rate, a method well applied by Wi-Fi. With the 802.11 standard now supporting hundreds of Mbps of data transmission over a single link, we can now see sub-milliseconds latency for single frames. This latency comes at the expense of power consumption, however. The Wi-Fi standard supports large packets of up to over 2000 bytes and uses complex modulation requiring power hungry circuitry.

Latency is in fact one of the main drivers behind the development of 5G networks. Promising a few milliseconds latency, 5G will provides a 10X improvement over LTE. However, 5G radios have a similar drawback to Wi-Fi, which is a very high power consumption, preventing their use in most IoT devices. As such, we are in a situation where we can route data over hundreds of kilometers in a few milliseconds, but it will take more time to do the last one hundred meters using a lower power radio.

UWB bridges the gap between long range, high data rate transceivers (Wi-Fi and 5G) and short-range low data rate solutions like BLE and Zigbee. UWB uses fast 2 ns pulses to reach data rate of tens of Mbps. This provides an airtime one order of magnitude shorter than BLE, reaching sub-ms latency. UWB is then a strong candidate to provide the last 100 meters low latency connectivity when combined with 5G.

The sub-ms latency and relatively large data rate of UWB could enable multiple new interactive experiences and applications that were previously not possible with other short-range radios. However, one very important aspect of UWB, one aspect required for the IoT revolution, has not yet been discussed: low-power operation. In the next part, we will see how UWB can reduce power consumption to a level not yet achieved by any other wireless transceiver.

About Frederic Nabki
Dr. Frederic Nabki is cofounder and CTO of SPARK Microsystems, a wireless start-up bringing a new ultra low-power and low-latency UWB wireless connectivity technology to the market. He directs the technological innovations that SPARK Microsystems is introducing to market. He has 17 years of experience in research and development of RFICs and MEMS. He obtained his Ph.D. in Electrical Engineering from McGill University in 2010. Dr. Nabki has contributed to setting the direction of the technological roadmap for start-up companies, coordinated the development of advanced technologies and participated in product development efforts. His technical expertise includes analog, RF, and mixed-signal integrated circuits and MEMS sensors and actuators. He is a professor of electrical engineering at the École de Technologie Supérieure in Montreal, Canada. He has published several scientific publications, and he holds multiple patents on novel devices and technologies touching on microsystems and integrated circuits.

About Dominic Deslandes
Dr. Dominic Deslandes is cofounder and CSO of SPARK Microsystems, a wireless start-up bringing a new ultra low-power and low-latency UWB wireless connectivity technology to the market. He leads SPARK Microsystems’s long-term technology vision. Dominic has 20 years of experience in the design of RF systems. In the course of his career, he managed several research and development projects in the field of antenna design, RF system integration and interconnections, sensor networks and UWB communication systems. He has collaborated with several companies to develop innovative solutions for microwave sub-systems. Dr. Deslandes holds a doctorate in electrical engineering and a Master of Science in electrical engineering for Ecole Polytechnique of Montreal, where his research focused on high frequency system integration. He is a professor of electrical engineering at the École de Technologie Supérieure in Montreal, Canada.


Turbo-Charge Your Next PCIe SoC with PLDA Switch IP

Turbo-Charge Your Next PCIe SoC with PLDA Switch IP
by Mike Gianfagna on 03-12-2020 at 6:00 am

Integrated NVMe interfaces

SemiWiki has a new IP partner, PLDA and they bring a lot to the party.  Peripheral component interconnect express (PCIe) is a popular high-performance data interface standard. Think GPUs, RAID cards, WiFi cards or solid-state disk (SSD) drives connected to a motherboard. The protocol offers much higher throughput than previous standards such as serial ATA. The last application has spawned a new standard that bundles SSDs with PCIe, creating non-volatile memory express (NVMe) drives.

With all that jargon out of the way, let’s explore how PLDA makes a difference in the NVMe market. One approach to system design with NVMe drives is to use discrete interfaces, as shown below.

While this approach reduces SoC design complexity, it has some significant drawbacks – lower performance and the ability to differentiate only in software are two. Here is where PLDA makes a difference.

Thanks to their embedded PCIe switch IP, the SoC can now handle all the interface tasks previously done by discrete parts. This means improved performance, an optimized bill-of-materials (BoM) and power profile, system design flexibility (e.g., host and/or flash controller implementation in one device) and future-proof capabilities thanks to PLDA’s support of the latest standards, up to PCIe 5.0.

Integrating the PCIe switch into the SoC opens up other opportunities for
further differentiation. For example, hardware-based accelerators can be added to the SoC for tasks such as encryption and compression, resulting in lower latency, power and a reduced BoM. Thanks to PLDA’s support for non-transparent bridges (NTB), multiple guest hosts can be added to the configuration as well. Host-to-host communication is supported by NTB, and guest host to device communication can be implemented with address translation.

Differentiation at the appliance level is also possible with a modular design to support low-cost to high-end applications through multiple instances of the same SoC/FPGA.

PLDA summarizes the key features of their IP for these applications as follows:

  • Low latency switch (cut-through architecture)
  • Support of PCIe 4.0 or 5.0 and SR-IOV
  • RAS features includes ECRC, Parity, ECC, AER, Hot Plug
  • Support for inline and/or embedded processing via pipelining and/or embedded endpoints
  • Any port to any port high-speed communication including peer-to-peer between endpoint devices
  • Ability to support different link width and speed on downstream ports
  • Downstream Port Containment (DPC)

Keep an eye on SemiWiki for more information on PLDA’s flexible IP portfolio, you can also get the big picture regarding PLDA’s full line of products and support at the PLDA website.


Where have all the Leaders gone – a profile of T.J. Rodgers

Where have all the Leaders gone – a profile of T.J. Rodgers
by Erach Desai on 03-11-2020 at 10:00 am

T.J. Rodgers SemiWiki

Silicon Valley has morphed from the days of semiconductor fabs interspersed between strawberry farms, and 3:00 pm rush-hour traffic during the shift change for the fabrication facility engineers and technicians. The leadership of technology companies has also arguably devolved from people who inspired employees and stewarded the growth of enterprises to individuals who count eyeballs and hob-nob with Hollywood elites.

Very few technology leaders have inspired me as much as the iconic T.J. Rodgers. Trust me this is not a fluff piece intended to ingratiate myself with T.J.

T.J. co-founded Cypress Semiconductor in December 1982 – originally a designer and manufacturer of SRAMs (static random access memories). Silicon Valley was born in the backdrop of Stanford University by Frederick Terman and William Shockley and many others, circa the 1950s. So in today’s lingo, T.J. would be from the 2.0 generation of the valley. Remarkably, Rodgers served as the CEO of Cypress for 34 years – stepping down in August 2016 – and, logs in as the longest tenured CEO of all publicly-traded technology (real tech) companies.

The first time I heard about T.J. Rodgers was in 1990 when I read his bombastic article in the Harvard Business Review titled No Excuses Management (in a real paper format, mind you). This makes me sound like some elitist snob (bragging about reading HBR). The publication that I loved at the time was a Silicon Valley trade rag called Upside (it was allegedly run out of business by the legendary Steve Jobs, but that’s a story for another day!).

It was quite a controversial article at the time. T.J.’s premise was that “most companies don’t fail for lack of talent or strategic vision. They fail for lack of execution…” He went on to elaborate – in great detail – about the management systems (sometimes referred to as management information systems, back then) that Cypress had implemented to track and manage: recruitment, goals management, resource allocations, and rewarding employees (compensation). Some observers viewed this as “Big Brother” or micro-management. In a sense, it may have been that. But, the fact remains that Cypress has employed hundreds to thousands of employees under this system – in Silicon Valley, where job mobility is legend.

Personally, I had mixed feelings about this management style. I liked the transparency and accountability. I knew that I would not like my every move being monitored. But, what I most disagreed with was T.J.’s expectation that management – especially executives – had to sacrifice their personal life to succeed at Cypress. While it was not a center piece of his article, I was a youngish, idealistic, and naive fool at the time and believed that “the next generation” could have it all! Nonetheless, I walked away very impressed with this gadfly executive and his chutzpah.

The next time Rodgers appeared on my radar in a meaningful way was in 1995 and I was working in product marketing at ArcSys (which became Avant! through the acquisition of Integrated Silicon Systems). I distinctly remember the morning that ArcSys’ offices were raided by the FBI and the San Jose PD – horrified and disheartened. A few days later I went to the San Jose District Court to pay for my own copy of the complaint that had been filed by Cadence Design Systems, alleging intellectual property theft. The complaint alleged that a Cypress design engineer had gone to T.J. when he had encountered unique error messages running ArcCell software – error messages that were identically poorly worded and mis-spelt as those from running Cadence’s Symbad tools (the Taiwanese-born software engineers at Cadence and ArcSys themselves referred to this as “Chinglish”; so, the thought police and hyper-ventilating PC folks can keep their pitchforks at bay!). The story goes that T.J. had then contacted Joe Costello, CEO of Cadence, and there-in began a series of cloak-and-dagger investigations in conjunction with the District Attorney’s office (there is an ArcSys version of the story that alleges that Costello approached Rodgers to do a software execution comparison; but, …).

The point is not to rehash the case of Cadence vs. Avant! right here. The point is that T.J. was an upstanding leader, willing to fight for what he felt was right in what became a very public case at the time (but, dwarfed by the O.J. Simpson trial). The real thug who master-minded the actual theft of intellectual property from Cadence to ArcSys is Gerald (Gerry) C. Hsu, who dodged prosecution and flourishes in mainland China. My personal investigation and digging into the series of events revealed a more insidious plan that Avant! was never really held accountable for (a book on this very intriguing case is on my bucket list).

But, the story that really galvanized T.J. into my hall of fame of executives is the story of SunPower and Cypress Semiconductor. Phew, they say! He’s finally getting around to the crux of this article.

It was 2001 and Rodgers ran into an old classmate from Stanford. Richard Swanson had ventured away from semiconductors into solar power. Swanson had started SunPower way back in 1988 to make efficient solar cells from silicon. At their chance encounter, Swanson informed Rodgers that he was a few weeks away from laying off half his workforce in order to meet payroll. Curious about the technology and perhaps having empathy for his friend, T.J. decided to meet up with Swanson to learn more and do some initial due diligence. He was so sold on the potential of SunPower’s technology that he approached the board of directors at Cypress to make an equity investment. The board was unconvinced, thinking that it was not core to Cypress’ memory and communications chip business. In characteristic form, T.J. then informed the board that he would make the investment himself, and cut Swanson a $750,000 personal check for an equity investment.

Up to this point this would be a ho-hum narrative. But, there is more.

A year later, SunPower’s technology was proving to be significantly better than that of overseas rivals like Siemens, Kyocdera, Sharp and Sanyo. It made sense to plow further investment into SunPower and Cypress’ board agreed to taking a controlling stake with an $8 million investment. SunPower became a subsidiary of Cypress Semiconductor and brought on Tom Werner as CEO in 2003. Sometime in 2004 it became clear that shareholder value for SunPower would be optimized by taking it public. But, the board – rightly – was concerned: T.J.’s original $750,000 investment could be viewed as self-serving or a conflict of interest – even though it clearly wasn’t as events had transpired (Rodgers had brought the deal to the board, they had rejected it, he asked to invest personally, and the board had okayed his outside investment). In anticipation of the IPO in 2005, the board urged T.J. to sell his personal equity stake to Cypress at the original valuation – in other words, Rodgers would get a $750,000 check from Cypress netting a 0% return. T.J. went ballistic and angry words were supposedly exchanged. In the end, T.J. did “the right thing” even though he had no legal obligation to do so. Kudos to a board of directors that proactively wanted to avoid any perception of conflict. But, a double-kudos to T.J. for being the bigger person, when he did not have to be!

This story was relayed to me at some point in 2004 by Manny Hernandez, then Cypress’ CFO, in my role as a semiconductor equity analyst covering Cypress. However, the narration is mine and I take full responsibility for any unintentional factual misstatements or misrepresentations.

I cannot profess to know T.J. Rodgers personally. As the analyst on Cypress’ stock for a period of a few years, I had some opportunities to meet with him in a group of analysts and investors. Quite frankly he would not know me from Joe the Plumber! When Brian Fuller (editor-in-chief of EETimes, and a friend) interviewed him in a town hall setting at the Embedded Systems Conference in 2006, I grabbed a front seat like a crazed fan!

If there is an underlying thread as to why I truly admire Rodgers’ leadership it is because of his integrity: a person who believes in what he believes, embodies it and built a long-lasting enterprise under his stewardship. In doing some background research for this article, I ran into a more recent story circa 2017. Basically, T.J. (retired from Cypress in 2016, but the largest individual shareholder of Cypress’ stock at the time) took Cypress board of directors to task in a proxy fight in which he prevailed. The issue that T.J. fought for – and won on – was about the integrity of the board of directors and accountability to shareholders. Go figure!


An Objective Hardware Security Metric in Sight

An Objective Hardware Security Metric in Sight
by Bernard Murphy on 03-11-2020 at 6:00 am

Metrics

Security has been a domain blessed with an abundance of methods to improve in various ways, not so much in methods to measure the effectiveness of those improvements. With the best will in the world, absent an agreed security measurement, all those improvement techniques still add up to “trust me, our baby monitor camera is really secure.” Software security has made some progress on this front, as we’ll see. Hardware not so much. That’s a problem. Where’s the UL seal of security approval or something of that nature?

Now there’s hope that will change. The MITRE corporation has released its first documented taxonomy for Common Weakness Enumeration (CWE) in hardware design.

A little background first. MITRE is a not-for-profit organization charged with managing federally funded R&D centers supporting a number of government agencies. Among their well-known contributions is their development and maintenance of a list of Common Vulnerabilities and Exposures, which documents known cybersecurity vulnerabilities.  For example, the recent Meltdown attack has CVE-2017-5754 and Spectre has CVE-2017-5753 and CVE-2017-5715.

MITRE also maintains a related CWE list, which until recently only concerned itself with software weaknesses. For example, you can find buffer overflow weaknesses in this list. The software CWE is very well developed at this point, so much so that there are now over 100 products (including Synopsys’ Coverity) which analyze software for CWE weaknesses.

Intel has a very sophisticated security team (they’re a very big target for attacks) and have been working with MITRE for a while now to develop an equivalent weaknesses list for hardware design. Tortuga Logic has worked with Intel and were invited to contribute weaknesses they had found using their Radix software. So now what you’ll find in this starting list is a combination of Intel wisdom on what they know to be common weaknesses, plus Tortuga Logic wisdom on additional weaknesses they have found.

That’s a pretty darn impressive accomplishment for Tortuga Logic. This CWE list for hardware is likely to follow the same path as the list for software, becoming a definitive standard for best practices in hardware design for security. Even more important, Tortuga Logic Radix-* software is already setup to find many or most of these weaknesses.

Where will this lead over time? First, just as for software CWE, we should expect the list to grow over time. MITRE has a formal process to review and approve new submissions. I don’t imagine designers will want to check through each weakness one at a time in a security signoff (there are already ~840 weaknesses documented). Hardware security tools will be essential.

Second, where is this going in terms of enforcement versus defacto adoption? Jason Oberg (CEO of Tortuga Logic) doesn’t yet see any kind of enforcement, though I suspect government agencies and particularly the DoD will expect vendors to demonstrate they are clean.

Along those lines it is worth noting that the National Institute for Standards and Technology (NIST) has adopted CWE in their cybersecurity framework. It’s perhaps too early to talk about that also including the just-released hardware component, but it’s difficult to see why it wouldn’t ultimately be incorporated.

So unless you are going to ignore government business or build separate hardware for the government, get ready to have to prove CWE compliance at some point. And when the commercial industry is looking around for a security standard on which to hang its hat, I would guess the MITRE CWE will look like a pretty good place to start.


The Story of Ultra-WideBand – Part 3: The Resurgence

The Story of Ultra-WideBand – Part 3: The Resurgence
by Frederic Nabki & Dominic Deslandes on 03-10-2020 at 10:00 am

The Story of Ultra WideBand SemiWiki

In Part 2, we discussed the second false-start of Ultra-WideBand (UWB) leveraging over-engineered orthogonal frequency-division multiplexing (OFDM) transceivers, launching at the dawn of the great recession and surpassed by a new generation of Wi-Fi transceivers. These circumstances signed the end of the proposed applications – short-range very high data-rate (i.e., few hundred Mbps) wireless link – not of the technology. In fact, the history of UWB is a little bit more complex: by the time the high speed wireless UWB proposal was starting to fade, other UWB applications were thriving.

Starting in World-War II, the rapid development of microwave systems paved the way to the development of UWB systems. In the 1960’s the Lawrence Livermore National Laboratory (LLNL) and the Los Alamos National Laboratory (LANL) were working on pulse transmitters, receivers and antennas. These research projects were not pure academic research; there was indeed high incentive to develop impulse systems: UWB could provide ultra-high resolution, which could then be used for object positioning, characterization and identification. By the 1970’s UWB radars were developed mainly for military application. As research continued to progress, other applications were found and, at the end of 1990’s, multiple UWB radars were used for a wide range of applications: forestry applications, through-wall detection in urban areas, imaging for search and rescue operations and obstacle avoidance.

In order to really understand the appeal of UWB, we first have to grasp the time-frequency duality, well encapsulated by the Fourier Transform. In simple terms, this duality states that if you have an infinitely long periodic time signal, it will have an infinitely small bandwidth. On the other hand, if you have an infinitely short impulse signal, it will have an infinitely large bandwidth. In other terms, it means you can trade time for bandwidth. Why would you what to do that? There are multiple reasons for it but a very important one is to enable ultra-high-resolution positioning.

There are two basic ways to determine the distance between RF devices: you can either use the Received Signal Strength (RSS) or the Time of Flight (ToF) of the signal. RSS is a very simple technique to implement and can be used by any wireless transceiver, which explains why it is so widely used. However, it is severely limited in its accuracy: the perceived distance between two immobile objects will change according to obstacles in their direct path. As an example, if you have two devices placed 10 meters apart but separated by a brick wall, which provides 12 dB of attenuation, you would think that both devices are 40 meters away. ToF solves this issue. By measuring the time it takes to travel from one device to the other, you can precisely extract the distance between both objects. In our previous example, the speed of light will indeed be reduced inside the brick wall, but this will only induce an error of about 10 cm (due to the slight reduction in the speed of light in the brick).

ToF is clearly the way to go in order to accurately position objects in space. One drawback however is that you need to deal with the speed of light, which is pretty fast to say the least. In fact, the light takes only 333 picoseconds to travel 10 cm. If one wants to measure distances between objects with centimeter precision, sub-nanosecond accuracy will be needed in the system. The easiest way to achieve this accuracy is to send a signal that is very short in time, which requires, thanks to the time-frequency duality, an UWB signal.

The possibility of accurately measuring the distance with ToF explains to a large extent the resurgence of the UWB in the last few years. The market for accurate positioning is rapidly growing in multiple sectors and should continue to see a double-digit growth in the next years. Multiple companies are now jumping into the UWB bandwagon, the latest being Apple which equipped its iPhone 11 with an UWB chip, the U1, seemingly its own design. With the capability to implement Real-Time Location Systems (RTLS), UWB enables a wealth of new applications in a wide variety of markets: Industry 4.0, IoT, and vehicular.

As we saw in this article, time can be traded for bandwidth, which can advantageously be used to do positioning. But it can also provide other advantages. In Part 4, we will explore another key advantage to UWB in many wireless applications: very low latency.

About Frederic Nabki
Dr. Frederic Nabki is cofounder and CTO of SPARK Microsystems, a wireless start-up bringing a new ultra low-power and low-latency UWB wireless connectivity technology to the market. He directs the technological innovations that SPARK Microsystems is introducing to market. He has 17 years of experience in research and development of RFICs and MEMS. He obtained his Ph.D. in Electrical Engineering from McGill University in 2010. Dr. Nabki has contributed to setting the direction of the technological roadmap for start-up companies, coordinated the development of advanced technologies and participated in product development efforts. His technical expertise includes analog, RF, and mixed-signal integrated circuits and MEMS sensors and actuators. He is a professor of electrical engineering at the École de Technologie Supérieure in Montreal, Canada. He has published several scientific publications, and he holds multiple patents on novel devices and technologies touching on microsystems and integrated circuits.

About Dominic Deslandes
Dr. Dominic Deslandes is cofounder and CSO of SPARK Microsystems, a wireless start-up bringing a new ultra low-power and low-latency UWB wireless connectivity technology to the market. He leads SPARK Microsystems’s long-term technology vision. Dominic has 20 years of experience in the design of RF systems. In the course of his career, he managed several research and development projects in the field of antenna design, RF system integration and interconnections, sensor networks and UWB communication systems. He has collaborated with several companies to develop innovative solutions for microwave sub-systems. Dr. Deslandes holds a doctorate in electrical engineering and a Master of Science in electrical engineering for Ecole Polytechnique of Montreal, where his research focused on high frequency system integration. He is a professor of electrical engineering at the École de Technologie Supérieure in Montreal, Canada.


Viewing the Largest IC Layout Files Quickly

Viewing the Largest IC Layout Files Quickly
by Daniel Payne on 03-10-2020 at 6:00 am

Skipper, Empyrean

The old adage, “Time is money”, certainly rings true today for IC designers, so the entire EDA industry has focused on this challenging goal of making tools that help speed up design and physical verification tasks like DRC (Design Rule Checks) and LVS (Layout Versus Schematic). Sure, the big three EDA vendors have adequate IC layout editors, however they are general purpose tools not really optimized for loading and viewing the largest IC designs that can now approach 1TB in size. This creates an opportunity for a focused point tool that excels at quickly reading an IC layout and does chip finishing tasks, which is what Empyrean has done with their Skipper tool.

WEBINAR: IP Integration Challenges of Complex SoC Platforms

I had a WebEx session last week with Chen Zhao, AE Manager at Empyrean to see what the Skipper tool was all about. Here’s a diagram showing what the input and output file formats are for Skipper:

Let’s say that you have a 200GB OASIS layout file that you need to load and browse, so with a general purpose IC editor just loading that file would take about 5 hours, however with Skipper the same file loads in just 30 minutes. Now that’s what I call automation. The following six IC design tasks all benefit from a fast tool like Skipper:

  1. Visualizing IC layout during DRC and LVS debugging.
  2. Point 2 Point (P2P) resistance analysis.
  3. Net tracing for VDD, VSS, Clock, etc. to find shorts and opens.
  4. Merging multiple IP blocks as part of chip finishing.
  5. Comparing two or more versions of the same layout.
  6. Focused Ion Beam (FIB) processing for defect analysis and circuit modification.

Using Skipper

There are three ways to use the Skipper tool, and each method is optimized for certain tasks.

  • Normal mode – reads the IC layout file into RAM at speeds up to 1 GB/s, depending on your hard drive being SSD or magnetic.
  • Cache mode – reads only parts of an IC layout file into RAM using an index file. Useful if you have cells that don’t need to be loaded.
  • Share mode – the second user to invoke Skipper on the same machine shares the first RAM image, allowing quickest viewing, typically ready within seconds for a 100+ GB file.

The first two modes sounded like a typical EDA tool, however the share mode was something that I’ve never heard about before, and just watching how fast the IC layout appears is quite exciting.

Let’s look at some scenarios for using Skipper starting with Normal mode where each of the following three users independently loads an IC layout, and each has to patiently wait:

Normal mode

With Cache mode User A creates an index file containing just the cells of interest, instead of the entire IC, so now User B and C will load only the pertinent cells, saving time.

Finally, with Share mode User A is the first to load the IC layout on Server 1, then both users B and C use the same Server 1 with Skipper and share the RAM image, allowing a near instantaneous viewing experience in just seconds.

Share mode

To get some speed perspective consider an actual IC design with a 1.6GB OASIS flattened layout, then here are the loading times to start viewing:

  • Normal mode: 110 seconds (at 14MB/s reading speed)
  • Cache mode: 20 seconds (5.5X faster)
  • Share mode: <1 second (100X faster)

Customer Examples

The following two tables compare the speed of Skipper versus another IC layout viewer on an FPGA design, showing improvements between 3X to 24X faster times:

FPGA benchmark times

For golden layout signoff using four of the Skipper capabilities a customer found that on designs ranging from 28nm to 7nm benefited from IP merge at 5X faster, and they have used Skipper on 100+ chips so far.

Customer Case

The final customer case involved a CAD group that integrated the Skipper features by using the C++ API in their flow, coding 200+ API features in just 3 months time.

API integration

Another way to control Skipper is with Tcl code, so that should keep your CAD guys happy.

Talking about customers, Empyrean has about 300 customers using Skipper, so it’s a proven technology, and a tool category worth taking a closer look at.

TSMC Symposium

If you live in Silicon Valley then consider attending the TSMC Symposium to watch what Skipper can do in real time, and talk with the experts at the Empyrean booth. On Wednesday, April 29th visit the Santa Clara Convention Center.

Summary

There’s a new category of EDA tool for speeding up your LVS/DRC debug times, and LVL checking, where the largest IC designs can be browsed most quickly using a point tool like Skipper, saving you time and improving productivity. Here’s a brief overview video showing Skipper in action:

Related Blogs