SemiWiki – Page 338 – The Open Forum for Semiconductor Professionals

July 9, 2020September 30, 2020

Achronix Blog Roundup!

Achronix Blog Roundup!
by Daniel Nenni on 07-09-2020 at 10:00 am
Categories: Achronix, eFPGA

Blogging is not an easy thing to do. It takes time, patience, commitment, and creativity. SemiWiki brought blogging to the semiconductor industry and many companies have followed. Very few have been successful with personal or corporate blogs but as a premier semiconductor blogger I have developed a proven recipe over the last ten years and can spot a winner when I see one.

As a corporate blog success story I will point to the Achronix blog site. Over the last three years Achronix has posted 28 blogs. My preference would be one per month without fail but 28 in 37 months is a serious commitment. There is a nice mix of authors from different aspects of the company (engineering, marketing, applications, C level, etc…):

– Kent Orthner, Systems Architect
– Alok Sanghavi, Sr. Marketing Manager
– Steve Mensor, Vice President, Marketing
– Katie Purcell, Senior Staff Applications Engineer
– Volkan Oktem, Sr. Director of Application
– Manoj Roge, VP of Strategic Planning & Business Development
– Raymond Nijssen, Vice President and Chief Technologist
– Bob SIller, Director, Product Marketing
– Huang Lun, Sr. Field Applications Engineer

While Author is important, blog title and the first paragraph is everything for both direct and search traffic. You need to speak to a specific problem to get high quality traffic. For semiconductor sites, quality versus quantity is important and be very careful about clickbait because it is a double edged sword. Again, the Achronix blogs are a great example for titles and summaries:

Embedded FPGAs for Next-Generation Automotive ASICs
Bob SIller, Sr. Marketing Manager
For anyone who has looked at new cars lately, it’s hard not to notice how quickly automotive electronics are advancing. Looking at automotive safety technology from just three years ago vs. today, you see a significant increase in the number of cameras to support applications such as surround-view display, driver distraction monitors, stereo vision cameras, forward-facing and multiple rearview cameras. Speedcore

Increase Performance Using an FPGA with 2D NoC
Huang Lun, Sr. Field Applications Engineer
Achronix Speedster7t FPGAs feature a revolutionary new two-dimensional network on chip (NoC), which provides >20 Tbps ultra-high bandwidth connectivity to external high-speed interfaces and for routing data within the programmable logic fabric. The NoC is structured as a series of rows and columns spread across the Speedster7t FPGA fabric. Each row or column has two 256-bit data paths using industry standard AXI data format, which support 512 Gbps data rates.

What is an FPGA and Why the Answer is Changing?
Bob Siller, Director, Product Marketing
What is an FPGA? With the advent of new FPGA architectures, the answer has changed more in the last two years than ever before. Traditionally, an FPGA or field programmable gate array, is a reconfigurable semiconductor device comprising programmable logic gates and interconnect or routing, connected to multipurpose I/O pins. An FPGA can be reprogrammed to perform any function, and its functionality can be changed over time. (For a great summary and history of the FPGA industry and technology.

Insights from the Next FPGA Platform Event
Manoj Roge, VP of Strategic Planning & Business Development
It was exciting to participate in Next FPGA Platform on January 22^nd at the Glasshouse in San Jose. I found it was particularly exciting to have Achronix share in a panel discussion with Xilinx and Intel. The Next Platform co-editors Nicole Hemsoth and Timothy Prickett Morgan did a great job in interviewing experts from FPGA ecosystem with insightful questions. The best part of Next Platform events is their format, where they keep marketing pitches to minimum with no presentations, just discussions.

FPGAs in the 2020s – The New Old Thing
Bob Siller, Director, Product Marketing
FPGAs are the new old thing in semiconductors today. Even though FPGAs are 35 years old, the next decade represents a growth opportunity that hasn’t been seen since the early 1990s. Why is this happening now?

Mine Cryptocurrencies Sooner Part 1-3
Raymond Nijssen, Vice President, Marketing
Cryptocurrency mining is the process of computing a new cryptocurrency unit based on all the previously found ones. The concept of cryptocurrency is nearly universally recognized by the publicity of the original cryptocurrency, Bitcoin. Cryptocurrencies were supposed to be a broadly democratic currency vehicle not controlled by any one entity, such as banks, governments, or small groups of companies. Much of a cryptocurrency’s acceptance and trustworthiness is based on that proposition. However, with Bitcoin, that is not how it unfolded.

Getting the first wave of blog views is actually the easy part. Keeping readership is critical and that is all about the quality of content. If I had to credit one thing for the success of SemiWiki over the last 10 years it would be excellent content. All content on a company website is important but done correctly blogs can bring a consistent stream of high quality traffic and improve your website SEO and rankings, absolutely.

You should also check out the Achronix videos and webinars, very well done!

About Achronix Semiconductor Corporation
Achronix Semiconductor Corporation is a privately held, fabless semiconductor corporation based in Santa Clara, California and offers high-performance FPGA and embedded FPGA (eFPGA) solutions. Achronix’s history is one of pushing the boundaries in the high-performance FPGA market. Achronix offerings include programmable FPGA fabrics, discrete high-performance and high-density FPGAs with hardwired system-level blocks, datacenter and HPC hardware accelerator boards, and best-in-class EDA software supporting all Achronix products. The company has sales offices and representatives in the United States, Europe, and China, and has a research and design office in Bangalore, India..

Follow Achronix
Website: www.achronix.com
The Achronix Blog: /blogs/
Twitter: @AchronixInc
LinkedIn: https://www.linkedin.com/company/57668/
Facebook: https://www.facebook.com/achronix/

July 9, 2020April 26, 2022

Interface IP Category to Overtake CPU IP by 2025?

Interface IP Category to Overtake CPU IP by 2025?
by Eric Esteve on 07-09-2020 at 6:00 am
Categories: Arm, IP, IPnest, RISC-V, Semiconductor Services
1 Comment

The Interface Design IP market explodes, growing by 18% in 2019, with $870 million, when CPU IP category grew by 5% at $1,460 million. In fact, Interface IP market is forecasted to sustain high growth rate for the next five years, as calculated by IPnest in the “Interface IP Survey 2015-2019 & Forecast 2020-2024”, to reach $1,800 million by 2025. Obviously the CPU IP category will not stay at the 2019 level and is expected to grow as well. But we think the CAGR 2020-2025 for CPU will be more modest, in the 4% range.

Why such a modest growth rate for the CPU IP category? The first reason is that the CPU IP market is shaky because the licensing business model is in revolt since the insertion of RISC-V CPU. The second reason is the uncertainty about ARM future revenues coming from IP sales in China (estimated to be in the 30% range), because of the exit of the JV built to support ARM IP sales in the country. This post “Tears in the Rain – ARM and JVs in China” from Jay Goldberg in Semiwiki gives very detailed explanation of the complete story. I strongly suggest you to read this post because it reflects that we were only guessing, translating a feeling into clear wording.

But the goal today is to explain why the interface IP category will see such a high growth rate until 2025. The below picture is showing that the CPU IP market share is declining since 2017 (40.8% to 37.2%) when the interface IP share is growing on the same period from 18% to 22.1%. This trend is validated during the last three years and we will see why this behavior will continue during the 2020’s.

In the 2010-decade smartphone was the strong driver for the IP industry pushing the CPU/GPU categories and some interface protocols like LPDDR, USB and MIPI. The smartphone industry is still active but has reached a peak. The new growth driver for IP sales are data-centric applications including servers, datacenter, wired and wireless networking and emerging AI.

All of these applications share the need for higher and higher bandwidth for in-system data exchange (with memory and between chips) as well across the global network to support faster and wider interconnects between datacenter and networking.

This translates into high speed memory controllers (DDR5, HBM or GDDR6) and faster release of interface protocols (PCIe 5, 400G and 800G Ethernet, 112G SerDes) as well as emergence of protocols supporting Chiplet (HBI or SerDes).

If we look at the interface IP segments, it will directly impact the memory controller, PCI Express, Ethernet and SerDes segments and a new segment that we could call “Die2Die” (D2D). We have already seen significant IP revenue growth in these segments in 2019, ranging from 12% (memory controller), 20% (PCIe) and even more for Ethernet and SerDes segment.

The drivers have been linked with emerging protocols adoption as well as new technology nodes, like 7nm and 5m. For PCIe the driver has been adoption for PCIe 4 (16 Gbps data rate per lane). For the memory controller segment, we have seen several drivers like DDR4 adoption in datacenter, and also the adoption of High Bandwidth Memory (HBM2) and Graphic (GDDR6) in numerous applications, some of them being new and linked with AI.

When a design project starts on the last available technology node (7nm in 2019) and integrates the last release of a protocol, the license ASP is impacted and more expansive than before (n-1 release on N-1 node). So the growth for a specific IP segment is generated by the number of design starts (higher than before because there are more developments in application like datacenter and AI) multiplied by the license ASP increase, because the protocol is more complex and the target node is advanced.

What we start to see clearly is that data-centric applications (servers, datacenter, networking, AI, …) are strongly pushing the interface IP market, more specifically memory controller, PCIe, Ethernet and SerDes.

With SerDes we can consider that 2019 was the year where 112G PAM4 SerDes have started to be adopted, impacting positively the SerDes IP category revenues, but also Ethernet, as 400G MAC IP (and 800G MAC) have started to sale.

In fact, we have seen growth in the high 30% for this category, illustrated by Synopsys (thanks to Silabtech acquisition) Cadence (thanks to Nusemi acquisition) and three year old SerDes start-up Alphawave reaching $25 million revenues in 2019!

Don’t forget other protocols like USB as the introduction of USB 4 should boost USB IP sales in 2021 and after. USB 4 offers much higher bandwidth with 40 Gbps (to be compared with 10 Gbps for USB 3.2 or 20 Gbps for USB 3.2×2) and clarifies USB nomenclature making it easier to understand for the end user (the consumer). It also supports DisplayPort and ThunderBolt, a new capability to make life easier for consumers who want to see movies.

The MIPI protocol, part of the top 5 interfaces, is massively used in the smartphone. The change is coming from the automotive segment with the adoption of MIPI CSI (camera) and MIPI A-PHY defined to support long range (LR) SerDes based interconnect in a car.

Nevertheless, the IPnest forecast for USB and MIPI predict a CAGR in the 10% range for 2020-2024 for these two protocols, slightly less than the 15% CAGR associated with the three other protocols.

IPnest has used a methodology based on design starts by protocol forecasting the new project growth in respect with the target market segment (like datacenter, networking or ADAS) and predicting the license price (as a function of the technology node for the PHY and linked with the protocol release for the controller). This approach is quite complex but we expect it to help with accurate results and more importantly a realistic forecast.

This is the 12^th version of the survey starting in 2009 when the Interface IP category market was $250 million (in 2019 $870 million), and we can affirm that the 5 years forecast stayed within +/- 5% error margin! So, when IPnest predicts in 2020 that the interface IP category in 2025 will be in the $1800-$2000 range, passing the CPU IP category, this affirmation is backed-up by experience…

If you’re interested in this “Interface IP Survey” released in June 2020, just contact me:

eric.esteve@ip-nest.com .

Eric Esteve from IPnest

Also Read:

Design IP Revenue Grew 5.2% in 2019, Good News in Declining Semi Market

Chiplet: Are You Ready For Next Semiconductor Revolution?

IPnest Forecast Interface IP Category Growth to $2.5B in 2025

July 8, 2020July 27, 2020

Arm Rings the Bell in Supercomputing

Arm Rings the Bell in Supercomputing
by Bernard Murphy on 07-08-2020 at 6:00 am
Categories: AI, Arm, IP
9 Comments

Late last year I wrote about Arm’s efforts to play a role in servers, in AWS, and particularly Arm-based supercomputing, in the Sandia Astra roadmap and in partnering with NVIDIA who are in the Oak Ridge Summit supercomputer. These steps came, at least for me, with an implicit “Good for them, playing a role on the edges of these challenging applications.”

Well they just blew right past that theory. The Fugaku Arm-based supercomputer was just named this year’s fastest in the world. Arm isn’t helping in some peripheral role. Arm cores are the CPUs in this supercomputer. What’s more, Fugaku earlier also topped the list of the world’s most efficient supercomputers.

Some Fugaku specs

Fujitsu and RIKEN developed Fugaku jointly, around the Fujitsu A64FX processor. Fujitsu have built these processors around a many-core Arm CPU, with 48 compute cores connected through a NoC, together with either 2 or 4 helper cores. In addition, each processor connects in-package to 32GB of high-bandwidth memory (HBM) supporting streaming memory accesses, also the types of accesses common in AI applications. The processor uses the Arm V8.2A architecture, plus scalable vector extension with a 512-bit vector implementation. One processor alone is a serious machine.

384 of these go in a full rack and there are 396 of those racks in the system, plus a number of half racks. Together these add up to a total of nearly 160k nodes in Fugaku. These interconnect through a torus-architecture network they call TofuD (a neat name for a Japanese supercomputer network).

Theoretical peak performance is eye-watering. In boost mode, the system reaches 1.07 exaflops/second in 32-bit single precision, 2.15 exaflops/second in AI training (16 bit) and 4.3 exaops/second in 8-bit inference. This with a theoretical peak memory bandwidth of 163 petabytes/second. Peak power for this monster is about 28 MW and, no surprise, depends on a closed circuit water cooling system.

COVID applications

RIKEN is working with the Japanese Ministry of Education, Culture, Sports, Science and Technology to use Fugaku on a number of projects targeting COVID. One is a project to search for new drug candidates using molecular dynamics modeling to find candidates with a high affinity for the spike proteins on the virus. They are applying this analysis to 2000 existing drug candidates.

A different analysis is looking at the molecular dynamics of the spike protein to find features which may not be experimentally detectable. This is to gain a better understanding of the mechanisms behind connection to ACE2 receptors on cell surfaces.

A third team team plan to model infection in indoor environments through virus droplets . This is with a view to testing possible counter-measures, such as airflow control. I like this simply because it’s an incredibly complex many-body fluidics problem. How else would you model this other than on a monster supercomputer?

Cray announces their Arm supercomputer

Fugaku isn’t the only Arm-based supercomputer on record. HPE/Cray have announced the Cray CS500, based on the Fujitsu A64FX processor. This product provides a Cray programming environment on the system. Already SUNY Stony Brook, DOE Los Alamos National Laboratory and ORNL have signed up for these systems.

No more patronizing Arm in supercomputing. They’re on the leader board and one of their customers is at the top of the leader board. I’ve heard that Cray plans to reclaim that spot next year. Wow!

You can read more about Arms journey in supercomputing HERE and you can learn more about Fugaku HERE.

July 7, 2020July 10, 2020

Siemens Acquires UltraSoC to Drive Design for Silicon Lifecycle Management

Siemens Acquires UltraSoC to Drive Design for Silicon Lifecycle Management
by Mike Gianfagna on 07-07-2020 at 10:00 am
Categories: EDA, IP, Siemens EDA
2 Comments

As reported recently by Dan Nenni, Siemens has signed an agreement to acquire Cambridge, UK-based UltraSoC Technologies Ltd. We’ve all seen plenty of mergers and acquisitions in EDA. Some transactions perform better than others. The best ones enhance an existing product or service by blending non-overlapping technologies. This one is different. The combination of two non-overlapping technologies is creating a whole new category.

The Details

The acquisition is integrating UltraSoCs’ embedded monitoring hardware with the Tessent product suite, a comprehensive silicon test and yield analysis solution from Mentor Graphics, now part of Siemens Digital Industries Software. I had a chance to explore the details of the deal with Brady Benware, Tessent vice president and general manager at Siemens Digital Industries Software. Brady joined Mentor almost 14 years ago and has worked on the development of the Tessent product suite the entire time. He is a great source of detail and color about this acquisition.

Brady explained that UltraSoC started off in 2009 with monitoring IP that could be embedded in an SoC. The focus was to assist with silicon debug. Around 2015, UltraSoC began to implement a change in direction. They realized that their embedded monitoring technology could be used in a much broader range of applications. Cyber security, safety, system optimization and predictive analysis were some target areas. This change in direction put UltraSoC on a path that would ultimately intersect with Siemens and their Tessent product suite.

Mentor is a leader in test and the Tessent product suite brings a lot of silicon test and yield analysis solutions together. Areas such as automotive, logic, memory and mixed signal test are covered. Silicon learning tools to address test bring-up, silicon characterization, diagnosis-driven yield analysis and failure analysis are also addressed. These tools all focus on structural verification of the design during the manufacturing phase and while it’s deployed in the field.

In some important ways, the UltraSoC product family picks up where Tessent ends. Their embedded functional monitoring and analysis technology goes beyond structural verification, which enables a wider range of capabilities for monitoring and optimization of the part in the field. Brady discussed some of these capabilities. The first thing to realize is that applications such as automotive, IoT, data center and AI are all pushing performance to the limits of the silicon. These applications also demand optimal power and performance over the lifetime of the device and safe, reliable operation in a highly secure envelope.

UltraSoC’s embedded monitoring and analysis technology addresses all these requirements in a unique, hardware-driven way. The power and performance of the device can be monitored and analyzed, allowing modifications to the operating parameters of the device to compensate for aging effects. Bus traffic can be monitored to ensure there are no out of spec packets, which can indicate a security intrusion. The data collected from this embedded technology can also be used to perform predictive analysis for maintenance and provisioning.

An embedded approach provides some real advantages for this kind of analysis. Since all monitoring is done in hardware, the process is less intrusive to system operation, freeing capacity to address mission-critical tasks. A hardware-level approach is also less prone to hacking and external interference. Brady described a hierarchical communications backbone to leverage the data and analytics. It begins with the device and extends to the local system and collections of systems. From a security and safety standpoint, this kind of structure enables the identification of a system component that isn’t behaving like other similar system components. This could be an early warning for a compromised or failing device.

The Synergy

Putting all this together creates something Siemens is calling silicon lifecycle management, and this is the new category enabled by the acquisition. Testing and validation no longer ends when the part goes into production. It is rather the beginning of a monitoring and analysis process that extends over the entire lifecycle.

Moving quickly to bring these added capabilities to customers, the deal has already closed and integration is progressing. The two companies possess a shared vision, complimentary technology and similar customer base. This acquisition should bring substantial new capabilities to the market.

July 7, 2020January 21, 2021

Waking Up to the Requirements of Voice Activity Detection

Waking Up to the Requirements of Voice Activity Detection
by Tom Simon on 07-07-2020 at 6:00 am
Categories: AI, Dolphin Design, IP
1 Comment

There is a famous scene in the 1976 movie Taxi Driver when Robert De Niro’s character Travis is pretending to have a conversation looking in the mirror and repeatedly saying “Are you talking to me?”. I think about this scene every time I use a voice active device – Hey, are you talking to me? Yes, I am, but are you listening?

Voice command, which was the stuff of fantasy not that many years ago, has become a staple for smart products and systems. Even though many of these systems use computational processes similar to those used in our brains for voice recognition, electronic systems must operate under a set of tight constraints to make their use feasible. Chief among these are power limitations and the need to maintain privacy, primarily when conversation is not intended for the voice operated smart device. As a result, designers must design these systems with extra care to ensure these requirements are met.

Consumers will not tolerate voice systems that send all of their conversations over the internet to the cloud for analysis and potential recording. Furthermore, it is simply too costly to transmit that much audio information. It would require too much bandwidth and power consumption. Ideally voice activated systems would largely be in sleep mode with the absolute minimum circuitry active – listening for potential voice commands.

With that in mind Dolphin Design has developed several IPs that help systems locally detect valid voice input to start the process of interpreting voice commands. Voice activity detection (VAD) starts with the detection of a keyword that triggers overall system activation. Only once a voice and a correct keyword is detected will the entire voice recognition chain be switched on. Dolphin has a new white paper titled “Why VAD and what solution to choose?” that talks about different architectures for VAD based systems and their relative merits.

One of the most important metrics is the detection latency for various phonemes that can come at the beginning of a command phrase. VAD systems need to reject ambient noise yet respond quickly to valid voice input. Dolphin has developed the MiWok benchmarking platform to allow designers to compare key metrics.

Some systems use analog microphones, which means that most of the system can be in sleep mode, with only a small IP, such as the Dolphin WhisperTrigger, active to detect valid voice input. Other systems use digital microphones, which necessarily require more supporting circuitry, in addition to the WhisperTrigger IP, to remain in wake mode so the microphone input can be converted to a usable signal. The Dolphin white paper describes each type of system and their tradeoffs.

Regardless, their analysis shows that adding the WhisperTrigger IP to a voice activated system allows for significant power reductions, versus maintaining DSPs in an on state to analyze incoming audio data. The Dolphin WhisperTrigger IP offers extensive configurability to let designers fine tune sensitivity and performance for the specific application.

The white paper offers benchmark comparisons to help illustrate the alternatives available and their overall power consumption figures. If you don’t want the users of your system to feel like they are talking to themselves in the mirror, it might be worth reading the white paper to understand the options available for power efficient and reliable VAD system design. The white paper is located on the Dolphin Design Website for download and reading.

July 6, 2020August 10, 2020

The Future of Chip Design with the Cadence iSpatial Flow

The Future of Chip Design with the Cadence iSpatial Flow
by Mike Gianfagna on 07-06-2020 at 10:00 am
Categories: AI, Cadence, EDA
5 Comments

A few months ago, I wrote about the announcement of a new digital full flow from Cadence. In that piece, I focused on the machine learning (ML) aspects of the new tool. I had covered a discussion with Cadence’s Paul Cunningham a week before that explored ML in Cadence products, so it was timely to dive into a real-world example of the strategy Paul described. Since then, I also covered a position paper from Cadence on Intelligent System Design, which provides more details on advanced technology and ML for EDA.

The new digital full flow from Cadence is called iSpatial. Beyond ML, it also features unified placement and physical optimization engines that Cadence describes as an industry first. That’s a lot of integrated functionality. Questions that come to mind include:

How does the use model for a new tool like this compare to the prior generation?

How is the workflow different, and what are the benefits of doing things a new way?

I had the opportunity recently to explore these questions with Vivek Mishra, corporate VP, product engineering and Kam Kittrell, senior product management group director in the Digital & Signoff Group at Cadence. I was treated to a detailed tour of the use model for iSpatial and some actual results.

Vivek started our discussion by explaining that a key benefit of a flow like this is superior forward visibility (for the front-end synthesis team). We explored this statement further.

The front-end design team needs to know the power, performance and area of a given design iteration. This information drives optimization, and before iSpatial, the front-end team needed to wait for a completed design iteration from the back-end team to know these results. That could take many days.

Instead, with the iSpatial flow, the front-end design team gets meaningful and actionable information very quickly on things like overall performance, size and power as well as details on items such as routing congestion, critical path delays and clock insertion delays. The information is also presented in a format that is familiar to the front-end design team, avoiding the need to get an interpretation of the data from the back-end team. This contributes to efficiency as well as quality of results.

So, the integrated iSpatial flow minimizes turnaround time and maximizes efficiency for design iterations. But there’s more—the flow can reduce the overall number of design iterations as well. This is one application of ML. In this case, the tool will “learn” from prior design iterations and apply that knowledge in the form of suggestions for the next design iteration. Vivek provided some examples, things like modified pin placement to avoid DRC errors or a different choice of cell library elements that will improve performance. These suggestions are provided in the form of scripts that can be run to implement the various suggestions. This technology can actually help reduce design iterations by avoiding errors, which is headline news from a schedule perspective. Cadence calls these learning and optimization techniques “ML outside”.

There’s another ML use model which applies the technology to the core algorithms to optimize the results achieved. Cadence calls this “ML inside”. I explored some examples of these techniques with Vivek as well. Delay calculation was one we discussed. This is a very iterative and time-consuming process, requiring simulation. ML can optimize this process to increase both the speed of results as well as accuracy. Synthesis mapping is another example, where the best choice for a given implementation can be “learned” to avoid additional iterations.

Kam provided some more color on “ML inside” techniques at Cadence. Consider that many EDA algorithms are iterative in nature and the starting point for those iterations can impact the time to a converged result, or even if there is convergence at all. Finding the right starting point is something of a pattern-matching problem, and ML is quite good at those kinds of tasks.

As a final point, I asked about actual results on real customer designs. Kam reminded me that some detailed statistics were shared in the original press release, an unusual level of detail for a press release actually. MediaTek reported, “… we were able to automatically and quickly train a model of our CPU core, which resulted in an improved maximum frequency along with an 80% reduction in total negative slack. This enabled 2X shorter turnaround time for final signoff design closure.”

Samsung Electronics reported, “(iSpatial) enabled us to achieve 3X faster design turnaround time by quickly iterating on RTL, constraints and floorplan while improving total power by 6%. Furthermore, Cadence’s unique ML capabilities allowed us to train a model of our design on Samsung Foundry’s 4nm EUV node, which helped us further achieve a 5% performance improvement and 5% leakage power savings.”

Kam further mentioned that on several advanced customer designs, a double-digit total negative slack (TNS) improvement, often 50 percent or more, was achieved. On these same designs, power was improved by 1 to 3.5 percent. If you consider that a design team could spend months looking for a three percent power improvement, these numbers are quite impressive. Kam also explained that design groups using older technology nodes are also seeing benefits from the new flow in terms of reduced design iterations and a more finely tuned methodology.

At this point, I felt like I had seen the future (of chip design). You can learn more about the Cadence suite of digital design and signoff products here.

July 6, 2020September 8, 2020

A Compelling Application for AI in Semiconductor Manufacturing

A Compelling Application for AI in Semiconductor Manufacturing
by Tom Dillinger on 07-06-2020 at 6:00 am
Categories: AI, Foundries, Security, TSMC

There have been a multitude of announcements recently relative to the incorporation of machine learning (ML) methods into EDA tool algorithms, mostly in the physical implementation flows. For example, deterministic ML-based decision algorithms applied to cell placement and signal interconnect routing promise to expedite and optimize physical design results, without the iterative cell-swap placement and rip-up-and-reroute algorithms. These quality-of-results and runtime improvements are noteworthy, to be sure.

Yet, there is one facet of the semiconductor industry that is (or soon will be) critically-dependent upon AI support – the metrology of semiconductor process characterization, both during initial process development/bring-up, and in-line inspection driving continuous process improvement. (Webster’s defines metrology as “the application of measuring instruments and testing procedures to provide accurate and reliable measurements”.) Every aspect of semiconductor processing, from lithographic design rule specifications to ongoing yield analysis, is fundamentally dependent upon accurate and reliable data for critical dimension (CD) lithographic patterning and material composition.

At the recent VLSI 2020 Symposium, Yi-hung Lin, Manager of the Advanced Metrology Engineering Group at TSMC, gave a compelling presentation on the current status of semiconductor metrology techniques, and the opportunities for AI methods to provide the necessary breakthroughs to support future process node development. This article briefly summarizes the highlights of his talk. [1]

The figure below introduced Yi-hung’s talk, illustrating the sequence where metrology techniques are used. There is an initial analysis of fabrication materials specifications and lithography targets during development. Once the process transitions to manufacturing, in-line (non-destructive) inspection is implemented to ensure that variations are within the process window for high yield. Over time, the breadth of different designs, and specifically, the introduction of the process on multiple fab lines requires focus on dimensional matching, wafer-to-wafer, lot-to-lot, and fab line-to-fab line.

The “pre-learning” opportunities suggest that initial process bring-up metrology data could be used as the training set for AI model development, subsequently applied in production. Ideally, the models would be used to accelerate the time to reach high-volume manufacturing. These AI opportunities are described in more detail below.

Optical Critical Dimension (OCD) Spectroscopy
I know some members of the SemiWiki audience fondly (or, perhaps not so fondly) recall the many hours spent in the clean room looking through a Zeiss microscope at wafers, to evaluate developed photoresist layers, layer-to-layer alignment verniers, and material etch results. At the wavelength of the microscope light source, these multiple-micrometer features were visually distinguishable – those days are long, long gone.

Yi-hung highlighted that OCD spectroscopy is still a key source of process metrology data. It is fast, inexpensive, and non-destructive – yet, the utilization of OCD has changed in deep sub-micron nodes. The figure below illustrates the application of optical light sources in surface metrology.

The incident (visible, or increasingly, X-ray) wavelength is provided to a 3D simulation model of the surface, which solves electromagnetic equations to predict the scattering. These predicted results are compared to the measured spectrum, and the model is adjusted – a metrology “solution” is achieved when the measured and EM simulation results converge.

OCD illumination is most applicable when an appropriate (1D or 2D) “optical grating-like” pattern is used for reflective diffraction of the incident light. However, the challenge is that current surface topographies are definitely three-dimensional, and the material measures of interest do not resemble a planar grating. Optical X-ray scatterometry provides improved analysis accuracy with these 3D topographies, but is an extremely slow method of data gathering.

Yi-hung used the term ML-OCD, to describe how an AI model derived from other metrology techniques could provide an effective alternative to the converged EM simulation approach. As illustrated below, the ML-OCD spectral data would serve as the input training dataset for model development, with the output target being the measures from (destructive) transmission electron microscopy (TEM), to be discussed next.

ML for Transmission Electron Microscopy (TEM)
TEM utilizes a focused electron beam that is directed through a very thin sample – e.g., 100nm or thinner. The resulting (black-and-white) image provides high-magnification detail of the material cross-section, due to the much smaller electron wavelength (1000X smaller than an optical photon).

There are two areas that Yu-hing highlighted where ML techniques would be ideal for TEM images. The first would utilize familiar image processing and classification techniques to automatically extract CD features, especially useful for “blurred” TEM images. The second would be to serve as the training set output for ML-OCD, as mentioned above. Yi-hung noted that one issue to the use of TEM data for ML-OCD modeling is that a large amount of TEM sample data would required as the model output target. (The fine resolution of the TEM image compared to the field of the incident OCD exposure exacerbates the issue.)

ML for Scanning Electron Microscopy (SEM)
The familiar SEM images measure the intensity of secondary electrons (emitted from the outer atomic electron shell) that are produced from collisions with an incident primary electron – the greater the number of SE’s generated in a local area, the brighter the SEM image. SEMs are utilized at deep submicron nodes for (top view) line/space images, and in particular, showing areas where lithographic and material pattering process defects are present.

ML methods could be applied to SEM images for defect identification and classification, and to assist with root cause determination by correlating the defects to specific process steps.

Another scanning electron technique uses a variable range of higher-energy primary electrons, which will have different landing distances from the surface, and thus, provide secondary electrons from deeper into the material. However, an extremely large primary energy will result in the generation of both secondary electrons and X-ray photons, as illustrated below. (Yi-hung noted that this will limit the image usability for the electron detectors used in SEM equipment, and thus limit the material depth that could be explored – either more SE sensitivity or SE plus X-ray detector resolution will be required.) The opportunities for a (generative) machine learning network to assist with “deep SEM” image classification are great.

Summary
Yi-hung concluded his presentation with the following breakdown of metrology requirements:

(high-throughput) dimensional measurement:
- - OCD, X-ray spectroscopy (poor on 3D topography)
(high-accuracy, destructive) reference measurement: TEM
Inspection (defect identification and yield prediction): SEM
In-line monitoring (high-throughput, non-destructive):
- - hybrid of OCD + X-ray, with ML-OCD in the future?

In all these cases, there are great opportunities to apply machine learning methods to the fundamental metrology requirements of advanced process development and high-volume manufacturing. Yi-hung repeated the cautionary tone that semiconductor engineering metrology currently does not have the volume of training data associated with other ML applications. Nevertheless, he encouraged data science engineers potentially interested in these applications to contact him. 🙂

Yu-hing also added that there is a whole other metrology field to explore for potential AI applications – namely, application of the sensor data captured by individual pieces of semiconductor processing equipment, as it relates to overall manufacturing yield and throughput. A mighty challenge, indeed.

-chipguy

References

[1] Yi-hung Lin, “Metrology with Angstrom Accuracy Required by Logic IC Manufacturing – Challenges From R&D to High Volume Manufacturing and Solutions in the AI Era”, VLSI 2020 Symposium, Workshop WS2.3.

Images supplied by the VLSI Symposium on Technology & Circuits 2020.

July 5, 2020July 6, 2020

Teaching AI to be Evil with Unethical Data

Teaching AI to be Evil with Unethical Data
by Matthew Rosenquist on 07-05-2020 at 2:00 pm
Categories: AI, Security
1 Comment

An Artificial Intelligence (AI) system is only as good as its training. For AI Machine Learning (ML) and Deep Learning (DL) frameworks, the training data sets are a crucial element that defines how the system will operate. Feed it skewed or biased information and it will create a flawed inference engine.

MIT recently removed a dataset that has been popular with AI developers. The training set, 80 Million Tiny Images, was scraped from Google in 2008 and used in training AI software to identify objects. It consists of images that are labeled with descriptions. During the learning phase, an AI system will ingest the dataset and ‘learn’ how to classify images. The problem is that many of the images are questionable and the labels were inappropriate. For example, women are described with derogatory terms, body parts are identified with offensive slang, and racial slurs were sometimes used to label minority people. Such training should never be allowed.

AI developers need vast amounts of training data to train their systems. Collections are often created out of convenience, without consideration for courteous content, copyright restrictions, compliance to licensing agreements, people’s privacy rights, or respect for society. Unfortunately, many of the available sets were haphazardly created by scraping the internet, social sites, copyrighted content, and human interactions without approval or notice.

Many of the most used training datasets have issues. A large number were created by unethically acquiring content, some contain derogatory or inflammatory information, and for others, the sample is not representative because it excludes certain groups that would benefit from inclusion.

The problem has become worse over time. Flawed datasets, that were made openly available to the developer community early-on, became so popular that they are now considered a standard. These benchmarks are used to check accuracy and performance across different AI systems and configurations.

Too few are vetted for inclusion, content, accuracy, or socially acceptable content. Using such flawed records is simply unethical because the resulting systems can be racially charged, biased, and promote inequality.

We cannot have good AI if the commonly used datasets create unethical systems. All files should be vetted and both the creators and product developers held responsible. Just as chefs are held accountable for the ingredients they put into their prepared dishes, so should the AI community be held responsible for allowing poor data to result in harmful AI systems.

July 5, 2020September 3, 2024

Application-Specific Lithography: The 5nm 6-Track Cell

Application-Specific Lithography: The 5nm 6-Track Cell
by Fred Chen on 07-05-2020 at 10:00 am
Categories: Lithography
1 Comment

An update is now available here: Application-Specific Lithography: Patterning 5nm 5.5-Track Metal by DUV

The 5nm foundry (e.g., TSMC) node may see the introduction of 6-track cells (two double-width rails plus four minimum-width dense lines) with a minimum metal pitch in the neighborhood of 30 nm. IMEC had studied a representative case as its ‘7nm’ case [1]. TSMC had some published 5nm test structures which looked like extended 6-track cells [2]. Even with EUV lithography, the use of highly specialized patterning techniques is expected. We consider various options here for patterning the lines of the 6-track cell (Figure 1).

Figure 1. Reference example for 5nm 6-track cell with four 14 nm metal lines spaced 14 nm apart, and two 28 nm wide rails at the upper and lower boundaries.

Single Exposure EUV

At first glance, a single exposure technique using EUV should be easiest to carry out without much yield consideration. However, EUV has many added concerns uncovered over the years such as stochastic variation [3-5]. Figure 2 shows the map of pupil sources correlated with the possible diffraction patterns for a 154 nm pitch 6-T cell (14 nm internal half-pitch, 28 nm rail width). Unfortunately, each individual diffraction pattern takes less than 20% pupil fill, leading to throughput loss for a dedicated diffraction pattern on the current NXE:3400B systems [6].

Figure 2. All possible diffraction patterns for a single EUV 0.33 NA exposure of the 6-track cell line pattern in Figure 1. Each different symbol represents a diffraction pattern produced by the corresponding EUV source point, labeled by a 7-digit string. The nth digit indicates how many -nth, nth orders are included (0, 1 or 2).

When the illumination is expected to spread the photon number over at least several diffraction patterns, each source point effectively becomes more noisy [5].

SAQP

A more familiar alternative, but not without its own technical challenges, is self-aligned quadruple patterning (SAQP), using immersion lithography. Conducting features are best defined between spacers to make cutting more efficient; this is also known as the “spacer-is-dielectric” (SID) approach [7]. The biggest hurdle is that the number of spacers is naturally even, so that the features defined between spacers will naturally come in odd numbers. As a workaround, some spacers may be made to disappear or merge, with a deliberately narrowed space between some starting features. Effectively, this removes one spacer to leave an odd number of spacers, with an even number of features in between. For the 154 nm pitch case being studied, the starting pattern proposed in [1] could actually be drawn as a 308 nm pitch pattern as shown in Figure 3.

Figure 3. SAQP integration for 6-track cell. Blue: starting (core or mandrel) features. Green: 1st spacer. Red: 2nd spacer. Purple: dielectric masked by 2nd spacer. Gray: metal filling in between spacers. In some schemes, the material filled at the 1st spacer locations is different from that filled at other locations between the 2nd spacers. This is to facilitate self-aligned blocking [1].

Pitch Walking

Due to the breaking of symmetry in the pattern of Figure 1, “pitch walking” is likely to occur in the patterning process. This is the effect where the spacing between some lines is decreased or increased relative to the spacing between other lines. This can occur in the lithography process itself, due to defocus, for either the EUV or SAQP options described above.

For the EUV case, the different illumination source points can produce different effects (Figure 4). This is again aggravated by stochastic sensitivity.

Figure 4. Defocus can cause pitch walking to an extent dependent on the source points for illumination. The conditions assumed for this 6-track cell example are 154 nm pitch, 13.5 nm wavelength, 0.33 NA, 50 nm defocus.

For the SAQP case, there are additional potential contributors to pitch walking from process steps following the lithography, such as spacer deposition thickness and spacer overetch. These extra conditions force tighter tolerances on defocus for the starting features, so that even 30 nm could be limiting (Figure 5).

Figure 5. For SAQP, defocus tolerance needs to be tighter to prevent pitch walking caused by post-litho process steps.

A potential mitigation of the defocus impact is to use a multi-patterning technique instead of a symmetry-breaking single exposure for the SAQP starting features. In the most brute-force case, 3 exposures, 3 etches may be used to each pattern one of the three starting features (the central 56 nm and the two side 42 nm features) within the 308 nm pitch. A more efficient way would be to use self-aligned triple patterning (SATP) [8] to define all three features with one mask (Figure 6). For this process, the lithography will maintain repeating feature symmetry.

Figure 6. SATP flow [8] using two spacers for producing the starting features for SAQP shown in Figure 3.

Other steps that can be taken to address pitch walking for SAQP include thickness optimization [9] and process control loop feedback [10]. Presumably, the same issues have been encountered for self-aligned double patterning (SADP) already, so as SADP matures, SAQP should benefit.

Self-aligned blocking or cutting

The breaks in the line tracks also need to be patterned. Within the same EUV exposure as the lines, the extra pitches in the second dimension create more diffraction patterns among which the EUV photon number will be divided [5], further aggravating the stochastic effects. Line ends are already small collection areas for photons [11], leading to extra tip-to-tip variation. The classical resolution limit for line end gaps is ~ 0.6 wavelength/NA [12], where NA is the numerical aperture of the lithography system (~25 nm for the NXE:3400 with 0.33 NA). Thus, a separate exposure for cutting the lines, or blocking the etch at some locations, is preferred.

The self-aligned block (SAB) approach is preferred over a single exposure block or cut, due to its being more robust against overlay and edge placement errors [1,13]. However, the SAB approach necessitates the use of two masks, as two oppositely selective etches will be used for different block/cut locations. While EUV is commonly discussed for use in the SAB approach, immersion lithography can also be used, with self-aligned double patterning (SADP) as needed [14].

Summary of approaches

The pros and cons of the approaches covered above are summarized in the table below:

References

[1] J. U. Lee, S. H. Choi, Y. Sherazzi, R. R. H. Kim, “SAQP spacer merge and EUV self-aligned block decomposition at 28nm metal pitch on imec 7nm node,” Proc. SPIE 10962, 109620N (2019).

[2] G. Yeap, S. S. Lin, Y. M. Chen, H. L. Shang, P. W. Wang, H. C. Lin, Y. C. Peng, J. Y. Sheu, M. Wang, X. Chen, B. R. Yang, C. P. Lin, F. C. Yang, Y. K. Leung, D. W. Lin, C. P. Chen, K. F. Yu, D. H. Chen, C. Y. Chang, H. K. Chen, P. Hung, C. S. Hou, Y. K. Cheng, J. Chang, L. Yuan, C. K. Lin, C. C. Chen, Y. C. Yeo, M. H. Tsai, H. T. Lin, C. O. Chui, K. B. Huang, W. Chang, H. J. Lin, K. W. Chen, R. Chen, S. H. Sun, Q. Fu, H. T. Yang, H. T. Chiang, C. C. Yeh, T. L. Lee, C. H. Wang, S. L. Shue, C. W. Wu, R. Lu, W. R. Lin, J. Wu, F. Lai, Y. H. Wu, B. Z. Tien, Y. C. Huang, L. C. Lu, J. He, Y. Ku, J. Lin, M. Cao, T. S. Chang, S. M. Jang, “5nm CMOS Production Technology Platform featuring full-fledged EUV, and High-Mobility Channel FinFETs with densest 0.021 um2 SRAM cells for Mobile SOC and High Performance Computing Applications,” IEDM 2019.

[3] A. Frommhold, D. Cerbu, J. Bekaert, L. Van Look, M. Maslow, G. Rispens, E. Hendrickx, “Predicting Stochastic Defects across the Process Window,” Proc. SPIE 11147, 1114708 (2019).

[4] P. De Bisschop, E. Hendrickx, “Stochastic Printing Failures in EUV Lithography,” Proc. SPIE 10957, 109570E (2019).

[5] https://www.linkedin.com/pulse/stochastic-variation-euv-source-illumination-frederick-chen/

[6] M. van de Kerkhof, H. Jasper, L. Levasier, R. Peeters, R. van Es, J-W. Bosker, A. Zdravkov, E. Lenderink, F. Evangelista, P. Broman, B. Bilski, T. Last, “Enabling sub-10nm node lithography: presenting the NXE:3400B EUV scanner,” Proc. SPIE 10143, 101430D (2017).

[7] Y. Ban, D. Z. Pan, “Self-aligned double-patterning layout decomposition for two-dimensional random metals for sub-10-nm node design,” J. Micro/Nanolith. MEMS MOEMS 14, 011004 (2014).

[8] J-Y. Lee, J-S. Park, S-G. Woo, US Patent 7842601, assigned to Samsung, filed Apr. 20, 2006.

[9] T. Yang, D. Yim, “SAQP Pitch walking improvement pathfinding by simulation,” 41st International Symposium on Dry Process, 2019. http://www.dry-process.org/2019/poster_program.html

[10] H. Ren, A. Mani, S. Han, X. Li, X. Chen, D. Van Den 90Heuvel, “Advanced process control loop for SAQP pitch walk with combined lithography, deposition and etch actuators,” Proc. SPIE 11325, 1132523 (2020).

[11] https://www.linkedin.com/pulse/photon-shot-noise-impact-line-end-placement-frederick-chen/

[12] https://www.linkedin.com/pulse/lithography-resolution-limits-line-end-gaps-frederick-chen

[13] A. Raley, N. Mohanty, X. Sun, R. A. Farrell, J. T. Smith, A. Ko, A. W. Metz, P. Biolsi, A. Devilliers, “Self-Aligned Blocking Integration Demonstration for Critical sub 40nm pitch Mx Level Patterning,” Proc. SPIE 10149, 101490O (2017).

[14] E.g., see A. J. deVilliers, US Patent 9240329, assigned to Tokyo Electron Limited, filed Feb. 17, 2015.

The original article first appeared in LinkedIn Pulse: Application-Specific Lithography: The 5nm 6-Track Cell

What is Zero Trust Model (ZTM)

What is Zero Trust Model (ZTM)
by Ahmed Banafa on 07-05-2020 at 10:00 am
Categories: Security

The Zero Trust Model of information #security simplifies how #information security is conceptualized by assuming there are no longer “trusted” interfaces, applications, traffic, networks, or users. It takes the old model— “trust but verify”—and inverts it, because recent breaches have proven that when an organization trusts, it doesn’t verify [6].

This model requires that the following rules be followed [6]:

All resources must be accessed in a secure manner.
Access control must be on a need-to-know basis and strictly enforced.
Systems must verify and never trust.
All traffic must be inspected, logged, and reviewed.
Systems must be designed from the inside out instead of the outside in.

The zero-trust model has three key concepts:

Ensure all resources are accessed securely regardless of location.
Adopt a least privilege strategy and strictly enforce access control.
Inspect and log all traffic.

“Outside-In” to “Inside-Out” Attacks
According to a Forrester Research report, information security professionals should readjust some widely held views on how to combat cyber risks. Security professionals emphasize strengthening the network perimeter, the report states, but evolving threats—such as increasing misuse of employee passwords and targeted attacks—mean executives need to start buffering internal networks. In the zero-trust security model, companies should also analyze employee access and internal network traffic. One major recommendation of the Forrester report is for companies to grant minimal employee access privileges. It also emphasizes the importance of log analysis; another recommendation is for increased use of tools that inspect the actual content, or data “packets,” of internal traffic [1].

Teams within enterprises, with and without the support of information technology management, are embracing new technologies in the constant quest to improve business and personal effectiveness and efficiency. These technologies include virtualization; cloud computing; converged data, voice, and video networks; Web 2.0 applications; social networking; #smartphones; and tablets. In addition, the percentage of remote and mobile workers in organizations continues to increase and reduce the value of physical perimeter controls [2].

The primary vector of attackers has shifted from “outside-in” to “inside-out.” Formerly, the primary attack vector was to directly penetrate the enterprise at the network level through open ports and to exploit operating system vulnerabilities. We call this attack methodology “outside-in.” In “inside-out” attacks, the user inside the “protected” network reaching out to an external website can be just as vulnerable as the user accessing the Internet from home [5].

Zero Trust Recommendations:

Update network security with next-generation firewalls.
Use a “sandbox” control to detect unknown threats in files.
Establish protected enclaves to control user access to applications and resources.
Use a specialized anti-phishing email protection service.
Use threat intelligence to prioritize vulnerability remediation.
Analyze logs using advanced machine learning algorithms to detect compromised and malicious users.
Implement an incident management system to minimize the impact of individual incidents.
Deploy a cloud services manager to discover, analyze, and control shadow IT. (Shadow IT is hardware or software within an enterprise that is not supported by the organization’s central IT department.)
Monitor your partners’ security postures using a cloud-based service.
Deploy an enterprise key & certificate management system.
Deploy a backup, cloud-based DDoS mitigation service.
Deploy a non-signature-based endpoint malware detection control.

Just remember: the zero-trust model of information security means “verify and never trust.”

Ahmed Banafa, Author the Books:
Secure and Smart Internet of Things (IoT) Using Blockchain and AI

Blockchain Technology and Applications

Read more articles at: https://medium.com/@banafa

References:

[1] http://www.securitymanagement.com/article/zero-trust-model-007894

[2] http://www.securityweek.com/steps-implementing-zero-trust-network

[3] http://spyders.ca/reduce-risk-by-adopting-a-zero-trust-modelapproach-to-security/

[4] http://www.cymbel.com/zero-trust-recommendations/

[5] http://csrc.nist.gov/cyberframework/rfi_comments/040813_forrester_research.pdf

[6] https://go.forrester.com/research/