SemiWiki – Page 217 – The Open Forum for Semiconductor Professionals

March 4, 2022March 15, 2023

Integrating Materials Solutions with Alex Yoon of Intermolecular

I had a follow-on discussion with Alex Yoon from our podcast last year. He is as a Head of Strategic and Emerging Technologies and partnerships at Intermolecular, part of EMD electronics.

Prior to joining EMD electronics, he was Senior Technical Director at Lam Research, led activities in emerging memory and novel materials in New Product Development, Corporate Technology Development and Advanced Technology Development groups. Prior to Lam, he was technology manager for WCVD/WALD at Applied Materials. He holds BS in Chemistry/Materials Science from UCLA and PhD in Chemistry from UC Berkeley.

In our podcast we discussed semiconductor materials innovations. This time we are talking about integrating materials solutions.

What is the rationale for integrating materials solutions? Why now?
The Semiconductor industry is facing many challenges and opportunities – all of which demand faster product innovation. Integrating the right materials solutions helps enable differentiation.

Point solutions may not solve present day device, integration and process technology needs / in an efficient, rapid and complete manner. Integrated materials solutions aim to solve complex integration challenges by collaboration and co-optimization of individual solutions to give better, faster solutions.

Severe reduction in design, device and process margins for present day semiconductors creates the need now.

How did customers solve these problems before integrated materials solutions was available? Why is that approach not completely viable?
Previously these problems were solved sequentially, especially when materials suppliers were involved. For example, first develop the precursors, the deposition technique and the resulting thin film. Then develop the slurry, pad and CMP process. This sequential process is inefficient and could potentially lead to unnecessary/inadequate optimization especially when integration considerations are missed during optimization of just one material.

What are the top 3 benefits to adopting integrated materials solutions approach?
Better convergence to integration driven needs, better and faster materials solutions, tackling integration issues early on in the process, and deployable solutions developed with process integration in mind help achieve higher order key performance indicators. All of which leads to faster product development and innovation for customers.

How does the previous and proposed integrated solutions approach compare? Benefits vs risks?
Traditionally, the approach was siloed between the different processing steps. This allows to have deep technical experts push the technology further, however the downside is that the solutions only come together in the integration team. With Integrated solutions, we leverage the position of EMD Electronics with all the various in-house materials, using our understanding of the materials and the ability to quickly iterate, so that we keep the deep technical experts involved while breaking up the silos between the different disciplines.

Benefits of this approach is we can offer better, faster solutions and better convergence to integrations needs of customers. On the other end, the risk is that this model changes engagement model between chip makers/foundries and materials suppliers. Requires transparent and trust-based interactions to understand capabilities and customer may be somewhat reluctant to share their insights.

What is an example where you can co-optimize multiple steps towards an integrated solution?
We have multiple examples of co-optimizing multiple steps. Such as deposition of thin film and planarization, deposition of materials and etch gases, co-optimization of atomic layer deposition (ALD) and atomic layer etch processes (ALE)

How do customers engage with you specifically to take advantage of this integrated approach? How long is the engagement?
Customers highlight a process technology needs to include a discussion on integration challenges and specifications for individual process steps. EMD Electronics will then assess and respond with specific options, actions and timelines. Once we mutually align on the Statement Of Work, or Joint Development Agreement, we can collaborate via frequent and regular communication meetings to realize the integrated materials solution. The engagement depends on the scope and deliverables but can be typically 6 months for simpler project and longer for more complex or involves new materials development.

What is the future for integrated materials solutions? Do you expect to go beyond the process module level?
Integrated materials solutions are not limited to any specific or limited number of levels of abstractions. Currently we are focusing on integrated materials solutions based on device and process integration needs and see that our customers find value in this. We will listen carefully to feedback from our customers and use our core strengths in materials innovation to best serve them.

Thank you Alex!

Also read:

Podcast EP42: Semiconductor Materials Innovations

Ferroelectric Hafnia-based Materials for Neuromorphic ICs

Webinar: Rapid Exploration of Advanced Materials (for Ferroelectric Memory)

March 3, 2022March 9, 2022

Intel Evolution of Transistor Innovation

Intel Evolution of Transistor Innovation
by Daniel Nenni on 03-03-2022 at 10:00 am
Categories: Foundries, Intel Foundry
13 Comments

Intel recently released an exceptional video providing an insightful chronology of MOS transistor technology. Evolution of Transistor Innovation is a five-minute audiovisual adventure, spanning 50 years of Moore’s Law. Some of the highlights are summarized below, with a few screen shot captures – the full video is definitely worth viewing.

The speaker is 16+ year Intel veteran Marisa Ahmed, a Member of the Technology Leadership Marketing Team. Marissa is responsible for building technology marketing strategies and activities in support of Intel’s process, packaging and manufacturing capabilities.

1971

The figure above establishes a baseline for the MOS field effect transistor, circa 1971.

(Note the additional supplemental info provided with the transistor cross-sections that follow – e.g., the total number of transistors released; the number of metal layers for the process generation; the exposing wavelength for lithographic patterning; the wafer size; and, the related Intel product families.)

Polycide and Salicide: 1979-81

With ongoing Dennard scaling of the device gate length, the sheet resistivity of the polysilicon gate material was increasing. Similarly, the transistor drain/source series resistance (Rs, Rd) was increasing. The contact resistance (Rc) to the metal layer was also increasing, due to the scaling of the S/D junction depth. To address these problematic parasitics, a process innovation emerged to create a silicide. A refractory metal such as Titanium was deposited and alloyed at elevated temperature with the exposed silicon. (Salicide is a composite term for “self-aligned silicide” – the deposited metal does not react with the adjacent dielectric materials.)

STI: 1995

The device electrical isolation and surface topography underwent a significant change, in the transition from local oxidation of silicon (LoCoS) to shallow trench isolation (STI).

LoCoS was a process method where the field oxide isolation between devices was formed by patterning a hard mask over the device area, and exposing the field to an oxidation environment. Oxygen would diffuse from the high-temperature environment through the growing field oxide layer to reach with the silicon crystal at the oxide-substrate interface. The resulting oxide profile was a tapered (“bird’s beak”) surface topography, better for metal traversal between devices.

To facilitate further scaling, a new process for field oxide separation was introduced. STI leveraged major improvements in anisotropic dry etching technology (with near vertical sidewalls) combined with chemical vapor deposition of dielectric materials.

Aluminum à Copper

A watershed (non-device) process enhancement in the late 1990s was the transition from Aluminum metallization to Copper. Dennard scaling continued to enable greater device current and lower device capacitances. This era was marked by the transition from gate fanout load-dominated circuit delay to significant contributions from the R*C interconnect delay from the driving gate output to the fanouts. The need for interconnects with improved resistivity and electromigration robustness necessitated the transition from Al to Cu.

Concurrent with that material transition, a major shift in interconnect patterning was required. Aluminum as the primary interconnect involved a rather straightforward deposition, lithography, and subtractive removal process flow. Due to the difficult chemistry associated with dry etching of copper – e.g., corrosive gases, few volatile copper-based reaction products to pump out – a damascene patterning method was required. The dielectric to surround the metal was deposited, a trench was etched in the dielectric (and interlevel dielectric below for the vias), then copper was deposited in the trench through electroplating.

In addition to the additive damascene process replacing the subtractive Al etch method, it was also therefore necessary to evolve the chemical-mechanical polishing (CMP) process step. The wafer surface with the deposited Cu is placed face down onto a polishing pad, which rotates at a low speed. A rotating piston at higher RPM provide an appropriate downward force on the wafer (Newtons/cm**2), and a slurry is introduced onto the pad. The slurry consists of both a chemical solution and a fine grit. The chemical is intended to selectively react with the material to be removed – Cu, in this case – while the mechanical polish removes the result of the reaction. An extremely flat surface topography is produced. As shown in the figure above, as well as the succeeding figures, CMP has enabled a much-needed increase in the number of metal layers available for interconnecting the scaled circuit density.

Gate and gate oxide enhancements

Device evolution encountered issues with continued scaling of the gate oxide thickness. The influence of the input gate electric field on the device channel requires scaling the gate oxide capacitance: Cg ~ ((K*E0)/t), where K is the relative dielectric constant and t is the gate oxide thickness. As the gate oxide became thinner, gate tunneling current through the device input increased. To equivalently increase Cg without the issues of reducing the thickness, alternative high-K dielectric materials replaced SiO2 for the gate oxide.

Scaling the traditional polycrystalline silicon gate material was resulting in higher resistivity and greater sensitivity to non-uniformity in the polySi grain size, distribution, and impurity concentration. A replacement metal gate process step was introduced, displacing polySi as the gate material. (For more info on this rather difficult step, do a follow-on search for high-K, metal gate “HKMG gate-first versus gate-last” process; the term replacement in the figure above refers to a gate-last flow.)

FinFET: 2011

Intel amazed the industry with its aggressive adoption of a new transistor topology at the 22nm process node – the FinFET (also known as the “tri-gate FET”).

The traditional planar S/D channel topology had an increasing issue with (sub-channel) leakage current between source and drain when the device was “off”. To reduce the sub-threshold leakage, a device topology was required where the gate input provided greater electrostatic control over the channel. The vertical channel “fin” has the input gate traversing over the sidewalls and top. In the figure above, a single gate inputs traverses over three silicon fins to be connected in parallel – the channel current flows through the vertical fins. The thickness of the fin is sufficiently small such that the gate input electric field control reduces the sub-threshold leakage substantially, which has enabled much greater battery life for laptop and mobile electronics.

Gate-All Around (GAA) Ribbon FET: Intel 20A in 2024

To further improve the electrostatic gate control over the channel, another major evolution in the transistor topology is emerging to replace the FinFET. A gate-all-around configuration involves a vertical stack of electrically isolated silicon channels. The gate dielectric and gate input utilize an atomic layer deposition (ALD) process flow to surround all channel surfaces in the stack.

Intel will be releasing their GAA Ribbon FET 20A process in 1H 2024.

Summary

The evolution of the field effect transistor over the past 50 years is rather amazing. The figure below illustrates this progress, with the devices drawn to scale.

This evolution was enabled by the innovative ideas and hard work of research and development teams throughout the industry, with expertise ranging from materials science to chemistry to optical lithography to the physical of deposition/etch process steps. Rather incredibly, this progress shows no signs of stopping anytime soon, absolutely.

Also read:

Intel 2022 Investor Meeting

The Intel Foundry Ecosystem Explained

Intel Discusses Scaling Innovations at IEDM

March 3, 2022January 13, 2023

Passion for Innovation – an Interview with Samtec’s Keith Guetig

Passion for Innovation – an Interview with Samtec’s Keith Guetig
by Mike Gianfagna on 03-03-2022 at 6:00 am
Categories: Samtec, Semiconductor Services

There’s a lot to be said for staying power. I’ve met many people in my career who simply resonate with a company – their products, culture, and direction. People like this build an organic knowledge of the company’s products and its customers. They contribute to a culture that helps enable great companies. Keith Guetig is one such person at Samtec. He joined the company 26 years ago as an engineer, and today he directs the new product roadmap. I had the opportunity to interview Keith recently about his journey at Samtec. Read on to learn what a passion for innovation can accomplish.

What drew you to Samtec?

Samtec has a service-driven culture combined with a passion for innovation and new technology. These are desirable attributes for a young engineer looking to kick off a career.

How long have you been there, and how has the journey been so far?

I began working at Samtec as an engineering co-op in my fall semester of 1995. Therefore, I’m homegrown. I’ve now been with Samtec for 26 years. My career has included three phases. I began in manufacturing engineering. This portion of my career included automation design, which I found fascinating. My second phase was technical application engineering for our newest, most advanced products. Both hands-on and directing a growing team. And today it’s Product Management leadership. Our future growth is dependent on correctly setting the new product directional compass today.

With regard to high-performance cables, what are the trends between traditional and optical technologies?

The trend isn’t pointing toward a “winner” between Copper and Optics High Performance Cables. The trend is more of both. And the applications and industries that drive the demand – AI/ML, Data Center, ASIC evaluation and development, Automotive and Transportation, Embedded Computing, etc. – are growing and diversifying, which is a good thing.

Do certain applications drive one type of technology vs. the other?

The easy answer here is the required reach of the channel (advantage fiber for long reach). But others include size and flexibility within a bundle (advantage: optical fiber), thermals (advantage: copper), and cost (advantage: copper, usually).

Over the past decade, we’ve seen explosive growth in design complexity at the chip level. Have you seen similar trends for high performance channels?

Certainly yes. A second order effect of this is the challenge of signal loss routing away from the chip package. This challenge creates an inflection point in our market: The utilization of inside the rack high-speed ultra-low skew twinax copper cable assemblies like Samtec Flyover systems, passing traffic that previously was handled in the PCB. This enables a full rethink of legacy architectures; this is the story.

What types of requirements have seen the most demand for improvement?

As signal integrity challenges increase for chip package substrate and PCB, it translates to less allowable performance margins for connectors and cable. There’s also a growing demand for one hundred percent signal integrity characterization for finished products, right on the line, not in the lab. This is a big challenge in the connectivity industry, albeit we already have this capability at Samtec thanks to forward planning many years ago.

Simulation models for channel components seem quite important. What is Samtec’s view of this?

We utilize Ansys HFSS for all our new product development; the correlation with measured signal integrity performance is highly impressive. The bottom line is this: We can iterate a connector/cable design 100+ times before we ever tool anything up. Therefore, we precisely tune the design to exactly meet the target performance attributes. Our customers love this.

Looking out five years, what will be the most impactful new design requirements that Samtec will address?

It’s a bundle of applications, requirements and solutions that overlap. Some of these include:

Connectivity interfacing directly to the chip package
Advancements in low loss and phase stability cable plus material advancements that further mitigate crosstalk, all enabling 224G and beyond
Flexible mmWave Waveguides
Improved board-to-board power density
Backplane transition to CablePlane

That’s a short story of what a passion for innovation means at Samtec. You can learn more about this unique company at https://www.samtec.com. You can also access insightful background on how Samtec helps semiconductor and system design here.

March 2, 2022March 28, 2022

Sondrel explains the 10 steps to model and design a complex SoC

Sondrel explains the 10 steps to model and design a complex SoC
by Daniel Nenni on 03-02-2022 at 10:00 am
Categories: Aion Silicon, Semiconductor Services

Sondrel just released a position paper on how to model and design a complex ASIC. We have been following Sondrel for the past year and I have found their collateral to be excellent. Here is the position paper overview, a description of the new Sondrel modeling tool, the 10 steps, and of course a link to download the paper:

Overview
It is important to model an SoC well in advance to avoid costly over design or insufficient performance and to create a hardware emulate on which representative end user applications can be run. Detailed architectural modelling provides reasonable estimates of the performance, power, memory resources, and the NoC (Network on Chip) configuration that will be required along with an indicative size of the die and what it is likely to cost. With this information, a customer can decide whether to proceed with the design, if it needs to be adjusted or even cancelled. Sondrel™ has created unique, proprietary modelling flow software, initially for use with Arm® and Synopsys® tools, that dramatically reduces the time to do this from months to a few days, which Sondrel claims to be an industry first for a services organisation. This article discusses how modelling is used in the ten steps of modelling and designing a complex SoC architecture.

Sondrel’s new tool
Modelling tools are available as standard items from leading vendors but what Sondrel does is to wrap the vendor’s offerings with its own custom flow. The vendor’s tools are limited in terms of automation and ways that they can be adjusted but Sondrel’s new modelling flow tool adds a framework with a much greater number of settings that can be tweaked by the Sondrel Systems Architect who is working on the project. This is added using hooks into the vendor’s software that are provided for this very purpose. Typically, users create customisation wrappers that are specific to the designs that they work on if not already present in a library of an ever-growing number of such wrappers. However, because Sondrel works on a wide variety of projects for a plethora of customers, it has defined a methodology and flows that are unique and broader in scope so that they can be used for almost any architectural exploration project.

The biggest benefit of the modelling flow’s dramatic reduction in the time it takes to create a model and run simulations, is that Sondrel can provide customers with data on the likely performance of a proposed ASIC in a matter of few days to determine if the architecture proposed gives an appropriate set of numbers. If not, it is very easy and quick to run variants of the model simply by changing the settings of the existing model to decide which is the best one for the customer’s application use case. Running each variation takes anywhere between a few minutes to an hour, so the whole process of model creation and running variants can still be done in a few days.

For comparison, converging on a candidate architecture without Sondrel’s modelling flow tool would rely heavily on static spreadsheet modelling which would take several weeks and then each variant of the model to evaluate different architectures would each take weeks as each variant model would have to be created from scratch. Overall, that could total a number of months.

The 10 steps are: –

Determines what the data is and what are the I/O constraints such as burstiness, latency, timing, and data formatting, to decide on the buffer requirements that is captured in a spreadsheet.
Breaks the processing down into sub-tasks and groups parts of the SoC into common pieces of functionality.
Identifies what third party IP blocks will be required to perform the steps of an algorithm and how much memory and compute power they require from their datasheets that can be fed into the modelling environment to give a more accurate representation of what all the IP blocks will be doing.
Covers the method of exchanging data in between parts of an algorithm such as on-chip SRAM or external DDR memory as well as FIFO which are small spaces of memory on chip. The decision between SRAM and DDR depends on the size of the data and how often it needs to be accessed with large pieces of data going to external memory and small pieces of data to SRAM or FIFO.
This is when a software representation is created of what the different stages are with the conceptual view of the algorithm and actual simulation objects that correspond to the different software stages of the algorithm. These require settings such as latency and processing cycles, and are joined by objects known as channels that indicate what the sequencing is.
Having constructed all the simulation objects for the full algorithm, simulations can be run to see if the right sequencing of the algorithm has been captured.
Uses models of the hardware platform with VPUs (Virtual Processor Units) that will run the software of step 5, each with its own local memory. Here the interface timing can be considered and communication domains defined with their assigned channels and evaluated. It also enables the configuration of the VPUs to be verified as correct.
Takes the memory available to each VPU and remodels it as being connected to external memory via a common memory controller. This gives a more accurate representation of the connectivity of all the VPUs and memories in the final system.
Adds the interconnect fabric. Instead of the direct connections between the VPUs and the memory controller, these are replaced by the interconnect fabric and the effects on the timing and performance evaluated. The interconnect fabric is then adjusted to meet the performance required, with previous stages being redone to achieve the required results.
This is a good working model so, by simply adjusting settings, various simulations can be run to identify bottlenecks, what constraints there are in the system, and which parameters should be adjusted to improve the throughput and reduce the latency of the SoC. These take a few minutes to an hour to run so that it is straightforward and quick to test variants.

The first four steps can be done on paper or on a spreadsheet by calculation to understand the input/output dataflows into the SoC and what their characteristics are. The last six steps are simulation-based where software models are constructed and simulations run to generate results that inform about the system.

An article covering this in more depth is available by clicking here.

About Sondrel^™
Founded in 2002, Sondrel is the trusted partner of choice for handling every stage of an IC’s creation. Its award-winning, define and design ASIC consulting capability is fully complemented by its turnkey services to transform designs into tested, volume-packaged silicon chips. This single point of contact for the entire supply chain process ensures low risk and faster times to market. Headquartered in the UK, Sondrel supports customers around the world via its offices in China, India, Morocco and North America. For more information, visit www.sondrel.com

Also read:

Build a Sophisticated Edge Processing ASIC FAST and EASY with Sondrel

Sondrel Creates a Unique Modelling Flow to Ensure Your ASIC Hits the Target

Get a Jump-Start on Your Next IoT Design with Sondrel’s SFA 100

March 2, 2022April 2, 2022

Emergency Response Getting Sexy

Emergency Response Getting Sexy
by Roger C. Lanctot on 03-02-2022 at 6:00 am
Categories: Automotive

For 20 years the concept of emergency response has been one of the most tired, uninteresting sectors of the automotive industry. General Motors introduced OnStar automatic crash notification 26 years ago and that application has long been considered the end of the story.

The shutoff of 3G wireless networks and the resulting loss of automatic crash notification built into cars from multiple brands has suddenly changed everything. Emergency response is now a red hot topic – and auto makers are scrambling to respond. (A similar scenario is unfolding in Europe where automatic crash notification – called “eCall” – was mandated beginning four years ago.)

Of course, OnStar was not the end of the story. The onset of the smartphone introduced the prospect of smartphone-based emergency calling while simultaneously negating the perceived need for “built-in” automatic crash detection and notification.

More importantly, the arrival of the smartphone removed the “fear factor” marketing methods in support of a built-in system that was capable of summoning emergency assistance in the event of a crash (after detecting an airbag deployment). In fact, the arrival of the smartphone introduced smartphone-based usage-based insurance and indirect driver distraction detection.

The introduction of smartphone-based usage-based insurance also brought forth the use of smartphones for insurance claims management including the uploading of pictures from crash scenes and the use of artificial intelligence to streamline the claims process. The latest innovation, though, is the detection of crashes by the smartphone further eliminating the need for the built-in system, or so it seems.

The vision of an entirely smartphone-based crash detection and claims management solution has insurers drooling with anticipation. This is especially true in view of the post-pandemic explosion in car crashes and highway fatalities.

More car crashes means big bucks and potential market share increases for car insurers in an otherwise mature and slowly evolving sector. Nothing short of a revolution is set to sweep the industry as insurers target smartphone-centric insurance capable of serving consumers at home, in their cars, and anywhere on the go.

One of the latest developments is the partnership of ADT and Agero to deliver the Sosecure application for smartphone-based emergency response suitable to all circumstances. There is no question that this concept will be leveraged across the insurance sector as soon as the underwriters can eliminate all foreseeable liability drawbacks.

Apple and Google are already on board. Android-based phones are already capable of detecting car crashes via the Android Auto platform. Apple’s own CarPlay automotive platform can be expected to follow quickly – while Apple is already advertising the automated emergency calling now available on its phones.

It’s not completely clear that either Apple or Google are interested in taking on the insurance opportunity in its entirety. Google already dipped its toe in once, and quickly withdrew.

What is clear is that auto makers are at risk of being excluded from a critical customer bonding experience if built-in systems are not enhanced and advanced to compete with these handheld invaders. Smartphone-based emergency response is an important tool, but it diminishes the value of built-in systems capable of transmitting vital information to first responders including the identity and condition of drivers and passengers, the severity and nature of the crash, and so much more.

The sudden industry-wide recognition of the importance of built-in automatic crash notification in cars has been highlighted by the shutdown, this month, of AT&T’s 3G wireless network. Millions of cars were sold in the U.S. with built-in 3G connections for emergency crash notification – including cars from luxury makes such as Audi, Mercedes-Benz, BMW, and Lexus. Those systems will be instantly disabled when those networks are shutoff.

With crashes and fatalities on the rise in the U.S., the importance of a rapid response to a crash scene with accurate location, severity, and driver condition information is more important than ever. Do drivers and passengers really want to rely on a smartphone-based solution? Do auto makers really want to abdicate their responsibility for customer care?

Let’s remember: The interest of auto makers in providing a robust built-in emergency response system is not founded on altruism. A reliable on-board emergency response system, a la OnStar, serves the auto maker’s vital brand building and customer retention needs. Thousands of lives are at stake, but also, billions of dollars and all-important market share.

I’ve said it before and I’ll say it here again: A customer in a car crash is at a low point of customer satisfaction and a high point of customer defection. The industry can’t afford to leave those customers in the lurch or, worse, in the hands of Apple, Google, State Farm, Allstate, Geico, or Progressive.

Also read:

Waymo Collides with Transparency

Apple and OnStar: Privacy vs. Emergency Response

Musk: Colossus of Roads, with Achilles’ Heel

March 1, 2022April 2, 2022

Using a GPU to Speed Up PCB Layout Editing

Using a GPU to Speed Up PCB Layout Editing
by Daniel Payne on 03-01-2022 at 10:00 am
Categories: Cadence, EDA
3 Comments

I can remember back in the 1980s how Apollo workstations were quite popular, because they accelerated the graphics display time for EDA tools much better than competitive hardware. Fast forward to 2022 and we have the same promise of speeding up EDA tools like PCB layout editing by using a GPU. At the 58th DAC there was a session called, Accelerating EDA Algorithms with GPUs and Machine Learning, where Patrick Bernard and Anton Kryukov of Cadence presented.

The Cadence PCB layout tool is called Allegro, and they added support to detect an Nvidia GPU to speed up rendering, something that benefits projects with large design sizes, like 100’s of millions of graphical objects and up to 200 layers. Just take a look at this 3D example from a small portion of a modern PCB to get an idea of the density of objects:

3D PCB Layout

Every time that a PCB layout designer does a pan, zoom or fit operation, then there’s a render of each object, which takes time for the CPU to calculate new geometries. What Cadence developers did to speed up rendering times was to cache geometry in GPU memory, minimizing the calculations required.

Anton went into some of the details of how Allegro used the Nvidia GPU boards to accelerate the rendering times, and they used a Scene Graph (SG) data structure. A PCB has many layers, each shown in different color below:

PCB Layers

Accelerated rendering is done through a pipeline of several internal steps:

Allegro – incremental changes
Scene Mapper
Abstract Interface
NV Plugin – renderer
NV Plugin – QWindow + OpenGL context
Create and place rendering window

An example of how fast this GPU-based acceleration operates was shown with a 15 layer PCB design with 32,423 stroked paths, and Allegro had frame rates from 144 fps up to a whopping 349 fps, depending on the zoom level.

PCB NV Path Rendering

Even the text layers have acceleration for True Type Fonts with NV Path rendering. A technique called Frame Buffer Optimization (FBO) was also applied that understands the difference between a static and dynamic scene.

Results

Patrick shared that GPU results often rendered instantly when using an Nvidia Quadro P2000 card, compared to a few seconds for the old graphic engine speeds. Quality of zooming into graphics was also much improved:

Quality improvements

With the old graphics approach there was a filter that identified objects less than 5-8 pixels, and simply didn’t show them at all. With the new GPU approach every single object is rendered, and there is no filtering, so there are fewer visual surprises to the designer when looking at their high-resolution monitors.

The Allegro tool ships with a demo board, and Patrick loaded that design and began to pan and zoom all around the board, with very little time spent waiting for all of the layers to render. The text was always crisp, and all objects were turned on.

Demo PCB

You can expect the GPU-based acceleration to be applied to future PCB challenges, like:

Shape engine
Design Rules Checker
Manufacturing output
Simulations

Allegro automatically detects if your workstation is using one of the popular Quadro series of GPUs (P, GP, GV, T, RTX) or the Tesla (P, V, T), so you just enjoy faster productivity.

Summary

Over the years in EDA I’ve watched CPU performance improve, cloud computing emerge, and GPU acceleration techniques added. They all have their place in making engineers and designers more productive by not having to wait so much time for results to become visible. Development engineers at Cadence in the Allegro group have done a good job of speeding up graphical rendering times for PCB designers by support GPU cards from NVIDIA.

Now the CAD department can buy NVIDIA GPU cards for their PCB designers and see immediate productivity improvements in Allegro operations. The bigger the project, the bigger the time benefits.

View the full 38 minute video online at Nvidia.

Related Blogs

March 1, 2022February 27, 2023

WEBINAR: Balancing Performance and Power in adding AI Accelerators to System-on-Chip (SoC)

WEBINAR: Balancing Performance and Power in adding AI Accelerators to System-on-Chip (SoC)
by Daniel Nenni on 03-01-2022 at 6:00 am
Categories: Events, IP, Mirabilis Design

Among the multiple technologies that are poised to deliver substantial value in the future, Artificial Intelligence (AI) tops the list. An IEEE survey showed that AI will drive the majority of innovation across almost every industry sector in the next one to five years.

As a result, the AI revolution is motivating the need for an entirely new generation of AI systems-on-chip (SoCs). Using AI in chip design can significantly boost productivity, enhance design performance and energy efficiency, and focus expertise on the most valuable aspects of chip design.

Watch Replay HERE

AI Accelerators
Big data has led data scientists to deploy neural networks to consume enormous amounts of data and train themselves through iterative optimization. The industry’s principal pillars for executing software – standardized Instruction Set Architectures (ISA) – however aren’t suited for this approach. AI accelerators have instead emerged to deliver the processing power and energy efficiency needed to enable our world of abundant-data computing.

There are currently two distinct AI accelerator spaces: the data center on one end and the edge on the other.

Hyperscale data centers require massively scalable compute architectures. The Wafer-Scale Engine (WSE) for example can deliver more compute, memory, and communication bandwidth, and support AI research at dramatically faster speeds and scalability compared with traditional architectures.

On the other hand, with regards to the edge, energy efficiency is key and real estate is limited, since the intelligence is distributed at the edge of the network rather than a more centralized location. AI accelerator IP is integrated into edge SoC devices which, no matter how small, deliver the near-instantaneous results needed.

Webinar Objective

Given this situation, three critical parameters for project success using AI accelerators, will be discussed in detail in the upcoming webinar on Thursday, March 10, 2022:

Estimating the power advantage of implementing an AI algorithm on an accelerator
Sizing the AI accelerator for existing and future AI requirements
The latency advantage between ARM, RISC, DSP and Accelerator in deploying AI tasks

An architect always thinks of the performance or power gain that can be obtained with a proposed design. There are multiple variables, and many viable options available, with a myriad different configurations to choose from. The webinar will focus on the execution of an AI algorithm in an ARM, RISCV, DSP-based system; and in an AI accelerator-based system. The ensuing benefits of power, sizing and latency advantages will be highlighted.

Power Advantage of AI Algorithm on Accelerator using VisualSim
Mirabilis Design’s flagship product VisualSim has library blocks that have power incorporated into the logic of the block. Adding the details of power does not slow down the simulation; it also provides a number of important statistics that can be used to further optimize the AI accelerator design.

VisualSim AI Accelerator Power Designer
VisualSim AI Accelerator Designer uses state-based power modeling methodology. The user inputs two pieces of information – the power in each state (Active, Standby, Idle, Wait, etc.) and the power management algorithm. As the traffic flows into the system and the tasks are executed, the instruction executes in the processor core and requests data from the cache and memory. At the same time, the network is also triggered.

All these devices in the system move from one state to another. VisualSim PowerTable keeps track of the power in each state, the transition between states, and the changes to a lower state based on the power management algorithm.

The power statistics can be exported to a text file, and to a timing diagram format.

Advantages of sizing the AI accelerator
AI accelerators are repetitive operations with large buffers. These IPs occupy significant semiconductor space and thus augmenting the overall cost of the SoC, where the accelerator is just a small section.

The other reason for right-sizing of the accelerator is that, depending on the application, functions can be executed either in parallel or serial, or data size. The buffers, cores and other resources of the IP must be sized differently. Hence the right-sizing is important.

Workloads and Use Cases
The SoC architecture is tested for a variety of workloads and use-cases. An AI accelerator receives a different sequence of matrix multiplication requests, based on the input data, sensor values, task to be performed, scheduling, queuing mechanism and flow control.

For example, the reference data values can be stored off-chip in the DRAM or can be stored in an SRAM adjacent to the AI block. Similarly the math can be executed inline, i.e., without any buffering, or buffered and scheduled.

New VisualSim Insight Methodology and its Application
Insight technology connects the requirements to the entire product lifecycle by tracking the metrics generated at each level, against requirements. The insight engines work throughout the process from planning, design, validation and testing. In the case of the AI accelerator, the initial requirements can be memory bandwidth, cycles per AI, power per AI functions, etc. Functional correctness and flow control correctness can be added later. The goal of the Insight Engine is to carry metrics of system planning all the way to product delivery. There will be a reference to verify at each stage.

Building of AI Accelerators
AI accelerators can be built using a variety of configurations, whether single or multi-core. A number of open-source concepts are available. Companies such as Nvidia and Google have published their own accelerators. The core IP from Tensilica provides AI acceleration as a primary feature.

Mirabilis Design and AI Accelerators
Mirabilis Design has experimented with performance and power analysis of Tensorflow ver 2.0 and 3.0. In addition, we are working on a model of the Tensilica AI accelerator model.

Workload Partitioning in Multi-Core Processors
The user constructs the models in two parts- hardware architecture and behavior flow which resembles a Task Graph. Each element of a task can perform multiple functions- execute, trigger another task or move data from one location to another. Each of these tasks get mapped to a different part of the hardware. There are other aspects that will also affect the partition. For example the coefficients can be stored locally, increased parallel processing of the matrix multiply, masking unused threads to reduce power etc. The goal is to determine the number of operations per second.

Configuration Power and Performance Metrics
The power and performance do not follow the same pattern. They can diverge for a number of reasons. Memory accesses to the same bank group or writing to the same block address or using the same load/store unit can reduce die space and in some cases be faster, but the power consumed could be much higher.

Summary
Finally, we would like to say that this webinar apart from highlighting the above sections with regard to the AI accelerator, will also show how to arrive at the best configuration and detect any bottlenecks in the proposed design.

Watch Replay HERE

Also Read:

System-Level Modeling using your Web Browser

Architecture Exploration with Miribalis Design

CEO Interview: Deepak Shankar of Mirabilis Design

February 28, 2022March 28, 2022

An Ah-Ha Moment for Testbench Assembly

An Ah-Ha Moment for Testbench Assembly
by Bernard Murphy on 02-28-2022 at 10:00 am
Categories: Arteris, IP

Sometimes we miss the forest for the trees, and I’m as guilty as anyone else. When we think testbenches, we rightly turn to UVM because that’s the agreed standard, and everyone has been investing their energy in learning UVM. UVM is fine, so why do we need to talk about anything different? That’s the forest and trees thing. We don’t need to change the way we define testbenches – the behavior and (largely) the top-level structure. But maybe there’s a better way to assemble that top level through a more structured assembly method than through hand-coding or ad-hoc scripting.

A parallel in design assembly

This sounds just like SoC design assembly. IPs are defined in RTL already, and you also want the top level in RTL because that’s the standard required by all design tools. But while top-level designs can be and should be defined in RTL, that is a cumbersome representation for assembly. Which is why so many design teams switch to spreadsheets and scripts to pull it together. Creation and updates are simpler through spreadsheets and scripts that handle the mechanical task of generating the final RTL.

UVM top levels present a similar problem for a different reason. The UVM methodology is very powerful and amply capable of representing a testbench top level. But it is a methodology defined by and for software engineers, full of object-oriented design and complex structures. All of which is foreign to the great majority of hardware verifiers who are not software experts. Worse still, UVM is sufficiently powerful that it does not constrain how components – VIPs, sequencers, scoreboards, etc. – define their interfaces. Which makes instantiating, configuring and hooking up these components a problem to be solved by the testbench integrator. Redundantly repeated between verification teams. This problem is well known. Verification teams typically spend weeks debugging testbenches before they can turn to debugging the design.

SoC verification requires that many testbenches be generated in support of the wide range of objectives defined in the test plan. Those objectives will be farmed out to multiple verification teams, often distributed across the globe. Most of whom are production verification engineers, not UVM experts. It is easy to see how effort you can’t afford to waste is wasted in support of an unstructured approach to testbench assembly.

Testbench assembly cries out for standardization

The Universal Verification Methodology is the foundation of any modern verification strategy. But few would deny that UVM, as a complex methodology designed around class-based design, is mystifying to the great majority of hardware verification engineers who are not experts in modern software programming concepts. A small team of UVM experts bridges the gap. They know how to construct the complex functions needed in SoC verification while also hiding that complexity behind functions or classes to make them more accessible to non-UVM-experts.

Complexity hiding is logical but is compromised by the diversity of sources for modern VIPs. Without a standard to align packaging methods, disconnects at the integration level are inevitable. In design assembly, the assembly problem has been significantly alleviated through the IP-XACT standard, defining a constrained and unified interface between components and the top-level assembly. Design and testbench structure have much in common, therefore IP-XACT should also be a good starting point to assemble testbench top levels.

Potential problems and solutions

One drawback is that there is no accepted standard today for packaging testbench components. Not that we don’t try. The in-house UVM team will develop components with interfaces to the in-house standard. Commercial VIP developers will each develop to their in-house standard. And there are legacy VIPs developed before any standard was considered. All well-intentioned, but this is a tower of Babel of interfaces. Some UVM teams go further, wrapping all VIPs in an interface following their protocol. A practical solution, though obviously, it would be better if we were all aiming at a standard and that redundant rework could be avoided. IP-XACT would be an excellent starting point, already well established for IP packaging.

A second potential problem is that IP-XACT has been defined for design, not testbenches. This is not nearly as big a problem as it might seem. Top levels should be mostly structural; IP-XACT already handles this very well through instantiations, ports, interfaces, connections, parametrization, and configurations. A couple of exceptions are class information exposed at the block level and SystemVerilog interfaces, both of which can be managed through vendor extensions. In the upcoming 2022 release, interfaces will be incorporated in the standard, leaving only class support to be handled in a later release, a goal which Arteris IP continues to push in the working group.

Interesting idea. What’s next?

Standardizing (and automating) testbench assembly is the only way to go to bring scalability to this task. UVM experts can work on building standard VIPs, leaving assembly (with much easier to understand scripting) to all those diverse teams in support of their needs.

Arteris IP has been developing a solution around this concept, with feedback from key customers. The result is Arteris Magillem UTG (UVM Testbench Generator). If you are intrigued and wonder if this approach could accelerate your SoC verification efforts, contact Arteris IP.

Also read:

Business Considerations in Traceability

Traceability and ISO 26262

Physically Aware SoC Assembly

February 28, 2022April 2, 2022

Breker Verification Systems Unleashes the SystemUVM Initiative to Empower UVM Engineering

Breker Verification Systems Unleashes the SystemUVM Initiative to Empower UVM Engineering
by Daniel Nenni on 02-28-2022 at 6:00 am
Categories: Accellera, Breker Verification Systems, EDA, Events

The much anticipated (virtual) DVCON 2022 is happening this week and functional verification plus UVM is a very hot topic. Functional Verification Engineers using UVM can enjoy a large number of benefits by synthesizing test content for their testbenches. Abstract, easily composable models, coverage-driven content, deep sequential state exploration, pre-execution randomization for test optimization and configurable reuse are just some examples of the advantages afforded by test suite synthesis.

However, a specification model is required and there are few alternatives that a UVM/SystemVerilog engineer can simply pick up and use.

Enter SystemUVM™, a UVM class library built on top of Accellera’s Portable Stimulus Standard that looks and feels like SystemVerilog with UVM, but enables the level of abstraction and composability required for this specification model with an almost negligible learning curve.

Breker Verification Systems Unleashes the SystemUVM Initiative to Empower UVM Engineering

Enhances Bug Hunting by Simplifying Specification Model Composition for Test Content Synthesis in Existing UVM Environments

SAN JOSE, CALIF. –– February 28, 2022 –– Breker Verification Systems used the opening of DVCon U.S. today to unveil SystemUVM™, a framework designed to simplify specification model composition for test content synthesis with a UVM/SystemVerilog syntactic and semantic approach familiar to universal verification methodology (UVM) engineers.

Developed in partnership with leading semiconductor companies, Breker’s SystemUVM’s UVM-style specification model drives test content synthesis, leveraging artificial intelligence (AI) planning algorithms for deep sequential bug hunting in existing UVM environments.

A coverage-driven approach simplifies test composition and employs up-front randomization for efficient simulation and accelerated emulation. It enhances test content reuse through configurable scenario libraries and portability for system-on-chip (SoC) integration verification and beyond.

For more information go to: www.brekersystems.com/SystemUVM

The Breker Approach
“UVM is an effective standard for block-level verification,” remarks David Kelf, Breker’s CEO. “As blocks and subsystems get larger and more complicated, composing test content for the UVM environment becomes more difficult and harder to scale. By leveraging synthesis for test content generation, a 5X improvement for larger components and multi-IP subsystems is common in composition time combined with significant coverage increases. SystemUVM makes this easily accessible for verification specialists with a minimal learning curve, dramatically changing the nature of functional verification.”

Breker’s SystemUVM layers UVM class libraries on to Accellera’s Portable Stimulus Standard (PSS) to provide the look and feel of SystemVerilog/UVM and its procedural use model. Models can be composed rapidly, efficiently reused and easily understood and maintained through UVM’s register access level (RAL), a library of common verification functions and abstract “path constraints.”

SystemUVM code offers an alternative to generic PSS while still being built on the industry standard, specifically targeting the needs of UVM engineers and recognizable to them, unleashing the power of PSS Test Content Synthesis tools, such as Breker’s TrekUVM™ and TrekSoC™ products.

SystemUVM-based Test Suite Synthesis allows the simplified generation of self-checking test content from a single abstract model complete with high-level path constraints for manageable code. Synthesis AI planning algorithms allow for specification state-space exploration, uncovering complex corner-cases that lead to potential complex bugs.

The coverage-driven nature of the process eliminates the need for coverage models and post-execution coverage analysis that results in test respins. With test randomization performed before execution, simulation is accelerated, and emulation can be used without an integrated testbench simulator, which increases its performance. The tests can also be reused in system verification via the Synthesizable VerificationOS layer without any change or disruption to the UVM testbench.

Availability and Pricing
SystemUVM is available today and is included in Breker’s Test Suite Synthesis product line. Pricing is available upon request. For more information, visit the Breker website or email info@brekersystems.com.

Breker at DVCon U.S.
DVCon’s tutorial “PSS In The Real World” opens this year’s virtual conference at 9 a.m. P.S.T., showcasing the power and flexibility of Accellera’s Portable Stimulus Standard by highlighting several real-world examples. Adnan Hamid, Breker’s executive president and CTO, is a speaker.

“In-emulator UVM++ Randomized Testbenches for High Performance Functional Verification,” a Breker-sponsored workshop also on Monday at 11:30 a.m. P.S.T., attendees will learn proven, practical methods to verify complex blocks, SoCs and sub-systems with a high degree of quality.

“The Meeting of the SoC Verification Hidden Dragons,” a panel organized by Breker and featuring Hamid will address the gap in semiconductor verification between block functional verification and system SoC validation. The panel will be held Wednesday, March 2, at 8:30 a.m. P.S.T.

About Breker Verification Systems
Breker Verification Systems is a leading provider of verification synthesis solutions that leverage SystemUVM, C++ and Portable Stimulus, a standard means to specify reusable verification intent. It is the first company to introduce graph-based verification and the synthesis of high-coverage test sets based on AI planning algorithms. Breker’s Test Suite Synthesis and TrekApp library allows the automated generation of high-coverage, powerful test cases for deployment into a variety of UVM, SoC and Post-Silicon verification environments. Case studies that feature Altera (now Intel), Analog Devices, Broadcom, IBM, Huawei and other companies leveraging Breker’s solutions are available on the Breker website. Breker is privately held and works with leading semiconductor companies worldwide.

Engage with Breker at:
Website: www.brekersystems.com
Twitter: @BrekerSystems
LinkedIn: https://www.linkedin.com/company/breker-verification-systems/
Facebook: https://www.facebook.com/BrekerSystems/

Also read:

Breker Attacks System Coherency Verification

Breker Tips a Hat to Formal Graphs in PSS Security Verification

Verification, RISC-V and Extensibility

February 27, 2022February 28, 2022

Intel’s Investor Day – Nothing New

Intel’s Investor Day – Nothing New
by Doug O'Laughlin on 02-27-2022 at 6:00 am
Categories: Foundries, Intel Foundry, Semiconductor Services
2 Comments

https3A2F2Fbucketeer e05bbc84 baa3 437e 9518 adb32be77984.s3.amazonaws.com2Fpublic2Fimages2Fbe32664a cc3a 41e2 8898 d1a1ba57daf1 2400x1240

Intel’s big investor day was anything but big. The stock reacted poorly, down 5% on a day that was a widespread sell-off anyways.

I want to briefly summarize what matters for the stock. There was very little incremental news to the technology roadmap, and the financial outlook was underwhelming, to say the least.

The revenue guide was low single-digit growth, transitioning to mid-single-digit growth, to then double-digit growth in 2026.

They expect to improve gross margins when they get leadership products. They provided this simple bridge. David Zinsner (the new CFO) reiterated multiple times he thinks that this gross margin bridge was conservative.

The real meat of the financial outlook was given in a two-part long-term model. First, is the investment phase model, where they spend heavily to catch up to the industry and bootstrap their foundry business.

And then the longer-term model, when Intel presumably is back in the lead during their fateful 2024 crossover timeline. 2024 is when Pat Gelsinger believes they can take process leadership, and have better revenue growth and margins.

The thing is that the total vision was pretty underwhelming if you ask me. The reason why? Well, it’s consensus already.

It’s Consensus Already

One of the things that frustrated me is that I believe that the gross margin and revenue goals from the analyst day were within 50 bps of street consensus until 2025, and this is why the market found this model so particularly uncompelling. This is what we expected already, and stating it again felt pointless. This comment in particular was extremely off-putting as well.

“I want to double the earnings and double the multiple of this company”.

Historically commenting on the multiple is pretty icky if you ask me. Also if you compare it to the historical multiple it feels pretty unlikely. Will Intel really trade at 28x forward earnings?

Trading at 20x forward earnings would put it in a multiple class Intel hasn’t seen since pre-2008. Albeit growing double digits would also be something Intel hasn’t done since 2018 (~13% YoY revenue growth) and hasn’t done for a period of 3 consecutive years in a row since 2003-2005. There has to be a lot fundamentally different for Intel to deserve that doubling in multiple. We will talk about the doubling in earnings a bit.

Financial Model & Free Cash Flow Levers

The thing that also is a bit frustrating is looking at this company on an earnings basis when they guided for 3 years of flat to negative FCF. Here is my model backing out of their long-term guidance.

Notice doubling earnings from the 2022 level, not the 2021 level. For simplicity’s sake, I took the top end of their ranges. The conservatism that David and Pat stress I think is reflected by me assuming share count doesn’t grow.

So while it trades at “just 10x 2024 earnings” it will make no FCF in that year. There is so much implied in the model for the huge technical and financial turnaround in 2024. Everything in the model hinges on 2024 which is a few years away and implies a huge margin acceleration. From 0% FCF margin to 20% in 2 years, this seems unlikely.

The only thing that really made me believe this is possible is if they play the government incentives game really hard. Their initial capex guidance is based on 10% savings from net spend (their guide) to gross.

David Zinser also thinks that a 30% cost reduction number seems like the number they will get and would be “surprised” if Intel didn’t there.

So from my reading of this situation, Intel could hit the FCF number if they bag the entire savings from a 10% cost reduction to a 30% cost reduction, and then improve total EBIT margins by 800 bps. I think at least 1/3rd of the FCF bridge will be driven by playing around with the net and gross capex assumptions. Pretty heroic assumptions, which brings me to the entire crux of the investor day. We are waiting for 2024.

Waiting for Gadot (aka 2024)

The Intel turnaround is waiting for 2024. We kind of already knew that now we just have financial clarity until that fateful year. The financials were as bad as we expected, and for an Investor day, there was almost no surprise.

Another thing that I keep coming back to is that everyone expects that Intel’s turnaround will mean some drastic and amazing returns for the stock. To be clear if the turnaround happens with no hitches I think that the stock could easily do a 20%+ CAGR to 2026, but I also think that can be said of multiple companies in the semiconductor space at these current prices. I think the problem is that most investors are mentally comparing Intel to their notable competitor: AMD.

I want to be clear, this is not AMD. AMD went from almost no share to meaningful amounts of share and massively improved their economics in that time period. The torque of 10% to 40% share and 20% gross margins to 50% margins will not be repeated. The best case for Intel I think is something that is a 20%+ CAGR, or a rough tripling. That’s a great return! But I want to say this is the BEST CASE and is still contingent on a meaningful amount of execution risk and zero FCF for multiple years along the way. Investor day didn’t really give us much hope other than to wait until 2024.

We are still just waiting for 2024 to see the process leadership be regained.

Also, it looks like the crux of what I said about Tower tipping their hand of Intel becoming a holding company seems true.

The other thing I’ve said is that, “Hey, I’d like to do a Mobileye-like spin on our foundry business at some point as well.” I’m going to keep the structure, as opposed to integrating as much, I’m going to keep it more separate to enable that, which means I’m going to leverage a lot more of Tower and the expertise that it builds over time as part of it.

Ala the venerable Stratechery.

The screenshots from above are from a very simplistic model that I used to replicate the investor day goals. It is behind the paywall for paying subscribers only. Feel free to mess with the assumptions yourself. The FCF part of the model is disconnected from earnings, mostly because of how meaningful the D&A cost ramp will be for the next few years.

Also Read:

Semiconductor Earnings Roundup

Tower Semi Buyout Tips Intel’s Hand

The Rising Tide of Semiconductor Cost

TSMC Earnings – The Handoff from Mobile to HPC