SemiWiki – Page 271 – The Open Forum for Semiconductor Professionals

RVN! 26 Banner revised (800 x 100 px) (600 x 100 px)

Podcast EP56: sureCore Memory, From Ultra-Low Power to Ultra-Low Temperature

Podcast EP56: sureCore Memory, From Ultra-Low Power to Ultra-Low Temperature
by Daniel Nenni on 01-07-2022 at 10:00 am

Dan is joined by Paul Wells. CEO of sureCcore. Paul describes a variety of new and innovative applications for sureCore memory products. including ultra-low power applications for consumer connected devices and new applications for quantum computing.

The views, thoughts, and opinions expressed in these podcasts belong solely to the speaker, and not to the speaker’s employer, organization, committee or any other group or individual.

January 7, 2022July 18, 2025

Identity and Data Encryption for PCIe and CXL Security

Identity and Data Encryption for PCIe and CXL Security
by Tom Simon on 01-07-2022 at 6:00 am
Categories: IP, Synopsys

Privacy and security have always been a concern when it comes to computing. In prior decades for most people this meant protecting passwords and locking your computer. However, today more and more users are storing sensitive data in the cloud, where it needs to be protected at rest and while in motion. In a Synopsys webinar Dana Neustadter, Senior Marketing Manager for Security IP, cites figures from Skyhigh Networks that show as much as 21% of files uploaded to file sharing services contain sensitive data, such as medical, financial or personal information.

While we all feel that data centers and related infrastructure are secure, if you have ever watched a video of a “penetration tester”, you will see just how easy it is for bad actors to physically access some sites. If you want to see one of these videos, search for “pen tester” on Youtube. Fortunately, the industry is responding to this issue by adding security to specifications such as PCIe and CXL (Computer eXpress Link). These additions go a long way toward meeting the requirements of new laws and regulations which are creating requirements for system security where there is sensitive data.

Security for data in motion in PCIe and CXL, of course, depends on proper on chip security within SOCs. A trusted execution environment should offer power-on, runtime and power off security, through a Hardware Security Module (HSM). The real key to PCIe and CXL security is the addition of an Integrity and Data Encryption (IDE) component. In the Synopsys webinar Dana does a thorough job of describing the function and operation of an IDE in conjunction with authentication and key management. The PCIe 5.0/6.0 and CXL 2.0/3.0 specifications call for this additional functionality to afford increased security.

Security for Cloud Applications

The IDE is intended to sit within the PCIe Transaction Layer. This is a critical aspect of the design, because while added security is an essential requirement, it needs to have minimal impact on latency and performance. Right now, the specs allow for IDEs in the handling of TLP streams. FLIT mode will be included in the PCIe 6.0 release. Packets are protected by AES-GCM with 256-bit keys and 96-bit MAC tags. Ideally the addition of IDE should be plug and play, and this is the case for the Synopsys PCIe IDE and Controller IP. Another important element is that FIPS 140-3 certification is becoming important in the industry and should be supported through a certification test mode.

CXL operation and support mirrors that of PCIe. Dana includes the flow for both PCIe and CXL when IDE is included. Of course, with CXL there are some differences because of the three types of protocols it supports. IP for the CXL IDE needs to include containment and skid modes, and additions for PCRC when running CXL.cache/mem. Dana also discusses the ins and outs of key management for the large number of streams that can be operating in a design.

This webinar is comprehensive in that it discusses the needs and requirements for PCIe and CXL security in cloud applications. It also goes in depth on the components, architecture and related standards that are supported in the Synopsys DesignWare IP. Near the conclusion of the webinar, Dana shows how several different SOCs for AI or networking can be constructed largely from IP available from Synopsys. The webinar is available for replay on the Synopsys website.

Also read:

High-Performance Natural Language Processing (NLP) in Constrained Embedded Systems

Lecture Series: Designing a Time Interleaved ADC for 5G Automotive Applications

Synopsys’ ARC® DSP IP for Low-Power Embedded Applications

January 6, 2022March 28, 2022

Cliosoft and Microsoft to Collaborate on the RAMP Program

We have all heard of many advanced technological inventions and products from the defense sector that subsequently got commercialized. While most of the Defense Advanced Research Projects Agency (DARPA) projects are classified secrets, many military innovations have had great influence in the commercial sector in the fields of electronics, communications and computer science. A well-known invention that we all rely on every day, the Internet, had its beginnings from such a project. Since its commercialization, so many advances have happened with and using the Internet including cloud computing.

While the commercial sector has made incredible advances over time, the defense sector has been mostly limited by security concerns and related matters. The Department of Defense (DoD) and the traditional Defense Industrial Base (DIB) are still following obsolete practices and outdated processes in some fields. In particular, as they relate to State-of-the-Art (SOTA) custom IC and System On a Chip (SoC) design and associated physical design. The Navy and Air Force having recognized this, have embarked on an initiative to leverage commercial capabilities to demonstrate secure enhanced design. The initiative is called the Rapid Assured Microelectronics Prototypes (RAMP) program. The purpose of this prototype is to facilitate the rapid development of IC hardware for further evaluation and technology enablement of the DoD. The RAMP program is now in its second phase. Microsoft has been tasked with leading the program by collaborating with companies in the electronics, EDA, semiconductor and related fields.

Microsoft has selected the following industry leaders to collaborate with: Ansys, Applied Materials, Inc., BAE Systems, Battelle Memorial Institute, Cadence Design Systems, Cliosoft, Inc., Flex Logix, GlobalFoundries, Intel Federal, Raytheon Intelligence and Space, Siemens EDA, Synopsys, Inc., Tortuga Logic, and Zero ASIC Corporation.

This article will focus on what Cliosoft brings to the RAMP program.

Cliosoft’s sole focus is helping semiconductor companies manage their design data and their IP.

Its SOS family of design management solutions serves as the backbone for design collaboration at many of the largest semiconductor companies. Cliosoft also provides an enterprise IP management platform called HUB that is used by companies to easily create, publish and reuse their design IPs. Their Visual Design Diff (VDD) platform allows design teams to quickly compare two versions of a schematic or layout by graphically highlighting the differences directly in the design editor. Together, the above three data platforms enable easy and secure handling of data and IP through all aspects of microelectronics development and workflow. Let’s take a closer look.

Designing Flexibility

Design teams need the flexibility to be able to use multi-vendor tool flow on their designs. SOS is integrated and production tested with design tools from multiple vendors. Efficient management of design data and upkeep of proper documentation requires disciplined effort from the team members. Making it easy for them to invoke SOS revision control & design data management features directly from their preferred tools helps achieve success.

Creativity, IP ReUse and Return on Investment

Cliosoft HUB lets people across an enterprise share their IP and expertise with others. It enables problems to be solved quickly by crowdsourcing and designs to be completed faster without reinventing the wheel. Cliosoft HUB helps manage and track these collaborative efforts.

Effective IP reuse requires an IP-based design methodology and a good software infrastructure to enable it. It also requires an easy way for designers to find the right IP and gauge its quality. When reusing an IP, designers need the ability to get help with the IP if needed, report issues found, and be notified if there are updates. Cliosoft HUB addresses all of these requirements.

Engineers needing a piece of IP may find that it has already been developed in another division. They can now quickly access the IP and leverage the expertise of the IP creators. They can also benefit from other users in the enterprise who may have integrated that IP into their designs. All the interaction is recorded within Cliosoft HUB and becomes a knowledge base that future users of the IP can leverage.

IP Assembly

For situations when a piece of higher-level IP must be developed internally, it is usually a matter of assembly using lower-level IP blocks. In order to successfully complete this assembly process, hierarchical visibility is required along with access to the knowledge base and issues tracking of all IP blocks.

IP Traceability

Traceability is key to understanding the evolution of an IP block, and the modifications that were made for bug fixes or new features. Cliosoft HUB provides IP traceability through a knowledge base that describes the evolution, reuse and integration of IP into various products. This kind of traceability is especially required for compliance reasons in the defense sector, automotive and medical device markets. Standards such as ISO26262 and MIL-STD-882 mandate this kind of documentation. All of Cliosoft’s products are ISO26262 certified.

Also Read

DAC 2021 – Cliosoft Overview

Cliosoft Webinar: What’s Needed for Next Generation IP-Based Digital Design

Webinar – Why Keeping Track of IP in the Enterprise Really Matters

January 5, 2022January 7, 2022

CES 2022 and the Electrification of Cycling

CES 2022 and the Electrification of Cycling
by Daniel Payne on 01-05-2022 at 10:00 am
Categories: Events
6 Comments

With the Omicron variant of the COVID-19 virus in the news, there have been some big corporate names withdrawing from CES ( Peleton, Super73), however the cycling innovation companies assembled once again in Las Vegas this year for CES 2022. Data from statista show the strong growth in bicycle revenues in March 2020, when the pandemic started in the US, showing an 85% growth in Electric bikes:

eMobility Experience

This year visitors to CES could go for a test ride on an outdoor track, with about a hundred electric models to ride.

eMobility Test Track

e-Bikes

Alta Cycling Group showcased their eBike brands: Diamondback, IZIP, Redline and Haibike. The Diamondback Union 2 has a class 3 Bosch Performance Line Speed motor, fenders, lights and a rack for commuting and shopping trips:

The IZIP brand categorizes their eBikes into: Adventuring, Commuting, Cruising.

IZIP

Tern Bicycles

Hyper Bicycles

Coaster Cycles

VAAST Bikes

iX Rider

Bianchi Aria E-Road

Magnum Scout

Totem USA – Zen Rider

RKS Motor

SoFlow

Aventon – Adventure Ebike

Dongguan CXM

Euphree – City Robin

Fiil Bikes

Giant – Road E+ 1 Pro

Go Power Bikes – GO Express

Hongji Bike: E-CityMM01

Hyper Bicycles

Rad Power Bikes

Tern Bicycles

LeMond Bicycles

I noticed how some of these eBikes have concealed the batteries into the frame, while others look like the batteries are just bolted onto the frame. I prefer the concealed look much better.

From China there’s NIU and with an e-bike called the BQi, designed with a step-through frame, concealed batteries and a 62 mile range, boasting a top speed of 28mph, priced attractively at just $1,075, which is quite low for an e-bike.

NUI BQi-C1 Pro

Bird started out with scooter rentals, but this year had two new e-Bikes, one with a step-through frame, and the other with a traditional top tube. I liked the built-in light features, concealed batteries, and these look to be commuter bikes.

BirdBike

Hailing from my home state of Minnesota is Benno Bikes, and they showed a lineup of four models: Boost, Ejoy, Escout, Remidemi. Each of these models is aimed at carrying cargo in the back and front.

Benno Bikes – BOOST

If speed is your ultimate goal, then consider the 50mph top speed from Delfast, and it boasts a range of 200 miles.

Delfast (Source: UPI)

Another city e-bike, but with a twist, Urtopia sports a single-speed, carbon frame, fingerprint scanner, LED lighting, turn signals, and a built-in display on the stem.

Urtopia

OKAI EB20

Wise-integration has an e-bike charger that is 6X smaller and 6X more energy efficient by using GaN technology.

Wise-integration

GPS Tuner provides the infrastructure that an e-bike system needs: an IoT Adapter, white label apps, and the cloud.

GPS Tuner

All of those e-bikes need to be parked and charged, especially for commuters, so ParkENT has developed a secure charging station.

ParkENT – secure charging station

Carbon frames provide the lightest weight for bicycles, however the traditional process to make them is quite labor intensive, which drives the prices up, so Superstrata has a 3D-printed carbon frame composite, in both traditional and e-bike versions.

Superstrata E

CES 2022 Innovation Awards Honoree

We all love to win awards, right? Bosch got an Honoree award for their eBike Systems, which consists of an eBike Flow app, an LED user interface, color display, rechargeable battery and drive unit. It’s smart enough to support over-the-air updates, something that we take for granted with our smart phones and other electronic devices. With an eBike you really need to know how low the battery charge is, so that you don’t get surprised mid-ride. You’ve likely heard of Bosch as supplying automotive parts, but they’ve also been supporting eBikes with electric motors for several years now too.

JBL has a portable Bluetooth speaker called the Wind 3 that mounts to your handlebars while cycling.

JBL Wind 3

I wear a heart rate monitor while cycling, but now there’s a new sensor product from CORE that measures your core body temperature during a workout. It’s also used by 8 professional cycling teams.

CORE

Indoor Trainers

LG showed off their Virtual Ride, a stationary bike concept along with three vertical 55-inch OLED displays, spanning quite the range of vision to make you feel more immersed while working out:

Echelon has their EX-8S, a Peleton competitor, sporting a 24″ curved display, and priced at $2,399, plus a $34.99 monthly subscription.

Echelon EX-8S

AI workouts targeted to just your fitness level is the goal of Renpho and their new Smart Bike Pro.

Renpho – Smart Bike Pro

Cultbike comes with a 22″ touchscreen to view your spin class workouts, and you can view actual outdoor video scenery to pass the time.

Cultbike

Cycling Cameras

There are a couple of use models for adding a camera system to your bike: safety – you now have a record of approaching vehicles in case of a collision or near miss, social – you like to share video clips or photos from the route and your cycling buddies.

apeman debuted the SEEKER series of 4K HD action camera for rear-facing, or front-facing configurations.

Smart Helmets

How about adding programmable lights to the front and back of a cycling helmet, then adding Bluetooth speakers? That’s what OKAI did with the SH10 smart helmet.

OKAI SH10

Electricity Generation

Growing up as a kid in Minnesota I recall seeing a 3-speed English bike with a wheel-mounted generator that provided electricity for a front light. Now there’s a company called WITHUS & EARTH that generates electricity from a device placed near your rotating wheel, yet not touching it, as magnets placed inside of the wheel help turn the dynamo. The company has won a CES award for the third year in a row now.

WITHUS & EARTH

Cycling App

From Korea comes a cycling app called Veloga Cycle, sporting lots of data fields, analytics, and a way to share your ride with others. Here in the US we’ve already seen many similar apps: Strava, MapMyRide, RideWithGPS. My cycling journey with apps started out with MapMyRide, but then I switched to Strava, because all of my buddies used it, and I wanted to fit into the community.

Veloga Cycle

Daniels 2021 Cycling

Here’s the Strava stats for my cycling in 2021, and you’re invited to follow me on Strava, I will follow you back so let’s stay in shape together.

My epic endurance ride was from Tualatin, OR to Pacific City and back, 206.5 miles, yes, in one ride.

Here’s a list of all the electronics that I ride with:

SRAM eTAP, wireless, electronic shifting
Wahoo ROAM, cycling computer
Quarq Dzero, power meter
Garmin HRM-Dual, heart rate monitor
Wahoo Speed Sensor
Blackburn Dayblazer 65, rear light
Blackburn Central 650, front light

On rainy days in Oregon I cycle indoors with:

Tacx Neo 2T, smart trainer
Apple TV 4K, runs the Zwift app
LG TV, 42″
Apple iPad Pro, runs the Zwift Companion app

Zwift

I did a virtual Everest on February 6, 2021 with a buddy, climbing over 29,029 feet, and covering 132 miles on Zwift. Follow me on Zwift, Daniel Payne (VV), and I will follow you back.

Summary

The electrification of the bicycle continues in 2022, with the e-bike category continuing to grow across a wide range of models. Gamification of fitness is another mega-trend, with spin bikes and smart trainers leading the way. Traditional bike companies are trying to catch up with new e-bike models, while the number of untraditional bike competitors continues to rise.

Related Blogs

January 5, 2022January 7, 2022

Self-Aligned Via Process Development for Beyond the 3nm Node

Self-Aligned Via Process Development for Beyond the 3nm Node
by Tom Dillinger on 01-05-2022 at 6:00 am
Categories: Events, Foundries, TSMC
1 Comment

The further scaling of interconnect and via lithography for advanced nodes is challenged by the requirement to provide a process window that supports post-patterning critical dimension variations and mask overlay tolerances. At the recent international Electron Devices Meeting (IEDM) in San Francisco, TSMC presented a research update on their process development activities to realize a “self-aligned via” (SAV) for upcoming nodes, with an interconnect + via flow that provides improved manufacturability.[1] This article summarizes the highlights of their presentation.

Introduction

The manufacturability of vias needs to address multiple litho, electrical, and reliability measures:

tolerance to overlay variation (aka, “edge placement error”, or EPE)
consistency of via resistance
robustness of via-to-adjacent metal dielectric properties
- leakage current
- maximum applied voltage before breakdown (Vbd)
- dielectric reliability, measured as time-dependent dielectric breakdown (TDDB)

and, of course,

exceptional yield

(Note that these issues are most severe for the scaling of lower level metals and vias, denoted as “Mx” in the figures in this article.)

The overlay positioning between a via and an adjacent metal line impacts the dielectric breakdown – both Vbd and TDDB. The figure below illustrates the overlay versus dielectric breakdown issue of a conventional via, for a representative EPE.

A “self-aligned” via (with a unique dielectric to an adjacent metal line) would provide greater process latitude to address the challenges listed above.

TSMC SAV Process

There are two key steps to the TSMC SAV process flow – the deposition of a “blocking layer” on metal lines and the selective deposition of dielectric-on-dielectric.

self-assembled monolayer (SAM) deposition on metal

A unique process chemistry step deposits a monolayer of a blocking material on an exposed metal surface. This process is based on the affinity of organic chemical chains suspended in a solution to the metal. The molecular chains are adsorbed on the metal surface, and self-assemble into an organized domain. As the molecules adsorb over time, they will nucleate into groups and grow until the metal surface is covered with a monolayer. (The monolayer packs tightly due to the van der Waals forces, the weak net attractive electric force between neutral organic solids.)

This SAM monolayer will serve as a blocking material. Its composition needs to withstand the thermal exposure of the next step – the selective dielectric deposition on oxide.

selective dielectric-on-dielectric (DoD) deposition

Advanced nodes have leveraged atomic layer deposition (ALD) steps for several generations. A gas phase “pre-cursor” is introduced into the process chamber. Due to chemisorption, a unique pre-cursor monolayer is deposited on the wafer surface. The pre-cursor adheres to the surface, but not to itself – no successive pre-cursor layers are deposited. The chamber is then purged of the excess pre-cursor, and a co-reagent is subsequently introduced. The chemical reaction results in a final monolayer of the desired reaction product that remains on the surface, while the excess co-reagent and reaction by-products are pumped out. The cycle can be repeated to deposit multiple “atomic” layers. ALD has been widely adopted for the deposition of metal and thin-oxide dielectric materials. A key advantage of current ALD processes is they operate uniformly and conformally on the exposed wafer surface.

An active area of research is to provide selective area atomic layer deposition, where the pre-cursor only adheres to a specific material surface. The goal is the pre-cursor adsorption is suppressed on specific areas – in this case, the SAM molecules on the metal.

TSMC explored a selective deposition chemical process, for dielectric-on-dielectric layer buildup. The images in the figure below depict the process flow to raise a dielectric layer above the existing surface oxide.

The SAM blocking layer precludes the selective deposition on the exposed dielectric. As mentioned earlier, the blocking layer must withstand exposure to the elevated temperature of the dielectric-on-dielectric selective deposition. TSMC indicated that higher DoD process temperatures improve the etch selectivity of the dielectric pedestal to the surrounding low-K inter-level dielectric for the via, to be discussed next.

The image labeled “DoD” in the figure above illustrates the wafer after dielectric-on-dielectric deposition and after removal of the SAM blocking material over the wafer, prior to the addition of the low-K dielectric.

The image on the right shows the final via connection, after low-K dielectric dep/etch and via patterning. The added DoD material serves as a suitable “etch stop”, due to the lower etch rate compared to the low-K material. This image illustrates the via-to-adjacent metal dielectric, in the presence of a significant overlay shift.

The figure below illustrates how the added dielectric-on-dielectric layer improves via robustness. The “control” transmission electron microscopy image (without the DoD) shows excessive via etch of the original dielectric, with little isolation to the adjacent Mx line – not particularly tolerant of overlay error. The DoD TEM image shows vastly improved isolation.

Experimental Electrical and Reliability Data for the SAV Process

The various figures below show the experimental data from the TSMC SAV process development team. The Control data reflects the standard via patterning process without the selective DoD layer deposition.

via resistance

Both single via and via chain (yield assessment) resistance values show no difference between the control and DoD processes.

via-to-adjacent Mx reliability (leakage current, Vbd, TDDB)

To assess the process window, the TSMC team evaluated the leakage current and Vbd with an intentional via-to-Mx overlay shift. Note that the control process would not support a 4nm overlay tolerance.

To ensure the additional DoD process steps did not adversely impact the characteristics of the existing Mx metal, TSMC shared evaluation data of metal lines with and without the DoD process. The graphs below show there was no impact to metal line resistance or TDDB/electromigration reliability.

Summary

Continued interconnect scaling below the 3nm node will necessitate unique process development research to maintain electrical and reliability specs in the presence of (up to 4nm) overlay error. The need for low-K interlevel dielectrics is a given – yet, the via etch in these materials is not especially tolerant of EPE.

TSMC has demonstrated a potential process flow for a “self-aligned via” with an additional DoD material. The etch rate differential of the DoD results in more robust via-to-adjacent metal reliability. This process flow utilizes two unique steps – the SAM of a blocking material on metal surfaces, and the selective ALD of a dielectric-on-dielectric.

Hopefully, selective ALD flows will transition soon from R&D to production fabrication – the potential impact of this chemistry for advanced node scaling is great.

-chipguy

References

[1] Chen, H.-P., et al, “Fully Self-Aligned via Integration for Interconnect Scaling Beyond 3nm Node”, IEDM 2021, paper 22-1.

Note: All images are copyright of the IEEE.

January 4, 2022April 9, 2022

A User View of Efabless Platform: Interview with Matt Venn

A User View of Efabless Platform: Interview with Matt Venn
by Kalar Rajendiran on 01-04-2022 at 10:00 am
Categories: Efabless, Semiconductor Services
1 Comment

A few months ago, SemiWiki published an interview of Mike Wishart, CEO of Efabless. That interview provided insights into Efabless vision and its platform strategy. If you haven’t already read that post, please refer to it for background details. In essence, Efabless is about democratization of chip design and manufacturing. Whether you are a professional, company, academic, or hobbyist – the Efabless platform is a fast, simple, inexpensive way to produce your custom chips. Just as YouTube enables everyone to become a creator of published video content, Efabless helps software and hardware developers create their customized chips.

The following interview with Matthew Venn is to bring out a user’s perspective of the Efabless platform. Matt is a science & technology communicator and electronic engineer. Matt has used the Efabless platform to create multiple designs and has submitted designs on all 4 open MPW runs to date. After applying to MPW1 he created a course that aims to teach you everything you need to know to go from zero to ASIC. You can find out more about him and his course at ZeroToASICcourse.com.

Also, feel free to connect with Matt on LinkedIn: https://www.linkedin.com/in/matt-venn/ or Twitter: https://twitter.com/matthewvenn

What is your motivation behind the chip projects you are working on?

It started off purely as personal interest. I saw Tim Edwards from Efabless presenting a chip he had designed using QFlow, an open-source ASIC flow. I was working on an FPGA project at that time, so I tried a couple of small designs just to see how QFlow worked. It was really eye opening to see the different parts of the ASIC flow. I posted some screenshots on Twitter and lots of people were interested. I started looking into running some kind of training that would result in people getting chips in hand. At the time, this meant finding 10 people who’d be willing to pay a few thousand dollars each for a course.

A few months later we had the announcement from Tim Ansell about the Google sponsored free shuttle in collaboration with Efabless and Skywater. I watched the FOSSi dial-up talks and tried the OpenLane flow when it became available. It looked like I’d be able to put my design onto some real chips – a really exciting moment!

The interest I saw from my peers led me to create the Zero to ASIC course, and that has gone on to train 130 people around the world. We’ve applied to all of the MPW runs, with about 30 people from my course submitting designs to the tapeouts.

To summarize, my motivation is about learning and teaching, being involved in this exciting moment of open-source chip design, and taking advantage of getting free chips!

What caused you to look at an open-source approach for your projects?

I’ve worked independently for so long, so a big part of the appeal of open-source to me is the community. I can learn and share, write docs, create videos and be part of a movement without needing to physically be in a company that is using those specific tools.

Of course, the Efabless Open MPW shuttles sponsored by Google are only available if the designs are open-source, so for getting chips made for free, open sourcing my designs was essential.

How did you learn of the Efabless platform and what pushed you to try this platform?

I heard about it through the FOSSi dialup YouTube series. It was the opportunity to get some of my designs in silicon that pushed me to get involved. If I had known how hard it was going to be at the beginning, I might not have done it! You know that saying – sometimes you have to not know the journey in order to embark on it. And that’s also fed into me creating my course. I wanted to have a journey from end to end that would take someone from zero to ASIC.

What was your first project using the Efabless platform? How would you describe your experience? How many projects have you done so far?

My first project was on the first shuttle, the Open MPW-1. I had a couple of open-source FPGA designs already and I thought I’d try them out. This was when I realized how large a space the Efabless Caravel harness provides (or how small my designs were!). So, I asked a few friends if they’d like to get involved and that was how I came up with the idea of what I call a “multi-project-multi-project-wafer” or MPMPW! In the first run, we used a MUX to allow all the designs to have access to all the pins.

One of the cool things that Efabless have done is to create the Caravel harness – which includes a lot of basic building blocks that we can reuse to save time. A key part of it is a RISC V CPU that can configure the GPIOs, and we can also use it to select which design is active.

By the time of the second shuttle, the Open MPW-2 – I had created my course and 14 participants made designs to put on the MPMPW. One thing that’s really attractive about the new ecosystem is how practical the course can be. It’s unusual to have the tools locally installed and to also be able to take part in a real tapeout.

In MPW-3 we managed to place a 1kB SRAM generated by the OpenRAM memory compiler along with 7 other designs from the course.

For MPW-4, we made that memory available via wishbone to the other designs on the MPMPW, and included 10 designs from people on the course. Two designs made use of the local fast memory, a CPU by Uri Shaked and a demonstration project by me.

What are the speed bumps and hurdles that you faced when trying to build your own chips?

As there is a lot new in this process – new tools, new PDK, a lot of people new to the field – there are a lot of problems that are being discovered and solved. The first few shuttle runs, we had a lot of changes to the tooling at the last moment, even after the scheduled tapeout dates. Mostly, this was due to discovering critical issues that needed patches made to the tools, and participants needing to re-run their designs on the updated tools. As a result, this has led to long feedback times. By the time this article goes out, I’ll have submitted to 4 MPWs, but I am only now getting the first chips back to test. I could have made mistakes but not learnt about them till now. The good news is that the issues are being solved. The better news is that this open model has allowed us to have visibility into the issues and help in the solutions.

I’ve learnt a lot over the last year – and there’s still more to learn. I like Feynman’s philosophy – if you want to learn, you should teach. It really exposes what I don’t know. For example, I didn’t really fully understand what hold violations were, until recently.

Tell us about your multi project tooling that lets you aggregate more designs into an MPW submission. How cheap is it to build a chip through this platform?

When I discovered how small my designs were, I thought it was a shame to waste all the extra space. On MPW-1, I manually wired everything up, but it was pretty scary making changes. Since then, I’ve developed some open-source Python tooling that helps to join lots of smaller designs together. The tools can run a set of automatic tests and do some basic checks to make sure the designs are safe. All the outputs of the designs are tri-stated, and the firmware of the RISC V processor in the Efabless Caravel harness keeps them turned off until it’s their turn. This means that each design can use all 38 IOs.

At the moment we are applying to the free shuttles. Efabless also has a paid version of the shuttle called ChipIgnite, where your design doesn’t need to be open-source. So, if we combined 16 designs into one application, that would mean each applicant would pay just over $600 for 18 chips. Everything gets very affordable and fabrication gets accessible to a broader group.

You are an educator as well, teaching people to design and build their own chips. Tell us more.

By working in various startups over the years, I realized that when it comes to engineering, I’m much more of a generalist than a specialist. I’ve never focused on one area long enough to become a true expert. That used to bother me, but I’ve realized that it’s actually very valuable to have someone on the team that can help to bridge different areas, and especially to communicate difficult ideas to a broader audience.

I’ve always loved sharing what excites me about science and technology, which has led to a lot of science communication work. I think everyone knows the difference between really first-rate training and the truly awful boring kind. I’m always looking for ways to make my training accessible, entertaining, and practical.

I think one of the reasons there is so much interest from people about the open-source ASIC tooling is because this makes such a difference to the accessibility of training. Now I can have a VM with all the open-source tooling ready to go, and people can come on my course without having to do complicated setup or signing any NDAs.

Give us examples of exciting innovative chips that you’re developing through the Efabless platform.

You can see all the designs that went on MPW-2 , MPW-3 and MPW-4 here:

https://www.zerotoasiccourse.com/post/mpw2-submitted/

https://www.zerotoasiccourse.com/post/mpw3/

https://github.com/mattvenn/zero_to_asic_mpw4

It’s hard to choose the most exciting or innovative among them, but some interesting experimental work would be a PUF and EM glitch detector.

OpenPUF:

Author: Pedro Rivera
Github: https://github.com/pedrorivera/wrapped_OpenPUF
Description: A delay-based physically-unclonable function (PUF) proof of concept

hoggephase

Author: David Hulton
Github: https://github.com/h1kari/wrapped_hoggephase_project
Description: Hogge Phase EMFI/BBI Glitch Detector

What would you tell software engineers who would like to try their hand at building their own chips to run their software?

I’d say it depends on your interest. I’d break it into two recommendations for different people.

If you are coming to this out of an interest to know how chips are designed and made, then start off simple and build something very basic at a low level. For example, make a PWM driver in Verilog or VHDL.

If you want to build a custom SoC for a RISC V CPU, then use a framework like LiteX that magically automates a lot of the pain away.

What misconceptions or apprehensions did you have before you tried the open ASIC shuttle project first-hand?

It was one of those cases where I really didn’t know what I didn’t know. And I’ve still a lot to learn. You have to remember I only have 1 year of experience with ASICs!

What would you tell experienced hardware engineers who might hesitate to try their hand at using the open-source tools and the Efabless platform?

I would set expectations. Don’t expect the open-source tools to have parity with the proprietary ones in creating a fully optimized design for power, performance and area. In my opinion, that won’t happen for a long time, at least for more advanced technologies. That being said, the open-source tools plus functionality provided by Efabless knocks down the barriers that stopped most of us from getting to a real design. For example, I think users will find the solution very effective for proof of concepts and it is a viable path for creating the custom silicon that you need for your hardware. In any case, do get involved and do keep an eye on it. You will be joining a lot of like-minded people and I expect there will be lots of new opportunities for the tools outside of education.

One of the really cool things is the transparency of the Efabless solution. You can see what works and doesn’t and even contribute to the fixes and become part of the evolution of the solution. My MPMPW idea and my videos are just a couple of examples of a user adding value.

In your mind, how would the semiconductor supply chain benefit from this open-source movement?

I think that open-source will disrupt the traditional chip, tool and IP markets. Eventually, this will lead to a lot lower barrier to entry for new people to get involved and will lead to more people making more chips. I would imagine that, just like software, there will be winners and losers with the ones leveraging the movement reaping big benefits.

I’ve also heard about dramatic performance gains from designing special purpose SoCs for specific applications or using chiplets. I think both of these areas will be ones where the open-source tooling can compete on a more even footing by lowering the upfront cost or NRE, and focusing on community and simplicity.

In the case of domain specific accelerators, open-source tools and frameworks make it easier to quickly build something. The LiteX example I gave earlier is a perfect example.

For chiplets, it allows the use of older and cheaper processes for digital IO or peripherals. These don’t need to be as performant in PPA terms, so why pay for the expensive tooling or use advanced nodes to develop them?

Then we have the security side, that’s always been something where open-source tooling is ahead. How can you trust the proprietary tools if you can’t inspect how they work. We have the same ‘more eyes’ reasoning that also makes sense for developing secure software and cryptographic algorithms.

Also read:

CEO Interview: Mike Wishart of Efabless

The Importance of Low Power for NAND Flash Storage

Security Requirements for IoT Devices

January 4, 2022January 5, 2022

Technology Design Co-Optimization for STT-MRAM

Technology Design Co-Optimization for STT-MRAM
by Tom Dillinger on 01-04-2022 at 6:00 am
Categories: Events, Foundries, TSMC
3 Comments

Previous SemiWiki articles have described the evolution of embedded non-volatile memory (eNVM) IP from (charge-based) eFlash technology to alternative (resistive) bitcell devices. (link, link)

The applications for eNVM are vast, and growing. For example, microcontrollers (MCUs) integrate non-volatile memory for a variety of code and data storage tasks, from automotive control to financial bankcard security to IoT/wearable sensor data processing. The key characteristics of eNVM are:

performance (read access time, write-verify cycle time)
data retention, over voltage and (especially) temperature extremes
- bitcell “drift” over time (e.g., changes in device resistance leading to increasing bit-error rate)
write endurance (# of write cycles)
reliability (e.g., susceptibility to bit storage fails from external radiation or magnetic fields)
sensitivity to process variability
cost (e.g., # of additional mask lithography steps, compatibility of the embedded memory fabrication with existing FEOL and BEOL process steps)

(Note that the number of extra masks for embedded flash is large, and requires exposure to high programming voltage.)

yield (assume a double-error correction data width encoding will be used)

STT-MRAM

One of the leading eNVM technologies is the magnetic tunnel junction (MTJ) device, which uses a spin-torque transfer write current mechanism to toggle the MTJ between “parallel” (P) and “anti-parallel” (AP) states. During a read cycle, the resistance differences between these states is sensed.

The figure below illustrates the process integration of STT-MRAM into a BEOL process for an advanced logic node. [1]

This STT-MRAM process offers a considerable cost advantage over scaling existing eFlash device technology.

In the image on the right above, the word lines run through the array connected to access devices. During a read cycle, the column select line is grounded, and the resistance of the active MTJ determines the bitline current. For a write cycle, since the MTJ programming current flows in opposite directions for a write to the AP state versus a write to the P state, the roles of the bitlines and column select lines are reversed, depending on the data value – i.e., write_1: BL = 0V, CS = VPP; write_0: BL = VPP, CS = 0V.

STT-MRAM technology does present some challenges, however.

small read sensing window

The read cycle needs to sense the difference in MTJ resistance between parallel and anti-parallel states. Process variation in MTJ characteristics results in narrowing of this resistance contrast. Sophisticated sense amplifier design methods are needed to compensate for a tight resistance margin.

strong MTJ sensitivity to temperature

The embedded MTJ IP will be subjected to temperature extremes both during assembly and during its operational lifetime. The solder ball reflow and package-to-PCB attach process temperature is far higher than the maximum operational temperature, albeit only once and for a relatively short duration. (Solder reflow temperatures are ~245C-260C.) The operational environment for the demanding nature of MCU applications typically spans -40C to 125C. The composition and diameter of the MTJ materials – i.e., the fixed and free magnetic layers, the tunneling oxide – are selected to maintain the spin-transfer torque properties throughout both assembly and operating temperature cycles.

Yet, due to the MTJ sensitivity to temperature, any attempt to pre-program data into the embedded STT-MRAM array prior to exposure to the assembly process temperatures would be fruitless. Special technology-design co-optimization (TDCO) methods are needed to initialize (a portion of) the STT-MRAM array with key data measured at wafer test – more on these methods shortly.

Also, the read sensitivity – i.e., the resistance difference of P and AP states – is reduced at high temperature. At cold temperature, the write current required to set the state of the bitcell is increased. Again, TDCO techniques are required to compensate for these reduced margins at different temperature extremes.

process variation in MTJ characteristics

Sensing of the resistance differential also needs to address the process variation in MTJ devices, and the range of P and AP resistance states.

At the recent International Electron Devices Meeting (IEDM) conference in San Francisco, TSMC presented their TDCO approaches to address the STT-MRAM challenges above. [2] The rest of this article summarizes the highlights of their presentation, leading to the production release of STT-MRAM IP in their N22 ultra-low leakage process (N22ULL) node.

TSMC TDCO for N22ULL STT-MRAM

read sensing

When an address word line is raised along a set of bitcells in the MRAM array, current flows through the MTJ from the (pre-charged) bitline to the (grounded) select line. The magnitude of the current on the bitline depends upon the P or AP state of the individual bitcell, and needs to be accurately sensed. The MTJ process variation across the array suggests that each bitline sense circuit must be individually “trimmed” to match the specific local characteristics of the devices. And, the strong temperature dependence of the MTJ needs to be dynamically compensated.

The optimized TSMC solution to MRAM bitline read sensing is illustrated below.

The read sense circuitry shown above is differential in nature, intended to amplify the voltage difference on lines Q and QB that evolves during the read cycle. Prior to the start of the read, both nodes Q and QB are pre-charged. When the address word line is raised, bitline current flows through the MTJ – in the figure above that is represented by current source Icell.

Note that the bitcells in the memory array are “single-ended” – i.e., there is only one connection to the sense amplifier. (This is in contrast to a conventional 6T SRAM bitcell, for example, which provides connections to both Q and QB of the sense amplifier.) As a result of the single connection, it is necessary to provide the QB line with a reference current, which needs to be between the Icell_P and Icell_AP values which may be flowing in the opposite side of the sense amplifier. Further, this reference current needs to adapt to the local die temperature.

TSMC developed a unique design approach to provide the Iref value to a set of N bitcells + sense amplifiers on a word line in the array.

The figure above depicts N/2 reference MTJs that have been initialized to a P resistive state and N/2 reference MTJs in an AP state. Their outputs have been dotted to provide a “merged” reference current. The WL_REF signal is raised in a balanced timeframe as the active wordline – the resulting merged reference current is connected to the N sense amplifiers. As a result, the Iref current to an individual SA is:

((N/2) * I_P) + ((N/2) * I_AP) / N = (I_P + I_AP) / 2

or the ideal “midpoint” current on the QB line. After an appropriate duration into the read cycle, when a Q and QB voltage difference has been established, the Latch enable signal is raised to amplify the differential and drive Dout to the read value.

The approach to generate Iref for the sense amplifiers in an MRAM array bank provides both temperature compensation and some degree of “averaging” over process variation.

sense amplifier trimming

Nevertheless, MTJ process variation necessitates a per-sense amplifier correct design technique. In the sense amplifier circuit figure above, devices N1A through N1X are in parallel with the sense pulldown transistor, all connected to Vclamp. The switches in series with these devices represent the capability to trim the resistance of the Q line during a read cycle. (The N2A through N2X devices provide a comparable, symmetric capability on the QB line, matching the loading on the Q line.) During wafer-level testing, the memory BIST macro IP includes programming support to adjust the “trim code” to realize the lowest number of bit read failures during BIST, with error-correction circuitry disabled. (This testing is performed at elevated temperature.)

OTP-MRAM

It was mentioned earlier that the elevated temperatures to which the MTJ is subjected during assembly preclude any attempt to write data into the array during die test. Yet, the trim code values for each sense amplifier derived during memory BIST need to be retained in the array. (Also, any built-in array self-repair BISR codes identified after BIST testing need to be stored.)

To address this issue, TSMC developed a unique approach, where some of the MTJ cells are subjected to a one-time programming (OTP) write sequence. These cells retain their OTP values after exposure to the solder-reflow assembly temperature.

For these storage locations, a (tunnel oxide) “breakdown” voltage is applied to the MTJ to represent a stored ‘0’ value; the cell current will be high. As illustrated above, any OTP junction that does not receive an applied breakdown voltage during programming will remain (P or AP) resistive, and thus will be sensed as storing a fixed ‘1’ value.

temperature-compensated write cycle

Whereas the sense amplifier (Rp versus Rap) read margin is reduced at high temperature, the MTJ write cycle is a greater challenge at low temps, where higher current are required to alter the MTJ state. TSMC developed an operational write-verify cycle, where the applied write voltage is dynamically adapted to temperature. The figure below shows a shmoo plot indicating the (wordline and bitline) write voltage sensitivity versus temperature (for AP-to-P and P-to-AP), and thus the need for compensation.

TSMC noted the “wakeup time” of the analog circuitry used to generate the corresponding write voltages adds minimally to the write cycle time.

Summary

At advanced process nodes, STT-MRAM IP offers an attractive evolution from eFlash for non-volatile storage – e.g., high retention, high durability, low additional process cost. TSMC recently presented their TDCO approach toward addressing the challenges of this technology, adopting several unique features:

improved read sensing between Rp and Rap
- derivation of read sense amplifier reference current compensated for temperature, with process variation averaging
- per sense amplifier “trimming” for optimal read bit error rate
one-time programming cell storage prior to solder reflow assembly, to retain trim codes and array repair values
a temperature-compensated write voltage applied to the MTJ (as part of the write-verify cycle)

The characteristics and specs for the TSMC N22ULL STT-MRAM IP are appended below.

To quote TSMC, “Each emerging memory technology has its own unique advantages and challenges. Design innovation is essential to overcome the new challenges and bring the memory IP to market.”

-chipguy

References

[1] Shih, Yi-Chun, et al., “A Reflow-capable, Embedded 8Mb STT-MRAM Macro with 9nS Read Access Time in 16nm FinFET Logic CMOS Process”, IEDM 20, paper 11.4.

[2] Chih, Yu-Der, et al., “Design Challenges and Solutions of Emerging Nonvolatile Memory for Embedded Applications”, IEDM 2021, paper 2.4.

Note: All images are copyright of the IEEE.

January 3, 2022April 14, 2022

Demand for High Speed Drives 200G Modulation Standards

Demand for High Speed Drives 200G Modulation Standards
by Tom Simon on 01-03-2022 at 10:00 am
Categories: Alphawave Semi, Events, IP

Right now, the most prevalent generation of Ethernet for data centers is 400 Gbps, with the shift to 800 Gbps coming rapidly. It is expected that by 2025 there will be 25 million units of 800 Gbps shipped. Line speeds of 100G are used predominantly for 400 Gbps Ethernet – requiring 4 lanes each. Initially 800 Gbps will simply move to 8 lanes, but the bulk of 800 Gbps will ultimately use 200G lanes. This move to 800 Gbps and the expected use of 200G lanes is adding huge impetus to the development of modulation standards for 200G lanes.

For long reach connections line loss and power are major factors in determining what modulation method should be used in going to 200G. 100G uses 4-PAM. There is a recent video and white paper from Alphawave IP, a developer of advanced communications IP, on this topic that compares the various options. The video, titled “Connecting the Digital World—The Path to 224 Gbps Serial Links” also looks at other methods that can be used to improve data rates while not consuming excessive power. In the video Alphawave IP President and CEO Tony Pialis reviews the options that the industry has for moving forward on 200G standards.

Tony goes through 2-PAM, 4-PAM, 6-PAM, 8-PAM, QPSK and 16-QAM comparing the tradeoffs for each. It seems that for each one there is a penalty in power, SNR or Eb/No to maintain the needed bit error rate. Doubling the frequency for 4-PAM requires more power than 6-PAM and requires more bandwidth. 2-PAM requires even more power. 6-PAM and 8-PAM suffer from decreased SNR due to the smaller constellation spacing. QPSK and 16-QAM require more channel capacity compared to PAM modulation techniques. They also suffer from increased Eb/No or power. There is more to this and I suggest viewing the video to get the full picture.

200G Modulation

Other novel methods can be utilized to help reduce errors and increase line efficiency. Tony starts by describing an advanced DSP technique that can improve the integrity of the received signal with virtually no penalty. He proposes the use of Decision Feedback Equalizers (DFE) to remove inter-symbols interference (ISI). With the receipt of each identified symbol, the remnant of that symbol can be factored out of the subsequent symbol by using DSP techniques. This makes it easier to interpret the incoming symbol.

Another application of advanced DSP techniques is Maximum Likelihood Sequence Detectors (MLSD) which creates a usable analog model of the channel to predict what various incoming symbol sequences would look like after transiting the channel. By comparing the actual received signal with various possible data patterns, the data pattern with the lowest mean square error compared to the actual signal can help identify the correct sent data.

Amplifying a signal boosts the noise just as much as the data, so smarter methods like the two listed above have a lot to offer as data rates push harder against the absolutes limits of Ethernet connections and board/package losses at higher data rates. The methods above are also very power efficient.

Tony closes with his thoughts on the use of error correction and modulation methods. 4-PAM is really not able to support longer channels. This leaves 6-PAM as a good alternative for long range links. He even hints at a standard that could mix modulation methods based on the channel. There is no doubt that the push to 200G lanes is on, and we can expect to see its use in 800 GbpE and even 1.6TbE. The full video is available here on the Alphawave IP website.

Also Read:

The Path to 200 Gbps Serial Links

Enabling Next Generation Silicon In Package Products

Alphawave IP is Enabling 224Gbps Serial Links with DSP

January 3, 2022January 5, 2022

Advanced 2.5D/3D Packaging Roadmap

Advanced 2.5D/3D Packaging Roadmap
by Tom Dillinger on 01-03-2022 at 6:00 am
Categories: Events, Foundries, TSMC
1 Comment

Frequent SemiWiki readers are no doubt familiar with the advances in packaging technology introduced over the past decade. At the recent International Electron Devices Meeting (IEDM) in San Francisco, TSMC gave an insightful presentation sharing their vision for packaging roadmap goals and challenges, to address the growing demand for greater die integration, improved performance, and higher interconnect bandwidth.[1] This article summarizes the highlights of the presentation.

Background

2.5D packaging

2.5D packages enable multiple die to be laterally positioned in close proximity, with signal redistribution interconnect layers (RDL) between the die fabricated on a silicon interposer present between the die and package substrate. Through silicon vias (TSVs) provide the connectivity to the substrate.

The TSMC implementation of this technology is denoted as Chip-on-Wafer-on-Substrate (CoWoS), as was introduced a decade ago using multiple FPGA die in the package to expand the effective gate count.

The emergence of high bandwidth memory (HBM) stacked die as a constituent of the 2.5D integration offered system architects with new alternatives for the memory hierarchy and processor-to-memory bandwidth.

The development investment in 2.5D technology grew, now enabling the silicon interposer area to greatly exceed the “1X maximum” reticle size, to accommodate more (and more diverse) processing, memory and I/O die components (aka, “chiplets”).

Additional package fabrication steps incorporate local “trench capacitors” into the interposer. Oxide-poly-oxide-poly material layers fill the trench, with the poly connected to the RDL supply metal. The resulting decoupling capacitance reduces power supply droop considerably.

Alternative technologies have also been developed, replacing the full area silicon interposer with a local “silicon bridge” (CoWoS-L) between adjacent die embedded in an organic interposer, thus reducing cost (albeit with relaxed RDL interconnect dimensions).

Concurrently, for very low cost applications, the demand for higher I/O count die than could be supported with the conventional wafer-level chip-scale package (WLCSP) led to the development of a novel technology that expands the die surface area with a “reconstituted wafer”, on which the redistribution to a larger number of I/O bumps could be fabricated.

This Integrated FanOut (InFO) technology was originally developed for single die (as a WLCSP-like offering). Yet, the application of this technique is readily extended to support the 2.5D integration of multiple heterogeneous die placed adjacent, prior to the reconstitution step. (The InFO_oS technology will be discussed shortly.)

3D die stacking

3D die stacking technology has also evolved rapidly. As mentioned above, the fabrication of TSVs spanning between layers of DRAM memory die with “microbumps” attached at the other end of the TSV has enabled impressive levels of vertical stacking – e.g., eight memory die plus a base logic controller die in an HBM2e configuration.

Similarly, through-InFO vias (located outside the base die in the reconstituted wafer material) has enabled additional micro-bumped die to be vertically stacked above the base InFO die – e.g., a memory die on top of a logic die.

The most recent advancement in 3D stacking technology has been to employ bump-less “direct bonding” between two die surfaces. Applying a unique thermal + compression process, two die surfaces are joined. The metal pad areas on the different die expand to form an electrical connection, while the abutting dielectric surfaces on the two die are bonded. Both face-to-face (F2F) and face-to-back (F2B) die orientations are supported. The planarity and uniformity (warpage) requirements of the surfaces are demanding; particulates present on the surface are especially problematic. TSMC denotes their 3D package technology as System-on-Integrated Chips, or “SoIC”.

As product architects are exploring the opportunities available with these packaging technologies, there is growing interest in combining “front-end” 3D stacked SoIC configurations with 2.5D “back-end” (InFO or CoWoS) RDL patterning and assembly. The collective brand that TSMC has given to their entire suite of advanced packaging offerings is “3D Fabric”, as illustrated below.

TSMC 3D Fabric Roadmap

At IEDM, TSMC shared their strategy for improving performance, power efficiency, signal bandwidth, and heat dissipation for these technologies. (The majority of the focus was on bonding technology for SoIC.)

CoWoS (2.5D)

- increase package dimensions to 3X maximum reticle size for the Si interposer
- expectation is that stacked SoIC die will be integrated with multiple HBM stacks

InFO_oS (2.5D)

The original InFO offering was as an evolution to WLCSP, first as a single die, and then as a base die with another added on top connected to the through-InFO vias. TSMC is also expanding the InFO offering to support multiple adjacent die embedded in the reconstituted wafer; the RDL layers are then fabricated and microbumps added for attach to a substrate (InFO-on-Substrate, of InFO_oS). A projection for the InFO_oS configurations to be supported is illustrated below.

SoIC (3D)

The roadmap for 3D package development is shown below, followed by a table illustrating the key technical focus – i.e., scaling the bond pitch of the (F2F or F2B) stacked connections.

The bond pitch (and other metrics) for microbump technology evolution are included with the SoIC direct bonding measures in the table above for comparison.

As shown in the table above, TSMC has defined a new (relative comparison) metric to represent the roadmap for 3D stack bonding technology – an “Energy Efficiency Performance” (EEP) calculation. Note that the target gains in EEP are driven by the aggressive targets for scaling of the bond pitch.

EEP = (bond_density) * (performance) * (energy efficiency)

Much like the IC scaling associated with Moore’s Law, there are tradeoffs in 3D bond scaling for performance versus interconnect density. And, like Moore’s Law, the TSMC roadmap goals are striving for a 2X improvement in EEP for each generation.

SoIC Futures

As an illustration of the future potential for 3D stacking, TSMC provided an example of a three-high stacked structure, as shown below.

Note that the assumption is that future HBM stacks will migrate from a microbump attach technology within the stack to a bonded connection – the benefits of this transition on performance, power, and thermal resistance (TR) are also shown in the figure.

heat dissipation

Speaking of thermal resistance, TSMC emphasized the importance of both the bonding process for low TR and design analysis of the proposed 3D stack configuration, to ensure the junction temperature (Tj) across all die remains within limits.

The IEDM presentation referred to additional research underway at TSMC to evaluate liquid-cooling technology options. [2] As illustrated below, “micro-pillars” can be etched into a silicon lid bonded to the assembly, or even directly into the die, for water cooling.

Summary

Advanced 2.5D and 3D packaging technologies will provide unique opportunities for systems designers to optimize performance, power, form factor (area and volume), thermal dissipation, and cost. TSMC shared their development roadmap for both 2.5D and 3D configurations.

The 2.5D focus will remain on support of larger substrate sizes for more (heterogeneous) die integration; for markets focus on cost versus performance, different interposer/bridge (CoWoS) and reconstituted wafer (InFO technology options are available.

3D stacking technology will receive the greatest development focus, with an emphasis on scaling the interface bond pitch. The resulting “2X improvement in EEP” for each SoIC generation is the target for the new “More than Moore” semiconductor roadmap.

-chipguy

References

[1] Yu, Douglas C.H., et al, “Foundry Perspectives on 2.5D/3D Integration and Roadmap”, IEDM 2021, paper 3-7.

[2] Hung, Jeng-Nan, et al., “Advanced System Integration for High Performance Computing with Liquid Cooling”, 2021 IEEE 71st Electronic Components and Technology Conference (ECTC), p. 105-111.

Note: All images are copyright of the IEEE.

January 2, 2022February 9, 2022

Webinar: AMS, RF and Digital Full Custom IC Designs need Circuit Sizing

Webinar: AMS, RF and Digital Full Custom IC Designs need Circuit Sizing
by Daniel Payne on 01-02-2022 at 10:00 am
Categories: EDA, Events, MunEDA

My career started out by designing DRAM circuits at Intel, and we manually sized every transistor in the entire design to get the optimum performance, power and area. Yes, it was time consuming, required lots of SPICE iterations and was a bit error prone. Thank goodness times have changed, and circuit designers can work smarter by using EDA tools that size transistors to meet goals, without all of that manual sizing and SPICE iterations.

I’ve been following EDA vendors with transistor sizing tools for many years now, and MunEDA has this technology. They hosted a webinar on Optimal Circuit Sizing Strategies for Performance, Low power, and High Yield of Analog and Full-custom IP. You can see replay HERE.

I asked some questions about their circuit sizing technology to learn more, prior to the webinar.

Circuit Sizing Q&A

Q: Does the circuit sizing work for any IC technology: Planar CMOS, FinFET, GAA, Bipolar, BiCMOS, SiC ?

Yes, the optimization algorithms we are using for circuit sizing are developed and adapted to all today typical semiconductor process technologies like the ones mentioned by you. This is enabled by smart combinations of continuous and discrete sizing methods that have been continuously improved with process generations over the years and are meanwhile highly applicable and efficient.

Q: How large of an IP block can I optimize sizes for, in terms of MOS transistors and Resistors?

There is not really a limit by the number of single devices in your circuit. Nevertheless circuit sizing is more practical when you have circuits or blocks with a reasonable simulation time that lasts from a few seconds to a few minutes for a single simulation. Typical IP blocks used for sizing and optimizing are between a few dozen up to several hundred devices large. You have to consider that a nominal optimization run requires typically a few hundred simulations, a full yield optimization including worst-case and degradation effects can require a few thousand simulations. Depending if you expect a result within 1-2 hours or can run the optimization over the weekend or for a whole week, will have great influence on which circuits or even whole chips can be useful for such optimization runs.

Q: Do I use my own SPICE circuit simulator along with your optimization tool?

MunEDA’s tools are simulator-agnostic which means they are integrated and run with the standard industrial SPICE simulators from the large simulator vendors. But we also have integrated and run our tools with customers’ in-house simulators for many years. We are not urging the customer to use a specific simulator to run our tools. Customers like to work in their individual, quality-proven and certified design framework and simulation environment, in which other tools like MunEDA’s should be integrated smoothly and seamlessly. This is given and guaranteed for MunEDA tools for enhanced circuit migration, verification and optimization.

Q: Does your approach take advantage of multi-core CPUs?

Yes, all simulation runs can be parallelized over a network and run simultaneously on parallel simulation engines using multi-core CPUs for further speed-up.

Q: Can I run optimization in the cloud as a service?

MunEDA is offering the EDA tools for doing automated migration, verification, sizing and optimization for direct installation with our customers. We are not offering optimization services in the cloud, but our customers can install and use our software in the cloud. In reality it is often the case that our customers, fabless design houses or IDM Integrated Device Manufacturers are working with our tools to migrate their own or their customers IP from existing foundry process to new process technologies. After migration running circuit optimization can help to address the new customer specifications for the transferred IP much faster and more efficient.

Q: Is your sizing technology patented?

We have no patents on our sizing technology. There are many publications around about circuit sizing and optimization, but only a very few EDA vendors have managed to successfully implement these complex methods into such easy to use tools like MunEDA.

Q: Has there been correlation with silicon results to prove that the sizing was optimized, or do we just compare SPICE simulation results?

We have many cases about this, some of them have been published by our customers on our regular MUGM MunEDA User Group Meetings. Customers will compare correlations of both simulation runs and silicon runs with each other. Our methods and software can often also detect if there are problems with technology data in the PDK. Also comparison between PCM measurements and simulation data can be checked with our tools. This helps the designer but also the process engineer to get higher confidence about effects that can happen in between simulation and manufacturing of circuits and chips.

Q: What is the learning curve like for your circuit sizing tool?

It is often quite easy and not very hard. The circuit designer knows her/his circuit often quite well. Therefore, also performances, specifications and other important target lines are often known. The designer simply defines such sizing and optimization targets in the tool – can be also partially imported from the design framework – and starts to run the optimization algorithms. The sizing tool takes into account all constraints and circuit restrictions and tries to optimize for the given circuit as much as possible. The optimization procedure follows here exactly the structure the designer knows from manual design optimization like constraint check and optimization, performance optimization, worst-case corner optimization, optimization for statistical variation and yield, and even degradation and reliability effects. The setup routine is fast and easy and the optimization itself can run automatically in the background without much designer attention. As all tools can be run by an easy to use GUI Graphical User Interface but also in batch mode the designer can select its preferred way of working easily.

Q: Are the optimization results displayed numerically, graphically or both?

The optimization results will be always available in both ways. But more than this you also can compare the values and curves with the waveform extracted from the SPICE simulations. The designer can also see easily how much trade-offs are still in the circuit to improve them further (e.g. for less area, less power or higher performance and speed). There are many GUI and display functions the designer can get information out of the tools that helps for her/his design and quality reports. There are numerous export and printing functions you can transfer the results to other tools.

Q: Can I optimize for both time-domain and frequency-domain analysis?

Yes this ca be done simultaneously running the same DUT device-under-test within different domains using our powerful multi test bench environment.

Q: How do I control the optimization process, are there any settings that I need to learn?

You can directly follow in the GUI the changes the tool is doing during the optimization process on your circuit performances or other parameters. There are also graphs.

Q: How is ML applied during circuit sizing?

Our sizing methods contain highly intelligent ML-based decision algorithms that continuously measure and simulate the current status and automatically calculate directions for improving the circuit in the desired way. For this reason the designer attention during the sizing and optimization process can be reduced to an absolute minimum. The ML-based algorithms also can run circuit optimization for the same test-bench under various also sometimes controversial conditions.

Q: Who are some of the customers using your circuit sizing?

Our circuit sizing algorithms are in use by numerous large, midsize and small IDM integrated device manufacturer, fabless design and IP houses but also by the IP and design services departments of silicon foundries. We have numerous publications and presentations from our customer such as Samsung, STMicroelectronics, SKHynix, Infineon, Novatek, ROHM, Fraunhofer, inPlay, SMIC, and many others from our MUGM MunEDA User Group Meeting but also international conferences such as DAC, CICC, ANALOG and others.

Q: How does the circuit sizing optimization take into account all of the layout parasitics?

After layout you can run our highly-efficient circuit sizing tools to run on the extracted and flat netlists to check on the parasitics and reduce their influence by very small sizing steps especially to improve the final yield and reduce the sensitivity to statistical process variations.

Q: Can a Junior IC circuit designer be successful with your tool, or do I need to be a Senior IC circuit designer?

All designers can easily run our tools for circuit migration, verification, sizing and optimization regardless if they have only a few or many years of experience. They are in use with graduate students or PhD students in universities, just like with design fellows at industrial semiconductor design and manufacturers. Our GUI-based alternative of step-by-step improvements or fully-automatically circuit sizing delivers this knowledge to the designer and adapts to her/his individual experience level.

Summary

Learn how to automate the circuit sizing portion of your transistor-level IC designs to get the best performance in a reasonable amount of time at this webinar. You can see the replay HERE.