100X800 Banner (1)

Predicting EUV Stochastic Defect Density

Predicting EUV Stochastic Defect Density
by Fred Chen on 12-04-2022 at 6:00 am

Predicting EUV Stochastic Defect Density

Extreme ultraviolet (EUV) lithography targets patterning pitches below 50 nm, which is beyond the resolution of an immersion lithography system without multiple patterning. In the process of exposing smaller pitches, stochastic patterning effects, i.e., random local pattern errors from unwanted resist removal or lack of exposure, have been uncovered due to the smaller effective pixel size and the smaller number of photons absorbed per pixel. In this article, I present a way to visualize the defective pixel rate and how it may be tied to stochastic defect density.

Here, for the most straightforward analysis we will consider an idealized image: a 1:1 duty cycle line grating, with binary amplitude. Also, we will focus on the pitch range of 50 nm and below for a 0.33 NA EUV system. Consequently, the normalized image can be represented mathematically as 0.25+(1/pi)^2+1/pi*cos(2*pi*x/pitch). The absorbed dose profile in the resist will therefore be proportional to this expression, basically multiplied by the absorbed average dose. Again, for keeping things simple, we ignore the polarization and angle-based 3D-mask effects which are actually present, as well as electron blur, which would become much more significant for the 0.55 NA EUV systems [1].

This absorbed dose profile is plotted on a preset grid. I used a 99 x 101 nm pixel grid, where the pixel is normalized to 1/100th of the pitch. Poisson statistics is used to obtain the random absorbed dose at each pixel. The pixel is considered defective if it falls below a certain threshold for exposure, producing an unexposed defect, or if it exceeds the same threshold, producing a potential bridge defect. By changing the dose, improperly unexposed or exposed pixels can be visualized (Figure 1).

Figure 1. Stochastic defects at lower doses (left) tend to be unexposed pixels (blue in central orange area), while at higher doses (right) tend to be improperly exposed pixels (orange in top/bottom blue area).

By scanning the dose, the defective pixel rate may be plotted as a function of absorbed dose. Unexposed pixels decrease with increasing dose, while beyond some dose, improperly exposed pixels leading to bridging start increasing (Figures 2,3). The smallest defective pixel rate that can be detected for this small grid is 1e-4. The defective pixel rate is not a direct measure of predicted defect density. Instead, we rely on a formula from de Bisschop [2] used for inspection image pixels: defects/cm2 = 1e14 pixNOK/(NPR), where pixNOK is the defective pixel rate, N is the average number of pixels per defect, P is the pitch, and R is the pixel size in nm. For the 50 nm pitch case, a 3e-10 defective pixel rate with 0.5 nm/pixel and 100 pixels/defect gives 12 defects/cm2. For the 40 nm pitch case, a 1e-9 defective pixel rate with 0.4 nm/pixel and 125 pixels/defect gives 50 defects/cm2. These values are comparable to recently published values [3].

Figure 2. Defective pixel rate (out of 99 x 101 0.5 nm pixels) for 25 nm half-pitch vs. absorbed dose.

Figure 3. Defective pixel rate (out of 99 x 101 0.4 nm pixels) for 20 nm half-pitch vs. absorbed dose. Optimum absorbed dose and minimum defective rate are higher for the reduced pitch.

At the same average absorbed dose, the smaller pitch shows larger variations due to the smaller pixel size. It is therefore to be expected that larger doses are preferred to maintain a given defective pixel rate. The standard deviation is also smaller for the smaller pitch (due to fewer photons within the grid area), within a given dose range, which would also lead to a higher minimum defect rate.

The immensely greater photon density of ArF immersion systems has allowed them to avoid seeing stochastic effects down to the 80 nm pitch (Figure 4), even with relatively low absorbed mJ/cm2.

Figure 4. Negligible stochastic effects show at 80 nm pitch for ArF immersion lithography, even with only 3 mJ/cm2 absorbed.

References

[1] T. Allenet et al., “EUV resist screening update: progress towards High-NA lithography,” Proc. SPIE 12055, 120550F (2022).

[2] P. de Bisschop, “Stochastic printing failures in extreme ultraviolet lithography,” J. Microlith/Nanolith. MEMS MOEMS 17, 041011 (2018).

[3] S. Kang et al., “Massive e-beam metrology and inspection for analysis of EUV stochastic defect,” Proc. SPIE 11611, 1161129 (2021).

This article originally appeared in LinkedIn Pulse: Predicting EUV Stochastic Defect Density

Also Read:

Electron Blur Impact in EUV Resist Films from Interface Reflection

Where Are EUV Doses Headed?

Application-Specific Lithography: 5nm Node Gate Patterning

Spot Pairs for Measurement of Secondary Electron Blur in EUV and E-beam Resists

EUV’s Pupil Fill and Resist Limitations at 3nm


Podcast EP128: Secure-IC’s Vision For Cybersecurity

Podcast EP128: Secure-IC’s Vision For Cybersecurity
by Daniel Nenni on 12-02-2022 at 10:00 am

Dan is joined by Hassan Triqui who has over 20 years of experience in the technology sector. Prior to spearheading Secure-IC’s development into a major player in embedded cybersecurity solutions, Hassan was a former senior executive at Thales (Talles) and Thomson.

Dan explores Secure-IC’s vision and strategy to deploy integrated cybersecurity capability across many products and markets. The Company’s chip to cloud vision is discussed as well as its recent acquisition of Silex Insight. The impact of the complete portfolio is examined.

The views, thoughts, and opinions expressed in these podcasts belong solely to the speaker, and not to the speaker’s employer, organization, committee or any other group or individual.


Samsung Versus TSMC Update 2022

Samsung Versus TSMC Update 2022
by Daniel Nenni on 12-02-2022 at 6:00 am

TSMC Versus Samsung

After attending the TSMC and Samsung foundry conferences I wanted to share some quick opinions about the foundry business. Nothing earth shattering but interesting just the same. Both conferences were well attended. If we are not back to the pre pandemic numbers we are very close to it.

TSMC and Samsung both acknowledged that there could be a correction in the first half of 2023 but over the next 5 years semiconductors and the foundry business will see very healthy growth rates. Very good news and I agree completely. The strength and criticality of semiconductors has never been more defined and the foundry ecosystem has never been stronger, absolutely.

At their recent Foundry Forum Samsung forecasted (citing Gartner) that by 2027 the semiconductor industry will approach $800B at a 9% Compound Annual Growth Rate and the foundry industry will experience a 12% CAGR. Samsung Foundry predicts advanced nodes (< 7nm) to outgrow the foundry industry at a 21% CAGR over the next five years and predicts its business will grow to approximately $26B by 2027 with a 20% CAGR.

It will be interesting to see what TSMC guides for 2023 during the Q4 2022 earnings call. Any guesses? Double digits (10-20%) growth is my guess. N3 will be in full production and it will be the biggest node in the history of TSMC, my opinion.

According to the World Semiconductor Trade Statistics (WSTS) the semiconductor industry is now expected to grow 4% in 2022 and drop 4% in 2023. After a 26% gain in 2021 this should not be a surprise. TSMC still expects 35% growth in 2022 and based on their monthly numbers that sounds reasonable.

For Samsung the FinFET era has come to an end with the R&D focus being on GAA. Samsung had a good run at 14nm even getting a piece of the Apple iPhone 6s business. And let’s not forget that Globalfoundries licensed Samsung 14nm so that success belongs to Samsung as well as GF.

Unfortunately, Samsung 10nm was an utter failure in both yield and PPAC (performance, power, area, cost). TSMC 10nm did not fair well either with the exception of Apple. The ROI between 14/16nm and 10nm just was not enough for most customers and the promise of 7nm was worth the wait.

7nm did much better and Samsung again came back to the competitive table. Samsung 14nm was still a stronger node but 8/7nm is doing very well. This can be seen with the current TSMC 7nm slump as Samsung is a cheaper alternative. Unfortunately, Samsung 5/4nm had serious PDK and yield problems so the lion’s share of the leading edge FinFET market went back to TSMC and will stay there, my opinion.

This leaves the door wide open for Intel Foundry Services to get back in the foundry game. IFS will be spending time with us at IEDM this coming week so we can talk more after that. If Intel executes on their process roadmap down to 18A this could get really interesting.

All three foundries are talking about GAA and Samsung is even in very limited production at 3nm GAA but personally I think the FinFET era will continue on for a few more years as we get the kinks worked out of GAA. In talking to the ecosystem at the conferences, HVM GAA is still years away and the PPAC (power/performance/area and cost) is still a big question. Based on the papers I have seen we should get a pretty good GAA update next week at IEDM. Scott Jones and I will be there amongst the media masses.

One of the more interesting battles between Samsung and TSMC became clear at the conferences and that is RF. I fully expect IFS to hit this market hard as well. Based on the talk inside the ecosystem, Samsung 8nm RF is a cheaper non EUV version of TSMC N6F and it seems to be experiencing a surge in popularity. TSMC N6F however is set to fill the N7 fabs so we should see a big push from TSMC in that direction. At the recent TSMC OIP analog automation, optimization, and migration were popular topics( TSMC OIP – Enabling System Innovation , TSMC Expands the OIP Ecosystem! ). But again, RF chips are very price sensitive so if the design specs can be met at Samsung 8RF and the ecosystem is willing then that is where the chips will go, my opinion.

Source: Samsung

Capacity plans were discussed in detail at both conferences. If you look at TSMC, Samsung, and Intel fab plans you will wonder how they will be filled. TSMC builds fabs based on customer demand which now includes pre payments so I have no worries there. Samsung and Intel however seem to be following the Field of Dreams strategy as in “build it and they will come”. I have no worries there either. If all of the fab expansion and build plans that I have seen announced do actually happen we will have oversupply in the next five years which is a good thing for the ecosystem and customers. TSMC, Samsung, and IFS can certainly weather a pricing storm but the 2nd, 3rd, and 4th tier foundries may be in for rougher times.

Just my opinion of course but since I actively work inside the semiconductor ecosystem I am more than just a pretty face.

Also Read:

TSMC OIP – Enabling System Innovation

TSMC Expands the OIP Ecosystem!

A Memorable Samsung Event

Intel Foundry Services Forms Alliance to Enable National Security, Government Applications


Hyundai’s Hybrid Radio a First

Hyundai’s Hybrid Radio a First
by Roger C. Lanctot on 12-01-2022 at 10:00 am

Hyundais Hybrid Radio a First

Current owners of a wide range of Hyundai connected cars encompassing multiple model years are receiving or have received notification of the availability (i.e. eligibility) for a software upgrade that will connect their in-vehicle radio to the Internet. For those owners who receive and have this update installed they may not realize they are experiencing a first-of-its-kind experience – a connected or hybrid radio experience in a mass market vehicle.

Notably, the experience is enabled in the Hyundai Ioniq 5 – an electric vehicle. I write “notably” because we just learned this week that Ford Motor Company’s F-150 Lightning comes with no AM radio. The Hyundai Ioniq 5 proudly preserves AM radio along with Internet access that allows the system to display station ID and logo, streaming promos for the station and its Website, music track and artist, and streaming elements of broadcast advertising.

All of this extra information related to the broadcast is referred to as “metadata” and broadcasters have struggled to deliver the information in a consistent manner – while automakers have struggled to render the information consistently. Working with Xperi, Hyundai is delivering all of it – now – in millions of dashboard systems.

The integration of this metadata – what Xperi calls DTS AutoStage – is only the first step in the transformation of broadcast radio. The availability of the Internet connection and the data ultimately means that future software updates could add content search capability to the radio experience or even alerts and links to further information, Internet content, or e-commerce opportunities.

It was just three years ago that NextRadio gave up on trying to deliver a hybrid radio experience via activated FM chips in Android phones with cellular service from Sprint – now part of T-Mobile. It was a valiant effort and a clever solution leveraging HD Radio technology to create a searchable broadcast solution. The complexity of the NextRadio solution and Apple’s refusal to activate the FM chips in its own phones doomed this ambitious effort.

Audi was next with its own concept of a hybrid radio with a system developed entirely in house and originally deployed only in the most expensive Audi, the A8, and originally only in Europe. Parent Volkswagen has since indicated its plans to bring its hybrid radio platform to all Volkswagen’s brand – and that rollout is steadily proceeding.

The focal point of the Audi hybrid solution – connecting the radio to the Internet – was the ability of the in-car radio to grab a radio station’s Internet stream more or less seamlessly when the terrestrial signal was lost due to a car driving out of range. The idea was as clever as the NextRadio solution, but it was not scalable beyond Volkswagen vehicles and it, too, was and is a bit of a Rube Goldberg proposition with inconsistent execution across platforms.

Mercedes Benz arrived on the hybrid radio scene last year with its own hybrid radio enabled by Xperi’s DTS AutoStage technology. The Mercedes offering added unique HMI elements such as a kind of carousel of radio station logos that made manual searching for a station in a car a truly unique experience and potentially less distracting than turning a knob – though maybe more distracting than pressing a button.

To be clear, Mercedes was first to deploy the DTS AutoStage solution. Hyundai is the first to bring DTS AutoStage to the masses. In fact, Hyundai is simultaneously bringing DTS AutoStage to multiiple Kia and Genesis models as well. Tesla owners may also soon discover their vehicles infused with DTS AutoStage.

The onset of electric vehicles has thrust the in-car experience of radio into the spotlight. Lucid Motors announced this week that its full line-up of vehicles will be equipped with SiriusXM satellite radio technology. The announcement cut through the growing suspicion that emerging EV makers – like Tesla – were somewhat ambivalent about including SiriusXM reception in all their vehicles.

TuneIn has lately been pointing the way to an exclusively Internet-based in-vehicle radio experience. TuneIn recently added Rivian to its existing roster of automotive partners which already includes Tesla, Mercedes, Polestar, and Jaguar Land Rover. TuneIn is also available via Amazon’s Alexa digital assistant wherever it is available in an embedded system.

Mercedes, Hyundai, and Tesla are all pointing the way toward a digital, searchable, delightful embedded radio experience that includes FM AND AM. Hyundai is first to bring the DTS AutoStage hybrid radio technology to the masses, but it won’t be the last.

Also Read:

Mobility is Dead; Long Live Mobility

Configurable Processors. The Why and How

Requiem for a Self-Driving Prophet


INNOVA PDM, a New Era for Planning and Tracking Chip Design Resources is Born

INNOVA PDM, a New Era for Planning and Tracking Chip Design Resources is Born
by Daniel Nenni on 12-01-2022 at 6:00 am

Innova PDM

No doubt that the design success of nowadays system on chips (SoCs) is directly linked to the success of cost control. More market opportunities are open for less expensive system on chips and electronic systems.

Both the design cost prediction and the resource tracking during the design process, are key to such a success

Predicting design cost need to cover all aspects: design (EDA) tools, computing servers, human resources, external IP cores, etc. All these aspects need to be tracked automatically by reporting any problem like a resource that is no longer available and measuring the impact on the subsequent design steps. Otherwise, the financial impact is significant especially in correlation with tight tape-out schedules.

INNOVA Advanced Technologies through its PDM (Project & Design Management) tool is the first software solution in the market which consider all the above aspects, simultaneously and automatically.

INNOVA Advanced Technologies has been founded in 2020 by seasoned from the semiconductor industry. Its solution is intended for designers as well as design managers of complex and multi-domain projects, ranging from microelectronics to computer science. It helps them to manage projects and resources in one unique place.

​INNOVA Project and Design Management (PDM) software Platform offers a single portal that links areas that were, until now, considered separately. This includes all management of resources of a complex design project: design flows and tools, computing servers, and also human resources.

Being fully compatible with design and IT systems in place, this disruptive and non-intrusive solution serves as a single portal. It helps reduce the complexity of using design tools and dedicated design environments. Thanks to its rich reports including alarms and dashboards, optimal decisions can be made in terms of design resource planning, monitoring and resource adjustment through a complete design live cycle.

For each design step, traditionally dozens of software tools are required and often several hundred design engineers are involved throughout a project such as designing a communication chip or a microprocessor.

There are also significant intangible resources involved: predesigned blocks, various software and design flows, computer resources (server farms, etc.), and libraries in connection with companies manufacturing electronic components.

PDM as an open and secured platform correlates design projects directly with the involved design resources. The tool is fully customizable and both graphical and script-based APIs are open to the users.

Thanks to the INNOVA PDM Platform, it is possible to consult information related to current projects: progress, rate of occupation of human resources, the anticipation of possible delays, and the effects they may have on the rest of the design chain, etc. This multidimensional tracking of EDA tool licenses, servers either local or cloud-based is real time.

Capitalization on past experiences is made possible through consultation and a deep reporting of past projects. PDM provides a clear prediction (ML-based) & tracking answers the fundamental question of how much design resources I need to start my design project and how may I track real-time design task execution and report any problem. In addition to easy tracking, PDM provides scheduling capabilities to automatically manage design tasks and jobs based on resource availability.

Compared to traditional and ad-hoc internal solutions, INNOVA claims up to 30% cost reduction with PDM in place within a corporation.

A webinar is planned by INNOVA where INNOVA experts will be presenting typical cases of how to reduce the cost of EDA licenses and computing servers and also how to plan the most optimal and cost-effective package of tool licenses for a design project. You can register for this webinar here: Reduce design cost by better managing EDA tool licenses and servers

For more information about INNOVA Advanced Technologies you can visit their website here: https://www.innova-advancedtech.com/

Also Read:

IDEAS Online Technical Conference Features Intel, Qualcomm, Nvidia, IBM, Samsung, and More Discussing Chip Design Experiences

TSMC OIP – Enabling System Innovation

2023: Welcome to the Danger Zone


Podcast EP127: MITRE Engenuity – Reshaping the Future of Semiconductors and Innovation

Podcast EP127: MITRE Engenuity – Reshaping the Future of Semiconductors and Innovation
by Daniel Nenni on 11-30-2022 at 10:00 am

Dan is joined by Dr. Raj Jammy of MITRE Engenuity. As Chief Technologist, Raj is responsible for incubating and accelerating technologies in partnership with the private sector, and for developing strategic frameworks that promote technologies for the public good. A seasoned semiconductor/electronics industry executive, Dr. Jammy brings 25 years of experience to his role at MITRE Engenuity.

Raj explains the role and vision of MITRE Engenuity. As a hub for transformative innovation, this organization partners with research and technology organizations in the US and their partners around the world. A part of MITRE Corporation, MITRE Edgenuity focuses on bringing the entire ecosystem together to build world-changing innovation in America. Dan explores the strategies and goals of the organization with Raj, including an assessment of how the CHIPS Act fits into the overall strategy.

The views, thoughts, and opinions expressed in these podcasts belong solely to the speaker, and not to the speaker’s employer, organization, committee or any other group or individual.


WEBINAR: FPGAs for Real-Time Machine Learning Inference

WEBINAR: FPGAs for Real-Time Machine Learning Inference
by Don Dingee on 11-30-2022 at 6:00 am

An server plus an accelerator with FPGAs for real-time machine learning inference reduces costs and energy consumption up to 90 percent

With AI applications proliferating, many designers are looking for ways to reduce server footprints in data centers – and turning to FPGA-based accelerator cards for the job. In a 20-minute session, Salvador Alvarez, Sr. Manager of Product Planning at Achronix, provides insight on the potential of FPGAs for real-time machine learning inference, illustrating how an automatic speech recognition (ASR) application might work with acceleration.

High-level requirements for ASR

Speech recognition is a computationally intensive task and an excellent fit for machine learning (ML). Language differences aside, speakers have different inflections and accents and vary in their use of vocabulary and grammar. Still, sophisticated ML models can produce accurate speech-to-text results using cloud-based resources. Popular models include connectionist temporal classification, listen-attend-spell, and recurrent neural network transducer.

A deterministic, low-latency response is essential. Transit time from an edge device to the cloud and back is low enough on fast 5G or fiber networks to make speech processing the dominant term in response time. Interactive systems add natural language processing and text-to-speech features. Users expect a normal conversation flow and will accept short delays.

Accuracy is also a must, with a low word error rate. Correct speech interpretation depends on what words are present in the conversational vocabulary. Research continues into ASR improvements, and flexibility to adopt new algorithms with a better response in speed or accuracy is a must-have for an ASR system.

While cloud-based resources offer the potential for more processing power than most edge devices, they are not infinitely scalable without tradeoffs. Capital expenditure (CapEx) costs and energy consumption can be substantial in scaled-up, high-throughput configurations that simultaneously take speech input from many users.

FPGA-based acceleration meets the challenge

Multiply-accumulate workloads with high parallelization, typical of most ML algorithms, don’t fit CPUs well, requiring some acceleration to hit performance, cost, and power consumption goals. Three primary ML acceleration vehicles exist: GPUs, ASICs, and FPGAs. GPUs offer flexibility but tend to drive power consumption through the roof with efficiency challenges. ASICs offer tuned performance for specific workloads but can limit flexibility as new models come into play.

FPGA-based acceleration checks all the boxes. By consolidating acceleration in one server with high-performance FPGA accelerator cards, server counts drop drastically while determinism and latency improve. Flexibility for algorithm changes is excellent, requiring only a new FPGA bitstream for new model implementations. Eliminating servers reduces up-front CapEx, helps with space and power consumption, and simplifies maintenance and OpEx.

 

 

 

 

 

 

 

 

High-performance FPGAs like the Achronix Speedster7t family have four features suited for real-time ML inference. Logic blocks provide multiply-accumulate resources. High bandwidth memory keeps data and weighting coefficients flowing, and high-speed interfaces provide the connection to the host server platform. FPGA logic also supports various computational precision needs, yielding ML inference accuracy and lowering ML training requirements.

Overlays help non-FPGA designers

Some ML developers may be less familiar with FPGA design tactics. “An overlay can optimally configure the hardware on an FPGA to create a highly-efficient engine, yet leave it software programmable,” says Alvarez. He expands on how accelerator IP from Myrtle.ai can be configured into the FPGA, abstracting the user interface, upping the clock rate, and utilizing hardware better.

 

 

 

 

 

 

 

 

Alvarez wraps up this webinar on FPGAs for real-time machine learning with a case study describing how an accelerated ASR appliance might work. With the proper ML training, simultaneously transcribing thousands of voice streams with dynamic language allocation becomes possible. According to Achronix:

  • One server with a 250W PCIe Speedster 7t-based accelerator card can replace 20 servers without acceleration
  • Each accelerated server delivers as many as 4000 streaming speech channels
  • Costs and energy consumption both drop by up to 90% by using an accelerated server

Although the example in this webinar is specific to ASR, the principles apply to other machine learning applications where FPGA hardware and IP accelerate inference models. When time-to-market and flexibility matter and high performance is required, FPGAs for real-time machine learning inference are a great fit. Follow the link below to see the entire webinar, including the enlightening case study discussion.

Achronix Webinar: Unlocking the Full Potential of FPGAs for Real-Time Machine Learning Inference

Also Read:

WEBINAR The Rise of the SmartNIC

A clear VectorPath when AI inference models are uncertain

Time is of the Essence for High-Frequency Traders


IDEAS Online Technical Conference Features Intel, Qualcomm, Nvidia, IBM, Samsung, and More Discussing Chip Design Experiences

IDEAS Online Technical Conference Features Intel, Qualcomm, Nvidia, IBM, Samsung, and More Discussing Chip Design Experiences
by Daniel Nenni on 11-29-2022 at 10:00 am

IDEAS 2022 Just Topics Icon

Ansys is hosting IDEAS Digital Forum 2022, a no-cost virtual event that brings together industry executives and technical design experts to discuss the latest in EDA for Semiconductors, Electronics, and Photonics.

See the full online conference agenda and list of speakers at www.ansys.com/IDEAS. The free registration will allow you to attend the event on December 6th or on-demand any time after that.

IDEAS will start with Keynote addresses from Raja Koduri from Intel, Pankaj Kukkal from Qualcomm, and insights into the metaverse from DP Prakash with start-up Youtopian.

Keynote Speakers and Panelists at IDEAS on December 6th, 2022

You can also attend the IDEAS Panel Discussion in the afternoon on the topic of “Thermal Management: How to Keep Your Cool When Chips Get Hot. The moderated panel discussion will include Jean-Philippe Fricker from Cerebras,  Roopashree HM from Texas Instruments, and Bill Mullen, senior director of R&D at Ansys.

Following the Keynotes there are 8 technical tracks on topics covering Thermal Integrity, Power Integrity, Timing Closure, Electromagnetics, Machine Learning, Hardware Security, and Photonics. Over 20 companies are participating in IDEAS to present case studies of their production designs including:

Intel                     Qualcomm                        Nvidia

Samsung             MediaTek                          IBM

GUC                      HP Enterprise                    NXP      

Select authors will be available for Q&A chat with the event attendees after their presentations – don’t miss this opportunity to interact with industry experts.

To see the full agenda, Register now for IDEAS and add this premier event to your calendar.

For more information, contact Marc Swinnen

Ansys is at the forefront of electronic design enablement in partnership with the world’s leading companies for 2.5D/3D-IC, AI and machine learning, high-performance computing, 5G, telecommunications, aerospace and autonomous vehicles.

Join us for the IDEAS Digital Forum — a place to catch up on industry best practices and the latest advances in semiconductor, electronic and photonic design. IDEAS will explore future trends with keynotes from industry leaders and offer technical insights by expert chip designers from many of the world’s largest electronic and semiconductor companies. IDEAS will give you a close-up view of some of the leading companies most advanced electronic design projects in the world.

Meet your industry peers and fellow designers from around the world at this premier virtual event for networking, sharing and learning the latest in multiphysics technology for electronic, photonic, and semiconductor design.

This free event will be hosted as a virtual, on-line event.

About Ansys

When visionary companies need to know how their world-changing ideas will perform, they close the gap between design and reality with Ansys simulation. For more than 50 years, Ansys software has enabled innovators across industries to push boundaries by using the predictive power of simulation. From sustainable transportation to advanced semiconductors, from satellite systems to life-saving medical devices, the next great leaps in human advancement will be powered by Ansys.

Take a leap of certainty … with Ansys.

Also Read:

Whatever Happened to the Big 5G Airport Controversy? Plus A Look To The Future

Ansys’ Emergence as a Tier 1 EDA Player— and What That Means for 3D-IC

What Quantum Means for Electronic Design Automation


Integration Methodology of High-End SerDes IP into FPGAs

Integration Methodology of High-End SerDes IP into FPGAs
by Kalar Rajendiran on 11-29-2022 at 6:00 am

AlphaCORE100 Multi Standard SerDes

Over the last couple of decades, the electronics communications industry has been a significant driver behind the growth of the FPGA market and continues on. A major reason behind this is the many different high-speed interfaces built into FPGAs to support a variety of communications standards/protocols. The underlying input-output PHY technology involved in implementing these standards is the serializer-deserializer (SerDes) technology. FPGA as a technology is complex and challenging to begin with, even before high-speed interfaces are taken into account. And SerDes PHY designs are complex and challenging in their own right. When these two are brought together, the implementation gets trickier, which is generally why there is a lag in incorporating the most advanced SerDes designs into FPGAs. But what if the status quo can be changed? This was the objective behind a collaborative effort between Alphawave IP and Achronix, the results of which were presented at the TSMC OIP Forum in October.

Challenges in Integrating High-End SerDes into FPGAs

Interdependencies between the SerDes and the FPGA fabric may lead to floorplanning challenges for the integrated chip. In addition to the layout challenges, even minor differences in metal stack choices between the fabric and the SerDes may adversely impact the power, performance and area (PPA) of either of these components.

FPGAs have to support a large number of line rates and protocols and protocol variants with diverse electrical channel requirements. The line rates range from 1Gbps to 112Gbps using NRZ or PAM4 signaling schemes to deliver the speed performance. This combinatorial requirement places a heavy burden on the modeling used for simulations. Each line rate/protocol combination needs to be validated pre-silicon and post-silicon based on highly accurate models.

Requirements for Successful Integration

Whether it is the SerDes or the FPGA fabric, architectural enhancements are made which will impact the SerDes integration with the FPGA fabric. To avoid surprises at integration time, architectures need to be discussed early on and agreed upon so proper sim models can be developed for validating. An overly optimistic model would force a radical change in the architecture and an pessimistic model would deliver a PPA uncompetitive solution. Neither of these two situations are desirable.

A close collaboration between the SerDes IP vendor and the FPGA integrator is required early on for developing accurate models. The close partnering is also needed for ensuring optimal floorplanning, power planning, bump map planning, timing, etc.

Scope of Alphawave IP and Achronix Collaboration

Achronix’s high-end FPGAs support multi-standard protocols such as 1GbE through 400GbE, PCIe Gen5, etc., including custom protocols to support non-standard speeds such as 82Gbps (for example). The SerDes 112 Gbps uses a different architecture compared to the 56Gbps SerDes and uses the PAM4 signaling scheme. The design uses a digital ADC and is a built around a DSP-based architecture.

The goal of the collaborative effort was to achieve successful integration of Alphawave IP’s AlphaCORE100 multi-standard SerDes with Achronix’s Speedster7t FPGA fabric.

Test Chip

A Test chip was built to validate the early sim models. The Test chip was implemented in TSMC’s N7 process and included four data channels, full AFE, digital PLLs and DLLs, BIST and additional test circuity for characterization.

Successful Results

As presented in the plots below, the simulation results based on the early models developed through the collaborative efforts correlated very well with Test chip measurements in the lab. The high accuracy models enabled Achronix to produce first-time-right Speedster7t FPGAs with Alphawave IP’s AlphaCore100 SerDes IP to support PCIe Gen5x16 and Gen5x8 as well as 400GbE.

The results of full simulation also correlated well with BER measurements from the lab for a wide range of channel loss conditions.

For more details, please connect with Achronix and Alphawave IP.

Also Read:

WEBINAR The Rise of the SmartNIC

A clear VectorPath when AI inference models are uncertain

Time is of the Essence for High-Frequency Traders


The Role of Clock Gating

The Role of Clock Gating
by Steve Hoover on 11-28-2022 at 10:00 am

The Role of Clock Gating

Perhaps you’ve heard the term “clock gating” and you’re wondering how it works, or maybe you know what clock gating is and you’re wondering how to best implement it. Either way, this post is for you.

Why Power Matters

I can’t help but laugh when I watch a movie where the main characters are shrunk down to the size of grains of sand and they have to fight off ants the size of T-Rexes. It’s not that I’m amused by giant ants; it’s the unquestioned assumption that our bodies would act the same at such a small scale that gets to me. I know, it’s just a movie, but it still makes me cringe. Things just wouldn’t scale that way. If I’ve got my physics right, our mass would scale cubically, while our surface area would scale quadratically as would our strength. As a result, we’d be super strong, but it wouldn’t matter since we’d freeze to death almost instantly from heat loss. Plenty of other factors come into play, and I’m sure I’d get them wrong if I tried, but you get the idea.

But, what does “Honey I Shrunk the Kids” have to do with clock gating? Well, I began designing silicon in the 90s. At that time the only thing that mattered was performance. Since then, transistors have shrunk a bit–a lot actually–just like Rick Moranis. And their properties scale by different factors. One that’s getting out of control is power. According to “The Dark Silicon Problem and What it Means for CPU Designers”, heat generation per unit of silicon area is “somewhere between the inside of a nuclear reactor and the surface of a star,” and that was in 2013. Power is now a first-order concern. In fact, we find ourselves in a new situation where we have more transistors available to us than we can afford to use. In a very real sense, the best way to get more performance is now to save more power.

I was motivated to write this post today by a Linkedin notification I received this morning, letting me know that I had been quoted in a post by Brian Bailey of Semiconductor Engineering entitled “Taking Power More Seriously”. Bailey provides an excellent high-level overview of the myriad challenges of designing for power. One of those challenges is to implement fine-grained clock gating. As an EDA developer myself, of tools that, among other things, automate clock gating, I felt it timely to dive deeper into the topic.

What is clock gating?

Several factors contribute to a circuit’s power consumption. The logic gates have static or leakage power that is roughly constant as long as a voltage is applied to them, and they have dynamic or switching power resulting from toggling wires. Flip-flops are rather power-hungry, accounting for maybe ~20% of total power. Clocks can consume even more, perhaps ~40%! Global clocks go everywhere, and they toggle twice each cycle. As we’ll see, clock gating avoids toggling the clock when clock pulses are not needed. This reduces the power consumption of clock distribution and flip-flops, and it can even reduce dynamic power for logic gates.

Even in a busy circuit, when you look closer, most of the logic is not doing meaningful work most of the time. In this trace of a WARP-V CPU core, for example, the CPU is executing instructions nearly every cycle. But the logic computing branch targets isn’t busy. It is only needed for branch instructions. And floating-point logic is only needed for floating-point instructions, etc. Most signal values in the trace below are gray, indicating that they aren’t used.

CPU waveform showing clock gating opportunity

As previously noted, a significant portion of overall power is consumed by driving clock signals to flip-flops so the flip-flops can propagate their input values to their outputs for the next cycle of execution. If most of these flip-flop input signals are meaningless, there’s no need to propagate them, and we’re wasting a lot of power.

Clock gating cuts out clock pulses that aren’t needed. (Circuits may also be designed to depend on the absence of a clock pulse, but let’s not confuse matters with that case.) The circuit below shows two clock gating blocks (in blue) that cut out unneeded clock pulses and only pulse the clock when a meaningful computation is being performed.

Illustration of clock gating

In addition to reducing clock distribution and flip-flop power, clock gating also guarantees that flip-flop outputs are not wiggling when there are no clock pulses. This reduces downstream dynamic power consumption. In all, clock gating can save a considerable amount of power relative to an ungated circuit.

Implementing Clock Gating

A prerequisite for clock gating is knowing when signals are meaningful and when they are not. This is among the aspects of higher-level awareness inherent in a Transaction-Level Verilog model. The logic of a “transaction” is expressed under the condition that indicates its validity. Since a single condition can apply to all the logic along a path followed by transactions, the overhead of applying validity is minimal.

Validity is not just about clock gating. It helps to separate the wheat from the chaff, so to speak. The earlier CPU waveform, for example, is from a TL-Verilog model. Debugging gets easier as we have automatically filtered out the majority of signal values, having identified them as meaningless. And we know they are meaningless because of automatic checking that ensures that these values are not inadvertently consumed by meaningful computations.

With this awareness in our model, default fine-grained clock gating comes for free. The valid conditions are used by default to enable our clock pulses.

The full implications of having clock gating in place from the start may not be readily apparent. I’ve never been on a project that met its goals for clock gating. We always went to silicon with plenty of opportunity left on the table. This is because power savings is always the last thing to be implemented. Functionality has to come first. Without it, verification can’t make progress, and verification is always the long pole. Logic designers can’t afford to give clock gating any real focus until they have worked through their functional bug backlogs, which doesn’t happen until the end is in sight. At this point, many units have already been successfully implemented without full clock gating. The project is undoubtedly behind schedule, and adding clock gating would necessitate implementation rework including the need to address new timing and area pressure. Worse yet, it would bring with it a whole new flood of functional bugs. As a result, well, let’s just say we’re heating the planet faster than necessary. Getting clock gating into the model from the start, at no cost, completely flips the script.

Conclusion

Power is now a first-order design constraint, and clock gating is an important part of an overall power strategy. Register transfer level modeling does not lend itself to the successful use of clock gating. A transaction-level design can have clock gating in place from the start, having a shift-left effect on the project schedule and resulting in lower-power silicon (and indirectly higher performance and lower area as well). If you are planning to produce competitive silicon, it’s important to have a robust clock-gating methodology in place from the start.

Also Read:

Clock Aging Issues at Sub-10nm Nodes

Analyzing Clocks at 7nm and Smaller Nodes

Methodology to Minimize the Impact of Duty Cycle Distortion in Clock Distribution Networks