BannerforSemiWiki 800x100 (2)

A Fast Path to Better ARC PPA through Fusion Quickstart Implementation Kits and DSO.AI

A Fast Path to Better ARC PPA through Fusion Quickstart Implementation Kits and DSO.AI
by Bernard Murphy on 11-07-2023 at 6:00 am

QIK+DSO.AI flow

Synopsys recently presented a webinar on using their own software to optimize one of their own IPs (an ARC HS68 processor) for both performance and power through what looks like a straightforward flow from initial configuration through first level optimization to more comprehensive AI-driven PPA optimization. Also of note they emphasized the importance for one IP team to be able to efficiently support multiple different design objectives, say for high performance versus low power, for derivatives and for node migrations. The flow they describe is supported both by capabilities provided by the ARC IP team and by Synopsys Fusion Compiler with DSO.AI.

The reference design flow (RDF)

RDF provides a complete RTL to GDS flow, in this example targeting the TSMC 5nm (TSMC-5FF) technology library. The RDF configures the IP and generates the RTL files. It builds and instantiates required closely coupled memories through the Synopsys memory compiler and configures the implementation flow scripts which can be used in the Fusion Compiler flow to generate a floor plan. It also replaces clock gates and synchronizer cells with options drawn from the technology library.

John Moors (ASIC Physical Design Engineer, Synopsys) mentioned that RDF can optionally do DFT insertion and will generate RTL gate level simulation scripts, low power checks, formal verification scripts and SDA and power analysis flows, though they didn’t exercise these options for this test.

The Fusion quick start implementation kit (QIK)

Highly configurable IP like the ARC cores offer SoC architects and integrators significant flexibility in meeting their goals but this flexibility often adds complexity in meeting PPA targets. Quick start implementation kits aim to simplify the task, building on R&D experience, IP team experience and collaboration with customers. QIK kits are currently offered for HS5x, HS6x, VPX and NPX cores.

Kits integrate know-how for best achieving PPA and floorplan goals, including preferred choice for implementation (flat, hierarchical, ..), library, technology and design customizations, and custom flow features to maximize the latest capabilities in an IP family. They also include support for all the expected collateral checks including RC extraction, timing analysis, IR drop analysis, RTL versus gate equivalence checks and so on.

Frank Gover (Physical design AE at Synopsys) mentioned that these kits continue to evolve with experience and with enhancements in IP, in tools and in technologies. QIKs also include information to leverage DSO.AI. DSO.AI has been covered abundantly elsewhere so I will recap here only that this uses AI-based methods together with reinforcement learning to intelligently explore many more tradeoffs between tunable options than could be found through manual search.

Comparing Baseline, QIK and QIK+DSO.AI

The team found that QIK alone improved frequency by 17% and QIK+DSO.AI improved frequency by 23%. QIK improved total power by 49% (!) and leakage power by just over 9%. These number got a little worse after running DSO.AI – total power improvement dropped to 44% and leakage power dropped to -5% (worse than the baseline). However, Frank pointed out that their metric for optimization was based on a combination of worst negative slack, total negative slack, and leakage power, in that order, so performance optimization was preferred over leakage. Still, a 90MHz gain in performance for a power increase of only 0.6mW is pretty impressive.

One interesting observation here: Slack and leakage measurements are based on static analyses. Dynamic power estimation requires simulation, very unfriendly to inner loop calculations in machine learning, even for the specialized approaches used by DSO.AI. From this I suspect that all the dynamic power improvement in this flow comes from the QIK kit, and that incorporating any form of dynamic power estimation into ML-based methods is still a topic for research.

When looking at efficiency in applying this flow to multiple SoC objectives, Frank talks about both a cold start approach with no prior learning (only user guidance) and a warm start approach (starting from a learned model developed on prior designs/nodes). First time out of the gate you will have a cold start and might need to assign say 50 workers to learning, from which you would choose the best 10 results and pass those on to the next epoch. Learning would then progressively explore the design space, ultimately converging on better results.

Once that reference training model is developed, you can reuse it for warm starts on similar design objectives – a node migration or a derivative design for example. Warm starts may require only 30 workers per epoch and quite likely less epochs to converge on satisfactory results. In other words you pay a bit more for training at the outset, but you quickly get to the point that the model is reusable (and continues to refine) on other design objectives.

Nice summary of application on a real IP, with clearly valuable benefits to PPA. You can review the webinar HERE.


Unlocking the Power of Data: Enabling a Safer Future for Automotive Systems

Unlocking the Power of Data: Enabling a Safer Future for Automotive Systems
by Kalar Rajendiran on 11-06-2023 at 10:00 am

SDVs New Monetization Opportunities

The automotive industry is undergoing a major transformation; it is not about just connectivity and convenience anymore. Data is emerging as the driving force behind innovation and safety with vehicles becoming sophisticated data-driven machines. By unlocking the power of data, we can create safer vehicles and roads and usher in the era of semi and fully autonomous vehicles.

The Data Revolution in the Automotive Industry

The automotive industry has always been at the forefront of technological advancements with integration of electronics and computer systems. Over the recent years, proliferation of sensors, connectivity, and advanced computing has ushered in a new era of data-driven transformation. Advanced Driver Assistance Systems (ADAS) technologies, such as adaptive cruise control, lane-keeping assist, and automatic emergency braking, rely heavily on data from various sensors, cameras, and radar systems. These systems help drivers by providing real-time data about their surroundings and assist in avoiding accidents. Autonomous vehicles, of course, have pushed the industry to the pinnacle of data-driven automotive technology. These vehicles rely on a constant stream of data from sensors, cameras, lidar, and radar to make real-time decisions and navigate safely. Features such as vehicle-to-vehicle (V2V) and vehicle-to-infrastructure (V2I) communication are crucial to improving safety and rely on modern vehicles being connected to the internet. Cloud-based platforms enable communications between vehicles and infrastructure for sharing real-time data.

How Data Enables Automotive Safety

Avoiding unexpected breakdowns

Maintenance alerts have come a long way in the automotive industry.  From the primitive warning lights on the dashboard to the flex maintenance alerts over the last couple of decades. With the data revolution discussed in the earlier section, automotive safety can be taken to a whole new level. Telematics systems can collect information about a vehicle’s performance and usage patterns. The data collected can be analyzed to not only provide predictive maintenance needs but also provide prescriptive measures to even extend the useful life of vehicles. This reduces the chances of mechanical failures causing accidents.

Accident prevention

Data can be used to monitor driver behavior, such as distraction and drowsiness detection. Advanced systems can alert drivers when they exhibit dangerous behaviors, reducing the risk of accidents. Data-driven safety systems, like ADAS, can detect potential collisions and provide warnings or take corrective actions, such as applying brakes or steering, to prevent accidents. Autonomous vehicles use vast amounts of data to make split-second decisions. They can analyze the behavior of nearby vehicles, pedestrians, and road conditions to make safer driving choices. Road conditions data collected can help reduce the risk of accidents by enabling efficient traffic flow and routing management.

This topic is the context for a talk given by Nir Sever at the recent TSMC Open Innovation Platform (OIP) Forum that was held in Santa Clara, California. Nir is a senior director of business development at proteanTecs. His presentation focused on how software-defined vehicles are driving innovation, as well as enabling new levels of safety to future automotive systems.

Software-Defined Vehicles (SDV) Drive Innovation

At its core, a SDV relies on software for controlling various vehicle functions. SDVs are driving innovation in the automotive industry by redefining the role of software in vehicles. With over-the-air updates, they enable continuous enhancements and adaptability without physical modifications. SDVs also lead autonomous driving innovation through advanced software algorithms, while their connectivity enables innovations in traffic management and safety. Vast data generated by SDVs fuel advancements in analytics and machine learning, enhancing vehicle performance and safety features.

Decoupling software functions from hardware

Decoupling software functions from hardware is a cornerstone of SDVs. This separation allows SDVs to embrace hardware-agnostic software, ensuring compatibility with diverse platforms. It promotes modularity, facilitating the reuse of software modules across different vehicle models, ultimately reducing development time and costs. SDVs exhibit scalability, effortlessly adapting to evolving hardware configurations and accommodating various feature levels. Customers can personalize their SDVs without hardware modifications, and AI and machine learning enhancements become simpler.

Modular development practices

SDVs lend themselves to modular development practices that simplify the integration of hardware and software components. They become highly adaptable, accommodating various configurations, and feature sets while facilitating over-the-air updates for ongoing enhancements. The modular approach encourages third-party integration, fostering a healthy ecosystem of software offerings for customization purposes. This practice helps future proof SDVs by allowing easy integration of new modules as technology advances.

Enabling new business models

SDVs are reshaping the automotive industry by introducing innovative business models. Mobility-as-a-Service (MaaS) is flourishing through smart-hailing, ridesharing, and subscription-based models that offer on-demand access to SDVs, reducing the need for vehicle ownership. Data generated by SDVs are monetized through analytics services and personalized advertising. Manufacturers generate post-sale revenue with software upgrades and app stores. In essence, SDVs unlock a new era of business opportunities, adapting to evolving consumer demands and technological advancements.

Health and Performance Monitoring of SDV Electronics

The primary objectives and growing challenges of Electronic Control Units (ECUs) and System on Chips (SOCs) in SDVs are reliability and functional safety, extension of mission profiles, power usage reduction, and security/authentication. These objectives can be achieved only through continuous health and performance monitoring.

By combining chip telemetry with advanced ML-driven analytics, proteanTecs provides embedded and cloud SW solutions to predictively monitor vehicle electronics during their lifetime operation, under functional workloads.

At the TSMC OIP forum, proteanTecs showed a demo of their health and performance monitoring solution for SDVs. Following is a screenshot from that demo, showing operational diagnostics in mission-mode, against pre-defined thresholds, therefore providing real-time safety signals and system optimization.

In the example above, the die-to-die (D2D) interconnect lanes are monitored in an advanced chiplet-based package. When the health score of any of the many data lanes on the D2D interconnect that are being monitored falls below a threshold, the software is programmed to switch out the failing data lane to a pre-identified spare lane. This prevents faulty interconnect lanes from being the cause behind faults elsewhere in the system. Customers can easily implement such a solution utilizing proteanTecs on-chip monitors and deep data analytics software.

proteanTecs offers an Automotive SW stack and an extensive IP portfolio of on-chip agents that are part of the TSMC, Samsung SAFE and Intel IFS IP Alliances.

For more details, visit www.proteanTecs.com, or download proteanTecs’ white paper on predictive and maintenance in the context of automotive functional safety.

Summary

The automotive industry is in the midst of a data-driven revolution that has the potential to significantly enhance safety on our roads. From advanced driver assistance systems to autonomous vehicles, data is the cornerstone of future innovations in automotive safety. Health and performance monitoring solutions implemented in SDVs unlock the power of data to enhance the safety of future automotive systems.

Also Read:

proteanTecs On-Chip Monitoring and Deep Data Analytics System

Predictive Maintenance in the Context of Automotive Functional Safety

Semico Research Quantifies the Business Impact of Deep Data Analytics, Concludes It Accelerates SoC TTM by Six Months


Make Your RISC-V Product a Fruitful Endeavor

Make Your RISC-V Product a Fruitful Endeavor
by Daniel Nenni on 11-06-2023 at 6:00 am

RISC V Chip

Consider RISC-V ISA as a new ‘unforbidden fruit’. Unlike other fruits (ISAs) that grow in proprietary orchards, RISC-V is available to all, i.e. open-source. Much like a delicious fruit can be transformed into a wide array of delectable desserts, so can RISC-V be utilized to create a plethora of effective applications across industries—consumer, industrial, medical, and mil-aero, to name a few.

The farm-to-table journey bears a great resemblance to the design-to-application journey. Considering that the hardware acts as the pots and pans that define and determine the scope of the end-application. There is no dearth of industry titans manufacturing these implements. Storage giant Western Digital is not only moving production hardware to RISC-V, they have also released their own RISC-V-based SweRV Core™ to the open-source community; graphics leader NVIDIA has decided to base their next-generation Falcon logic controller on RISC-V. Semico Research estimates that by 2025 there will be over 62 billion RISC-V-based devices shipped globally.

In this analogy, if your hardware is the pots and pans used to make said dessert, your software plays the role of the recipe and the cooking medium. A recipe describes the full set of processes to be applied to your ingredient(s). Somewhat similarly, the compiler, libraries, linker and debugger tool set used to convert application source code to binary code and get it running on the target hardware, are collectively referred to as a toolchain. Toolchains integrated into an application development suite to support a specific product are known as a Software Development Kit (SDK) which is a key part of the ecosystem needed to support RISV-C commercial product deployment.

Now, you can download any old recipe from the internet and with some skill, come out with something delicious. Software developers targeting RISC-V devices are certainly free to download GCC or LLVM source code directly from the projects’ public open-source repositories. However, this is not usually an efficient first step in porting application software to a RISC-V processor. Building and validating your own open-source-based toolchain is a complex and lengthy process fraught with many potential pitfalls, especially for those new to building their own toolchains.

Fortunately, many RISC-V chip and IP vendors provide open-source-based toolchain reference distributions to help developers to evaluate their IP. While these reference toolchains are useful for evaluating RISC-V IP, there are no productized solutions to enable developers and end users to deploy software applications to commercial RISC-V-based products. There are three key requirements to consider when procuring customer ready open-source-based toolchains to enable developers to use RISC-V based platform–commercialization, customization, and support.

Commercialization

Toolchain commercialization is the process of transforming the publicly available open-source code into a tested and supportable distribution of executable toolchain components and libraries configured to support the target devices. The two most important toolchain commercialization considerations are selection of the compiler version as well as the target device variants to be supported. The commercialization process leverages the open-source compiler project test suites and test frameworks, such as DejaGNU, to provide toolchain test coverage. Such coverages ensures high-quality toolchains built from a rapidly changing code base.

Customization

Toolchain customization provides feature extensions and performance optimizations beyond what is currently available from the open-source community. For RISC-V, toolchain customization can take the form of compiler extensions to support hardware vendor-specific RISC-V ISA or customizations enabled via the RISC-V open ISA. Target-specific application performance optimization is another area of toolchain customization that can help maximize target hardware application performance. In addition, toolchain customization can also include compiler bug fixes and new feature completion not yet publicly available from the open-source community.

Support

Toolchains regularly have CVEs reported against them, including issues in the compiler runtime libraries. Undiscovered CVEs in compiler runtime libraries get linked into application software and can thus be inadvertently deployed to customers via a product release. This, in turn, creates the risk of having to provide a hot-fix software update to address newly discovered CVEs at some point in the future. In such cases having a toolchain support contract in place that provides ongoing CVE remediation will ensure your toolchain is ready to build the software update needed to resolve such a crisis quickly and effectively without having to first sort out potentially years of toolchain patches and updates. Toolchain long-term support is intended to keep older version toolchains updated with critical CVE patches and bug fixes while minimizing the total number of overall toolchain changes to reduce the risk of introducing new issues.

As RISC-V enters the mainstream of embedded commercial products, a customer-ready software development ecosystem will be essential for marketplace success. Toolchain commercialization, customization and support are essential ingredients for deploying commercial grade open-source based toolchains in SDKs supporting RISC-V based product development and post release software updates. Siemens has a 20+ year proven track record in delivering commercial-grade toolchains and toolchain support services for a wide variety of processor architectures.

Managing a RISC-V operation yourself is hard. A commercialized product can be the difference between a home-cook and a professional chef. The latter has significantly more expertise and can make life far easier. A customized, commercialized product with ongoing support can deliver a smooth execution that will make you want to celebrate with dessert.

Article provided by the Embedded Software Division, Siemens Digital Industries Software.

Also Read:

Ensuring 3D IC Semiconductor Reliability: Challenges and Solutions for Successful Integration

The Path to Chiplet Architecture

Placement and Clocks for HPC

 


Podcast EP191: The Impact of Evolving AI System Architectures and Samtec’s Role with Matt Burns

Podcast EP191: The Impact of Evolving AI System Architectures and Samtec’s Role with Matt Burns
by Daniel Nenni on 11-03-2023 at 10:00 am

Dan is joined by Matthew Burns, Matt develops go-to-market strategies for Samtec’s Silicon to Silicon solutions. Over the course of 20+ years, he has been a leader in design, technical sales and marketing in the telecommunications, medical and electronic components industries.

Matt and Dan discuss revelations from the recent AI Hardware & Edge Summit and the OCP Global Summit. The deployment of new AI system architectures are enabling many new capabilities, including those based on large language models (LLMs).

These architectures demand tremendous data communication performance, an area where Samtec can make a huge difference. In this broad and informative discussion Matt describes how Samtec’s products can help to deploy new AI system architectures, with a look at what the future holds.

The views, thoughts, and opinions expressed in these podcasts belong solely to the speaker, and not to the speaker’s employer, organization, committee or any other group or individual.


Executive Interview: Tony Casassa, General Manager of METTLER TOLEDO THORNTON

Executive Interview: Tony Casassa, General Manager of METTLER TOLEDO THORNTON
by Daniel Nenni on 11-03-2023 at 6:00 am

TC MT Thornton

Tony Casassa has served as the General Manager of METTLER TOLEDO THORNTON since 2015, after having joined the company in 2007 to lead the US Process Analytics business. Prior to spending the last 16 years with METTLER TOLEDO, Tony held various business leadership positions for 2 decades with Rohm and Haas Chemical. A common thread throughout his 35+ year career has been bringing new innovations to market that delivers enhanced process control and automation for the Semiconductor and Life Science industries.

Tell us about Mettler Toledo Thornton?

METTLER TOLEDO THORNTON takes important or critical Pure water process measurements that are currently OFFLINE, and brings them online in a real-time, continuous measurement approach.  Measuring in-situ on a real-time and continuous basis eliminates sampling errors/contamination risks and provides increased process transparency.

Our focus is on segments that use pure water for critical process stages and where water quality has high value and process impact.  Our customers benefit from the deep segment competency & process expertise that our experts have developed over decades in the industry.

What application areas are your strongest?

Mettler Toledo Thornton is the global leader in the measurement and control of the production and monitoring of Pure and Ultrapure water (UPW) for the Semiconductor (Microelectronics) industry.  We provide a full portfolio of the process analytics that are required for the real time monitoring of these critical waters in the UPW loop, distribution system or reclaim.

What problems are you solving in semiconductor?

The semiconductor industry is constantly developing new ICs that are faster, more efficient and have increased processing capability. To accomplish these advances, the architecture and line width on the ICs has become narrower and narrower, now approaching 2 to 3 nanometers.  To produce these state-of-the-art ICs, the SEMI Fab requires vast quantities of Ultrapure water that is virtually free of any ionic or organic impurities.  Mettler Toledo Thornton R&D has developed, and we have introduced to the market the UPW UniCond® resistivity sensor that exceeds the measurement accuracy of previous or current resistivity sensors on the market.  This sensor exceeds the stringent recommended measurement limits published by the SEMI Liquid Chemicals committee.  Measurement and control of ionic impurities in Ultrapure water is critical but so is the measurement and control of organics. Our online Low PPB 6000TOCi provides the semiconductor industry a TOC instrument that accurately measures in the sub ppb TOC range but also uses 80% less water than other instruments.  The control of dissolved oxygen and silica are critical measurements for the SEMI Fab and our analytics provide industry setting accuracy with longer service life and reduced maintenance.

Another key priority for the industry is sustainability and the need to reclaim water & reduce overall water consumption.   To minimize adverse impact on wafer yield,  it is imperative to fully understand the water quality & risk of contaminants to enable informed decision making on where & how the various reclaimed water streams can be recycled or reused.  Mettler Toledo Thornton’s continuous, real-time measurements enable fast, confident decisions in the water reclaim area.

This full range of process analytics helps the SEMI industry monitor and control the most critical solvent for wafer quality, increased yield and reduced defects while obtaining sustainability goals.

How are your customers using your products?

For the semiconductor facility the production of Ultrapure water is the first step in the SEMI water cycle, but the UPW is the universal solvent used in the Tools for cleaning the wafers after photolithography, epitaxy, RCA process and other processing steps.  Our instruments are critical in the monitoring of the Tools process and assure the cleaning of the wafers.  The next step in the water cycle for the UPW is in the Reclaim/Reuse/Recycle, where the SEMI Fab utilizes Mettler Toledo Thornton process analytics to decide if the UPW can be brought back to the UPW purification process to reduce water discharge and improve industry sustainability.

What does the competitive landscape look like and how do you differentiate?

The current super cycle in the semiconductor industry has resulted in numerous companies pursuing opportunities for their products in this space.  Some are historical suppliers while there are new competitors in the market.  Mettler Toledo Thornton has been involved in the semiconductor industry for over 40 years and active in working with the industry to establish the accuracy and technical specifications for UPW.  The research work at Mettler Toledo Thornton conducted by that Dr. Thornton, Dr. Light and Dr. Morash established the actual resistivity value of Ultrapure water which  is the current standard for the industry.  We have been a partner with all the leading global semiconductor companies to develop the most accurate online analytical instruments.  Our focus on continuous, real-time measurements provides our customers with the highest level of process transparency relative to other batch analyzers or offline measurements.

A key factor in the establishment of Mettler Toledo Thornton as a global leader in the semiconductor industry has been our active participation and membership of the SEMI committees that are establishing the recommended limits for the SEMI industry.  Our global presence and service organization provides the semiconductor facility with the ability to standardize on an analytical parameter and employ it across all their facilities and locations.

What new features/technology are you working on?

Our focus is responding to the needs of the market as they strive to improve yield, minimize water consumption, increase throughput and reduce costs.  We have recently brought a rapid, online microbial detection to market that replaces the need for slow, costly and error-prone offline test methods.  Our new Low PPB TOC continuous online sensor reduces water consumption by 80% vs other company’s batch instruments.  Our latest innovation is the UPW UniCOND sensor that delivers the highest accuracy on the market today.  We continue to develop new technologies that enables the industry to achieve sustainability goals without sacrificing wafer yield.

How do customers normally engage with your company?

Mettler Toledo Thornton has a global presence with our own sales, marketing, and service organizations in all the countries with semiconductor facilities and Tool manufacturers.  The local sales and service teams have been trained on the complete water cycle for semiconductor which gives them the expertise to engage with the semiconductor engineers to provide the technical solutions for UPW monitoring and control.  We conduct numerous global webinars and in-person seminars to provide the semiconductor industry the opportunity to learn about the most recent advances in analytical measurements.  These seminars also provide the local semiconductor facilities the opportunity to be updated on the most recent UPW recommended limits because of our participation in the technical committees.  With our industry specialists we produce and publish numerous White Papers and Technical documents that give the industry the opportunity to gain further insight into the latest advancements.

Also Read:

A New Ultra-Stable Resistivity Monitor for Ultra-Pure Water

Water Sustainability in Semiconductor Manufacturing: Challenges and Solutions

CEO Interview: Stephen Rothrock of ATREG


Webinar: Fast and Accurate High-Sigma Analysis with Worst-Case Points

Webinar: Fast and Accurate High-Sigma Analysis with Worst-Case Points
by Daniel Payne on 11-02-2023 at 10:00 am

Worst case point min

IC designers are tasked with meeting specifications like robustness in SRAM bit cells where the probability of a violation are lower than 1 part-per-billion (1 ppb). Another example of robustness is a Flip-Flop register that must have a probability of specification violation lower than 1 part-per-million (1 ppm). Using Monte Carlo simulation at the SPICE level for normal distributed performance with a small sample size to achieve 1 ppm requires 4.75 sigma analysis, while reaching 1 ppb increases to 6.0 sigma analysis. The problem is that for non-normal distributed performance the standard Monte Carlo approach requires a sample size that is simply too large to simulate, so a more efficient approach is required and that’s where high-sigma analysis and worst-case points come into use.

Register for this MunEDA webinar scheduled for November 14th at 9AM PST, and be prepared to have your questions answered by the experts.

MunEDA is an EDA vendor with much experience is this area of high-sigma analysis methods, and they will be presenting a webinar on this topic in November. I’ll describe some of the benefits of attending this webinar for engineers that need to design for robustness.

In the non-normal distribution case to prove that the failure rate is below a required limit of 1 ppm, or 4.75 sigma, requires 3 million simulations. To estimate the failure rate to a 95% accuracy as being between 0.5 ppm and 1.5 ppm requires a much larger 15.4 million simulations. Trying to achieve 6.0 sigma with this same math then requires billions of simulations, something impractical to even consider.

The webinar goes into details on how parameter variation and yield are impacted by Monte Carlo techniques like brute-force random sampling versus searching the failure region by an optimizer to find the highest density of failing points. The worst-case point is the region which has the highest density of failure points, and is closest to the mean point of passing values.

Worst-case Point

Just knowing where this worst-case point is located will help guide where SPICE simulations should be made and even helps during analog yield optimizations. Failure rates can be estimated from worst-case distances. Different sampling methods at the worst-case point are introduced and compared. The First Order Reliability Model (FORM) is a straight line drawn through the worst-case point, and serves as a boundary between passing and failing regions.

First Order Reliability Model (FORM)

The error rate of using the FORM approximation is presented as a small number. The algorithms for finding the worst-case point are presented, and they show how few simulation runs are required to find 6-sigma values with small error values.

The shape of performance functions of the SRAM bit cell are shown to be continuous and only slightly non-linear, and using the FORM approach results in small errors. MunEDA has applied these high-sigma Worst Case Analysis (WCA) algorithms to its EDA tools resulting in the ability to scale to high-sigma levels like 5, 6 or even 7 sigma by only using a small number of simulation runs. The typical runtime for a 6.5 sigma SRAM bitcell analysis is completed in under 2 minutes, using just on CPU.

The MunEDA high-sigma methods are actually building models then used by Machine Learning (ML), which scale nicely to handle large circuits, like up to 110,000 mismatch parameters in a memory read-path analysis.

Cases where you still should run brute-force Monte Carlo analysis were presented: non-linearity, number of variables, complexity of test bench, low-sigma. Results from customer examples were shared that all used high-sigma analysis.

Summary

If you ever wondered how an EDA vendor like MunEDA approaches their results for high-sigma analysis, then this webinar is another must see. It covers the history of various analysis methods, and how MunEDA chose its worst-case point method. Real numbers are shared, so you know just how fast their tools operate.

Register for this MunEDA webinar scheduled for November 14th at 9AM PST, and be prepared to have your questions answered by the experts.

Related Blogs


Arm Total Design Hints at Accelerating Multi-Die Activity

Arm Total Design Hints at Accelerating Multi-Die Activity
by Bernard Murphy on 11-02-2023 at 6:00 am

multi die

I confess I am reading tea leaves in this blog, but why not? Arm recently announced Arm Total Design, an expansion of their Compute Subsystems (CSS) offering which made me wonder about the motivation behind this direction. They have a lot of blue-chip partners lined up for this program yet only a general pointer to multi-die systems and what applications might drive the need. Neither Arm nor their partners will make this investment simply for PR value, so I have to assume there is building activity they are not ready to announce. I’m guessing that in a still shaky economy the big silicon drivers (in hyperscalers, AI, automotive, and maybe communication infrastructure) are already engaged in faster and managed cost paths to differentiated custom silicon, likely in multi-die systems.

Arm CSS and Total Design

I wrote about CSS recently. CSS N2, as Arm describes it, is customizable compute subsystem that is configured, verified, validated and PPA-optimized by Arm. Think of a multi-core cluster objective for which you don’t just get the Lego pieces (CPU core, coherent interconnect, memory subsystem, etc.) but a complete customizable compute subsystem configured with up to 64 Neoverse N2 cores, multiple DDR5/LP DDR5 channels and multiple PCIe/CXL PHY/controller. All verified, validated, and PPA-optimized by Arm to a target foundry and process.

Most recently Arm revealed Arm Total Design, a comprehensive ecosystem of ASIC design houses, IP vendors, EDA tool providers, foundries, and firmware developers – to accelerate and simplify the development of Neoverse CSS-based systems. EDA tools and IP are supplied by Cadence, Synopsys, Rambus and of course Arm, among others. Design services come from companies including ADTechnology, Alphawave Semi, Broadcom, Capgemini, Faraday, Socionext and Sondrel. For silicon process and packaging technology they call out Intel Foundry Services and TSMC (though not Samsung curiously, maybe they are still working on that partnership). And AMI is in this ecosystem to provide software and firmware support.

Reading the tea leaves

I recently blogged on a Synopsys-hosted panel on multi-die systems which suggested already at least 100+ such systems in development. Representatives from Intel and Samsung voiced no objections to that estimate. At the same time there was consensus that these are technologies still very much in development, requiring close collaboration between system company, EDA, IP, chiplet, design services, foundry, and software development. This is not something that an in-house design team, even a hyperscaler design team, can handle on their own.

Arm mentions multi-die chiplet SoC designs in their release though in a fairly general way as the next frontier. I suspect the need is more pressing. Multi-die systems are becoming essential to support state of the art designs driven by the latest AI innovations, especially around transformer-based techniques. We already know that datacenters are pushing these technologies, automotive applications are looking for differentiation in improved natural language recognition and visual transformers for better global recognition, even wireless infrastructure sees application for more intelligent services and more efficient radio communication.

All these applications are pushing higher levels of integration between compute, accelerators and memory, the kind of integration which requires multi-die packaging. This demands experts from foundries to design services to EDA tooling. We also need a ramp-up in available high value chiplet designs, where the press release suggests another hint. Socionext have built a multi-core CPU chiplet around CSS and are aiming it at TSMC 2nm for markets in server CPUs, data center AI edge servers, and 5/6G infrastructure.

More momentum behind multi-die systems. You can read the press release HERE.

 

 


Generative AI for Silicon Design – Article 2 (Debug My Waveform)

Generative AI for Silicon Design – Article 2 (Debug My Waveform)
by Anshul Jain on 11-01-2023 at 10:00 am

Generative AI for Silicon Design Article 2

Generative AI has been making waves across various industries, and its potential continues to expand. Among its many applications, one particularly intriguing area is the capacity of GenAI to explain digital design waveforms and act as a co-pilot for hardware engineers in the debugging process. In this article, we will explore the fascinating possibilities that this technology brings to the table and its impact on the field of silicon development.

Experiment 2: Explaining Design Waveforms

This first experiment showcases the ability of Generative AI to recognise patterns and trends in data and revolutionise the way we understand and interact with waveforms. In this experiment, Google’s Generative AI Model – BARD was used to generate an explanation of write transfers in a system complying to AMBA APB protocol by looking at a sample waveform (sourced from https://documentation-service.arm.com)

Write transfer with wait states

Generated Explanation (tweaked to make its language inclusive)

Disclaimer: This explanation is not reviewed for accuracy

The waveform in the image shows the following signals:

TCLK: The APB3 clock signal.
PADDR: The APB3 address bus.
PWRITE: The APB3 write signal.
PSEL: The APB3 chip select signal.
PENABLE: The APB3 enable signal.
PWDATA: The APB3 write data bus.
PREADY: The APB3 ready signal.
Data 1: The data being written to the APB3 completer device.

The waveform shows the following behavior:

At time T1, the PWRITE signal goes high and the PADDR bus is driven with the address of the APB3 completer device to be written to.

At time T2, the PSEL signal goes high. This indicates that the APB3 initiator is ready to transfer data to the completer device.

At time T3, the PENABLE signal goes high. This indicates that the APB3 initiator is ready to drive the PWDATA bus.

At time T4, the PREADY signal goes high. This indicates that the APB3 completer device is ready to receive the data on the PWDATA bus.

At time T5, the PWRITE signal goes low. This indicates that the APB3 initiator has finished transferring data to the completer device.

The PSEL and PENABLE signals remain high until the APB3 completer device has finished processing the data.

What Does Future Look Like – GenAI Powered Debugger Copilot

One of the most exciting applications of generative AI is the potential it holds to be a co-pilot for developers during the debugging process. Waveforms are the lifeblood of complex systems in hardware design. A debugger copilot, powered by generative AI, can dissect waveforms, identify anomalies, and correlate patterns, making it an invaluable tool for engineers seeking to comprehend complex design behavior.

Challenges in Developing a Debugger Copilot

While the prospects of a debugger copilot powered by generative AI are enticing, there are challenges that need to be addressed. The accuracy and reliability of the AI’s explanations are paramount. We must ensure that the AI’s insights are trustworthy and not just algorithmically generated responses. Moreover, privacy and data security concerns are significant. Handling sensitive information in the debugging process requires a robust framework to protect data integrity.

Conclusion

The journey of generative AI into the realm of explaining waveforms and acting as a debugger copilot is an exciting one. This technology has the potential to transform the way we develop and debug software, saving time, reducing costs, and increasing accuracy. As engineers continue to explore its possibilities, the future of debugging looks promising.

Also Read:

Generative AI for Silicon Design – Article 1 (Code My FSM)


WEBINAR: How to Achieve 95%+ Accurate Power Measurement During Architecture Exploration

WEBINAR: How to Achieve 95%+ Accurate Power Measurement During Architecture Exploration
by Daniel Nenni on 11-01-2023 at 6:00 am

AVFS IP SOC

Today’s power modeling solutions are trained at measuring power using the micro-events captured from detailed RTL simulation or studying the electromagnetic radiation from IR drop and side channel attacks. These solutions are fantastic for debugging and verification of the implementation. There are both open source and commercial products for power modeling.  For example, Synopsys PrimeTime can provide the percentage of activity in a stage for each device while Cadence Joules can debug the power intent, and Ansys Totem can help with the transistor-level power integrity and reliability analysis.  There is an open-source power modelling add-on to SystemC. All of these are static solutions that take the simulation, process it and generate reports.  Alternately they work in conjunction with a debugger.

As the power requirements get more stringent and complex, designers need to predict the power consumption, impact of power intent and savings from power management units, during architecture exploration.  System designers and integrators want the power consumption metrics in the datasheet for the entire activity range, not just one or two scenarios.  The marketing team must be able to quickly generate power metrics for the requested scenarios. Thermal team requires heat and temperature data to plan for the cooling, chassis design and board layout.  The system integrators need guidelines on the battery lifecycle and peak power to plan for the supply and capacitance.  Thus, a good modelling solution must support power generation, storage, analog and digital consumption and power management. The IP or semiconductor vendor must have the ability share a dynamic model that quickly simulate scenarios. The accuracy of the power measurements at the architecture exploration must be within 10% of the value at RTL or board-level.

Power exploration starts by examining the tangible hardware and software choices that significantly affect SoC power consumption. These include:

  • Use cases
  • Architectural details (number of cores, choice of accelerators)
  • Power-saving features (DVFS, clock gating, power gating)
  • Firmware/software algorithms
  • Operating frequency and voltage modes
  • Process technology aspects that impact power

Modern SOCs employ multiple power management strategies either at individual IP level or full SOC level. One such strategy is adaptive voltage and frequency scaling (AVFS). In AVFS, a closed-loop feedback system is implemented between the voltage scaling power supply and delay-sensing performance monitor on the SoC. The on-chip performance monitor not only sees the actual voltage delivered on-chip but also understands whether the silicon is slow, typical or fast, and the effect of the temperature of the surrounding silicon. It can also track the utilization parameters to decide the choice of operating points.

The figure below represents an example of this feedback system between power manager and performance monitor on a chip.

System-level modelling solutions such as VisualSim Power Modeler 2.0 can be used to model the effect of power management strategies on the system performance and hence decide on a power-performance trade-off in early system design phases.

In the intricate realm of System-on-Chip (SoC) design, optimizing power is paramount. VisualSim Power Modeling 2.0 is a dynamic power exploration, optimization and measurement of the Intellectual Property, Semiconductor Device, embedded system, chassis or Data Center.  VisualSim enables system-level SoC power measurement even before delving into the intricacies of Register Transfer Level (RTL) development. Using this solution, designers and Power Architect can study ways to reduce power consumption, measure power consumed for a software application, impact of hardware configurations, adding of new hardware or change in the interconnect and new workloads/ use-cases. As the Power modeling toolkit is common from IP vendors to the system deployment, it is possible for suppliers and the system integrators to share encrypted power information along with the model.

VisualSim System-Level Power Modeling is used during the specification phase and before development.  Thus, it is possible to optimize the architecture to achieve Power-Performance-Trade off before development and also reuse the model for downstream debugging.

The built-AI engine continuously monitors the power activity to identify failures and identifies the cause of the failure.

See Webinar Replays Here
Also Read:

System Simulation, Architecture Exploration, Power Management, Performance Analysis

Mirabilis Invites System Architects at DAC 2023 in San Francisco

Mapping SysML to Hardware Architecture


The Significance of Point Spread Functions with Stochastic Behavior in Electron-Beam Lithography

The Significance of Point Spread Functions with Stochastic Behavior in Electron-Beam Lithography
by Fred Chen on 10-31-2023 at 10:00 am

Electron Beam Lithography

Electron beam lithography is commercially used to directly write submicron patterns onto advanced node masks. With the advent of EUV masks and nanometer-scale NIL (nanoimprint lithography), multi-beam writers are now being used, compensating the ultralow throughput of a single high-resolution electron beam with the use of over 100,000 beams [1-2]. Unlike optical or EUV scanners, the resolution is not determined by the numerical aperture (NA) of the system. Instead, it starts with how small the electron beam is focused, followed by how the electrons are subsequently scattered and release more electrons which are subsequently scattered.

Commonly, Gaussian functions are used to model the electron beam and scattering profiles. However, a single Gaussian function is not sufficient to build the effective point spread function. This is because the backscattered electrons travel far beyond the range of the immediately scattered electrons. At least two Gaussian functions (and generally more) are needed for an accurate profile. Some published results from the literature [3-4], in fact, indicate exponential functions as more appropriate for fitting the energy deposited by a high-energy electron beam. Compared to a Gaussian function, an exponential function has a sharper, narrower peak, a sharper drop-off, and a more extended tail.

Figure 1. Exponential functions more realistically describe the combination of a narrow beam with extended backscattering.

The narrow peak and sharp drop-off are consistent with the highest energy electrons (>>10 keV) not deviating far laterally from the entry point, while the extended tail comes from all the secondary electrons (many generations, no doubt) from electrons which are backscattered from the substrate.

The resolution limit from this point spread function can be studied (as in the optical cases) by considering the space between two exposed points. As may be expected, the shot noise from the electron dose plays a critical role (Figure 2).

Figure 2. Reducing the dose to 16% of the nominal value increases the impact from shot noise visibly. The two exposed spots here are separated by 40 nm. The number of Poisson events is considered proportional to the exponential deposited energy profile.

What also should not be a surprise is the impact of blur. Here blur comes from a wider range of electron energies (many keVs) than expected from EUV lithography (only up to ~92 eV).

Figure 3. Increasing blur reduces the peak-valley dose difference. Since the peak dose here is fixed, the valley dose is actually increasing with blur. As in Figure 2, the two exposed spots here are separated by 40 nm, and the number of Poisson events is considered proportional to the exponential deposited energy profile.

From Figure 3, we note two trends. First, the peak-valley dose difference diminished with increasing blur. Second, while the peak dose is fixed, the valley dose is increasing with increasing blur. In terms of the impact on stochastic dose variations, these two trends counter each other. The first trend increases sensitivity to fluctuations, but the second trend reduces the amplitude of such fluctuations in accordance with Poisson statistics. Consequently, the CD variation due to these fluctuations is comparable among the three cases (Figure 4).

Figure 4. CD variation for 20 points repeatability run, for the cases shown in Figure 3. The amount of variation is comparable among the three cases.

The 3 sigma uniformity is on the order of 3-4 nm (~15-20%) for the three cases examined. Obviously, the dose needs to be increased to reduce this to smaller values. However, this will proportionally increase write time if beam number is not increased; a doubling is considered unacceptable [2]. Larger CDs (>20 nm) will not be impacted as significantly [5], as the peak-valley difference is increased and more point spread functions are overlapped together in writing the spots [1,6]. However, for nanoimprint masks and High-NA EUV masks with subresolution assist features (SRAFs), the stochastic variations cannot be ignored.

References

[1] M. Tomandl et al., Proc. SPIE 12802, 1280204 (2023).

[2] C. Klein et al., Proc. SPIE PC12497, PC1249707 (2023).

[3] V. R. Manfrinato et al., Nano. Lett. 14, 4406 (2014).

[4] V. R. Manfrinato et al., Nano. Lett. 13, 1555 (2013).

[5] K. Ichimura et al., Proc. SPIE 12293, 122930Q (2022).

[6] K. Ichimura et al., Proc. SPIE 10584, 105840P (2018).

This article first appeared on LinkedIn Pulse: The Significance of Point Spread Functions with Stochastic Behavior in Electron-Beam Lithography

Also Read:

KLAC- OK quarter in ugly environment- Big China $ – Little Process $ – Legacy good

ASML- Longer Deeper Downcycle finally hits lithography – Flat 2024 – Weak Memory – Bottom?

ASML- Absolutely Solid Monopoly in Lithography- Ignoring hysteria & stupidity