Semiwiki EDA Webinar 800x100

DAC 2021 – Accellera Panel all about Functional Safety Standards

DAC 2021 – Accellera Panel all about Functional Safety Standards
by Daniel Payne on 12-14-2021 at 10:00 am

FS data format min

Functional safety has been at the forefront of the electrification of our vehicles with new ADAS features, and the push to reach autonomous driving, while having compliance with the ISO 26262 functional safety standard. I attended the Accellera hosted panel discussion on Monday at DAC, hearing from functional safety panelists that work at AMD, Arm, Texas Instruments, and DARPA. Alessandra Nardi, Accellera Functional Safety Working Group Chair moderated the panel discussion, and started out with a big picture overview first.

The Functional Safety Working Group (FS WG) started out as a proposed working group back in October 2019, and soon had a kickoff in December 2019. By February 2020 the working group was officially formed from member company representatives, and about 30 companies are now working on the standard. You can expect a Data Model definition white paper to come out in Q1 2022, and a Draft Language (LRM) release by Q2 2022.

The big idea is to exchange the same FS data across different various automation tools, where there’s a connection between the FS data and the design info. So Accellera will define a data format/language.

Today there are many Functional Safety standards, like ISO 26262, and for industries like: Medical, Industrial, Aviation, Railways, Machinery. The Accellera FS WG will create a data standard, then collaborate with the IEEE as the standard becomes more mature, and they publish IEEE P2851.

AMD – Alexandre Palus

Functional Safety (FuSa) is concerned with both Systematic Failures and Random Failures. For vehicles, the Automotive Safety Integrity Level (ASIL) has defined four classes from A through D, where a failure in ASIL-D results in human death, and ASIL-B failures cause human injury.

Design for safety costs more on a project up front, but in the long run is less costly, based on experience, because there are fewer mask spins to correct safety issues. Teams really need to have a FuSa architect, and not treat FuSa as an after thought in order to be successful. If everyone does their job right, then nobody ends up in court.

Arm – Ghani Kanawati

Does FuSa add more complexity to development of a product? Yes, it’s a new requirement, but we need to convince design managers that FuSa is simple to implement.  There may be conflicts during development between security and safety goals, and each project is unique in requirements, but safety concerns need to come into the project at the very beginning.

Having a lifecycle process in place ensures a higher chance of success when designing for FuSa. Here’s a typical lifecycle process for Soft IP development at Arm:

The FuSa design process does add more steps, and they are well understood, and sure, there’s a learning curve. If a random fault happens in your hardware design, then what happens to the system behavior? Does your system respond safely?

Texas Instruments – Bharat Rajaram

There are over a dozen FS standards spanning many inter-related industries, so ISO 26262 is the gold standard for semiconductor companies,  and at TI they’ve been designing for safety since the 1980s. Automotive examples that require FuSa include:

  • ASIL A – Rear lights, Vision ADAS
  • ASIL B – Instrument cluster, headlights, rear-view cameras
  • ASIL B&C – Active suspension
  • ASIL C – Radar cruise control
  • ASIL C&D – Engine management
  • ASIL D – Antilock braking, electric power steering, airbag

Industrial systems have their own FuSa levels: SIL 1, SIL 2, SIL 3. Safety engineers are also concerned about Failure In Time (FIT) rates, where there are standards to follow like, IEC TR 62380, SN 29500, JESD85.

DARPA – Serge Leef

In Serge’s group they are mostly focused on security, not safety, but the two are related, because security breaches can cause safety failures. At DARPA one big concern is how to secure chips in the supply chain from gray market devices and hacking attempts. An electronic system has an attack surface reference model, and security exploits happen in four categories:

  • Side Channel
  • Reverse Engineering
  • Supply Chain
  • Malicious Hardware

The DARPA proposal is to harden new chips with an on-chip security engine:

Both Synopsys and Arm are working on the specification of this on-chip security engine, so stay tuned for more details.

Serge was skeptical of the FuSa concepts, based on experience working in EDA, because companies couldn’t explain clearly enough what compliance to the ISO 26262 standard meant for EDA tools that are used to develop IP. In government circles they talk about Quantifiable Assurance, which is FuSa for defense systems.

For most digital systems, with only 5-10% of the state space even being simulated, can you really assure that the chips will operate safely under all conditions?

Panel Discussion

Q: Are we going beyond 5% coverage of the digital state space?

A: Alessandra – optimistic about progress of continued FuSa best practices across many industries, and safety is becoming more mainstream now.

A: Alex – we started asking vendors back in 2002 and 2003 for FuSa, but they weren’t offering much. There’s been gradual improvements from Arm and EDA vendors, and we feel that cores are safe, and most IP is becoming safer. Most EDA and IP vendors have checkboxes for safety compliance. Yes, security is unsolved at the moment.

A: Bharat – Do you know how the aileron controller operate safely in a jet? They have 3 out of 5 microcontrollers vote. Even the S-Class Mercedes comes with about 200 controllers, and they have to meet FuSa standards to get into the car ecosystem. In the early days air gap approaches stopped hacking. By 2030 about 90% of cars will have connectivity, which also provides a huge attack surface for criminals.

A: Ghani – FuSa is really a tier of things to consider, while automotive vendors often are only willing to spend quite frugally. The ISO 26262 standard is about 300 pages in length now.

A: Serge – Hackers have breached cars with WiFi, and also through entertainment and CAN bus attacks, but where is the economic gain?

Q: How do you separate security from safety standards?

A; Ghani – yes, we’ve seen hacking in cars, so security and safety are always inter related.

A Alex – the security of SW in our cars is quite a recent development. Hackers could even place a Trojan into an auto company during development, and then trigger a ransom attack.

A: Serge – with the market push for autonomous vehicles, the security aspects are quite high.

Also Read:

Accellera Unveils PSS 2.0 – Production Ready

Functional Safety – What and How

An Accellera Update. COVID Accelerates Progress


Intel Discusses Scaling Innovations at IEDM

Intel Discusses Scaling Innovations at IEDM
by Scotten Jones on 12-14-2021 at 6:00 am

Intel at IEDM Slides Page 1

Standard Cell Scaling

Complex logic designs are built up from standard cells, in order to continue to scale logic we need to continually shrink the size of standard cells.

Figure 1 illustrates the dimensions of a standard cell.

 Figure 1. Standard Cell Dimensions.

 From figure 1 we can see that shrinking standard cell sizes requires shrinking the cell height, or the cell width, or both. The height of the cell is Metal 2 Pitch (M2P) multiplied by the number of Tracks. The cell width is determined by the Contacted Poly Pitch (CPP) and whether the cell has single or double diffusion breaks.

Shrinking cell height and width impacts the underlying device structures. In figure 1 on the right side is a simple cross section of the fins that must fit in the cell and at the bottom of the figure is a simple cross section of the elements that make up CPP. The two Intel papers I want to discuss in this write-up are the 3-D CMOS paper that can enable reduced cell height and the 2D Monolayer CMOS paper that can enable reduced cell width.

3-D CMOS (CFET)

Figure 2 illustrates the FinFET device dimensions that must fit into the cell height.

Figure 2. Cell Height Scaling.

From figure 2 we can see that the cell height includes two cell boundaries, some number of fin pitches (depends on number of fins) and the n-p spacing between the nFET and pFET fins.

Figure 3 illustrate that once a transition is made to horizonal nanosheets the n-p spacing can be reduced by various options.

Figure 3. n-p Spacing.

On the left side of figure 3 is a standard horizontal nanosheet that needs the same n-p spacing we would see with a FinFET. This type of configuration supports a 6-track cell height or with the addition of buried power rails (BPR), a 5-track cell (BPR reduce the cell boundary width). The middle of the figure illustrates adding a dielectric wall in between the nFET and pFET to reduce n-p spacing and enable track heights of 4.33 to 4.00. Finally on the right side of the figure a 3-D CMOS device (CFET), is illustrated and the n-p spacing is now zero in the lateral dimension because the FETs are stacked, this approach can support tracks heights of 4.00 to 3.00.

In their “Opportunities in 3-D stacked CMOS transistors” paper at IEDM, Intel provided an overview of 3D-CMOS.

The basic idea behind 3D-CMOS is illustrated on the left side of figure 4.

Figure 4. 3D-CMOS.

There are two main options for 3D-CMOS.

In a sequential approach the bottom device layer is fabricated up through gates and contacts on one wafer, a top layer of devices is separately fabricated on a second wafer and then deposited onto the bottom devices through layer transfer or bonding, followed by interconnect for the resulting two-layer structure. The sequential approach is illustrated in the top right-hand side of figure 4.

The sequential approach requires extra processing because the bottom and top layers are fabricated independently, but it offers the ability to mix and match various materials, for example Germanium PMOS devices with Silicon NMOS devices or even introducing Gallium Nitride devices. It does require combining the two device layers without degrading either layer and a critical bonding step.

The second approach is the self-aligned approach, where both the bottom and top layers are fabricated on the same wafer. This approach can in theory reduce the process complexity but does present integration challenges to achieving good device performance for both layers. The self-aligned approach is illustrated in the bottom right-hand side of figure 4.

3D-CMOS is a promising solution to continue scaling after horizontal nanosheets enter production.

 2D Monolayer CMOS

If we look at the CPP cross section at the bottom of figure 1 in more detail, we get the diagram on the left side of figure 5. CPP is made up of Gate Length (Lg), contact width and twice the contact to gate spacer thickness.

Figure 5. Contacted Poly Pitch Scaling.

As we can see from the table on the right side of figure 5, TSMC for example, has been shrinking CPP by reducing all three dimensions.

With respect to Lg it is limited by the type of device being used. The more constrained the channel is and more gates that are used to control the channel, the shorter the minimum gate length can be. Figure 6 presents the limits for different device types and presents the minimum gate length versus channel thickness and number of gates for silicon.

Figure 6. Gate Length Scaling.

From the bottom left part of figure 6 we can see that for a planar transistor with a single gate, the channel is poorly controlled, and the gate length limit is around 30nm (theoretically it is less, but all the logic manufacturers moved off single gate planar devices by 30nm). Moving to a planar device with a thin channel and two gates as is seen with FDSOI, reduces the minimum channel length to approximately 23nm. FinFETs with the channel constrained to a thin fin and three gates enables gate lengths down to approximately 16nm, this is one of the reasons FinFETs have been adopted as the logic mainstream. As we move forward horizontal nanowire/nanosheets with four gates offer minimum gate lengths of approximately 13nm. Finally, beyond nanosheets, the Intel work discussed here addresses 2D devices that can enable channels of less than 10nm providing further CPP scaling. This is a promising next step beyond 3D-CMOS or may be integrated with 3D-CMOS. There were many papers presented at the conference representing a variety of companies and research groups illustrating the interest in this technology.

The Intel paper discussing this is entitled: “Advancing 2D Monolayer CMOS Through Contact, Channel and Interface Engineering”.

As silicon is scaled down, the channel thickness must get thinner and mobility degradation eventually occurs, the silicon limit for good mobility is approximately 5nm. Transition Metal Dichalcogenide (TMD) materials show similar mobility in monolayer films ~1nm to their bulk mobility making them attractive candidates for 2D devices. TMD films will have lower mobility and higher contact resistance than current generation silicon CMOS devices, but simulations indicate that even with these draw backs, stacking enough 2D layers will provide a performance and scaling improvement over silicon horizontal nanosheets, for example with if 2D layers are stacked 6 high with a Lg of 5nm and metallic contacts, significant scaling, power, and performance improvements can be achieved.

Figure 7 illustrates 2D Devices.

Figure 7. 2D Devices.

The paper reviews three key areas: channel material quality, contact resistance, and gate stack quality.

The best channel results in literature are from deposition techniques that haven’t been demonstrated on 300mm wafers. MOCVD and nucleated CVD on pre patterned seeds have the potential for 300mm deposition. MOCVD offers the prospect of a wide temperature range including 300oC deposition that could open up TMD channel deposition compatible with the Back End of Line (BEOL). Nucleated CVD offers grain boundary free devices and Intel has achieved the best published WS2 mobility.

Low contact resistance contacts to NMOS and PMOS remain a challenge for 2D FETs. The authors show promising results for Sb on MoSi2, and Sb offers a higher melting point than Bi (the other leading NMOS contact material). PMOS contacts remain far more challenging, once again the authors showed some results with Ru but there are still a lot of challenges.

The 2D materials of interest here are well known to collect organic processing residues that can inhibit ALD deposition of gate oxides. The authors compared a vacuum anneal and a forming gas anneal, and both reduced the carbon contamination levels, the forming gas anneal was shown to improve measured electrical performance for a MOSCAP.

By comparing the work done here with previously published work the authors have shown where 2D devices currently stand and introduced promising new contact materials and deposition techniques.

2D devices are far from being ready for manufacturing but several groups are pursuing them, and steady progress is being made.

Conclusion

Samsung is currently trying to be the first in the industry to put horizontal nanosheets (HNS) into production. Intel and TSMC are also working on HNS. It is likely that HNS will carry the industry at least through 2025. By around 2028 3D-CMOS (called CFETs by others) may be ready for production incorporating vertical stacks of n and p nanosheets. As a follow on to 3D-CMOS or even and extension of 3D-CMOS 2D devices are a potential path for continued scaling past the end of the decade. Intel is clearly trying to reassert themselves as a semiconductor technology leader.

Also Read:

SISPAD – Cost Simulations to Enable PPAC Aware Technology Development

TSMC Arizona Fab Cost Revisited

Intel Accelerated


DAC 2021 – Joe Sawicki explains Digitalization

DAC 2021 – Joe Sawicki explains Digitalization
by Daniel Payne on 12-13-2021 at 10:00 am

semiconductor content min

Monday at DAC this year started off on a very optimistic note as Joe Sawicki from Siemens EDA presented in the Pavilion on the topic of Digitalization, a frequent theme in the popular press because of the whole Work From Home transition that we’ve gone through during the pandemic. Several industries are benefiting from the digitalization trend: semiconductor, aerospace, defense, automotive, heavy machinery, medical, consumer products, energy, utilities and even marine.

The ever-present Tesla leads the way in EV sales as owners enjoy the benefits of SW updates over the air, adding new features and fixing bugs, much like our smart phone apps, all enabled by semiconductors. Even President Biden extols the virtues of semiconductor production in the US, and how national policy should benefit the semiconductor industry.

During the pandemic we’ve experienced trauma because of illness and death, yet the move to digital commerce has sharply risen, and cloud-based services are flourishing, like Zoom and Microsoft Teams. Most modern businesses are quickly moving their services and support to digital and cloud platforms.

Semiconductor content in electronic products has moved from 16% up to 25% as a mega-trend, think: Amazon, Google, Tesla, Bosch, ZTE, Huawei, Apple and Facebook.

Foundry revenue trends are showing that systems companies are growing at a 26.8% CAGR, which is a big shift in who does SoC designs. Some of the drivers at foundries include: Sensors, edge computing, 5G, wireless, cloud and data center. Just in the past 5 years there has been a 5X increase in the number of sensors connected to the Internet, projected to reach 29.6 billion devices by 2025. An example of connected sensors is the Ring Doorbell.

There’s growth in how 5G and IoT markets are linked, and even data centers are forecasted to have a 14% CAGR during the 2028-2030 period. The flow of information is from sensors, to edge processing, to 5G or wireless, and it is all ending up in a data center.

Within the next 10 years the projection is that 95% of our vehicles will be connected, which creates big demands on the wireless infrastructure. ADAS is expected to have a 22% CAGR from 2020 to 2030. The value of Electronic Control Units (ECUs) in a car went from $302 in 2010, to $499 in 2020, and is expected to reach $758 by 2030. The electrification trend in automotive is in growth mode, so one challenge is to hire enough system designers to keep pace with competitive demands.

AI in semiconductor has a steep growth of some 31% CAGR to 2030, and it’s becoming a pervasive technology. We now have chips optimized for AI training and inferencing. Semiconductor companies are starting new domain-specific AI chips in diverse areas like: voice, video, cyber security, IoT, odor detection,  etc.

The overall semiconductor revenues are predicted to grow at a 9.5% CAGR over 2020-2025, while the GDP CAGR is a bit lower at 6.2%, so let the good times continue to roll. Even the US government is considering legislation to fund research in the semiconductor industry to help gain critical market share. Even VC funding for fabless companies reached some $8B in 2021, setting another world record year. Foundry investments are expected to reach $79.6B in 2021.

Historically the R&D funding as a percentage of revenue for semiconductor companies has been about 14.2%, while it has ranged from 12% to 18% over the years. During the pandemic our industry has seen shortages in the supply chain, as demand is strong, however the basic materials can be scarce. The shortage is slowing down our consumer and industrial productivity, but how much double ordering is really going on right now?

The majority of VC investing in semiconductor has focused on China markets recently, at 53%.

In our EDA world the revenues are driven by new design starts, not semiconductor revenues, so just how many new designs will start using the 5nm node? We’ve seen failed predictions in the past, like some who said that only 3 companies would even attempt a 5nm design. In reality, IC design starts are quite strong in many areas: wearables, IoT, ADAS, industrial, smart grid, 5G. In Q2 of 2021 the EDA revenues were at $3B, a growth of 14.6%, so that’s a healthy increase, and EDA has seen four quarters now of double-digit growth. In fact, EDA revenues have recently seen their highest growth rate in the past 10 years.

Diving a bit deeper, the following areas are driving EDA growth:

  • AMS, RF – 12.4%
  • DFT – 9.7%
  • IC Full Custom – 8.4%
  • Formal – 8.4%
  • Logic – 7.4%
  • Layout Verification – 7.4%
  • P&R – 7%
  • PCB – 10%

Challenges going forward into the next decade include three areas: Technology scaling, Design scaling, System scaling. Looking at Apple as an example of a systems company designing their own A-series of processors over the past 8 years, where the A7 processor had about 1 billion transistors, while the latest A15 processor includes some 15 billion transistors, almost in perfect alignment with a 16X increase predicted by Moore’s Law.

Machine learning as a technique is being applied to EDA tools to enable accelerations in yield ramp, pattern analytics, and metrology. Predicted costs for design scaling are that 3nm SoC costs could range from $535M – $626M. In EDA the use of High Level Synthesis (HLS) continues, and Siemens has offered HLS for about 25 years now. With HLS systems companies like NVIDIA are designing new chips with a small group of just 10 engineers in just 6 months. As AI techniques become more ubiquitous, we can expect to see even more accelerator products announced that are application specific, and during the design phase the engineers will explore to find the optimum architecture.

Another new acronym is STCO, short for System Technology Co-Optimization, where multiple die and chiplets are being used to assemble new systems, along with the use of 3D die stacking, and other advanced packaging concepts.

For system scaling the path forward calls for mixed-mode, virtual simulations, along with hardware-assisted verification techniques like using emulation, where a systems designer can actually run real apps even before the silicon has been manufactured. Emulation will also be used more for the debug of HW/SW integration issues early in the design process.

Within Siemens there’s something called PAVE360, that allows a systems approach to model, sense, compute, analyze and actuate as a model, prior to implementation. In the pursuit of autonomous vehicles, using a PAVE 360 methodology is more practical than driving billions of actual miles to uncover safety issues.

Summary

Mr. Sawicki was quite upbeat about the semiconductor industry trends, and the number of IC design starts is great news for EDA vendors of all types, although there are formidable challenges ahead to meet the scaling demands.

Also Read:

System Technology Co-Optimization (STCO)

Siemens EDA will be returning to DAC this year as a Platinum Sponsor.

Machine Learning Applied to IP Validation, Running on AWS Graviton2


A Practical Approach to Better Thermal Analysis for Chip and Package

A Practical Approach to Better Thermal Analysis for Chip and Package
by Daniel Nenni on 12-13-2021 at 6:00 am

ANSYS Thermal Chip Model

Thermal modeling has become a hot topic for designers of today’s high-speed circuits and complex packages. This has led to the adoption of better and more sophisticated thermal modeling tools and flows as exemplified in this presentation by Micron at the IDEAS Digital Forum. The presentation is titled “Thermal Aware Memory Controller Design with Chip Package System Simulation” and covers the latest developments in both power modeling and thermal modeling by the Controller design team at Micron.

The first presenter is Shiva Shankar Padakanti, a senior physical design manager at Micron with over 17 years of experience in backend design and more than 33 tape-outs down to 7nm. Shiva introduces the two major thermal issues faced by his team: (a.) avoiding overly pessimistic thermal limits that degrade a chip’s performance, and (b.) avoiding thermal runaway – a reliability issue where local hotspots cause increased device leakage, which increases the temperature yet further.

Shiva sets the stage by discussing their traditional thermal analysis flow that assumed a uniform temperature across the entire chip based on total power and relied on simple power/temperature limits with a large safety margin. This constrained power signoff to use un-realistically pessimistic temperature limits because the analysis under-reported the true maximum temperature. This could lead to compromise in the design’s specification and significant loss in chip performance due to over-design. The first attempt to improve their analysis capability was to analyze the power on a block-by-block basis instead of full-chip. This gave a more realistic non-uniform temperature distribution but was still unable to account for temperature-dependent leakage power.

Working with Ansys, Micron developed a new analysis flow that uses the Chip Thermal Model (CTM) technology augmented with the APL Leakage Model. A CTM cuts each layer in the chip into a fine grid and then describes the power output of each grid square as a function of the temperature. The APL Leakage files capture how device leakage varies with temperature. These models are generated by Ansys RedHawk™ or Ansys Totem™ power integrity signoff and gives a much more accurate and fine-grained power model. This was then handed off to the Thermal team to enable their package and system thermal analysis.

Fig.1 Thermal analysis flow using Chip Thermal Models (CTM) generated by Ansys RedHawk or Ansys Totem power integrity signoff tools, and then used for package and system thermal analysis by Ansys Icepak.

The advantage of the CTM technology is that it accurately predicts the location of thermal hotpots and, in this test case, predicted a temperature 12% higher than the simpler block-based approach (see Fig.2). This higher temperature results from the accurate modeling of temperature-dependent leakage which was not considered in the block-based or traditional flows.

Fig.2 Shows a comparison of the temperature profile using the simpler block-based thermal modeling approach against the more accurate Chip Thermal Model that relies on a per-layer gridded model. The CTM technology accurately identifies the hotspot locations and predicts a 12% higher temperature based on temperature-dependent leakage

The second part of the presentation is narrated by Ravi Kumar, senior principal engineer at Micron with over 9 years’ experience  in thermal management of electronics. Ravi starts by pointing out that chip, package, and system analyses are each at a different scale – from microns to centimeters and thus require a range of simulation technologies to span this range. Also, simulating a complete stack as shown in Fig.3 is very computationally expensive for each temperature point, often limiting the scope of thermal analysis.

Fig.3 Cross section of the complete chip-package-system stack for the Micron controller under thermal analysis, including the PCB substrate and the external heat sink. The cooling airflow over the heatsink is modeled by Icepak using Ansys’ computational fluid dynamics technology.

However, by using the CTM modeling approach, Ravi’s team was able to speed up the thermal simulation time by 90% due to the higher efficiency and faster convergence of the CTM approach. The final operating temperature depends, of course, on its power output. But the power output is also temperature dependent. Icepak executes internal iterations using the CTM to arrive at a stable operating temperature. In this test case, the heat sink was designed to radiate an estimated 50W, but the system actually ended up generating closer to 60W. Failure to anticipate the real heat flow can heat stress the package and impact the performance and reliability of the entire system.

A final benefit highlighted by the Micron team was the ability to optimize the placement of thermal sensors on the chip. The traditional techniques had not accurately placed the sensors at the true maximum hotspots and under-measured the hotspot temperature by 8.1°C. The new CTM-based approach optimized their placement and reduced the risk of thermal runaway.

Shiva concluded the presentation by outlining future projects by his team to consider thermal-aware electromigration analysis, and the mechanical warpage of package and PCB due to thermal gradients.

You can view the entire Micron presentation on-demand at Ansys IDEAS Digital Forum under the Electrothermal Analysis track. Registration is free.

Also Read

Ansys CEO Ajei Gopal’s Keynote on 3D-IC at Samsung SAFE Forum

Ansys to Present Multiphysics Cloud Enablement with Microsoft Azure at DAC

Big Data Helps Boost PDN Sign Off Coverage


Edge Computing Paradigm

Edge Computing Paradigm
by Ahmed Banafa on 12-12-2021 at 6:00 am

Edge Computing Paradigm

Edge computing is a model in which data, processing and applications are concentrated in devices at the network rather than existing almost entirely in the cloud.

Edge Computing is a paradigm that extends Cloud Computing and services to the of the network, similar to Cloud, Edge provides data, compute, storage, and application services to end-users.

Edge Computing reduces service latency, and improves QoS (Quality of Service), resulting in superior user-experience. #Edge Computing supports emerging concept of Metaverse applications that demand real-time/predictable latency (industrial automation, transportation, networks of sensors and actuators). Edge Computing paradigm is well positioned for real time Big Data and real time analytics, it supports densely distributed data collection points, hence adding a fourth axis to the often-mentioned Big Data dimensions (volume, variety, and velocity).

Unlike traditional data centers, Edge devices are geographically distributed over heterogeneous platforms, spanning multiple management domains. That means data can be processed locally in smart devices rather than being sent to the cloud for processing.

Edge Computing Services cover:

  • Applications that require very low and predictable latency.
  • Geographically distributed applications
  • Fast mobile applications
  • Large-scale distributed control systems

Advantages of Edge computing

  • Bringing data close to the user. Instead of housing information at data center sites far from the end-point, the Edge aims to place the data close to the end-user.
  • Creating dense geographical distribution. First of all, big data and analytics can be done faster with better results. Second, administrators are able to support location-based mobility demands and not have to traverse the entire network. Third, these (Edge) systems would be created in such a way that real-time data analytics become a reality on a truly massive scale.
  • True support for mobility and the Metaverse. By controlling data at various points, Edge computing integrates core cloud services with those of a truly distributed data center platform. As more services are created to benefit the end-user, and Edge networks will become more prevalent.
  • Numerous verticals are ready to adopt. Many organizations are already adopting the concept of the Edge. Many different types of services aim to deliver rich content to the end-user. This spans IT shops, vendors, and entertainment companies as well.
  • Seamless integration with the cloud and other services. With Edge services, we’re able to enhance the cloud experience by isolating user data that needs to live on the Edge. From there, administrators are able to tie-in analytics, security, or other services directly into their cloud model.

Benefits of Edge Computing

  • Minimize latency
  • Conserve network bandwidth
  • Address security concerns at all level of the network
  • Operate reliably with quick decisions
  • Collect and secure wide range of data
  • Move data to the best place for processing
  • Lower expenses of using high computing power only when needed and less bandwidth
  • Better analysis and insights of local data

Real-Life Example:

A traffic light system in a major city is equipped with smart sensors. It is the day after the local team won a championship game and it’s the morning of the day of the big parade. A surge of traffic into the city is expected as revelers come to celebrate their team’s win. As the traffic builds, data are collected from individual traffic lights. The application developed by the city to adjust light patterns and timing is running on each edge device. The app automatically makes adjustments to light patterns in real time, at the edge, working around traffic impediments as they arise and diminish. Traffic delays are kept to a minimum, and fans spend less time in their cars and have more time to enjoy their big day.

After the parade is over, all the data collected from the traffic light system would be sent up to the cloud and analyzed, supporting predictive analysis and allowing the city to adjust and improve its traffic application’s response to future traffic anomalies. There is little value in sending a live steady stream of everyday traffic sensor data to the cloud for storage and analysis. The civic engineers have a good handle on normal traffic patterns. The relevant data is sensor information that diverges from the norm, such as the data from parade day.

Future of Edge Computing

As more services, data and applications are pushed to the end-user, technologists will need to find ways to optimize the delivery process. This means bringing information closer to the end-user, reducing latency and being prepared for the Metaverse and its applications in Web 3.0. More users are utilizing mobility as their means to conduct business and their personal lives. Rich content and lots of data points are pushing cloud computing platforms, literally, to the Edge – where the user’s requirements are continuing to grow.

With the increase in data and cloud services utilization, Edge Computing will play a key role in helping reduce latency and improving the user experience. We are now truly distributing the data plane and pushing advanced services to the Edge. By doing so, administrators are able to bring rich content to the user faster, more efficiently, and – very importantly – more economically. This, ultimately, will mean better data access, improved corporate analytics capabilities, and an overall improvement in the end-user computing experience.

Moving the intelligent processing of data to the edge only raises the stakes for maintaining the availability of these smart gateways and their communication path to the cloud. When the Internet of Things (IoT) provides methods that allow people to manage their daily lives, from locking their homes to checking their schedules to cooking their meals, gateway downtime in the Edge Computing world becomes a critical issue. Additionally, resilience and failover solutions that safeguard those processes will become even more essential. Generally speaking, we are moving towards localization to distributed model away from the current strained centralized system defining the Internet infrastructure.

Ahmed Banafa, Author the Books:

Secure and Smart Internet of Things (IoT) Using Blockchain and AI

Blockchain Technology and Applications

Read more articles at: Prof. Banafa website

References  

https://www.linkedin.com/pulse/why-iot-needs-fog-computing-ahmed-banafa/

https://www.linkedin.com/pulse/fog-computing-vital-successful-internet-things-iot-ahmed-banafa/

http://www.cisco.com/web/about/ac50/ac207/crc_new/university/RFP/rfp13078.html

http://www.howtogeek.com/185876/what-is-Edge-computing/

http://newsroom.cisco.com/feature-content?type=webcontent&articleId=1365576


Performance, Power and Area (PPA) Benefits Through Intelligent Clock Networks

Performance, Power and Area (PPA) Benefits Through Intelligent Clock Networks
by Kalar Rajendiran on 12-10-2021 at 10:00 am

4 What is Maestro ICN

One of the sessions at the Linley Fall Processor Conference 2021 was the SoC Design session. With a horizontal focus, it included presentations of interest to a variety of different market applications. The talk by Mo Faisal, CEO of Movellus, caught my attention as it promises to solve a chronic issue relating to synchronizing clock networks. While clock synchronization reduces the chance of signal hazards, the act of synchronization leads to performance, power and area inefficiencies. Over the years, many different approaches have been deployed to reduce these inefficiencies. But most of these techniques still depend on clock mesh and/or clock tree trunks and traces and use clock buffers for fanning out the clock signals.

While Mo’s talk was titled “Clock Networks in a multi-core AI SoC, the solution he presented is applicable to all SoCs. The following is a synthesis of what I gathered from his presentation.

Drawbacks of Traditional Solutions

Traditional clock networks are either a mesh or a tree implemented with wires and buffers. The buffers don’t have intelligence into what is going on with the SoCs. The implementation is typically over designed with clock buffers. Movellus claims that SoCs lose about 30%-50% of their performance due to inefficiencies introduced by clock networks. In addition, there is a significant power overhead on the SoC total dynamic power (TDP) budget and introduction of latencies. Improving the quality of clock distribution networks can improve the PPA of the entire SoC.

Movellus’ Solution

Through its intelligent clock network technology named Maestro, Movellus can ameliorate or eliminate the inefficiencies introduced by traditional clock networks.  Maestro technology consists of multiple components to achieve this. In his presentation, Mo shows a smart clock module (SCM) which senses and compensates for on-chip variation (OCV) effects and skew across an entire SoC. The SCM has awareness of on-chip variation (OCV), skew and temperature drift and dynamically aligns the clock network across the entire SoC. It pushes the common clock point very close to the flops on which the clocks are operating.

Movellus’ architectural innovation drives the delivery of the following three benefits.

      • Latency Reduction
      • Energy Efficiency
      • Max Throughput

While the above attributes are typical requirements for most applications, these are particularly critical for today’s AI driven edge applications.

The Maestro solution is offered in soft IP form and fits into any EDA tool flow, making it easy to integrate into any SoC.

Some Use Cases

The Maestro technology can bring benefits to both heterogeneous SoCs and homogeneous SoCs. A heterogeneous SoC consists of many different subsystems with different care abouts, whether speed, power or timing closure. Refer to Figure below.

While Mo showcases the value of Maestro technology using a homogeneous SoC example through the bulk of his presentation, the insights gained can be directly applied to the different subsystems of a heterogeneous SoC such as the one shown above. For example, the ability to do multi-rate communication without clock-domain-crossing (CDC) FIFOs:  A SoC with a compute core running at a higher frequency with the rest of the chip running at half clock rate. With the Maestro solution, data can be moved from I/O flop to I/O flop without having to add retiming flops and CDC FIFOs. With an AI SoC where the data bus width is very wide, the maestro solution will save lot of retiming flops, reducing latency and improving PPA.

Mo calls the Maestro solution a very high-quality large-scale synchronization method at the lowest power possible.

Higher Speed

With Maestro, the common clock point is pushed very close to the flops by using SCM. Refer to Figure below for the intra-core example used. The core is a 3 sq.mm in N7 node, running at 2.5GHz. The divergent insertion delay was reduced from 750ps to 200 psec. Even with the 5ps Maestro overhead, the OCV-driven speed sacrifice is driven down from 26% to 8.3%, delivering about 18% gain is useful cycle time.

Lower Power

Traditional global clock networks typically use some variation of a clock mesh to bring the clock to all the cores and is always-on and consuming power. Refer to the Figure below for the example used. In this example, the traditional approach burns 2.5W all the time, independent of the SoC run time utilization level. The total dynamic power (TDP) of the example SoC is 50W. Under the traditional approach, the global clock distribution power at 2.5W is at 5% of the TDP. At a 20% utilization level, the 2.5W is 25% of the 10W dynamic power consumption. Generally speaking, average utilization levels are well below 100%.

For this example, a Maestro implementation helps keep the global clock distribution power at or below 2.5% of the TDP under various utilization levels.

Resultant Benefits

While the above examples quantified the efficiency gains along speed and energy dimensions, there are other tangible benefits from using the Maestro technology. For example, the ease of handling multi-rate clocks in a heterogeneous SoC. Another example is the ease of implementing the global level clock network. Once the intra-core clock network is fixed, the global clock network gets automatically corrected. All that is needed is to hook it up with a normal global level clock tree straight out of clock tree synthesis. There is no need to balance the global clock distribution. The die area savings and latency reduction through the avoidance of a large number of buffers and/or retiming flops could be significant too.

New Opportunities to Innovate

Mo encourages SoC architects and implementation specialists to think of new use cases Maestro technology could enable in their designs. What can one do with a large-scale synchronization capability like this? Does this help with simplification of software? What can you do with extra timing margin?

Mo closes his talk with the following teaser. He suggests that the amount of performance that is sacrificed to accommodate for OCV effects is only 1/3 of the performance gain that Maestro solution can deliver to an SoC. There are other details of the Maestro architecture which were not disclosed during the presentation. For more details, contact Movellus.

Also Read:

Advantages of Large-Scale Synchronous Clocking Domains in AI Chip Designs

It’s Now Time for Smart Clock Networks

CEO Interview: Mo Faisal of Movellus


Podcast EP52: A Preview of the Upcoming IEDM Meeting

Podcast EP52: A Preview of the Upcoming IEDM Meeting
by Daniel Nenni on 12-10-2021 at 10:00 am

Dan is joined by Srabanti Chowdhury, the publicity co-chair for IEDM, which will be an in-person conference December 11-15 at the Hilton San Francisco Union Square. Dan explores the topics to be discussed at the upcoming meeting and what they suggest about the future of semiconductors.

Srabanti Chowdhury is an associate professor of Electrical Engineering (EE) and a Senior Fellow of Precourt Institute at Stanford. She leads the Wide bandgap (WBG) Lab at Stanford, where her research focuses on the wideband gap (WBG) and ultra-wide bandgap (UWBG) materials and device engineering for energy-efficient and compact system architecture for various applications, including power, RF, computation, and emerging ones. Besides Gallium Nitride, her group is exploring Diamond for various active and passive electronic applications, particularly thermal management.

Srabanti received her M.S and Ph.D. in Electrical Engineering from the University of California, Santa Barbara working on Vertical GaN Switches.

She received the DARPA Young Faculty Award, NSF CAREER, and AFOSR Young Investigator Program (YIP) in 2015. In 2016 she received the Young Scientist award at the International Symposium on Compound Semiconductors (ISCS). She is a senior member of IEEE and an alumni of NAE Frontiers of Engineering.  She received the Alfred P. Sloan fellowship in Physics in 2020.  To date, her work has produced over 6 book chapters, 90 journal papers, 110 conference presentations, and 26 issued patents.  She serves the program committee of several IEEE conferences including IRPS and VLSI Symposium, and the executive committee of IEDM. She serves as the Associate Editor of Transaction Electron Devices as well as two committees under IEEE Electron Device Society (Compound Semiconductor Devices & Circuits Committee Members and Power Devices and ICs Committee).

The views, thoughts, and opinions expressed in these podcasts belong solely to the speaker, and not to the speaker’s employer, organization, committee or any other group or individual.


Ansys CEO Ajei Gopal’s Keynote on 3D-IC at Samsung SAFE Forum

Ansys CEO Ajei Gopal’s Keynote on 3D-IC at Samsung SAFE Forum
by Tom Simon on 12-09-2021 at 10:00 am

Ajei Gopal talks about 3D IC

System on chip (SoC) based design has long been recognized as a powerful method to offer product differentiation through higher performance and expanded functionality. Yet, it comes with a number of limitations, such as high cost of development.  Also, SoCs are monolithic, which can inhibit rapid adaptation in the face of changing market needs. Furthermore, integration of mixed elements into a single die, such as memory, RF, FPGA, CMOS, optical etc. can complicate product delivery. These factors have led to the growth of 2.5D and 3D-IC which can offer a high degree of package level integration while providing flexibility and freedom from yield risks and extra costs associated with single die SoCs.

At the recent Samsung Advanced Foundry Ecosystem Forum Ajei Gopal, President and Chief Executive Officer of Ansys, gave a keynote address that focused on this issue and the new types of analysis that will be needed to enable system growth through 3D-IC. Ajei spoke about how Samsung’s eXtend-Cube (X-Cube) can offer integration of multi die assemblies to create compact high-performance systems. According to Ajei X-Cube is suitable for 5G, AI, high performance computing, as well as wearables and IoT.

Ajei said to facilitate rapidly building 3D-ICs, physics based simulation can be used to account for all the effects that need to be considered in these new designs. The twist is that now many differing materials are being combined in a single package. There are new requirements for structural and fluids simulations that are critical to predict cooling and thermal warping and to ensure reliable solder ball connections. Also, electromagnetic interactions will become more significant.

Ajei cited an example where a customer used RedHawk-SC to model current flowing through thousands of microbumps and predicted that in some locations that there would be enough heat to melt the bumps. This would have led to a catastrophic failure of the 3D-IC module.

The real crux of what Ajei had to say was that while 3D-ICs are necessary for the innovations that the market calls for, to meet these needs a partnership is needed between multiple vendors to offer a complete and comprehensive solution. Not only has Ansys partnered with Samsung in areas like sign-off for EM effects in 3D-IC modules, but a broader partnership is required to satisfy design needs.

Ansys has partnered with Synopsys to integrate RedHawk, HFSS and IcePak into Synopsys 3D-IC compiler to provide highly accurate signal, thermal and power data. This combination of tools assures faster design closure with fewer iterations. Designers can also use Ansys SeaScape to apply machine learning algorithms to help filter analysis scenarios and dramatically trim analysis time.

It’s been widely understood for decades that no single vendor can provide the optimal solution for the complexities of IC, and now 3D-IC module, design. Ajei emphasized that any given analysis tool for simulation of multi-physics can take decades of effort to implement and validate. It makes the most sense to leverage several vendors to create an optimal solution. It’s best for designers when vendors work together proactively, instead of asking users to cobble something together. It was heartening to see this spirit of cooperation emphasized at this Samsung event. The only way designs that meet market needs will be produced is through multilateral cooperation. The Samsung SAFE event is available for on-demand viewing online, including the keynote address and the individual partner presentations.

Also Read

Ansys to Present Multiphysics Cloud Enablement with Microsoft Azure at DAC

Big Data Helps Boost PDN Sign Off Coverage

Bonds, Wire-bonds: No Time to Mesh Mesh It All with Phi Plus


RedCap Will Accelerate 5G for IoT

RedCap Will Accelerate 5G for IoT
by Bernard Murphy on 12-08-2021 at 6:00 am

5G 2021 min

You could be forgiven for wondering why I should push 5G when it might seem marketing is still ahead of deployment. While we may not all have it today, GlobeNewswire (September 22, 2021 12:30 ET) estimates there will be 700 million 5G connections across the world by the end of this year. That’s pretty rapid growth already, though still mostly driven by subscriber adoption. However, a key goal of 5G was always to extend cellular far beyond our phones, to trillions of IoT endpoints. Release 17 and later 18 from 3GPP are already moving to make these use models much more real, especially around a new standard called 5G RedCap.

Redefining the network

5G is a major advance over LTE, designed not only for performance but also for scalability to trillions of nodes. The classical cellular model, endpoints communicating with base stations which then communicate with central stations, is not scalable to that level. The way to resolve this problem is through disaggregation, distributing compute and radio management within the network. A central unit communicates with distributed units, which in turn connect with radio units (base stations or small cells) which in turn connect to endpoints. The tree structure is more scalable than a star structure. In a tree, each gateway and radio head can make manageably large numbers of connections.

But the infrastructure must locally handle a lot more processing, because each node must condense raw traffic for upstream nodes. More compute, AI/ML capabilities and beamforming features move into those cells and distributed units to handle many different classes of traffic. Traffic from safety critical functions for cars and remote surgeries, to factory automation, to mobile gaming and 8K streaming on your phones. Network operators are then able to provide software-driven network slicing to tier these services, so cat videos don’t override traffic or surgical safety.

The hardware supporting these functions can’t be general purpose CPUs. The hardware must provide a lot of horsepower certainly, but also AI/ML and signal processing, as well as general compute. Which is why you see the big mobile network equipment makers (and even operators) getting active again in chip design and chip partnerships. Open-RAN accelerates competition in this area, stimulating product advances not only from existing infrastructure builders but also new players.

Next RedCap

The IoT is not a monolithic producer and consumer of mobile traffic. Some devices can get by with short and infrequent bursts, suitable for a standard like NB-IoT. But some endpoints need more bandwidth. Surveillance cameras and AI glasses will work with video streams. Or at least frequent abstracted streams (detected objects and AR overlays). Vehicle V2X and telemetry on the other hand aim to support safety, traffic updates, emergency reporting and over the air software updates. All of which require decent performance and bandwidth.

This is where RedCap, short for Reduced Capacity, comes in. Nir Shapira (Director Strategic Technologies at CEVA) explained RedCap to me this way. The 5G triangle splits usage into enhanced mobile broadband (eMBB) at the top. Ultra-reliable low latency communications (URLLC) and massive machine type communications (MMTC) form the bottom two corners of the triangle. RedCap sits somewhere between eMBB and URLLC. RedCap offers performance similar to LTE while also able to take advantage of the 5G infrastructure features. Features such as network slicing and local intelligence in nearby infrastructure.

More disaggregation, more options, more opportunity

Disaggregation and Open-RAN create a lot of opportunity for chip and module makers in the infrastructure. RedCap adds opportunity for IoT solution builders who need bandwidth and potentially some of the 5G infrastructure services. At lower power/energy consumption that a 5G mobile phone. That’s likely to cover a lot of use-cases. Maybe you should talk to CEVA when you’re building your 5G cellular IoT product plans😀. They already have an impressive footprint in endpoint and infrastructure applications.

Also read:

CEVA Fortrix™ SecureD2D IP: Securing Communications between Heterogeneous Chiplets

AI at the Edge No Longer Means Dumbed-Down AI

Ultra-Wide Band Finds New Relevance