Bronco Webinar 800x100 1

IBM and HPE Keynotes at Synopsys Verification Day

IBM and HPE Keynotes at Synopsys Verification Day
by Bernard Murphy on 10-06-2021 at 6:00 am

Synopsys Verification Day 2021 View Ondemand min

I have attended several past Synopsys verification events which I remember as engineering conference room, all-engineer pitches and debates. Effective but aiming for content rather than polish. This year’s event was different. First it was virtual, like most events these days, which certainly made the whole event feel more prime-time ready. Also each day of the two day event started with a keynote, further underlining the big conference feel. Finally many of the pitches, mostly from big-name customers, looked very topical – verification in AI, continuous integration, cloud architectures. Exciting stuff!

IBM on AI hardware, implications for verification

Kailash Gopalakrishnan spoke on this topic. Kailash is an IBM Fellow and Sr. Manager in the accelerator architectures and ML group at the T.J. Watson Research Center. He started by noting the rapid growth in AI model size over the past 3-5 years, more than 4 orders of magnitude. One direction he is pursuing to attack this is approximate computing. Reducing word sizes for both integer and floating point in training and in inference. This helps with both performance and power, critical for embedded applications such as in-line fraud prevention.

For such AI accelerators a wider range of word sizes increases complexity for formal methods. He sees rising formal complexity also in use of software managed scratchpads with complex arbiters. Advanced accelerators have more high bandwidth asynchronous interfaces, driving yet more increases in verification runtimes and coverage complexity. Such designs commonly build on rapidly evolving deep learning primitives and use-cases. Many more moving parts than we might normally expect when building on more stable IP and workloads for regular SoCs.

Big AI designs for datacenters are following similar paths to servers. Massively arrayed cores on a chip, with PCIe support for DMA and coherent die-to-die interfaces, ganging together many die (or chips) for training. These giants must support virtualization, potentially running multiple training tasks in a single socket. All of this needs verification. Verifying complex software stacks (TensorFlow, PyTorch on down to the hardware) running on a virtual platform together with emulation or FPGA prototyping for the hardware.

In next generation chips, modeling and verification will need to encompass AI explainability and reasoning, also secure execution. Analog AI will become more common. Unlike mixed signal verification today (e.g. around IO cores) this analog will be sprinkled throughout the accelerator. Which may raise expectations for AMS verification fidelity and performance. Finally, also for performance and power, 3D stacking will likely drive need for help more help in partitioning between stacked die. Not a new need but likely to become even more important.

HPE on growing design team methodologies

David Lacey is Chief Verification Technologist in HP Enterprise Labs and was making a plea for more focus on methodologies. In part referring to opportunities for EDA vendors to provide more support, but much more for verification teams to graduate up the verification maturity curve. Here I imagine vigorous pushback – “our process is very mature, it’s the EDA vendors that need to improve!” David isn’t an EDA vendor so his position should carry some weight. I’m guessing he sees a broad cross section, from very sophisticated methodologies to quite a few that are less so. Especially I would think in FPGA design, even in ASIC in teams with backgrounds in small devices.

David walked through 5 levels of maturity, starting from directed testing only. I won’t detail these here, but I will call out a few points I thought were interesting. At level 3 where you’re doing constrained random testing, he mentioned really ramping up metrics. Coverage certainly but also compute farm metrics to find who may be hogging an unusual level of resource. Performance metrics especially in regressions. Generally looking for big picture problems across the project as well as trends by block (coverage not converging for example).

He stresses automation, taking more advantage of tool features, adding in-house scripting to aggregate data after nightly runs so you can quickly see what needs extra attention. Eventually moving to continuous integration methodologies, using Jenkins or similar tools. Mature teams no longer practice “Everybody stop, now we’re going to integrate all checkins and regress”. He also stressed working with your EDA vendor to implement upgrades to simplify these and other tasks.

Finally, the ultimate stage in maturity. Using emulation to shift left to enable SW/HW co-development for system design. Taking advantage of the ML options we now see in some verification flows. These don’t require much ML understanding on your part but can offer big advances in getting to higher coverage quicker, simplifying static testing, accelerating root cause analysis on bugs and reducing regression run-times. Consider also the ROI of working with your current compute farm versus upgrading servers, exploiting the cloud or a hybrid approach. From one generation to the next, server performance advances by 50%. Per unit of throughput, a server upgrade is much cheaper than adding licenses. Moving to the cloud has flexibility advantages but you need to actively manage cost. And EDA vendors should add as-a-service licensing models to make EDA in the cloud a more attractive option.

Lots of good material. The whole session was recorded, I believe you can watch any of the talks through the end of the year. I’ll be posting more blogs over the next 3 months on other sessions in this valuable and virtual conference.

Also Read:

Reliability Analysis for Mission-Critical IC design

Why Optimizing 3DIC Designs Calls for a New Approach

Using Machine Learning to Improve EDA Tool Flow Results


Blur, not Wavelength, Determines Resolution at Advanced Nodes

Blur, not Wavelength, Determines Resolution at Advanced Nodes
by Fred Chen on 10-05-2021 at 10:00 am

Blur not Wavelength Determines Resolution at Advanced Nodes

Lithography has been the driving force for shrinking feature sizes for decades, and the most easily identified factor behind this trend is the reduction of wavelength. G-line (436 nm wavelength) was used for 0.5 um in the late 1980s [1], and I-line (365 nm wavelength) was used down to 0.3 um in the 1990s [2]. Then began the era of deep-ultraviolet (DUV), during which two wavelengths, KrF (248 nm) and ArF (193 nm) dominate even today. Subsequent wavelengths were practically impossible to find. Both F2 (157 nm) and EUV (13.2-13.8 nm) suffer from strong absorption in air. 157 nm required a dry nitrogen ambient [3] whereas EUV could not even be operated using projection lenses. 157 nm was eventually dropped as it was supplanted by the extension of 193 nm by using water as an immersion fluid. Today, EUV requires all-reflective optics in a vacuum with a background hydrogen ambient [4].

Resolution in optical projection systems can be pushed down to a 0.3-0.4 x wavelength/(numerical aperture). Numerical aperture (NA) cannot be increased beyond the refractive index of the imaging medium (1.44 for water, 1 for air or vacuum), so wavelength reduction is the expected solution. As the EUV wavelength is so much smaller than DUV, it has been expected to be able to support many generations of process nodes. However, for advanced processes, wavelength has lost its influence over determining resolution.

The reason is feature size has started approaching scales where blur becomes important. Blur here refers to the smoothing of chemical distributions within the resist after it has been exposed to light in the lithography tool. Smoothing makes edge definition, and hence feature size, more difficult to control by dose. The largest source of blur has been acid diffusion in chemically amplifed DUV resists [5]. However, for the case of EUV, another blur mechanism arises, namely secondary electrons [6]. Blur in EUV resists had been measured as large as over 5 nm [7,8], but more recently blur as low as 2 nm had been considered [9].

To quantitatively assess the impact of blur, it is most convenient to first fit images of interest with Gaussian curves. In Figure 1, dense line pitches of 20 nm, 30 nm, and 40 nm were fit with Gaussian curves with sigma=4.3 nm, 6.3 nm, and 8.4 nm, respectively. The fitting sigma is found to be excellently fit as a linear function of pitch.

Figure 1. Gaussian fits to dense line images from 20 nm to 40 nm pitch.

The blur is itself modeled as a Gaussian function. The blur function is convolved with the original image function to produced a blurred image curve. The higher the blur sigma, the more the image is changed. The degradation of image quality is quantified by the NILS (Normalized Image Log-Slope) metric. d[log(dose)]/d[x/feature width] at full-width half-maximum (FWHM) value. A NILS value of 2 has a special meaning: a 10% change in dose results in a 10% change in feature width. This can be considered a dividing line between “bad” and “good” images. Figure 2 shows a reference Gaussian and the effect of blurring with another Gaussian whose sigma is 0.66x the reference sigma (taken to be 1).

Figure 2. The image is degraded to borderline case (NILS=2) by a blur value of 0.66 times the original image sigma.

Combining the results of Figure 1 and 2, it is found that for a given pitch, the maximum allowed blur sigma is 0.14x pitch in the 20-40 nm pitch range. Conversely, the minimum pitch for a given resist would be given by blur/0.14. For example, a 5 nm blur sigma would limit resolution to 36 nm pitch.

Figure 3. Minimum dense line pitch (at which NILS=2) as a function of blur sigma.

The most important point to take away from this is that resist blur, which is not directly related to wavelength, is becoming the key factor in determining resolution. The electron blur is itself not easily pinned down, and contributes to the stochastic nature of EUV lithography.

References

[1] K. Eguchi et al., “0.5µm Lithography Using A G-Line Stepper With A 0.6 Numerical Aperture Lens,” Proc. SPIE 0922 (1988).

[2] K-Y. Kim et al., “Implementation of I-line lithography to 0.30 tm design rules,” Proc. SPIE 2440, 76 (1995).

[3] https://www.laserfocusworld.com/optics/article/16556523/challenges-remain-for-157nm-lithography

[4] https://www.spiedigitallibrary.org/journals/journal-of-micro-nanopatterning-materials-and-metrology/volume-20/issue-03/033801/EUV-induced-hydrogen-plasma–pulsed-mode-operation-and-confinement/10.1117/1.JMM.20.3.033801.full

[5] G. M. Gallatin, “Resist Blur and Line Edge Roughness,” Proc. SPIE 5754 (2005).

[6] T. Kozawa and S. Tagawa, “Normalized Image Log Slope with Secondary Electron Migration Effect in Chemically Amplified Extreme Ultraviolet Resists,” Appl. Phys. Exp. 2, 095004 (2009).

[7] R. Gronheid et al., “EUV Secondary Electron Blur at the 22 nm Half Pitch Node,” Proc. SPIE 7969, 796904 (2011).

[8] A. Chunder et al., “Separating the optical contribution to line edge roughness of EUV lithography using stochastic simulations,” Proc. SPIE 10146, 101460B (2017).

[9] Z. Belete et al., “Stochastic simulation and calibration of organometallic photoresists for extreme ultraviolet lithography,” J. Micro/Nanopattern. Mater. Metrol. 20, 014801 (2021).

Related Lithography Posts


On-Device Tensilica AI Platform For AI SoCs

On-Device Tensilica AI Platform For AI SoCs
by Kalar Rajendiran on 10-05-2021 at 6:00 am

Varying On Device AI Requirements 1

During his keynote address at the CadenceLIVE 2021 conference, CEO Lip-Bu Tan made some market trend comments. He observed that most of the data nowadays is generated at the edge but only 20% is processed there. He predicted that by 2030, 80% of data is expected to be processed at the edge. And most of this 80% will be processed on edge devices as AI workloads. This migration is already happening rapidly and is calling for different types of on-device AI SoCs.

During the same conference, president Anirudh Devgan presented Cadence’s strategy guiding their next wave of innovations and technology offerings. One of the three prongs of the strategy is enabling design excellence for its customers. Cadence delivers it through its EDA tools, emulation platforms, semiconductor IP and productivity software tools.

Cadence has been executing its strategy and announcing new capabilities at a nice pace. In July, it announced Cerebrus, the Intelligent Chip Explorer. Cerebrus falls under the EDA tool category of its design excellence drive. This month, Cadence announced its Tensilica AI Platform for accelerating AI SoC development.  The related press release mentioned three product families optimized for varying on-device AI requirements. The products fall under the category of semiconductor IP solutions for design excellence. This is the context for this blog.

I had an opportunity to discuss this product announcement with Pulin Desai. Pulin is the group director of product marketing and management for Tensilica Vision & AI at Cadence. The following is a synthesis of what I gathered from our conversation.

Pulin stated that Cadence is focused on bringing the most energy-efficient on-device IP solutions for AI SoCs. And it wants to do this for all types of workloads over a broad range of market segments. This in turn requires the product to be configurable, extensible and scalable across performance and energy parameters. Cadence took a platform approach that can deliver solutions to address these varying requirements.

Market Requirements

The supported market segments range from simple intelligent sensors to IoT audio/vision, mobile and automotive/ADAS. These market segments are looking for a solution that offers/enables low cost, rapid development, longer battery life and quick product differentiation. Customers also want configurable and extensible software-hardware platform that makes it easy for them to scale their products to meet different segments of their markets.

Product Requirements

The above identified market requirements translate into the following product-level requirements. It is no longer just the Tera Operations Per Second (TOPS) rating that matters. It is optimum TOPS per Watt and TOPS per sq.mm of silicon area. It is performance/speed at the lowest latency. It is the ability to

interface with varying types of workloads that operate on fixed or floating-point data. It is the capacity to process data from single or multi-mode sensors and execute concurrently. Refer to the figure below to see the very wide range of performance, power and workload interfaces that need to be addressed.

 

Tensilica AI DSPs

The AI SoCs that are to implement the above on-device AI product requirements need AI engines as well as DSP capabilities. The DSP functions are needed for processing the inputs received from multiple sensors. Cadence already has a long track record of successful Tensilica DSPs with AI ISA extensions based on the time-tested Xtensa® configurable and extensible processor platform. Tensilica DSPs are shipping in volume production in numerous SoCs and end products. It made strategic sense to expand these AI solutions further based on this strong DSP foundation to offer a wider range of AI-enabled products. Refer to figure below.

 

Tensilica AI Platform

The Tensilica AI Platform supports three AI product families to satisfy a broad range of market requirements: AI Base, AI Boost and AI Max. The comprehensive common software platform enables ease of scalability across these product families. The configurability and extensibility features allow some markets/applications to be addressed by multiple Tensilica AI solutions. The platform includes a Neural Network Compiler which supports industry-standard frameworks such as: TensorFlow, ONNX, PyTorch, Caffe2, TensorFlowLite and MXNet for automated end-to-end code generation; Android Neural Network Compiler; TFLite Delegates for real-time execution; and TensorFlow Lite Micro for microcontroller-class devices.

AI Base: The low-end product including Tensilica DSPs with AI ISA extensions targets voice/audio/vision/radar/LiDAR related applications. It can deliver up to 30x better performance and 5x-10x better energy efficiency compared to a regular CPU based solution. Refer to figure below for some benchmark results.

 

AI Boost: The mid-level product can be used with any of the AI Base applications when performance and power need to be optimized. It integrates the AI Base technology with a differentiated sparse compute Neural Network Engine (NNE). The initial version NNE 110 can scale from 64 to 256 GOPS and provides concurrent signal processing and efficient inferencing. It consumes 80% less energy per inference and delivers more than 4X TOPS per Watt compared to industry-leading standalone Tensilica DSPs. Refer to figure below for some benchmark results.

 

AI Max: The high-end product integrates the AI Base and AI Boost technology with a Neural Network Accelerator (NNA) family. The NNA family currently includes single core (NNA 110), 2-core (NNA120), 4-core (NNA 140) and 8-core (NNA 180) options. The multi-core NNA accelerators can scale up to 32 TOPS, while future NNA products are targeted to scale to 100s of TOPS. Refer to figure below for some benchmark results.

 

Summary

The Cadence Tensilica AI Platform enables industry-leading performance and energy efficiency for on-device AI applications. It is built upon the mature, volume-production proven Tensilica Xtensa architecture. The low-end, mid-level and high-end product families cover the full spectrum of PPA and cost points for various market segments. The solution currently scales from 8 GOPS to 32 TOPS, with additional products expected  to deliver 100s of TOPS to meet future requirements.

For more information on the Tensilica AI Platform and new AI IP solutions, visit their product page by clicking here. To read the full press release, click here.

Also Read:

Cadence Tensilica FloatingPoint DSPs

Features of Short-Reach Interface IP Design

112G/56G SerDes – Select the Right PAM4 SerDes for Your Application


Heterogeneous Package Design Challenges for ADAS

Heterogeneous Package Design Challenges for ADAS
by Tom Simon on 10-04-2021 at 10:00 am

Hetergeneous Package Design

Increasingly complex heterogeneous packaging solutions have proved essential to meeting the rapidly scaling requirements for automotive electronics. Perhaps there is no better example of this than advanced driver-assistance systems (ADAS) that are found in most new cars. In a recent paper published by Siemens EDA, they detail the current technology trends that are creating design challenges. The paper titled “Methodology and Process for Heterogeneous Automotive Package Design” mentions shrinking bump pitch, increasing bump density, decreasing ball pitch, high current consumption and high frequency issues as factors that must be addressed to meet system level requirements in this market.

The Siemens paper is focused on the adoption of Xpedition Substrate Integrator (xSI) from Siemens by the Back-End Manufacturing Technology Team at ST Microelectronics in Agrate Italy. The design of their current and future ADAS products requires a tool that can handle increasing package connectivity density and deal with design data from chip and PCB tools as well as the package design. The paper outlines a well-structured three phase flow for package design.

In the early stage there is focus on design rule identification, implementation technology, cost optimization, and design strategy verification. All aspects of the design are explored to come up with a preliminary implementation. The areas considered are package dimensions, substrate stack up, ball-out definitions, break out strategies and preliminary connectivity assignment. With this information, cost and performance metrics can be estimated.

This is followed by the intermediate stage, where attention is paid to physical design, debugging and optimization. A lot happens here, starting with finalization and optimization of the substrate routing. Key aspects such as manufacturability, interfaces, power requirements and more are assessed. The final stage involves verification and sign off of every aspect of the finished design. This includes thermal, signal integrity, power integrity and manufacturing verification leading to the tape-out handoff to manufacturing.

The Siemens paper points out that each of these design stages rely heavily on co-design, co-simulation and co-optimization. Without these a siloed non-convergent design process would result. It is necessary for the package design flow to take a system level approach. This is the only way that system connectivity can be optimized to meet all requirements. Data must flow from IC and PCB tools. xSI helps eliminate the use of spreadsheets for passing preliminary data by providing a feature known as connectivity tables to capture system definition during development. This has the added benefit of permitting easy what-if analysis.

Hetergeneous Package Design

Siemens xSI was attractive to ST because of its extensive support for early prototyping during the package design process. Missing devices can be created from scratch, and complex bump and ball pitch and positions can be defined for sophisticated design exploration. Because the design is not static, xSI supports fast connectivity updates when design elements are changed via ECO. The article describes several other innovative capabilities that influenced ST’s decision.

The article also describes in some detail the test case that was used by ST. At the end of the article the authors review the most important aspects of the flow that xSI offers. They point out that xSI enables hierarchical construction of the complete package assembly with step-by-step han­dling of multiple parts, including reuse of parts. In order to meet the needs of high-speed interface design and accommodate complex bump-out geometries, xSI includes efficient integration with external tools, and flexibility during IC-package connectivity planning and optimization.

Package design is by its very nature heavily constrained. Both physically, by its role in between die and board, but also because of interface and interconnection requirements. There are thermal, power, signal integrity, area and many other requirements to satisfy. There is also no argument that ADAS systems impose some of the strictest requirements on system operation. They often include high speed video streams, display output, as well as other sensor input and control output. ADAS systems also include extremely advanced conventional and AI processors. Siemens offers a lot of information about why ST chose xSI in the article, which is available here on the Siemens website.

Also Read:

Electromigration and IR Drop Analysis has a New Entrant

Formal Methods for Aircraft Standards Compliance

Verifications Horizons 2021, Now More Siemens


What the Heck is Collaborative Specification?

What the Heck is Collaborative Specification?
by Daniel Nenni on 10-04-2021 at 6:00 am

Git Commit

It’s been quite a while since I talked with Agnisys CEO and founder Anupam Bakshi, when he described their successful first user group meeting. I reached out to him recently to ask what’s new at Agnisys, and his answer was “collaborative specification.” I told him that I wasn’t quite sure what that term meant, and he offered to spend some time with me to explain. I’d like to share some of our conversation.

Can you please tell us what collaborative specification means?

It’s a term used in slightly different ways in multiple industries, but to us it’s similar to parallel design. The concept of multiple design engineers working together on a single chip is well established, and in fact it’s essential given the size and complexity of today’s devices. Distributed design teams and the ongoing pandemic mean that engineers are rarely working side by side physically, so a highly automated online system is required. The same argument applies to chip verification; modern testbenches are incredibly complicated and require many verification engineers collaborating. We’ve extended this idea to specification of the design, with many architects and designers working together virtually.

Why is collaborative specification different than collaborative documentation?

Well, if your design specification is just another text document, there’s really no difference at all. But the Agnisys approach is based on executable specifications that our tools use to automatically generate RTL designs, Universal Verification Methodology (UVM) testbenches, assertions for formal verification, C/C++ code for firmware and drivers, and end user documentation. Collaborating on executable specifications is a rigorous process that offers more opportunities for automation.

Are you talking about multiple engineers working on a single specification at the same time?

Yes, that’s sometimes the case, so it’s important to have a great change tracking and control system in place. I should comment that the way these systems have evolved is one of the biggest changes in software and hardware development over the years. Revision control systems used to be based on a locking model, where any engineers wishing to edit a file (schematics, RTL code, testbenches, programs, etc.) would “check out” the file and have exclusive access to it during the editing period. If someone else wanted to make a change, the first engineer would have to “check in” the edits first. For quite a few years now, the dominant approach has been the merge model, in which multiple engineers might be editing the same file at the same time. Resolving any inconsistencies has become an essential part of any version control system flow. Of course, we support the merge model in our IDS NextGen  (IDS-NG) solution.

So is IDS-NG a version control system?

No, there are excellent solutions out there and we saw no reason to “reinvent the wheel.” In talking with our customers and partners, we found that the open source Git distributed version control system is extremely popular and very powerful. IDS-NG now has a tight, native integration with Git so that our users can create and edit executable specifications that are saved in a repository and managed by an industry-standard system.

Can you give an example of how this works?

Sure! The user creates and edits a specification using the intuitive graphical environment and language-aware editor within IDS-NG. They may import existing specification information from common standard formats such as IP-XACT and SystemRDL, or they may leverage the highly configurable standard IP blocks available in our SLIP-G library. Once they have reached a good checkpoint, they can easily “commit” their changes to a local branch of the design specification and then “push” these changes to the main development branch in the project-wide repository. Other users can then “pull” the new version of the file for use in their local branches.

What if someone else has edited the same file in parallel and already pushed those changes?

In that case, Git reports that there are conflicting edits, but it cannot resolve the differences by itself. This is where the cleverness of IDS-NG really kicks in. If the user pulls the other updated version of the file, IDS-NG performs a comparison (“diff”) of the two design specifications and reports the results. Some changes can be merged automatically, but others require user input. For example, if two users change the width of the same register in their respective edits, one of the users needs to decide which is the correct value. This information is displayed very clearly in the IDS-NG interface and it’s easy for the user to review and resolve any differences. This unique approach is easier and more intuitive than doing a simple text comparison on two SystemRDL files.

Is seems to me that this capability would be very useful in a CI-based workflow; is that correct?

Yes indeed. In continuous integration (CI), which Git supports well, all committed changes are pushed to the main development branch frequently, perhaps at the end of every work day. The motivation for this approach is finding conflicting edits sooner rather than later, so that they can be resolved before two versions of the specification diverge wildly. You could argue that the “compare documents” function in Microsoft Word played a big role in enabling collaborative documentation. With the recent Git integration and diff capability in IDS-NG, we’ve done the same to enable collaborative specification.

Will you be demonstrating these new features at the upcoming virtual DVCon Europe show?

Yes, we are a Silver Sponsor for this important event, and we have a very nice collaborative specification demo and video available. We are looking forward to sharing them with the online DVCon Europe attendees. We’re excited at the advanced project workflows we now enable, and we expect a lot of user interest.

Anupam, thank you for the updates and the technical information.

Thanks, Dan. It’s always a pleasure to talk with you.

Also read:

AUGER, the First User Group Meeting for Agnisys

Register Automation for a DDR PHY Design

Automatic Generation of SoC Verification Testbench and Tests


Autonomous Vehicle Rationale Breaks Down

Autonomous Vehicle Rationale Breaks Down
by Roger C. Lanctot on 10-03-2021 at 6:00 am

Autonomous Vehicle Rationale Breaks Down

The latest SmartDrivingCars podcast raised fundamental questions regarding the rationale for developing autonomous cars while debating the various paths to market adoption. The discussion took place between Alain Kornhauser – faculty chair of autonomous vehicle engineering at Princeton University and Adriano Alessandrini, a professor at the University of Florence.

Ostensibly, the conversation between Kornhauser and Alessandrini was to be focused on the need to improve road systems and infrastructure to support autonomous mobility. The wide ranging discussion detoured into the various assumptions and business models intended to justify and enable the adoption of autonomous vehicle tech.

The discussion ultimately and inadvertently challenged the fundamental assumptions behind the efficacy and purpose of autonomous driving. The conversation pointed toward a single justification for developing autonomous vehicle tech: to serve physically or financially disadvantages populations.

One tends to arrive at this conclusion by considering the various autonomous vehicle adoption scenarios, most of which simply breakdown upon closer scrutiny.

  1. Communities will dedicate lanes for autonomous vehicles along existing roadways providing privileged access for these vehicles. These vehicles might be equipped with technology designed to surrender vehicle control to infrastructure-based guidance systems.  Counter Argument: If communities choose to dedicate lanes to autonomous vehicles – as in the case of Michigan creating its “Road of the Future” to Ann Arbor – it is clear that tracks for trains would be a superior choice.  “Specially” equipped cars suitable to operate in such privileged lanes would end up being more expensive and likely only accessible to the rich.
  2. Provide incentives for robotaxis or roboshuttles to operate alongside existing transit solutions. Counter Argument: Robotaxis or roboshuttles are likely to gravitate to the most popular and profitable routes of an already subsidized public transit system, further undermining the already challenging finances of that system. (An example of this is the Lyft-MBTA transit trial in Boston which revealed this tendency.) This will, in turn, put pressure on the financial viability of less popular routes – including those serving disadvantaged neighborhoods. (The MBTA estimates it loses $20M in fares annually to Uber/Lyft riders – and that Uber and Lyft contribute to traffic congestion.)
  3. Launch autonomous vehicles in suburbs. Counter Argument: Waymo has already demonstrated that offering robotaxis or roboshuttles in suburbs is not viable due to the saturation of vehicle ownership. Consumers in typical middle class suburbs simply aren’t interested in the robotaxi proposition.
  4. Introduce robotaxis in cities. Counter Argument: Robotaxis are too slow and are likely to be too expensive. Human driven taxis do a better job. Also, robotaxis are likely to skim off the most popular and lucrative routes (putting financial pressure on providers serving less popular routes and disadvantaged neighborhoods)  unless they are programmed to serve disadvantaged neighborhoods.
  5. Exclude cars from city centers and only allow robotaxis and roboshuttles. Counter Argument: In such circumstances, the cities that choose this path will have, in effect, created a public concession with related bidding processes and funding. Such circumstances, by their nature, will become public transportation subject to the financial challenges of existing public transportation and potentially in conflict with existing solutions and also subject to the same demands of serving the entire population, regardless of the financial constraints.

The unavoidable conclusion is that autonomous vehicle technology is most ideally suited to serving financially or physically disadvantaged populations. Bringing autonomous vehicle tech to other parts of the city – as suburban deployment does not appear to make much sense – only makes sense as a direct replacement of public transportation – not as a competing alternative.

In this context, the most notable conclusion from the podcast was Professor Kornhauser’s return to his two main themes – consistent throughout all of his podcasts:

  • Existing infrastructure to support autonomous vehicle operation is lousy. The problem starts with the poor application and maintenance of paint and local authorities and industry constituencies need to start fixing that first.
  • Cars need to do a better job assisting drivers with the driving task. Professor Kornhauser is outraged at the failure to see wider and more aggressive application of emergency braking technology to avoid collisions at higher speeds.

Ultimately, the key takeaway from the SmartDrivingCars podcast is that more attention needs to be paid to adopting and deploying ADAS-type (advanced driver assist system) technologies for lane keeping, blind spot detection, adaptive cruise control, and emergency braking. It’s likely that robotaxis intended for wide deployment in cities will evolve as public transit propositions.

The long development cycle of autonomous technology and the enduring interest in vehicle ownership are likely to render suburban areas as hostile territory for autonomous vehicles indefinitely.  All driving environments are fertile ground for deploying driver assistance technology. On that we can agree.

SmartDrivingCars Podcast Episode 233 – Making mobility happen in Europe, Trenton and beyond – https://tinyurl.com/tw7jf7na


Podcast EP40: The Semiconductor Supply Chain and the Real Cause of Semiconductor Shortages

Podcast EP40: The Semiconductor Supply Chain and the Real Cause of Semiconductor Shortages
by Daniel Nenni on 10-01-2021 at 10:00 am

Dan and Mike are joined by Malcolm Penn, 50-year semiconductor industry veteran and founder and CEO of Future Horizons. Dan and Mike explore the evolution of the semiconductor supply chain, how we got to the current state of shortages and what the future may hold. Drawing on his substantial knowledge of the industry, Malcolm makes some enlightening comments about the industry and what may lie ahead.

https://www.futurehorizons.com/

The views, thoughts, and opinions expressed in these podcasts belong solely to the speaker, and not to the speaker’s employer, organization, committee or any other group or individual.


The Semiconductor Shortage False Narrative!

The Semiconductor Shortage False Narrative!
by Daniel Nenni on 10-01-2021 at 6:00 am

Port of Oakland

The semiconductor shortage has really caught the world’s attention. Friends and family who don’t really know what a semiconductor is now ask me to explain it. This is great news for the semiconductor industry for different reasons which I will discuss here. We can also discuss the downside risks.

First, let’s look at a bit of semiconductor history. Malcolm Penn of Future Horizons did his “IFS2021-MT” Mid-Term Semiconductor Industry Update & Forecast” which was definitely worth an hour of my time. Malcolm went through 237 slides in that hour which he shared with me. This one is my favorite:

I’m a big fan of history, we really need to understand how we got to where we are today to better understand where we will go tomorrow. The takeaway here is that the semiconductor industry has weathered storms like this in the past and will continue to do so in the future. In fact, Malcolm said that this is the 14th such event in his 50 years.

Malcolm joined us for a Semiconductor Insiders Podcast, you should also catch last week’s episode with Wally Rhines. We discuss the history of TSMC and a few other topics including the current semiconductor shortage.

After listening to Malcolm’s take on the semiconductor shortage I took a close look at the supply chain from wafer to packaged parts to finished electronic products. I talked to colleagues and attended multiple calls with industry experts. We do not have a semiconductor manufacturing problem, it’s a supply chain problem and that problem is not limited to semiconductors, it’s hitting many sectors and will continue to do so for many quarters to come.

Bottom line: COVID and climate change have devastated the worldwide supply chain. I mentioned a while back that I noticed a growing number of ships backed up in the San Francisco Bay and out the Golden Gate. The word on the docks is that not only is COVID causing personnel shortages but COVID protocols are killing productivity. The airline industry is facing a different kind of personnel issue. In addition to COVID complications, pilots are aging out and they can’t hire and train replacements fast enough.

Unfortunately, the dominant narrative led by Intel and others, including politicians, is that we need more fabs and those fabs need to be in the country of demand origin. This really started with China a few years ago with the Made in China 2025 strategic plan. As a result, China has bought massive amounts of semiconductor equipment over the past few years. Unfortunately, the majority of this equipment sits idle for the time being. When it does in fact get into high volume production China will become somewhat self sufficient with their massive semiconductor consumption which will add to the coming semiconductor glut, absolutely.

But please do build more fabs all over the world and try to replicate the entire semiconductor supply chain many times over. It will be an exercise in futility but we all need exercise. The end result of course will be a glut and that means lower prices which will bring about a whole new generation of semiconductor products and services. All for the greater good of semiconductors and modern life, I’m looking forward to it!


Synopsys’ ARC® DSP IP for Low-Power Embedded Applications

Synopsys’ ARC® DSP IP for Low-Power Embedded Applications
by Kalar Rajendiran on 09-30-2021 at 10:00 am

Key Applications Driving PPA Optimized Signal Processing

On Sep 20th, Synopsys announced an expansion of its DesignWare® ARC® Processor IP portfolio with new 128-bit ARC VPX2 and 256-bit ARC VPX3 DSP Processors targeting low-power embedded SoCs. In 2019, the company had launched a 512-bit ARC VPX5 DSP processor for high-performance signal processing SoCs. Due to the length, format and style, press releases are limited in what they capture. Typically, there is a story behind every product announcement. Learning this story gives us better insights into the announced products.

In order to gain these insights, I had a meeting with Matt Gutierrez, Sr. Director of Marketing, Synopsys Processor Solutions, Markus Willems, Sr. Product Marketing Manager, ARC VPX DSP Processors. This blog is a synthesis of what we discussed.

Unwavering Focus on Embedded Applications

One thing that has been steady and constant from the 1990s to today is ARC technology’s focus on supporting embedded applications. Historically, ARC processors did not target the mobile applications processor segment. The markets for embedded applications have been evolving and ARC processor technology has been transforming accordingly. ARC processors have moved up from being used for just simple and dedicated tasks such as power management to even running 64-bit Linux operating system.

After becoming part of Synopsys in 2010, the burgeoning IoT market gave impetus to build a new generation of embedded ARC processors. A family of very small, highly efficient, low-power processors was need to support the IoT market.  A new architecture and ISA were born. A couple of processors were developed and marketed. Early IoT devices needed minimum amount of DSP capabilities. Some DSP functions were added to the processors to support the IoT requirements.

Fast forward to today, Synopsys offers five different ARC product families, each with extensive lineups. Each product family of embedded processors addresses the varying and tight requirements of a broad range of applications. The current announcement is about their VPX DSP family of processors for Language processing, Radar/LiDAR, Sensor Fusion and High-end IoT applications.

Focus Drives Highly Efficient ARC Architecture

The instruction set architecture (ISA) has been designed with the embedded market in mind. For example, unique instructions such as compare & control transfer and branch & loop make it easy to efficiently implement common embedded program behaviors. Another example is 16-bit encodings for popular 32-bit instructions. The ARC ISA has many such features for reducing code size as memory space is at a premium on embedded devices.

Every microarchitectural decision is also made with the embedded market in mind. For example, built-in shadow registers are important for real-time embedded applications to enable fast context switching. These kinds of architectural decisions make a big difference for embedded applications. Something not easily replicated by taking a processor designed for some other applications and tweaking it to support embedded applications.

Other important aspects of ARC’s value proposition are the configurability of the design and the extensibility of the instruction set. Configurability enables implementing just the minimum hardware that is needed for a SoC and nothing more. Extensibility enables adding custom instructions to accelerate application code, increase code density and reduce power consumption.

Customers are effectively able to create customized processor hardware, supported by a singular, standard MetaWare toolchain, that delivers the optimal PPA and code density for their application needs. The majority of ARC customers extend the instruction set by adding custom instructions for their specific algorithms.

Addressing Expanding Market Requirements

Until the introduction of the VPX family of processors, ARC processors could be categorized as Big CPU, little DSP IP solutions. Embedded workloads such as IoT sensor fusion, Radar and LiDAR processing, voice/speech recognition, and natural language processing call for full-fledged DSP capabilities. As Synopsys saw this rising market need, they launched the VPX line of processors, which uses an extended ARC ISA to implement highly vectorized DSPs.

 

Product Requirements for these Markets

Floating point support is becoming more important for signal processing applications. The data processing algorithms being developed for these markets use floating point to support a wide dynamic range. Staying in floating point instead of converting to fixed point makes mapping an algorithm to a design architecture quicker. The DSP libraries and linear algebra libraries that are supporting these applications are represented in floating point format. Strong support for programming with vector floating point operations is becoming more of a requirement than in the past.

Efficient execution of AI algorithms is another must-have for any modern DSP. This implies support for short Integer datatypes such as Int8, combined with a dedicated programming environment that allows for a smooth mapping of graphs to the DSP architecture. And of course, the DSP has to come with a rich library of machine learning kernels optimized for the hardware to ease software development.

A dedicated hardware accelerator for linear and non-linear algebra operations significantly speeds up these increasingly used math functions.

Configurability, extensibility and scalability are becoming key requirements as product companies start offering multiple variants. Each variant may be optimized differently for PPA and code density.

VPX Family of DSP IP

With the availability of three different VPX families representing 7 different DSPs, customers now have greater flexibility for implementing specific application requirements. The latest two additions are based on the same VLIW/SIMD architecture as the higher-performance 512-bit ARC VPX5 DSP processor launched two years ago. As the new additions target low-power embedded SoCs, they are designed for smaller vector lengths, resulting in smaller, lower power footprints. As ultra-high floating-point performance is a focus for the VPX DSPs, a Vector Floating Point Unit (VFPU) is offered as an option. The VFPU is implemented with multiple pipelines capable of executing up to 512 FLOPs per clock cycle. Along with the launch of the two new additions, Synopsys has also announced some enhancements to the VPX5 processor.

 

Easy Migration and Scalability of Products

The ARC VPX processors are supported by the Synopsys ARC MetaWare Development Toolkit, which provides a vector length-agnostic (VLA) software programming model. From a programming perspective, the vector length is identified as “n” and the value for n is specified in a define statement. The MetaWare compiler does the mapping and picks the right set of software libraries for compilation. The compiler also provides an auto-vectorization feature which transforms sequential code into vector operations for maximum throughput.

In combination with the DSP, machine learning and linear algebra function software libraries, the MetaWare Development Toolkit delivers a comprehensive programming environment.

Together, the above features enable customers to easily migrate and/or scale their products across all members of the VPX family.

Opportunity for Optimizing Current ARC VPX5-based Designs

In all the talk about VPX2 and VPX3 in the press announcement, mention of the VPX5 enhancements may have gotten lost. Refer to Figure below.

The VPX5 enhancements include double-wide vector load/store, wider AXI interfaces, ISA extensions, and machine learning, DSP and linear algebra libraries that support a VLA-based programming model. These enhancements have enabled VPX5 (as well) to double its performance compared to the earlier version for common DSP functions such as FFT, dot product and windowing. In many applications, this removes the need for designers to implement a separate external accelerator for these functions.

For the Automotive Market

To satisfy the enhanced safety requirements of the automotive market, Synopsys offers a functional safety (FS) series for their entire portfolio including the VPX family of processors. The FS series of processors meet random fault detection and systematic functional safety development flow requirements for full ISO 26262 compliance up to ASIL D.

Summary

Delivering design efficiencies, optimizing for PPA and maximizing software code density are at the root of what ARC is about. Synopsys’ ARC VPX DSP family of processors provides customers with a full range of scalable solutions to address their varying requirements.

You can access the full press release here. You can review details of the VPX DSP processors at their product page.

Also read:

Synopsys’ Complete 800G Ethernet Solutions

Safety + Security for Automotive SoCs with ASIL B Compliant tRoot HSMs

IoT’s Inconvenient Truth: IoT Security Is a Never-Ending Battle


Verification Completion: When is enough enough?  Part I

Verification Completion: When is enough enough?  Part I
by Dusica Glisic on 09-30-2021 at 6:00 am

Tunnel min

Verification is a complex task that takes the majority of time and effort in chip design. Veriest shares customer views on what this means. We are an ASIC services company, and we have the opportunity to work on multiple projects and methodologies, interfacing with different experts.

In this “Verification Talks” series of articles, we aim to leverage this unique position to analyze various approaches to common verification challenges, and to contribute to make the verification process more efficient.

As product life cycles are getting shorter, you need to be very effective in your development flow to meet deadlines. What does that mean in terms of design maturity? Since we can never say that the verification is fully complete, I was interested in who and when decides when “enough is enough”: Is there some predefined date? Is it adjustable in relation to the success of the development process?

Even after 10 years of career as a verification engineer, I was very interested to discuss this topic with other respected professionals and get their perspectives. My interlocutors were: Mirella Negro from STMicroelectronics in Italy, Mike Thompson from the OpenHW Group in the Canada and Elihai Maicas from NVIDIA in Israel. We’ve also talked about different sign-off criteria, progress tracking and measuring metrics against the criteria as the development progresses.

How do we set a project timeline?

Everyone agrees that verification completion criteria should be jointly defined by all stakeholders in project.

Mike believes that it should all start from the question “what are our quality goals?” and that primarily depends on the specific project. “For example, many ASIC projects will budget for one metal re-spin before going to production, so the quality requirement of first silicon could be written as ‘no functional bugs that gate sample testing and initial customer testing’.  This is still a very high bar, but it is not ‘zero functional defects’”.

Elihai believes it depends on the product application. Although he has worked only on consumer electronics projects, where the tape-out date is usually specified at the beginning, he noticed that in the development of, for instance, medical or automotive devices all stages are much more strictly defined and reviewed. So hopefully these chips are not taped-out until the confidence in the quality is very high.

“This is market driven”, says Mirella. She explains that after the marketing team understands customer demands, they come to the R&D team with a description of the product and technology required. The R&D team then evaluates if that goal is achievable and in how much time. There are two possible scenarios: The first one – this is an innovation; in which case you are usually not limited in time. The second scenario – the product that is currently in demand in the market; here the development makes sense only if your delivery date is aligned with expected customer demand. If you miss a very short window of opportunity, you may miss out the time to sell.

Mirella emphasizes the importance of risk planning and having business continuity plan in place. Risk planning should include both a general risk as well as project-specific risk. Project scope can change because projects are very complex and involve a lot of innovation. But also, you can be completely stuck due to unforeseen circumstances such as a pandemic, a bug in the tool, or the technical leader who left the company.

Sign-Off Criteria

According to Mike, the quality goal should be defined at the start of the project, but the “measurement question” must typically wait until the requirements and/or functional specification is available. And the answer to that question should be more qualitative and usually is a variation of “100% coverage and no test failing”. “There are a lot of re-use opportunities for completion criteria when the projects are in the same domain.  For instance, in our case, two different 32-bit RISC-V cores implement the RVIMC instruction extensions.  So, the RVIMC completion criteria for these cores are (probably) the same,” he adds.” However, the two cores have very different interfaces to instruction and data memory, so the completion criteria for these aspects are very different.  This illustrates why the ‘measurement question’ is gated by the Requirements and/or Functional Specification.”

Elihai’s experience indicates that in most cases there are some “magic” numbers (for example: phase1: 85% code coverage, phase2: 90% code coverage + 90% functional coverage, Tape Out: 95% code coverage + 100% functional coverage) that are passed from project to project, plus some criteria specific to the current project, block, IP. This should all be defined before the development starts. However, he adds: “I never saw that happening. In real life, the specs are constantly being edited as the development progress, and so are the flows and exact numbers (performance for example) that need to be simulated”. “The criteria are defined by the technical managers and project leaders, based on personal experience and legacy criteria used in the team,” Elihai summarizes.

At STMicroelectronics, however, there is a common sign-off criteria for all projects: anything less than 100% of functional and code coverage means there is still work to do. Once reached full coverage and no failing tests, there is still possibility to miss some functional bugs, but it is not that obvious to define better criteria. They try to grant at least what is measurable today by the available tools.

Takeaways

Since the quality criteria depends on the market, competition, budget, I’m not sure that we can or should aim to create common criteria because this dictates the success of the company. Therefore, companies that find themselves less successful comparing to competitors, could revise their approach in this regard. Still, there seems to be a minimum that everyone respects.  It is mandatory to have a very high percentage (100% or almost 100%) of all success indicators that can be measured with today’s tools. Also, no one compromises the quality when it comes to the sensitive applications such are medical and automotive devices. There the verification sign-off happens only once all internal and external auditors approve it. On the other hand, in consumer electronic devices some features might be abandoned due to a pressure to meet a deadline.

After understanding what different criteria are and how they were created, in the follow-up article we will look at how you can track the progress and what can cause problems on that way.

To view more about Veriest, please visit our web site.

Also Read:

Verification Completion: When is Enough Enough?  Part II

On Standards and Open-Sourcing. Verification Talks

Agile and DevOps for Hardware. Keynotes at DVCon Europe