Semiwiki EDA Webinar 800x100

Chip Enabler and Bottleneck ASML

Chip Enabler and Bottleneck ASML
by Robert Maire on 04-22-2022 at 6:00 am

ASML Zeiss SemiWiki

-ASML reported an “in line” Q1- Orders remain super strong
-Ongoing supply chain issues will limit growth and upside
-ASML targets 2025 for supply fixes- We are not so sure
-Intel, TSMC, Samsung won’t be able to build all fabs they plan

ASML has “In linesh” Q1, orders still off the charts

ASML reported Euro3.5B in revenues and EPS of Euro1.73. Revenues were slightly light while earnings were a slight beat. Margins were 49%. More importantly orders remain very strong at Euro7B including Euro2.5B of EUV and multiple high NA systems.

Orders continue to outstrip ability to supply so more of the focus of both management and investors will be on ASML’s ability to ramp their supply chain to meet demand.

Talking about 2025 as target to fix supply issues

“Lets keep our fingers crossed and see what 2025 brings us” -Peter Wennick on the earnings call, not very confidence building. 2025 was mentioned many times on the call as the target to fix the supply chain issues that are limiting ASML’s ability to ship tools. We think many investors misunderstand the factors limiting ASML’s supply chain and therefore growth.

ASML is not limited by chips or current issues in Europe due to Ukraine or even Covid related issues. The supply chain issues are unique and specific to ASML, and ASML suppliers. Suggesting that 2025 will be the answer is more of a current hope than definitive plan that is in place to insure that issues will be fixed.

Zeiss is the key bottleneck and immovable object in the road to growth

While ASML is the key enabler to the chip industry, Zeiss is the key enabler to ASML. Zeiss makes the key optics that are the differentiator that makes ASML tools work. There is no second source, ASML is totally dependent upon Zeiss.

Most investors do not understand that Zeiss is not a normal company with shareholders. Zeiss is a foundation. The stated target of the foundation is furthering of science and insuring the employees well being and continued employment. Profit and growth is an afterthought. It is essentially run for the betterment of employees not profit. It is German labor unions and labor relations taken to an extreme.

Being the oldest such foundation in Germany also makes it slower to change.
One of the current issues is that Zeiss does not have enough space to increase production and doesn’t want to ruffle the feathers of neighbors with construction.

They is also the fact that not a lot of young Germans want to apprentice for years to polish glass for the rest of their lives. ASML just may be stuck with a supplier that can’t respond as quickly as needed and also just doesn’t care to respond as quickly as needed and doesn’t have to. It’s like trying to get a 175 year old Galpagos tortoise to run a sprint. Not gonna happen.

This all trickles down to Intel, TSMC & Samsung not building fabs

The demand for EUV tools, let alone High NA tools, far outstrips ASML’s ability to deliver them. Somethings gotta give. Intel, TSMC and Samsung can place all the orders they want and they will just pile up on ASML’s desk.

TSMC is far ahead of Samsung in EUV tool count and Intel is a distant third. Other companies in the memory space are also entering the EUV club and placing orders as well. This means order growth likely well in excess of the roughly 20% limit of production growth…in other words a significant shortfall.

It likely that TSMC could take up 100% of ASML’s EUV production by itself. Intel can never hope to catch TSMC until it can get more EUV tools.

Basically there is no way the chip industry will be able to get enough tools for all the plans of fabs today and either fabs will not be built or remain empty shells until EUV tools become available. Intel recently uprooted an EUV tool from Portland to send to Europe which is something you never want to do unless you are very desperate.

In a way this is likely a good thing for the industry in that it will put off the oversupply cyclicality that has historically plagued the industry. It will allow prices to remain higher for chips due to shortages and will allow ASML to charge whatever it wants.

Maybe ASML could charge more for a “Fastpass” to cut the EUV line much like Disney does. Could financial buyers take a place in line and “scalp” EUV tools?

Chip industry needs alternatives to current lithography process

The chip industry is clearly limited by ASML and Zeiss. The industry desperately needs either an alternative to existing lithography process and tools such as E beam direct write and DSA (directed self assembly) or process enhancements to existing lithographic process that can speed or make more efficient use of existing lithographic tools to get more output.

We don’t need more dep and etch tools. Maybe some more yield management to help optimize EUV and litho tools. Necessity is the mother of invention.

The Stock

ASML is in the enviable position of being a monopoly in an industry desperate for their tools. This will not change any time soon, in fact the gap between demand and supply of litho tools will likely only get worse over the medium term as orders pile on without a corresponding increase in supply.

While the quarter reported was only OK, the order news says that the shortage of ASML tools will continue. ASML will be able to increase capacity but we think it will take far longer than most investors expect or understand. We would be careful not to extrapolate and expect too much growth out of ASML due to the limitations inherent in their system.

We expect that ASML’s stock while down for the year will respond positively as there were some concerns about demand slowing. The issue is clearly not demand but ability to supply… its a better problem to have as it is much more fixable, but will take time.

Also read:

DUV, EUV now PUV Next gen Litho and Materials Shortages worsen supply chain

AMAT – Supply Constraints continue & Backlog Builds- Almost sold out for 2022

Intel buys Tower – Best way to become foundry is to buy one


Truechip’s DisplayPort 2.0 Verification IP (VIP) Solution

Truechip’s DisplayPort 2.0 Verification IP (VIP) Solution
by Kalar Rajendiran on 04-21-2022 at 10:00 am

Truechip TruEYE GUI

Integrating IP to build SoCs has been consistently on the rise. Growth in complexity and meeting time to market pressures are some primary drivers behind this phenomenon. Consequentially, the IP market segment has also been enjoying tremendous growth. While this is great news for chip design schedules, it does highlight the increased demand for quick, easy and accurate verification. Without a time and cost efficient way to verify an IP solution, the cost of verifying can end up being higher than the cost of developing the IP itself. And an SoC’s development schedule would be adversely impacted. Naturally, the Verification IP (VIP) segment of the IP market has seen high growth rates.

There are IP verification solutions offered by a number of companies. One company that was introduced in late 2020 to the SemiWiki audience is Truechip. Founded in 2008, Truechip characterizes itself as the Verification IP Specialist. It offers an extensive portfolio of VIP solutions to verify IP components interfacing with industry-standard protocols integrated into ASICs, FPGAs and SoCs.

Salient Aspects of Truechip’s VIP Solutions

Truechip’s Verification IPs are fully compliant to standard specifications and come with an easy plug-and-play interface to enable a rapid development cycle. The VIPs are highly configurable by the user to suit the verification environment. They also support a variety of error injection scenarios to help stress test the device under test (DUT). Their comprehensive documentation includes user guides for various scenarios of VIP/DUT integration. Truechip’s VIP solutions work with all industry-leading dynamic and formal verification simulators. The solutions also include Assertions that can be used in formal and dynamic verification as well as with emulations.

And their solutions come with the TruEYE GUI-based tool that makes debugging very easy. This patented debugging tool reduces debugging time by up to 50%.

Truechip’s DisplayPort 2.0 VIP Solution

One interface IP that is gaining lot of attention these days is the DisplayPort IP. Truechip has been supporting the Display market segment with VIP solutions for HDMI, HDCP and DisplayPort. They recently expanded their portfolio with the addition of DisplayPort 2.0 VIP solution. Their DisplayPort 1.4 VIP has a long track record within the customer base. Their DisplayPort 2.0 VIP has brought lot of upgrades to keep up with the enhancements from DisplayPort 1.4 to 2.0. The following Figure depicts a block diagram of the corresponding VIP environment.

The DisplayPort 2.0 VIP is fully compliant with Standard DisplayPort Version 2.0 specifications from VESA. Nonetheless, it is a light weight VIP with easy plug-and -play interface for a rapid design cycle and reduced simulation time. The solution is offered in native System Verilog (UVM/OVM/ VMM) and Verilog, with availability of compliance and regression test suites.

Some Salient Features of Truechip’s DisplayPort 2.0 VIP Solution

  • Supports High Bandwidth Digital Content Protection System Version 1.4, 2.2 and 2.3.
  • Supports Multi-Stream Transport (MST)
  • Supports Link Training(LT) Tunable PHY Repeaters (LTTPR)
  • Supports Reed-Solomon Forward Error Correction RS(254,250)
  • Supports multi lane configuration (up to 4 lanes)
  • Supports DSC v1.2a (Compressed Display Stream Transport Services)
  • Supports DisplayPort Configuration Data (DPCD) version 1.4
  • Support of legacy EDID is provided
  • Supports I2C over AUX Channel and Native AUX
  • Supports dynamically configurable modes.
  • Supports Dynamic as well as Static Error Injection scenarios.
  • On the fly protocol checking using protocol check functions, static and dynamic assertions
  • Built in Coverage analysis.
  • TruEYE GUI analyzer tool to show transactions for easy debugging

Deliverables

  • DisplayPort 2.0 BFMs for:
    • Source – Link Layer
    • Source – MAC Layer
    • Source – PHY Layer
    • Sink – Link Layer
    • Sink – MAC Layer
    • Sink – PHY Layer
    • Branching Devices
  • DisplayPort layered monitor & scoreboard
  • Test Environment & Test Suite :
    • Basic and Directed Protocol Tests
    • Random Tests
    • Error Scenario Tests
    • Assertions & Cover Point Tests
    • Compliance Test Suite
    • User Test Suite
  • Integration guide, user manual, and release notes
  • TruEYE GUI analyzer to view simulation packet flow

About Truechip

Truechip, the Verification IP specialist, is a leading provider of Design and Verification solutions. It has been serving customers for more than a decade. Its solutions help accelerate the design cycle, lowers the cost of development and reduces the risks associated with the development of ASICs, FPGAs and SoCs. The company has a global footprint with sales coverage across North America, Europe and Asia. Truechip provides the industry’s first 24×5 support model with specialization in VIP integration, customization and SoC Verification.

For more information, refer to Truechip website.

Also Read:

Bringing PCIe Gen 6 Devices to Market

PCIe Gen 6 Verification IP Speeds Up Chip Development

USB4 Makes Interfacing Easy, But is Hard to Implement


The ASIC Business is Surging!

The ASIC Business is Surging!
by Daniel Nenni on 04-21-2022 at 6:00 am

Alchip Revenue

Application Specific Integrated Circuits were the foundation of the semiconductor industry up until the IDMs came to power in the 1980s and 90s. Computer companies all had their own fabs, I worked in one, until start up companies like SUN Microsystems started using off the shelf chips from Motorola. SUN moved to the fabless model and designed their own SPARC chips but Intel was too powerful, with Windows and Linux they took over the CPU space forthwith.

During this transition quite a few semiconductor companies adopted the ASIC model and designed and manufactured chips for other systems companies. IBM, NEC, Toshiba, come to mind and there were also dedicated ASIC companies like VLSI Technology and LSI Logic who had their own fabs.

The fabless transformation changed all of this of course and now the ASIC market is dominated by fabless companies. The ASIC business today is split into two categories: There are pure-play fabless ASIC companies (GUC, Faraday, Alchip, Sondrel, Verisilicon, Alphawave, SemiFive, eInfochips, etc…) and Chip companies that also do ASICs (Broadcom, Marvell, MediaTek). Broadcom has the former Avago ASIC business and Marvell acquired eSilicon and the Globalfoundries/IBM ASIC business. MediaTek grew their ASIC ambitions organically.

Why is the ASIC business surging you ask? The same reason EDA, IP and TSMC are surging. Systems companies from all walks of life are now doing their own ASICs. It really has come full circle.

Alchip recently released numbers that are quite telling with four consecutive years of record setting performance. Alchip, founded in 2003, is headquartered in Taipei but has roots in the US and Japan.

One of the more telling pieces of data from Alchip is that the majority of their revenue (88%) came from FinFET designs from 16nm down to 5nm, some with complex packaging (CoWos and MCM).

Alchip President and CEO Johnny Shen expects strong demand in 2022 to come from AI, HPC, and IoT. He also pointed out that a select number of large production-quantity leading edge AI devices entered mass production in 2021, accounting for the company’s record performance.

Also in 2021, Alchip, a TSMC-certified Value Chain Aggregator, taped-out a significant number of 16nm, 12nm and 7nm designs. Several of the 7nm designs involved advanced packaging technology. A number of the 5nm designs will tape out in 2022 and a 3nm test chip is under development, with expected tape out in late 2022.

Bottom line: The ASIC business is the semiconductor world’s oldest profession, much of which is done undercover. The chip supply constraints of the roaring 20s have done more for ASIC’s than anyone could have imagined and will continue to do so, absolutely.

About Alchip
Alchip Technologies Ltd, headquartered in Taipei, Taiwan, is a leading global provider of silicon design and production services for system companies developing complex and high-volume ASICs and SoCs.   The company was founded by semiconductor veterans from Silicon Valley and Japan in 2003 and provides faster time-to-market and cost-effective solutions for SoC design at mainstream and advanced, including 7nm processes. Customers include global leaders in AI, HPC/supercomputer, mobile phones, entertainment device, networking equipment and other electronic product categories. Alchip is listed on the Taiwan Stock Exchange (TWSE: 3661) and the Luxembourg stock exchange and a TSMC-certified Value Chain Aggregator.

Also read:

Alchip Reveals How to Extend Moore’s Law at TSMC OIP Ecosystem Forum

Alchip is Painting a Bright Future for the ASIC Market

Maximizing ASIC Performance through Post-GDSII Backend Services


Podcast EP72: Analog AI/ML with Aspinity

Podcast EP72: Analog AI/ML with Aspinity
by Daniel Nenni on 04-20-2022 at 10:00 am

Dan is joined by Tom Doyle, CEO of Aspinity. Dan explores the benefits of Aspinity’s analog signal processing technology with Tom. The ultra-low power analog computing capability delivered by Aspinity has significant implications for the design and deployment of AI/ML systems.

The views, thoughts, and opinions expressed in these podcasts belong solely to the speaker, and not to the speaker’s employer, organization, committee or any other group or individual.


Visual Debug for Formal Verification

Visual Debug for Formal Verification
by Steve Hoover on 04-20-2022 at 6:00 am

ThisIsFormal

Success with Open-Source Formal Verification

The dream of 100% confidence is compelling for silicon engineers. We all want that big red button to push that magically finds all of our bugs for us. Verification, after all, accounts for roughly two-thirds of logic design effort. Without that button, we have to create reference models, focused tests, random stimuli, checkers, coverage monitors, regression suites, etc.

Of course, there is no big red button, and I’d be crazy to suggest that we could abandon all of that work altogether. But, at the same time, that’s not far from what Akos Hadnagy and I did, several years ago, in developing the WARP-V CPU generator.

I wrote WARP-V initially to explore code generation using the emerging Transaction-Level Verilog language. I brought the model to life with a simple test program that summed numbers from 1 to 10. Then, Akos put RISC-V configurations of WARP-V through the wringer, as a student in Google Summer of Code, using the open-source RISC-V Formal framework. By completing formal verification (which has now also been done independently by Axiomise using different tools and checkers), we felt no inclination to bother with any of the standard RISC-V tools and compliance tests.

Formal Verification Hurdles

While our formal-focused approach helped eliminate a considerable amount of work, in other ways it did add some effort, too. Of course it did. If formal verification were a panacea, everyone would be taking this approach, and while formal verification has been around for a long time, it still struggles to attain first-class status in the verification landscape. This has little to do with the core science and everything to do with usability.

The first big leap in usability for formal verification came with the provision of counterexample traces. These let you debug formal failures much like simulation failures, using a waveform viewer. This, however, is not enough to put formal verification on a level playing field with dynamic verification. For one thing, simulations can produce log files in addition to waveforms. These provide high-level context about simulations to help with debugging. For aggressive use of formal verification, getting big picture context is important. Here’s why:

Traditionally, focused testing plays a major role in stabilizing the model. The myriad basic bugs are identified by focused tests, which are written for a specific purpose in a very controlled context. You know what they are doing. You know what to look for. Formal verification, however, will identify a counterexample that could be doing absolutely anything (within your constraints). Fortunately, the trace will be short, but formal tools have a way of finding really gnarly corners you would never expect or never be able to hit in a controlled fashion. That’s what’s so great about formal!

So if we’re going to find a significant portion of our bugs using formal methods, we’d better make it easier to figure out what’s going on in the counterexamples. That’s where visualization comes in.

Streamlining Debugging with Visualization

WARP-V utilizes the Visual Debug framework, now freely available to open-source projects in the Makerchip.com IDE. Visual Debug (or VIZ) makes it easy to define simulation visualizations. These aid in the debugging process of any digital circuit developed using any hardware description language and any design environment that produces industry-standard (.vcd) trace files. You may have seen screenshots of visualizations similar to those of WARP-V in various of my posts about my RISC-V CPU design courses, in which hundreds of students have developed their own RISC-V CPUs.

Using Visual Debug for the first time is like turning the lights on in a room you didn’t realize was dark. Just as you wouldn’t walk into a dark room to find your car keys without turning on the lights, you shouldn’t start debugging without first enabling Visual Debug. Though it wasn’t the case at the start of WARP-V development, as you’ve undoubtedly guessed by now, VIZ now works for formal counterexamples as well as it does for simulation.

Implications of Easier Debugging

Let’s put these visualization benefits in the context of WARP-V’s design methodology. This means, I first get to talk about the benefits of TL-Verilog–my favorite topic. Utilizing TL-Verilog, WARP-V is able to support different pipeline depths and even different instruction set architectures from the same codebase. And it is able to do so in less code (and correspondingly fewer bugs) than a single RTL-based CPU core. Furthermore, transaction-level design greatly simplifies the task of creating test harnesses to connect any RISC-V hardware configuration to the RISC-V Formal checkers. As described in “Verifying RISC-V in One Page of Code!”, the reduction in modeling effort across four different CPU configurations was arguably a factor of 70x or more! (These benefits would apply to test harnesses for dynamic verification as well.)

In the face of these TL-Verilog benefits, the effort to debug formal verification failures became a significant portion of the remaining work, and Visual Debug would have streamlined this effort. More generally, being able to easily decipher formal counterexamples can be the boost in productivity that tips the scales for formal verification. This, in turn, makes our resulting hardware more robust and secure. And security is quite possibly the biggest challenge faced by design teams today.

Visual Debug in Action

I leave you with a screen-capture, narrated by yours truly, demonstrating debugging of a register bypass (aka register forwarding) bug in WARP-V.

Related Links: Makerchip.comVisual DebugWARP-V CPU generatorRISC-V FormalRISC-V CPU design courses“Verifying RISC-V in One Page of Code!”


White Paper: Advanced SoC Debug with Multi-FPGA Prototyping

White Paper: Advanced SoC Debug with Multi-FPGA Prototyping
by Daniel Nenni on 04-19-2022 at 10:00 am

S2C EDA Prototyping White Paper 2022

S2C EDA recently released a whitepaper written by a good friend of mine Steve Walters. Steve and I have worked together many times throughout our careers and I consider him to be one of my trusted few, especially in regards to prototyping and emulation. Steve is also my co author on the book “Prototypical II The Practice of FPGA-Based Prototyping for SoC Design”. Prototypical II and this 10 page white paper are available on the S2C EDA website HERE.

Introduction

As SoC designs advance in complexity and performance, and software becomes more sophisticated and SoC-dependent, SoC designers face a relentless push to “shift left” the co-development of the SoC silicon and software to improve time-to-market.  Consequently, SoC verification has evolved to include multi-FPGA prototyping, and higher prototype performance, to support longer runs of the SoC design prototype, running more of its software, prior to silicon – in an effort to avoid the skyrocketing costs associated with silicon respins.  While FPGA prototyping for SoC design verification by its nature remains a “blunt instrument”, FPGA prototyping is still the only available pre-silicon verification option, beyond hardware emulation, for achieving longer periods of SoC design operation capable of running software, and, in some cases, “plugging” the SoC design prototype directly into real target-system hardware.  Not surprisingly, commercial FPGA prototype suppliers are using the latest FPGA technology to implement FPGA prototyping, offering multi-FPGA prototyping platforms, and advancing FPGA prototyping debug tool capabilities, to meet customer demands for more effective SoC verification.

Ideally, SoC design debug tools for FPGA prototyping would enable software simulation-like verification and debug at silicon speeds – providing visibility of all internal SoC design nodes, not impede prototype performance, provide unlimited debug trace-data storage, and be quickly reconfigurable for revisions to the SoC design and/or the debug setup.  In reality, today’s SoC design debug tools for FPGA prototyping falls short of the ideal, and multi-FPGA prototyping adds to the challenge of achieving ideal SoC design debug tool capabilities.  As a result, today’s FPGA prototyping for SoC design debug offers tradeoffs among the ideal debug tool capabilities, and it is left to the SoC design verification team to configure an “optimal” verification strategy for each SoC design project – with consideration for future scaling-up and improved verification capabilities.

This white paper reviews some of the multi-FPGA prototyping challenges for SoC design verification and debug, and, reviews one example of a commercially available multi-FPGA prototyping debug capability offered by S2C Inc., a leading supplier of FPGA prototyping solutions for SoC design verification and debug (s2ceda.com).

Summary and Conclusions

S2C’s MDM Pro hardware, together with S2C’s Prodigy FPGA prototyping platforms, and S2C’s Player Pro software, implements a rich set of debug features that provides SoC designers with the flexibility to optimize the FPGA prototype debug tools for a given FPGA prototyping project.  MDM Pro combines off-FPGA hardware for “deep” trace-data storage and complex hardware trigger logic, in combination with probe multiplexing IP in the FPGA to access a large number of debug probes over a few FPGA high-speed GTY connections to minimize the consumption of FPGA I/O, and the ability to setup more probe connections than need to be viewed at the same time so that more probes may be viewed when needed without recompiling the FPGA or degrading the debug performance.  Player Pro software for debug compliments the debug hardware with a powerful user interface for managing the debug setup, configuring advanced trace-data trigger conditions, initiating debug runs of the FPGA prototype, and viewing the debug trace-data from multiple FPGAs in a single viewing window.

Also read:

Prototype enables new synergy – how Artosyn helps their customers succeed

S2C’s FPGA Prototyping Solutions

DAC 2021 Wrap-up – S2C turns more than a few heads


SoC Application Usecase Capture For System Architecture Exploration

SoC Application Usecase Capture For System Architecture Exploration
by Sondrel on 04-19-2022 at 6:00 am

Fig 1

Sondrel is the trusted partner of choice for handling every stage of an IC’s creation. Its award-winning define and design ASIC consulting capability is fully complemented by its turnkey services to transform designs into tested, volume-packaged silicon chips. This single point of contact for the entire supply chain process ensures low risk and faster times to market. Headquartered in the UK, Sondrel supports customers around the world via its offices in China, India, Morocco and North America.

Introduction

Early in the SoC development cycle, Product Managers, Systems Architects and relevant technical stakeholders discuss and elaborate product requirements.  Each group tends to have a specific mental model of the product, typically with product managers focusing on the end-use and product applications. At the same time, Systems Architects focus on functionality and execution and implementation of the requirements.

The ‘Requirements Capture Phase’ identifies, formulates and records all known functionality and metrics, including performance in a clear and complete proposal. In addition, this exercise identifies functionality that is not fully understood or may be included later and seeks to determine and plan what tasks are required to complete the qualification and quantification of such functions.

On completion, or as complete as possible at the program’s start, the system architecture team’s requirements go through an analysis phase with appropriate inputs from design and implementation teams. The outcome of this iterative process is an architecture design specification that includes an architecture design for which all functionality, estimation of the power, performance and area are determined.

The inclusion of design and implementation effort at the initial phase ensures better accuracy and validation for the specification and architecture. In addition, it identifies the sensitivities needed to guide design choices.

The architecture analysis includes the architecture exploration, IP selection/specification, verification of requirements, and generation of the project execution plan with major tasks to be elaborated in later phases.

The architecture exploration of the candidate architecture is a significant component. It refines the architecture design by modelling the proposal and evaluating known or reference use cases, dynamically allowing the system topology to be defined and provisioning of resources to be allocated (memory, bus fabric data/control paths etc.).

While it allows aspects of the functionality to be evaluated and validated (connectivity, timing, performance etc.) for confidence in the correctness of the design, later phases using more detailed and accurate models are used to determine and correct potential errors during the implementation of the architecture.

The remaining sections of this article cover the use of modelling in the architecture phase of the program.

SoC application use case capture for system architecture exploration

The initial part of SoC Architecture Exploration is a rigorous way of capturing one or more application use cases and dataflows which an SoC is required to perform.  Accurate and complete description of use cases is necessary to communicate with stakeholders and agree on requirements early in the product definition phase.

The Systems Architect seeks to draw out the product requirements and express them so that technical and non-technical stakeholders can keep up with the product intent and architectural choices without excessive technical detail.

Figure 1 shows an overview of this collaboration process in 8 steps:

  1. Market analysis, industry trends, product requirements definition carried out by the Product Manager for a potential SoC solution
  2. Product Usecase requirements are communicated to the System Architect, usually by presentations, spreadsheets or documents.
  3. Requirements translation to DSL format required by modelling flows
  4. Tools generate an Executable Specification and visualisations of the use case
  5. Tools also generate the cycle-accurate SystemC model required for use case architecture exploration
  6. Systems architect inspects results of an exploration exercise and progressively converges to an optimal architecture for the SoC
  7. System Architect communicates findings with Product Manager
  8. The Product Manager may decide to modify requirements or collaborate with the Systems Architect to further refine the candidate SoC Architecture.

Industry trends show that vision-based applications are becoming more common to incorporate classical computer vision techniques and neural-net-based AI inferencing, with a fusion step to combine results from the two stages.

Figure 2 shows a typical autonomous vision use case data flow graph, with nodes representing processing functions and edges representing data flow.  The specific stages are:

  • Frame Exposure – The interval during which a camera sensor takes a snapshot of its field of vision. The image sensor may be configured in either global shutter or rolling shutter mode, and each mode has an exposure period associated with it.
  • Frame RX – The interval over which pixes of an image grouped in lines are sent to the SoC over a real-time interface such as MIPI CSI-3.
  • Image Conditioning – Any image pre-processing, filtering or summarisation steps performed on the received data before the actual compute stages.
  • Classical Computer Vision – Well-known vision processing algorithms, for example, camera calibration, motion estimation or homography operations for stereo vision.
  • Computational Imaging – Vision algorithms are augmented with custom processing steps such as Pixel Cloud or Depth Map estimation
  • AI Inferencing – Neural Net based image processing for semantic segmentation, object classification and the like.
  • Data Fusion – Final stage sensor fusion and tracking. May also include formatting or packetisation processing.
  • Data TX – Can be over PCIE or a real-time interface such as MIPI CSI-3 at a constant or variable data rate.

Associated with every processing stage are parameters that need to be specified so that the dynamic simulation model can be configured correctly.  These parameters generally describe:

  1. Read DMA characteristics: Number of blocks, block sizes, memory addresses and memory access patterns
  2. Processing characteristics: The delay which the task will require in order to perform its processing.
  3. Write DMA characteristics: Number of blocks, block sizes, memory addresses and memory access patterns

Figure 3 shows that this information is best described in tabular format, where rows represent processing tasks and columns are parameters associated with the task.

The use case graph may also have an embedded sub-graph, which is often the case with AI applications that describe the algorithm in terms of a Neural Network computation graph.  Figure 4 shows a sub-graph within a larger use case graph.  The method of describing the sub-graph is in the same tabular format, which may be present in any part of the larger graph, not just with AI processing.

Usecase parameters captured in tabular format as shown in Figure 3 are sufficient to describe the application intent regarding dataflows between processing stages and the processing delay of a given stage.  The added benefit of having the graph drawn to the left of the table is that it becomes intuitive to understand the data flow, hence the relationship between nodes as processing stages.  Even for large graphs, the method is applicable and offers supplementary information readily available if required.

Separate to the Application Usecase is a model of the Hardware Platform, which will perform the data transfers and processing delays as prescribed by the Usecase model.  The Hardware Platform model will typically have the following capabilities:

  1. Generate and initiate protocol compliant transactions to local memory, global memory or any IO device
  2. Simulate arbitration delays in all levels of a hierarchical interconnect
  3. Simulate memory access delays in a memory controller model as per the chosen JEDEC memory standard.

Figure 4 shows a block diagram of one such Hardware Platform, which, in addition to a simulation model, forms the basis for elaborating an SoC architecture specification.

So far we have defined two simulation constructs – the Application Usecase Model and the Hardware Platform Model.  What is required is now a specification of how the Usecase maps on to the Hardware Platform subsystems.  That is, which tasks of the application usecase model are run by which subsystems in the hardware platform model.  Figure 6 shows a the full simulation model with usecase tasks mapped on to subsystems of the hardware platform.

The Full System Model in Figure 6 is the dynamic performance model used for Usecase and Hardware Platform Exploration.

Every node in the Usecase graph is traversed during simulation, with the Subsystem master transactor generating and initiating memory transactions to one or more slave transactors. As a result, delays due to contention, pipeline stages or outstanding transactions are applied to every transaction, which cumulatively sums up the total duration that the task is active.

The temporal simulation view in Figure 7 shows the duration active for each task for a single traversal of the Application Usecase.  The duration for the entire chain is defined as the Usecase Latency.  Having one visualisation showing the Hardware Platform, Application Usecase and Temporal Simulation view often work very well for various stakeholders because it is intuitive to follow.

Now a single traversal is not useful, decides providing some sanity checks about the setup of the environment.  For thorough System Performance Exploration multiple traversals need to be run, and in this setup, we see the two phases of the simulation.  A transient phase is when the pipeline is filling up, followed by the steady-state when the pipeline is full; hence the system is at maximum contention.

Figure 8 highlights a portion of the simulation when the system is at maximum contention. During the steady-state, metrics are gathered to understand the performance characteristics and bounds of the system.  This guides further tuning and exploration of the use case and hardware platform.

Figure 9 shows two configurations of the hardware platform and the resulting temporal views.  One system is setup for low latency by using direct streaming interfaces to avoid data exchange in the DDR memory.

Yet again, the benefits of showing the two systems visually bring clarity so that all stakeholders can understand with a bit of guidance.

The complete architecture exploration methodology relates to use case and platform requirements, simulation metrics, key performance indicators and reports.

Figure 10 shows the flow of information in the following order:

  1. Application Usecase is defined first. The tabular format for capturing the use case is crucial here, as shown previously in Figure 3
  2. Usecase Requirements associated with the Application Usecase are stated.
  3. Usecase Requirements are converted into Key Performance Indicators, which are thresholds on metrics expected from simulation runs.
  4. Simulation metrics are collected from simulation runs
  5. Usecase performance summary report is produced by checking if metrics meet their Key Performance Indicators or not.

A similar flow applies to Hardware Platform Requirements whereby:

  1. Hardware Platform defined first
  2. Platform Requirements stated
  3. Platform KPIs extracted from Requirements
  4. Platform simulation metrics collected
  5. Platform performance summary generated by comparing metrics with KPIs.

Also read:

Sondrel explains the 10 steps to model and design a complex SoC

Build a Sophisticated Edge Processing ASIC FAST and EASY with Sondrel

Sondrel Creates a Unique Modelling Flow to Ensure Your ASIC Hits the Target

 


Power Transistor Modeling for Converter Design

Power Transistor Modeling for Converter Design
by Tom Simon on 04-18-2022 at 10:00 am

Magwel PTM Field Viewer

Voltage converters and regulators are a vital part of pretty much every semiconductor-based product. They play an outsized role in mobile devices such as cell phones where there are many subsystems operating at different voltages with different power needs. Many portable devices rely on Lithium Ion batteries whose output voltage can vary from 4.2 volts down to 3.0 volts as they discharge. The power distribution systems in these devices need to operate with extremely high efficiency to meet battery life requirements.

As an example, a typical cell phone contains CPU cores, DRAM, RF radio, display backlight, camera, audio codec and other subsystems which need voltages ranging from 0.8V to ~4V – all from a single voltage source in the lithium ion battery. A combination of buck and boost converters are needed to precisely produce all these voltage levels from the battery regardless of its state of charge. Because switching based converters can be noisy, low drop out (LDO) voltage regulators are also needed for several power supplies.

In the converters and regulators listed above, one of the most important elements is the pass device, which handles all the current to the load and controls the final output voltage. Pass devices can be made from a wide range of materials and can be designed as bipolar or MOS devices. Regardless of material and device type, the design of the pass device has a major effect on power loss and thermal behavior.

Magwel PTM Field Viewer

Power devices typically have many fingers and large channel widths (W). Connections to the semiconductor layers are made through a complex interconnection of metal and via layers that connect all the active areas in parallel. The size and topology of these devices leads to complex electrical behaviors. There are a large number of gate/base contacts which often have maze like connections to the external device terminal(s). The same is often true for connections to the source and drain, or emitter and collector.

These complex metal connections contribute to device resistance and can also introduce non-uniform delays within the device. To model this electrical behavior, designers need tools like the Magwel Power Transistor Modeler (PTM) suite. Traditional circuit extractors are not designed to deal with wide metal, large via arrays and usual shapes found in power devices. Likewise, point-to-point resistance values are needed, along with efficient and accurate ways to model the channel.

Magwel’s PTM tools use a solver based extractor that is optimized for the complex metal shapes and vias found in power devices. PTM can automatically identify the channel and will segment it according to user settings to create multiple parallel devices that can be used for full device modeling.

Usually when power devices are used for switching converters the active area can be modeled effectively as a linear resistive value based on the foundry device model and operating conditions, such as temperature and stimulus. However, Low Drop Out (LDO) regulators are often used to get as much working voltage out of a discharging battery. The lower the drop-out voltage the longer the LDO regulator can use a battery and the less overall power is wasted on internal resistance and converted to heat. For this reason, LDO regulator pass device performance is extremely important, necessitating the use of more sophisticated device modeling for the active region. Magwel’s PTM has the option to use non-linear models to accurately predict the behavior of the active area during LDO power device operation.

Another important aspect of power transistor modeling is the stimulus used at the external device pins for simulation. Magwel’s PTM offers a wide range of easy to use options for this. The most basic method is to simply set a constant voltage or current. The user can select the operating Temperature for each simulation. There is also a voltage controlled voltage source (VCVS) mode for modeling the device pin voltage as a proportional function of a probe voltage in the device. This is exceptionally useful for working with circuits that have replica or sensing devices.

With the inputs described above, PTM can provide voltage values at every point in the device. Designers can also view the current density throughout each layer. Thresholds for current density can be set to flag potential electromigration violations. In addition to output reports and exportable csv files, users can view a field view for full visualization of the device for easy debugging and optimization.

Magwel’s PTM is used by many leading converter circuit design companies. Silicon validation results show correlation within a percent or two. Designers can make provisional changes to the device geometry and pin locations and quickly rerun simulations without iterating back through the layout tools to perform what-if analysis when optimizing the design. More information on the PTM suite of tools is available on the Magwel website.


Bespoke Silicon is Coming, Absolutely!

Bespoke Silicon is Coming, Absolutely!
by Daniel Nenni on 04-18-2022 at 6:00 am

IMG 9977

It was nice to be at a live conference again. DesignCon was held at the Santa Clara Convention Center, my favorite location, which to me there was a back to normal crowd. The sessions I attended were full and the show floor was busy. Masks and vaccinations were not required, maybe that was it. Or there was a pent-up demand to get back engaged with the semiconductor ecosystem? Either way it was a great conference, absolutely.

SemiWiki stalworth companies Cadence, Ansys, Siemens, and Samtec, were all there. We will have more coverage of their talks over the next week or two. SemiWiki newcomer Xpeedic was there and we will be covering their new announcement as well.

The first panel I attended in the Chip Head Theater was titled Bespoke Silicon: How System Companies are Driving Chip Design. The panelists were John Lee, GM Semiconductor, Electronics, Optics BU, Ansys. Rob Aitken, Arm Fellow and Director of Technology, Arm Research. Prashant Varshney, Head of Product, Silicon Vertical, Microsoft Azure.

This panel was set up to explore the trend of system/software companies deciding they need semiconductor solutions that cannot be bought off the shelf. Some prominent examples of this are Meta, Amazon, Microsoft, and Google who are all defining and designing their own chips.  An understanding of what is driving this market trend also gives insights on how it impacts the technical demands on Ansys’ simulation/analysis products.

Why are these companies doing this? The background enabler is, of course, the internet and the pervasive digitalization of society and the economy. But more specifically, it is a confluence of advances in AI/ML algorithms together with semiconductor systems that have become big and complex and capable enough to actually move the needle for an entire business division. Take, for example, Meta’s vision for a VR-enabled future: it all depends critically on the technical capability of the optical headset as well as the power of the AI algorithms driving it – which itself requires a lot of silicon to execute.

Microsoft’s gaming division is only competitive to the degree that its Xbox can stay at the cutting edge of graphic processing. Amazon Web Services finds its costs structure is tied to the price, performance and power profile of the CPUs they use to power their data centers. So they developed their proprietary Graviton2 microprocessor in collaboration with Arm.  There are very interesting business dynamics resulting from this that the panel explored.

At the lower, technical level this evolution is driven on the one hand by advances in AI/ML techniques, and on the other hand by advances in integration density with 3D-IC that have accelerated past reliance on just Moore’s Law. We see the latest HPC products from AMD and Nvidia and Intel are all multi-die chiplet systems. The recent industry collaboration on the release of the UCIe spec indicates how seriously these companies take the 3D-IC revolution as an enabler for the systems they want to build. Not to mention that just AI/ML algorithms are driving a leap in design sizes all on their own – see the wafer-scale engine from Cerebras which is explicitly targeted at ML training.

What this means from the Ansys point of view is that they are being called on to analyze increasingly large and complex multi-die systems. That is where the analysis/signoff market is going. However, the technical challenge extends well beyond simple massive capacity (which makes a cloud strategy a must-have for EDA tools). Even more challenging is the emergence of new physical effects that need to be simulated. So, the 3D-IC problem is not just quantitatively bigger, it is also qualitatively different. We call this the multiphysics challenge of 3D-IC.

The primary new physics is thermal analysis since heat dissipation is often the #1 limiting factor on these advanced designs (part of Cerebras’ secret sauce is how they manage to cool their ~15kW wafer). Of course, thermal analysis is not new but it is to most chip designers. It is an example of how chip, package, and PCB design is collapsing into a single design problem. Furthermore, thermal analysis screams out for a computational fluid dynamics simulation engine to model how the air flow and heatsink interact to set boundary conditions for the 3D-IC module. That’s another modeling physics pulled into the mix. And then there are the mechanical stress/warpage issues from having differential thermal expansion in various parts of the 3D-IC stack. Add a mechanical modeling engine to the mix.

One last example of new physics being jammed into the 3D-IC design problem space: electromagnetic analysis of high-speed signals. You see, what makes a 3D-IC integration fundamentally different from just placing two packaged chip next to each other on a PCB is that the inter-chip communication is very low-power and very high-bandwidth. If that can be done, then we can minimize the power/performance cost of going off-chip. But these interconnect traces absolutely require electromagnetic simulation for interference and coupling. How many digital designers are familiar with EM simulation?

Bottom line: The manufacturing process allows us to produce very fine-grain electrical integration of multiple chips. But the success of this market, which is driven in large part by bespoke silicon projects, is gated by the ability of designers to model, simulate, and verify the electrothermal interactions. I believe that is where the true bottleneck to adoption lies, and something Ansys tools are uniquely positioned to alleviate.

Also read:

Webinar Series: Learn the Foundation of Computational Electromagnetics

5G and Aircraft Safety: Simulation is Key to Ensuring Passenger Safety – Part 4

The Clash Between 5G and Airline Safety


Quantum Computing Trends

Quantum Computing Trends
by Ahmed Banafa on 04-17-2022 at 10:00 am

Math Physics Biology

Quantum Computing is the area of study focused on developing computer technology based on the principles of quantum theory. Tens of billions of public and private capitals are being invested in Quantum technologies. Countries across the world have realized that quantum technologies can be a major disruptor of existing businesses, they have collectively invested $24 billion in in quantum research and applications in 2021 [1].

A Comparison of Classical and Quantum Computing

Classical computing relies, at its ultimate level, on principles expressed by Boolean algebra. Data must be processed in an exclusive binary state at any point in time or what we call bits. While the time that each transistor or capacitor need be either in 0 or 1 before switching states is now measurable in billionths of a second, there is still a limit as to how quickly these devices can be made to switch state.

 As we progress to smaller and faster circuits, we begin to reach the physical limits of materials and the threshold for classical laws of physics to apply. Beyond this, the quantum world takes over, in a quantum computer, a number of elemental particles such as electrons or photons can be used with either their charge or polarization acting as a representation of 0 and/or 1. Each of these particles is known as a quantum bit, or qubit, the nature and behavior of these particles form the basis of quantum computing [2]. Classic computers use transistors as the physical building blocks of logic, while quantum computers may use trapped ions, superconducting loops, quantum dots or vacancies in a diamond[1].

Physical vs Logical Qubits

When discussing quantum computers with error correction, we talk about physical and logical qubits. Physical qubits are the physical qubits in quantum computer, whereas logical qubits are groups of physical qubits we use as a single qubit in our computation to fight noise and improve error correction.

To illustrate this, let’s consider an example of a quantum computer with 100 qubits. Let’s say this computer is prone to noise, to remedy this we can use multiple qubits to form a single more stable qubit. We might decide that we need 10 physical qubits to form one acceptable logical qubit. In this case we would say our quantum computer has 100 physical qubits which we use as 10 logical qubits.

Distinguishing between physical and logical qubits is important. There are many estimates as to how many qubits we will need to perform certain calculations, but some of these estimates talk about logical qubits and others talk about physical qubits. For example: To break RSA cryptography we would need thousands of logical qubits but millions of physical qubits.

Another thing to keep in mind, in a classical computer compute-power increases linearly with the number of transistors and clock speed, while in a Quantum computer compute-power increases exponentially with the addition of each logical qubit [4].

Quantum Superposition and Entanglement

The two most relevant aspects of quantum physics are the principles of superposition and entanglement.

Superposition: Think of a qubit as an electron in a magnetic field. The electron’s spin may be either in alignment with the field, which is known as a spin-up state, or opposite to the field, which is known as a spin-down state. According to quantum law, the particle enters a superposition of states, in which it behaves as if it were in both states simultaneously. Each qubit utilized could take a superposition of both 0 and 1. Where a 2-bit register in an ordinary computer can store only one of four binary configurations (00, 01, 10, or 11) at any given time, a 2-qubit register in a quantum computer can store all four numbers simultaneously, because each qubit represents two values. If more qubits are added, the increased capacity is expanded exponentially.

Entanglement: Particles that have interacted at some point retain a type of connection and can be entangled with each other in pairs, in a process known as correlation. Knowing the spin state of one entangled particle – up or down – allows one to know that the spin of its mate is in the opposite direction. Quantum entanglement allows qubits that are separated by incredible distances to interact with each other instantaneously (not limited to the speed of light). No matter how great the distance between the correlated particles, they will remain entangled as long as they are isolated. Taken together, quantum superposition and entanglement create an enormously enhanced computing power[3] .

Quantum computers fall into four categories [1]

  1. Quantum Emulator/Simulator
  2. Quantum Annealer
  3. Noisy Intermediate Scale Quantum (NISQ)
  4. Universal Quantum Computer – which can be a Cryptographically Relevant Quantum Computer (CRQC)

Quantum Emulator/Simulator

These are classical computers that you can buy today that simulate quantum algorithms. They make it easy to test and debug a quantum algorithm that someday may be able to run on a Universal Quantum Computer (UQC). Since they don’t use any quantum hardware, they are no faster than standard computers.

Quantum Annealer

A special purpose quantum computer designed to only run combinatorial optimization problems, not general-purpose computing, or cryptography problems. While they have more physical Qubits than any other current system they are not organized as gate-based logical qubits. Currently this is a commercial technology in search of a future viable market.

Noisy Intermediate-Scale Quantum (NISQ) computers.

Think of these as prototypes of a Universal Quantum Computer – with several orders of magnitude fewer bits. They currently have 50-100 qubits, limited gate depths, and short coherence times. As they are short several orders of magnitude of Qubits, NISQ computers cannot perform any useful computation, however they are a necessary phase in the learning, especially to drive total system and software learning in parallel to the hardware development. Think of them as the training wheels for future universal quantum computers.

Universal Quantum Computers / Cryptographically Relevant Quantum Computers (CRQC)

This is the ultimate goal. If you could build a universal quantum computer with fault tolerance (i.e., millions of error- corrected physical qubits resulting in thousands of logical Qubits), you could run quantum algorithms in cryptography, search and optimization, quantum systems simulations, and linear equations solvers.

Post-Quantum / Quantum-Resistant Codes

New cryptographic systems would secure against both quantum and conventional computers and can interoperate with existing communication protocols and networks. The symmetric key algorithms of the Commercial National Security Algorithm (CNSA) Suite were selected to be secure for national security systems usage even if a CRQC is developed. Cryptographic schemes that commercial industry believes are quantum-safe include lattice-based cryptography, hash trees, multivariate equations, and super-singular isogeny elliptic curves [1].

Difficulties with Quantum Computers [2]

•       Interference – During the computation phase of a quantum calculation, the slightest disturbance in a quantum system (say a stray photon or wave of EM radiation) causes the quantum computation to collapse, a process known as de-coherence. A quantum computer must be totally isolated from all external interference during the computation phase.

•       Error correction – Given the nature of quantum computing, error correction is ultra-critical – even a single error in a calculation can cause the validity of the entire computation to collapse.

•       Output observance – Closely related to the above two, retrieving output data after a quantum calculation is complete risks corrupting the data.

Ahmed Banafa, Author the Books:

Secure and Smart Internet of Things (IoT) Using Blockchain and AI

Blockchain Technology and Applications

Quantum Computing

References

1.     https://www.linkedin.com/pulse/quantum-technology-ecosystem-explained-steve-blank/?

2.     https://www.bbvaopenmind.com/en/technology/digital-world/quantum-computing-and-ai/

3.     https://phys.org/news/2022-03-technique-quantum-resilient-noise-boosts.html

4.     https://thequantuminsider.com/2019/10/01/introduction-to-qubits-part-1/

Also read:

Facebook or Meta: Change the Head Coach

The Metaverse: A Different Perspective

Your Smart Device Will Feel Your Pain & Fear