Semiwiki EDA Webinar 800x100

Altair at #59DAC with the Concept Engineering Acquisition

Altair at #59DAC with the Concept Engineering Acquisition
by Daniel Nenni on 07-07-2022 at 10:00 am

Altair HPC Banner

The Design Automation Conference has been the pinnacle for semiconductor design for almost 60 years. This year will be my 38th DAC and I can’t wait to see everyone again. One of the companies I will be spending time with this year is Altair.

Last month Altair acquired our friends at Concept Engineering, the leading provider of electronic system visualization software. Prior to that Altair acquired our friends at Runtime Design Automation. The Runtime people are still at Altair which is a very good sign. Prior to the Runtime acquisition I had little contact with Altair but over the last two years I have developed a great amount of respect for what they have accomplished, absolutely.

Altair will be at DAC this year in a very big way, which I greatly appreciate. Here is a quick preview from their DAC landing page:

Compute Intelligence for Breakthrough Results Visit us at #59DAC!

Join us to learn more about Altair’s world-class, high-throughput solutions for every step of the semiconductor design process (and more!).

Altair solutions are used by leading companies all over the globe to keep EDA, HPC, and cloud compute resources running smoothly and efficiently. We care about the same critical components you do — including cores, licenses, and emulation — and know that even the most capable hardware can’t do its job without the right tools to enable top performance and high throughput.

Schedule Meeting

Rapidly Advancing Electronics: Altair Solutions Enable Rapid Growth for Wired Connectivity Leader Kandou

Faced with rapid growth, the team at Kandou needed to manage workloads and licensing for their expensive EDA tools. The team chose Altair® Accelerator™ for job scheduling and Altair® Monitor™ for real-time license monitoring and management, resulting in improved product development, getting to market faster, and saving money on expensive EDA tools.

Read the Customer Story

Inphi Corporation Speeds Up Semiconductor Design with Altair Accelerator

The team at Inphi understands the importance of HPC and EDA software performance optimization better than most. They evaluated several competing solutions before selecting Altair Accelerator, which stood out among the competition for superior performance and Altair’s reputation for excellent customer service.

Read the Customer Story

CEA Speeds Up EDA for Research: Powering R&D at the French Alternative Energies and Atomic Energy Commission

CEA Tech, the Grenoble-based technology research unit for the French Alternative Energies and Atomic Energy Commission (CEA) is a global leader in miniaturization technologies that enable smart digital systems and secure, energy-efficient solutions for industry.

Read the Customer Story

Using I/O Profiling to Migrate and Right-size EDA Workloads in Microsoft Azure

Semiconductor companies are taking advantage of Microsoft Azure HPC infrastructures for their complex electronic design automation (EDA) software. When one of the largest semiconductor companies asked for help using Azure to run its EDA workloads, Microsoft teamed up with Altair. This presentation outlines how Microsoft used Altair Breeze™ to diagnose I/O patterns, choose the workflow segments best suited for the cloud, and right-size the Azure infrastructure. The result was better performance and lower costs for our semiconductor customer.

Watch Now

Measuring Success in Semiconductor Design Optimization: What Metrics Matter?

There are few fields in the world as competitive as semiconductor design exploration and verification. Teams might run tens of millions of compute jobs in a single day on their quest to bring new chips to market first, requiring vast quantities of compute and, increasingly, cloud and emulator resources, as well as expensive EDA licenses, and the all-important resource, time. In this roundtable, experts will discuss the license-, job-, compute-, and host-based metrics, highlighting the optimization strategies that edge out the competition and drive up profitability.

Learn More

I hope to see you there!

Also Read:

Future.HPC is Coming!

Six Essential Steps For Optimizing EDA Productivity

Latest Updates to Altair Accelerator, the Industry’s Fastest Enterprise Job Scheduler


CXL Verification. A Siemens EDA Perspective

CXL Verification. A Siemens EDA Perspective
by Bernard Murphy on 07-07-2022 at 6:00 am

CXL Verification

Amid the alphabet soup of inter-die/chip coherent access protocols, CXL is gaining a lot of traction. Originally proposed by Intel for cross-board and cross-backplane connectivity to accelerators of various types (GPU, AI, warm storage, etc.), a who’s who of systems and chip companies now sits on the board, joined by an equally impressive list of contributing members. The standard enables coherent memory sharing between a central processor/CPU cluster with its own cache coherent memory subsystem, with memory/caching on each of multiple accelerator systems. This greatly simplifies life for software developers since memory consistency is managed in hardware. No need to worry about this in software; it’s all just one unified memory model, whether software is running on the processor or on an accelerator.

CXL and PCIe

As an Intel-initiated standard, CXL layers on top of PCIe (as does NVMe, but that’s another story). PCIe already provides the physical interface standard, also the protocols and traffic management for IO communication. CXL builds on top of this for memory and cache communication between devices. This makes it a complex protocol to verify out of the gate, requiring PCIe compliance just as a starting point.

CXL layers on top of PCIe three protocols:

  • io for configuration and a variety of administrative functions
  • cache providing peripherals with low-latency access to host memory
  • memory allowing the host to coherently access memory attached to attached CXL devices

The coherency requirement adds more complications such as compliance with the associated coherency protocol (eg MESI). Also add in Integrity and Data Encryption (IDE) to ensure secure connection and computing. Put all of this together and it is clear that CXL protocol checking is a very complex beast, for which a well-defined VIP would be enormously helpful.

Questa VIP for CXL

Siemens EDA have built a Questa VIP to address this need. QVIP can model any or all the CXL-compliant components in a system, including IDE, generating fully compliant stimulus in host, device, or passive device roles. The VIP comes with a comprehensive verification plan covering simple and complex scenarios. The VIP comes with predefined sequences to support generating these scenarios. Checkers are provided to validate compliance with the coherency protocol of choice, also to validate data integrity through cache reads, writes, and updates.

When a problem is found, possibly elsewhere in the system, the VIP provides detailed logging, from both device to host and from host to device. This logs all information on the CXL interconnect by timestamp, which simplifies tracking problems back to transactions. It is also possible to enable detailed debug messages. Once you know roughly where you want to look you can trigger detailed transaction information in both directions.

Finally, for coverage, the testplan supplied with the VIP is designed to guide you to high coverage over your CXL compliance testing. Table entries define the main test objective, and each objective comes with predefined coverpoints. You can tweak weights for these as appropriate to your verification goals. So, it’s an all-in-one package: VIP, testplan, debug support, and coverage. You just have to dial in your menu choices.

CXL looks likely to be the multi-chip/chiplet solution of choice for coherent memory sharing. This means that you should expect to see this play a larger role in verification planning. If you want to learn more about the Questa Verification IP solution, click HERE.


What Quantum Means for Electronic Design Automation

What Quantum Means for Electronic Design Automation
by Kelly Damalou and Kostas Nikellis on 07-06-2022 at 10:00 am

Ansys quantum blog Image1

In 1982, Richard Feynman, a theoretical physicist and Nobel Prize winner, proposed the initial quantum computer; Feynman’s quantum computer would have the capacity to facilitate traditional algorithms and quantum circuits with the goal of simulating quantum behavior as it would have occurred in nature. The systems Feynman wanted to simulate could not be modeled by even a massively parallel classical computer. To use Feynman’s words, “Nature isn’t classical, dammit, and if you want to make a simulation of nature, you’d better make it quantum mechanical, and by golly it’s a wonderful problem, because it doesn’t look so easy.

Today, companies like Google, Amazon, Microsoft, IBM, and D-Wave are working to bring Feynman’s ambitious theories to life by designing quantum hardware processing units to address some of the world’s most complicated problems—problems it would take a traditional computer months or even years to solve (if ever). They’re tackling cryptography, blockchain, chemistry, biology, financial modeling, and beyond.

The scalability of their solutions relies on a growing number of qubits. Qubits are the building blocks of quantum processing; they’re similar to bits, the building blocks of traditional processing units. IBM’s roadmap for scaling quantum technology shows the 27-qubit IBM Q System One release in 2019, and less than 5 years later, they expect to release the next family of IBM Quantum systems at 1,121 qubits.

Achieving a sufficient level of qubit quality is the main challenge in making large-scale quantum computers possible. Today, error correction is a critical operation in quantum systems, and it preoccupies the vast majority of qubits in each quantum processor. Improving fault tolerance in quantum computing requires error correction that’s faster than error occurrence. Beyond error correction, there are plenty of challenges on the road to designing a truly fault-tolerant quantum computer with exact, mathematically accurate results. Qubit fidelity, qubit connectivity, granularity of phase, probability of amplitude, and circuit depth are all important considerations in this pursuit.

While quantum computing poses a major technological leap forward, there are similarities between quantum designs and traditional IC designs. Those similarities allow the electronic design automation (EDA) industry to build on existing knowledge and experience from IC workflows to tackle quantum processing unit design.

Logic Synthesis in Quantum and RFIC Designs

In quantum designs on superconductive silicon, the basic building block is the Josephson Junction. In radio-frequency integrated circuit (RFIC) chips, that role is played by transistors. In both situations, these fundamental building blocks are used to build gates that ultimately form qubits in quantum and bits in RFIC.

Image source: “An Introduction to the Transmon Qubit for Electromagnetic Engineers”, T. E. Roth, R. Ma, W. C. Chew, 2021, arXiv:2106.11352 [quant-ph]

Caption: From the Josephson junction to the quantum processor

In RFICs, the state of a bit can be read with certainty—it’s either 0 or 1. Determining the state of a qubit is much more complicated. Yet, it’s a critical step for accurate calculations. Due to the peculiar laws of quantum mechanics, qubits can exist in more than one state at the same time—a phenomenon called superposition. Superposition allows a qubit to assume a value of 0, 1, or a linear combination of 0 and 1. It’s instrumental to the operations of a quantum computer because it provides exponential speedups in memory and processing. The quantum state is represented inside the quantum hardware, but when qubits are measured, the quantum computer reports out a 0 or a 1 for each.

Entanglement is another key quantum mechanical property that describes how the state of one qubit can depend on the state of another. In other words, observing one qubit can reveal the state of its unobserved pair. Unfortunately, observation (i.e., measurement) of the state of a qubit comes at a cost. When measuring, the quantum system is no longer isolated, and its coherence—a definite phase relation between different states—collapses. This phenomenon, quantum decoherence, is roughly described as information loss. The decoherence mechanism is heavily influenced by self and mutual inductance among qubits, which must be modeled with very high accuracy to avoid chip malfunctions.

Quantum processors are frequently implemented using superconductive silicon because it’s lower in cost and easy to scale. Further, it offers longer coherence times compared to other quantum hardware designs. In this implementation, integrated circuits (ICs) are designed using traditional silicon processes and cooled down to temperatures very close to zero Kelvin. Traditional electromagnetic solvers struggle with the complexity and size of quantum systems, so simulation providers need to step up their capacity to meet the moment.

Image credits: IBM

Caption: An IBM quantum computer

Modeling Inductance in Quantum and RFIC Designs

It’s worth noting that superconductors are not new, exotic materials. Common metals like niobium or aluminum are found in superconducting applications. Once these metals are cooled down to a few millikelvin, using a dilution refrigerator, a portion of their electrons do not flow as they normally would. Instead, they form cooper-pairs. This superconductive current flow results in new electromagnetic effects that need to be accurately modeled. For example, inductance is no longer simply the sum of self and mutual inductance. It includes an additional term, called kinetic inductance:

This summation is not as straightforward as it looks. Kinetic inductance has drastically different properties than self and mutual inductance, which are frequency independent and temperature dependent. In a similar fashion, the minimal resistance in a superconductor has different properties than the ohmic resistance of conductors (i.e., proportional to the square of frequency). Electromagnetic modeling tools must account for these physical phenomena both accurately and efficiently.

Scale also poses challenges for electromagnetic solvers. Josephson Junctions, the basic building block of the physical qubit, combine with superconductive loops to form qubit circuits. The metal paths form junctions and loops with dimensions of just a few nanometers. While qubits only need a tiny piece of layout area, they must be combined with much larger circuits for various operations (e.g., control, coupling, measurement). The ideal electromagnetic modeling tool for superconductive hardware design will need to maintain both accuracy and efficiency for layouts ranging from several millimeters down to a few nanometers to be beneficial in all stages of superconductive quantum hardware design.

Image source: “Tunable Topological Beam Splitter in Superconducting Circuit Lattice”, L. Qi, et.al., Quantum Rep. 2021, 3(1), 1-12

Caption: An indicative quantum circuit

 

Looking Forward (or backward – It’s hard to tell with Quantum)

Designers in the quantum computing space need highly accurate electromagnetic models for prototyping and innovation. Simulation providers need to rise to the challenge of scaling to accommodate large, complex designs that push the boundaries of electromagnetic solvers with more and more qubits.

Ansys, the leader in multiphysics simulation, recently launched a new high-capacity, high-speed electromagnetic solver for superconductive silicon. The new solver, RaptorQu, is designed to interface seamlessly with existing silicon design flows and processes. Thus far, our partners are particularly pleased with their ability to accurately predict the performance of their quantum computing circuits.

Caption: Correlation of RaptorQu with HFSS on inductance (left) and resistance (right) for a superconductive circuit

Interested? For updates, keep an eye on our blog.

Dr. Kostas Nikellis, R&D Director at Ansys, Inc., is responsible for the evolution of the electromagnetic modeling engine for high speed and RF SoC silicon designs. He has a broad background in electromagnetic modeling, RF and high-speed silicon design, with several patents and publications in these areas. He joined Helic, Inc. in 2002, and served as R&D Director from 2016 to 2019, when the company was acquired by Ansys, Inc. Dr. Nikellis received his diploma and PhD in Electrical and Computer Engineering in 2000 and 2006 respectively, both from the National Technical University of Athens and his M.B.A. from University of Piraeus in 2014.

Kelly Damalou is Product Manager for the Ansys on-chip electromagnetic simulation portfolio. For the past 20 years she has worked closely with leading semiconductor companies, helping them address their electromagnetic challenges. She joined Ansys in 2019 through the acquisition of Helic, where, since 2004 she held several positions both in Product Development and Field Operations. Kelly holds a diploma in Electrical Engineering from the University of Patras, Greece, and an MBA from the University of Piraeus, Greece.

Also Read:

The Lines Are Blurring Between System and Silicon. You’re Not Ready.

Multiphysics, Multivariate Analysis: An Imperative for Today’s 3D-IC Designs

A Different Perspective: Ansys’ View on the Central Issues Driving EDA Today


Multi-FPGA Prototyping Software – Never Enough of a Good Thing

Multi-FPGA Prototyping Software – Never Enough of a Good Thing
by Daniel Nenni on 07-06-2022 at 8:00 am

PlayerPro EN

Building a multi-FPGA prototype for SoC verification is complex with many interdependent parts – and is “always on a clock”.  The best multi-FPGA prototype implementation is worthless if its not up and running early in the SoC design cycle, where it offers the highest verification ROI terms of minimizing the cost of bug fixes and accelerating the SoC time-to-market.  So, any automation software that enables a more accurate, higher performing prototype implementation in less time should be warmly welcomed by the SoC verification people prototyping large SoCs.

There are at least three pertinent challenges to the implementation of multi-FPGA prototypes;

  1. Cutting large SoC designs into blocks that will “fit” into each FPGA of a multi-FPGA prototyping platform,
  2. Assuring the overall timing integrity of the multi-FPGA prototype when all the FPGAs are connected together, and
  3. Managing the trade-off between the scarcity of FPGA I/O pins that limits the amount of logic in a partition “cut” when the design is spread across several FPGAs, and the prototype performance.

Adding to these prototype implementation challenges are other second-order challenges, like connecting thousands of debug probes, which consumes FPGA connectivity and impacts utilization, and connections to real-world target systems, which consumes FPGA connectivity and FPGA I/O, that impact how easy, or difficult, it is to compile all the FPGAs into a multi-FPGA prototype in an acceptable amount of time with manageable effort.  The tighter you pack the FPGAs (higher utilization), the harder it is for the FPGA compiler tools to find a place and route solution that meets timing targets, and the longer they will take to complete.  But, we’ll defer discussion of these challenges to a future blog.

Automation tools for partitioning large SoC’s for multi-FPGA prototyping should offer a spectrum of “level-of-automation”, from heavily-assisted partitioning, where the user chooses to “guide” the partitioning process with specific design knowledge that will enable a specific partitioning result, to fully automatic partitioning, where the user kicks off a partition run and goes for coffee while the partitioner does its thing.  The basis for choosing the level of automation may be as simple as project schedule, where the designer wants to get to a working multi-FPGA prototype in a hurry and is willing to sacrifice prototype performance for fast compile times.  Some SoC designs lend themselves to intuitive partitioning across multiple FPGAs, and the partition “cut lines” are easily imagined by the designer, while other designers choose higher automation due to the complexity of the critical timing paths, or the prototype target performance, or an aggressive project schedule.  Partitioning at the RTL level is great for early estimations of performance and prototype fit into a multi-FPGA hardware platform, while heavy designer involvement in partitioning may go straight to the gate-level and render unnecessary the need for RTL partitioning.

As unimaginable as it may be today, early commercial multi-FPGA prototyping products did not include integrated timing analysis.  Correct prototype timing in the early days was achieved by applying input stimulus to the prototype and observing the prototype output waveforms with debug probes, and then manually adjusting the relative-edge timing of failed-timing paths by inserting additional FPGA logic gates into the failed-timing path to fix hold-time violations.  That approach by FPGA prototype product providers quickly drew the wrath of early users and led to integrated timing analysis into the FPGA prototyping flow.  Today’s complex multi-FPGA prototypes would be unmanageably difficult without system-level timing analysis that considers the prototype timing of multiplexed FPGA I/O pins, and interconnect cables between FPGAs.

The scarcity of FPGA I/O pins continues to be the bane of multi-FPGA prototyping, even with the new massively large prototyping FPGAs from Intel and Xilinx (up to 80M usable gates per FPGA), because the number of “natural partition cut” interconnections between SoC design partitions often far exceeds the available I/O pins on the FPGAs.  The number of partition interconnections can number in the tens of thousands, whereas the number of available I.O pins on the latest prototyping FPGAs is only a few thousand (1,976 max single-ended HP I/O’s for the Xilinx VU19P, and 2,304 maximum user I/O pins for the Intel Stratix GX 10M.  Consequently, multi-FPGA prototyping must often resort to pin-multiplexing the FPGA I/O pins to implement a multi-FPGA prototype.  The pin-multiplexing is usually accomplished with TDM soft-IP that is implemented with FPGA logic gates with the embedded multiplexors run at the upper limit of the FPGA’s switching speeds.  Different levels of pin-multiplexing (2:1, 4:1, etc.) effectively expands the effective FPGA I/O but sacrifices higher prototype performance.

So, it goes without saying that more automation for multi-FPGA prototype implementation is a good thing, and it comes as no surprise that S2C would offer more of a good thing to its customers by continuing to advance its multi-FPGA prototyping software.  Hence, S2C has recently announced a new release of its Prodigy Player Pro-7TM prototyping software – for use with its Logic System and Logic Matrix families of multi-FPGA prototyping hardware platforms.  S2C has been in production with these multi-FPGA hardware platforms now for a while that incorporate the largest available prototyping FPGAs, like the Xilinx VU19P and the Intel Stratix GX 10M.

According to S2C, the salient features of the new Player Pro-7 software include;

  • RTL Partitioning and Module Replication to support Parallel Design Compilation and reduce Time-to-Implementation
  • Pre/Post-Partition System-Level Timing Analysis for Increased Productivity
  • SerDes TDM Mode for Optimal Multi-FPGA Partition Interconnect and Higher Prototype Performance

The new Player Pro-7 software suite is organized into three separate tools; Player Pro-CompileTimeTMPlayer Pro-DebugTimeTM, and Player Pro-RuTimeTM.  While the new releases of DebugTime and RunTime software include upgrades nfor multi-FPGA debug probing and trace viewing, and strengthening prototype hardware platform control and test, respectively – the most significant multi-FPGA prototyping feature improvements are in the new CompileTime software.

Previous releases of the Player Pro software supported design partitioning at the gate-level, so RTL partitioning is a big step forward for S2C, simplifying the management of multi-core design implementations, and enabling an early assessment of the number of prototype FPGAs required.

For more information about S2C’s multi-FPGA prototyping hardware and software, please visit S2C’s web site at www.s2cinc.com.  Or, stop by S2C’s booth at the 2022 Design Automation Conference from July 11th to July 13th at the Moscone Center in San Francisco.

Also read:

Flexible prototyping for validation and firmware workflows

White Paper: Advanced SoC Debug with Multi-FPGA Prototyping

Prototype enables new synergy – how Artosyn helps their customers succeed


Accellera Update: CDC, Safety and AMS

Accellera Update: CDC, Safety and AMS
by Bernard Murphy on 07-06-2022 at 6:00 am

logo accellera min

I recently had an update from Lu Dai, Chairman of Accellera, also Sr. Director of Engineering at Qualcomm. He’s always a pleasure to talk to, in this instance giving me a capsule summary of status in 3 areas that interested me: CDC, Functional Safety and AMS. I will start with CDC, a new proposed working group in Accellera. To manage hierarchical CDC analysis back in my Atrenta days, you would first analyze a block, then use that analysis to define pseudo constraints on ports of the block, and so on up through the hierarchy. These pseudo constraints might capture things like internal input or output synchronization with related clock info. Sort of a CDC-centric abstraction of the block.

We should have guessed that other tool providers would do something similar, with their own constraint extensions. Which creates a problem when using IP from multiple vendors, each of whom use their own tools for CDC. Maybe you would have to re-do the analysis from scratch for a block? Which may not be possible for encrypted RTL. This is an obvious candidate for standardization – defining abstractions in a common language. SDC-based, no doubt, since these constraints must intermingle with the usual input, output and clock constraints. A worthy effort in support of CDC verification teams.

Functional Safety

It might seem that ISO 26262 is the final word in defining functional safety (FuSa) requirements for electronic design for vehicles. In fact, like most ISO standards ISO 26262 is more about process than detailed guidelines. As tools, IPs and Systems development have advanced to comply with FuSa needs it has become obvious that we need more rigor in those expectations. Take a simple example. What columns should appear in an FMEDA table, in what order and with what headings? Or could this information be scripted instead? None of this is nailed down by ISO 26262. Formats/scripting approaches are completely unconstrained, creating a potential nightmare for integrators.

More generally, there is a need to ensure standardized interoperability in creating and exchanging FuSa information between suppliers and integrators. Which should in turn encourage more automation. So when I claim my IP meets some safety goal, you don’t just have to take my word for it. You can run your own independent checks. On a related note, the methodology should support traceability (a favorite topic of mine). Allowing for validation across the development lifecycle, from IPs to cars. Incidentally there is a nice intro to Accellera work in this area from DAC 2021.

Lu mentioned a related effort in IEEE. I believe this is IEEE P2851, looking at some fairly closely related topics. Lu tells me the Accellera and IEEE groups have had a number of discussions to ensure they won’t trip over each other. His quick and dirty summary is that Accellera is handling the low-level tool and format details while IEEE is aiming somewhat higher. I’m sure that eventually the two efforts will be merged in some manner.

UVM-AMS

The stated objective of this working group is to standardize a method to drive and monitor analog/mixed-signal nets within UVM. Also to define a framework for the creation of analog/mixed-signal verification components by introducing extensions to digital-centric verification IP.

In talking with Lu, the initial objective is to align with existing AMS efforts, in Verilog, SystemVerilog and SystemC. There’s a nice background to the complexities of AMS modeling in simulation HERE for those of us who might have thought this should be easy to solve. Even the basics of real number modeling are still not frozen. Analog signals are not just continuous variants of digital signals; think of the complex number representations common in RF. So there’s history and learning which the standard should leverage yet not disrupt unnecessarily.

AMS teams want the benefits of UVM methodologies, but they don’t want to start from scratch. Aligning those benefits with existing AMS requirements is the current focus. Lu says that many of these requirements aren’t language specific. The working group is figuring out the semantics of the methodology first, then will look more closely at syntax issues.

Accellera will be presenting more on this topic at DAC 2022 so you’ll have an opportunity to learn more there.


Jade Design Automation’s Register Management Tool

Jade Design Automation’s Register Management Tool
by Kalar Rajendiran on 07-05-2022 at 10:00 am

RegMan supervisor CSRs

When more than one person is working on any project, coordination is imperative. When the team size grows, being in sync becomes essential. When it comes to SoC design management, registers and bit fields are used to communicate status of results and execute conditional controls. The Register Management function plays an essential role during the course of any modern day SoC product development. Earlier this year, SemiWiki introduced Jade Design Automation (JadeDA) to its readership, through an interview with its CEO and Founder, Tamas Olaszi.  JadeDA is focused on register management of a chip design starting from the system architecture stage all the way to software bring-up.

This post will discuss register management and a feature to configure RISC-V processor registers. JadeDA will be showcasing their Register Manager tool at the upcoming DAC 2022 in San Francisco. I had an opportunity to chat with Tamas and this blog is based on that conversation.

Register Management Benefits

Where

While register management has always been important on any chip design project, it takes more importance in today’s world of hardware/software co-design/co-development. Even an average complexity chip could include 100,000 or even millions of registers. During the design phase, bit fields in those registers could be change frequently, even many times in a day. This necessitates the validation, regeneration of RTL, updating of UVMs and the relevant documentation in close to real-time as possible. Without register management, different teams could be out-of-sync. For example, a change made by the design team may not be noticed by the verification team right away.  The software team may be working off of outdated information, wasting cycles on developing code that would need to be changed.

Following is a real life example that Tamas narrated during our chat. It was an embedded software development project. The documentation the team was working off of said to set certain bits and then wait for certain things to happen and then perform some actions. The hardware team knew when that something happens because they have access to an internal register. But the software team doesn’t have access to this register. No status bits or interrupts were triggered. Without knowing, the software could be waiting forever to take action. This is the kind of thing that can happen if there is no centralized information access that all teams could review, verify and work from.

While the above example is from an embedded real-time device application, the same goes for any device including HPC-oriented high-performance application. Only difference is, we can expect even more frequent updates the larger and more complex a design gets. And the speed at which the centralized information gets updated and all relevant code and documentation gets regenerated becomes critical.

Who

A good register management capability will render the following functional roles the respective benefits. It also allows automated broadcast of updated information to all the different teams working on a project.

  • System architects can capture and maintain all the high level system information in a centralized way.
  • System Integrators can pull together IPs from various sources to a centralized platform for enhanced quality.
  • IP teams can auto-generate production ready RTL and UVM register descriptions throughout the development process, which is a great productivity booster.
  • Engineering Managers can monitor the consistent and high quality release deliverables offered to their internal or external customers.
  • Software Engineers can have register information loaded into their debuggers so they can instantly see what register they are working with on a particular offset; they can do this without having to wade through pages of documentation.

JadeDA’s Register Management Tool

JadeDA has kept it simple and straightforward by naming their register management tool, the Register Manager. The tool efficiently manages all tasks around the HW/SW interface of an SoC. Users can capture register and bitfield information at the IP level as well as the memory maps on the IP, subsystem and SoC level. The Register Manager generates RTL, Verification, SW, Documentation and data Interoperability formats like IP-XACT 1685-2009 and 1685-2014 from these descriptions.

Data Model and Flexibility for Customization

While data models can be based off of standards such as IP-XACT and SystemRDL, standards evolve very slowly. A proprietary data model from a supplier with strong support for customization serves customers well. JadeDA’s importing tools can migrate IP-XACT and SystemRDL based data models. Data models/tools based off of IP-XACT usually have vendor extensions. JadeDA tool’s data model is richer than what IP-XACT offers. Legacy data in custom formats can be imported via the tool’s API. The API is very efficient and well documented. JadeDA can also easily import register information stored in excel sheets.

GUI

The Register Manager has a rich and intuitive GUI to visualize and edit the HW/SW interface and edit the register and bitfield information. The GUI is much more than just entry fields for various attributes. It can be controlled with mouse only to change attributes like offsets, widths, access types and reset values. There is also a full keyboard support with intuitive focus traversal that allows quick and efficient data capture without raising the hand from the keyboard. Typing pre-existing information from a PDF document can be done without having to reach for the mouse in-between keyboard entries. This is a productivity enhancement.

Note: The tool also has a fully functional shell mode for power users as well as fully scriptable command files for automated flows.

Performance

As changes happen in a design, the tool can capture the data, validate it and generate RTL, UVM, documentation, software and IP-XACT collateral in a few seconds. Jade DA has noticed that its tool runs in an order of magnitude less time than what is available in the marketplace today. And the performance of JadeDA tool scales linearly.

Processor Registers Configurability Feature

JadeDA will be showcasing this new feature of the Register Manager tool at DAC 2022.

JadeDA can deliver its customers the superset of control and status registers (CSRs) through the tool’s GUI. As the customers configure their designs, they can get rid of the CSRs they don’t need for a particular design. A RISC-V based design serves as a good case study. The RISC-V specification offers a whole bunch of CSRs, not all of which are used by all customers. And different customers or different projects at the same customer may use different selection of CSRs. The tool captures all of the registers in all the details contained in the RISC-V specification. With the configurability feature, users can configure the particular subset they need. Some of the configuration options are available as RTL configurable parameters. If the customer turn them off, the users won’t be able to configure the corresponding registers.

This configurability feature is something that JadeDA can implement in its tool for any processor architecture/ISA. Contact JadeDA to explore.

You have to see the live demo to fully appreciate the power of the tool, the user interface and ease of use. See the Register Manager Tool Demo @Booth Number 2430 at DAC 2022 in San Francisco.

Meanwhile, here are some screenshots from the tool. The following two Figures show the scenarios of when a FPU and corresponding registers are included in the configuration and when they are not.

 

The following Table shows the supervisor related CSRs found in the RISC-V specification.

The following relates to a case of an application processor where supervisor related CSRs are needed. The screenshot below shows their conditional presence being enabled.

See the Register Manager Tool Demo @Booth Number 2430 at DAC 2022 in San Francisco.


5G for IoT Gets Closer

5G for IoT Gets Closer
by Bernard Murphy on 07-05-2022 at 6:00 am

5G for IoT

Very recently, 3GPP announced that 5G Release 17 was finalized. One important consequence is that 5G RedCap (reduced capacity) is now real and that means 5G becomes accessible to IoT devices. Think smart wearables (e.g. watches), industrial sensors and surveillance devices. “So what?”, you protest. “I don’t need 5G on my watch. It can link to my phone over Bluetooth and let the phone handle 5G communication.” Yes it can, but have you ever wondered why you always need your phone to use your watch?

That seems like a half-step to convenience, a nice light device on your wrist tethered to an increasingly bulky device in your pocket. When you’re jogging, hiking, working out, wouldn’t it be nice to only need the watch? Industrial sensors and surveillance devices rely more on Wi-Fi for communication but what if the Wi-Fi isn’t very good, or non-existent? Is it time to cut the cord and let these devices talk directly to the cellular network?

The real growth in 5G

The smartphone market is already slowing according to multiple surveys. 5G may generate a boost in support of mobile gaming and high-quality streaming but still the heady growth of early years seems unlikely to re-emerge. That’s why IoT applications have become so interesting. The total available market is not bounded by human users, only by applications. Millions of smart parking meters, moisture sensors in field, bridge stress sensors, power grid sensors, etc, etc. Analysts estimate 1.24 billion M2M non-handset devices shipping in 2027. Smart watch volume estimates show up to 230 million units by 2026, making them an encouraging consumer option to pick up from declining volumes in smart phones. There doesn’t seem to be a killer app here. Volumes are projected to be roughly divided between public sector infrastructure, smart metering, consumer electronics, intelligent buildings, security, retail and commerce, healthcare and transport and logistics.

What underlines the strength of this opportunity is that 5G infrastructure build-out is already underway. Not as fast as we’d like, and it may be a financial challenge for the mobile network providers but coming. There has been talk of expanding the reach of Bluetooth (mesh networks) and Wi-Fi (Wi-Fi 6). Technologically these are possible, but someone must pay for building wide coverage infrastructure. Which seems unlikely given existing investment in 5G infrastructure. Moreover, it’s difficult to beat cellular reach for remote applications – agriculture, highways, power grids, etc.. 5G RedCap is increasingly looking like the best fit for IoT communication.

PentaG2-Lite Well Positioned to Help

As the only 5G NR IP platform on the market, CEVA’s PentaG2 is compelling solution for those needing an embedded solution to meet cost and power goals. This will particularly be true for IoT builders who are likely to see a good fit in the PentaG2-Lite version. This IP offers a wide range of accelerators for modem and other functions.  First product shipments probably will appear 2025, but that date requires builders to start planning now. CEVA offers an integrated SystemC simulation environment for architects in support of that early design.

You can learn more by watching this webinar.


Using AI in EDA for Multidisciplinary Design Analysis and Optimization

Using AI in EDA for Multidisciplinary Design Analysis and Optimization
by Daniel Payne on 07-04-2022 at 10:00 am

Optimality min

Most IC and system engineers follow a familiar process when designing a new product: create a model, use parameters for the model, simulate the model, observe the results, compare results versus requirements, change the parameters or model and repeat until satisfied or it’s time to tape out. On the EDA side, most tools perform some narrow function in a single domain, and it’s up to the EDA user to control the tool, read the results, and then iterate while manually optimizing.

In the late 1980’s we saw the birth of smarter EDA tools like logic synthesis, which at first only optimized a gate level netlist into a reduced form, then later accepted RTL language and produced an optimized, process-specific, gate-level netlist. By the mid 2000’s there was an application of Machine Learning (ML) to Monte Carlo simulations for SPICE simulators, saving circuit designers time and effort. Recently, even Google has applied ML to produce better placement results for large SoC designs than what a human can produce. The trends have been clear,  EDA tool developers have created smarter tools, but mostly limited to single domains, like: Logic design, SPICE and floor planning.

On June 7 some big news in EDA came from Cadence, as they announced something called Optimally Intelligent System Explorer, an AI-based approach for Multidisciplinary Design Analysis and Optimization (MDAO). The days of separated silos of EDA tools operating in only one domain are changing into more complex, multi-domain tools. Cadence has gone so far as to organize a Multi-Physics System Analysis Group, where Ben Gu is the Vice President. The new product name isOptimality Explorer, and it works across three system-level EDA tools:

  • Clarity – 3D Electromagnetic (EM) field solver
  • Sigrity X – Distributed simulation for signal and power integrity (SI/PI)
  • Celsius – Thermal solver (Optimality integration coming soon)
Optimality Explorer

The diagram above shows a system design where a communication channel consists of an IC driver, package, PCB layout, package, and finally a receiver inside the final IC. Criteria for success is optimizing the physical layouts to ensure an acceptable return and insertion loss, while managing cross-talk issues and maintaining signal isolation. Optimality Explorer is used to automatically guide optimization, using both the Clarity and Sigrity X tools, and it decides what to change for each tool run, and figure out when an optimal solution has been found.

For example, the system designer specifies that return loss has to be lower than some threshold, and then Optimality Explorer reads from Allegro, creates design variables,  controls the optimization process, and finds the optimum solution. Here’s a plot from an optimization run where the criteria was a return loss under -35dB:

Optimization Results: Return Loss

The blue dots each represent an iteration during optimization, and the red line is the progress towards reaching the design goal. This automated method for optimization happens much faster than the manual approaches used for the past decades. Cadence is claiming a 10X faster time to optimization by using Optimality.

The theory of applying ML to optimization sounds good, but what about real world results? Great question. At DesignCon there was a presentation by Kyle Chen of Microsoft, where they used Optimality to optimize micro-stacked vias in a rigid-flex PCB. Kyle wrote, “As an early adopter of the Cadence Optimality Intelligent System Explorer, we stressed its performance on a rigid-flex PCB with multiple via structures and transmission lines. The Optimality Explorer’s AI-driven optimization allowed us to uncover novel designs and methodologies that we would not have achieved otherwise. Optimality Explorer adds intelligence to the powerful Clarity 3D Solver, letting us meet our performance target with accelerated efficiency.”

Micro-Stacked Vias

This approach may sound familiar to Cadence IC designer users  in the RTL to GDS flow, because last year they announced Cerebrus, an AI approach using ML to explore the design space for Power, Performance and Area (PPA) through placement, routing and timing closure. The same kind of reinforcement ML in Cerebrus has also been used in Optimality Explorer.

Summary

EDA tools have been used to create every AI chip every designed, and now AI and specifically ML is being applied to EDA tools like Optimality Explorer, to explore the design space of systems by optimizing more quickly than manual methods. The first two tools from Cadence that work with Optimality Explorer are Sigrity X and Clarity, then expect Celsius to be the next tool added. Multi-physics EDA, or multidisciplinary design analysis and optimization (MDAO) has begun in earnest.

Related Blogs

Verifying Inter-Chiplet Communication

Verifying Inter-Chiplet Communication
by Daniel Nenni on 07-04-2022 at 6:00 am

UCIe min

Chiplets are hot now as a way to extend Moore’s Law, dividing functionality across multiple die within a single package. It’s no longer practical to jam all functionality onto a single die in the very latest processes, exceeding reticle limits in some cases and in others straining cost/yield. This is not an academic concern. Already server processors, FPGAs and large AI training platforms run to multiple chiplets on a die. The breakthrough in expanding functional design to chiplets serves not only growing gate counts in large systems. It also allows many functions can be parceled out to individual die at less aggressive technologies for lower cost and potentially broader availability. Reserving the most aggressive processes only for functions/die needing that advantage.

This seems like the best of all possible worlds, but the idea only works if you have a very fast (and low power) interconnect between those chiplets. That’s the goal of the Universal Chiplet Interconnect Express (UCIe). How do you verify compliance with a standard that is new in town? You must work with a company that has a track record in tight relationships with standard developers, in delivering VIP and compliance checking. Avery has that track record.

The foundations of UCIe

UCIe builds on well proven standards, particularly PCIe as a host extension interface, already long established in the PC and server world. Add to this CXL for coherent memory connectivity (memory, IO and cache) between chiplets. PCIe and CXL are mapped natively in UCIe in acknowledgement of the reality that they are already widely used. The fact that they plug-and-play with existing software is another not inconsiderable detail. Add this support for a raw streaming protocol as a way to extend to further protocols. Together, this combination seems like a no-brainer for chiplet-to-chiplet communication. I’ve heard some grumbling from the AI training world about the PCIe overhead impeding coherent communication performance with the core. Perhaps the streaming protocol might mitigate this issue. But anyway, for everyone else the benefits outweigh that bleeding edge limitation.

Thanks to short signal paths on substrate or interposer (for example), IO performance is expected to be 20x better than conventional PCIe SERDES, also at significantly lower power. The standard is also designed to support off-package connectivity, at board, rack or pod level, supported by retimers as needed.  Scaling out is clearly a longer term goal.

High performance at low power and building on established standards. It is easy to see why UCIe has garnered wide support – from Intel (or course), also AMD, Google Cloud, Meta, Microsoft, Arm, Samsung, Qualcomm, TSMC and others.

Verification

A standard depends on tooling to verify compliance with the standard. I can’t speak to aspects of physical compliance checking but I do know that Avery is a contributing member and has built a VIP to validate functional compliance at the protocol and logical PHY layers. As an established provider of VIPs across multiple domains – high speed IO, storage, embedded storage, mobile, memory and others – Avery already has the chops to deliver for UCIe. Their PCIe and CXL VIPs are proven and their QEMU co-simulation platform simplifies software co-design and validation with RTL design.

Avery offers a complete functional verification platform based on its robustly tested verification IP (VIP) portfolio that enables pre-silicon validation of design elements. Its UCIe offering supports standalone UCIe die to die adapter and LogPHY verification along with integrated PCIe and CXL VIP to run over the UCIe stack. In addition to UCIe models it provides comprehensive protocol checkers, coverage, reference testbenches, and compliance test-suites utilizing a flexible and open architecture.

You can learn more HERE.

Also read:

Data Processing Unit (DPU) uses Verification IP (VIP) for PCI Express

PCIe 6.0, LPDDR5, HBM2E and HBM3 Speed Adapters to FPGA Prototyping Solutions

Controlling the Automotive Network – CAN and TSN Update

 


A Crisis in Engineering Education – Where are the Microelectronics Engineers?

A Crisis in Engineering Education – Where are the Microelectronics Engineers?
by Tom Dillinger on 07-03-2022 at 10:00 am

enrollment

At the recent VLSI Symposium on Technology and Circuits, a panel discussion presented a jarring forecast.  The theme of the panel was “Building the 2030 Workforce:  How to Attract Great Students and What to Teach Them?”, with participants from academia and industry, as well as a packed (and vocal) audience.

On the one hand, the forecasts for economic growth in the microelectronics industry are uniformly robust – “a $1T industry by 2030” (notwithstanding a short-term more muted outlook).

Yet, the clear message from all the panel participants was “Where will the microelectronics engineers necessary to support this growth come from?” 

The figure below says it all.  The disparity in college enrollment for EE versus CS majors continues to grow.  (from Raja Koduri, Executive Vice President and general manager of the Accelerated Computing Systems and Graphics Group at Intel)

The goal of the panel session was to solicit ideas to address the issue.  As you might imagine, there were conflicting opinions on the merits of some of the proposals put forth.

The goal of this article is the same, to solicit recommendations from SemiWiki readers on how to get more students interested in microelectronics.

“Show me the money”

One topic of discussion was the salaries offered to graduating software developers versus microelectronics engineers.

    • “Students hear about software grads getting tremendous starting salaries. Why should they choose hardware engineering?”
    • “It is simply not viable for us to pay entry-level engineers on large hardware teams that kind of money.”
    • “When interviewing candidates, I look for a sense of passion about microelectronics. If their sole focus is money, it’s not a fit.” 

Question:  How could industry professionals and academics help generate that passion in students?

Academic + Government + Industry partnerships

“Other countries have recognized this issue, and have established special university programs for microelectronics students – from tuition incentives to assistance finding employment when they finish the program.”

Here’s a site with some examples – link.

“The American Semiconductor Academy Initiative is working on this issue in the U.S., a partnership between universities and SEMI.”link.

Questions:  How can academic/industry collaborations be more effective?  What should be the role of government in addressing the microelectronics engineering shortage – should the U.S. follow the examples of other countries?

The Microelectronics EE Curriculum

The audience did not have clear opinions when posed with the question whether the current undergraduate EE curriculum was appropriate or needed revision to encourage more microelectronics students.

A passionate faculty member said, “I am one of a group of faculty that teach a tapeout course.  It’s demanding, both on the students and the faculty.  The cost per student to the university is high.  Yet, the students say they benefit greatly from the experience.  They learn about engineering projects, schedules, teamwork, and how tradeoffs need to be addressed.”  (link, link, link)

I intend to follow-up further with the faculty, to see how this experience might scale to attract more students.

Questions:  Is the microelectronics curriculum optimum?  How do we educate students about the breadth of skills that are part of the microelectronics industry, to see what might ignite their passion (perhaps like a tapeout course)?  Would high school be appropriate to introduce (STEM) students to a microelectronics curriculum?

Internships

“Offer more internships to EE students early in their studies, to get them industry exposure and excited about microelectronics.”

“Internships are hit-and-miss.  Too often, there is just not a good fit with a student’s early background and our project opportunities.  It’s a mismatch for both the student and the mentor.  Instead of a positive experience, it turns into a negative.”

Questions:  Is early industry internship experience worth the investment, to attract more students?  How can the experience be more beneficial to both the student and the company?

The First Job Experience

“We often direct new hires into verification tasks to start their careers.  And, we have let verification – one of the most exciting and vital roles on the team – come to be regarded as unappealing.  We need to change perceptions about the importance of all the different facets of microelectronics, and make the first job a more valuable experience.”   

Much of the panel discussion centered on providing (circuits and/or system) design coursework to students, and how that often differs from their initial job assignments.  There was not much focus on how to expose students to other aspects that might appeal to them, areas like: product testing and bring-up; product qualification; sustaining product engineering (e.g., cost and performance improvements for product revisions, field support);  and, project management.

Industry on Campus

One anecdote from an academic on the panel received universal acclaim from the audience.

“We had an executive visit campus from a high tech company.  He met with students, and spent considerable time with them describing the kinds of microelectronics opportunities available and the skills the company was seeking.  He talked about potential career paths, and the company’s focus on employee development.  That made a huge impression on the students.” 

Perhaps more industry professionals could reach out to universities.  Contact the IEEE student chapter and offer to meet with students.  Buy pizza.  Share your own passion for microelectronics.  Indicate to them that they would be working on the most complex systems ever conceived – “one trillion transistors” – using the most advanced manufacturing techniques – “atomic layer deposition”.  And, their efforts could help the planet address critical issues we all face, from improving healthcare to enhancing transportation to enabling faster communications technology, all with a focus on power efficiency.

Follow-up

I would welcome your insights into ways to address the engineering shortage issue.

If you are involved in the American Semiconductor Academy initiative, either from SEMI or academia, please reach out with more info – I would like to better understand (and promote) the activities underway.

If you are a microelectronics student, why did you choose to pursue this field of study?

I am intrigued by the “tapeout experience” course offering, and how that could attract more microelectronics students – look for another article in the future.

Thanks in advance for your feedback.

-chipguy

Also read:

TSMC 2022 Technology Symposium Review – Advanced Packaging Development

TSMC 2022 Technology Symposium Review – Process Technology Development

Inverse Lithography Technology – A Status Update from TSMC