DAC2025 SemiWiki 800x100

eFPGAs handling crypto-agility for SoCs with PQC

eFPGAs handling crypto-agility for SoCs with PQC
by Don Dingee on 12-13-2022 at 6:00 am

Improving crypto-agility using hybrid PQC with ECC

With NIST performing its down-select to four post-quantum cryptography (PQC) algorithms for standardization in July 2022, some uncertainty remains. Starting an SoC with fixed PQC IP right now may be nerve-wracking, with possible PQC algorithm changes before standardization and another round of competition for even more advanced algorithms coming. Yet, PQC mandates loom, such as an NSA requirement starting in 2025. A low-risk path proposed in a short white paper by Xiphera and Flex Logix sees eFPGAs handling crypto-agility for SoCs with PQC.

Now the PQC algorithm journey gets serious

NIST selected four algorithms – CRYSTALS-Kyber, CRYSTALS-Dilithium, Falcon, and SPHINCS+ – that withstood the best attempts to expose vulnerabilities during its competition phase. In doing so, NIST now focuses resources on these four algorithms for standardization, marking the start of the PQC journey in earnest. Using teams of researchers armed with supercomputers, it can take years to thoroughly study a proposed crypto algorithm for potential vulnerabilities. A prime example: two PQC algorithms in the NIST competition broke under the weight of intense scrutiny very late in the contest, eliminating them from consideration.

While the odds of a significant break in these four selected PQC algorithms are low, minor changes are a distinct possibility. Uncertainty keeps many in the crypto community up at night, and changes that could disrupt hardware acceleration IP are always a concern for SoC developers. Hardware acceleration for these complex PQC algorithms is a must, especially in edge devices with size, power, and real-time determinism constraints.

Unfortunately, staying put isn’t an option, either. Existing crypto algorithms are vulnerable to quantum computer threats, if not immediately, then very soon. SoCs designed for lifecycles of more than a couple of years using only classical algorithms will be in dire peril when quantum threats materialize. The challenge becomes how to start a long life cycle SoC design now that can accelerate new PQC algorithms without falling victim to changes in those algorithms during design or, even worse, after it is complete.

Redefining crypto-agility practices for PQC in hardware

Crypto-agility sounds simple. Essentially, the idea is to run more than one crypto algorithm in parallel, with the objective that if one is compromised, the others remain intact, keeping the application secure. Researchers are already floating the idea of hybrid mechanisms as a safety net for PQC implementations. It’s possible to combine a traditional crypto algorithm, likely an ECC-based one, with a new PQC algorithm for the key derivation function (KDF).

 

 

 

 

 

 

 

 

But in SoC form, hybrid mechanisms have a cost, which gets higher as complexity increases. Instead of replacing the existing crypto hardware IP, a hybrid approach adds more circuitry for PQC and coordination between the algorithms. Size, power consumption, and latency increase, and another risk emerges. Designers would have to guess correctly about implementing a PQC algorithm; otherwise, the implementation would essentially be classical. The PQC hardware would lay unutilized, wasting space and power used for it entirely and leaving the design as vulnerable as it was without PQC.

A better approach to crypto-agility is reconfigurable computing. If hardware is reconfigurable, patching, upgrading, or replacing algorithms is straightforward. A creative design could even implement a hybrid mechanism on the fly, running one algorithm for a classical key, then reconfiguring to run PQC for its key, then reconfiguring again for operation on a data stream once keys are derived.

eFPGA technology provides a robust, proven reconfigurable computing solution for SoCs now. It’s efficient from a power and area standpoint, rightsized to the SoC design and the logic needed for algorithms. And in a PQC context, it provides the ultimate protection while designs are in progress and algorithms may be in flux.

Xiphera, a hardware-based security solution provider, is teaming up with Flex Logix to bring crypto-agility to SoCs using eFPGAs. Following is a page describing the effort with a link to a short white paper with more background, and a link to the Flex Logix eFPGA page.

Xiphera: Solving the Quantum Threat with Post-Quantum Cryptography on eFPGAs

Flex Logix: What is eFPGA?


TSMC OIP – Analog Cell Migration

TSMC OIP – Analog Cell Migration
by Daniel Payne on 12-12-2022 at 10:00 am

Analog Cell min

The world of analog cell design and migration is quite different from digital, because the inputs and outputs to an analog cell often have a continuously variable voltage level over time, instead of just switching between 1 and 0. Kenny Hsieh of TSMC presented on the topic of analog cell migration at the recent North American OIP event, and I watched his presentation to learn more about their approach to these challenges.

Analog Cell Challenges

Moving from N7 to N5 to N3 the number of analog design rules have dramatically increased, along with more layout effects to take into account. Analog cell heights tend to be irregular, so there’s no abutment like with standard cells. Nearby transistor layout impacts adjacent transistor performance, requiring more time spent in validation.

The TSMC approach for analog cells starting at the N5 node is to use layout with fixed cell heights, support abutment of cells to form arrays, re-use pre-drawn layouts of Metal 0 and below, and that are silicon validated. Inside the PDK for analog cells are active cells, plus all the other parameters for: CMOS, guard ring, CMOS tap, decap and varactor.

Analog cells now use fixed heights, placed in tracks, where you can use abutment, and even customize the transition, tap and guardring areas. All possible combinations of analog cells are exhaustively pre-verified.

Analog Cell

With this analog cell approach there is a uniform Oxide Diffusion (OD) and POlysilicon (PO), which improve silicon yields.

Analog Cell Layout

Automating Analog Cell Layout

By restricting the analog transistors inside of analog cells to use more regular patterns, then layout automation can be more readily used, like: automatic placement using templates, automatic routing with electrically-aware widths and spaces, and adding spare transistors to support any ECOs that arrive later in the design process.

Regular layout for Analog Cells

Migrating between nodes the schematic topology is re-used, while the width and lengths per device do change. The APR settings are tuned for each analog component of a cell. APR constraints for analog metrics like currents and parasitic matching make this process smarter. To support an ECO flow, there’s an automatic spare transistor insertion feature. Both Cadence and Synopsys have worked with TSMC since 2021 to enable this improved analog automation methodology.

Migrating analog circuits to new process nodes requires a flow of device mapping, circuit optimization, layout re-use, analog APR, EM and IR fixes and post-layout simulations. During mapping an Id saturation method is used, where devices are automatically identified by their context.

Pseudo post-layout simulation can use estimates and some fully extracted values to shorten the analysis loop. Enhancements to IC layout tools from both Cadence and Synopsys now support schematic migration, circuit optimization and layout migration steps.

A VCO layout from N4 was migrated to the N3E node using automation steps and a template approach, reusing the placement and orientation of differential pair and current mirror devices. The new automated approach for migration was compared to a manual approach, where the time required for manual migration was 50 days and with automation only 20 days, so a 2.5X productivity improvement. Early EM, IR and parasitic RC checks was fundamental to reaching the productivity gains.

N4 to N3E VCO layout migration

A ring-based VCO was also migrated both manually and automatically from the N40 to N22 node, using Pcells. The productivity gain was 2X by using the automated flow. Pcells had more limitations, so the productivity gain was a bit less.

Summary

TSMC has faced the challenges of analog cell migration by: collaborating with EDA vendors like Cadence and Synopsys to modify their tools, using analog cells with fixed heights to allow more layout automation, and adopting similar strategies to digital flows. Two migration examples show that the productivity improvements can reach 2.5X when using smaller nodes, like N5 to N3. Even with mature nodes like N40, you can expect a 2X productivity improvement using Pcells.

If you registered for the TSMC OIP, then you can watch the full 31 minute video online.

Related Blogs


Bizarre results for P2P resistance and current density (100x off) in on-chip ESD network simulations – why?

Bizarre results for P2P resistance and current density (100x off) in on-chip ESD network simulations – why?
by Maxim Ershov on 12-12-2022 at 6:00 am

Fig 1

Resistance checks between ESD diode cells and pads or power clamps, and current density analysis for such current flows are commonly used for ESD networks verification [1]. When such simulations use standard post-layout netlists generated by parasitic extraction tools, the calculated resistances may be dramatically higher or lower than real values by a factor of up to 100x, which is huge.  Current densities can also be significantly off. Relying on such simulations leads to either missed ESD problems, or to wasted time trying to fix artificial errors on a good layout. The root causes of such errors are the artifacts of parasitic extraction, including the incorrect treatment of a distributed ESD diode as a cell with a single instance pin, or connecting ESD diode with a port by a small (1 mOhm) resistor. This paper discusses how to detect,  identify, and get around these artifacts.

Problem statement

Resistance checks and current density checks are often performed on post-layout netlists to verify ESD protection networks [1] – see Fig.1. Point to point (P2P) resistance is used as a proxy, or a figure of merit for the  quality of metallization, and as a proxy for ESD stress voltage. High P2P resistance values (e.g., higher than 1 Ohm) indicate some problems with the metallization, and should be debugged and improved.

Figure 1. (a) ESD current paths, and (b) P2P resistances (red arrows) in ESD protection network. Resistances between pads and ESD diodes, diodes to power clamps, and other resistances are calculated to verify robustness and quality of ESD protection.

In recent years, many fabless semiconductor design companies have reported puzzling problems with ESD resistance and current density simulations, when post-layout netlists generated by standard parasitic extraction tools are used. These problems include unreasonably large or low (by ~100x) resistances between ESD diodes and pads or power clamps, and unphysical current densities in the interconnects. These problems became especially severe in the latest, sub-10nm technology, nodes with high interconnect resistances.

These problems usually happen when fabless companies use ESD diode p-cells provided by the foundries. The cells are designed, verified, and qualified by the foundries, and should be good. However, the quality of the connections of these ESD cells to the power nets and to IO nets can be poor. Such poor connections can lead to high resistances and current densities, and to big ESD problems. That’s why, even when ESD cells themselves are high quality, the resistance and current density checks on the complete ESD network are required.

Artificially high resistance case

In foundry-provided PDKs, ESD diodes are often represented as p-cells (parameterized cells) with a single instance pin for each of the terminals, anode and cathode. This is different from how power clamp MOSFETs are usually treated in the PDK – where each individual finger of a multi-finger devices is represented as a separate device instance, with its own instance pins for terminals.

These instance pins are usually used as a start point or a destination point for P2P resistance simulations. As a result, in the case of ESD diode p-cell simulations, current flows into the discrete point, creating artificial current crowding, high-current density values, and a high spreading resistance – see Fig.2.

Figure 2. Vertical cross-section of ESD diode, showing current flow pattern for simulation using (a) single instance pin, (b) distributing current in a realistic manner over the diode area.

This is an artifact of simulation, induced by artifacts of representing a large distributed device by a single, discrete instance pin. In real operation, in real life, ESD diodes will conduct current by all fingers, and the total ESD current will be distributed over a large area, more or less uniformly, to many fingers. In advanced technology nodes with many layers, the lower metal layers have high sheet resistivity, and they are used for vertical current routing, and contribute little to the total resistance. Contacts and vias above the active device are all conducting vertical current in parallel, ideally – uniformly. The current is shared by many contacts and vias – which leads to a low total resistance.

On the contrary, in simulations using a single instance pin as a start point or as a destination point – the current is getting concentrated and crowded near that instance pin. It creates artificial, unrealistic current flow patterns – such as lateral current in lower metal layers (M0, M1, M2, …), highly non-uniform current in vias, with high current density in vias close to the instance pin, and so on.

This leads to an artificially high spreading resistance. Fig. 3 compares the results of simulation for a standard ESD diode for 5nm technology. The resistance calculated using a single instance pin is ~7.65 Ohm. The resistance simulated using conditions providing the realistic (distributed) current distribution over the device area is 0.069 Ohm – more than 100x lower value!

Furthermore, the layers show very different ranking in their contributions to the total P2P resistance, for these two simulation conditions. Simulations with discrete instance pins may lead to a completely wrong layer optimization strategy, focusing on the wrong layers.

Figure 3. P2P resistance from ESD diode to ground net port, and resistance contribution by layer, for (a) single instance pin case, and (b) distributed simulation conditions.

Current density distribution in lower layers shows a strong current crowding near a single instance pin – see Fig. 4. In the case of distributed current flow, current density is more or less uniform, and its peak value is ~63x lower than in single instance pin case.

Figure 4. Current density distributions in (a) single instance pin, and (b) distributed simulation conditions. Peak current density for case (a) is 63x higher than for the case (b).

Artificially low resistance case

In some situations, the ESD diode instance pin is connected not to the low-level layers (such as diffusion or contacts), but directly to a port (pin) of a power net, located at the top metal layer. This connector resistor is very low, such as 1 mOhm. Why does that happen? I can guess that the terminal of the ESD diode is mapped to a well or substrate layer, that is not extracted for resistance. As a result, the parasitic extraction tool connects it to a net’s R network at a rather arbitrary point, which turns out to be a port, by a connector resistor. This is similar to how MOSFET’s bulk terminals are typically connected to the port (because wells and substrates are not extracted for resistance).

Visualization of parasitics and their probing allows engineers to identify such extraction details, and to understand what’s going on in parasitic extraction and electrical analysis, as illustrated in Fig. 5.

Figure 5. Visualization of parasitics over layout, helping identify connectivity, non-physical connector resistors, and probe parasitics.

Thus, the connectivity of ESD diode to the power net is incorrect. The resistance from the ESD diode to the port of the power net is very low (1 mOhm), due to this connector resistor bypassing the real current path through the interconnects.

Figure 6. (a) Schematic illustration of a connector resistor connecting ESD diode instance pin with power net port, and (b) Top-view illustration of real ESD current path from ESD diode to power clamp (shown in green) versus artificial simulated current path.

Similarly, the simulated current path from the ESD diode to power clamp differs from the real current path, see Fig.6. The current goes along the way of minimum resistance (minimum dissipated power), from ESD diode to the power net port, then flows along the (low-resistive) top metal, and then flows down to power clamp. Simulated resistance and current densities are artificial and different form the real resistance and current density.

To properly simulate the resistance and current for this case, the connector resistance has to be removed, and the diode’s instance pin should be connected to the lowest layer, in a distributed manner. It would be ideal if this is done by the parasitic extraction tool.

Connector resistors

Connector resistors are a semi-hidden feature in parasitic extraction tools. These are non-physical resistors, i.e. they are not generated by layout shapes and their resistivity. These resistors are not controllable by the users. Extraction tool vendors do not educate semiconductor companies about this feature, probably because it’s considered an internal detail of implementation.

Connector resistors are used for various connectivity purposes – for example, to connect instance pins of devices to resistive network or to other instance pins, to connect disconnected (“opens”) parts of a net, to “short” ports, and for many other purposes. Their value is usually very low – such as 0.1, 1, 10, or 100 mOhms. Most of the time, they do not have any harmful effect on electrical simulation results. However, sometimes, as discussed in a previous section, they can have a strange, or a very bad effect – such as shorting a finite resistance of interconnects, or adding 0.1 Ohm resistance to a system that has much lower resistance (e.g., power FET have interconnects resistance values in the range of mOhms).

Being able to “understand,” identify and visualize connector resistors on the layout (as shown in Fig.5), and just to be aware of their presence and potential impact, is very important to have a good understanding of the structure, connectivity, and potential pitfalls in a post-layout netlist.

Conclusions

Resistance and current density checks are useful and necessary steps for ESD verification, but proper care must be taken when setting up the simulations. Simulation conditions should mimic and reproduce the realistic current flow over and near the devices, to avoid parasitic extraction and simulation artifacts.

All simulations and visualizations presented in this paper were done using ParagonX [2].

References

  1. “ESD Electronic Design Automation Checks”, Technical report, ESD Association, 2014. Free download: https://www.esda.org/zh_CN/store/standards/product/4/esd-tr18-0-01-14
  2. ParagonX Users Guide, Diakopto Inc., 2022.

Also Read:

Your Symmetric Layouts show Mismatches in SPICE Simulations. What’s going on?

Fast EM/IR Analysis, a new EDA Category

CEO Interview: Maxim Ershov of Diakopto


Don’t Lie to Me

Don’t Lie to Me
by Roger C. Lanctot on 12-11-2022 at 6:00 pm

Dont Lie to Me

Some things just really rile me up. Mark Zuckerberg testifying before Congress. Bernie Madoff explaining his investment strategy. Elon Musk inveighing against restrictions on free speech. But there is a new candidate for boiling my blood – word of a potential merger of vehicle data aggregators Wejo and Otonomo.

These two SPAC specters – now mere shadows of their original and farcically touted billion-dollar valuations – are rumored to be considering a merger, according to a report from Globes.co.il. The publication reports: “If such a merger were to be completed, Otonomo would be valued at $150M, significantly higher than its current market cap of just $52M but 96% lower than its valuation of $1.26B, when its SPAC merger was completed in the summer of 2021.”

This potential outcome is aggravating on multiple levels. Where do I start?

  1. There is no doubt that all sorts of investment advisors and, of course, the founders of these companies were richly rewarded for their efforts in spite of the dismal performance of their companies.
  2. Promises were made to investors and absurd claims were made to the press, industry analysts, and industry partners and customers.
  3. Good people were hired and indoctrinated to spread these same claims and promises – soon to be proven vastly misguided.
  4. At least two auto makers – General Motors and Volkswagen – were cynical enough to participate in the dissembling by sharing some of their anonymized vehicle data – giving Wejo, in particular, the thin patina of OEM-endorsed credibility.
  5. Some OEM partners – Ford Motor Company – have provided the further endorsement of adopting customer-facing solutions based on these data platforms.
  6. Both companies are reporting tens of millions of dollars in losses on a couple million in revenue.
  7. We’ve seen this movie before – SiriusXM bought OBDII/vehicle data company Automatic for more than $100M before ultimately shutting it down; Lear acquired Xevo for $320M before shutting it down; and so it goes.
  8. The hyperbolic claims regarding the value of vehicle data are either complete fiction or completely misunderstood. Clearly Wejo and Otonomo have not found the golden key. Both companies did not clearly disclose in the originating documents that their focus would be anonymized vehicle data – which actually possesses little or no value.
  9. The spectacular failure of these twin SPACs not only soured financial markets on SPACs generally, the dual debacles continuing to unfold in the market present a major stumbling block for future startups with potentially more legitimate propositions.

I don’t begrudge the money pocketed by some otherwise nice people leading and working for both companies. I have no doubt they are all well meaning and, at some level, actually believe every word they’ve been saying.

Merging these two companies in the interest of saving their combined operations is nothing more than setting up a double drowning. It’s clear the plan was flawed for both firms. There is no fix – quick or long-term – that can convert failure to success. Both companies are simply sailing toward oblivion.

Worst of all is the utter lack of an exit strategy. There is no way out. Not even a merger can save a bad idea. There is no foundation, no monetizable intellectual property, no industry being disrupted. Otonomo and Wejo are redundant and irrelevant. This is not a time to merge. It’s a time to purge.

Also Read:

Hyundai’s Hybrid Radio a First

Mobility is Dead; Long Live Mobility

Requiem for a Self-Driving Prophet


Podcast EP130: Alphawave IP’s Rise in the Semiconductor Ecosystem with Tony Pialis

Podcast EP130: Alphawave IP’s Rise in the Semiconductor Ecosystem with Tony Pialis
by Daniel Nenni on 12-09-2022 at 10:00 am

Tony Pialis, co-founder and CEO of Alphawave, a global leader in high-speed connectivity IP enabling industries such as AI, autonomous vehicles, 5G, hyperscale data centers, and more. He is the former VP of Analog and Mixed-Signal IP at Intel and has co-founded three semiconductor IP companies, including Snowbush Microelectronics Inc (sold to Gennum/Semtech and now part of Rambus) and V Semiconductor Inc (acquired by Intel).

Tony discusses Alphawave IP’s strategy for growth in the semiconductor market, including its IPO last year and the recent acquisitions of OpenFive and Banias Labs. Strategies to support multi-die design and high-speed communication, among others, are touched on.

The views, thoughts, and opinions expressed in these podcasts belong solely to the speaker, and not to the speaker’s employer, organization, committee or any other group or individual.


Webinar: Flexible, Scalable Interconnect for AI HW System Architectures

Webinar: Flexible, Scalable Interconnect for AI HW System Architectures
by Mike Gianfagna on 12-09-2022 at 6:00 am

Webinar Flexible Scalable Interconnect for AI HW System Architectures

Building next generation systems is a real balancing act. The high-performance computing demands presented by increasing AI an ML content in systems means there are increasing challenges for power consumption, thermal load, and the never-ending appetite for faster data communications.  Power, performance, and cooling are all top of mind for system designers. These topics were discussed at a lively webinar held in association with the AI Hardware Summit this year. Several points of view were presented, and the impact of better data channels was explored in some detail. If you’re interested in adding AI/ML to your next design (and who isn’t), don’t be dismayed if you missed this webinar. A replay link is coming so you can see how much impact flexible, scalable interconnect for AI HW system architectures can have.

Introduction by Jean Bozman, President Cloud Architects LLC

Jean moderated the webinar and provided some introductory remarks.  She discussed the need to balance power, thermal and performance in systems that transfer Terabytes of data and deliver Zettaflops of computing power across hundreds to thousands of AI computing units. 112G PAM4 transceiver technology was discussed. She touched on the importance of system interconnects to achieve the balance needed for these new systems. Jean then introduced Matt Burns, who set the stage for what is available and what is possible for system interconnects.

Matt Burns, Technical Marketing Manager at Samtec

Matt observed that system and semiconductor products all need faster data communication channels to achieve their goals. Whether the problem is electrical, mechanical, optical or RF, the challenges are ever-present. Matt discussed Samtec’s Silicon-to-Silicon™ connectivity portfolio. He discussed the wide variety of interconnect solutions offered by Samtec. You really need to hear them all, but the graphic below gives you a sense of the breadth of Samtec’s capabilities.

Samtec solutions

Next, the system integrator perspective was discussed.

Cédric Bourrasset, Head of High-Performance AI Computing at Atos

Atos develops large-scale computing systems. Cedric focuses on large AI model training, which brings many challenges. He integrates thousands of computing units with collaboration from a broad ecosystem to address the challenges of model training. Scalability is a main challenge. The ability to implement efficient, fast data communication is a big part of that.

Dawei Huang, Director of Engineering at SambaNova Systems

SambaNova Systems builds an AI platform to help quickly deploy state-of-the-art AI and deep learning capabilities. Their focus is to bring enterprise-scale AI system benefits to all sizes of businesses. Get the benefits of AI without massive investment. They are a provider of the technology used by system integrators.

Panel Discussion

What followed these introductory remarks was a very spirited and informative panel discussion moderated by Jean Bozman. You need to hear the detailed responses, but I’ll provide a sample of the questions that were discussed:

  • What is driving the dramatic increase in size and power of compute systems? Is it the workloads, the size of the data or something else?
  • What are foundation models and what are their unique requirements?
  • What are the new requirements to support AI being seen by ODMs and OEMs – what do they need?
  • Energy, power, compute – what is changing here with the new pressures seen? Where does liquid cooling fit?
  • What are the new bandwidth requirements for different parts of the technology stack?
  • How do the communication and power requirements change between enterprise, edge, cloud and multi-cloud environments?

To Learn More

This is just a sample of the very detailed and insightful discussions that were captured during this webinar. If you are considering either adding or introducing AI/ML for your next project, I highly recommend you check out this webinar. You’ll learn a lot from folks who are in the middle of enabling these transitions.

You can access the webinar replay here. You can learn how much impact flexible, scalable interconnect for AI HW system architectures can have.


Rethinking the System Design Process

Rethinking the System Design Process
by Daniel Nenni on 12-08-2022 at 10:00 am

Rethinking the System Design Process 1

The system design process can incorporate linear thinking, parallel thinking, or both, depending on the nature of the anticipated system, subsystem, or element of a subsystem. The structure, composition, scale, or focal point of a new/incremental system design incorporates the talents and gifts of the designer in either a top-down or bottom-up design style. Is a centralized or distributed approach to processing the best method? Is a symmetrical or asymmetrical topology warranted? Is power or speed the driving criteria? The answer to these questions can lead to a conceptual block diagram that starts the design process, leading to a design specification.

Conceptual Block Diagrams
Everyone is familiar with a conceptual block diagram, where differences between block diagrams might reflect the level of abstraction, or conversely, how much detail is presented. A two-dimensional block diagram may implement a three dimensional nodal topology, or a linear communication block diagram may implement a multi-modulation scheme (Figure 1 ). After creating a conceptual block diagram, what methodologies are available to evaluate the system performance in terms of system throughput, system power, system latency, resource utilization, as related to cost? In many cases, the system throughput, power, latency, utilization, or cost are established by customers directly, or product marketing indirectly working with customers. It might be termed a marketing requirements document, a design specification, a product specification, or simply a spec sheet. There are many designations for a “design” level specification, which contains one or more conceptual block diagrams.

Figure 1: An example of a block diagram engineers use to describe a new system specification.

Design Level Specification
A design level specification captures a new or incremental approach to improving system throughput, power, latency, utilization, or cost; typically referred to as price-performance tradeoffs in product marketing. In medium to large system design organizations, a design specification may be coordinated, and, or approved by the executive staff, marketing, R&D, manufacturing, field support, or a specific customer. The degree of coordination, or approval, for system design specifications between intra-company groups varies from company to company, depending on the type of system market, time to market, consumer or industrial market segment, and market maturity. At each step in the evolution of a design specification, well-intentioned modifications, or improvements, may occur. What happens to the system design process if a well intentioned design specification change impacts the original conceptual block diagrams, such that design margin for system throughput drops from 20% to 5%? While the R&D group, who created the conceptual block diagram, may realize that the system throughput will be impacted by a marketing modification, there may not be an easy way to determine that the worst-case design margin has shrunk to 5%. Or, in other words, the time required to evaluate a design modification before, or after, the system design process has started, can vary dramatically, depending on the system evaluation methodology selected.

Figure 2. This is an example of the BONeS Designer from ’90s.

System Evaluation Methodologies
Moore’s law in reverse will reveal that systems created in the 1970s, or early 1980s could be designed and modified on a proverbial napkin, sent to a development team, and allow man to explore the moon on schedule, cost not withstanding. Intel’s development of the co-processor in the mid-80s marked the increasing sophistication of system design, given the transition from medium scale integration to large scale integration in chip design.

EXCEL spreadsheets became popular for estimating average throughput, power, latency, utilization, and cost at the system level when some napkin designs began to have problems in accurately estimating overall system performance, as system complexity increased. The problems encountered were mathematical discontinuities related to system operation (especially digital), estimating peak system performance, and simply mistakes in a spreadsheet, that were not readily apparent.

C and C++ golden reference models became popular in the late 1980s, and early 1990s, since they could resolve some of the EXCEL spreadsheet issues with a modest programming effort. To resolve digital system modeling issues, C/C++ provided internal synchronization in the form of software generated clocks or events, common resource objects and user-defined classes. The problems encountered were related to the envisioned “modest” programming effort. Software bugs were more difficult to find in increasingly complex software that resolved some of the EXCEL spreadsheet issues. Nonetheless, better performance modeling results were obtained with substantial programming effort. Different companies or even different groups within the same company typically made different assumptions regarding their internal golden reference models, such that it was difficult to exchange models from one company to another or one group to another group. Their golden reference models lacked a common frame of reference, or sometimes referred to as interoperability. In the early 1990s, the combination of low cost workstations, and modeling tools needing a common frame of reference started to appear in the marketplace.

Several system level tools, such as BONeS Designer (Block-Oriented Network System Designer) (Figure 2 ), Signal Processing Workstation (SPW), OPNET Modeler, SES Workbench, CACI COMNeT and Virtual Component Co-Design (VCC) appeared to provide the notion of time-ordered, concurrent system processes, embedded software algorithms, and data types. C/C++ programming languages do not explicitly provide for time sequenced operations, parallel time sequenced operations, or design related data types. Some companies shifted from C/C++ golden reference models to these standard modeling tool methodologies. In addition, many of these tools were graphically oriented, which reduced the need for extensive C/C++ coding efforts, replacing standard modeling functionality with graphical representations of common functions. If specific functionality was required, the user could create a custom-coded element, or block, depending on the modeling libraries supported by the tool.

Graphical modeling provided additional system-level modeling capabilities:

  • Ability to create hierarchical models
  • Ability to handle different levels of abstraction
  • Ability to speed model creation and partitioning
  • Ability to spatially refine an abstract model to a more detailed model
  • Ability to reuse system level modeling modules

The afore-mentioned tools focused on improving modeling capabilities in terms of performance modeling, ease of use, model creation time, and post-processing of modeling results. Some of the issues with these early system level modeling tools is that they were suited to specific classes of systems, added their own syntax to graphical modeling, and sometimes lacked sufficient libraries to solve certain modeling problems.

System Level Modeling
The system level modeling space consists of both methodology-specific and application-specific modeling domains that overlap to some extent. Methodology-specific domains consist of discrete-event, cycle-based, synchronous data flow and continuous time, models of computation provide a modeling methodology for general classes of modeling problems. The discrete-event model of computation is used for digital portions of a design that may have a strong control component. A discrete-event model of computation is very efficient for higher levels of abstraction, as the current simulation time is based on both deterministic synchronous and asynchronous events. Discrete-event models provide a user with both time-ordered (asynchronous) and concurrent (synchronous) event modeling capabilities.

A cycle-based model of computation is similar to a discrete-event model of the computation with the proviso that the model is clock-driven, executing the simulation engine for each clock cycle. Cycle-based simulators provide a user with more modeling fidelity, meaning that they usually are used for more detailed modeling of digital systems, including verification of the final design. A synchronous data flow model of computation is more DSP algorithm oriented, meaning the mathematical processing of Baseband digital signals, whether vectors, matrices, or complex data types. Internal processing of synchronous data flow type models can be simpler than a discrete-event modeling engine, requiring the concurrence of tokens at each modeling block to start processing, and the generation of new tokens to subsequent modeling blocks.

Application-specific modeling domains might include:

  1. Multi media (MPEG, AAC, Quick Time, etc.)
  2. Wireless (G3, GSM, 802.11a/b/g, Bluetooth, UWB, etc.)
  3. Wired (Routers, switches, etc.)
  4. Processors (uP, DSP, bus, cache, SDRAM, ASIC, etc.)

New Thinking
System level modeling is evolving to solve the original problem cited, how to determine quickly and efficiently how a change to a design specification might impact the performance of a proposed system. Is the throughput margin now 15% or 5%? One aspect of the system design process, the design specification itself typically remains a Word document with text, block diagrams, etc. It is difficult to exchange models with executive staff, marketing, manufacturing, or field support, simply because they lack the modeling expertise certain tools require. If the system level, or golden level model, could be exchanged among the disparate groups, or within design groups located around the globe, as part of the design specification, then the evaluation of a proposed change might be performed by the marketing organization directly.

One approach is to use the power of the Web Browser and Internet-based collaboration to embed a system level model into a design specification document as a Java Applet. Companies such as Mirabilis Design of Sunnyvale California have created a methodology around this to enable design team collaboration and better communication between suppliers and system companies (Figure 3 ). Any internet browser can run a system level model embedded as a Java Applet within an HTML document. In other words, the design specification can now contain an “executable” system level model that other people within the organization can run with a net browser, no additional software or license is required. The model contained within the Java Applet is identical to the tool level model, containing all the parameters, and block attributes the original model contains. Once can modify parameters such as bus speed and input data rate, but cannot modify the base architecture and connectivity.

Similarly models must be stitched together with Intellectual Property (IP) that is produced and maintained in disparate regions of the world. The system-level models associated with these IP can be located at the source, thus ensuring the latest is always available to engineers. Using a graphical model construction environment and sophisticated search engine, the most suitable set of technology can be easily evaluated and selected based on performance/power characteristics. Updates to any technology will be immediately and automatically available at the designer’s desktop. This would be a vast improvement over current IP communication techniques that depend on expensive custom modeling effort and a transition period for the user to upgrade their database setup. Moreover the Designer has vast degrees of freedom to create a quality product.

Another part of the puzzle is a seamless integration of the disparate models of computation. Just as hardware implementation tools support mixed language mode, system tools now support a common modeling platform. For example, Synopsys integrated Cossap (dynamic data flow) and SystemC (digital) into System Studio while VisualSim from Mirabilis Design combines SystemC (digital), synchronous dataflow (DSP), finite state machine (FSM), and continuous time (analog) domains. Previous system level tools typically supported a single modeling specific domain. Additionally, making the modeling software platform to be Operating System agnostic reduces support cost and enables easy data transition across the company.

Finally, a new approach to standard library development is required. Prior generations of graphical modeling tools might advertise 3,000 plus libraries as a sign of modeling tool maturity and robustness. However, if a new Ultra-Wideband model was required, then most of the prior libraries may not be reusable for UWB, since many were prior generation application specific type libraries, or bottom-up component libraries. A new system level modeling approach will not measure the tool by the quantity of libraries, rather by the quality and integration of the system level libraries. Such libraries will have a high likelihood of reuse in a new technology, or system level model. Relative to prior generations of graphical modeling tools, VisualSim integrates as many as thirty bottom-up component functions into a single, system level, easy to use, reusable block, or module. Four VisualSim queue blocks replace 24 prior generation queue blocks through polymorphic port support, block level menu attributes, while improving simulation performance.

For more information: https://www.mirabilisdesign.com/contact/

About the Authors:
Darryl Koivisto
 is the CTO at Mirabilis Design. Dr. Koivisto has a Ph.D. from Golden Gate University, MS from Santa Clara University and BS from California Polytechnic University, San Luis Obispo.
Deepak Shankar is Founder at Mirabilis Design. Mr. Shankar has an MBA from UC Berkeley, MS from Clemson University and BS from Coimbatore Institute of Technology.

Also read:

System-Level Modeling using your Web Browser

Architecture Exploration with Miribalis Design

CEO Interview: Deepak Shankar of Mirabilis Design


Solutions for Defense Electronics Supply Chain Challenges

Solutions for Defense Electronics Supply Chain Challenges
by Rahul Razdan on 12-08-2022 at 6:00 am

figure1 7

“The amateurs discuss tactics: the professionals discuss logistics.”

— Napoleon

Logistics is even more important today than it was in the early 1800’s. Further, the effectiveness of Defense systems is increasingly driven by sophisticated electronics. As the recent Ukraine conflict reveals, weapons such as precision munitions, autonomous drones, and other similar systems generate asymmetrical advantages on the battlefield. However, all these systems also generate a massive and complex electronics logistical tail which must be managed carefully. For the electronics supply ecosystem, Defense falls into the broader category of Long Lifecycle (LLC) system products.

“Two Speed” Challenge

LLC (Long lifecycle) are products which need to be supported for many years – typically anything over five years or more. In this time, these products need to provide legacy part support, and rarely drive very high chip volumes. However, the economics of semiconductor design imply that custom semiconductors only make sense for markets with high volume. Today, this consists largely of the consumer (cell phone, laptop, tablet, cloud, etc) marketplace which are short lifecycle products with typical lifecycle of < 5 years. This hard economic fact has optimized the semiconductor industry towards consumer driven short lifecycle products. This is at odds with the requirement of long lifecycle products, both from frequent End-of-Life obsolescence and long-term product reliability. The reliability is further impacted in Defense by the fact that these components must perform under strenuous external environments with challenging thermal, vibration and even radiation conditions. This “Two Speed” challenge and results in very frequent failure and obsolescence of electronic components.

Move towards “Availability” Contracts in Aerospace & Défense

Traditionally in the aerospace and defense industry, an initial contract for development and manufacture is followed by a separate contract for spares and repairs.  In the past, there has been a trend towards “availability” contracts where industry delivers a complete Product Service System (PSS). The key challenge in such contracts is to estimate “Whole Life Cost” (WLC) of the product which may span 30 or even 40 years. As one might imagine, this PSS paradigm skyrockets the cost of systems and still is not fool proof because of its need to predict the future.  This has led to some embarrassing costs for defense part procurement as compared to the commercial equivalent.

The US Secretary of Defense William Perry memorandum in 1994 resulted in a move towards performance-based specifications which led to the virtual abandonment of the MIL-STD and MIL-SPEC system that had been the mainstay of all military procurement in the US for several decades. Coupled with “Diminishing Manufacturing Sources” (DMS) for military grade components, the imperative was to move towards COTS (Commercial Of the Shelf) components while innovating at system level to cater to the stringent operating environment requirements. The initial reasoning and belief in using COTS to circumvent system costs was effective. However, it did expose defense systems to some key issues.

Key Issues for Defense Industry today

  • Component Obsolescence

Primarily as a result of “Two-Speed” challenge described above, components become harder to source over time and even grow obsolete and the rate of discontinuance of part availability is increasing steadily. Many programs such as the F22 stealth fighter, AWACS, Tornado, and Eurofighter are suffering from such component obsolescence. As a result, the OEM’s are forced to design replacements for those obsolete components and face nonrecurring engineering costs as a result. As per McKinsey’s recent estimates [“How industrial and aerospace and defense OEMs can win the obsolescence challenge, April 2022, McKinsey Insights” ], the aggregate obsolescence related nonrecurring costs for military aircraft segment alone are in the range of US $50 billion to US $70 billion.

  • Whole Life Cost (WLC)

As mentioned above, with increasing move towards “availability” contracts in defense and aerospace, one of the huge challenges has been to compute a realistic “Whole Life Cost” (WLC) of the products through the product lifecycle. This leads to massive held inventory costs with associated waste when held inventory is no longer useful. Moreover, any good estimate of WLC will require an accurate prediction of the future.

  • Reliability

Semiconductors for the consumer market are optimized for consumer lifetimes – typically 5 years or so. For LLC markets like Defense, the longer product life in non-traditional environmental situations often leads to product reliability and maintenance issues especially with increased use of COTS components.

  • Logistics chain in forward deployed areas

One of the unique issues in Defense which is further accentuated due to increased move towards “availability” contracts is the logistics nightmare to support equipment deployed in remote forward areas.  A very desirable characteristic would be to have “in theatre” maintenance and update capability for electronic systems. The last mile is the hardest mile in logistics.

  • Future Function

Given the timeframes of interest, upgrades in functionality are virtually guaranteed to happen. Since Defense products often have the characteristic of being embedded in the environment, upgrade costs are typically very high. A classic example is one of a satellite where the upgrade cost is prohibitively high.  Similarly with weapon systems deployed in forward areas, upgrade costs are prohibitive. Another example is obsolescence of industry standards and protocols and need to adhere to newer ones. In fact, field embedded electronics (more so in defense) require flexibility to manage derivative design function WITHOUT hardware updates.  How does one design for this capability, and how does a program manager understand the band of flexibility in defining new products, derivatives, and upgrades?

Figure 1:  Design for Supply Chain

What is the solution to these issues?   That answer there is the need to build a Design for Supply Chain methodology and associated Electronic Design Automation (EDA) capability.

Just as manufacturing test was optimized by  “Designing for Test” and power optimized by “Designing for Power” or performance with  “Design for performance” etc,  one should be designing for “Supply Chain and Reliability”! What are the critical aspects of Design for Supply Chain capability?

  1. Programmable Semiconductor Parts: Programmable parts (CPU, GPU, FPGA, etc) have the distinct advantages of:
    • Parts Obsolescence: A smaller number of programmable parts minimize inventory skews, can be forward deployed, and can be repurposed around a large number of defense electronic systems. Further, the aggregation of function around a small number of programmable parts raises the volume of these parts and thus minimizes the chances for parts obsolescence.
    • Redundancy for Reliability: Reliability can be greatly enhanced by the use of redundancy within and of multiple programmable devices. Similar to RAID storage, one can leave large parts of an FPGA unprogrammed and dynamically move functionality based on detected failures.
    • Future Function: Programmability enables the use of “over the air” updates which update functionality dynamically.
  2. Electronic Design Automation (EDA): To facilitate a design for supply chain approach, a critical piece is the EDA support. Critical functionality required is:
    • Total Cost of Ownership Model: With LLCs,  it is very important to consider lifetime costs based on downstream maintenance and function updates. An EDA system should help calculate lifetime cost metrics based on these factors to avoid mistakes which simply optimize near-term costs.  This model has to be sophisticated enough to understand that derivative programmable devices often can provide performance/power increases which are favorable as compared to older technology custom devices.
    • Programming Abstractions: Programmable devices are based on abstractions (computer ISA, Verilog,  Analog models, etc)  from which function is mapped onto physical devices.  EDA functionality is critical to maintain these abstractions and automate the process of mapping which optimizes for power, performance, reliability, and other factors. Can these abstractions & optimizations further be extended to obsolescence?
    • Static and Dynamic Fabrics: When the hardware configuration does not have to be changed, EDA functionality is only required for the programming the electronic system.  However, if hardware devices require changes, there is a need for a flexible fabric to accept the updates in a graceful manner. The nature of the flexible fabric maybe mechanical (ex..rack-mountable boards) or chemical (quick respins of PCB which maybe done in-field).  All of these methods have to be managed by a sophisticated EDA system. These methods are the key to the ease of integration of weapons systems.

With the above capability, one can perform proactive logistics management. One of the best practices that can yield rich dividends is to constitute a cross functional team (with representation from procurement, R&D, manufacturing and quality functions) which continuously scans for potential   issues. This team can be tasked with developing set of lead indicators to assess components, identify near-term issues, and develop countermeasures.  In order for this cross-functional programs to work, the EDA functionality has to be tied into the Product Lifecycle Management (PLM) systems in the enterprise.

Currently, a great deal of system intent is lost across the enterprise or over time as the design team moves to other projects. Thus, even the OEMs who use such proactive obsolescence management best practices are significantly hampered by lack of structured information or sophisticated tools which allow them to accurately predict these issues and plan mitigating actions including last-time strategic buys, finding alternate suppliers and even finding optimal FFF (Fit-Form-Function) replacements. It is imperative that such software (EDA) tool functions are available yesterday.

Summary

The semiconductor skew towards short lifecycle products (aka consumer electronics) has created a huge opportunity for the Defense industry to access low-cost electronics. However, this also generates issues when there is a need to support products in excess of 30 years as a result of fast obsolescence and shorter component reliability. What further worsens the situation for Defense are couple of unique situations for them – first, logistics issues in supporting equipment deployed on forward deployed areas and second, the increasing use of “availability” or full Product Service System contracts where establishing WLC (Whole Life Costs) for the equipment becomes critical.   The only way this can be solved efficiently is to bring in a paradigm shift to “Design for Supply Chain” EDA innovation is needed to significantly enhance to support these levels of abstractions at system/PCB level and be a catalyst to bring in this paradigm shift. McKinsey estimated a 35% cost reduction in obsolescence related nonrecurring costs by only using a structured albeit still reactive obsolescence management methodology, a proactive “Design for Supply Chain” approach can be truly transformational.

Acknowledgements:

  1. Anurag Seth : Special thanks to Anurag for co-authoring this article.
  2. D. “Jake” Polumbo (Major General USAF (ret.)): Thanks for Review (https://www.twoblueaces.com/)
  3. Mark Montgomery (Executive Director): Thanks for Review (CyberSolarium.org)
Also Read:

INNOVA PDM, a New Era for Planning and Tracking Chip Design Resources is Born

IDEAS Online Technical Conference Features Intel, Qualcomm, Nvidia, IBM, Samsung, and More Discussing Chip Design Experiences

The Role of Clock Gating


Podcast EP129: Sondrel’s Unique Position in the Custom Chip Market

Podcast EP129: Sondrel’s Unique Position in the Custom Chip Market
by Daniel Nenni on 12-07-2022 at 10:00 am

Dan is joined by Graham Curren, CEO of ASIC provider Sondrel. Graham founded Sondrel in 2002 after identifying a gap in the market for an international company specialising in complex digital IC design. Prior to establishing Sondrel, Graham worked in both ASIC design and manufacturing before joining EDA company, Avant! Corporation.

Dan explores what makes Sondrel unique with Graham. They discuss the company’s recent IPO, as well as how the company’s design skills, track record and market focus have helped them to grow.

The views, thoughts, and opinions expressed in these podcasts belong solely to the speaker, and not to the speaker’s employer, organization, committee or any other group or individual.


Re-configuring RISC-V Post-Silicon

Re-configuring RISC-V Post-Silicon
by Bernard Murphy on 12-07-2022 at 6:00 am

Post Silicon RISC V extensions min

How do you reconfigure system characteristics? The answer to that question is well established – through software. Make the underlying hardware general enough and use platform software to update behaviors and tweak hardware configuration registers. This simple fact drove the explosion of embedded processors everywhere and works very well in most cases, but not all. Software provides flexibility at the expense of performance and power, which can be a real problem in constrained IoT applications. Consider tight-loop encryption/decryption for example. Meeting performance and power goals requires acceleration for such functions, which seems to require custom silicon. But that option only makes sense for high-volume applications. Is there a better option, allowing for a high-volume platform which can offer ISA extensibility for acceleration post-silicon? This is what Menta and Codasip are offering.

Step 1: First build a RISC-V core for your design

This step looks like any RISC-V instantiation though here you use a Codasip core, for reasons you’ll understand shortly. Codasip provides a range of directly customizable RISC-V cores which a semi supplier might choose to optimize to a specific yet broad market objective. The tool suite (Codasip Studio) offers all the features you would expect to get in support of such a core, including generating an SDK and the ability to hardwire customize the ISA. (Here, by “hardwire” I mean extensions built directly into the silicon implementation.) Codasip Studio also provides tools to explore architecture options and generate a custom compiler.

The hardware implementation of custom instructions is through an HDL block in parallel to the regular datapath, as is common in these cases. The HDL for this block is defined by the integrator to implement custom instructions, a byte-swap for example. Codasip Studio takes care of vectoring execution to the HDL rather than the ALU as needed, also connecting appropriate register accesses.

Step 2: Add an eFPGA block in the datapth

So far, this is just regular RISC-V customization. Extending customization options to post-silicon requires reprogrammable logic, such as that offered by Menta. Their technology is standard cell based and is claimed portable to any process technology, making it readily embeddable in most SoC platforms. You can start to see how such a RISC-V core could host not only hardwired extensions but also programmable extensions.

This needs involvement from Codasip Studio (CS) at 2 stages. First, you as an SoC integrator must tell the system that you plan to add ISA customization after manufacture. This instructs CS to embed an unprogrammed eFPGA IP into the datapath.

Second, when silicon is available, you (or perhaps your customer?) will re-run CS to define added ISA instructions, along with RTL to implement those instructions. This will generate a revised compiler and SDK, plus a bitstream to program the eFPGA. Voilà – you have a post-silicon customized RISC-V core!

Post-silicon ISA customization

To recap, this partnership between Codasip and Menta offer the ability to not only customize RISC-V cores pre-silicon but also post-silicon, enabling an SoC vendor to deliver products which can be optimized to multiple applications with potential for high volume appeal. You can learn more in this white paper.

Codasip is based in Europe but has customers worldwide, including Rambus, Microsemi, Mythic, Mobileye and others. Menta is also based in Europe and have particular strengths in secure, defense and space applications. As a technologist with roots in the UK, it’s nice to see yet more successful growth in European IP 😊.

Also Read:

Scaling is Failing with Moore’s Law and Dennard

Optimizing AI/ML Operations at the Edge