You are currently viewing SemiWiki as a guest which gives you limited access to the site. To view blog comments and experience other SemiWiki features you must be a registered member. Registration is fast, simple, and absolutely free so please, join our community today!
The shortage of semiconductors for automotive applications is getting worse. Recent statements from major automakers:
General Motors significantly cut production at eight North American plants earlier this month due to the semiconductor shortage. GM expects North American vehicle production in the second half of the year will be down about 100,000 compared with the first half.
Ford Motor has cut North American production of its popular F-150 pickup truck due to the shortage.
Toyota announced on September 10 it will reduce global vehicle products by 70,000 units in September and 333,000 units in October. For its full fiscal year ending March 31, 2022, Toyota expects to produce 9 million vehicles, down from its previous forecast of 9.3 million.
Volkswagen has also cut production and will build 100,000 fewer vehicles in 2021 than planned.
Hyundai Motor cut production at its U.S. Hyundai and Kia plants.
Stellantis, the merger of the Fiat-Chrysler and Peugeot groups, temporarily halted vehicle production at four plants in North America and one in Italy in late August.
At the Munich Motor Show earlier this month, Daimler AG’s CEO Ola Kallenius stated its Mercedes unit will have significantly lower third quarter sales, buts expects its semiconductor supply to improve in the fourth quarter. He expects shortages to influence 2022 auto production, with the industry fully recovering in 2023. Ford Europe chairman Gunnar Herrmann said the semiconductor shortage could continue until 2024. Volkswagen CEO Herbert Diess expects shortages to ease as countries reduce COVID-19 cases, but a general shortage of semiconductors could persist for some time due to demand from other applications such as the internet of things.
The chart below shows vehicle production by the six largest automakers relative to March 2021, when production was about at pre-pandemic levels. Since March, production has generally been on a downtrend due to semiconductor shortages. SAIC of China fared the best, bouncing back to 92% of March production in August from 66% in July. In June, SAIC Motor Chairman Chen Hong stated his company’s semiconductor supply shortage would be alleviated in late July. In August 2021, GM Group, Hyundai Group and Toyota Group each produced vehicles at about 60% of the March level. VW Group and Stellantis in August were at about 37% of March production.
On September 16, IHS Markit updated its April forecast for global light vehicle sales. The 2021 forecast was reduced by 7.7 million units, or 9%. 2022 was cut by 7.1 million units, or 8%. The April forecast called for 2022 sales of 89.7 million units, up from the pre-pandemic 89.0 million in 2019. The September forecast has 2022 sales at 82.6 million, 7% below 2019 levels. Thus, the automotive market is not expected to fully recover until at least 2023.
Our Semiconductor Intelligence April newsletter asserted automakers were primarily to blame for their semiconductor shortages since they drastically cut production and semiconductor orders while other semiconductor applications were either relatively stable or growing. Another contributing factor was automakers use of Just-In-Time (JIT) inventory management systems. JIT systems are designed to reduce automakers’ parts inventories by working with suppliers to furnish parts just as they are needed for production.
The Wall Street Journal in April 2021 stated many automakers are modifying their JIT systems to have safety stock of critical material such as semiconductors. Raconteur.net proposes automakers move from JIT to just-in-case (JIC). JIC means keeping minimal levels of inventory for critical components. Sourcengine.com points to NXP Semiconductor as an example of a supplier signing medium-term supply contracts with some its automotive customers. It cites DigiTimes prediction automotive semiconductor shortages will not be resolved until the middle of 2022. IHS Markit estimates automotive microcontroller supply will not catch up with demand until 2Q 2022.
Automakers are now realizing the importance of semiconductors. Automotive semiconductors have unique requirements compared to other applications. Many need to operate in extreme temperatures. Devices must often be supplied in high volumes over long time periods. Automakers compete with other major applications such as PCs and smartphones for the more advanced devices such microcontrollers. Automakers generally cannot switch to another semiconductor supplier in the short term. Trends toward electric cars and self-driving cars only make semiconductors even more important. It is time for automakers to think more strategically about semiconductors and not treat them as just another component in a vehicle.
A processor ISA provides an abstraction against which to verify an implementation. We look here at a paper extending this concept to accelerators, for verification of how these interact with processors and software. Paul Cunningham (GM, Verification at Cadence), Raúl Camposano (Silicon Catalyst, entrepreneur, former Synopsys CTO) and I continue our series on research ideas. As always, feedback welcome.
The authors aim with ILA to extend the ISA (Instruction Set Architecture) concept for processors to include accelerators common in most SoCs, providing a unified abstraction for all software-visible compute units in such SOCs. Accelerators are already visible to software for example through memory-mapped IO; ILA adds instruction abstractions to this view. Their goal is to enable scalable software validation against an abstract model of the SoC and a standard abstraction against which implementations can be verified. Generally they aim to reduce verification complexity as architectures expand further in massively multi-core and multi-die systems.
They consider applications to accelerators for image processing, machine learning and cryptography and a RISC-V core. ILAs here are generated either by hand or through a template-based synthesis flow. ILA allows for hierarchical levels in abstraction, to model microcode level complexities like looping and implementation choices like buffering for streaming IO.
The authors use formal methods based on commercial and open-source tools to verify their ILAs. They do this through equivalence checking to compare two different implementations of an ILA (e.g. a specification ILA versus a more elaborated ILA). They also run equivalence checks between ILAs and RTL implementations.
Paul’s view
This is a very thought-provoking paper. The trend towards domain-specific heterogenous compute is unstoppable, fueled in a big part by the AI/ML transformation ongoing all around us.
Traditional CPU design benefits from a rich ecosystem of verification and validation techniques built around formal definitions of ISAs. These ISAs serve as a bridge between the software validation world and hardware verification world. Software is compiled into a sequence of ISA instructions. Hardware processes that ISA sequence in silicon.
In this paper the authors attempt to generalize an ISA beyond general purpose CPUs to any domain specific accelerator. They observe that while accelerators do not literally support an ISA, it is possible to map an accelerator’s MMIO or stream interface to something that looks and feels like an ISA. A key contribution of the paper is a formal definition of this “ILA” generalization such that no matter what concurrency there is in hardware between the CPU and its accelerators, the software world can see a flattened sequential sequence of unified ILA instructions across all compute engines.
The paper is very practical, taking the reader through multiple worked examples of ILAs for accelerators (gaussian blur, recurrent neural network, AES encryption), along with proofs of correctness for RTL implementations of these accelerators to their respective ILAs – nice to see one of the tools they use being JasperGold from Cadence 🙂
I would love to see follow-on papers to showcase worked examples of the other side of ILAs. How they can be used to improve validation of software leveraging multiple accelerators and CPUs. While the way an ISA bridges software and hardware for a CPU is clear, a software program that leverages accelerators does not literally compile into ILA instructions, so the bridge between hardware and software for accelerators using ILAs is less clear to me, and I would appreciate a worked example.
Raúl’s view
The paper is a significant contribution around modeling, not really about innovation in verification, as Paul noted. Verification is based on existing verification tools, the contribution here is methodology and modeling. I think it is fair to note that the human effort involved in the setup (including setup for proofs) seems more significant than run-time and probably requires a high level of expertise. The authors hand-generated some of the ILA models, not so surprising at this stage of evolution. Some they generated through template-based synthesis, checking against reference models (C/C++, SystemC, Chisel, RTL). Though they don’t go into details, I suspect there is significant manual work behind the scenes in finalizing those models. Even the formal verification setup probably requires a lot more know-how and work than detailed in the paper.
In terms of how EDA tools could use ILA in the future, that depends obviously on a usable language / notation. We’d have to look carefully at comparable objectives and adoption rates. Chisel for example is another way to abstract, also using templates; Berkeley used Chisel to build the Risc-V Rocket and Google used it to design an edge TPU. SystemC might be a more interesting reference as a high-level standard which has faced adoption challenges.
This looks like a starting point on a long road, pointed to by future work suggestions. Extensions to concurrency, consistency of shared memory, accelerator code generation and reliable simulator generation for software development are some examples.
My view
While this paper doesn’t demonstrate an application. I did find a related paper, also from Princeton, on application to memory consistency verification. Perhaps we can review that paper in a later blog. I see this paper more as an introduction to the concept. With a demonstration applied to a range of accelerators, rather than a self-contained innovation. Some innovations build on multiple sub-innovations. This is a sub-innovation 😀
My first IR drop analysis was back in the early 1980s at Intel, where I had to manually model the parasitics of the VDD and VSS interconnect for all of the IO cells that our team was designing in a graphics chip, then I ran that netlist in a SPICE simulator using transient analysis, measuring the bounce in VSS and droop in VDD levels as all of the IOs switched simultaneously to keep the power and ground levels within a safe operating region. On another occasion in 1980 I was debugging a DRAM chip, because a certain percentage of these chips were failing when a specific portion of the aluminum wiring heated up enough to cause the metal to bubble and form open circuits, an Electromigration (EM) failure. Oh, how I wished for some EDA tools to help me pinpoint reliability issues like EM/IR, before tape-out of course.
For the past few decades there have been EM/IR tools available in the EDA market, mostly for big digital designs and limited analog designs, so reliability analysis has been performed in order to avoid field failures. Last week I had a Microsoft Teams meeting with Joseph Davis of Siemens EDA, where he outlined their brand new EM/IR tool offering, called mPower.
The past approaches to EM/IR analysis have been to use static techniques on big analog blocks, which is faster, but less accurate than dynamic analysis. Then for smaller blocks, typically analog or AMS, to use the more accurate dynamic analysis. What jumped out to me right away with the new mPower tool, is that it has the capacity to handle the full-chip using the more accurate dynamic analysis, as shown below:
mPower Capacity
The secret sauce to mPower is in how it scales across your network of CPUs, creating quick run times and enabling billion transistor capacity, while also minimizing RAM usage. Inputs to mPower are industry standard file formats, so there’s little work for your CAD group, and it’s a quick learning curve for your design engineers using the tool, just pick out your favorite extraction tool and SPICE circuit simulator. Siemens EDA does offer the popular Calibre tool for extraction and AFS for SPICE circuit simulation, but really, any vendor tool works.
For digital flows, there’s the mPower Digital tool, and then for transistor-level analog flows, the product is mPower Analog. I asked about which IC design companies are using mPower for EM/IR analysis, and was impressed with the initial list:
MaxLinear – full-chip, large analog
Efinix FPGA – full-chip, transistor-level EM/IR analysis without IP modeling
Esperanto – AI chip with 1,000+ RISC-V cores, ran mPower Digital on their own network
On Semi – pixel array designs, both mPower Digital and Analog
The AI chip from Esperanto has 24 billion transistors, so that’s a prime example showing the capacity of the mPower Digital tool. The other customer examples just couldn’t be run with competitor tools using the dynamic approach.
Support from foundries will be announced soon, just know that you can use this new EM/IR tool at process nodes down to 5nm with confidence today, and that 3nm support is in the works. Even 2.5 or 3D ICs can be analyzed for EM/IR compliance, and you can run mPower Digital in the cloud to meet your time to market requirements.
Summary
During IC layout and certainly before tape-out, your design team needs assurance that EM/IR reliability concerns have been analyzed, and that the layout has been properly updated. There is now a new choice for this type of analysis, and Siemens EDA has carved out some unique properties in this segment like full-chip dynamic analysis, using mPower Analog. It’s worth a look to see how your designs could benefit from higher capacity, and fit within the compute resources in place. EDA competition always fosters innovation, and vendor loyalties can quickly change, if the new entrant delivers on their promises.
Anyone can create a testbench[TB] and verify the design, but it can’t be simply reused as a verification IP [VIP]. So I would like to address in this article: What is VIP? How can we build a high-quality VIP? How can we verify the VIP? What else can we do to make the VIP unique and commercially more valuable?
Most of the module/IP level nonstandard testbenches are used once to verify the design. Is it efficient? We always want to use the same module/IP level testbench to verify the IP’s derivatives or the same IP at the chip /SoC level efficiently. Also, if any third-party vendor or client wants to use the testbench to verify their IP/Chip, then the testbench should comply with coding guidelines as per the standard methodology like UVM. So, a reusable testbench that follows a standard methodology, scalable TB architecture, and coding guidelines is called Verification IP. Let me share some of the important guidelines to implement the VIP.
Verification Plan: Defines the verification intent of the DUV/DUT[Design Under Verification/Test]. It captures all the design features and defines how each feature can be verified and tracked closely. It acts as a golden reference document for all the verification folks responsible for the verification sign-off. The VIP functional specification will have all the details – Vplan, TB architecture, Coverage Model, Verification Strategy, Test Scenarios, etc.
Verification Plan [Vplan] is different from the test plan as it’s based on the DUT features and random testcases. In SystemVerilog, we use covergroups and assertions to generate the functional coverage primarily to track the verification progress during the regression testing, which is predominantly done through random testcases. This verification tracking process can be automated by back-annotating the functional coverage to the Vplan document[Excel/Word Doc] during simulation using the EDA tool. But using the test plan, we track it manually by running every directed testcase, usually developed in HDL. Despite this huge difference Directed vs Random, traditional verification folks still refer to Vplan as test plan, similar to how we informally refer to the RTL as DUT instead of DUV.
TB Architecture: Currently, most companies prefer a standard testbench methodology like IEEE standard UVM [Universal Verification Methodology]to define their SystemVerilog TB architecture. It doesn’t mean that one can simply create UVM agents for all the DUV interfaces and connect them together at the TB top level. To create a proper working TB, one needs to understand how the working environment of RTL would be in real-time and how it can be modelled. It is the most challenging part of the verification process. Having a proper TB architecture that can support our verification strategies means that 50% of our job is over.
Let me share my experience, how we created the TB architecture for the Bluetooth verification IP [ABLE – Aceic’s Bluetooth Low Energy]. Refer to the figure below – ABLE’s architecture.
ABLE was created mainly to verify the DUV, Link Layer RTL of Bluetooth. So, we created the UVM TB to mimic the host and TLM functional reference model in UVM to mimic the Bluetooth Link Layer. TB was using HCI [Host Controller Interface] as a TLM interface to configure and interact with the Link Layer functional model. The TB as the host for generating stimulus to configure and drive both the BLE reference model and RTL. The scoreboard for comparing the DUV outputs with the expected values generated by the reference model.
More importantly, using the DUV reference model as TLM, one can add or remove any number of BLE devices dynamically during simulation by creating or deleting the TLM objects, as we do in the real-time environment.
All the Link Layer compliance and HCI test scenarios were modeled using UVM sequences. The LL-TS[Link Layer Test-suite and HCI-TS[Host Controller Test-suite] were invoking those UVM virtual sequences.
Also, we had added necessary adapters [BFMs] to replace the functional model with RTL BLE IP at one end, as most of the DUVs use standard bus interfaces like UART/SPI. One can configure the DUT adapter and use it with any interface. So, the DUT side sequences remain the same at the TB top level. Smartly we had used the UVM RAL model to capture the DUT status, and reference/received data through backdoor access for the scoreboard data comparison.
Verification Strategy: Our approach was back-to-back verification using our BLE reference model, UVM TLM. As we were into only VIP development, we didn’t have access to the BLE RTL IP. So, we had to find a way to verify our functional model. I had created two different teams, the TB and modeling team, and had made them work independently in parallel on both the functional model and testbench. TB folks created their own DUV reference model as a predictor logic independently, based on their interpretation of DUV[Link Layer] specification. Eventually, we had to deal with their difference in terms of interpretations, which helped us find all the VIP bugs.
You can’t sell a VIP that has more bugs than the DUT. Eventually, your customer will end up finding bugs in VIP rather than verifying their DUT. So, the verification strategy is critical for the success of your VIP.
As shown in the above figure, the VIP provider should also provide all necessary things:
Executable verification plan which maps all the coverage data
Scripts that can run the regression test-suite that includes all the compliance tests and back-annotate the coverage data to the verification plan
Assertion IP to verify the interface protocols
Reference models that can be used independently as TLMs
User guides to understand and run the VIP, examples, etc.
No one will buy the VIP just because its source code compiles and generates stimulus on any industry-standard simulator. Your customer will ask you to prove how your VIP is different from other commercially available VIPs on their DUT during a detailed evaluation phase beyond your impressive pre-sales presentation and demo. So, you really need to think about how fast you can find bugs in their design and excite your customer beyond their usual expectations like easy VIP integration, user interface, and complete automation.
I’ve simulated IC designs at the transistor-level with SPICE, gate-level, RTL with Verilog, and even used cycle-based functional simulators. Sure, they each worked well, but only for the domain and purpose they were designed for. Industry analyst, Gary Smith predicted that the IC world would soon move to system-level modeling, and I’m seeing more tools in this area. One notable vendor that focuses on system-level modeling is Mirabilis Design, and their simulator is called VisualSim. Mirabilis Design recently issued a press release about a new product called VisualSim Cloud, so I contacted Deepak Shankar, Founder, to learn more about it.
VisualSim Cloud is a Cloud-based simulation platform that can be used for architects, software designer and developers to quickly explore, conduct trade-offs, and optimize the specification. It has the complete feature set of VisualSim Architect. From within a Web Browser, users can assemble models, run simulations by varying parameters, and view/save the results. Models constructed by the user are stored on their respective desktops. All VisualSim libraries are available in the VisualSim Cloud.
VisualSim Cloud is a completely new product that has been in development for over 4 years. It is the next generation to the VisualSim Explorer that we announced a few years ago and was used mostly as a server product within companies. The current release of VisualSim Cloud is the equivalent to VisualSim Architect 2130.
Agile methodology is being to manage the versions within the Cloud. The Cloud version is updated as soon as new features have been developed and fully tested. This include GUI features, simulation speed improvements and new library components.
VisualSim Cloud will showcase all types of modeling- analog, data centers, supply chain, electronics, semiconductors and software.
Modeling and simulation can be carried anywhere. The models can be loaded on a drive and VisualSim Cloud can open it from any machine with just a Web Browser. There is no need for a software download or setup of complex Licensing mechanism. The simulation engine uses the compute and memory power of the local desktop.
New updates and bug fixes can be provided instantly. The user does not need to wait for the next release or request CAD to update their install.
Q: What problems are you trying to solve?
System-Level modeling takes a long time to get started, there’s a need to be in the corporate network, you get approvals before they decide to conduct system modeling, restriction to take your work any where, and leverage any server that does not have VisualSim installed. With this approach, everything is performed through the browser. All the tutorials, documentation, starting models and all available online. SO, the user is up and running as soon as they get the login information.
Also, new features are not immediately communicated to all users. This can be eliminated as the user will always be working with the latest release.
Q: Who should use this tool?
There are three main users- students, casual users, users that need the software for a single project or for a single analysis, and for overflow scenario when a license is not available in the corporate environment.
Q: Which OS do you support?
VisualSim Cloud will work on any platform that has a Web browser. This includes Windows, Linux and Mac OS.
Q: What else needs to be installed on my computer to run this?
To launch and start modeling, user needs a login, Java 1.14 installed on the machine, and a small download called OpenWebStart. Please note that right now support is limited to Java 14. Before launching VisualSim Cloud for the first time, user needs to configure OpenWebStart to work with Java 14. When they click on the Launch button, a small executable is downloaded. When the user double clicks on this executable, a series of windows will ask the user if they would like to execute. The user must accept all the security statement. Finally, VisualSim will open from within the Web Browser. There are several models available in VisualSim Cloud using File->Open Template. Also, if the user has models on their desktop, they can open them, as well.
Q: Are there any modeling limitations?
There are two limitations currently in VisualSim Cloud. User created classes cannot be used in the models, and batch-mode simulation are not supported.
Q: When should I use VisualSim Cloud versus standard VisualSim?
VisualSim ensures the user always has the latest version and there is no need for installation and management of the software locally. This can be of great use to students working on research or class projects, professor offering assignments, researcher using it for short-term; startup or smaller companies that do not have an IT/CAD infrastructure. Also, anyone that wants to use it for a short time, want flexibility in their working environment and overflow at existing company infrastructure.
Unlike other Cloud solutions, VisualSim still stores the models locally. This way the user manages the data and also does not have to pay expensive cloud provider fees.
Q: What is the cost of VisualSim Cloud?
Until end of 2021, there is no charge. After that we will be charging between $500-3000 a month depending on the libraries, type of customer, and usage. There is no cloud provider fees.
Q: If VisualSim Cloud is. free, then how do you make money?
VisualSim cloud is free for some types of teaching and student research. It is not free for commercial operations. The price is listed above.
Q: Are there any capacity or practical limits to the size of a system that I want to simulate?
There is no model capacity or simulation limitations with the cloud version. Of course, there are limits in the interface to other tools as they also need to be cloud compatible.
Q: What are some of the largest designs simulated on VisualSim Cloud?
We have simulated three large designs that cover the most popular market segments:
SoC with 64 NoC Router, with 4 HBMs, 64 ARM N1 Routers and a host of associated interfaces, DMA and cache.
TSN and CAN Network with 85 devices
Multi-blade PCie system with ten 100Gbps interfaces
Q: Does VisualSim Cloud co-simulate with other tools or have an API?
VisualSim Cloud API has an open and documented API. Currently there are no other system modeling tool that has Cloud facility. We welcome any and all companies to integrate their IP and simulators to VisualSim.
Q: How would you compare VisualSim Cloud to something like MathWorks, Simulink, Simscape, Stateflow tools? Are there any competing tools out there?
VisualSim is the only integrated multi models of computation and system modeling solution on the Cloud. VisualSim is used for architecture exploration, performance analysis and integrated power exploration. While others are focused on correctness of algorithms and code-generation, VisualSIm focused on providing Intellectual Property that enables designs to develop new SoC, processor, network equipment, Radar and communication systems, and avionics.
Q: Can I run Monte Carlo simulations, or perform design optimizations?
Yes, you can run Monte Carlo simulation by varying configurations, topologies, and parameters. Using the soon-to-be-launched requirements/Diagnostic engine, much more accurate design optimization can be performed. For example, if you are designing an SoC, you create the best configuration to meet the Quality-of-Service, throughput and power requirements for a multimedia, AI or networking application. Similarly, if you are design an AI machine for automotive application, you can select the interface, AI processor or FPGA, memory size and software partitioning.
Q: Which Universities are using VisualSim Cloud in their curriculum?
A number of universities and companies are using VisualSim cloud: Wichita State, Politehnica University- Romania, Xiaopeng Motors, eSol Trinity, ELC Labs, Shivaji university, ZTE, City University of Seattle, University of Ottawa, Draper laboratory, TH Cologne, University of East London and a few others.
Q: What was the impetus to offer a cloud-based system simulator?
System design is not well understood and there is very little time allocated. Engineers prefer to get their analysis done quickly. We felt a Cloud version will enable engineers to get started very quickly and use existing templates to quickly explore their designs. This will provide instant benefit without having to setup a formal Engineering infrastructure. As the company evolves to make systems design more mainstream, the engineers can migrate to the Desktop or corporate version.
Q: Does Mirabilis have a 3rd party program for other vendors that want to integrate with your ecosystem?
Yes, we have a fully open API. Users can partner in three ways- create value-add features for new application markets (industrial or Medical), develop libraries and market them for their existing market segment (Automotive networks), training companies (Labs and interactive learning solutions), develop multi model of computation design for sensors, mixed-signal and control system.
Cloud Hardware
Q: Which datacenter are you using for VisualSim Cloud?
Mirabilis Design maintains a private data center at Host Gator in Houston, TX and Provo, UT.
Q: What security measures are you taking in the cloud?
VisualSim Cloud hosting site is protected by a variety of security measures. The access is via https.
Q: How do you guarantee the computing power for each cloud job?
All simulation are executed on the host machine. VisualSim takes advantage of the processor and memory capacity of the machine running the Web browser.
Q: What OS is VisualSim Cloud using?
The cloud is running on the following version of Linux- Linux 4.19.150-76.ELK.el6.x86_64 x86_64
Q: What is the download speed of VisualSim Cloud generated simulation data?
The download speed is 10Gbps.
User applications
Q: How many VisualSim Cloud jobs can a person launch simultaneously?
Each user can open any number of VisualSim models. At any time, only one model can be simulated.
Q: What is the Memory capacity allowed for a person?
Currently, there is no limit on the memory capacity [partitioning for each user. Memory is provided on-demand for each simulation run.
Q: Can the simulation results be saved to a cloud disk, and is their a limitation of the disk space?
No, All simulation results and models are saved locally on the client.
Q: What Web browsers are supported for VisualSim Cloud?
VisualSim Cloud can be accessed by any browser that support Java 1.14. This includes current versions of Microsoft Edge, Firefox, Google Chrome, and Safari.
Q: Where are the models hosted?
The models are hosted on the local desktop. The software image is stored on the Mirabilis Design server. When the user logins to the VisualSim Cloud, the software is downloaded to the Browser and VisualSim opens within the Browser.
In June 2021, eMemory Technology hosted a webinar titled “PUFiot: A PUFrt-based Secure Coprocessor.” You can read a blog leading up to that webinar here. PUFiot is a novel high-security crypto coprocessor. You can access a recording of that entire webinar from eMemory’s Resources page. While the focus of that webinar was to present the details of PUFiot and the underlying PUF technology and PUF-based Hardware Root of Trust, it did have one slide showing the different use cases for the coprocessor. Refer to the figure below.
The webinar stated that the PUFiot Coprocessor can be used in Arm-based systems and RISC-V-based systems to secure applications but did not go into detail. Therefore, I want to pick up from there and blog about how the PUFiot Coprocessor can be utilized to secure applications. This blog is based on a whitepaper published by PUF Security Inc. The proof point is in the form of a successful demo by Andes Technology.
According to AV-TEST Institute, the number of malware programs climbed from around 65 million in 2011 to 1.1 billion by the end of 2020. The AV-TEST Institute is an independent research institute for IT security monitoring and reporting. The threat level is expected to grow significantly with the rate of adoption of IoT devices.
IoT devices are set to take over the world. They are used in applications ranging from autonomous vehicles to remote weather stations. If these applications can be breached, imagine the damage that can happen. Companies must be prepared to deal with this increased vulnerability by securing their applications.
What is Needed to Tightly Secure Applications?
Ensuring the security of applications requires the following six things as identified in the whitepaper. See below.
Trusted Execution Environment (TEE): Isolates codes, data, and memory that require a higher security level.
Root of Trust: Safeguard crucial security parameters; comprises unique ID, certificates, secret keys, and secure storage.
Secure Boot: Blocks unauthorized OS and applications from running.
Data at Rest Security: Stores data in an encrypted/obfuscated form with solid access control to prevent leakage.
Data in Transit Security: Utilizes keys to encrypt data before transmission to prevent interception.
Secure OTA Update: Ensures that firmware or software updates in the field come as encrypted ciphertext and that no downgrading is allowed.
The main processor/CPU of the application cannot accomplish all of the above by itself. The Root of Trust is more securely implemented at the hardware level, using an inborn PUF. The key storage unit and the execution environment need to be tamper-proof. The cryptographic algorithms are also more efficiently implemented in hardware.
In essence, a Secure Coprocessor that includes a Hardware Root of Trust and anti-tampering features is needed to support the CPU in securing applications.
Benefits of Using a Secure Coprocessor for Securing Applications
A fully secure hardware-accelerated coprocessor will offload the security-related tasks away from the CPU, allowing the main processor to perform its primary functions safely and efficiently. This approach simplifies the system design and enhances the overall performance of the application. An ideal coprocessor will be a plug-and-play security solution to allow easy implementation of key security features.
PUFiot – A Drop-in Solution for RISC-V-based IoT Systems
RISC-V architecture is gaining significant adoption among IoT devices for the processor of choice to handle the enormous amount of data and associated transactions. A major reason for this is RISC-V’s open architecture and relatively low cost. But the security guidelines for RISC-V based systems are still under development.
As security guidelines continue to evolve, choosing a compatible solution that calls for the least amount of disruption to the system and the application is a wise choice. PUFiot coprocessor is such a drop-in solution. Refer to the figure below for the design architecture of the PUFiot Coprocessor.
PUFiot Coprocessor Design Architecture
PUFiot’s secure boundary is based on physical separation of hardware, therefore establishing a sound Trusted Execution Environment (TEE). At the heart of PUFiot is a Hardware Root-of-Trust design. This encompasses eMemory’s patented NeoPUF, providing each chip with a unique chip fingerprint (UID) and offers Riscure certified anti-tampering secure OTP for key storage, preventing physical/electrical attacks on crucial security parameters. The Hardware Root of Trust also comes with a True Random Number Generator (TRNG), a source of dynamic entropies to secure cryptographic engines and communications between systems. For complete details of all of the built-in features and functionality, refer to the PUFiot product page.
Securing Applications on a RISC-V based system
With the security guidelines for RISC-V based systems still evolving and the ecosystem still maturing, systems developers have to either implement the security solution themselves or adopt a trustworthy solution from a partner. Choosing the in-house development path will throw its own challenges along the way. For example, does the solution provide a solid secure boundary (meaning comprehensive Hardware Root of Trust), support all major crypto algorithms, and obtain 3rd party security certifications, etc. Security is a global issue, and certifications need to satisfy international requirements, rules, and regulations. PUFiot coprocessor has all of these aspects covered. Refer to the figure below for a block diagram of a RISC-V SoC design incorporating PUFiot coprocessor to secure applications.
A RISC-V SoC Incorporating PUFiot Coprocessor
A potential threat to an IoT device comes in the form of a malicious chip added to the system in place of the genuine chip holding the firmware. A PUFiot implementation ensures that attempted breaches can be stopped right at boot time. Any attempted tampering to switch security key information is stopped right at boot time by verifying chip pairing. Refer to Andes Technology demo below.
Andes Technology’s Secure Boot Demo
Andes Technology is a leading supplier of high-efficiency, low-power 32/64-bit RISC-V processor cores and a Founding Premier member of RISC-V International. It demonstrated the effectiveness of the PUFiot coprocessor in securing applications.
A secure boot process involves checking and authenticating application firmware before executing the boot-up process. Andes setup two FPGAs (A and B) for the demo. A’s firmware was encrypted and stored in flash memory corresponding to A. Similarly, B’s firmware was encrypted and stored in flash memory corresponding to B. Each PUFiot comes with its unique inborn private key, what is stored in the respective flash memories is encrypted differently. As a result, the systems will only boot with the correct chip pairing. Any attempt to boot the systems with non-matching flash memories will fail, as the decryption will not be successful without the valid key. Andes demonstrated that the systems booted successfully with the correct chips-pairing but could not boot when the flash memories were swapped.
Summary
The PUFiot secure coprocessor can easily be dropped into RISC-V-based systems to secure IoT applications. PUFiot enables Zero Touch Deployment needed in the world of IoT. With built-in hardware-accelerated security functions and access controls, PUFiot also meets the requirements of Zero Trust Security in cloud applications. You can access the whitepaper titled “An Essential Security Coprocessor for RISC-V Designs” from here. PUFiot is available for free evaluation for users who would like to try the IP. Please visit https://www.pufsecurity.com/ip-go.
The field of DRAM is fascinating as it continues to grow and innovate. For the past ten years, I have often read that DRAM is running out of steam because of its difficulty to scale the capacitor, and yet it continues to evolve since invented by Dr. R. Dennard at IBM. In 1966, he introduced the concept of a transistor memory cell consisting of one transistor and one capacitor. His invention was granted a patent (US3387286) in 1968. The overall configuration of one transistor memory cell has not changed over the years. Today — fifty-five years later — we have three manufacturers in 1X nodes with a memory capacity greater than 4 Gb, who still fabricate their memory cells with the same configuration consisting of one transistor and one capacitor. Micron’s D1α, which is the most advanced DRAM and is the first sub-15nm cell design, has an impressive memory capacity of 8 Gb.
Every new DRAM technology node produces chips that are smaller and more compact than their predecessors. This scaling allows more dies per wafer which offsets the increasing manufacturing cost of introducing new technology. Every new node not only shrinks the cell size, but also introduces new materials or new architecture layouts. The DRAM technology has moved from trench capacitors to stacked capacitors. The capacitor dielectric has changed from a single high-K layer to multiple dielectric layers, with the capacitor structure evolving from a crown structure to a pillar structure and the layout now modified from 10F2 to 8F2to 6F2, where F is the minimum feature size.
I was particularly interested in the cell layout and considered it to be a strong parameter to enable reduction of cell size. Micron was the first company to switch from 8F2 to 6F2 cell layout at 9x nm node, followed by Samsung at 80 nm node and finally SK-Hynix, which also adopted the 6F2 cell layout at 3x nm node. I keep wondering when will a 4F2 be adopted? In a 4F2 cell, the wordline pitch and the bitline pitch are exactly 2F. The 4F2 configuration requires a surrounded vertical gate structure. This concept has still not materialized even though there have been major advancements in patterning and lithography as it is more cost effective to stay with the same type of architecture and simply make modifications rather than adopting a completely new design.
Working for an IP centric company, I was able to quickly check the status of DRAM 4F2 patents and was surprised to find that the total number of alive patents filed in the US by the three major memory makers (Samsung, Micron, and SK-Hynix) is less than a couple of hundred and that the filing activity after the year 2015 is sparse. This seems to indicate that the industry is focused on something else.
The main challenge of DRAM is bandwidth and latency. Bandwidth is the quantity of data that can be written on to the memory or can be read from it, while latency is the time gap between the request to the memory and its execution. This topic is of current interest in the memory industry. The biggest relief in bandwidth came in 2013 with the introduction of High-bandwidth-memory (HBM) where stacked DRAM dies were connected to each other by through-silicon-vias (TSV). Figure 1 shows a picture of HBM analyzed at UnitedLex in 2018 for supporting IP activities. The HBM along with its TSV structures improved the data transfer between the logic process and the memory but it did not solve the “memory wall” problem entirely.
Figure 1: Cross-section of HBM (UnitedLex)
I often wondered what the next steps would be, I then came across the proceedings of the Electronic Components and Technology Conference 2021(ECTC), which is one of the premier international events related to packaging, components, and microelectronic systems. This conference had papers from major device makers like GlobalFoundries, IBM, Intel, Micron Samsung, and TSMC. All these companies discussed hybrid bonding, direct bonding, die to die connections, and various TSV-less solutions. The one paper that caught my attention was authored by Micron Memory Japan, along with several other research organizations, and titled “Ultra-thinning of 20 nm Node DRAMS down to 3 µm for Wafer-on-Wafer (WOW) applications”. This paper describes how they thinned the wafers using two different methods namely grinding and chemical mechanical polishing (CMP) and compared retention time of the DRAM before and after the thinning. They concluded that the retention properties had not deteriorated due to the thinning process.
This was indeed a “WOW” paper. Ever since HBM was introduced, the wafer thickness has plummeted from a few hundreds of micrometers to around 40 µm but going to 3 µm is something extraordinary. Just in comparison, the human hair is around 70 µm ±20 µm. The combination of hybrid bonding and wafer thinning opens new possibilities for DRAM. In hybrid bonding the metallic bond pads of two wafers are directly connected as well as the dielectric materials adjacent to them which are also connected. Hybrid bonding is used in the industry and has been employed by Sony in their image sensors, however, as of today it has not yet been implemented in stacked DRAM products. One of the challenges of hybrid bonding is that it requires a clean interface at an atomic plane level.
The production of thin wafers along with hybrid bonding would greatly reduce the TSV impedance, it would also increase data bandwidth, reduce thermal resistance, and finally increase the density of interconnects. If such a technique were to be used then the image on figure one would not have the conductive bumps seen between the dies and the memory die thickness would be ten times thinner, which would lead to a considerable overall reduction of the height of the stack. This combination of ultra-thin wafers with hybrid bonding will extend the life of DRAM devices more easily than adopting a completely new configuration like monolithic 3D DRAM which is being lately discussed in the scientific community. New applications, like using a DRAM stack directly bonded onto a logic chip could be envisaged, or it could be used as the cache memory, like AMD and some others are implementing with their external SRAM memory. Of course, for DRAM devices, some architecture designs would need to be modified because as of today, SRAM is better than DRAM in terms of latency. Mounting thin DRAM dies on logic dies could also bring new concepts like compute in memory, where the base die in HBM could have some computing power.
DRAM devices are far from being at the end of their lives and still have many miles to go. It will need to be further shrunk to reduce costs. Probably in the future, the periphery circuitry will also be scaled or even taken out of the DRAM die and be fabricated as an independent chip and then be mounted on the DRAM using ultra thinning process and hybrid bonding technology. The combination of advanced lithography and patterning, the possibility of disaggregating periphery circuitry into individual small chips (or “chiplet”) configuration, the availability of wafer thinning process and hybrid bonding technology has rejuvenated the DRAM devices. Most likely DRAM is not ceding its place to any other memory soon. I am also hoping that there will be a breakthrough in design and process technology so that monolithic DRAM with 4F2 cell layout will be available in the market soon.
At UnitedLex, we monitor and analyze both the technology and patents that surround the IC eco-system. By doing this, we are well positioned to help clients track the innovations being implemented in industry but also, strategically guide clients on how to optimize their patent portfolios.
I knew something special was going on in Munich last week at IAA Mobility – the first international auto show to be held outside China since the start of the pandemic – when a senior executive stepped off the stage before his talk to a modest crowd to say to me (sitting in the second row): “What are you doing here?” I don’t remember whether it was “What are YOU doing here?” or if it was “What are you doing HERE?” Or maybe it was “What are YOU doing HERE?”
It was a good question, but not for the reason you might think. It was logical for this European executive to be mildly surprised at the presence of an American in the middle of a European auto show in the midst of a pandemic characterized by widespread travel restrictions. It was even more surprising, though, to see ME at any auto show since I am no car enthusiast.
I was not alone. I ran into U.S. executives from Argo, Qualcomm, Intel, Lumotive, Luminar, Volkswagen, NNG, and a host of journalists and industry analysts. So, not so shocking for me to be in attendance.
There is a bizarre irony that auto shows are touted as showing off the latest technologies – giving consumers (for whom they are really intended) a taste of what the future of automotive technology holds. The reality is that entering the cockpit of a vehicle at an auto show is like entering a time machine transporting the individual back 3-4 years to what designers planned and implemented many years ago. (An exception to this are so-called “concept” cars.)
Car makers are hopelessly and routinely behind the times. It took years for car makers to accommodate smartphones and it has taken two decades for them to build in wireless connections.
Auto shows are probably the last place to go to catch a glimpse of what lies along the road ahead in automotive design. This is why IAA Mobility was so unique. Unlike its predecessor IAA (in Frankfurt), IAA Mobility included supplier companies – large and small – on the show floor and in the speeches and panel discussions.
IAA, like NAIAS, like the Geneva Motor Show, or Paris’ erstwhile Mondial de L’automobile typically feature moshpit-style press conferences, boilerplate press releases, and sterile in-booth displays. It has historically been nearly impossible at these events to have a substantive conversation regarding technical or regulatory challenges facing the industry or even user experience issues.
Traditional auto shows are all about glitz and glamour and heavy metal. This is why events such as the L.A. Auto Show’s AutomobilityLA – coming up this November – and the Future Networked Car event put on by the International Telecommunications Union in connection with the hopefully-soon-to-be-revived Geneva Show are so important. These “side shows” provide a forum to network and discuss the critical transportation issues of the day – while consumers kick the tires on the hypercars (and daily drivers) on the show floor.
The question remains: What was I doing in Munich at a car show? IAA Mobility was a car show that wasn’t a car show. That’s why I was there as were so many others.
In spite of chip shortages and related factory shutdowns, vehicle launches continue apace. New EVs are arriving nearly every week. The automotive industry is advancing and evolving and consumer demand for vehicles has never been stronger, driving up average retail prices as the reality of a scarce supply settles in.
Car companies are uniquely challenged in returning to the auto show stage as many car makers have yet to re-open their offices. Yet the new models keep coming and there is an industry-wide urge to get the word out to consumers and get the vehicles out and in front of the public.
New features and functions need to be demonstrated and explained. I went on an EV test drive recently and the initial orientation made clear that this new generation of electrified vehicles are about more than just regenerative braking and finding the right charging station.
Drivers are accessing their cars with their smartphones and accessing their infotainment systems with their Gmail credentials. The march of Google/Android-infused cars has only just begun and thus far includes the Polestar 2, Volvo XC-40, Renault Megane E-Tech, and General Motors’ Hummer EV. And the battle for the EV pickup market is gaining steam with the arrival of Ford’s Lightning F-150 and Rivian’s R1T. (Musk’s Cybertruck is delayed until late 2022.)
There is also a need to connect and interact with colleagues at auto shows. The first opportunity to do this in the U.S. will arrive at the L.A. Auto Show in November. I’ll be there, thanks to AutomobilityLA. I hope to see you there as well.
Dan is joined by Dr. Walden Rhines for a far-reaching discussion of the history of TSMC and the foundry business model. The past, present and potential future scenarios are all explored.
The views, thoughts, and opinions expressed in these podcasts belong solely to the speaker, and not to the speaker’s employer, organization, committee or any other group or individual.
Maxim is a scientist, engineer, and entrepreneur. His expertise is in physics, mathematics, semiconductor devices, and EDA. Prior to co-founding Diakopto, Maxim worked at Apple’s SEG (Silicon Engineering Group), where he was responsible for parasitic extraction. Before Apple, he was CTO of Silicon Frontline Technology, where he architected and successfully brought to market several industry-leading tools such as R3D and Rmap. Maxim also worked as device engineer at PDF Solutions, T-RAM Semiconductor and Foveon. Prior to moving to the industry, he was a professor at Georgia State University and University of Aizu. Maxim graduated with a Ph.D. in Solid State Electronics from the Moscow Institute of Physics and Technology, and won the first place in the Physics Olympiads (USSR) for high schools and for universities.
Tell us about what Diakopto does and what prompted you to start the company.
Diakopto was founded to help chip designers, layout engineers and CAD teams solve IC problems caused by interconnects and layout parasitics. I saw first-hand the increasing pain and headaches that parasitics were causing, delaying tapeouts by weeks or months. It was agonizing for me to watch some of the smartest engineers attempting to solve parasitics-related IC problems using a manual, trial-and-error approach that was extremely tedious and time-consuming since none of the existing tools or methodologies were helping them find where the problems were, and what was causing them. They were largely shooting in the dark, trying different things to see if the problems went away. Or they would brute-force overdesign their chips to overcome these problems, which led to higher power, area and cost.
We wanted to equip these engineers with a flashlight and a magnifying glass, to give them insight and visibility into the parasitic effects and help them find the proverbial needle in a haystack. It was for this reason that we started Diakopto and why our first tool, ParagonX, became so successful so quickly.
Where does the name Diakopto come from and why did you pick it for your company?
It’s a twist on the word “Diakoptics” which was introduced by Gabriel Kron as a method for breaking a problem down into sub-problems which can be solved independently before being joined back together to obtain an exact solution to the whole problem. We take a very similar approach in our software and methodology to solve a very large problem by slicing it into smaller sub-problems to solve.
Diakopto is also the name of a beach-front vacation town in Greece (although I have not personally had the chance to visit).
Why has parasitics become such an important factor now? What changed?
A few factors have made parasitics the increasingly dominant challenge for engineers:
The main one is the industry transition to advanced technology nodes. In FinFET technologies, the number, magnitude and impact of parasitics on chip performance, power and reliability have grown exponentially. This not only considerably slows down simulations using existing tools (often taking many days to more than a week to complete), but if the simulations reveal problems, trying to debug the problems and find out which few parasitic elements (out of thousands, millions or billions) are causing the problems is a nightmare. This is where our tools and methodology come into play – to quickly and easily pinpoint the few parasitics that need to be fixed.
We also have many customers using our products and methodology in older technologies, such as 28nm, 40nm, 90nm, even 180nm. These customers are continually pushing the envelope of these older nodes in terms of speed, accuracy, linearity, etc. This moves their designs closer and closer to the edge of the cliff of those process nodes, where parasitics suddenly become critical.
Who are your competitors? What are the differences between Diakopto’s tools and other tools?
We believe that the shift from transistor-dominated designs (pre-FinFET) to parasitics-dominated designs has driven the urgent need for a new class of tools and methodology, developed from the ground up to analyze, visualize, debug and optimize parasitic effects in modern ICs. We have opened up a new market that is mostly untapped at this time.
We complement (and do not compete with) the major signoff tools such as SPICE simulators or IR/EM tools.
There are a couple of tools that claim to help with parasitics analysis, but they are not as versatile, fast or easy to use as ParagonX. Those tools will tell you there are problems, but they do not quickly and intuitively point to the root causes.
One of the big advantages of ParagonX is the ease-of-use and out-of-the-box experience it offers: there is no need for any setup or configuration, CAD support or foundry qualification. A novice user can start using the tool after a 10 minute training, which has been unheard of in the EDA industry until now. This is why it’s easy for our tool to proliferate to new design teams, layout engineers, and CAD groups.
Diakopto seems to have come out of nowhere, but already with over 30 companies using your debugging platform. What’s the backstory?
When we first introduced our ParagonX tool in 2018, we were pleasantly overwhelmed by the high level of interest from our early customers. And very quickly, the word spread to other engineers and other companies and we have more and more customers evaluating and signing up. We are pleased to see customers using ParagonX for different foundries, process nodes, design styles, and design applications: SerDes, image sensors, data converters, PLLs, memories, low-power IoT, AR/VR, wireless/RF and many more.
Again, having a tool that is designed for unparalleled ease-of-use and that requires virtually no training or support enables a rapid adoption of ParagonX across a broad section of the industry. Once our customers validated that this is indeed the case, we felt comfortable that we could broaden our reach to the thousands of semiconductor design teams out there that we have not yet tapped into and without compromising the user experience.
Where do you see Diakopto going from here?
We are very excited about our future. Not only have we seen the adoption of ParagonX grow exponentially over the last couple of years, we are also seeing a significant uptick in the frequency and expansion of use at our customer base. Many customers have embraced and made ParagonX part of their standard flow and methodology for IC design and debugging. We have a healthy pipeline of companies currently evaluating ParagonX and we believe they will soon join our global community of customers.
We are equally encouraged by the strong tailwinds that will continue to fuel our growth. There are several key market trends that driving the increasing need for our solutions:
Hyperscale data centers
5G wireless
AI/ML
IoT and sensors
AR/VR
Autonomous vehicles
These market trends are in turn driving the need for (1) higher speed circuits, (2) higher precision circuits, and (3) broad industry migration to advanced process technologies – all 3 of which lead to the exponentially increasing severity of parasitic effects on chip PPA and time-to-market.
What makes me even more enthusiastic is the new products that we are bringing to market to address adjacent opportunities while staying rooted in our founding principles. We will be announcing some of the new products over the coming 12 months.