100X800 Banner (1)

Changing RISC-V Verification Requirements, Standardization, Infrastructure

Changing RISC-V Verification Requirements, Standardization, Infrastructure
by Daniel Nenni on 11-07-2024 at 10:00 am

Abstract,Futuristic,Infographic,With,Visual,Data,Complexity,,,Represent,Big

A lively panel discussion about RISC-V and open-source functional verification highlighted this year’s Design Automation Conference. Part One looked at selecting a RISC-V IP block from a third-party vendor and investigating its functional verification process.

In Part Two, moderator Ron Wilson and Contributing Editor to the Ojo-Yoshida Report took the panel of verification and open-source IP experts on a journey through RISC-V’s changing verification requirements, standardization and its infrastructure. Panelists included Jean-Marie Brunet, Vice President and General Manager of Hardware-Assisted Verification at Siemens; Ty Garibay, President of Condor Computing; Darren Jones, Distinguished Engineer and Solution Architect with Andes Technology; and Josh Scheid, Head of Design Verification at Ventana Microsystems.

Wilson: When SoC designers think RISC-V, they think microcontroller or at least a tightly controlled software stack. Presumably, evolution will bring us to the point where those stacks are exposed to app developers and maybe even end users. Does that change the verification requirements? If so, how?

Scheid: On standardized stacks? Market forces will see this happen. We see Linux-capable single-board computers with a set of different CPU implementations. Designers see the reaction to non-standardization. The RISC-V community has survived a couple of opportunities for fragmentation. The problem was with the vector extension and some debate about compressed instructions. We survived both by staying together with everything. I think that will continue.

Back to these single-board computers. The ability for the software development community to run on those allows developers to say, “Yes, I work on this, and I work on these five things, therefore I consider that.” It’s a way for software to know that they’re portable. They’re working on the common standard together.

Garibay: The difference isn’t in so much the verification of a CPU. We’re spending much more time at the full-chip SoC deployment level and that’s typically ISA-agnostic. A designer is running at an industry-standard bus level with industry-standard IO and IP. That layer of the design should standardize in a relatively stable manner.

Wilson: Does it help your verification?

Jones: Sure. The more software to run on RISC-V CPUs, the better. Once designers step outside of the CPU, it’s plug-and-play. For example, companies take products to Plugfests for various interconnect technologies, including PCIe. When PCIe was brand new, they couldn’t have done that because it was too new. If a customer is building a system, they want to have more software. The more software that there is to run, the better it is for everyone, including the CPU vendor who can run and validate their design.

Brunet: An aspect as well is the need to run much more software and the need for speed. Designers are not going to run everything at full RTL. Using a virtual model and virtual platform has been helpful for conventional architectures. With RISC-V, we

are starting to see some technology that is helping the virtual platform and it’s necessary. We don’t see standardization and it’s a bit of the Wild West with respect to RISC-V virtual platform simulation. It will benefit the industry to have better standardization, as proven by the traditional Arm platform. It’s a reliable virtual environment that can accelerate and run more software validation. I don’t see that happening yet with RISC-V.

Wilson: I’m hearing that the infrastructure around RISC-V has not caught up to where Arm is, but the ramp is faster than any other previous architecture. It’s getting there and RISC-V has advantages, but it’s brand new.

Scheid: I can feel the speed of the ramp.

Jones: Yes. Having seen the 64-bit Arm ramp for many years, the rate RISC-V is moving has been much more rapid. Arm broke the seal on extra architectures. RISC-V can learn all those lessons, and we’re doing well.

Wilson: Are we going to sacrifice quality or accuracy for that speed of ramp, or is RISC-V doing what other architectures have done faster and just as well?

Garibay: RISC-V is not a monolith. It is individual providers and many players in the market. The competition could cause corner cutting and that’s the great capitalist way forward. Designers must get what they pay for and there’s nothing inherently different about RISC-V in that respect. The goodness is that they have a number of potential vendors. At least one of them has something that is good and useful. It may be up to the designer to figure that out, but all the work over time will be viable.

Jones: I agree with Ty. We’re probably supposed to disagree to make the panel more interesting. But I agree that with RISC-V, the competition is actually a good thing, and that’s enabling a faster ramp. How many CPU design teams are designing Arm processors? A couple that are captive. Otherwise, it’s Arm. How many CPU design teams are designing RISC-V? Many. Andes has more than one team working on it.

Competition produces quality. Products that aren’t good won’t last. If designers have a software infrastructure with Arm and want to make a functional safety feature but don’t like Arm’s offering, they’re stuck. With RISC-V, they have a multitude of vendors that offer whatever’s needed.

Wilson: I’d like to dig into that. Historically, one RISC-V advantage is that it is open source. Designers can add their own instructions or pipeline. Is that true anymore with full support vendors beginning to organize the market? If so, what kind of verification challenge does a designer take on as an SoC developer?

Scheid: There are two sides. Certainly, there’s a verification aspect because a designer is taking on some of that design and making decisions about what’s appropriate as an architectural extension, which is powerful but there is risk and reward.

Another consideration is software ecosystem support. The amount of resources for any given instruction set architecture and the software system spent on software support is far greater than hardware.

Designers must consider the choice of going with standard customization or not. RISC-V is the third path, which is going ahead and proposing extensions into the RISC-V community for standardization and ratification as actual extensions. That can matter to design team and depends on what to keep proprietary. That also, as a middle ground, allows a design team to customize and have externalized ecosystem support over time by convincing the RISC community this is a value-added, viable extension.

Garibay: The ability to extend, especially the RISC-V instruction set is one of the great value-added propositions driving some of the motion toward the architecture.

How does that affect verification? Obviously, if the licensor is the one making the changes, it takes on some unique accountability and responsibility. As a company that licenses its IP, it has the responsibility to make it safe to create an environment around the deliverable so that the IP cannot be broken.

It’s an added burden for a CPU designer to create the environment to validate that statement is true to the extent that’s possible. It’s a critical value-add to the offering and worth spending engineering cycles to make it happen.

Licensors must make sure they create safe sandboxes for an end user or an SoC builder that are proprietary to the licensee. If the licensee wants a change and is willing to pay for it, great. If the licensor wants to put something in their hands and let them roll, it’s a collaborative process that must be part of the design from the beginning to make sure that it is done right.

Wilson: Is it possible to create a secure sandbox?

Garibay: Yes. I think Andes has done that.

Wilson: Do you end up having to rerun your verification suite after they’ve stabilized their new RTL?

Jones: A CPU IP vendor must put a sandbox around this unless it’s a company willing to do a custom design. If a designer wants to add a couple of special instructions, the company needs to make sure that the designer’s special instructions won’t break everything else. For instance, Andes has a capability through which designers can add their own computation instructions.

Wilson: You’re only allowed to break what you put into your custom?

Garibay: Yes, absolutely.

Jones: Designers have to verify their special add, sum, subtract, multiply on their own. That’s another question for the CPU vendor: How do you enable me to verify this? How do you enable me to write software to it? Buyer beware. You have to check.

Wilson: When we start talking about security, is there a role for formal tools, either for vendors or users?

Garibay: We use formal tools in our verification. One of the great things about RISC-V is its spec is not in a state that is easily usable by formal tools. I’d love to see the RISC-V port go that way.

Scheid: We use a full spectrum of different formal methods within our implementation process. In terms of customization, the method that makes the most sense for special add and would be the type of commercial tools that allow for C-to-RTL equivalency checking. With the right type of sandbox approach, it could be directly applicable to solving that problem for a designer’s customization.

Jones: I’ll take a little bit of the high road. You always have to ask the question about formal and formal means different things to different people. Formal has a role; a CPU vendor may be questioned about whether they used formal and for what and what did the formal tools find? Formal is good where traditional debug is difficult such as ensuring a combination of states can never be hit. Proving a negative is difficult for debug but is a strength of formal.

Scheid: For formal as a place in this space, I come back the ability to do customization of instructions, an attractive feature of RISC-V. Clearly, it can be done with Arm, but it’s a different value of the checkbook that’s needed to be used here. It’s attractive for grading verification challenges, something that comes with Arm.

RISC-V has a higher stack of verification. A custom instruction set goes completely through the roof on what’s needed to be verified. It’s doable. The verification bar is high, complex and focused on certain verticals. It’s also not for everybody. It’s a good attribute, but there’s a price in verification. Another interesting aspect of competition is the EDA space. Ventana is EDA and the only EDA vendor that does not provide any processor IP. The other two are vocal about RISC-V, creating an interesting situation with the market dynamic.

Also Read:

The RISC-V and Open-Source Functional Verification Challenge

Andes Technology is Expanding RISC-V’s Horizons in High-Performance Computing Applications

TetraMem Integrates Energy-Efficient In-Memory Computing with Andes RISC-V Vector Processor


Semidynamics: A Single-Software-Stack, Configurable and Customizable RISC-V Solution

Semidynamics: A Single-Software-Stack, Configurable and Customizable RISC-V Solution
by Kalar Rajendiran on 11-07-2024 at 6:00 am

Risc V CPU

Founded with a vision to create transformative, customizable IP solutions, Semidynamics has emerged as a significant player in the AI hardware industry. Initially operating as a design engineering company, Semidynamics spent its early years exploring various pathways before pivoting to develop proprietary intellectual property (IP) around 2019. With financial support from the European Union, they began by creating highly efficient Core and Vector Units, receiving recognition from the tech ecosystem.

Over the past year, Semidynamics has made several announcements highlighting their technology advancements and partnership engagements to support many fast growing market segments. Their signature value proposition is a versatile “All-in-One IP” solution equipped to meet the demands of modern AI applications.

During the RISC-V Summit 2024, I sat down with Roger Espasa, Founder and CEO of Semidynamics to receive a holistic update. The following is a synthesis of that discussion.

A Unified, Programmable Solution

The heart of Semidynamics’ innovation lies in its commitment to a single software stack approach. In an industry where heterogeneous SoC (System on a Chip) architectures often combine CPUs, GPUs, and NPUs from multiple vendors, each with its own software stack, Semidynamics offers a streamlined alternative. By uniting Core, Vector, and Tensor processing units under a single software stack, they eliminate the inefficiencies commonly associated with multiple software stacks that rely heavily on data orchestration through Direct Memory Access (DMA) operations.

This unified solution is built on the RISC-V open-source architecture, ensuring adaptability and control. Semidynamics’ RISC-V-based architecture enables seamless communication between the Core and specialized units, allowing everything to run smoothly as a cohesive program. This differs from traditional designs where data is sent, processed, and returned in a fragmented sequence, leading to latency issues. Customers have responded positively to this innovation, appreciating the streamlined programming experience it provides.

Key Components of Semidynamics’ IP Solution

Core, Vector, and Tensor Units

Semidynamics’ “All-in-One IP” integrates three essential processing units—the Core, Vector, and Tensor units—working in harmony. While the Core handles general-purpose processing, the Vector unit manages 32-bit precision activations, and the Tensor unit is optimized for smaller data types, crucial for AI tasks like matrix multiplications in transformer models. The system dynamically balances workloads across these units to maximize performance.

Gazillion Misses™ IP

The Gazillion Misses IP is a specialized feature within their CPU Core that ensures high data availability for AI applications. With AI models requiring vast amounts of data, caches alone cannot keep up. Gazillion Misses IP addresses this challenge by continuously requesting data from the main memory, ensuring that both the Vector and Tensor units remain active and data-ready, a capability essential for managing complex models like transformers.

Out-of-Order Processing with Atrevido

In response to the growing demands of transformer-based models, Semidynamics offers an Out-of-Order processing architecture, dubbed “Atrevido.” This architecture ensures that, even as data demand surges, the processing units do not suffer from data starvation, maintaining smooth and efficient operation.

Configurability and Customization

Recognizing that each customer’s requirements vary, Semidynamics offers both configurable and customizable IP. Configuration involves selecting from standard options like cryptography and hypervisor support, while customization entails crafting special instructions based on customer requirements. This flexibility allows Semidynamics to serve a broad range of applications, from high-performance computing (HPC) to low-power security cameras.

RISC-V: The Backbone of Semidynamics’ Approach to Open Standards

Semidynamics’ choice of RISC-V as the foundation of their technology aligns with a broader industry shift towards open-source architectures. Similar to the freedom Linux brought to software, RISC-V liberates hardware developers from proprietary constraints. However, with the high costs associated with hardware tapeouts, choosing a solution partner becomes critical. Semidynamics not only brings flexibility and control to hardware but also future-proofing by grounding their technology in a general-purpose core that can adapt to new algorithms as they emerge.

Practical Engagement with Customers

Beyond IP delivery, Semidynamics ensures their clients have hands-on access to the technology. Once the RTL (Register Transfer Level) is delivered, customers can begin working immediately, with the option of testing on their multi-FPGA emulation platform. This engagement model accelerates integration and allows clients to adapt the IP to their needs in real-time.

Business Model

Semidynamics employs a straightforward business model that includes a licensing fee, a maintenance fee, and royalty options. This flexible structure ensures that customers pay for what they need, aligning the financial model with the technical customization Semidynamics provides.

A Blended Talent Pool

Based in Barcelona, Semidynamics boasts a team that combines industry veterans from companies like Intel, DEC, and Broadcom with young talent trained through rigorous immersion. This blend of experience and fresh perspectives ensures that the company remains innovative while drawing on deep industry knowledge.

Future-Proofing AI Hardware

AI hardware is undergoing rapid evolution, driven by emerging algorithms and models that challenge traditional computing frameworks. Semidynamics’ approach—anchoring Vector and Tensor units under CPU control—ensures that their IP can adapt to future AI trends. This scalability, combined with their focus on programmability, positions Semidynamics as a forward-thinking solution provider in the AI hardware space.

Summary

Semidynamics “All-in-One IP” solution strategically combines CPU, GPU, and NPU processing capabilities into a unified RISC-V architecture to meet the increasing demands of AI, machine learning, and edge computing. By implementing a single software stack, Semidynamics enables seamless control over Core, Vector, and Tensor units, minimizing the need for fallback to the CPU. This approach ensures efficient task distribution across specialized units and directly addresses the performance limitations highlighted by Amdahl’s Law, which focuses on bottlenecks from tasks that cannot be parallelized.

To prevent memory access issues that can slow down AI applications, Semidynamics developed Gazillion Misses™ technology. This technology continuously feeds data to the Vector and Tensor units from main memory, reducing idle time and supporting high-throughput processing, even for large, complex AI models. By combining a unified software stack, advanced memory management, and a customizable architecture, Semidynamics delivers an adaptable solution for various AI and HPC workloads, providing efficient, scalable, and future-ready performance.

To learn more, visit https://semidynamics.com/en

Also Read:

Gazzillion Misses – Making the Memory Wall Irrelevant

CEO Interview: Roger Espasa of Semidynamics

Semidynamics Shakes Up Embedded World 2024 with All-In-One AI IP to Power Nextgen AI Chips


Synopsys-Ansys 2.5D/3D Multi-Die Design Update: Learning from the Early Adopters

Synopsys-Ansys 2.5D/3D Multi-Die Design Update: Learning from the Early Adopters
by Daniel Nenni on 11-06-2024 at 10:00 am

banner for webinar

The demand for high-performance computing (HPC), data centers, and AI-driven applications has fueled the rise of 2.5D and 3D multi-die designs, offering superior performance, power efficiency, and packaging density. However, these benefits come with myriads of challenges, such as multi-physics, which need to be addressed. Ansys and Synopsys as part of their long-standing partnership are addressing these multi-die design challenges, bringing together cutting-edge technology and solutions to enhance the multi-die design and verification process from early architecture to manufacturing and reliability

Multi-Die Design Challenges: Architecture and Early Prototyping

Multi-die designs are far more complex than traditional monolithic chip designs. The integration of multiple heterogeneous and homogeneous dies within a single package leads to significant challenges, particularly in thermal management, mechanical stress, and early architecture decisions. The initial architecture and die placement are major steps in the multi-die design process, requiring specialized tools. Synopsys 3DIC Compiler™ is an industry-leading solution that helps define the architecture of 2.5D/3D multi-die designs in a unified exploration-to-signoff platform. It enables chip designers to address early architectural challenges effectively, facilitating smoother transitions into early prototyping and ultimately to signoff.

Thermal awareness and mechanical reliability are major challenges that should be addressed as early as possible in the design cycle. Thermal challenges in multi-die designs can arise from temperature and thermal property differences between individual dies, die interconnects, and materials used in multi-die designs. Designers must thoroughly analyze each element to avoid costly redesigns later. Mechanical issues like stress and warpage can lead to failures if not addressed early in the design process. Ansys offers a comprehensive platform for tackling these physical challenges at an early stage. With software tools like Ansys RedHawk-SC Electrothermal™ and Ansys Icepak™, designers can efficiently address these issues to facilitate rapid prototyping and architectural exploration. Early-stage thermal and mechanical analysis is critical to prevent problems like hotspots, warping, and system failures due to poor heat dissipation or physical stress.

Importance of Early Verification

Verification at an early stage of multi-die design is pivotal. As multiple dies are stacked together in a small form factor, verifying the overall system becomes increasingly difficult, yet even more essential. Failure to catch potential issues early, such as thermal bottlenecks or power integrity problems, could lead to costly delays and suboptimal performance.

One of the key challenges in multi-die design is managing voltage drop and electromigration (EM/IR), which can lead to power integrity failures. Especially difficult is ensuring reliable power distribution in the vertical direction from interposer to chip, and between stacked chips. Supply currents for up to 200W need to be delivered through tiny microbumps, hybrid bonds, and through-silicon vias (TSVs). This requires very careful power integrity analysis down to level of each individual bump.

Ansys RedHawk-SC Electrothermal offers advanced simulation capabilities for robust power integrity analysis while Synopsys 3DIC Compiler ensures that the design architecture meets the desired design goals by enabling feasibility and prototyping, and implementation and analysis, all in a single environment using a common data model. Under our existing partnership, Ansys and Synopsys provide designers with the necessary solutions to create resilient 2.5D/3D multi-die designs that can withstand the demands of modern high-performance computing environments.

The Role of AI in Multi-Die Designs

Artificial Intelligence (AI) is revolutionizing how designers’ approach 3DIC designs. AI-driven tools can automate many time-consuming processes, from early prototyping to layout optimization, significantly reducing the design cycle. As the complexity of multi-die design continues to grow, AI technology will become an essential component in handling massive design datasets, enabling smarter decisions and faster results.

The use of AI in design exploration can help optimize key parameters such as power efficiency, thermal distribution, and interconnect layout. This is not just a matter of saving time; AI’s ability to predict and automate design solutions can lead to more innovative and efficient architectures, allowing designers to focus on higher-level innovations.

The Golden Sign-off Tools

Ansys RedHawk-SC and Synopsys PrimeTime stand as first-class tools for signoff verification. Together, these tools provide designers with a robust verification framework, ensuring that the multi-die designs not only meet performance and power targets but also maintain reliability and longevity.

As multi-die design continues to evolve, the long-standing partnership between Ansys and Synopsys is leading the way in helping designers overcome the inherent complexities of multi-die  design. To learn more about the latest advances in this area,  attend the joint Ansys and Synopsys webinar by registering at Technology Update: The Latest Advances in Multi-Die Design to explore multi-die designs, key challenges, and how Synopsys and Ansys software solutions can help you overcome these obstacles. Learn how these tools can streamline the 2.5D/3D multi-die design process, enabling more efficient and effective designs.

Also Read:

Ansys and eShard Sign Agreement to Deliver Comprehensive Hardware Security Solution for Semiconductor Products

Ansys and NVIDIA Collaboration Will Be On Display at DAC 2024

Don’t Settle for Less Than Optimal – Get the Perfect Inductor Every Time


Podcast EP259: A View of the History and Future of Semiconductor Manufacturing From PDF Solution’s John Kibarian

Podcast EP259: A View of the History and Future of Semiconductor Manufacturing From PDF Solution’s John Kibarian
by Daniel Nenni on 11-06-2024 at 8:00 am

Dan is joined by John Kibarian, president, chief executive officer and co-founder of PDF Solutions. He has served as president since 1991 and CEO since 2000.

John explains the evolution of PDF Solutions from its beginnings in 1992 to the present day. John describes moving from TCAD tools for design teams to a yield optimization focus working with fabs and equipment vendors. Today, PDF Solutions has customers in all three areas and a central focus is finding and cultivating the right data.

John comments on the expanding use of 3D processing and packaging as a driver for future innovation and the challenges that must be met. He comments on the growing development of electric cars and its impact on the semiconductor industry.

Looking forward, he sees a more nimble, software-based approach to manufacturing. John also comments on the impact chiplet-based design will have on the semiconductor supply chain and how AI will be used to improve the entire process.

The views, thoughts, and opinions expressed in these podcasts belong solely to the speaker, and not to the speaker’s employer, organization, committee or any other group or individual.


Arteris Empowering Advances in Inference Accelerators

Arteris Empowering Advances in Inference Accelerators
by Bernard Murphy on 11-06-2024 at 6:00 am

NoC Tiling min

Systolic arrays, with their ability to highly parallelize matrix operations, are at the heart of many modern AI accelerators. Their regular structure is ideally suited to matrix/matrix multiplication, a repetitive sequence of row-by-column multiply-accumulate operations. But that regular structure is less than ideal for inference, where the real money will be made in AI. Here users expect real-time response and very low per-inference cost. In inference not all steps in acceleration can run efficiently on systolic arrays, even less so after aggressive model compression. Addressing this challenge is driving a lot of inference-centric innovation and Arteris NoCs are a key component in enabling that innovation. I talked with Andy Nightingale (Vice President of Product Management & Marketing, Arteris) and did some of my own research to learn more.

Challenges in inference

Take vision applications as an example. We think of vision AI as based on convolutional networks. Convolution continues to be an important component and now models are adding transformer networks, notably in Vision Transformers (ViT). To reduce these more complex trained networks effectively for edge application is proving to be a challenge.

Take first the transformer component in the model. Systolic arrays are ideally suited to accelerate the big matrix calculations central to transformers. Even more speedup is possible through pruning, zeroing weights which have low impact on accuracy. Calculation for zero weights can be skipped, so in principle enough pruning can offer even more acceleration. Unfortunately, this only works in scalar ALUs (or in vector ALUs to a lesser extent). But skipping individual matrix element calculations simply isn’t possible in a fixed systolic array.

The second problem is accelerating convolution calculations, which map to nested loops with at least four levels of nesting. The way to accelerate loops is to unroll the loops, but a two-dimensional systolic array can only support two levels of unrolling. That happens to be perfect for matrix multiplication but is two levels short of the need for convolution.

There are other areas where systolic arrays are an imperfect fit with AI model needs. Matrix/vector multiplications for example can run on a systolic array but leave most of the array idle since only one row (or column) is needed for the vector. And operations requiring more advanced math, such as softmax, can’t run on array nodes which only support multiply-accumulate.

All these issues raise an important question: “Are systolic arrays accelerators only useful for one part of the AI task or can they accelerate more, or even all needs of a model?” Architects and designers are working hard to meet that larger goal. Some methods include restructuring sparsity to enable skipping contiguous blocks of zeroes. For convolutions, one approach restructures convolutions into one dimension which can be mapped into individual rows in the array. This obviously requires specialized routing support beyond the systolic mesh. Other methods propose folding in more general SIMD computation support in the systolic array. Conversely some papers have proposed embedding a systolic array engine within a CPU.

Whatever path accelerator teams take, they now need much more flexible compute and connectivity options in their arrays.

Arteris NoC Soft Tiling for the New Generation of Accelerators

Based on what Arteris is seeing, a very popular direction is to increase the flexibility of accelerator arrays, to the point that some are now calling these new designs coarse-grained reconfigurable arrays (CGRA). One component of this change is in replacing simple MAC processing elements with more complex processing elements (PEs) or even subsystems, as mentioned above. Another component extends current mesh network architectures to allow for various levels of reconfigurability, so an accelerator could look like a systolic array for an attention calculation, or a 1D array for a 1D convolution or vector calculation.

You could argue that architects and designers could do this themselves today without additional support, but they are seeing real problems in their own ability to manage the implied compexity. Leading them to want to build these more complex accelerators bottom-up. First build a basic tile, say a 4×4 array of PEs, then thoroughly validate, debug and profile that basic tile. Within a tile, CPUs and other (non-NoC) IPs must connect to the NoC through appropriate network interface units, for AXI as an example. A tile becomes in effect rather like an IP, requiring all the care and attention needed to fully validate an IP.

Once a base tile is ready, then it can be replicated. The Arteris framework for tiling allows for direct NoC-to-NoC connections between tiles, without need for translation to and from a standard protocol (which would add undesirable latency). Allowing you to step and repeat that proven 4×4 tile into an 8×8 structure. Arteris will also take care of updating node IDs for traffic routing.

More room for innovation

I also talked to Andy about what at first seemed crazy – selective dynamic frequency scaling in an accelerator array. Turns out this is not crazy and is backed by a published paper. Note that this is frequency scaling only, not frequency and voltage scaling since voltage scaling would add too much latency. The authors propose switching frequency between layers of a neural net and claim improved frames per second and reduced power.

Equally intriguing, some work has been done on multi-tasking in accelerator arrays for handling multiple different networks simultaneously, splitting layers in each into separate threads which can run concurrently in the array. I would guess that maybe this could also take advantage of partitioned frequency scaling?

All of this is good for Arteris because they have had support in place for frequency (and voltage) scaling within their networks for many years.

Fascinating stuff. Hardware evolution for AI is clearly not done yet. Incidentally, Arteris already supports tiling in FlexNoC for non-coherent NoC generation and plans to support the same capability for Ncore coherent NoCs later in the year.

You can learn more HERE.

Also Read:

Arteris at the 2024 Design Automation Conference

Arteris is Solving SoC Integration Challenges

Arteris Frames Network-On-Chip Topologies in the Car


The Convergence of Functional with Safety, Security and PPA Verification

The Convergence of Functional with Safety, Security and PPA Verification
by Nicky Khodadad on 11-05-2024 at 10:00 am

Future is Formal v7

Formal For All!

“Do I need a PhD to use formal verification?”
“Can formal methods really scale?”
“Is it too difficult to write formal properties that actually prove something?”
“If I can’t get a proof, should I just hope for the best?”
“Do formal methods even offer useful coverage metrics?”

Discouraging words to say the least, but we don’t have to live in the shadows they create anymore!

AI technologies are getting adopted faster than ever and the implementation of AI algorithms are no longer limited to purely software. In fact, breakthroughs in AI performance and the need to reduce energy consumption footprint are driving crazy innovations in hardware designs. So, for the first time, power, performance and area (PPA) along with functional verification has become a mainstream challenge; not to mention that with the adoption of AI hardware in embedded, IoT and edge – safety & security is now even a bigger challenge!

In a recent keynote at DVCon India given by Dr. Ashish Darbari, founder & CEO of Axiomise, he described how 1030  simulation cycles are not finding the bugs causing expensive respins. The respins are estimated to be 76% for ASICS with 84% of FPGA designs going through non-trivial bug escapes – the data coming from the well-known Wilson Research survey, 2022.

Axiomise: Making formal normal through consulting, training and automated IP

At Axiomise, we have been driving a vision of making formal normal and predictable for all kinds of semiconductor design verification. With over 60+ person years of combined formal verification experience in applying formal methods for silicon verification, the Axiomise team has verified over 150 designs covering GPU blocks, networking switches, programmable routers, NoCs, coherent fabrics, video IP components and hyper-scalers implemented with the three major CPU architectures including RISC-V, Arm, and x86. We have been able to achieve this through a combination of consulting & services engagements powered by automation derived from custom IP such as formalISA app for RISC-V and bespoke formal assertion IP for standard bus protocols such as AXI and ACE.

We love bringing in this experience to the masses and we have been delivering training through a unique combination of on-demand and instructor-led courses for the last 7 years at Axiomise. Our trained portfolio of hundreds of engineers now covers some of the Who’s Who of the industry.

Back to the drawing board

While we were training smart engineers in the industry, we learnt what is working for them and what is not and based on this experience we have come out with our most condensed and practical course yet! Learn the essentials of model checking from an industry expert who has pioneered a lot of the methods and abstractions used by experts in the industry today.

With 65 patents, and nearly 3 decades of experience, Axiomise founder and CEO Dr. Ashish Darbari brings this course, Essential Introduction to Practical Formal Verification, to anyone who has the desire to set foot on this journey.

With formal, we can find bugs early in design cycle, accomplishing the shift-left vision as well as prove bug absence through mathematical proofs.

Essential Introduction to Practical Formal Verification

Our goal is to make formal normal. And that’s how we’ve approached making this course, not just in making sure we start from scratch and build up knowledge as we go along, but also ensuring that the complex subject of formal is presented in a way that makes it easy for everyone to get started with formal. The course takes the complex topics of abstraction, problem reduction, design validation, verification and coverage and presents them with examples, and case studies to make it easier to understand the scope of formal for large design verification.

The course is delivered as an on-demand, video based online course, with lots of quizzes to test your understanding, live demos, example code to play with, a certificate at the end, and of course some food for thought to encourage you to go further and not just stop there.

Reaching out to non-English speaking audience

We made a genuine effort to reach out to non-English speaking audience in the world by providing subtitles for every video in 5 languages other than English. We have subtitles in available in French, Spanish, Portuguese, Japanese, and Chinese.

Priced with accessibility in mind, our goal is that with your help we can break free of just-good-enough designs and create a new standard for what we expect the future to look like. When it comes to depending more and more on our electronic devices and all the safety critical aspects, they can be a part of, one bug is enough for a catastrophe!

Every journey begins with a first step

One bug is one bug too many! We can never know when or where the next escaped catastrophic bug will appear, but we do live in a world where it could have been prevented, if the standard was to actually prove bug absence. And who knows, somewhere, maybe even close by, the most obscure and dangerous bug could be in the process of being written within a piece of code right now as you read this text.

Let’s get started in deploying formal for semiconductor design and work collectively to make formal normal!

Authors

Nicky Khodadad, Formal Verification Engineer, Axiomise

Ashish Darbari, Founder and CEO, Axiomise

Also Read:

An Enduring Growth Challenge for Formal Verification

2024 Outlook with Laura Long of Axiomise

RISC-V Summit Buzz – Axiomise Accelerates RISC-V Designs with Next Generation formalISA®


New Product for In-System Test

New Product for In-System Test
by Daniel Payne on 11-05-2024 at 8:00 am

Failure rates over time

The annual ITC event is happening this week in San Diego as semiconductor test professionals gather from around the world to discuss their emerging challenges and new approaches, so last week I had the opportunity to get an advance look at something new from Siemens named Tessent In-System Test software. Jeff Mayer, Product Manager, Tessent Logic Test Products brought me up to speed on this new product announcement.

Jeff explained how customers in two markets are converging on the same goals to detect premature device failures, monitor health of aging devices, plus guard against Silent Data Errors (SDE) and Silent Data Corruption (SDC). The two markets and their test approaches are:

Safety & Security – Automotive

  • Logic BIST for in-system test
  • Embedded deterministic test for manufacturing quality
  • Beginning to adopt advanced technology nodes

Quality – Networking, Data Center

  • Beginning to adopt in-system test
  • Embedded deterministic test for manufacturing quality
  • Already using advanced technology nodes

Data centers desire to extend the lifetime of their investments, because it’s just too costly to continuously upgrade the compute infrastructure. An emerging challenge with HPC and data centers is the challenge of Silent Data Errors as reported by Facebook, Google and Intel, because they are related to PVT variations and the workloads being run, which are difficult to reproduce. HPC vendors don’t want to take their servers offline for testing, so they opt to do testing during scheduled maintenance times.

In-system testing is required to ensure reliable electronic operation over the lifetime of a product, by considering semiconductor device issues like:

  • Incomplete test coverage
  • Small delay faults
  • Subtle defects
  • Test escapes caused by test marginalities
  • Early-life failures
  • Random failures
  • Silicon aging
Failure rates over time

Tessent In-System Test

What Siemens has created is a new In-System Test Controller placed as an IP block inside your SoC, enabling in-system deterministic test with the Tessent Streaming Scan Network (SSN) software.

This new ISTC block supports all Tessent MissionMode features for IJTAG and BIST instruments. Your test data is delivered through the AXI or APB system bus, which connects to functional interfaces like PCIe or USB. This new approach can target specific cell-internal and aging defects using high-quality, deterministic patterns. You can even change your test content in the field, targeting learned defectivity rates through the silicon lifecycle. The Tessent In-System Test (IST) is a superset of Tessent MissionMode, so here’s a more detailed view of how that all connects.

Tessent In-System Test (IST)

Summary

Safety-minded customers are adopting advanced technology nodes, and quality-minded customers want to leverage existing on-chip IP for in-system test, so both markets benefit from in-system test methodologies. Semiconductor device failures can be detected using in-system and in-field monitoring for errors. Combining Tessent In-System Test with Tessent Streaming Scan Network and Tessent TestKompress is a proven way to detect test escapes in-system and in-field.

There’s a live webinar on this topic of applying manufacturing quality test patterns direct to a design, leveraging the benefits of deterministic test over transitional in-system test methods. The webinar is on November 19th.

At ITC there are customers of this new technology presenting their initial results in a paper and a poster session, and silicon is already out, with more tape-outs underway. Like other Tessent products, there is no royalty for using this test IP. If your team already uses SSN, then you can quickly evaluate adding IST by talking to your local AE and AM team.

Related Blogs


An Illuminating Real Number Modeling Example in Functional Verification

An Illuminating Real Number Modeling Example in Functional Verification
by Bernard Murphy on 11-05-2024 at 6:00 am

Data stream sine waves 1s and 0s orange Getty 496123972 EXT min

I just read an interesting white paper on functional verification of analog blocks using SV-RNM (SystemVerilog real number modeling). The content is worth the effort to read closely as it elaborates a functional verification flow for RNM matching expectations for digital logic verification, from randomization to functional coverage, assertions and checkers, and integration into UVM. The white paper illustrates for an ADC and a DAC.

The importance of mixed-signal verification

AI may dominate the headlines but outside of cloud and AI PC deployments, real applications must interact with real-world analog inputs, from cameras, radar, lidar, to audio, and drive analog outputs for lighting, speakers, and actuators. (In fact even in the cloud, systems must monitor temperature, humidity, and supply voltage levels. But that’s another story.)

Verifying correct interactions between digital and analog circuits has until recently depended on co-simulation between SPICE (or accelerated SPICE) modeling of analog transistor circuit behavior and logic simulator modeling of digital logic behavior in SystemVerilog. Since circuit simulation runs many orders of magnitude slower than logic simulation, practical testing has been limited to running only simple handoff sequences across the analog/digital interface.

Today analog and digital designs are much more tightly coupled, to control and monitor analog parametrics. Modern DDR interfaces provide a good example of this coupling in their training cycles. Verifying correctness in such cases requires much more extensive sequence testing between analog and digital circuits, often interacting with software-driven control on the digital side. Simulation then needs to run closer to digital simulation speeds to have hope of achieving reasonable coverage in testing.

Real number modeling (RNM)

In digital simulators a signal can be a 0 or a 1; their speed depends on this simplification. Analog simulators model signal values (and times) as real numbers, a signal voltage might be 0.314 at time 0.125 for example. RNM allows for a compromise in analog modeling in which analog signals can be quantized (amplitude and time), allowing for discrete modeling. (Full disclosure, RNM modeling also considers currents and impedances, not essential to this particular set of examples.)

Digital simulators have been adapted to deal with such quantized values and can still run much faster than real number-based SPICE, while also coupling to the regular digital logic simulator. More complex test plans become possible, and with suitable support for RNM compatible with digital verification expectations (randomization, constraints, assertions, coverage props, and UVM support), NVM-based verification can integrate very easily with mainstream verification flows.

Highlighted functional verification features

The first point the paper covers is constrained randomization for a flash ADC. They consider the resistive divider chain providing reference voltages from AVDD down to AGND, in say 8 steps, with comparators at each step. These resistors won’t be perfectly matched, so some randomized error (within constrained bounds) can be attached to each. Equally, testing should allow for (constrained) variation in AVDD – AGND. Finally, the input to the ADC can be defined either through deterministic sequences or as randomized sequences within the allowed range.

Coverage is straightforward. The paper suggests looking for analog signal samples in bins from minimum signal amplitude to maximum signal amplitude. Any uncovered bin indicates the need for more comprehensive testing, described in a SemiWiki article written by this paper’s author.

The section on assertions provides nice examples for how analog/digital assertions are constructed. Nothing mysterious here. For an ADC, the check quantizes the input voltage to an expected digital value and compares that value with the output of the ADC. For the DAC, simply invert this check, comparing the expected output voltage with the DAC output voltage.

UVM integration details will make more sense to UVM experts than to me (a UVM illiterate). I know it’s important and appears to be quite detailed in the examples.

The paper wraps up with a nice discussion on measuring linearity in these devices, a topic you wouldn’t find in logic verification, and a discussion on results of the analysis. My takeaway from the second point is that here is an opportunity to consider the randomization constraints set in the beginning. Overestimating these constraints could lead to more errors than observed in silicon, and, of course, underestimating could be disastrous. I expect getting this right probably requires some level of calibration against silicon devices.

Thought-provoking paper for me. You can read it HERE.

Also Read:

How to Update Your FPGA Devices with Questa

The RISC-V and Open-Source Functional Verification Challenge

Prioritize Short Isolation for Faster SoC Verification

 

 


MIPI solutions for driving dual-display foldable devices

MIPI solutions for driving dual-display foldable devices
by Don Dingee on 11-04-2024 at 10:00 am

H3 FPGA with MIPI for driving dual display folding devices

Flexible LCD technology has spurred a wave of creativity in device design, including a new class of foldable phones and an update to the venerable flip phone. Besides the primary display inside the fold – sometimes taking the entire inside area – a smaller secondary display is often found outside the fold. Introducing the secondary display adds more challenges than meet the eye. One of Mixel’s MIPI IP customers, Hercules Microelectronics, is creating innovative MIPI solutions for driving dual-display foldable devices.

Offloading foldable display tasks from an application processor

Adding a feature to a mobile device often means adding an IP block to the application processor (AP). Driving dual displays reveals subtle integration issues, including always-on requirements, that make a strong case for a different implementation.

The first is the physical interfaces. MIPI interfaces are standard in mobile devices, offering speed and power efficiency for driving displays. MIPI is the unquestioned choice to drive the primary foldable display. Still, power requirements increase if the AP manages the MIPI interface as primary displays move to higher resolutions and increased frame rates. Conventional solutions to save power are well-known: dimming the display, reducing its resolution or frame rate, or powering it down entirely.

Add a secondary display on the outside of the device case, usually smaller and with a lower resolution. Reducing the size and costs of these displays has pushed vendors to an alternative interface – QSPI, a feature often found on pin-count-sensitive microcontrollers but not on most APs. Physically adding QSPI is simple, but keeping portions of the AP on just to drive a small always-on display gets expensive in battery drain. The secondary display power requirements increase with rotation, chewing up computational resources and power to keep the display upright as the device turns.

All this points to what might be a counterintuitive conclusion for mobile device designers: an external dual display controller could achieve overall power savings by offloading foldable display tasks and keeping the AP at minimal power when folded. Using an external controller also reduces the time and risk of modifying an AP for dual-display foldable devices and their always-on nature.

MIPI interfaces integrated into low-power FPGA manages dual displays

This external controller would have to be low power yet have enough performance to take a MIPI stream from the AP host and convert it into a MIPI interface for the primary display with resolution up-scaling and a QSPI interface with the required rotational processing and always-on control. Ideally, it would also quickly adapt to various display configurations to keep costs down while supporting various foldable devices.

Hercules Microelectronics (HME) has a family of low-power SRAM-based FPGA devices that fit this profile. Their HME-H3 combines an Arm Cortex-M3 core and six hard MIPI IP products included: a C-PHY/D-PHY combo Rx with DSI peripheral and CSI-2 Rx controllers and a D-PHY Tx with DSI host and CSI-2 Tx controller cores. Adding the QSPI interface and control to the FPGA logic in the H3 is straightforward for a complete dual-display solution, as shown next.

Mixel provided HME with its MIPI C-PHY/D-PHY Combo IP and MIPI D-PHY IP. Mixel’s MIPI IP supports MIPI C-PHY v2.0 with 3 trios at speeds of up to 2.5G symbols per second per trio and MIPI D-PHY v2.5 with 4 lanes at speeds of up to 2.5Gbps per lane. These lanes provide a total aggregate bandwidth of 17.1Gbps in C-PHY mode and 10.0Gbps in D-PHY mode. HME achieved first-time silicon success with Mixel’s integrated MIPI solutions.

With Mixel’s MIPI IP products deployed in many applications, this was a low-risk, rapid-return solution for HME and its customers. The HME-H3 was already in use in video bridging and embedded vision applications and extending it to dual-display foldable device applications opens more possibilities for designers.

HME-H3 is Hercules Microelectronics’ second-generation product to leverage Mixel’s MIPI IP with first-time silicon success. The first product, HME-H1D03 FPGA, integrated Mixel’s MIPI D-PHY IP and won Best FPGA of the Year Award at the 2020 China IC Design Achievement Award Ceremony.

Flexible LCDs free designers from many physical design constraints, leading to a wide range of foldable and flip devices entering the market. The success of any mobile device in the market continues to rely on a blend of display performance and power efficiency for the experience users expect. MIPI solutions for driving dual-display foldable devices are crucial to achieving the experience.

 

See what Hercules Microelectronics and Mixel say about this application:

Mixel MIPI C-PHY/D-PHY Combo IP Integrated into Hercules Microelectronics HME-H3 FPGA

 

To learn more about Mixel’s MIPI IP product solutions in this application, please visit:

Mixel MIPI IP Cores

Also Read:

Ultra-low-power MIPI use case for streaming sensors

2024 Outlook with Justin Endo of Mixel

Automotive-grade MIPI PHY IP drives multi-sensor solutions


Notes from DVCon Europe 2024

Notes from DVCon Europe 2024
by Jakob Engblom on 11-04-2024 at 6:00 am

semiwiki 1 dvcon europe 2024 cookie

The 2024 DVCon (Design and Verification) Europe conference took place on October 15 and 16, in its traditional location at the Holiday Inn Munich City Centre. Artificial intelligence and software were prominent topics, along with the traditional DVCon topics like virtual platforms, RTL verification, and validation.

The decorated Lebkuchenherzen are given as gifts to all speakers at the conference.

Keynotes: Infineon and Zyphra

The keynotes provide high-level insights into broader technology industry trends and future directions, complementing the more detailed tutorials and paper presentations. A DVCon Europe keynote is not necessarily so much about how to build IP, chips, and systems, but about what they are being used for and the products they are part of. Not surprisingly, artificial intelligence (AI) has come up in most keynotes from the past few years…

This year started with Thomas Böhm from Infineon talking about “Dependable microcontroller architectures”. Microcontrollers are getting significantly more complex and are adding core clusters and accelerators to handle increases in compute requirements as well as low-latency handling of secured communication.

AI and machine learning (ML) techniques are being used in microcontrollers to implement fundamental control. This requires specialized hardware acceleration at a much smaller scale than what you find in datacenters and even client chips. Traffic on in-vehicle networks is encrypted, requiring security hardware acceleration to maintain low latency.

The second keynote came from Erik Norden from Zyphra, a startup that just went out of stealth in time for Erik to tell us the name of the company at DVCon! His talk was about “Next 10x in AI – System, Silicon, Algorithms, Data” – i.e., AI at datacenter scale. It was particularly interesting to hear Erik’s take on this as he started out on the hardware side and has moved towards the software/algorithm side.

Building an efficient and scalable AI system requires tweaking all aspects. For example, using a better training data set can improve the performance of a same-size model on same-size hardware. Zyphra has also developed new LLM architectures that get more performance out of existing hardware by using it more efficiently.

Software

The panel discussion, “Digital Transformation in Automotive – Expectations versus Reality”, spent a lot of time on software.

Software is becoming increasingly important to “traditional” automotive companies. It used to be specified as part of the functionality of physical components, but with the advent of software-defined vehicles (SDV) it is necessary to transition to a software-first model. Companies like Tesla have totally changed how software is treated and proven the model of delivering incremental value to the same hardware over time by software changes. I really liked the point being made that it has be to be made fun to work with software in automotive.

Software is also showing up in classic design and verification papers – both as part of the device-under-test and as part of the stimuli. More than half of all the papers at the conference addressed software in some way.

Open Source

Open-source software and open-source EDA software have grown in importance over time. There were papers and tutorials about open-source technologies like Qemu and cocotb, and open-source software like Linux is very commonly used in domains like automotive. Having access to hardware design flows based on open-source tools lowers the barrier to entry and brings more enthusiasts into the hardware design field.

Another open-source technology that is seeing major adoption is obviously RISC-V. Thomas Böhm’s keynote mentioned it as the potential future for automotive designs, and it was present in papers and tutorials.

Locally Global

DVCon Europe encompassed two keynotes, one panel, a day of tutorials, and 55 peer-reviewed papers split across an engineering and a research track. The conference takes place in the heart of Europe, but participants from all over the world! Some data was presented in the following slide:

For a deeper dive into what was discussed at the conference, check out https://jakob.engbloms.se/archives/4362 and the DVCon Europe group on linkedIn.

Also Read:

Accellera and PSS 3.0 at #61DAC

An Accellera Functional Safety Update

DVCon Europe is Coming Soon. Sign Up Now