Banner 800x100 0810

My Conversation with Infinisim – Why Good Enough Isn’t Enough

My Conversation with Infinisim – Why Good Enough Isn’t Enough
by Mike Gianfagna on 11-12-2024 at 6:00 am

My Conversation with Infinisim – Why Good Enough Isn’t Enough

My recent post on a high-profile chip performance issue got me thinking. The root cause of the problem discussed there had to do with a clock tree circuit that was particularly vulnerable to reliability aging under elevated voltage and temperature. Chip aging effects have always got my attention. I’ve lived through a few of them in my career and they are, in a word, exciting. Perhaps frightening.

This kind of failure represents a ticking time bomb in the design. There are many such potential problems embedded in lots of chip designs. Most don’t ignite, but when one does, things can get heated quickly. I made a comment at the end of the last post about Infinisim and how the company’s technology may have avoided the issue being addressed. I decided to dig into that topic a bit further to better understand the dynamics at play with clock performance. So, I reached out to the company’s co-founder and CTO. What I got was a master class in good design practices and good company strategy. I want to share my conversation with Infinisim and why good enough isn’t enough.

Who Is Infinisim?

You can learn more about Infinisim on SemiWiki here. The company provides a range of solutions that focus on accurate, robust full-chip clock analysis.

Several tools are available to achieve this result. One is SoC Clock Analysis that helps to accurately verify timing, detect failures, and optimize performance of the clock in advanced designs. Another is Clock Jitter Analysis that helps to accurately compute power supply induced jitter of clock domains – a hard-to-trace problem that can cause lots of problems. And finally Clock Aging Analysis that helps to accurately determine the operational lifetime of power-sensitive clocks. It is this last tool that I believe could have helped with the chip issue discussed in my prior blog.

The tools offered by Infinisim use highly accurate and very efficient analysis techniques. The approach goes much deeper than traditional static timing analysis.

My Conversation With the CTO

Dr. Zakir H. Syed

I was able to spend some time speaking with Dr. Zakir H. Syed, co-founder and chief technology officer at Infinisim. Zakir has almost 30 years of experience in EDA. He was at Simplex Solutions (acquired by Cadence) at its inception in 1995 through the end of 2000.  He has published numerous papers on verification and simulation and has presented at many industry conferences.  Zakir holds an MS in Mechanical Engineering and a PhD in Electrical Engineering, both from Duke University.

Here are the questions I posed to Zakir and his responses.

It seems like Infinisim’s capabilities can provide the margin of victory for many designs. How are you received when you brief potential customers?

 Their response really depends on past experiences. If they’ve previously encountered issues—like anomalous clock performance, timing challenges, or yield problems—they tend to quickly see the value Infinisim brings and are eager to learn more. In my experience, these folks are few and far between, however.

This is a bit surprising. Why do you think this is the case?

It’s an interesting point. The issue isn’t that better performance isn’t desirable; it’s that there’s a general trend to accept less-than-optimal performance as the norm. Over time, parameters like timing, aging, jitter, yield, and voltage have been treated as “known quantities” and design teams rely on established margins to work within these expectations.

I’m beginning to see the challenge. If design teams are meeting the generally accepted parameters, why rock the boat?

Exactly. If the design conforms to the required margins, all is well. Designers are rewarded for meeting schedules. CAD teams are recognized for delivering an effective flow. And this continues until there is some kind of catastrophic failure. When that “ticking time bomb” goes off, suddenly every assumption is questioned, and a deep analysis begins.

I get your point. I wrote a blog recently that looked at a high-profile issue that was traced back to clock aging.

Yes, that issue could likely have been discovered with our tools, before the chip was shipped to customers. In that case, aging effects came into play under certain operating conditions. Since N-channel and P-channel devices age differently, the result was a clock duty cycle that began to drift from the expected 50/50 duration. Once the asymmetry became large enough, circuit performance began to fail.

So, what you don’t know can hurt you.

You’re right. But there’s also a bigger opportunity here. It’s not just about preventing catastrophic failures. Advanced nodes are costly, and we pay for that performance. By thoroughly examining circuit behavior across all process corners, we can leverage that investment to its fullest potential instead of leaving performance on the table with excessive margins. The same goes for yield, which directly impacts profitability. In today’s competitive chip design landscape, accepting less performance often means losing out on market share.

OK, the light bulb is going off. Now I see the bigger picture. Using tools like Infinisim’s doesn’t just prevent failures; it’s a strategic move toward maximizing profitability and competitiveness.

I think you’ve got it. When more people within a company—from engineers to executives—embrace this mindset, it leads to a stronger, more competitive organization. By challenging the status quo, companies can achieve more and realize their full potential.

To Learn More

You can learn more about the integrated flow offered by Infinisim here.  My conversation with Infinisim made it clear why good enough isn’t enough.


Build a 100% Python-based Design environment for Large SoC Designs

Build a 100% Python-based Design environment for Large SoC Designs
by Daniel Nenni on 11-11-2024 at 10:00 am

Integrated Python based design environment

In the fast-evolving world of semiconductor design, chip designers are constantly on the lookout for EDA tools that can enhance their productivity, streamline workflows, and push the boundaries of innovation. Although Tcl is currently the most widely used language, it seems to be reaching its limits in the face of the growing complexity of chip designs. In these conditions, Python appears to be the wisest choice among the programming languages and APIs available.

Today, Python is used more and more frequently, especially by young design engineers. Python offers a wide range of advantages. In terms of usability, its ease of debugging and execution speed opens more possibilities compared to Tcl. What’s more, Python benefits from a very active community and a wide choice of open-source libraries. It therefore has a rightful place in the use of EDA tools, and continuing to use a dual Python/Tcl language is counter-productive for design workflows.

1. One Unified Design Environment

Using Python for semiconductor design means working in a single, unified design environment. Indeed, Python gives engineers access to a wide range of libraries, design tools, and frameworks within a single ecosystem. This integration simplifies the design process significantly. Engineers can achieve their goals without having to switch from one language or platform to another. With all tools available in one place, the workflow becomes more cohesive and efficient, allowing for seamless transitions between design, reporting, simulation, and analysis.

2.  Ease of Learning and Use

Python’s simplicity and readability make it an excellent choice. Its straightforward syntax is easy to learn. It allows designers to focus on learning key concepts. Python also offers far more scripting possibilities than Tcl.  This ease of use accelerates the learning curve, enabling engineers to quickly prototype and iterate on their designs.

3. Rich Ecosystem of Libraries and Tools

Python boasts a robust ecosystem filled with libraries specifically tailored for scientific computing, data analysis, and machine learning. Libraries such as NumPy, SciPy, and Pandas provide powerful tools for numerical computations, while TensorFlow and PyTorch can be leveraged for machine learning applications in chip design. This wide range of resources enables engineers to implement sophisticated algorithms and analyses without the hassle of integrating disparate tools.

Semiconductor chip design also involves analyzing large datasets and visualizing complex processes. Python provides robust libraries like Matplotlib and Seaborn for data visualization, helping engineers to better understand their designs and make data-driven decisions. This capability is crucial for optimizing chip performance and functionality.

4. Adoption in Academia and Industry

Python is now a commonly taught subject in higher education establishments. As a result, young engineers who are proficient in Python will be better prepared for the job market. Indeed, industry has also integrated Python into its design flows. Many companies are now specifically looking for candidates with Python skills, making it a valuable asset for career advancement.

5. Defacto’s SoC Compiler is 100% Python compliant

Defacto’s SoC Compiler provides a full support of all its capabilities with an object-oriented Python API. Indeed, Defacto made the choice two decades ago to build its software so that Python would be a built-in API, and today this allows many users to benefit from the power of this language. Defacto estimates that today more than 60% of its users switched to its Python API. Such switch enabled top Semiconductor companies to better integrate new Defacto’s SoC Compiler-based application into their SoC design environment, to develop new additional applications, to fit into general corporate decisions of using Python in EDA, and more.

Defacto engineers are also providing a close support to its customers to help migrate from Tcl to Python API and to build custom Python-based applications.

Figure 1 – Defacto’s SoC Compiler flow

A typical case study using the Defacto Python API is the generation of RTL code with open-source libraries to help generating and building a complete SoC at RTL.

Figure 2 below illustrates an example of a design environment for RTL code generation by using open-source libraries (like Chisel) and Defacto’s SoC Compiler which provides the capabilities to edit and build top level subsystems and SoCs. This ONE-STOP design and debug environment 100% python-based increases design efficiency for SoC architects and RTL designers.

Figure 2: Python-based Integrated Design Environment

Python is the future of EDA industry. With this two decades maturity of providing Python API, Defacto’s SoC Compiler is a strong weapon to build next generation SoC build flows.

For more information about the Defacto products, reach out to their website: https://defactotech.com/


Keysight EDA 2025 launches AI-enhanced design workflows

Keysight EDA 2025 launches AI-enhanced design workflows
by Don Dingee on 11-11-2024 at 6:00 am

Keysight ADS 2025 enables AI-enhanced design workflows

The upcoming Keysight EDA 2025 launch has three familiar tracks: RF circuit design, high-speed digital circuit design, and device modeling and characterization. However, this update features a common thread between the tracks – AI-enhanced design workflows. AI speeds modeling and simulation, opening co-optimization for complex designs. It also gives design teams more freedom to incorporate Keysight EDA tools into their workflows with Python customization. Here is a preview of what designers can expect, including some short videos on each of the three tracks, with more details to come in a multi-region, multi-track live and archived webinar event.

RF circuit designers move into a 3DHI co-design cockpit

Keysight Advanced Design System (ADS) is unmatched as the state-of-the-art platform for RF design and multi-domain co-simulation. Python scripting features already in ADS provide the capability for automating tasks and customizing the user interface. However, RF design complexity continues to grow, typified by the emergence of 3D heterogeneous integration (3DHI) techniques with dense multi-technology packaging.

Rising complexity creates a pressing need to insert RF designs into appropriate system contexts for simulation. However, workflows cannot tolerate the potential of spiraling simulation run times for comprehensive, realistic evaluations with more data points and swept parameters, which could force users to limit how frequently crucial RF simulations execute. Leaving unpredictable real-world effects undetected until physical prototypes is a poor choice.

Fortunately, it’s a choice ADS users won’t face. The previous phase of Keysight EDA research concentrated on broadening the analysis types in ADS, unifying measurement science with Keysight’s test and measurement instrumentation, and speeding simulations with innovative algorithms such as compact test signals, fast envelope techniques, and distortion EVM.

This new phase in Keysight EDA 2025 re-engineers the core simulation platform in ADS to provide external programmatic simulation control through an application programming interface (API), including Jupyter Notebook support. The API also enables new levels of Python customization, including user interfaces, importing layout or modeling data for simulation, creating visualizations for simulation results, and training artificial neural network (ANN) models. The newly re-engineered core delivers as much as 6x improvement in simulation times.

The result transforms ADS into a co-design cockpit where teams can efficiently manage multi-domain RF design and simulation in one open environment. This cockpit minimizes design manipulation while enabling comprehensive, accurate simulation as often as desired earlier in workflows. It also prepares ADS for future growth in RF design complexity and AI-driven command invocation. Floating license packs can set up multiple users for parallel basic analyses, a power user for high-performance specialized analysis, or any combination that makes sense for a design workflow.

High-speed digital circuit design gets enhanced crosstalk analysis

One of the most prominent 3DHI techniques is chiplets, with many teams interested in or pursuing designs based on the Universal Chiplet Interconnect Express (UCIe) specification. UCIe seeks to create an ecosystem where chiplets from different technology nodes can interoperate within a single package, and ongoing enhancements to the specification target optimized die-to-die signaling, improving performance.

Signal integrity is the biggest issue in achieving reliable UCIe designs. As interconnect speeds increase, signal integrity concerns are growing. Teams must carefully analyze UCIe designs, examining all metrics simultaneously to avoid the pitfall of optimizing one metric at the expense of degrading others. To make a comprehensive die-to-die interconnect layout and analysis possible, Keysight created Chiplet PHY Designer, an extension to ADS that provides UCIe simulation and enhanced analysis of the voltage transfer function (VTF) and forward clocking.

In the EDA 2025 update, one Keysight focus for chiplets is enhanced support for quarter data rate (QDR) clocking. Approved as an addition to the UCIe 2.0 specification in August 2024, QDR provides a path to lower UCIe clock rates, reducing design risk while still offering high-performance data transfer rates. Simulating QDR in ADS essentially repeats PHY analysis four times, once for each clock phase. AI enters the equation to help Chiplet PHY Designer visualize VTF crosstalk and VTF loss masks for different data rates and automatically model and optimize link design parameters for best results.

Device model re-centering improves speed by an order of magnitude

Creating process design kits (PDKs) for advanced semiconductors such as III-V technology can be tedious. Engineers try to fit a basic set of measurements into an existing model for a previous device in what is known as model re-centering. However, the fit is often less than ideal and may only work for a tightly bounded set of operating conditions, such as bias voltages or frequency ranges. If the application context changes, a new set of exhaustive measurements could take months. Without more measurement data, partially re-centered models can lack fidelity, leading to inaccurate simulation results.

Model re-centering fidelity is imperative with devices applied in more complex designs for wireless standards featuring higher-order modulation and broader bandwidths. Too much of a difference between simulations and measurements under parameter sweeps manifests as a significant risk of prototype failure.

The EDA 2025 update includes a refresh of Keysight IC-CAP with its ANN Toolkit leveraging AI to quickly re-center models spanning more parameters without exhaustive measurements, reducing the model re-centering process to hours instead of weeks and lowering the expertise required to obtain accurate modeling results.

Learn more at the Keysight EDA 2025 launch event

These are just some of the capabilities in Keysight EDA 2025. It’s also important to note that many EDA 2025 RF circuit design discussions apply to Cadence, Siemens, and Synopsys design platform users considering Keysight RFPro Circuit (with its similar next-generation core simulation technology) or ADS, depending on their workflow.

To help current and future users understand the latest enhancements in Keysight EDA 2025, including AI-enhanced design workflows for RFICs, chiplets, and PDKs, Keysight is hosting two live online launch events on December 3rd in European and American time zones. Designers can register for a track at either live event and view other tracks on demand.

See the Keysight EDA 2025 event page for more information and registration:

Keysight EDA 2025 Product Launch Event


Podcast EP260: How Ceva Enables a Broad Range of Smart Edge Applications with Chad Lucien

Podcast EP260: How Ceva Enables a Broad Range of Smart Edge Applications with Chad Lucien
by Daniel Nenni on 11-08-2024 at 10:00 am

Dan is joined by Chad Lucien, vice president and general manager of Ceva’s Sensing and Audio Business Unit. Previously he was president of Hillcrest Labs, a sensor fusion software and systems company, which was acquired by Ceva in July 2019. He brings nearly 25 years of experience having held a wide range of roles with software, hardware, and investment banking.

Dan explores the special requirements for smart edge applications with Chad. Both small, low power embedded AI as well as more demanding edge applications are discussed. Chad describes the three pillars of Ceva’s smart edge support – Connect, Sense and Infer.

Dan explores the capabilities of the new Ceva-NeuPro™- Nano NPU with Chad. This is the smallest addition to the product line that focuses on hearable, wearable and smart home applications, among others. Chad explains the benefits of Ceva’s NPU line of IP for compact, efficient implementation of AI at the edge.

The views, thoughts, and opinions expressed in these podcasts belong solely to the speaker, and not to the speaker’s employer, organization, committee or any other group or individual.


CEO Interview: Bijan Kiani of Mach42

CEO Interview: Bijan Kiani of Mach42
by Daniel Nenni on 11-08-2024 at 6:00 am

bijan (1)

Bijan’s role includes overseeing all product groups, field engineering, customer support, strategy, sales, marketing, and business development. His previous appointments include VP of Marketing at Synopsys Design Group and CEO of InCA. He holds a PhD in Electrical Engineering.

Tell us about your company 
Mach42 is a verification acceleration company. The company is a spinout company from Oxford University, delivering the next step-change in simulation acceleration. We are an early-stage software company developing state-of-the-art machine learning and artificial intelligence technology to simplify, automate, and accelerate simulation tasks. We leverage proprietary neural network technology to accelerate expensive calculations, and we do it with minimal data and high accuracy, providing many orders of magnitude speedup in verification acceleration. Our platform is already delivering a substantial competitive advantage to our early customers.

The company’s innovative technology has been covered in scientific articles such as Nature Physics and Science Magazine. In May 2023, the company announced First Light Fusion, the University of Oxford, the University of York, Imperial College London, and Mach42 will collaborate under a $16 million grant award from UK Research and Innovation’s Prosperity Partnership program (more details here). Our solution is selected to support the above consortium.

We have offices in the UK and California. Our core R&D team is based in the UK, and our US office provides business and technical support. We closed our pre-series A funding in September 2023, and the link below provides more details about our vision, its investors, and its co-founders: Machine Discovery Secures $6 Million to Deliver AI Tools For Semiconductor Design (prnewswire.com).

Mach42 was previously known as Machine Discovery.

What problems are you solving? 
Our flagship product, the Discovery Platform, allows you to exhaustively explore the design space in minutes, enabling you to identify potential out-of-spec conditions. As a companion to SPICE engines, the Discovery Platform leverages our breakthrough AI technology for faster and exhaustive design verification.

What application areas are your strongest? 
The Discovery Platform has demonstrated its shift-left ROI benefits in multiple complex applications, including PMIC, SerDes, and RF designs. It delivers accurate and secure design representations to explore the entire space in minutes.

Applications:
– Quickly and efficiently explore the design space in minutes
– Generate an AI-powered model of your design
– Analyze chip, package, and board-level LRC effects
– Generate a secure model of your design to share with third parties

What does the competitive landscape look like and how do you differentiate?
Mach42 is the first to market with its AI-powered platform to accelerate complex verification tasks.

In the coming years, trillions of dollars of revenue will be generated from new product developments in the engineering market. With this increase in supply, establishing early design insights using multi-physics simulation solutions will be vital to getting new products to market. We are uniquely positioned as a pure-play AI company servicing the semiconductor industry.

What new features/technology are you working on? 
Our vision is to cut the semiconductor design development cycle in half, leveraging our proprietary artificial intelligence technology. Thanks to the team’s experience and expertise, we are in an ideal position to drive the development of new technology to accelerate and improve all levels of product development, from design to verification, test development, and IP security.

By combining advanced simulation technology, cloud computing, and neural network technology, we make it possible to predict analog circuit design performance at the click of a button.

How do customers normally engage with your company?
Via our website Mach42 or email info@mach42.ai

Also Read:

An Important Advance in Analog Verification


Changing RISC-V Verification Requirements, Standardization, Infrastructure

Changing RISC-V Verification Requirements, Standardization, Infrastructure
by Daniel Nenni on 11-07-2024 at 10:00 am

Abstract,Futuristic,Infographic,With,Visual,Data,Complexity,,,Represent,Big

A lively panel discussion about RISC-V and open-source functional verification highlighted this year’s Design Automation Conference. Part One looked at selecting a RISC-V IP block from a third-party vendor and investigating its functional verification process.

In Part Two, moderator Ron Wilson and Contributing Editor to the Ojo-Yoshida Report took the panel of verification and open-source IP experts on a journey through RISC-V’s changing verification requirements, standardization and its infrastructure. Panelists included Jean-Marie Brunet, Vice President and General Manager of Hardware-Assisted Verification at Siemens; Ty Garibay, President of Condor Computing; Darren Jones, Distinguished Engineer and Solution Architect with Andes Technology; and Josh Scheid, Head of Design Verification at Ventana Microsystems.

Wilson: When SoC designers think RISC-V, they think microcontroller or at least a tightly controlled software stack. Presumably, evolution will bring us to the point where those stacks are exposed to app developers and maybe even end users. Does that change the verification requirements? If so, how?

Scheid: On standardized stacks? Market forces will see this happen. We see Linux-capable single-board computers with a set of different CPU implementations. Designers see the reaction to non-standardization. The RISC-V community has survived a couple of opportunities for fragmentation. The problem was with the vector extension and some debate about compressed instructions. We survived both by staying together with everything. I think that will continue.

Back to these single-board computers. The ability for the software development community to run on those allows developers to say, “Yes, I work on this, and I work on these five things, therefore I consider that.” It’s a way for software to know that they’re portable. They’re working on the common standard together.

Garibay: The difference isn’t in so much the verification of a CPU. We’re spending much more time at the full-chip SoC deployment level and that’s typically ISA-agnostic. A designer is running at an industry-standard bus level with industry-standard IO and IP. That layer of the design should standardize in a relatively stable manner.

Wilson: Does it help your verification?

Jones: Sure. The more software to run on RISC-V CPUs, the better. Once designers step outside of the CPU, it’s plug-and-play. For example, companies take products to Plugfests for various interconnect technologies, including PCIe. When PCIe was brand new, they couldn’t have done that because it was too new. If a customer is building a system, they want to have more software. The more software that there is to run, the better it is for everyone, including the CPU vendor who can run and validate their design.

Brunet: An aspect as well is the need to run much more software and the need for speed. Designers are not going to run everything at full RTL. Using a virtual model and virtual platform has been helpful for conventional architectures. With RISC-V, we

are starting to see some technology that is helping the virtual platform and it’s necessary. We don’t see standardization and it’s a bit of the Wild West with respect to RISC-V virtual platform simulation. It will benefit the industry to have better standardization, as proven by the traditional Arm platform. It’s a reliable virtual environment that can accelerate and run more software validation. I don’t see that happening yet with RISC-V.

Wilson: I’m hearing that the infrastructure around RISC-V has not caught up to where Arm is, but the ramp is faster than any other previous architecture. It’s getting there and RISC-V has advantages, but it’s brand new.

Scheid: I can feel the speed of the ramp.

Jones: Yes. Having seen the 64-bit Arm ramp for many years, the rate RISC-V is moving has been much more rapid. Arm broke the seal on extra architectures. RISC-V can learn all those lessons, and we’re doing well.

Wilson: Are we going to sacrifice quality or accuracy for that speed of ramp, or is RISC-V doing what other architectures have done faster and just as well?

Garibay: RISC-V is not a monolith. It is individual providers and many players in the market. The competition could cause corner cutting and that’s the great capitalist way forward. Designers must get what they pay for and there’s nothing inherently different about RISC-V in that respect. The goodness is that they have a number of potential vendors. At least one of them has something that is good and useful. It may be up to the designer to figure that out, but all the work over time will be viable.

Jones: I agree with Ty. We’re probably supposed to disagree to make the panel more interesting. But I agree that with RISC-V, the competition is actually a good thing, and that’s enabling a faster ramp. How many CPU design teams are designing Arm processors? A couple that are captive. Otherwise, it’s Arm. How many CPU design teams are designing RISC-V? Many. Andes has more than one team working on it.

Competition produces quality. Products that aren’t good won’t last. If designers have a software infrastructure with Arm and want to make a functional safety feature but don’t like Arm’s offering, they’re stuck. With RISC-V, they have a multitude of vendors that offer whatever’s needed.

Wilson: I’d like to dig into that. Historically, one RISC-V advantage is that it is open source. Designers can add their own instructions or pipeline. Is that true anymore with full support vendors beginning to organize the market? If so, what kind of verification challenge does a designer take on as an SoC developer?

Scheid: There are two sides. Certainly, there’s a verification aspect because a designer is taking on some of that design and making decisions about what’s appropriate as an architectural extension, which is powerful but there is risk and reward.

Another consideration is software ecosystem support. The amount of resources for any given instruction set architecture and the software system spent on software support is far greater than hardware.

Designers must consider the choice of going with standard customization or not. RISC-V is the third path, which is going ahead and proposing extensions into the RISC-V community for standardization and ratification as actual extensions. That can matter to design team and depends on what to keep proprietary. That also, as a middle ground, allows a design team to customize and have externalized ecosystem support over time by convincing the RISC community this is a value-added, viable extension.

Garibay: The ability to extend, especially the RISC-V instruction set is one of the great value-added propositions driving some of the motion toward the architecture.

How does that affect verification? Obviously, if the licensor is the one making the changes, it takes on some unique accountability and responsibility. As a company that licenses its IP, it has the responsibility to make it safe to create an environment around the deliverable so that the IP cannot be broken.

It’s an added burden for a CPU designer to create the environment to validate that statement is true to the extent that’s possible. It’s a critical value-add to the offering and worth spending engineering cycles to make it happen.

Licensors must make sure they create safe sandboxes for an end user or an SoC builder that are proprietary to the licensee. If the licensee wants a change and is willing to pay for it, great. If the licensor wants to put something in their hands and let them roll, it’s a collaborative process that must be part of the design from the beginning to make sure that it is done right.

Wilson: Is it possible to create a secure sandbox?

Garibay: Yes. I think Andes has done that.

Wilson: Do you end up having to rerun your verification suite after they’ve stabilized their new RTL?

Jones: A CPU IP vendor must put a sandbox around this unless it’s a company willing to do a custom design. If a designer wants to add a couple of special instructions, the company needs to make sure that the designer’s special instructions won’t break everything else. For instance, Andes has a capability through which designers can add their own computation instructions.

Wilson: You’re only allowed to break what you put into your custom?

Garibay: Yes, absolutely.

Jones: Designers have to verify their special add, sum, subtract, multiply on their own. That’s another question for the CPU vendor: How do you enable me to verify this? How do you enable me to write software to it? Buyer beware. You have to check.

Wilson: When we start talking about security, is there a role for formal tools, either for vendors or users?

Garibay: We use formal tools in our verification. One of the great things about RISC-V is its spec is not in a state that is easily usable by formal tools. I’d love to see the RISC-V port go that way.

Scheid: We use a full spectrum of different formal methods within our implementation process. In terms of customization, the method that makes the most sense for special add and would be the type of commercial tools that allow for C-to-RTL equivalency checking. With the right type of sandbox approach, it could be directly applicable to solving that problem for a designer’s customization.

Jones: I’ll take a little bit of the high road. You always have to ask the question about formal and formal means different things to different people. Formal has a role; a CPU vendor may be questioned about whether they used formal and for what and what did the formal tools find? Formal is good where traditional debug is difficult such as ensuring a combination of states can never be hit. Proving a negative is difficult for debug but is a strength of formal.

Scheid: For formal as a place in this space, I come back the ability to do customization of instructions, an attractive feature of RISC-V. Clearly, it can be done with Arm, but it’s a different value of the checkbook that’s needed to be used here. It’s attractive for grading verification challenges, something that comes with Arm.

RISC-V has a higher stack of verification. A custom instruction set goes completely through the roof on what’s needed to be verified. It’s doable. The verification bar is high, complex and focused on certain verticals. It’s also not for everybody. It’s a good attribute, but there’s a price in verification. Another interesting aspect of competition is the EDA space. Ventana is EDA and the only EDA vendor that does not provide any processor IP. The other two are vocal about RISC-V, creating an interesting situation with the market dynamic.

Also Read:

The RISC-V and Open-Source Functional Verification Challenge

Andes Technology is Expanding RISC-V’s Horizons in High-Performance Computing Applications

TetraMem Integrates Energy-Efficient In-Memory Computing with Andes RISC-V Vector Processor


Semidynamics: A Single-Software-Stack, Configurable and Customizable RISC-V Solution

Semidynamics: A Single-Software-Stack, Configurable and Customizable RISC-V Solution
by Kalar Rajendiran on 11-07-2024 at 6:00 am

Risc V CPU

Founded with a vision to create transformative, customizable IP solutions, Semidynamics has emerged as a significant player in the AI hardware industry. Initially operating as a design engineering company, Semidynamics spent its early years exploring various pathways before pivoting to develop proprietary intellectual property (IP) around 2019. With financial support from the European Union, they began by creating highly efficient Core and Vector Units, receiving recognition from the tech ecosystem.

Over the past year, Semidynamics has made several announcements highlighting their technology advancements and partnership engagements to support many fast growing market segments. Their signature value proposition is a versatile “All-in-One IP” solution equipped to meet the demands of modern AI applications.

During the RISC-V Summit 2024, I sat down with Roger Espasa, Founder and CEO of Semidynamics to receive a holistic update. The following is a synthesis of that discussion.

A Unified, Programmable Solution

The heart of Semidynamics’ innovation lies in its commitment to a single software stack approach. In an industry where heterogeneous SoC (System on a Chip) architectures often combine CPUs, GPUs, and NPUs from multiple vendors, each with its own software stack, Semidynamics offers a streamlined alternative. By uniting Core, Vector, and Tensor processing units under a single software stack, they eliminate the inefficiencies commonly associated with multiple software stacks that rely heavily on data orchestration through Direct Memory Access (DMA) operations.

This unified solution is built on the RISC-V open-source architecture, ensuring adaptability and control. Semidynamics’ RISC-V-based architecture enables seamless communication between the Core and specialized units, allowing everything to run smoothly as a cohesive program. This differs from traditional designs where data is sent, processed, and returned in a fragmented sequence, leading to latency issues. Customers have responded positively to this innovation, appreciating the streamlined programming experience it provides.

Key Components of Semidynamics’ IP Solution

Core, Vector, and Tensor Units

Semidynamics’ “All-in-One IP” integrates three essential processing units—the Core, Vector, and Tensor units—working in harmony. While the Core handles general-purpose processing, the Vector unit manages 32-bit precision activations, and the Tensor unit is optimized for smaller data types, crucial for AI tasks like matrix multiplications in transformer models. The system dynamically balances workloads across these units to maximize performance.

Gazillion Misses™ IP

The Gazillion Misses IP is a specialized feature within their CPU Core that ensures high data availability for AI applications. With AI models requiring vast amounts of data, caches alone cannot keep up. Gazillion Misses IP addresses this challenge by continuously requesting data from the main memory, ensuring that both the Vector and Tensor units remain active and data-ready, a capability essential for managing complex models like transformers.

Out-of-Order Processing with Atrevido

In response to the growing demands of transformer-based models, Semidynamics offers an Out-of-Order processing architecture, dubbed “Atrevido.” This architecture ensures that, even as data demand surges, the processing units do not suffer from data starvation, maintaining smooth and efficient operation.

Configurability and Customization

Recognizing that each customer’s requirements vary, Semidynamics offers both configurable and customizable IP. Configuration involves selecting from standard options like cryptography and hypervisor support, while customization entails crafting special instructions based on customer requirements. This flexibility allows Semidynamics to serve a broad range of applications, from high-performance computing (HPC) to low-power security cameras.

RISC-V: The Backbone of Semidynamics’ Approach to Open Standards

Semidynamics’ choice of RISC-V as the foundation of their technology aligns with a broader industry shift towards open-source architectures. Similar to the freedom Linux brought to software, RISC-V liberates hardware developers from proprietary constraints. However, with the high costs associated with hardware tapeouts, choosing a solution partner becomes critical. Semidynamics not only brings flexibility and control to hardware but also future-proofing by grounding their technology in a general-purpose core that can adapt to new algorithms as they emerge.

Practical Engagement with Customers

Beyond IP delivery, Semidynamics ensures their clients have hands-on access to the technology. Once the RTL (Register Transfer Level) is delivered, customers can begin working immediately, with the option of testing on their multi-FPGA emulation platform. This engagement model accelerates integration and allows clients to adapt the IP to their needs in real-time.

Business Model

Semidynamics employs a straightforward business model that includes a licensing fee, a maintenance fee, and royalty options. This flexible structure ensures that customers pay for what they need, aligning the financial model with the technical customization Semidynamics provides.

A Blended Talent Pool

Based in Barcelona, Semidynamics boasts a team that combines industry veterans from companies like Intel, DEC, and Broadcom with young talent trained through rigorous immersion. This blend of experience and fresh perspectives ensures that the company remains innovative while drawing on deep industry knowledge.

Future-Proofing AI Hardware

AI hardware is undergoing rapid evolution, driven by emerging algorithms and models that challenge traditional computing frameworks. Semidynamics’ approach—anchoring Vector and Tensor units under CPU control—ensures that their IP can adapt to future AI trends. This scalability, combined with their focus on programmability, positions Semidynamics as a forward-thinking solution provider in the AI hardware space.

Summary

Semidynamics “All-in-One IP” solution strategically combines CPU, GPU, and NPU processing capabilities into a unified RISC-V architecture to meet the increasing demands of AI, machine learning, and edge computing. By implementing a single software stack, Semidynamics enables seamless control over Core, Vector, and Tensor units, minimizing the need for fallback to the CPU. This approach ensures efficient task distribution across specialized units and directly addresses the performance limitations highlighted by Amdahl’s Law, which focuses on bottlenecks from tasks that cannot be parallelized.

To prevent memory access issues that can slow down AI applications, Semidynamics developed Gazillion Misses™ technology. This technology continuously feeds data to the Vector and Tensor units from main memory, reducing idle time and supporting high-throughput processing, even for large, complex AI models. By combining a unified software stack, advanced memory management, and a customizable architecture, Semidynamics delivers an adaptable solution for various AI and HPC workloads, providing efficient, scalable, and future-ready performance.

To learn more, visit https://semidynamics.com/en

Also Read:

Gazzillion Misses – Making the Memory Wall Irrelevant

CEO Interview: Roger Espasa of Semidynamics

Semidynamics Shakes Up Embedded World 2024 with All-In-One AI IP to Power Nextgen AI Chips


Synopsys-Ansys 2.5D/3D Multi-Die Design Update: Learning from the Early Adopters

Synopsys-Ansys 2.5D/3D Multi-Die Design Update: Learning from the Early Adopters
by Daniel Nenni on 11-06-2024 at 10:00 am

banner for webinar

The demand for high-performance computing (HPC), data centers, and AI-driven applications has fueled the rise of 2.5D and 3D multi-die designs, offering superior performance, power efficiency, and packaging density. However, these benefits come with myriads of challenges, such as multi-physics, which need to be addressed. Ansys and Synopsys as part of their long-standing partnership are addressing these multi-die design challenges, bringing together cutting-edge technology and solutions to enhance the multi-die design and verification process from early architecture to manufacturing and reliability

Multi-Die Design Challenges: Architecture and Early Prototyping

Multi-die designs are far more complex than traditional monolithic chip designs. The integration of multiple heterogeneous and homogeneous dies within a single package leads to significant challenges, particularly in thermal management, mechanical stress, and early architecture decisions. The initial architecture and die placement are major steps in the multi-die design process, requiring specialized tools. Synopsys 3DIC Compiler™ is an industry-leading solution that helps define the architecture of 2.5D/3D multi-die designs in a unified exploration-to-signoff platform. It enables chip designers to address early architectural challenges effectively, facilitating smoother transitions into early prototyping and ultimately to signoff.

Thermal awareness and mechanical reliability are major challenges that should be addressed as early as possible in the design cycle. Thermal challenges in multi-die designs can arise from temperature and thermal property differences between individual dies, die interconnects, and materials used in multi-die designs. Designers must thoroughly analyze each element to avoid costly redesigns later. Mechanical issues like stress and warpage can lead to failures if not addressed early in the design process. Ansys offers a comprehensive platform for tackling these physical challenges at an early stage. With software tools like Ansys RedHawk-SC Electrothermal™ and Ansys Icepak™, designers can efficiently address these issues to facilitate rapid prototyping and architectural exploration. Early-stage thermal and mechanical analysis is critical to prevent problems like hotspots, warping, and system failures due to poor heat dissipation or physical stress.

Importance of Early Verification

Verification at an early stage of multi-die design is pivotal. As multiple dies are stacked together in a small form factor, verifying the overall system becomes increasingly difficult, yet even more essential. Failure to catch potential issues early, such as thermal bottlenecks or power integrity problems, could lead to costly delays and suboptimal performance.

One of the key challenges in multi-die design is managing voltage drop and electromigration (EM/IR), which can lead to power integrity failures. Especially difficult is ensuring reliable power distribution in the vertical direction from interposer to chip, and between stacked chips. Supply currents for up to 200W need to be delivered through tiny microbumps, hybrid bonds, and through-silicon vias (TSVs). This requires very careful power integrity analysis down to level of each individual bump.

Ansys RedHawk-SC Electrothermal offers advanced simulation capabilities for robust power integrity analysis while Synopsys 3DIC Compiler ensures that the design architecture meets the desired design goals by enabling feasibility and prototyping, and implementation and analysis, all in a single environment using a common data model. Under our existing partnership, Ansys and Synopsys provide designers with the necessary solutions to create resilient 2.5D/3D multi-die designs that can withstand the demands of modern high-performance computing environments.

The Role of AI in Multi-Die Designs

Artificial Intelligence (AI) is revolutionizing how designers’ approach 3DIC designs. AI-driven tools can automate many time-consuming processes, from early prototyping to layout optimization, significantly reducing the design cycle. As the complexity of multi-die design continues to grow, AI technology will become an essential component in handling massive design datasets, enabling smarter decisions and faster results.

The use of AI in design exploration can help optimize key parameters such as power efficiency, thermal distribution, and interconnect layout. This is not just a matter of saving time; AI’s ability to predict and automate design solutions can lead to more innovative and efficient architectures, allowing designers to focus on higher-level innovations.

The Golden Sign-off Tools

Ansys RedHawk-SC and Synopsys PrimeTime stand as first-class tools for signoff verification. Together, these tools provide designers with a robust verification framework, ensuring that the multi-die designs not only meet performance and power targets but also maintain reliability and longevity.

As multi-die design continues to evolve, the long-standing partnership between Ansys and Synopsys is leading the way in helping designers overcome the inherent complexities of multi-die  design. To learn more about the latest advances in this area,  attend the joint Ansys and Synopsys webinar by registering at Technology Update: The Latest Advances in Multi-Die Design to explore multi-die designs, key challenges, and how Synopsys and Ansys software solutions can help you overcome these obstacles. Learn how these tools can streamline the 2.5D/3D multi-die design process, enabling more efficient and effective designs.

Also Read:

Ansys and eShard Sign Agreement to Deliver Comprehensive Hardware Security Solution for Semiconductor Products

Ansys and NVIDIA Collaboration Will Be On Display at DAC 2024

Don’t Settle for Less Than Optimal – Get the Perfect Inductor Every Time


Podcast EP259: A View of the History and Future of Semiconductor Manufacturing From PDF Solution’s John Kibarian

Podcast EP259: A View of the History and Future of Semiconductor Manufacturing From PDF Solution’s John Kibarian
by Daniel Nenni on 11-06-2024 at 8:00 am

Dan is joined by John Kibarian, president, chief executive officer and co-founder of PDF Solutions. He has served as president since 1991 and CEO since 2000.

John explains the evolution of PDF Solutions from its beginnings in 1992 to the present day. John describes moving from TCAD tools for design teams to a yield optimization focus working with fabs and equipment vendors. Today, PDF Solutions has customers in all three areas and a central focus is finding and cultivating the right data.

John comments on the expanding use of 3D processing and packaging as a driver for future innovation and the challenges that must be met. He comments on the growing development of electric cars and its impact on the semiconductor industry.

Looking forward, he sees a more nimble, software-based approach to manufacturing. John also comments on the impact chiplet-based design will have on the semiconductor supply chain and how AI will be used to improve the entire process.

The views, thoughts, and opinions expressed in these podcasts belong solely to the speaker, and not to the speaker’s employer, organization, committee or any other group or individual.


Arteris Empowering Advances in Inference Accelerators

Arteris Empowering Advances in Inference Accelerators
by Bernard Murphy on 11-06-2024 at 6:00 am

NoC Tiling min

Systolic arrays, with their ability to highly parallelize matrix operations, are at the heart of many modern AI accelerators. Their regular structure is ideally suited to matrix/matrix multiplication, a repetitive sequence of row-by-column multiply-accumulate operations. But that regular structure is less than ideal for inference, where the real money will be made in AI. Here users expect real-time response and very low per-inference cost. In inference not all steps in acceleration can run efficiently on systolic arrays, even less so after aggressive model compression. Addressing this challenge is driving a lot of inference-centric innovation and Arteris NoCs are a key component in enabling that innovation. I talked with Andy Nightingale (Vice President of Product Management & Marketing, Arteris) and did some of my own research to learn more.

Challenges in inference

Take vision applications as an example. We think of vision AI as based on convolutional networks. Convolution continues to be an important component and now models are adding transformer networks, notably in Vision Transformers (ViT). To reduce these more complex trained networks effectively for edge application is proving to be a challenge.

Take first the transformer component in the model. Systolic arrays are ideally suited to accelerate the big matrix calculations central to transformers. Even more speedup is possible through pruning, zeroing weights which have low impact on accuracy. Calculation for zero weights can be skipped, so in principle enough pruning can offer even more acceleration. Unfortunately, this only works in scalar ALUs (or in vector ALUs to a lesser extent). But skipping individual matrix element calculations simply isn’t possible in a fixed systolic array.

The second problem is accelerating convolution calculations, which map to nested loops with at least four levels of nesting. The way to accelerate loops is to unroll the loops, but a two-dimensional systolic array can only support two levels of unrolling. That happens to be perfect for matrix multiplication but is two levels short of the need for convolution.

There are other areas where systolic arrays are an imperfect fit with AI model needs. Matrix/vector multiplications for example can run on a systolic array but leave most of the array idle since only one row (or column) is needed for the vector. And operations requiring more advanced math, such as softmax, can’t run on array nodes which only support multiply-accumulate.

All these issues raise an important question: “Are systolic arrays accelerators only useful for one part of the AI task or can they accelerate more, or even all needs of a model?” Architects and designers are working hard to meet that larger goal. Some methods include restructuring sparsity to enable skipping contiguous blocks of zeroes. For convolutions, one approach restructures convolutions into one dimension which can be mapped into individual rows in the array. This obviously requires specialized routing support beyond the systolic mesh. Other methods propose folding in more general SIMD computation support in the systolic array. Conversely some papers have proposed embedding a systolic array engine within a CPU.

Whatever path accelerator teams take, they now need much more flexible compute and connectivity options in their arrays.

Arteris NoC Soft Tiling for the New Generation of Accelerators

Based on what Arteris is seeing, a very popular direction is to increase the flexibility of accelerator arrays, to the point that some are now calling these new designs coarse-grained reconfigurable arrays (CGRA). One component of this change is in replacing simple MAC processing elements with more complex processing elements (PEs) or even subsystems, as mentioned above. Another component extends current mesh network architectures to allow for various levels of reconfigurability, so an accelerator could look like a systolic array for an attention calculation, or a 1D array for a 1D convolution or vector calculation.

You could argue that architects and designers could do this themselves today without additional support, but they are seeing real problems in their own ability to manage the implied compexity. Leading them to want to build these more complex accelerators bottom-up. First build a basic tile, say a 4×4 array of PEs, then thoroughly validate, debug and profile that basic tile. Within a tile, CPUs and other (non-NoC) IPs must connect to the NoC through appropriate network interface units, for AXI as an example. A tile becomes in effect rather like an IP, requiring all the care and attention needed to fully validate an IP.

Once a base tile is ready, then it can be replicated. The Arteris framework for tiling allows for direct NoC-to-NoC connections between tiles, without need for translation to and from a standard protocol (which would add undesirable latency). Allowing you to step and repeat that proven 4×4 tile into an 8×8 structure. Arteris will also take care of updating node IDs for traffic routing.

More room for innovation

I also talked to Andy about what at first seemed crazy – selective dynamic frequency scaling in an accelerator array. Turns out this is not crazy and is backed by a published paper. Note that this is frequency scaling only, not frequency and voltage scaling since voltage scaling would add too much latency. The authors propose switching frequency between layers of a neural net and claim improved frames per second and reduced power.

Equally intriguing, some work has been done on multi-tasking in accelerator arrays for handling multiple different networks simultaneously, splitting layers in each into separate threads which can run concurrently in the array. I would guess that maybe this could also take advantage of partitioned frequency scaling?

All of this is good for Arteris because they have had support in place for frequency (and voltage) scaling within their networks for many years.

Fascinating stuff. Hardware evolution for AI is clearly not done yet. Incidentally, Arteris already supports tiling in FlexNoC for non-coherent NoC generation and plans to support the same capability for Ncore coherent NoCs later in the year.

You can learn more HERE.

Also Read:

Arteris at the 2024 Design Automation Conference

Arteris is Solving SoC Integration Challenges

Arteris Frames Network-On-Chip Topologies in the Car