SILVACO 073125 Webinar 800x100

Synopsys SNUG Silicon Valley Conference 2024: Powering Innovation in the Era of Pervasive Intelligence

Synopsys SNUG Silicon Valley Conference 2024: Powering Innovation in the Era of Pervasive Intelligence
by Kalar Rajendiran on 03-29-2024 at 6:00 am

AI Powered Hyperconvergence Tools Offerings

After the leadership transition at the top, Synopsys had just a little more than two months before the company’s flagship event, the Synopsys User Group (SNUG) conference. The Synopsys user community and entire ecosystem were waiting to hear new CEO Sassine Ghazi’s keynote to learn where the company is going and its strategic vectors. Sassine, his executive team and the entire company delivered an amazing SNUG 2024.

Right after Sassine took over as CEO, SemiWiki had posted its take on why and how Synopsys is geared for next era’s opportunity and growth. SNUG 2024 with the tagline, “Our Technology, Your Innovation,” provided an excellent avenue for Synopsys to share insights on how the company is enhancing value for all stakeholders in the ecosystem. This was corroborated through numerous testimonials heard throughout the event, from many different companies large, medium and emerging alike.

The event was kicked off with the keynote talk which encapsulated and rounded up the two-day event including several news announcements. An added bonus during the keynote was a special in-person appearance by Jensen Huang, founder and CEO of Nvidia, for an interactive Q&A session with Sassine. The keynote covered the three main trends of increasing silicon complexity, productivity bottleneck and silicon and systems intersection as well as how Synopsys is addressing these with their technology solutions. The following is a synthesis of the keynote session.

Enabling Innovation with IP and EDA Solutions

At the core of Synopsys’ strategy lies its extensive portfolio of Intellectual Property (IP), cultivated with 25 years of industry experience. Synopsys’ IP offerings include a wide range of pre-designed functional blocks and subsystems, covering everything from processor cores and memory controllers to interface IP and analog IP. This IP serves as the foundational building blocks for silicon design, enabling customers to differentiate their products while streamlining the design process. Synopsys remains at the forefront of emerging technologies, ensuring customers have access to the latest IP standards and market trends, thus empowering them to stay ahead of the curve. An example being the company’s announcement at SNUG that it has acquired Intrinsic ID, a leading provider of physical unclonable function (PUF) IP used in SoC design. This addition to Synopsys’ semiconductor IP portfolio provides an additional level of hardware security that is critical for today’s embedded applications and IoT devices.

Synopsys provides a comprehensive suite of Electronic Design Automation (EDA) tools, powered by artificial intelligence (AI). The company’s EDA tools span the entire design flow. From RTL synthesis and simulation to place-and-route and sign-off, Synopsys’ EDA solutions cover every aspect of the design process, enabling designers to optimize their designs for performance, power, and area. By infusing AI into every facet of the design process, Synopsys enables customers to achieve breakthroughs in efficiency and productivity, thereby redefining the boundaries of silicon design. At SNUG, Synopsys announced the development of 3DSO.ai, a new AI-driven capability for 3D design space optimization and architectural exploration using native thermal analysis. The new capability is built into Synopsys 3DIC Compiler to deliver significant productivity gains while also maximizing performance and quality of results. Moreover, with the evolution to heterogenous SoCs, Synopsys’ EDA tools are tightly integrated with its IP portfolio, allowing for seamless interoperability and faster time-to-market.

Convergence of Silicon and Systems Design

With the rise of heterogeneous computing architectures and the proliferation of AI and machine learning workloads, designers must increasingly consider both silicon-level and system-level optimizations when designing their products. As hyperscaler companies invest heavily in silicon development to optimize workloads for specific applications, the traditional boundaries between chip design and system architecture are blurring. Synopsys recognizes the importance of this trend and offers solutions that bridge the gap between silicon and systems design.

Bridging the Gap between Silicon and Systems Design

Synopsys offers a range of solutions that span the entire design continuum, from silicon to systems. At SNUG, Synopsys unveiled two new hardware-assisted verification (HAV) solutions : Synopsys Zebu® EP2, the latest version in the ZeBu EP family of unified emulation and prototyping systems, and Synopsys HAPS®-100 12, Synopsys’ highest capacity and density FPGA-based prototyping system. By providing designers with the tools and methodologies needed to optimize both the silicon and system aspects of their designs, Synopsys enables them to deliver products that meet the demanding performance and efficiency requirements of today’s markets.

Synopsys’ Holistic Approach

As discussed above, Synopsys’ response to the three main trends is characterized by its holistic approach. Rather than focusing on individual components or stages of the design process, Synopsys offers a comprehensive suite of solutions, or stack, that spans the entire design flow, from concept to production. This integrated approach enables designers to seamlessly transition between different stages of the design process, ensuring continuity, efficiency, and accuracy at every step. By working closely with industry partners, customers, and academic institutions, Synopsys is able to stay at the forefront of emerging technologies and trends. This collaborative ecosystem approach not only fosters knowledge sharing and best practices but also drives innovation and accelerates time-to-market for new products and technologies.

Summary

From tackling silicon complexity to embracing the convergence of silicon and systems design, Synopsys is at the forefront of shaping the future of technology. With its extensive portfolio of IP and EDA solutions powered by AI, coupled with a commitment to innovation and collaboration, Synopsys empowers the designer community to think and operate holistically. Designers can easily navigate the complexities of silicon design and deliver breakthrough products that drive the industry forward. From software-driven architecture exploration to hardware-assisted verification, Synopsys provides customers with the tools needed to navigate the convergence of silicon and systems design.

As the semiconductor landscape continues to evolve, Synopsys remains steadfast in its mission to drive technological advancement and enable innovation for years to come. Below are some recent announcements relating to the topic of this keynote.

Synopsys Announces New AI-Driven EDA, IP and Systems Design Solutions At SNUG Silicon Valley

Synopsys Expands Semiconductor IP Portfolio With Acquisition of Intrinsic ID

Jensen Huang’s special appearance for an interactive Q&A during Sassine’s keynote talk at SNUG 2024 was centered around the following announcement and the two companies’ decades long working relationship.

Synopsys Showcases EDA Performance and Next-Gen Capabilities with NVIDIA Accelerated Computing, Generative AI and Omniverse

Also Read:

2024 DVCon US Panel: Overcoming the challenges of multi-die systems verification

Synopsys Enhances PPA with Backside Routing

Complete 1.6T Ethernet IP Solution to Drive AI and Hyperscale Data Center Chips


Ultra-low-power MIPI use case for streaming sensors

Ultra-low-power MIPI use case for streaming sensors
by Don Dingee on 03-28-2024 at 10:00 am

Mixel D PHY TX+ for ultra-low-power MIPI streaming sensors

MIPI built its reputation on the efficient streaming of data from camera sensors in mobile devices. It combines high-speed transfers with balanced power consumption, helping extend battery life while providing the responsiveness users expect. However, high speed is not the only mode of operation for a MIPI interface – specifications also enable low power modes for slower data transfers, going to an ultra-low-power shutdown state when data communication is inactive. These low-power modes are gaining more attention as MIPI-based cameras see adoption in automotive, IoT, augmented and virtual reality (AR and VR), industrial, and medical applications. A new white paper from Mixel jointly authored with ams-OSRAM outlines an ultra-low-power MIPI use case for streaming sensors.

Borrowing a proven power management concept

The idea behind various power modes in power-sensitive applications isn’t new. Commercial microcontrollers specify modes like full-on, doze, nap, sleep, and deep-sleep, intending that staying in a lower-power mode as much as possible conserves power. Low-duty-cycle operation often matches sensor applications with lower sample rates and periodic data bursts. Work like computations and data transmission happens around each burst, returning to sleep between bursts. The result is a much lower average power consumption.

Streaming sensors, like digital image sensors, pose a different problem. Delivering video for human consumption requires more pixels and faster frame rates; otherwise, the experience becomes uncomfortable to watch. Higher resolution, higher frame rate video is costly – with a more powerful SoC to sample, process, and transmit the stream. SoC designers can use clock and power gating techniques to shut down IP blocks when they are unneeded, but when the video stream is on, there seems to be no choice except to use more power.

In many sensor applications, the point is for humans not to watch the streaming video 24/7. The sensor should be smart enough to monitor a scene with nothing of interest until a moment when something starts happening. A dual-context sensor can lower frame rate, resolution, and MIPI transfer rates, shifting into full-power mode only when a change occurs, such as motion. Managing event detection creates a massive power-saving opportunity, maybe 20x or more.

Changing lanes of the MIPI interface

MIPI architects anticipated scenarios like these when they wrote the MIPI specifications. Compliant IP blocks have both a high-speed and low-power lane, with the latter running at a fraction of the data rate. Control logic external to the IP block determines which lane to use.

Mixel MIPI IP achieves remarkable efficiency in any operating mode. For ams-OSRAM, Mixel customized its D-PHY TX+ solution, incorporating D-PHY v 2.1 and CSI-2 v 1.3 functions in a single IP block. In high-speed mode (HS-TX in the diagram), the lane runs at 1.5Gbps, while in low-power mode (LP-TX in the diagram), the lane shifts down to 10Mbps. If more throughput is needed, a 4-lane version is available. Built-in self-test (BIST) logic in the hard macro and CIL RTL exercises both modes, providing 100% test coverage for the block. Mixel indicates the customized D-PHY TX+ uses 30% less area than the comparable D-PHY Universal configuration and reduces leakage power by 40%.

ams-OSRAM took power savings further in their single-chip Mira050 sensor with an active pixel array on-chip by coordinating the sensor resolution sampling, frame rate, MIPI modes, and clock rates. Their fast-switching controller helps their image sensor go from full-on streaming using 75mW to total standby using only 60uW, and they indicate their reliable motion detection with a novel tiling algorithm (described in-depth in the white paper) is possible using around 3mW.

More use cases for ultra-low-power MIPI ahead

Ultra-low-power MIPI streaming sensors open use cases like home security cameras, consumer robotics, e-door locks, and AR/VR wearables. As Mixel puts it, “Any use case combining the elements of a small physical footprint and reduced power consumption yet requiring quality image processing for object classification and event detection can benefit from MIPI integration.” Mixel combines high-speed transfer capability with low-power modes, smaller footprints, reduced power, and increased testability in its MIPI IP solutions.

Learn more about the ams-OSRAM approach to ultra-low-power MIPI in the Mira050 with the Mixel MIPI D-PHY IP in this white paper:

MIPI Deployment in Ultra-low-power Streaming Sensors


Arm Automotive Update Stresses Prototyping for Software Development

Arm Automotive Update Stresses Prototyping for Software Development
by Bernard Murphy on 03-28-2024 at 6:00 am

Arm Automotive Update Stresses Prototyping for Software Development

If you were at all uncertain about auto OEM development priorities, the answer is becoming clear: to accelerate/shift left automotive software development and debug. At 100M lines of code and accelerating, this task is overshadowing all others. A recent Arm update from Dipti Vachani (SVP and GM for the Automotive Line of Business) led with their new emphasis on support for virtual prototyping for software development. Very interesting given Accellera’s recent update on Federated Simulation as an intended standard for whole-car software simulation (among other objectives). I have also written recently about increasing hardware complexity in zonal controllers and elsewhere, each requiring complex software services, further amplifying the software challenge. All our new tech goodies will amount to little if the software to coordinate the whole system cannot be developed in parallel.

Virtual prototyping solutions

Breaking with the standard Arm announcement flow, Dipti started her update here rather than on new cores (of which they have an abundance). I find this significant; not a token move to supporting software dev shift-left but the leading theme. Which is not surprising because to really shift left the whole automotive stack, firmware, middleware, and applications across all subsystems in the car must come together in parallel with the hardware.

Also of interest, Arm see this digital twin running in the cloud. I sense a trend. I would imagine that partners in the stack can more easily collaborate in this way against an evolving digital twin. But also, per Dipti, software developed in Arm-based cloud instances (Graviton or Ampere for now) will be automatically portable to the Arm-based hardware platforms in the car. Sneaky. Arm is leveraging its established strength in the cloud to push an Arm preference into cars. Even more so if the hardware architecture leverages CSS instances (more on that later) for high performance compute applications. As evidence that this isn’t just talk, she cites the AUTOWARE open autonomous stack (on the right of the above figure), containerized in Amazon AWS instances. Further she adds that this capability can shorten the OEM system development cycle by up to 2 years.

Pretty compelling, though we should remember compute in a car is not just based on Arm technology. In-vehicle infotainment may run on a Qualcomm chip. Sensing, object detection and sensor fusion for vision, lidar and radar will run on complex AI pipelines using a variety of DSP, AI accelerator, and GPU functions along with communications. A complete solution to prototype software for the full car system will likely still need something like the Accellera Federated Simulation standard connecting virtual models from multiple sources in addition to Arm’s initiatives.

New IP options for automotive

Plenty of new info here around extension to the Arm Automotive Enhanced (AE) family. Neoverse from the infrastructure product line has now been added to the AE portfolio as Neoverse V3AE, based on the high-performance V series already adopted in cloud datacenters. Applications are expected to be big in central controllers especially for software-defined vehicles. Arm has now announced that Nvidia Thor (aimed at the central controller) is based on this platform.

Cortex A720AE and 520AE add new few features in support of ASIL B and D certification, and provide cluster configurability between lock modes for safety and split modes for performance. Cortex R82AE extends real-time capability with 64-bit operation and 8 core clusters in support of safety islands while the Mali C720AE ISP adds more support and configurability for human vision and computer vision pipelines. All supporting ASIL B and D requirements and features of course (ASIL B and D seem to be the only ASIL standards mentioned these days. Whatever happened to A and C?)

The final important piece of news in this announcement is that the Arm automotive LOB is now working on CSS (compute subsystem) cores for the AE product line. If you don’t know anything about CSS, these are preconfigured subsystems of Arm cores developed as a customizable compute subsystem, verified, validated, and PPA-optimized by Arm. CSS was first introduced for Neoverse. Arm finds these pre-designed and optimized subsystems are attractive to system designers on a deadline who don’t feel a need to keep re-inventing compute subsystems. I would bet auto system designers feel the same way. Automotive CSS is expected to become available in 2025.

Takeaways

My first takeaway is building support for automotive digital twins, running in the cloud. Whether in a single container for Arm-centric platforms or in multiple containers orchestrated by a Kubernetes or similar will depend on how soon the Accellera standard may appear.

My second takeaway is that Arm has an interesting opportunity to extend its hegemony in cloud-based platforms to automotive platforms as well, simply by virtue of running on the same instruction set architecture in both domains.

You can read the press release HERE.

 


2024 Outlook with Srinivasa Kakumanu of MosChip

2024 Outlook with Srinivasa Kakumanu of MosChip
by Daniel Nenni on 03-27-2024 at 10:00 am

KS MD&CEO MosChip 2024

MosChip is a publicly traded company founded in the year 1999, they offer semiconductor design services, turnkey ASIC, software services, and end-to-end product engineering solutions. The company headquartered in Hyderabad, India, with five design centers and over 1300 engineers located in Silicon Valley (USA), Hyderabad, Bengaluru, Ahmedabad, and Pune. MosChip has over two decades of track record in designing semiconductor products and SoCs for computing, networking, and consumer applications. Also, MosChip has developed and shipped millions of connectivity ICs.

Tell us a little bit about yourself.
Hello, I’m Srinivasa Kakumanu, commonly known as KS. I’ve been in the semiconductor industry for over 28 years now. One of my notable accomplishments was co-founding First Pass Semiconductors Pvt Ltd, a prominent VLSI design services organization established in December 2010. Throughout my illustrious career, I have played a key role in leading numerous ASIC tape-outs across the Communication, Networking, Consumer, and Computing sectors.

Under my leadership, First Pass experienced significant growth, evolving into a thriving organization boasting more than 210 employees by FY18. This remarkable journey culminated in the acquisition of First Pass by MosChip in July 2018, all the while maintaining profitability since inception. Following the acquisition, I was responsible for the role of heading the Semiconductor Business Unit at MosChip, steering it to remarkable heights.

Before my tenure at First Pass, I held the position of General Manager for the VLSI group at Cyient (formerly known as Infotech Enterprise) in India. My career also includes stints with notable organizations such as TTM Inc. in San Jose, US; TTM India (both were acquired by Infotech in September 2008) Pvt. Ltd. in Hyderabad, India; Ikanos Communications in Fremont, US; QualCore Logic Ltd in India, and HAL in Hyderabad, among others.

I also maintain my professional education commitment by actively teaching Digital Design and Physical Design at MosChip Institute of Silicon Systems Pvt. Ltd, a training institute that I co-founded, which was subsequently acquired by MosChip in July 2018. My international experience includes a seven-year tenure in the United States between 2000 and 2007, where I contributed to TTM Inc. and Ikanos Communications.

What was the most exciting high point of 2023 for your company?
MosChip has reached new heights in the year 2023, with some remarkable achievements. Firstly, we were honored to be recognized among India’s Top 150 Growth Champions and Asia-Pacific’s Top 500 high-growth companies by institutions like the Economics Times, Financial Times, and Statista. This recognition shows our ongoing dedication to excellence and innovation in the semiconductor industry. Adding to this, On March 31, 2023, Softnautics, a semiconductor and software AI solutions company situated in California, was acquired by MosChip Technologies. This acquisition made us more powerful in the software sector and strengthened our portfolio and capabilities, setting us up for worldwide success. We also welcomed Dr. Naveed Sherwani, a veteran of the semiconductor industry, to our Board of Directors with great pleasure. His knowledge will surely help us make better strategic decisions and drive our company forward.

On top of that, being recognized by Qualcomm as the most valuable supplier in the software category for 2022 confirmed our commitment to providing high-quality solutions and forming solid partnerships. Also, Receiving the EE Times Asia Awards 2023 for the Most Influential Corporate in ASIA consecutively for 2 times was a humbling affirmation of our semiconductor industry excellence.

These milestones of 2023 motivate our determination to continue pushing boundaries, driving growth, and making a positive impact in the semiconductor and software sectors.

What was the biggest challenge your company faced in 2023?
The biggest challenge we faced in 2023 was a shortage of qualified chip design engineers in India’s semiconductor industry. The industry’s slow pace and hiring challenges triggered the situation. Despite increasing growth, hiring and finding skilled professionals, especially senior technical leaders, was tough. This challenge restricted our capacity to meet industry demands but with my team and support from the other leaders, we made it through.

How is your company’s work addressing this biggest challenge?
To address this challenge, MosChip has taken significant initiatives to develop new talent in the semiconductor and software fields with our indigenous institute for finishing schools, the “MosChip Institute of Silicon Systems (M-ISS)” which I co-founded and later on MosChip acquired, where we educate and develop aspiring chip design and software engineers, providing them with the training and experience with hands-on experience on the tools that industry professionals use to get them ready for the market. By cultivating these talents through our institute, we can close the skill gap and contribute to the growth and sustainability of India’s ecosystem.

What do you think the biggest growth area for 2024 will be, and why?
From my perspective, the semiconductor and software (both Digital Engineering and Device Engineering) market is expected to expand significantly this year. On the semiconductor front, next-generation memory technologies such as MRAM, ReRAM, HMC, and HBM have moved from studies to industrialization, with leading foundries and integrated device manufacturers (IDMs) qualifying STT MRAM technology for a wide range of applications including power-efficient MCU/SoC chips, ASIC products, IoT devices, wearables, and CMOS image sensors. On top of that, the system design market is predicted to expand significantly in 2024, led by increasing consumer demand for electric vehicles (EVs). Plus, it is expected that there will be a significant increase in various sectors such as telecommunications, healthcare, industrial IoT, consumer electronics, military, and aerospace. Emerging trends like Chiplets, RISC-V, and AI/ML present exciting opportunities for innovation, which will help MosChip maintain its position as a leader in the industry. This will contribute to the overall growth of the semiconductor, software, and  systems industries.

Referencehttps://www.marketsandmarkets.com/Market-Reports/global-semiconductor-industry-outlook-201471467.html#:~:text=MRAM%20is%20set%20to%20dominate%20the%20next%2Dgeneration,have%20reached%20commercialization%20after%20extensive%20R&D%20efforts.

https://www.linkedin.com/pulse/embedded-systems-market-growth-trends-forecast-2024-l0cxf/

How is your company’s work addressing this growth?
We are actively tackling the significant increase expected in the semiconductor, software, and systems markets by 2024. We devote ourselves to technological advancement to enhance next-generation memory technologies, collaborating with industry leaders to ensure our products exceed strict requirements. With the recent acquisition of Softnautics, we are deepening our expertise in Digital Engineering and Device Engineering and positioning ourselves to take advantage of opportunities in both areas. Overall, our strategic activities are aimed at capitalizing on growth prospects and strengthening our position as a significant leader that can lead us to conquer the semiconductor, software, and systems industries.

Will you attend conferences in 2024? Same or more?
Yes, we plan to make our conference attendance more than what we did earlier to cover our major geographies to meet the customers from USA, India, & Europe. Unlike the previous focus on semiconductor-specific events, We are now looking for more events covering Semiconductors, Product Engineering & AI/ML, etc. Although we value the importance of networking and staying up to date with industry developments at these events, our decision to attend will be based on how relevant the conference is to our company’s goals and priorities for the year.

Additional questions or final comments?
As we look ahead, we want to highlight our incomparable dedication to our customers and stakeholders. We focus on offering high-quality solutions and maintaining strong relationships that create mutual success. Our commitment to customer satisfaction and exceeding expectations is at the bottom of everything we do. As we come up with ever-evolving solutions for the semiconductor, software, and systems industries, our customer-centric approach will stay constant, ensuring that we remain a trusted partner and industry leader for many years to come. We firmly believe that our employees are our biggest asset and as such, we continuously prioritize their development and welfare.

Also Read:

CEO Interview: Larry Zu of Sarcina Technology

CEO Interview: Michael Sanie of Endura Technologies

Outlook 2024 with Dr. Laura Matz CEO of Athinia


Fault Simulation for AI Safety. Innovation in Verification

Fault Simulation for AI Safety. Innovation in Verification
by Bernard Murphy on 03-27-2024 at 6:00 am

Innovation New

More automotive content 😀

In modern cars, safety is governed as much by AI-based functions as by traditional logic and software. How can these functions be fault-graded for FMEDA analysis? Paul Cunningham (GM, Verification at Cadence), Raúl Camposano (Silicon Catalyst, entrepreneur, former Synopsys CTO and now Silvaco CTO) and I continue our series on research ideas. As always, feedback welcome.

The Innovation

This month’s pick is SiFI-AI: A Fast and Flexible RTL Fault Simulation Framework Tailored for AI Models and Accelerators. This article was published in the 2023 Great Lakes Symposium on VLSI. The authors are from the Karlsruhe Institute of Technology, Germany.

ISO 26262 requires safety analysis based on FMEDA methods using fault simulation to assess sensitivity of critical functions to transient and systematic faults, and the effectiveness of mitigation logic to guard against errors. Analysis starts with design expert understanding of what high-level behaviors must be guaranteed together with what realistic failures might propagate errors in those behaviors.

This expert know-how is already understood for conventional logic and software but not yet for AI models (neural nets) and the accelerators on which they run. Safety engineers need help exploring failure modes and effects in AI components to know where and how to fault models and hardware. Further that analysis must run at practical speeds on the large models common for DNNs. The authors propose a new technique which they say runs much faster than current methods.

Paul’s view

A thought provoking and intriguing paper: how do you assess the risk of random hardware faults in an AI accelerator used for driver assist or autonomous drive? AI inference is itself a statistical method, so determining the relationship between a random bit flip somewhere in the accelerator and an incorrect inference is non-trivial.

This paper proposes building a system that can “swap in” a real RTL simulation of a single layer of a neural network, an otherwise pure software-based inference of that network in PyTorch. A fault can be injected into the layer being RTL simulated to assess the impact of that fault on the overall inference operation.

The authors demonstrate their method on the Gemmini open-source AI accelerator running ResNet-18 and GoogLeNet image classification networks. They observe each element of the Gemmini accelerator array has 3 registers (input activation, weight and partial sum) and a weight select signal, together 4 possible types of fault to inject. They run 1.5M inference experiments, each with a random fault injected, checking if the top-1 classification out of the network is incorrect. Their runtime is an impressive 7x faster than prior work, and their charts validate the intuitive expectation that faults in earlier layers of the network are more impactful than those in deeper layers.

Also, it’s clear from their data that some form of hardware safety mechanism (e.g. triple-voting) is warranted since the absolute probability of a top-1 classification error is 2-8% for faults in the first 10 layers of the network. That’s way too high for a safe driving experience!

Raúl’s view

The main contribution of SiFI-AI is simulating transient faults in DNN accelerators combining fast AI inference with cycle-accurate RTL simulation and condition-based fault injection. This is 7x faster than the state of the art (reference 2, Condia et al, Combining Architectural Simulation and Software Fault Injection for a Fast and Accurate CNNs Reliability Evaluation on GPUs). The trick is to simulate only what is necessary in slow cycle-accurate RTL. The faults modeled are single-event upset (SEU), i.e., transient bit-flips induced by external effects such as radiation and charged particles, which persist until the next write operation. To find out whether a single fault will cause an error is especially difficult in this case; the high degree of data reuse could lead to significant fault propagation, and fault simulation needs to take both the hardware architecture and the DNN model topology into account.

SiFI-AI integrates the hardware simulation into the ML framework (PyTorch). For HW simulation it uses Verilator, a free and open-source Verilog simulator, to generate cycle accurate RTL models. A fault controller manages fault injection as directed by the user, using a condition-based approach, i.e., a list of conditions that avoid that a fault is masked. To select what part is simulated in RTL, it decomposes layers into smaller tiles based on “the layer properties, loop tiling strategy, accelerator layout, and the respective fault” and selects a tile.

The device tested in the experimental part is Gemmini, a systolic array DNN accelerator created at UC Berkeley in the Chipyard project, in a configuration of 16×16 processing elements (PE). SiFI-AI performs a resilience study with 1.5 M fault injection experiments on two typical DNN workloads, ResNet-18 and GoogLeNet. Faults are injected into three PE data registers and one control signal, as specified by the user. Results show a low error probability, confirming the resilience of DNNs. They also show that control signal faults have much more impact than data signal faults, and that wide and shallow layers are more susceptible than narrow and deep layers.

This is a good paper which advances the field of DNN reliability evaluation. The paper is well-written and clear and provides sufficient details and references to support the claims and results. Even though the core idea of combining simulation at different levels is old, the authors use it very effectively. Frameworks like SciFI-AI can help designers and researchers optimize their architectures and make them more resilient. I also like the analysis of the fault impact on different layers and signals, which reveals some interesting insights. The paper could be improved by providing more information on the fault injection strategy and the selection of the tiles. Despite the topic being quite specific, overall, a very enjoyable paper!


A Modeling, Simulation, Exploration and Collaborative Platform to Develop Electronics and SoCs

A Modeling, Simulation, Exploration and Collaborative Platform to Develop Electronics and SoCs
by Daniel Payne on 03-26-2024 at 10:00 am

Demo Chiplet System with CPU, DSP, GPU, IO, AI

During the GOMACTech conference held in South Carolina last week I had a Zoom call with Deepak Shankar, Founder and VP Technology at Mirabilis Design Inc. to ask questions and view a live demo of VisualSim – a modeling, simulation, exploration and collaborative platform to develop electronics and SoCs. What makes VisualSim so distinctive is that it comes bundled with about 500 high-level IP blocks ready to use, including 35 ARM processors, some 100 processors, and over 30 different interconnects. Users of VisualSim quickly connect these IP blocks together visually to create their systems, complete with networks. An automotive designer can model the entire network, including 5G communications, Ethernet, SDA and OTA updates with VisualSim.

A high-level model allows for quickest architectural exploration and making architectural trade-offs, way before implementation even begins with RTL code. You can model complex activities like a bus, memories and even cache, measuring things like end-to-end delays and latency. Engineers can measure what their cache hit/miss ratio is, and what happens with requests to L2 caches. All the popular network protocols are modeled: AXI, CHI, CMN600, Arteris NOC, UCIe, etc.

With this modeling approach an architect can model an SoC, complete aircraft or automotive system, and then begin to measure it’s performance to see if it meets the requirements. VisualSim is a multi-domain simulator that can integrate analog, software, power systems, digital and networking into a single model.

For the live demo Deepak showed me a chiplet-based design that had separate chiplets for the  DSP, GPU, AI processor and CPU all connected together using UCIe, and each IP block was parameterized to allow for customization and exploration.

Demo Chiplet System with CPU, DSP, GPU, IO, AI

Pushing into the UCIe block there was an IP called a UCIe switch, and a user can customize this block with five parameters, all at a high level.

UCIe Switch parameters

A router IP block had 10 parameters for customization.

Router parameters

To find each IP block there was a scrollable list on the left-hand side of the GUI, with each family of IP in the library. In a matter of seconds you can browse, select and start customizing an IP.

IP block list

In VisualSim you are connecting each IP in the dataflow, staying at a high level. The next live demo was for a multimedia system design, and to simulate 20 ms took about 15 seconds of wall time, running on a laptop. While the simulation is running you can view the system performance as instantaneous power, measure pipeline utilization, cache utilization, memory usage, and even view a timing diagram. This real time simulation triggered 7.5 million events, and the customer built this model in under 2 weeks, which included the entire SoC.

Multimedia system, timing diagram

Another customer example that Deepak mentioned include 45 masters and was completed in about 4 weeks, fully tested.

You can look inside any of the IP blocks and analyze metrics like pass/fail, then understand why it failed. There’s even an AI engine to help analyze data more efficiently, like finding a buffer overflow which caused a failure. While your model is running there are analytics captured to help measure system performance and identify architectural bottlenecks.

VisualSim is updated twice per year, and then there are patch updates for when new IP blocks are added. An architect defines requirements in an Excel file, with metrics like latency limits and buffer occupancy.

Requirements file

Users of VisualSim can define the range of payload size in terms of bytes, speed ranges and preferred values. Your system model can be swept across the combinations to find the best set of parameters. The simulator even understands how to explore the min, max, and preferred values. You get to define which system parameters will be explored. A multimedia system demo was shown next and then simulated live.

Multimedia System

For an FPGA block you choose the vendor and part number, and then you can see the latency for each Task and the channel statistics of the NOC after a simulation has been run. A power plot was shown for 1 second of operation when using Xilinx Versal parts.

Power Plot

All of the live demos were being run on a Windows laptop. Other supported OSes are: Unix, Mac. Running VisualSim requires a minimal HW infrastructure, because the models are high level.

VisualSim users receive over 500 examples that are pre-built to help get you started quickly, like a complete communication system with an Antenna, Transceiver, FPGA with baseband, and Ethernet interface. System architects using VisualSim can collaborate with all the low-level specialists, like RTL designers.

System-level trade-offs can be modeled and evaluated, like:

  • Changing from 64-QAM to QPSK modulation
  • Faster to slower processor
  • Changing Ethernet specs

If you start with VisualSim to model, implement, then measure, expect to see 95% accuracy compared to RTL implementation results. The promise of using high level models is to eliminate performance issues prior to implementation or integration. There really is no coding required for an entire system model.

Mirabilis has 65 customers worldwide so far and some 250 projects completed. Some of the well-known clients include: NASA, Samsung, Qualcomm, Broadcom, GM, Boeing, HP, Imagination, Raytheon, AMD, Northrup Grumman.

Summary

In the old days a systems designer may have drawn out their ideas on a napkin while eating at a restaurant, and then go back to work and cobble together some Excel spreadsheets with arcane equations to create a model. Today there’s a new choice, and that’s giving VisualSim from Mirabilis a try. You can now model an entire system in a just a few weeks, along with making architectural trade-offs while running actual simulations, all before getting into detailed implementation details.

Related Blogs


Weebit Nano Brings ReRAM Benefits to the Automotive Market

Weebit Nano Brings ReRAM Benefits to the Automotive Market
by Mike Gianfagna on 03-26-2024 at 6:00 am

Weebit Nano Brings ReRAM Benefits to the Automotive Market

Non-volatile memory (NVM) is a critical building block for most electronic systems. The most popular NVM technology has traditionally been flash. As a discrete part, the technology can be delivered in various form factors. For embedded applications flash presents scaling challenges, however. A new NVM technology developed by Weebit Nano is called ReRAM. Sometimes called RRAM, this approach stores bits as resistance vs. the typical approach of using charge that is prevalent in other memory technologies. NVM is used in many parts of automotive systems, as shown in the diagram at the top of this post. The problem is automotive systems present many challenges around things like operating temperature, safety and reliability. Using ReRAM for embedded applications has been hampered by these hurdles, until recently. Read on to see how Weebit Nano brings ReRAM benefits to the automotive market.

Weebit Nano Opens Access to Automotive Applications

Back in November of last year, Weebit Nano announced that its ReRAM IP achieved high temperature qualification in SkyWater Technology’s 130nm CMOS (S130) process. The announcement detailed qualification up to 125 degrees Celsius – the temperature specified for Grade-1 automotive applications. This temperature range also opens up application for industrial, aerospace and other high-temp applications. You can read the details of the announcement here.

Last month, the company raised the bar on automotive access by detailing high reliability and endurance at extreme temperatures and after extensive cycling. Specifically, high endurance was demonstrated at 100K flash-equivalent cycles and high-temperature stability was demonstrated at 150 degrees Celsius lifetime operation, including cycling and retention. The details are shown in the image, below. This clearly moves ReRAM much closer to mainstream use in automotive applications.

Image: Resistance distribution after 100K cycles at 150C. The Weebit performance demonstrates good BER throughout the entire 100K cycles at hot temperatures.

Coby Hanoch, Weebit Nano’s CEO commented, “The performance levels we’re achieving align with requirements specified by automotive companies. Demonstrating the resilience of Weebit ReRAM under these conditions will continue to enhance our position in this domain. Our latest results reaffirm the viability of Weebit ReRAM for use in microcontrollers and other automotive components, as well as numerous other applications requiring high-temperature reliability and extended endurance. Weebit ReRAM is ideal for these applications, offering advantages including ease of integration, cost effectiveness, power efficiency and tolerance to radiation and electromagnetic fields.”

You can read the full text of the announcement here.

A Closer Look at the Technology and the Challenges

According to the International Roadmap for Devices and Systems, 2022 Edition:

One challenge is the need of a new memory technology that combines the best features of current memories in a fabrication technology compatible with CMOS process flow and that can be scaled beyond the present limits of SRAM and FLASH.

Weebit Nano’s ReRAM technology offers a very cost-effective solution to this NVM need. Some specifics of the technology include:

  • Two-mask adder
    • Very few added steps compared to other NVM technologies
    • Lower wafer cost than competing NVM technologies
  • Fab-friendly materials
    • No contamination risk, No special handling, etc.
  • Using existing deposition techniques and tools
    • Easy to integrate into any CMOS fab
  • BEOL technology
    • Stack between any two metal layers
    • No interference with FEOL – Easier to embed with existing analog and RF circuits
    • Easy to scale from one process variation to another

Some of the growing needs for emerging automotive NVM application include code storage, trimming and data logging. Weebit ReRAM delivers high-temperature reliability, immunity to EMI, endurance, fast switching speed, longevity, and secure operation. And the technology can scale to the most advanced process nodes.

Automotive chips have unique requirements, such as design for safety, security and longevity. Devices must be reliable against extreme temperatures, EMI, vibration, and humidity. Fast boot, instant response, frequent over-the-air updates must also be supported. All these requirements mean advanced process nodes are adopted quickly, and this is where Weebit Nano’s technology shows great promise.

General ICs are qualified according to JEDEC standards – this is the baseline for consumer application markets. The automotive industry follows AEC-Q100 standards (Stress Test Qualification for Integrated Circuits). For automotive qualified ICs, tests are much stricter than those of an industrial or commercial IC. These stringent qualification tests assure reliable operation and long lifetimes in harsh automotive environments.

This is why Weebit Nano’s advanced testing work is so significant for automotive applications. The technology is also relevant for a wider range of applications, as shown in the figure below.

ReRAM Addresses a Broad Range of Application Requirements

To Learn More

You can learn more about the benefits of ReRAM technology here. You can also learn about the application of Weebit Nano’s ReRAM to power management here. WeeBit Nano recently presented at the recent IEEE Electron Devices Technology and Manufacturing (IEEE EDTM) Conference. You can view this presentation here. And that’s how Weebit Nano brings ReRAM benefits to the automotive market.


2024 DVCon US Panel: Overcoming the challenges of multi-die systems verification

2024 DVCon US Panel: Overcoming the challenges of multi-die systems verification
by Daniel Nenni on 03-25-2024 at 10:00 am

Dvcon 2024

2024 DVCon was very busy this year. Bernard Murphy and I were in attendance for SemiWiki, he has already written about it.  Multi die and chiplets was again a popular topic. Lauro Rizzatti, a consultant specializing in hardware-assisted verification, moderated an engaging panel, sponsored by Synopsys, focusing on the intricacies of verifying multi-die systems. The panel, which attracted a significant audience, included esteemed experts such as Alex Starr, a Corporate Fellow at AMD; Bharat Vinta, Director of Hardware Engineering at Nvidia; Divyang Agrawal, Senior Director of RISC-V Cores at Tenstorrent; and Dr. Arturo Salz, a distinguished Fellow at Synopsys.

Presented below is a condensed transcript of the panel discussion edited for clarity and coherence.

Rizzatti: How is multi-die evolving and growing? More specifically, what advantages have you experienced? Any drawback you can share?

Starr: By now, I think everybody has probably realized that AMD’s strategy is multi-die. Many years ago, ahead of the industry, we made a big bet on multi-die solutions, including scalability and practicality. Today, I would reflect and say our bet paid off to the extent that we’re living that dream right now. And that dream looks something like many dies per package, being able to use different process geometries for each of those dies as well to exploit the best out of each technology in terms of I/O versus compute, power/performance trade-offs.

Vinta: Pushed by the demand for increasing performance from generation to generation, modern chip sizes are growing so huge that a single die cannot accommodate the capacity we need any longer. Multi-die, as Alex put it, is here right now, it’s becoming a necessity not only today, but into the future. A multitude of upcoming products are going to reuse chiplets.

Agrawal: Coming from a startup, I have a slightly different view. Multi-die gets you the flexibility of mixing and matching different technologies. This is a significant help for small companies since we can focus on what our core competency is rather than worrying about the entire ecosystem.

Salz: I agree with all three of you because that’s largely what it is. Monolithic SoCs are hitting a reticle limit, we cannot grow them any bigger, they are low-yield, high-cost designs. We had to switch to multi-die and the benefits include the ability to mix and match different technologies. Now that you can mount and stack chiplets the interposer has no reticle limit, hence, there is no foreseeable limit for each of these SoCs. Size and capacity become the big challenge

Rizzatti: Let’s talk about adoption of multi-die design. What are the challenges to adopt the technology and what changes have you experienced?

Starr: We have different teams for different chiplets. All of them work independently but have to deliver on a common schedule to go into the package. While the teams are inherently tied, they are slightly decoupled in their schedules. Making sure that the different die work together as you’re co-developing them is a challenge.

You can verify each individual die, but, unfortunately, the real functionality of the device requires all of those dies to be there, forcing you to do what we used to call SoC simulation – I don’t even know what a SoC is anymore – you now have all of those components assembled together in such multitude that RTL simulators are not fast enough to perform any real testing at this system level. That’s why there has been a large growth in emulation/prototyping deployment because they’re the only engines that can perform this task.

Vinta: Multi-die introduces a major challenge when they all share the same delivery schedules. To meet the tapeout schedule, you not only have to perform die-level verification but also full chip verification. You need to verify the full SoC under all use cases scenarios.

Agrawal: I tend to think of everything backwards from a silicon standpoint. If your compute is coming in a little early you may have a platform on which do silicon bring up and not wait for everything else to come in. What if my DDR is busted? What if my HBM is busted? How do you compare, how do you combine, mix and match those things.

Salz: When you get into system level, you’re not dealing with just a system but a collection of systems communicating through interconnect fabrics. That’s a big difference that RTL designers are not used to thinking about. You have jitter or coherency issues, errors, guaranteed delivery, all things engineers commonly deal with in networking. It really is a bunch of networks on the chip but we’re not thinking about it that way. You need to plan this out all the way at the architectural level. You need to think about floor planning before you write any RTL code. You need to think about how you are going to test these chiplets. Are you going to test them each time we integrate one? What happens to different DPM or yields for different dies? Semiconductor makers are opportunistic. If you build a 16 core engine and two of them don’t work, you label it as an eight core piece and sell it. When you have 10 chiplets, you can get a factorial number in the millions of products. It can’t work that way.

Rizzatti: What are the specific challenges in verification and validation? Obviously, you need emulation and prototyping, can you possibly quantify these issues?

Starr: In terms of emulation capacity, we’ve grown 225X over the last 10 years and a large part of that is because of the increased complexity of chiplet-based designs. That’s a reference point for quantification.

I would like to add that, as Arturo mentioned, the focus on making sure you’re performing correct-by-construction design is more important now than ever before. In a monolithic chip-die environment you could get away with SoC level verification and just catch bugs that you may have missed in your IP. That is just really hard to do in a multi-die design.

Vinta: With the chiplet approach, there is no end in sight for how big the chip could grow to. System-level verification of full chip calls for huge emulation capacity requirements, particularly for the use cases that require full system emulation. It’s a challenge not only for emulation but also for prototyping. The capacity could easily increase an order of magnitude from chip to chip. That is one of my primary concerns, in the sense of “how do we configure emulation and prototyping systems that could handle these full system level sizes?”

Agrawal: With so many interfaces connected together, how do you even guarantee system level performance? This was a much cleaner problem to address when you had a monolithic die, but when you have chiplets then the performance is the least common denominator of all the interfaces, hoops that a transition has to go through.

Salz: That’s a very good point. By the way, the whole industry hinges on having standard interfaces. The future when you can buy a chiplet from a supplier and integrate it into your chip is only going to be possible if you have standard interfaces. We need more and better interfaces, such as UCIe.

By the way you don’t need to go to emulation right away. You do need emulation when you’re going to run software cycles, at the application-level, but for basic configuration testing you can use a mix of hybrid models and simulation. If you throw the entire system at it, you’ve got a big issue because emulation capacity is not growing as fast as these systems are growing, so that’s going to be a big challenge too.

Rizzatti: Are the tools available today adequate for the job? Do you need different tools? Have you developed anything inhouse that you couldn’t find on the market?

Starr: PSS portable stimulus is an important tool for chiplet verification. It’s because a lot of functionality of these designs is not just in RTL anymore, you’ve got tons of firmware components, and you need to be able to test out the systemic nature of these chiplet-based designs. Portable stimulus is going to give us a path to have a highly efficient, close to the metal stimulus that can exercise things at the system-level.

Vinta: From the tools and methodologies point of view, given that there is a need to do verification at the chiplet-level as well as at the system-level, you would want to simulate the chiplets individually and then, if possible, simulate at full system-level. The same goes for emulation and prototyping. Emulate and prototype at the chiplet-level as well as at the system-level if you can afford to do it. From the tools perspective, chiplet-level simulation is pretty much like monolithic chip simulation. Verification engineers are knowledgeable and experienced to that methodology.

Agrawal: No good debug tools are out there where you could combine multiple chiplets and debug something.

From a user standpoint, if you have a CPU-based chiplet and you’re running a Spec benchmark or 100 million instructions per workload on your multi-die package then something fails, maybe it’s functional performance, where do you start? What do you look at? If I bring that design up in Verdi it would take forever.

When you verify a large language model and run a data flow graph and you’re placing different pieces or snippets of the model across different cores, whether Tenstorrent cores or CPU cores, and you have to know at that point whether your placement is correct, how can you answer that question? There’s absolute lack of good visibility tools that can help verification engineers moving to multi-die design right now.

Salz: I do agree with Alex that portable stimulus is a good place to start because you want to do scenario testing, and that’s well suited for doing scenario testing with consumer-producer schemes that pick snippets of code needed for the test.

There are things to do for debug. Divyang, I think you’re thinking of old style waveform dumping for the whole SoC, and that is never going to work. You need to think about transaction level debug. There are features in Verdi to enable transaction level debug, but you need to create the transactions. I’ve seen people grab like a CPU transaction which typically is just the instructions and look at it and say, there’s a bug right there, or no, the problem is not in the CPU. Most of the time, north of 90%, the problem sits in the firmware or in the software, so that’s a good place to start as well.

Rizzatti: If there is such a thing as a wish-list for multi-die system verification, what would that wish-list include?

Starr: We probably need something like a thousand times faster verification, but typically we see perhaps a 2X improvement per generation in these technologies today. The EDA solutions are not keeping up with the demands of this scaling.

Some of that’s just inherent in the nature of things in that you can’t create technologies that are going to outpace the new technology you’re actually building. But we still need to come up with some novel ways of doing things and we can do all the things we discussed such as divide and conquer, hybrid modeling, and surrogate models.

Vinta: I 100% agree. Capacity and throughput need to be addressed. Current platforms are not going to scale, at least not in the near future. We would need to figure out how to divide and conquer, as Alex noted. Making sure with the given footprint how do you get more testing done, more verification up front. And then on top of it, address the debug questions that Divyang and Arturo have brought up.

Agrawal: Not exactly tool specific, but it would be nice to have a standard for some of these methodologies to talk to each other. Right now, it’s vendor specific. It would be nice to have some way of plugging in and playing different standards together so things just work right so people can focus on their core competencies rather than having to deal with what they don’t know.

Salz: It’s interesting that nobody’s brought up the “when do know you’re done?”

It’s an infinite process. You can keep simulating/verifying and that brings to mind the question of coverage. We understand some coverage at the block level, but at the system level is scenario driven. You can dream up more and more scenarios, each application brings something else. That’s an interesting problem that we have not yet addressed.

Rizzatti: This concludes our panel discussion for today. I want to thank all the panelists for offering their time and for sharing their insights into the multi-die verification challenges and solutions.

Also Read:

Complete 1.6T Ethernet IP Solution to Drive AI and Hyperscale Data Center Chips

2024 Signal & Power Integrity SIG Event Summary

Navigating the 1.6Tbps Era: Electro-Optical Interconnects and 224G Links


Andes Technology: Pioneering the Future of RISC-V CPU IP

Andes Technology: Pioneering the Future of RISC-V CPU IP
by Frankwell Lin on 03-25-2024 at 6:00 am

Table 1

On September 13, 2021, Andes Technology Corporation successfully issued its GDR (Global Depositary Receipt) public offering on the Luxembourg Stock Exchange. At the time it made Andes the only international public RISC-V Instruction set architecture (ISA) CPU IP supplier. This allowed investors around the world to participate in the growth Andes envisioned for RISC-V. This capital infusion would fuel Andes ambition to become a leader in the rapidly evolving, high-growth, open standard RISC-V market. In 2015 recognizing the vast potential for the RISC-V ISA, Andes had become a Founding and Premier Member of RISC-V International.

As of April 2, 2023; Unit: Shares %
Table 1. Composition of Andes Technology Corp. Shareholders

The investment has paid off significantly particularly because it coincided with ratification of the RISC-V Vector Extension in November 2021. This event marked a turning point in the evolution of the RISC-V instruction set architecture. RISC-V vector extension came at a time when data center computing was changing from general purpose processing to AI processing, which handles extremely large data sets. Vector processing excels in efficient processing of large arrays or structured data. Vector processing has the potential to make RISC-V the next major worldwide ISA.

A vector processor’s highly parallel architecture reduces latency and overhead. It achieves better energy efficiency by maximizing CPU resource utilization and minimizing idle cycles, thus realizing higher performance per watt. Moreover, the hardware to implement RISC-V Vector processing units (VPUs) and vector registers is simpler than highly parallel architectures used for graphics processing. And VPUs provide a far less complex programming model.

The Andes R&D teams in both the North American operation and the expanded Taiwan staff have been focused on developing cutting-edge architectures for high-end RISC-V processors. Notably, the two achieved a significant milestone by developing the first RISC-V vector (RVV) engine, the AndesCore™ NX27V, based on the RISC-V International RVV specification.  Showcasing the agility and innovation of the Andes engineering team, the design was completed within a year and based on version V0.8 of the RISC-V vector extension specification, and later on modified to version V1.0 when RVV was ratified. This accomplishment led to a few major OEM design wins.

Last year at the International Symposium on Computer Architecture (ISCA) 2023 conference in Orlando, Florida, META presented its paper, “MTIA: First Generation Silicon Targeting Meta’s Recommendation Systems,” which is the company’s data center, AI servers project. There are 64 processing elements (PE) in the server design that support MRETA’s custom-built proprietary accelerator. Each PE contains two processors: one scalar and one vector. Both are Andes IP that META engineers highly customized, using Andes Custom Extensions (ACE) to produce a completely unique solution targeted at META’s specific AI computing requirements.

The design validated the efficacy of RISC-V with Vector Extensions as a powerful solution to AI data center computing needs at a time when demand for data center processing hardware is exploding. According to Future Market Insights‘ report “Data Center CPU Market Outlook (2023 to 2033),” The data center CPU market is expected to grow significantly over the next few years, driven by the increasing demand for cloud computing, big data analytics, and artificial intelligence (AI). Key drivers of this growth include the need for faster data processing, increased efficiency, and reduced costs.

In 2021, in addition to vector extensions RISC-V International ratified 11 more extensions. Figure 1 illustrates the Andes product roadmap to support these extensions. Along the way to end of 2022 N25F-SE, 27 Series, and 45 series have since been delivered, in 2023, Andes delivered six new RISC-V cores to the market, such as D25F-SE, D23, N225, NX45V, AX45MPV as well as AX65. The road map spans from low-power and highly secured entry-level RISC-V processor AndesCore™ D23 to the AX65, the first in the 60 series, which was released in 2023 Q4 and is now shipping for customers designs.

Figure 1. Andes Technology Corp. Product Roadmap

The AX65 is a 13-stage, 4-way 64 bit out of order processor with RVA 22 profile (RVA22U64 profile specifies the ISA features available to user-mode execution environments in 64-bit applications processors). Equipped with 13-stage pipeline, 4-wide decode, 8-wide out-of-order execution, the series targets the Linux application processor sockets of computing, networking, and high-end controllers.

The AX65 allows multicore clusters from one to four to eight cores. The performance is world class, operating at 2.4 gigahertz clock frequency in seven nanometers TSMC process. The spec integers (Specint 2006) performance is 8.25 per gigahertz, and a SpecFp2006 is 10.2 per gigahertz, which are the best-known SPEC CPU® 2006 performance with two level cache design. The AX66, AX63 and AX67 will be delivered thereafter.

One other area Andes has made significant investment in is high performance automotive-grade RISC-V CPU IP. The penetration of RISC-V SoCs in automotive designs is projected to reach 21.4% by 2030, according to The SHD Group “RISC-V Market Report: Application Forecasts in a Heterogeneous World.” Andes developed functional safety-compliant products, include its N25F-SE, the world’s first fully ISO 26262 compliant RISC-V CPU IP; D25F-SE, which supports DSP extension instructions; and the 45-SE series processors that meet the highest ASIL level, ASIL D. ACE function will be enhanced to add support for 45-series processors.

On the strength of the demand Andes RISC-V products have experienced, the company continues to remain profitable and continues to enjoy rapid growth. From 2021 to 2023, Andes revenue showed nearly 30% growth. This was fueled by over 300 commercial licensees and over 600 signed license agreements with geographically distributed customers in Taiwan, China, Korea, Japan, Europe, and the USA. The company’s worldwide headcount grew nearly 70% over the same period.

Conclusion

In an era defined by rapid technological evolution, Andes Technology Corp. stands at the forefront of innovation in the RISC-V CPU IP market. From its pioneering issuance of overseas depositary receipts (GDR) to its groundbreaking advancements in RISC-V architecture, Andes Technology continues to redefine industry standards and shape the future of computing. As the demand for efficient, high-performance computing solutions continues to rise, Andes Technology remains committed to delivering unparalleled RISC-V solutions to drive transformative change across the global technology landscape.

Also Read:

LIVE WEBINAR: RISC-V Instruction Set Architecture: Enhancing Computing Power

WEBINAR: Leverage Certified RISC-V IP to Craft ASIL ISO 26262 Grade Automotive Chips

LIVE WEBINAR: Accelerating Compute-Bound Algorithms with Andes Custom Extensions (ACE) and Flex Logix Embedded FPGA Array


Podcast EP213: The Impact of Arteris on Automotive and Beyond with Frank Schirrmeister

Podcast EP213: The Impact of Arteris on Automotive and Beyond with Frank Schirrmeister
by Daniel Nenni on 03-22-2024 at 10:00 am

Dan is joined by Frank Schirrmeister, vice president of solutions and business development at Arteris. He leads activities in the industry verticals including automotive and technology horizontals like artificial intelligence, machine learning, and safety. Before Arteris, Frank held senior leadership positions at Cadence Design Systems, Synopsys, and Imperas, focusing on product marketing and management, solutions, strategic ecosystem partner initiatives, and customer engagement.

In this far-reaching discussion, Frank explains the impact Arteris NoC technology has on system design. He dives into its impact on automotive design, discussing many aspects of the market including how Arteris simplifies safety. Arteris support for cache coherent design is also discussed. Frank goes beyond automotive and explains the impact of this technology across many markets.

Looking to the future, Frank discusses recent Arteris acquisitions that expand the company’s footprint beyond its traditional markets. Engagements with tier-1 customers are discussed, along with an explanation of the engagement process Arteris uses with new customers.

The views, thoughts, and opinions expressed in these podcasts belong solely to the speaker, and not to the speaker’s employer, organization, committee or any other group or individual.