100X800 Banner (1)

The Git Dilemma: Is Version Control Enough for Semiconductor Success?

The Git Dilemma: Is Version Control Enough for Semiconductor Success?
by admin on 05-22-2024 at 10:00 am

Data in Software Vs Semiconductor Design Flow

Git is a version control system that saves every change made to files. It offers powerful tools for managing these changes. This makes Git ideal for software development, as it lets you keep all project files in one place.

Software has grown in complexity, necessitating more collaboration among engineers across various time zones. Software tools advanced to handle the increased number of iterations, files, and data. Initially, tools like RCS and SCCS could only manage file revisions, not entire software projects. CVS is a version control system that allows multiple developers to work on a project simultaneously while keeping the project consistent. As development spread across various geographies and functional teams, systems like ClearCase, Perforce, and SVN became essential for managing the process. Eventually, Git and Mercurial emerged to support the distributed development of extensive open-source projects effectively.

Git has become the go-to solution for tracking changes and managing software development flows. However, it is worth exploring whether Git is the best choice for projects beyond software development, such as semiconductor chip design. This blog post examines whether Git can effectively manage workflows in different fields.

How is semiconductor design different?

Like software development flows, IC design involves an extensive collection of files. A team of engineers goes through numerous revisions during the product’s development, debugging, and enhancement stages. Naturally, they must distribute their updates efficiently and precisely among the team. As the project progresses, managing the data becomes more complex as teams seek to find the optimal configurations of IC schematics, layouts, and semiconductor Intellectual Property (IP Cores). Also, managing associated simulation and analysis data and meta-data is an additional challenge in hardware design flows.

In addition to the fundamental similarities, there are also several notable differences.

  • Digital designs are commonly crafted using Verilog text files and edited with text editors. However, analog and mixed-signal (AMS) designs and packaging designs are created as binary files or groups of binary and text files to represent design objects such as schematics and layouts using specialized graphical editors.
  • A software workflow is characterized by a cyclical process of editing, compiling, and debugging. In contrast, the workflow for semiconductor design is significantly more nuanced, involving various editors in creating the components.
  • Various steps are required to complete the design, such as synthesis, place and route, different types of simulations, formal verification, timing analysis, etc. These tools and flows necessitate the collaboration of engineers with diverse specializations to generate binary files, which may require version management.
  • Specific components, often called Intellectual Property (IP) blocks, might be repurposed entirely or partially. These IPs are frequently acquired from external suppliers and might come with limitations regarding the editing permissions within the more extensive system.

Git serves as a robust platform for overseeing software development processes. Individual engineers focus on creating new features or resolving problems. The platform’s proficiency in integrating modifications into text files facilitates highly effective collaborative development, particularly within open-source initiatives.

Nonetheless, it must address the specific needs of semiconductor development, especially within a company. We’ll investigate these needs and the areas where Git might not measure up.

In IC design, several large files, ranging from a few MBs to several GBs, are expected. The operational framework of Git, which involves users cloning the repository, results in each user duplicating all the repository files and their numerous revisions. This practice can incur significant expenses in a corporate setting where the bulk of design data resides on costly, high-reliability network storage systems.

In the realm of IC design, particularly within analog and custom domains, the creation of binary files is a standard practice. Due to the non-automatable nature of combining schematics and layouts, utilizing a centralized repository equipped with editing restrictions is the optimal strategy to circumvent the intricacies and potential errors associated with manual merging.

Designing involves teamwork among various engineers, including design, verification, and layout engineers. They need to work together and share frequent updates. A central repository model allows frequent updates and better collaboration than a distributed repository model, as engineers can track each other’s work and stay updated on changes.

Design teams work from different locations. For example, design engineers might be in one country, while layout engineers might be in another. They need to coordinate their work regularly. Technology such as cache servers helps them do this effectively, considering the large volume of design data that needs to be shared.

Design objects are typically grouped sets of files treated as a single entity rather than just a set of files. Access restrictions are essential because engineers have specific roles, like preventing a layout engineer from changing a schematic by mistake. Also, it’s crucial to restrict contractors from sensitive materials. Centralized management of project data is necessary to maintain these access controls.

Although data might be organized in a simple flat directory system, IC design usually follows a structured hierarchy of blocks, where each tier incorporates a block from the below level. An IC designer requires a configuration management system to retrieve and manipulate the design hierarchy.

Consider how difficult it would be for software developers if they couldn’t compare different versions of files to see the changes. It would be a huge task to check for updates or track down the source of a new error. Similarly, circuit designers and layout engineers should have access to tools to spot differences in schematics or layouts between versions.

Indeed, the engineers’ design tools must incorporate revision control and configuration management functionalities. This is not only a matter of convenience; the design tools must also know the configuration management system to ensure that adds, deletes, and changes are correctly recorded and represented in the tools.

The temptation is to look for existing tools and shoehorn them to meet similar needs in a different domain. Git and other software configuration management (SCM) tools, as the name suggests, were developed by software engineers to meet the needs of software engineers. Each domain may have some unique requirements that differ from those of software development. It makes sense to explore your development tools and methodology requirements before adopting a configuration management system you will work with for many years.

Keysight Data Management System (SoS)

Also Read:

Self-heating and trapping enhancements in GaN HEMT models

QuantumPro unifies superconducting qubit design workflow

2024 Outlook with Niels Faché of Keysight EDA


Mastering Copper TSV Fill Part 1 of 3

Mastering Copper TSV Fill Part 1 of 3
by John Ghekiere on 05-22-2024 at 8:00 am

Mastering Copper TSV Fill Part 1 of 3

Establishing void-free fill of high aspect ratio TSVs, capped by a thin and uniform bulk layer optimized for removal by CMP, means fully optimizing each of a series of critical phases. As we will see in this 3-part series, the conditions governing outcomes for each phase vary greatly, and the complexity of interacting factors means that starting from scratch poses an empirical pursuit that is expensive and of long duration.

Robust and void-free filling of TSVs with copper progresses through six phases as laid out below:

  1. Feature wetting and wafer entry
  2. Feature polarization
  3. Nucleation
  4. Fill propagation
  5. Accelerator ejection
  6. Bulk layer plating
  7. (Rinsing and drying, which we won’t cover in this series)

Feature wetting

The primary purpose of the feature wetting step is, well, not very mysterious. It is to fully wet the wafer surface and most especially the features themselves. Inadequate feature wetting leads to trapping of air inside the TSV. That trapped air impedes plating, causing voids.

Wetting is by no means trivial, and the difficulty of fully wetting features increases dramatically with aspect ratio. There is a process window based on time of exposure of the features to the water. Too little time allowed for wetting will lead to incomplete wetting and bubbles (thus, voids). However, too much time for wetting will result in corrosion of the copper seed (thus other voids).

Side note: One of the biggest challenges in copper TSV formation actually comes before the copper fill step. I’m talking about the contiguous coverage of the feature in PVD copper seed. You probably guessed that the degree of difficulty in proper seed deposition increases dramatically with aspect ratio. Oh, how right you are. Getting good seed coverage in higher aspect ratio vias tends to require both sophisticated PVD systems and sophisticated PVD process engineers (neither of which comes cheap). The toughest spot to cover is the lower portion of the sidewall immediately adjacent to the via floor. If feature wetting is not optimized, seed corrosion can occur, exposing the underlying barrier and preventing plating altogether in that area, resulting in…voids. If you have non-optimized wetting that results in corrosion, the bottom of the via wall is where you are sure to see it.

Let’s talk methods of wetting. I’m sure there are exceptions but, generally speaking, wetting of TSV features is done in deionized water (DI). The wetting step can be accomplished using one of several different methods. I briefly summarize the most common methods below, along with the pros and cons of each:

Immersion wetting:

  • What is it: Simply immersing a dry wafer into a prewet bath.
  • Pros: Hardware requirements are minimal, amounting only to immersing the wafer into a pool of water.
  • Cons: This is the least aggressive method and is unlikely to work on vias of even modest aspect ratio. Even though the seed copper is highly hydrophilic, at this scale, wetting can proceed very slowly due to simple air entrapment or minor disadvantage in surface energy.

Spray wetting:

  • What is it: Spraying of water against the surface of the spinning wafer.
  • Pros: Spray is a quite effective way to completely wet features due to the more aggressive exchange of water at the surface which more effectively disrupts air entrapment.
  • Cons: Above an aspect ratio of around 5:1, spray wetting may be too slow to be effective, depending on the sensitivity of the TSV sidewall copper seed. Hardware complexity is higher than immersion and may not be available on the plating system you use. Wet transfer from a spray wetting system to a plating system is possible.

Vacuum Prewet:

  • Wetting of the wafer in a vacuum, then venting to atmosphere to drive water into the features.
  • Pros: Vacuum prewet is a highly effective and fast way to fully wet even TSVs of aspect ratio 10:1 and greater.
  • Cons: Most systems do not offer a vacuum prewet chamber. Even with access to vacuum prewet, recipe structure and settings must be optimized to avoid process issues; for example, excessive vacuum can lead to ambient pressure boiling of the water.

One last point on wetting, treatment of the water stream can help significantly to reduce (or nearly eliminate) corrosion rate, drastically opening process window. This improvement could be critical if seed copper is particularly thin, which is common at the base of the via sidewall in higher aspect ratio features.

Also, you have noticed I keep saying, “voids.” You want to master TSV? Master elimination of voids.

Once the wafer is wetted, it must be transferred to the plating reactor. The transfer of the wafer should be accomplished quickly to avoid evaporation of water on the wafer. Generally, the water inside the features will not evaporate quickly, however, water can evaporate quickly from the wafer surface, causing surface corrosion.

Wafer Entry

Immersion of the wafer into the plating fluid is the next consideration. Most state of the art plating systems introduce the wafer into fluid with the feature side facing down. Platers of this type are commonly called, “fountain platers,” because the chemistry flows continually upward and over a weir, replicating the behavior of a fountain. The other common plating chamber type uses a vertical orientation of the wafer. Systems of this type are often called “rack platers.”

Immersion of a wafer in a vertical orientation is typically straightforward, the primary risk being the trapping of air in and around the hardware features of the wafer holder. Immersion of a wafer into a fountain plater is comparatively more complex due to the need to evacuate air from under the wafer during immersion. This step is commonly called, “wafer entry,” and involves careful tilting of the wafer to a specific angle, at an optimal wafer rotational rate, and vertical downward velocity in order to prevent trapping of air against the wafer surface that leads to major…wait for it…voids.

Given the relatively higher complexity of wafer immersion in fountain plater systems, one may be tempted to think that vertical platers represent a preferable architecture, but they do not. For reasons I’ll share in a future post, the magic of rotating submerged disc physics makes fountain platers the industry’s best plating reactors.

Also Read:

Mastering Copper TSV Fill Part 2 of 3

 


From System Design to Drug Design. The Rationale

From System Design to Drug Design. The Rationale
by Bernard Murphy on 05-22-2024 at 6:00 am

Drug development cycle min

I’m guessing that more than a few people were mystified (maybe still are) when Cadence acquired OpenEye Scientific, a company known for computational molecular design aimed at medical drug/therapeutics discovery. What could EDA, even SDA (system design automation), and drug discovery possibly have in common? More than you might imagine, but to understand you first need to understand Anirudh’s longstanding claim that Cadence is at heart a computational software company, which I read as expert in big scientific/engineering applications. EDA is one such application, computational fluid dynamics (e.g for aerodynamics) is another and computational molecular design (for drug design) is yet another. I sat in on a presentation by Geoff Skillman (VP R&D of Cadence Molecular Sciences, previously OpenEye) at Cadence Live 2024, and what I heard was an eye opener (pun intended) for this hard-core EDA guy.

The dynamics and challenges in drug design

Developing a new drug has echoes with the path to developing a new semiconductor design – only much worse: a 12-year cycle from start to delivery; a very high fall out rate (90% of trials end in failure); an average $2.5B NRE per successful drug when averaged together with failed trials. I can’t imagine any semiconductor enterprise even contemplating this level of risk.

At least half of that time is consumed in clinical trials and FDA approval, stages we might not want to accelerate (I certainly don’t want to be a guinea pig for a poorly tested drug). The first half starts with discovery among a huge set of possibilities (10^60), screening for basic problems, optimizing, and running experiments in the lab. Unfortunately, biology is not nearly as cooperative as the physics underlying electronic systems. First, it is complex and dynamic, changing for its own reasons and in unforeseen responses to experiments. It has also evolved defenses over millions of years, seeing foreign substances as toxins to be captured and eliminated no matter how well intentioned. Not only must we aim to correct a problem, but we must also trick our way around those defenses to apply the fix.

A further challenge is that calculations in biology are approximate thanks to the sheer complexity and evolving understanding of bio systems. Worse yet, there is always a possibility that far from being helpful, a proposed drug may actually prove to be toxic. Adding experimental analyses helps refine calculations but these too have limited accuracy. Geoff admits that with all this complexity, approximation, and still artisanal development processes, it must seem like molecular science is stuck in the dark ages. But practitioners haven’t been sitting still.

From artisanal design to principled and scalable design

As high-performance computing options opened up in cloud and some supercomputer projects, some teams were able to sample these huge configuration spaces more effectively, refining their virtual modeling processes to a more principled and accelerated flow.

Building on these advances, Cadence Molecular Sciences now maps their approach to the Cadence 3-layer cake structure: their Orion scalable elastic HPC SaaS platform providing the hardware acceleration (and scaling) layer; a wide range of chemistry, physics, and biology tools and toolkits offering principled simulation and optimization over large configuration spaces; and AI/GenAI reinforced with experimental data in support of drug discovery.

Molecular sciences 3 layer cake

At the hardware acceleration layer, they can scale to arbitrary number of CPUs or GPUs. In the principled simulation/optimization layer they offer options to virtually screen candidate molecules (looking for similarity with experimentally known good options), to study molecular dynamics, quantum mechanics modeled behaviors, and free energy predictions (think objective functions in optimization). At the AI layer they can connect to the NVIDIA BioNeMo GenAI platforms, to AI-driven docking (to find best favored ligand to receptor docking options) and to AI-driven optimization toolkits. You won’t be surprised to hear that all these tools/processes are massively compute intensive, hence the need for the hardware acceleration layer of the cake.

Molecular similarity screening is a real strength for Cadence Molecular Sciences according to Geoff. Screening is the first stage in discovery, based on a widely accepted principle in medicinal chemistry that similar molecules will interact with biology in similar ways. This approach quickly eliminates random guess molecules of course, also molecules with possible strange behaviors or toxicities. Here they are comparing 3D shape and electrostatics for similarity and have measured performance improvement between 32x 3rd generation Xeon cores and 8x H100 GPU cores at over 1100X faster and 15X more cost efficiency. When you’re comparing against billions of candidate molecules, that matters.

Geoff also added that through Cadence connections to cloud providers (not so common among biotech companies) they have been able to expand availability to more cloud options. They have also delivered a proof of concept on Cadence datacenter GPUs. For biotech startups this is a big positive since they don’t have the capital to invest in their own large GPU clusters. For bigger enterprises, he hinted interest in adding GPU capacity to their own in-house datacenters to get around GPU capacity limitations (and I assume cloud overhead concerns).

Looking forward

What Geoff described in this presentation centered mostly on the Design stage of the common biotech Design-Make-Test-Analyze loop. What can be done to help with the other stages? BioNeMo includes a ChemInformatics package which could be used to help develop a recipe for the Make stage. The actual making (candidate synthesis), followed by test and analyze would still be in a lab. Yet following this approach, I can understand higher confidence among biotechs/pharmas that a candidate drug is more likely to survive into trials and maybe beyond.

Very cool. There’s lots more interesting information, about the kinds of experiments researchers are running for osteoporosis therapeutics, for oncology drugs and for other targets previously thought to be “undruggable”. But I didn’t want to overload this blog. If you are interested in learning more, click HERE.


New Tool that Synthesizes Python to RTL for AI Neural Network Code

New Tool that Synthesizes Python to RTL for AI Neural Network Code
by Daniel Payne on 05-21-2024 at 10:00 am

Catapult AI NN tool flow – Python to RTL

AI and ML techniques are popular topics, yet there are considerable challenges to those that want to design and build an AI accelerator for inferencing, as you need a team that understands how to model a neural network in a language like Python, turn that model into RTL, then verify that your RTL matches Python. Researchers from CERN, Fermilab and UC San Diego have made progress in this area by developing the open source hls4ml, which is a Python package for machine learning inference in FPGAs. The promise of this approach is to translate machine learning package models into HLS to speed development time.

I spoke with David Burnette, Director of Engineering, Catapult HLS with Siemens last week to learn how they have been working with Fermilab and contributors over the past two years on extending hls4ml to support both ASIC and FPGA implementations. The new Siemens tool is called Catapult AI NN, and it takes in the neural network description as Python code, converts that to C++ and then synthesizes the results as RTL code using Verilog or VHDL.

Data scientists working in AI and ML are apt to use Python for their neural network models, yet they are not experts at C++, RTL or hardware concepts. Manually translating from Python into RTL simply takes too much time, is error prone and is not easily updated or changed. Catapult AI NN allows an architect to stay with Python for modeling neural networks, then use automation to create C++ and RTL code quickly. This approach allows a team to do tradeoffs of power, area and performance rapidly in hours or days, not months or years.

One tradeoff that Catapult AI NN provides is for how much parallelism to use in hardware, so you could start with asking for the fastest network, which likely results in a larger chip area, or asking for the smallest design, which would impact the speed. Having quick iterations enables a project to reach a more optimal AI accelerator.

A common database for handwritten digits is called MNIST, with 60,000 training images and 10,000 testing images. A Python neural network model can be written to process and classify these images, then run in Catapult AI NN to produce RTL code in just minutes. Design teams that need hardware that performs objection classification and object detection will benefit from using this new Python to RTL automation.

Catapult AI NN tool flow – Python to RTL

Machine learning professionals that are used to tools like TensorFlow, PyTorch or Keras can continue to stay in their favored language domain, while automating the hardware implementation using a new tool. When using Catapult AI NN users can see how their Python neural network parameters correlate to RTL code, read reports on implementation area, measure the performance throughput per layer, and know where their neural network is spending time. To improve the speed of High-Level Synthesis a user can choose to distribute jobs for hundreds of layers at one time, instead of sequentially.

Summary

There is a new, quicker path to design AI accelerators, instead of using manual methods to translate from Python code for neural networks to RTL, and reaching an FPGA or ASIC implementation. With Catapult AI NN there’s the promise of quickly moving from neural network models written in Python to C++ and RTL, for both FPGA and ASIC domains. Rapid tradeoffs can be made with this new methodology, resulting in an optimization of power, performance and area for AI accelerators.

Inferencing at the Edge is a popular goal for many product design groups, and this announcement should attract their attention as a way to help meet their stringent goals with less effort, and less time for design and verification.  Fermilab has used this approach for particle detector applications, so that their AI experts can create efficient hardware without becoming ASIC designers.

Read the Siemens press release.

Related Blogs


S2C and Sirius Wireless Collaborate on Wi-Fi 7 RF IP Verification System

S2C and Sirius Wireless Collaborate on Wi-Fi 7 RF IP Verification System
by Daniel Nenni on 05-21-2024 at 6:00 am

wifi7 success story

Sirius Wireless partnered with FPGA prototyping expert S2C to develop the Wi-Fi 7 RF IP Verification System, enhancing working efficiency and accelerating time-to-market for clients.

Wi-Fi 7 is the latest Wi-Fi technology, with speeds of up to 30Gbps, approximately three times the peak performance of Wi-Fi 6. This enhanced performance will position Wi-Fi 7 to lead the market quickly, delivering users a more stable and faster wireless experience. However, Wi-Fi 7 sets rigorous standards for chipset designers and RF IP vendors, demanding excellent capabilities to handle 320 MHz bandwidth and 4096-QAM, including faster, lower-noise ADCs/DACs, sophisticated RF designs, and complex baseband processing. Enhanced Error Vector Magnitude (EVM) and noise control requirements in RF front-end modules exceed those of Wi-Fi 6/6E. Features like MRU and MLO increase complexity in baseband and MAC layer processing. Overcoming these challenges require innovative system architectures, algorithm designs, and advanced semiconductor processes for optimized performance and power management. Chip designers must also ensure flexible software support for interoperability among expanding wireless protocols, enhancing user experience while catering to diverse application demands.

Leveraging the S2C Prodigy S7-9P Logic System, Sirius Wireless conducts comprehensive verification and testing of RF performance indicators such as throughput, reception sensitivity, and EVM. Then they used Prodigy Prototype Ready IP which are ready-to-use daughter cards and accessories from S2C to interface with digital MAC, offer an end-to-end verification solution from RF to MAC to overcome the RF design nightmare, accelerating their time-to-market by shortening the entire chip verification cycle.

Sirius Wireless Validates Wi-Fi 7 RF with S2C Prodigy S7-9P Logic System

S2C’s extensive range of prototype tools, including productivity software suite, debugging solutions, and daughter boards, empowers designers to accelerate their functional verification by quickly building a target prototyping environment. In addition, Prodigy S7-9P Logic System also serves as a demonstration platform prior to tape-out to showcase and help them kickstart software development early. An example of such benefits is Sirius’s development of its Wi-Fi6 IP verification system. With this system, one of Sirius’s customers on short-range wireless chip designs spent only three months to complete the pre-silicon hardware performance analysis and performance com-parison test. The company thus shortened its production verification time and customers’ product introduction cycle, significantly improving efficiency by over 40%.

Sam Chu, VP of Marketing at Sirius Wireless, states, ” We have had a longstanding deep collaboration with S2C, jointly providing end-to-end verification solutions from RF to MAC for our clients. After our successful partnership on Wi-Fi 6, we’re confident in S2C’s Prodigy System for Wi-Fi 7 development. Its mature performance, user-friendly operation, and abundant validation experience reinforce our high expectations of Wifi-7 products.”

“S2C aims to boost partners’ market competitiveness”, said Ying Chen, VP of Sales & Marketing at S2C, “Sirius Wireless stands out in RF IP, being the sole company with TSMC’s advanced processes and Wi-Fi 7 RF design expertise. S2C is glad to work together with them to breathe new life into the whole industry.”

About Sirius Wireless

Headquartered in Singapore, Sirius Wireless was registered and established in 2018. The company has professional and outstanding R&D staff with more than 15 years of working experience in Wi-Fi, Bluetooth RF/ASIC/SW/HW.

About S2C

S2C is a leading global supplier of FPGA prototyping solutions for today’s innovative SoC and ASIC designs, now with the second largest share of the global prototyping market. S2C has been successfully delivering rapid SoC prototyping solutions since 2003. With over 600 customers, including 6 of the world’s top 10 semiconductor companies, our world-class engineering team and customer-centric sales team are experts at addressing our customer’s SoC and ASIC verification needs. S2C has offices and sales representatives in the US, Europe, mainland China, Hong Kong, Korea and Japan. For more information, please visit: https://www.s2cinc.com/ and more details about vu440, vu13p, vu19p board, vu9p FPGA, etc.

Also Read:

Accelerate SoC Design: DIY, FPGA Boards & Commercial Prototyping Solutions (I)

Enhancing the RISC-V Ecosystem with S2C Prototyping Solution

2024 Outlook with Toshio Nakama of S2C


An open letter regarding Cyber Resilience of the UK’s Critical National Infrastructure

An open letter regarding Cyber Resilience of the UK’s Critical National Infrastructure
by admin on 05-20-2024 at 10:00 am

Codasip UK Security Letter 1

Codasip announced a commercially available RISC-V processor with CHERI for license in October of 2023 and is demonstrating technology for IP provenance. 

Dear Members of the Science, Innovation and Technology Committee,

Let me start by applauding your hearing on 24 April 2024, and in particular the evidence of Professor John Goodacre, Challenge Director of Digital Security by Design at Innovate UK, and Mr Richard Grisenthwaite, Executive Vice President and Chief Architect at Arm. During this hearing, the witnesses discussed two extremely important cybersecurity issues: memory safety and IP provenance. In this letter, I would like to provide additional information about these topics that the committee should find relevant.

WEBINAR: Fine-grained Memory Protection to Prevent Cyber Attacks

Memory Safety and CHERI

As discussed in the hearing, memory safety issues represent roughly 70-80% of the cyber issues being tracked by the industry. These issues are referred to as Common Vulnerabilities and Exposures, or CVEs. The number of CVEs has grown exponentially over the last twenty years while the percentage of memory safety CVEs has been roughly constant.

Figure 1: Published CVE records

The reason is primarily related to the fact that most software is written in languages like C and C++, which do not provide inherent memory protection. What complicates the problem even more is that software is not normally developed monolithically, but by integrating pre-developed software from third parties, including open-source, where absolutely anyone can contribute potentially malicious changes.

Figure 2: Percentage of CVEs caused by memory safety issues. Source: Trends, challenges and strategic shifts in the software vulnerability mitigation landscape. Matt Miller, Microsoft Security Response Center (MSRC), Blue Hat IL, 2019

A rough estimate is that over one trillion lines of code are in use today, an enormous amount! The software industry has improved over the last decades, especially regarding “verification”, which is the part of the development process that checks for bugs and corrects them. However, as verification will never be perfect, nor will any developer, there will always be bugs for hackers to exploit in cyberattacks.

The UK is not alone in noticing the enormous memory security issue as the United States White House did a press release on 26 February 2024 entitled: Future Software Should Be Memory Safe.

As Professor Goodacre and Mr Grisenthwaite noted in the hearing, there are economic challenges for companies to take action to address memory safety issues, so they have been slow to do so, even where solutions are readily available. You may think of this situation as similar to the automotive industry’s challenge in adopting safety features that are standard today: seat belts, airbags, and crumple zones. It took decades to have such basic features in all automobiles, and it was only after regulations required them that every manufacturer did so.

For cyberattacks, whilst they are increasingly devastating, causing roughly $10 trillion of economic loss worldwide each year, the direct impact on each company is small enough that all too many do not choose to protect their customers.

In the US, the White House has realised this fact and indicates in its press release that it will be taking action “…shifting the responsibility of cybersecurity away from individuals and small businesses and onto large organizations like technology companies and the Federal Government that are more capable of managing the ever-evolving threat.

Over the last decade, Professor Goodacre has led outstanding work on CHERI at Digital Security by Design (DSbD), partnering with universities such as the University of Cambridge and semiconductor companies such as Arm. Indeed, Arm has produced a valuable CHERI research platform called Morello. During the hearing, Professor Goodacre noted that despite this exceptional work, the problem in general and specifically with Morello is that it is not a commercial offering, and consequently the industry has not been able to deploy CHERI. Whilst this has been true, I am pleased to update the committee that Codasip has recently launched a commercially available CHERI processor for license and has committed to making its entire portfolio of processors available with CHERI variants. We are also working very closely with the University of Cambridge and other companies to ensure CHERI is standardised and available to everyone.

Design Provenance and Traceability

The second topic discussed by Professor Goodacre and Mr Grisenthwaite was design provenance, which we believe must also include traceability. By provenance, we mean the origin of the design, including knowing the specific designers. By traceability, we mean changes to the design over time, including knowing the specific designers that made the changes. Additional information regarding the design, such as when, where, and with what tools changes were made should also be collected.

As Professor Goodacre and Mr Grisenthwaite explained, most semiconductor chips today are complete systems in themselves containing billions of transistors. Given the incredible complexity, chips are not designed monolithically, transistor by transistor, but assembled from pre-designed “IP blocks”, such as processors, memory, on-chip interconnects between IP blocks, and chip-to-chip interconnects such as USB. Companies like Arm and Codasip make processor IP blocks, while companies like Synopsys and Cadence make memory and chip-to-chip interconnects. Indeed, there is an entire IP industry for semiconductors. As Professor Goodacre and Mr Grisenthwaite discussed during the hearing, some IP are more prone to cyber issues than others, with processors being the most important and problematic.

For the previously discussed topic of memory security, CHERI involves invasive changes to the processor and lesser changes to memory. The additional cybersecurity challenge regarding provenance and traceability is that when one licenses IP blocks, one does not know who actually designed the IP, nor its possible history of modification. Consequently, when the inevitable bugs are found, it is not possible to irrefutably determine who made the errors. Most bugs will be accidental, but it is also possible that nefarious actors could have inserted malicious circuitry to appear as an accidental bug. We believe that provenance and traceability will increase in importance as cyberattacks increase in frequency and are increasingly used in military conflicts – indeed the Economist recently noted The cyberwar in Ukraine is as crucial as the battle in the trenches.

Fortunately, Codasip is also addressing the problem of provenance and traceability with a new software tool using blockchain technology to irrefutably log the processor design process and create a record of provenance with traceability. This new software tool is currently being demonstrated to customers in a pre-release version.

So, in summary, Codasip today has solutions to the two major problems that the committee identified in their hearing: (i) commercial availability of CHERI-based processors; and (ii) methods for providence and traceability of semiconductor IP blocks. Much of this work was and is being done in the UK, and the rest is done solely in Europe as we do not have R&D in other geographies.

If the committee has further interest in the technology we are making available, it would be a pleasure to arrange a meeting at your convenience.

Sincerely yours,

Dr Ron Black

CEO, Codasip GmbH

ron.black@codasip.com

WEBINAR: Fine-grained Memory Protection to Prevent RISC-V Cyber Attacks

About Dr. Ron Black

Dr Black, CEO at Codasip since 2021, has over 30 years of industry experience. Before joining Codasip, he was President and CEO at Imagination Technologies and previously CEO at Rambus, MobiWire, UPEK, and Wavecom. He holds a BS and MS in Engineering and a Ph.D. in Materials science from Cornell University. A consistent thread of his career has been processors including PowerPC at IBM, network processors at Freescale, security processors at Rambus, and GPUs and CPUs at Imagination.

About Codasip 

Codasip is a processor technology company enabling system-on-chip developers to differentiate their products for competitive advantage. Codasip is based in Munich and has development centres throughout Europe, including in Bristol and Cambridge in the UK. The company specializes in processors based on RISC-V (Reduced Instruction Set Computing, Generation Five), which is an open Instruction Set Architecture (ISA) alternative to proprietary architectures such as Arm and Intel x86. Codasip also has extensive experience in cybersecurity, with a team in Bristol that has spent the last two years architecting and designing the recently announced CHERI processor.

Also Read:

Webinar: Fine-grained Memory Protection to Prevent RISC-V Cyber Attacks

How Codasip Unleashed CHERI and Created a Paradigm Shift for Secured Innovation

RISC-V Summit Buzz – Ron Black Unveils Codasip’s Paradigm Shift for Secured Innovation


How to Find and Fix Soft Reset Metastability

How to Find and Fix Soft Reset Metastability
by Mike Gianfagna on 05-20-2024 at 6:00 am

How to Find and Fix Soft Reset Metastability

Most of us are familiar with the metastability problems that can be caused by clock domain crossings (CDC). Early static analysis techniques can flag these kinds of issues to ensure there are no surprises later. I spent quite a bit of time at Atrenta, the SpyGlass company, so I am very familiar with these challenges. Due to the demands of high-speed interfaces, the need to reduce power, and the growing focus on functional safety, soft resets are often used in advanced designs to clear potential errors. This practice can create hard-to-find metastability issues which remind me of CDC challenges. Siemens Digital Industries recently published a comprehensive white paper on this class of problem. If you use soft resets in your design, it’s a must read. A link is coming, but first let’s look at what Siemens has to say about how to find and fix soft reset metastability.

The Problem

As design complexity increases, systems contain many components such as processors, power management blocks, and DSP cores. To address low-power, high-performance and functional safety requirements, these designs are now equipped with several asynchronous and soft reset signals. These signals help safeguard software and hardware functional safety – they can quickly recover the system to an initial state and clear any pending errors or events. Using soft resets vs. a complete system re-start saves time and power.

The multiple asynchronous reset sources found in today’s complex designs result in multiple reset domain crossings (RDCs). This can lead to systematic faults that create data corruption, glitches, metastability or functional failures. This class of problem is not covered by standard, static verification methods such as the previously mentioned CDC analysis. And so, a proper reset domain crossing verification methodology is required to prevent errors in reset design during the RTL verification stage.

Let’s look at an example circuit that can cause soft reset metastability. A reset domain crossing (RDC) occurs when a path’s transmitting flop has an asynchronous reset, and the receiving flop has either a different asynchronous reset than the transmitting flop or has no reset at all. These two examples are summarized in the figure below.

Circuits with potential soft reset metastability issues

The circuit on the left shows a simple RDC problem between two flops having different asynchronous reset domains. The asynchronous assertion of the rst1 signal immediately changes the output of Tx flop to its assertion value. Since the assertion is asynchronous to clock clk, the output of Tx flop can change near the active clock edge of the Rx flop, which can violate the set-up hold timing constraints for flop Rx. So, the Rx flop can go into a metastable state.

To review, metastability is a state in which the output of a register is unpredictable or is in a quasi-stable state. The circuit on the right shows an RDC problem from a flop with an asynchronous reset domain to a non-resettable register (NRR), which does not have a reset pin.

Note that an RDC path with different reset domains on the transmitter and receiver does not guarantee that the path is unsafe.

Also, an RDC path having the same asynchronous reset domains on the transmitter and receiver does not guarantee that the path is safe, as issues may occur due to soft resets. Different soft resets in a design can induce metastability and cause unpredictable reset operations or, in the worst case, overheating of the device during reset assertion.

There are many additional examples in the white paper along with a detailed discussion of what type of analysis is required to determine if a potential real problem exists. A link is coming so you can learn more.

The Solution

The white paper then proposes a methodology to detect RDC issues. It is pointed out that RDC bugs, if ignored, can have severe consequences on system functionality, timing, and reliability. To ensure proper operation and avoid the associated risks, it is essential to detect unsafe RDCs systematically and apply appropriate synchronization techniques to tackle any issues that may arise due to reset path delays caused by soft resets.

The white paper explains that, by handling RDCs effectively, designers can mitigate potential issues and enhance the overall robustness and performance of a design. A systematic flow to assist in RDC verification closure using standard RDC verification tools is detailed in the white paper. The overall flow for this methodology is shown in the figure below.

Flowchart for proposed methodology for RDC verification

To Learn More

If some of the design challenges discussed here resonate with you, the Siemens Digital Industries white paper is a must read. Beyond a detailed explanation of the approach to address these design issues, data from real designs is also presented. The results are impressive.

You can get your copy of the white paper here. You will also find several additional resources on that page that present more details on RDC analysis. You will learn a lot about how to find and fix soft reset metastability.


Podcast EP224: An Overview of the Upcoming 2024 Electronic Components and Technology Conference with Dr. Michael Mayer

Podcast EP224: An Overview of the Upcoming 2024 Electronic Components and Technology Conference with Dr. Michael Mayer
by Daniel Nenni on 05-17-2024 at 10:00 am

Dan is joined by Dr. Michael Mayer, the 2024 Electronic Components and Technology Conference (ECTC) Program Chair. Michael is an Associate Professor in the department of Mechanical and Mechatronics Engineering at the University of Waterloo in Ontario, Canada. Michael has co-authored technical publications and patents about wire bonding methods and various microsensor tools for diagnostics of bonding processes, as well as reliability of micro joints. More recently he has been working on direct bonding of optical glasses and laser joining of biological materials.

Michael discusses the upcoming ECTC conference with Dan. The event will take place May 28 – 31, 2024 in Denver, Colorado. Michael discusses some of the innovation trends such as hybrid bonding presented at ECTC and how these technologies are paving the way for 2.5/3D heterogeneous integration. Michael provides an overview of the broad research in design, packaging and manufacturing that is presented at the conference.

Michael also discusses the trends in university research for advanced materials and packaging and highlights the more than 10 professional development courses available at the upcoming ECTC.


A Webinar with Silicon Catalyst, ST Microelectronics and an Exciting MEMS Development Contest

A Webinar with Silicon Catalyst, ST Microelectronics and an Exciting MEMS Development Contest
by Mike Gianfagna on 05-17-2024 at 8:00 am

A Webinar with Silicon Catalyst, ST Microelectronics and an Exciting MEMS Development Contest

Most MEMS and sensor companies struggle to find an industrialization partner that can support early-stage research and help develop and transition unique concepts to high-volume production. The wrong partner means delays and increased development costs as the design moves between various facilities. Recently, Silicon Catalyst joined forces with ST Microelectronics and a few other partners in a webinar to discuss these challenges. Silicon Catalyst also announced an exciting contest that helps new entrants to the MEMS market to get off the ground. If your product plans include MEMS devices, you will want to watch this webinar. A link is coming, but most importantly you’ll also want to check out the contest – this could be your big break. Read on to learn about a webinar with Silicon Catalyst, ST Microelectronics and an exciting MEMS development contest.

Webinar Background – the MEMS Development Dilemma

The event contained several parts:

  • A webinar introduction Paul Pickering, Managing Partner, Silicon Catalyst
  • A useful MEMS industry highlights presentation from Pierre Delbos, Market and Technology Analyst, Yole Group
  • An overview of the unique ST Lab-in-Fab fabrication concept from Dr. Andreja Erbes, Director, STMicroelectronics
  • Details of the contest
  • An informative Q&A session
Webinar Presenters

I highly recommend you watch this webinar; there is a lot of very useful information. A link is coming, but first let’s take a quick look at the key points.

Silicon Catalyst Introduction

Paul began the event by reviewing the incredible ecosystem Silicon Catalyst has built to foster semiconductor-related innovation. The organization does have a focus on advanced materials, so the MEMS topic fits quite well. The organization has a list of high-profile strategic partners, as shown in the graphic below.

Silicon Catalyst’s Strategic Partners

You can learn more about Silicon Catalyst’s impact and the impact of its incubator companies on SemiWiki here.

MEMS Overview from The Yole Group

Pierre provided an eye-opening tour of the MEMS industry. Yole has been tracking this market for 20 years and the diagram below shows the steady growth over that time. Pierre reported that there were 30 billion units in the MEMS market in 2023.

MEMS Industry History

Paul went on to review the players, markets and growth areas. You will likely learn a few things about this market by listening to Paul’s overview.

Addressing MEMS Development With “Lab-in-Fab” Approach

Next, Dr. Andreja Erbes discussed the challenges of MEMS development and presented a unique approach being pioneered by ST Microelectronics. The challenge Andreja described is one of too many hand-offs in the MEMS development process as a concept moves from idea to production. Each handoff (e.g., research, low-volume production, high-volume production) introduces delays, new learning curves and opportunities for errors. This flow is depicted in the figure below.

Typical MEMS Development Cycle

ST Microelectronics, in cooperation with the Institute of Microelectronics (IME) has built a unique MEMS development facility in Singapore. By bringing all phases of MEMS product development into one location, ST is delivering leading-edge competence and access to a global ecosystem. The figure below summarizes the elements of this rapid product development strategy.

Rapid Product Development

Andreja went on to describe the substantial physical campus layout, including a virtual connection to a fab in Italy. The development capabilities of the unique Lab-in Fab are reviewed in detail, along with example applications and third-party collaborations. It’s a very impressive overview. The figure below summarizes the engagement model and accelerated timeline that is enabled.

Lab in Fab Engagement Model

To Enter the Contest and Learn More

And now for the key item – entering the contest. Silicon Catalyst and STMicroelectronics announced the 2024 Lab-in-Fab development contest during the webinar.

The contest affords companies of all sizes an opportunity to align with one of the premiere MEMS manufacturing companies and benefit from its development expertise and world-class fabrication. Silicon Catalyst will conduct the screening and selection process that will include various MEMS experts as judges.

The contest offers the opportunity to engage with the Lab-in-Fab team for a free project evaluation. This includes:

  • Expense paid visit to meet the teams in Singapore (reimbursable expenses up to $10K USD)
  • Work with world-class teams to scope out the manufacturing plan
  • Participate in various PR activities ST, IME and Silicon Catalyst
  • Receive introductions to investors and VC’s to help fund your project

The deadline for submission to the contest is Monday, June 3, 2024. The winner will be notified by Wednesday, June 12th.

Click here to take you to a page with a link to watch the webinar replay and to access the short contest entry form. Click on it today! And that’s the details on a webinar with Silicon Catalyst, ST Microelectronics and an exciting MEMS development contest.


CEO Interview: Roger Espasa of Semidynamics

CEO Interview: Roger Espasa of Semidynamics
by Daniel Nenni on 05-17-2024 at 6:00 am

Roger Espasa

Roger Espasa is the CEO and founder of Semidynamics, an IP supplier of two RISC-V cores, Avispado (in-order) and Atrevido (out-of-order) supporting the RISC-V vector extension and Gazzillion(tm) misses, both targeted at HPC and Artificial Intelligence. Prior to the foundation of the company, Roger was Technical Director/Distinguished Engineer at Broadcom leading a team designing a custom ARMv8/v7 processor on 28nm for the set-top box market. Before its experience at Broadcom, from 2002 to 2014, Roger led various x86 projects at Intel as Principal Engineer: SIMD/vector unit and texture sampler on Knights Ferry (45nm), L2 cache, texture sampler on Knights Corner (22nm), the out-of-order core on Knights Landing (14nm) and the Knights Hill core (10nm). From 1999 to 2001 he worked for the Alpha Microprocessor Group on a vector extension to the Alpha architecture.

Roger got his Phd in Computer Science from Universitat Politècnica de Catalunya in 1997 and has published over 40 peer reviewed papers on Vector Architectures, Graphics/3D Architecture, Binary translation and optimization, Branch Prediction, and Media ISA Extensions. Roger holds 9 patents with 41 international filings.

Tell us about your company?
Processors are my passion. I’ve work on major processor architectures such as Alpha, x86, ARM and now RISC-V. When I became aware of the new RISC-V architecture, I realised that it was going to be the future of processors. Rather than being locked into a choice of either Arm or Intel, companies would have a choice of which IP processor vendor they wanted to use. In addition to vendor-choice, the fact that RISC-V is an open standard means that both customers and vendors can extend the ISA with whatever features they need. This flexibility and this freedom-to-change is something you simply can’t have if you are using Arm or Intel.

So, in 2016, I founded the company and we did a multi-core, RISC-V chip design for Esperanto Technologies. This financed the company as it started up. We had some other design projects that provided the cash flow while we developed our own range of 64-bit RISC-V IP cores such as Atrevido that we announced last year.  I am proud to say that we are entirely self-funded through sales and a few European grants which has enabled us to build a dynamic, highly knowledgeable team of over 70 and growing. This means that we are totally in control of our destiny and the pace at which we build the business.

What problems are you solving?
The key problem is that customers have a limited choice when it comes to IP cores, even if you include ARM as a supplier. Furthermore, those IP cores tend to come in a “fixed menu” format, i.e., you can’t add custom features to them. Granted, they all come with some configuration options (cache size, for example), but they can hardly ever be expanded with the customer’s special features needed for their application. We made the decision to accept any change request made by the customer, even if it implied deep “surgery” inside the core. Hence came our motto, “Open Core Surgery”. With us, the customer has total control over the specification, be it new instructions, separate address spaces, new memory accessing capabilities, etc.

This means that Semidynamics can precisely tailor a core to meet each project’s needs so there are no unnecessary overheads or compromises. Even more importantly, Semidynamics can implement a customer’s ‘secret sauce’ instructions and features into the core in a matter of weeks, which is something that no-one else offers.

Semidynamics also enables customers to achieve a fast time to market for their customised core as a first drop can be delivered that will run on an FPGA. This enables the customer to check functionality and run software on it while Semidynamics does the core verification. By doing these actions in parallel, the product can be brought to market faster and with reduced risk.

What application areas are your strongest?
We target any application that needs to move massive amounts of data around very fast such as AI and ML. Semidynamics has the fastest cores on the market for moving large amounts of data even when the data does not fit in the cache. Thanks to our “Gazzillion™ technology”, we can sustain a bandwidth of a “cache-line per clock cycle”, i.e., 64 Bytes every clock. And this can be done at frequencies up to 2.4 GHz on the right node. The rest of the market averages about a cache line every many, many cycles; that is nowhere near Semidynamics’ one every cycle. This makes the core perfect for applications that stream a lot of data and/or the application touches very large data that does not fit in cache. This unique capability is thanks to the fact that our cores can support up to 128 simultaneous requests for data and track them back to the correct place in whatever order they are returned. This is nearly 20 times more requests than competitors.

This is ability to move large amounts of data is required by Semidynamics’ Vector Unit that is the largest, fully customisable Vector Unit in the RISC-V market, delivering up to 2048b of computation per cycle for unprecedented data handling. The Vector Unit is composed of several ‘vector cores’, roughly equivalent to a GPU core, that perform multiple calculations in parallel. Each vector core has arithmetic units capable of performing addition, subtraction, fused multiply-add, division, square root, and logic operations. Semidynamics’ vector core can be tailored to support different data types: FP64, FP32, FP16, BF16, INT64, INT32, INT16, INT8, or INT4, depending on the customer’s target application domain. The largest data type size in bits defines the vector core width or ELEN. Customers then select the number of vector cores to be implemented within the Vector Unit, either 4, 8, 16 or 32 cores, catering for a very wide range of power-performance-area trade-off options. Once these choices are made, the total Vector Unit data path width or DLEN is ELEN x number of vector cores. Semidynamics supports DLEN configurations from 128b to 2048b.

Last but not least, our Tensor Unit is built on top of the Semidynamics RVV1.0 Vector Processing Unit and leverages the existing vector registers to store matrices. This enables the Tensor Unit to be used for layers that require matrix multiply capabilities, such as Fully Connected and Convolution, and use the Vector Unit for the activation function layers (ReLU, Sigmoid, Softmax, etc), which is a big improvement over stand-alone NPUs that usually have trouble dealing with activation layers.

The Tensor Unit leverages both the Vector Unit capabilities as well as the Atrevido-423 Gazzillion™ capabilities to fetch the data it needs from memory. Tensor Units consume data at an astounding rate and, without Gazzillion, a normal core would not keep up with the Tensor Unit’s demands. Other solutions rely on difficult-to-program DMAs to solve this problem. Instead, Semidynamics seamlessly integrates the Tensor Unit into its cache-coherent subsystem, opening a new era of programming simplicity for AI software.

Every designer using RISC-V wants to have the perfect set of Power, Performance and Area along with unique differentiating features and now, for the first time, they can have just that. This makes it ideal for the next generation applications of AI, Machine Learning (ML) and High-Performance Computing especially where big data, such as ChatGPT’s 14GB, just won’t fit into L1, L2 or L3 cache.

What keeps your customers up at night?
Finding that their data is too big to be handled with standard core offerings that also struggle to cope with the flow of data. There is a huge demand for AI hardware where this is a major problem. Our solution is the new All-In-One AI IP. This brings together all our innovations to create a unified IP solution that combines RISC-V, Vector, Tensor and Gazzillion technology so that AI chips are now easy to program and scale to whatever processing power is required.

The problem that we address is that the data volume and processing demand of AI is constantly increasing and the current solution is, essentially, to integrate more individual functional blocks. The CPU distributes dedicated partial workloads to gpGPUs (general purpose Graphical Processor Units) and NPUs (Neural Processor Units), and manages the communication between these units. But this has a major issue as moving the data between the blocks creates high latency. The current AI chip configuration is inelegant with typically three different IP vendors and three software tool chains, with poor PPA (Power Performance Area) and is increasingly hard to adapt to new algorithms. For example, they have difficulties handling an AI algorithm called a transformer.

We have created a completely new approach that is easy to program as there is just the RISC-V instruction set and a single software development environment. Integrating the various blocks into one RISC-V AI processing element means that new AI algorithms can easily be deployed without worrying about where to distribute which workload. The data is in the vector registers and can be used by the Vector Unit or the Tensor Unit with each part simply waiting in turn to access the same location as needed. Thus, there is zero communication latency and minimized caches that lead to optimized PPA but, most importantly, it easily scales to meet greater processing and data handling requirements.

In our solution there is just one IP supplier, one RISC-V instruction set and one tool chain making implementation significantly easier and faster with reduced risk. As many of these new processing elements as required to meet the application’s needs can be put together on a single chip to create a next generation, ultra-powerful AI chip.

The RISC-V core inside our All-In-One AI IP provides the ‘intelligence’ to adapt to today’s most complex AI algorithms and even to algorithms that have not been invented yet. The Tensor provides the sheer matrix multiply capability for convolutions, while the Vector Unit, with its fully general programmability, can tackle any of today’s activation layers as well as anything the AI software community can dream of in the future. Having an All-In-One processing element that is simple and yet repeatable solves the scalability problem so our customers can scale from one TOPS to hundreds of TOPS by using as many processing elements as needed on the chip. In addition, our IP remains fully customisable to enable companies to create unique solutions rather than using standard off-the-shelf chips.

What does the competitive landscape look like and how do you differentiate?
There are a lot of competitors and a small handful of big ones but, essentially, they fall in two camps: either they offer a core and, maybe a Vector Unit, or they offer a not-so-flexible NPU. We are unique in providing a fully customisable all-in-one solution comprising a core with our Open Core Surgery, Tensor Unit, Vector Unit and Gazzillion that provide further differentiation to create the high performance, custom core that they need.

What new features/technology are you working on?
One of the many delights of the RISC-V community is that there are always new great ideas being brought into RISC-V. For example, we will be announcing Crypto and Hypervisor in the near future. Plus, of course, a steady stream of new, even more powerful cores.

How do customers normally engage with your company? 
For a number of years, it was word of mouth as processor community is relatively small. I have been in it for years so customers sought us out as being RISC-V processor experts that could think outside the box and create exactly the core that they wanted. More recently, we have moved from stealth mode to actively promoting our cores and now we have a growing number of customers from around the world.

Also Read:

Semidynamics Shakes Up Embedded World 2024 with All-In-One AI IP to Power Nextgen AI Chips

RISC-V Summit Buzz – Semidynamics Founder and CEO Roger Espasa Introduces Extreme Customization

Deeper RISC-V pipeline plows through vector-scalar loops