BannerforSemiWiki 800x100 (2)

CEO Interview: Dr. J Provine of Aligned Carbon

CEO Interview: Dr. J Provine of Aligned Carbon
by Daniel Nenni on 11-17-2023 at 6:00 am

Cyclotron Road Cohort Five

J Provine, PhD, is the CEO and co-founder of Aligned Carbon, a startup that has developed a breakthrough synthesis and purification technology for aligned, semiconducting carbon nanotubes (CNTs) at wafer scale. His team believes that Aligned Carbon is the crucial missing piece in an industry that has long been hopeful of – yet historically disappointed by – the promise of CNTs.

Dr. Provine has over 20 years of experience in nanofabrication, beginning with his PhD studies at Cornell University, with a deep expertise in MEMS, atomic layer deposition, and a variety of semiconductor technologies. Continuing through the decade-plus he spent as a senior research scientist and Adjunct Professor at Stanford University, Dr. Provine consulted widely with established companies (Veeco, Bosch) and early-stage startups (Quantumscape, Spark Thermionics). He continues to be deeply involved in activities with the NSF and DOE. Dr. Provine also mentors other deep tech startup founders.

Tell us a little about your journey. How’d you end up in the driver seat of a semiconductor company?
I have worked in nanotechnology, particularly semiconductor-related research, since the late ‘90s. Before that, I was fixated on particle physics. I never planned on starting a company. I felt at home in the university setting. Over my career, I started spending more time consulting companies. I had great collaborators and PIs at Stanford, but I was searching for an opportunity to stretch myself in more of a leadership role, building a team with a focused technology that could greatly impact computing. I thought that would happen in the academic setting, but I saw so many revolutionary technologies get stuck in academia, so I looked outside the ivory tower. The technology behind what would become Aligned Carbon was the most interesting, high-value opportunity I had ever seen, and so I decided to take the leap and start the company. I saw the potential for my particular skills and background to improve material manufacturing, and that continues to drive me forward today.

Tell Us about Aligned Carbon. What have you been working on?
Aligned Carbon manufactures carbon nanotubes (CNTs) for use in integrated circuits. There has been a tremendous amount of research showcasing the promise of CNTs for use in nanoelectronics. Particularly, my Stanford colleagues Prof. Philip Wong and Prof. Subhasish Mitra led research programs in this area for decades. This research never made a commercial impact: issues with the CNT starting material played a big role in inhibiting broader applicability. We’re changing that at Aligned Carbon.

The semiconductor ecosystem is a huge, global endeavor. Aligned Carbon’s north star has been to revolutionize high performance computing, where gains in speed and energy efficiency can make an enormous impact. The impact CNTs can have for integrated circuits is even broader, including RF data transmission, power circuitry, satellite/space based circuits, and sensors. At Aligned Carbon, we are continuing to push the industry’s understanding of and capabilities in these areas.

What problems are you trying to address and why are CNTs uniquely suited to address them?
For 60 years, silicon has been able to solve most of our semiconductor needs for integrated circuits. Unique materials have emerged over the last few decades that have added special capability where silicon was not able to provide optimal performance. We have seen this in GaAs and other III-V semiconductors for photonics, and now GaN and SiC for power electronics. The unique material properties of CNTs, including high mobility, small body length, and enormous current density capacity, have such enormous potential to drive circuitry forward. Additionally, CNTs can deliver high performance transistors for logic with a low thermal budget (<300C), which enables monolithic integration on top of existing circuitry or temperature-limited substrates. This match of performance and integration is unique and marks CNTs as the best candidate for monolithically-integrated 3D logic.

Carbon Nanotubes went through a hype cycle before but did not pan out. Why, and what have you guys solved uniquely?
At IBM in the 2000’s, so much great work happened, but there were many issues with manufacturability and reliability of the material. At the time, there was an emphasis on directed synthesis/growth of CNTs, growing the specific kind of CNT you wanted (semiconducting vs metallic or even single chirality*) right where you wanted it. That proved to be a very challenging – perhaps impossible – problem, particularly when targeting a semiconducting-to-metallic CNT ratio of greater than 100,000:1. Sorting CNTs by polymer conjugation, commonly referred to as solution purification, has made great headway in solving the semiconductor-to-metallic ratio, but at the cost of CNT order and maintaining pristine CNT material.

Aligned Carbon has been focused on a route to having high semiconducting purity while maintaining arrays of CNTs with perfect alignment (no crossings or bundling), all without polymer contamination. This approach means you can have a “thin film” of aligned, purely-semiconducting CNTs that you can pattern as needed with traditional photolithography and etch to make devices.

*Chirality defines the structural and electronic properties of a single-walled carbon nanotube. A vector connecting the centers of the two graphitic hexagons is called the chiral vector, and it determines the structure of a single-walled carbon nanotube. A chiral vector C can be specified by a chiral index (n,m) using basis vectors a1 and a2 of a graphene sheet.

You mention a breakthrough. Tell us more about that.
Earlier this year, Aligned Carbon validated its purification process to greater than 99% semiconducting purity while maintaining fully-aligned material and no discernable polymer contamination on the CNTs. Further, there is a clear line of sight to scaling this to >99.9999% semiconducting purity, with our throughput capable of purifying a 300mm wafer in a few minutes. There is more work to be done, but we are very excited to put this material to work and find its limitations as we continue to refine and improve our unparalleled CNT synthesis and purification.

With this breakthrough, what application areas are you putting your focus on?
While we have always had digital logic as our target application, there are many other opportunities that we are actively exploring. The industry has just scratched the surface on the incredibly powerful new properties of CNTs.

We are digging into the use of CNFETs in RF data transmission, due to sub-THz signal linearity, and power electronics, due to high current-carrying potential. We have also previously looked at a few different types of CNT sensors. Another interesting opportunity is circuitry for use in space, as CNFETs are ~16 times less radiation- sensitive than silicon devices because of their reduced interaction cross-section. All of these are focused around making transistors out CNTs – which is where our material has been focused – and is a very fertile ground.

We are getting some interest across each of these spaces, which is really encouraging, and we do see our expertise in the material lending itself nicely to working closely with partners that like working on major innovations at the device layer.

What does the competitive landscape look like, both within carbon nanotubes but also when it comes to other technologies. Why do you win?
Carbon nanotubes are a remarkable material that have the potential to far outperform all bulk semiconductors out there. But, they need to meet the stringent requirements of a mature and highly-optimized industry. The great promise of CNTs and the hard limits that silicon has reached make that effort well worth it. One type of competition is other emergent materials, such as 2D materials, which possess intriguing material properties. But, they represent a less mature technology/material set that needs even more substantial manufacturing and reproducibility work than CNTs. So for me, CNTs really represent the most promising transistor channel material out there. The key to having reliable CNTs for the IC industry lies in the purity of electronic properties of order of the material. Aligned Carbon has a unique approach to delivering this material that maintains the quality of pristine CNTs and integrates cleanly and at volume with established IC manufacturing.

Solutions for other materials, integration schemes, and architectures are another form of competition for us. That said, the needs of the IC industry are complex, multi-faceted, and large. We at Aligned Carbon will be pushing as hard as possible to bring CHTs into the mix, knowing others are doing the same with their technologies.

What sort of partnerships or companies are you looking for right now?
We are excited to put our material to the test and iterate with collaborators/partners on a variety of transistor applications. We are confident in our ability to supply consistent material for R&D validation. Further, we have a lot of options for tuning our material, including pushing to much higher purity. Qualifying a new material for use in a foundry takes significant time, resources, and effort, but we have a unique material offering that innovative partners can utilize to meet the various challenges in the IC industry.

We as a society have a nearly insatiable desire for greater computing capability. The IC industry is rightfully concerned about that hunger causing significant issues in energy consumption and environmental impact. CNTs are uniquely poised to play a larger role in threading that needle, but we need to be vigilant about validating and verifying the technology to make it ready for mass market prime-time.

How can people get in touch?
We are really excited to talk with folks about our advanced technology. With the major material-level renaissance currently happening in the semiconductor space (as shown in power electronics), we are excited to get our products in front of people.

If you are interested in getting in touch, please feel free to reach out to info@alignedcarbon.com, as our team would be happy to chat. Also feel free to follow our company’s LinkedIn [Linkedin] for regular updates or to connect with our core team.

Also Read:

Executive Interview: Tony Casassa, General Manager of METTLER TOLEDO THORNTON

CEO Interview: Venkata Simhadri of MosChip

CEO Interview: Pat Brockett of Celera


RISC-V Summit Buzz – Ron Black Unveils Codasip’s Paradigm Shift for Secured Innovation

RISC-V Summit Buzz – Ron Black Unveils Codasip’s Paradigm Shift for Secured Innovation
by Mike Gianfagna on 11-16-2023 at 10:00 am

Ron Black
Ron Black

Codasip is a processor solutions company with an expanding footprint. It is Europe’s leading RISC-V organization with a global presence. Codasip reports billions of chips already use its technology.  You can learn more about Codasip here, The company has made some announcements recently that expand its offerings in terms of application areas and processor options. Dr. Ron Black has been the CEO at Codasip for almost three years. Previously, he was president and CEO at Imagination Technologies, CEO at Rambus, MobiWire, UPEK, and Wavecom. At his RISC-V Summit keynote, Ron discussed some major challenges for data security and an innovative approach to address those challenges. His presentation definitely got some attention. Read on to understand how Ron Black unveils Codasip’s paradigm shift for secured innovation. 

Leading Up to the Summit – Expanding Footprint

In mid-October, Codasip announced the 700 RISC-V processor family, expanding its offering beyond embedded processor IP to stand-alone application processors. In its own words, “bringing the world of Custom Compute to everyone.” The 700 family is a configurable and customizable set of RISC-V baseline processors. It is intended to complement Codasip’s embedded cores by offering a different starting point to accommodate the need for higher performance.

Codasip Studio delivers a streamlined design process that unleashes the potential of the 700 family. Examples of the power of Codasip Studio include:

Enabling different levels of processor optimization for each use case:

  • Through profiling 
  • Giving predictable results
  • Improving time-to-market with a proven, highly automated approach​

Automating processor design and custom SDK & HDK generation:

  • Better results achieved in a more efficient way
  • ​Accelerated exploration, investigation, and configuration
  • Allowing maximal optimization

Fostering collaboration between hardware and software teams:

  • To improve system efficiency
  • ​Allowing teams to analyze and suggest changes to the hardware team
  • Delivering differentiation and time-to-market in a highly competitive environment

The first core in the 700 family was also introduced. Codasip A730 is a 64-bit RISC-V application core that is being provided to early-access customers. It supports single-core or multi-core configurations and is Linux capable (MMU) with up to 2x the performance of the previous core. It employs a faster, tightly coupled cache coherent memory system as well. The core has broad market application from edge IoT to AI and sensor fusions. As you will see, the 700 family is also relevant to Codasip’s paradigm shift for secured innovation.

Ron’s Keynote at the Summit – Addressing the Elephant in the Room

Data security is a huge and growing problem. The specter of a data breach and the subsequent havoc associated with identity and monetary theft is on everyone’s mind. The headlines below illustrate the scope of the problem.

The problem at times can seem insurmountable. Huge amounts of money are lost each year from data breaches. There are so many “attack surfaces” and loopholes to worry about. Ron provided a rather startling insight about the problem at his keynote. It turns out a very large part of the problem is due to illegal memory access – 70 percent or more of the problem has this root cause. And more interesting, it’s been that way for a very long time. The graphic below drives home the point.

Ron then explored the dimensions of the problem – how big is it and what, if anything can be done to reduce the risk and financial loss. It turns out there is a substantial amount of legacy C/C++ code that is vulnerable. In-house, third-party, and open-source streams all contribute to the problem. And the incidence of data breaches is growing exponentially, as shown below.

So, what can be done to plug this rather large security hole? Ron outlined the options.

One could deploy tools to identify and fix all leaks and potential leaks in existing code. This would entail re-writing trillions of lines of code, perhaps with a memory safe approach like Rust. A direct, but intractable solution.

But what if you could use a specialized security processor to catch vulnerabilities natively? Something that could integrate with existing applications. This is a much better way, but does such an option actually exist?

Ron announced the news – both good and bad. The good news is that such a system does indeed exist. Capability Hardware Enhanced RISC Instructions (CHERI) is a specification developed by the University of Cambridge. It’s an architecture extension introducing hardware technology that mitigates software security vulnerabilities. It extends conventional hardware ISAs with memory protection and scalable compartmentalization. Something like this holds great promise since it can be integrated into existing systems with minor work vs. a major re-write.

There are several research projects underway for CHERI from the University of Cambridge and Microsoft, among others. And there is the bad news. It’s a research effort, not a production implementation. Until now.

Ron announced that Codasip is taking CHERI from a research topic to a commercial product thanks to RISC-V, CHERI, and the Codasip Custom Compute methodology. He explained that RISC-V is modular and allows customization. The company has taken its new Codasip 700 baseline processor design and enhanced it for security with CHERI technology using Codasip Studio.

This approach includes built-in, fine-grained memory protection by extending the RISC-V ISA with CHERI-based custom instructions. To enable the use of these instructions, Codasip is also delivering the software environment to take advantage of CHERI technology, bringing a full software development flow to add memory protection. Thanks to these developments, the world can now be a safer place with regard to data protection. Ron announced that Codasip was demonstrating this new core at the RISC-V Summit and lead customers would get early access to core with CHERI in the second half of 2024.

Comments From Ron

I got a chance to speak with Ron after his keynote. He likened data security to automobile insurance. No one really wants to pay for it, but you definitely want it just before you’re involved in an accident. Data breaches are like those automobile accidents. You need insurance in place at all times to truly be protected. With the introduction of a commercialized version of CHERI, that insurance is now available at a more reasonable cost, when compared with re-writing a massive code base.

Ron pointed out that CHERI is a cross-platform technology. Its benefits are not specific to RISC-V, but the open architecture of RISC-V make the implementation of CHERI more accessible. Will this new capability for improved data security find widespread use in the RISC-V community and will other architectures follow with commercial CHERI capabilities? Time will tell, but this innovation from Codasip could be the start of something big. And that’s how Ron Black unveils Codasip’s paradigm shift for secured innovation. 


NoCs give architects flexibility in system-in RISC-V design

NoCs give architects flexibility in system-in RISC-V design
by Don Dingee on 11-16-2023 at 6:00 am

Power domains and crossings into NoC for system in RISC V design

RISC-V tends to generate excitement over the possibilities for the processor core, any custom instruction extensions, and its attached memory subsystem. Those are all necessary steps to obtaining system-level performance. But is that attention sufficient? Architects who have ventured into larger system-on-chip (SoC) designs know how complicated interconnecting many IP blocks vying for data paths all at once can get. Arteris suggests a ‘system-in’ RISC-V design approach for solving system-level challenges instead of a ‘RISC-V out’ perspective.

De-facto specs can cross up designers quickly

“The typical cut point for starting an SoC design is a standard protocol like AXI,” says Frank Schirrmeister, VP of Solutions and Business Development for Arteris. “In that respect, RISC-V design is fundamentally the same as designing with other processor cores supporting AXI for their system interface.” Many peripheral IP blocks also support AXI. Broad support usually makes for a clear choice, but flexibility and optimization soon become factors at scale.

AXI carries about 300 wires, which need routing to every IP block it connects across a chip. It doesn’t take many IP blocks in the mix to create raw wiring congestion. Architects frequently turn to crossbars and bridges as IP block counts scale. Another problem develops quickly – how does one arbitrate between high-speed and high-priority data paths when several must be active simultaneously? AXI crossbars typically offer only one level of arbitration per switch. Protocol restrictions add to the performance throttling problem.

 

A simplified view of AXI crossbar routing, courtesy of Arteris

Establishing AXI routes between the RISC-V core and various IP blocks may work logically but not physically without extra care and feeding. “Every IP block could be on a different power or clock domain, leading to bigger problems,” continues Schirrmeister. Clock domain crossings lead to instabilities and timing closure problems. Power domain crossings add level-shifting complexity and make managing power consumption through shutting down unused blocks and sequencing their reawakening more challenging.

Adding a NoC embraces AXI in a co-optimized solution

IP blocks may support different protocols, with AXI being widely adopted. However, there are others like OCP, CHI, and more, each with their own versions and specific supported features. In simpler designs, users may like to start with an AXI crossbar, but there is a more efficient alternative for the SoC-wide interconnect. “Without a network-on-chip (NoC), any attempts at co-optimizing an SoC are undone by having to deal with crossbars and clock and power domain complexity,” observes Schirrmeister. “With a NoC, clock and power domain issues are localized and dealt with at the IP block boundary instead of in the network.”

A NoC simplifies transport and handling domain crossings, courtesy of Arteris

The transport protocol inside an Arteris NoC uses about half the wiring of an AXI interface, cutting wiring congestion. Separating clock and power domain crossings from the network helps guarantee NoC timing closure and improves the ability to manage power on a block-by-block basis. Both AXI and non-AXI blocks integrate more easily with localized protocol conversion.

A closer look at how power domain crossings work illustrates the advantages. Fewer wires in the transport protocol lead directly to area savings. Isolating power and clock crossings to the IP block boundary simplifies the configuration of handshaking, which is essential when using a power management unit (PMU) external to the NoC to coordinate supervision.

Power domains isolated to the IP block boundary where required, courtesy of Arteris

System-in RISC-V design approach fundamentals

With a system-in RISC-V design approach, the architectural context becomes the NoC, not the processor. IP blocks – including the RISC-V core – become more interchangeable, with reduced concerns over network routing and ripple effects from clock and power domain crossings.

“Architects can concentrate on their performance criteria around a RISC-V core,” says Schirrmeister. “I need this bandwidth on this lane to my selected IP block, so this has to flow, and that has to stall – and the NoC sorts that out once the block is connected.” Knowing how many video sources or data streams enter the SoC helps size the RISC-V and memory subsystem accordingly, preventing snarled traffic from limiting the RISC-V core performance. As an added benefit, potentially varying protocols depending on the IP used (OCP, CHI, AXI, AHB, APB, and others) can be translated at the border of a NoC for unified handling inside the NoC topology.

A system-in RISC-V design approach may yield even more flexibility in the second design project, where a team enhances its first SoC design with faster RISC-V processing and memory and swaps out targeted IP blocks for more throughput. With video frame rates, resolution, and wired and wireless networking speeds constantly rising, a solid architecture built around a NoC can respond to changes without ripping up the entire design.

Visit the Arteris website for more information on NoCs and other products.


The Coming China Chipocalypse – Trade Sanctions Backfire – Chips versus Equipment

The Coming China Chipocalypse – Trade Sanctions Backfire – Chips versus Equipment
by Robert Maire on 11-15-2023 at 10:00 am

China Sino US
  • China Chip Sanctions have had opposite of intended effect
  • Helps AMAT, LRCX, KLAC & TEL- Hurts Micron and GloFo
  • Tsunami of Chinese capacity will hurt memory & 2nd tier chip makers
  • The probability of a much longer chip downcycle is increasing
China is cornering the market on Semiconductor equipment

Quarterly reports from the industry show that China was 48% of Lams business, 46% of ASML’s business and 43% of KLA’s business. We fully expect China to be a similarly large percentage of AMAT’s business when they report this week.

With China approaching buying nearly half of all chip production equipment on the planet you have to question what happens when all that capacity comes on line?

It would be both naive and stupid to not assume it will add to China’s growing chip capacity.

If China were buying up half the planets steel milling equipment you would bet the steel industry would be freaking out about the prospects of a flood of cheap Chinese steel coming onto the market as soon as the equipment was installed. So too should the semiconductor industry be concerned.

Look at all the other industry’s that have been decimated by Chinese capacity coming on line; solar cells, LEDs, pharmaceuticals etc; etc;.

However we blithely and happily continue to sell chip making equipment, everything and anything not nailed down.

In fact, semiconductor equipment companies appear quite happy about the rush of business from China because it is saving them from an otherwise even uglier down cycle. Without China, the equipment industry would be cut in half.

The untold truth is that China probably accounts for more than half of equipment companies profits as the prices charged in China are well above the lower prices charged TSMC, Samsung & Intel and the rest of the world.

Every piece of equipment sold to China reduces chip demand elsewhere

The semiconductor industry, like many other industry’s, is somewhat of a zero sum game. Meaning that demand satisfied in one country offsets what would have otherwise been produced in another country.

There is some elasticity of demand in memory but most other applications are not as price elastic.

Automotive & consumer chips made in China will offset demand for those chips in the US, Japan, Korea and elsewhere.

Thus all the equipment being shipped to China is at the expense of chips made elsewhere

Micron & GloFo are most negatively impacted by coming China Chips

Both Micron and Global Foundries are the most vulnerable US companies to the coming tsunami of chinese chips created by the flood of equipment being bought.

Lets face it….memory is a commodity market. A gig of DRAM or NAND is pretty much indistinguishable as to the country of manufacture. Sure there are differences such as DDR or high bandwidth memory, but the vast majority is a fungible commodity.

As such, all the equipment being sold at a high rate of speed to Chinese memory makers will hurt Micron and other memory makers such as Samsung (to a lesser extent).

All the memory making equipment on boats and planes to China right now reduces the need for new Micron fabs (and jobs) in Boise and New York…its just that simple.

Global Foundries is also at significant risk as they are stuck in the technology dark ages as they gave up on the Moore’s Law technology race. SMIC in China is already far ahead of GloFo in that they have already produced a 7NM chip that was a slap in the face to US sanctions. GloFo does not have EUV and neither does SMIC and SMIC is more advanced. Without doubt, SMIC and other Chinese fabs can undercut GloFo’s pricing. Yes, GloFois trying to “hide” in some specialized chips but that is not adequate protection.

If you think “trusted” production for the US defense industry will save GloFo, think again, as its a relative rounding error in global chip making capacity.

The bottom line is that GloFo & Micron will both be hurt as excess chip capacity comes on line from China over the next few years.

Intel is in a safer, but not totally invulnerable space as they are focused on the leading edge and remain ahead of SMC and other Chinese chip companies as they pursue EUV and Moore’s Law as a renewed laser focus.

California wins at New York’s & Idaho’s expense

It is interesting to note that while California based companies such as Applied, Lam & KLA are reaping likely more than half of profits selling chip equipment to China, New York & Idaho will suffer. New York has GloFo and maybe (in the distant future) Micron, Idaho is scheduled to get a Micron fab in Boise before New York.

With all the memory capacity coming on line in China and middle of the road foundry capacity (which GloFo specializes in) also coming on line it could indefinitely delay the need for additional fabs. At current utilization rates GloFo certainly will never need a second fab in NY and wisely is expanding overseas where it is more financially viable and competitive.

In the mean time Applied is continuing to export manufacturing jobs to Singapore and Lam is exporting jobs to Malaysia quickly.

Its all about the “marginal cost”

As we have experienced many times in the cycles of the memory industry, pricing can and sometimes will collapse to the marginal cost of semiconductor manufacturing.

This is that selling prices will fall to the marginal cost of manufacture which is raw materials , labor, electric etc; and not include equipment amortization as that is already a “sunk” cost. China has more of a propensity to “race to the bottom” to marginal cost as we have unfortunately seen in other industry’s.

Its a cost of doing business and surviving in the long term. They are certainly not focused on quarterly earnings….its all about market share and market dominance and killing off the competition.

Something that Korea has been very good at in the memory industry but China is even more cutthroat than Korea and will undercut them as well…..

When there is excess capacity. the “race to the bottom” accelerates….and we see excess capacity coming on strong….

The cycle will almost certainly lengthen or perhaps “double dip”

We have warned several times over the past year that the expansion of the Chinese semiconductor market in the face of and because of sanctions will likely worsen the current oversupply condition that causes the industry to be in its current downcycle.

The current excess supply which would otherwise be offset by increasing demand may not be able to suck up all the new capacity that will come on line. Or if it does, an upcycle may be short lived until all that Chinese capacity comes on line.

The Stocks

As we have pointed out here we think there is significant longer term risk to both Global Foundries as well as Micron. They are quite vulnerable to the capacity being bought a a feverish pace by China and will likely be the victims as that capacity comes on line in their core markets.

We can only hope that demand outpaces increasing global chip supply but we wouldn’t bet on it or invest our money based upon that false hope…its too risky a bet.

Meanwhile chip equipment companies have been saved from disaster by China but will get hurt longer term as China looks to fill equipment needs internally.

Unfortunately the chip sanctions placed on China have had exactly the opposite effect of the intentions……that is that China will be a much bigger , stronger player in chips with more of a threat to US players because , in large part, of the fear inspired buying of tons of equipment caused by thise very sanctions.

We have effectively “scared” China in to buying much more equipment than they needed or would have otherwise bought and it will now come back to haunt the industry…….

Can you say “BOO”……?

Trick or treat?

About Semiconductor Advisors LLC

Semiconductor Advisors is an RIA (a Registered Investment Advisor),
specializing in technology companies with particular emphasis on semiconductor and semiconductor equipment companies. We have been covering the space longer and been involved with more transactions than any other financial professional in the space. We provide research, consulting and advisory services on strategic and financial matters to both industry participants as well as investors. We offer expert, intelligent, balanced research and advice. Our opinions are very direct and honest and offer an unbiased view as compared to other sources.

Also Read:

KLAC- OK quarter in ugly environment- Big China $ – Little Process $ – Legacy good

LRCX- QTR OK but outlook mixed- At half of revs, China is huge risk- Memory poor

ASML- Longer Deeper Downcycle finally hits lithography – Flat 2024 – Weak Memory – Bottom?


Accelerating Development for Audio and Vision AI Pipelines

Accelerating Development for Audio and Vision AI Pipelines
by Bernard Murphy on 11-15-2023 at 6:00 am

AI pipeline min

I wrote previously that the debate over which CPU rules the world (Arm versus RISC-V) somewhat misses the forest for the trees in modern systems. This is nowhere more obvious that in intelligent audio and vision: smart doorbells, speakers, voice activated remotes, intelligent earbuds, automotive collision avoidance, self-parking, and a million other applications. None of these would be possible simply by connecting a CPU or even a bank of CPUs directly to an AI engine. CPUs play an important administrative role, but Audio and Vision-based intelligence-based systems depend on complex pipelines of signal processing surrounding the AI core. However building these pipelines can add significant complexity and schedule risk to product development in very competitive markets.

What is an AI pipeline?

Audio samples are one-dimensional continuously varying signals, image signals are similar but two dimensional. After noise reduction, each must at minimum be scaled into a form compatible with samples on which the AI model was trained. For example, an image may be reduced to grayscale, since color adds significantly to training and inference cost. An audio signal may be filtered to eliminate low and high-frequency bands. For similar reasons signal range will be resized and averaged. All these steps are easily handled by software running on a DSP.

This level of pre-processing is basic. Pulin Desai (Product Marketing and Management for Tensilica vison, radar, lidar and communication DSPs at Cadence) introduced me to a few recent advances pushing higher accuracy. For example, voice recognition uses a technique called MFCC to extract features from speech which are considered more directly determinative than a simple audio wave. As another example, noise suppression now adds intelligence to support voice commands overriding background music speech or to make sure you don’t miss emergency sirens. Both these cases are controlled by signal processing, sometimes with sprinkle of AI.

Emerging vision transformers with global recognition strengths are now hot and are moving to multiscale-based image recognition where an image is broken down into a pyramid of progressively reduced images with lower resolution. There are more pre-inference possibilities and then (eventually) the actual inference step (CNN, transformer, or other options) will run. Following inference, post processing steps must activate; these may include non-maximal suppression (NMS), image sharpening or speech enhancement. Possibly the pipeline may even run an additional inference step for classification. It is obvious why such a process is viewed as a pipeline – many steps from an original image or sound, mostly signal processing accomplished through software running on DSPs.

Just as AI models continue to evolve, pipeline algorithms also continue to evolve. For a product OEM it must be possible to develop and maintain these complex pipelines for release within say 18-24 months and to have products remain competitive for a product life of 5 years or more. Bottom line, OEMs need simplification and acceleration in developing and maintaining DSP software to meet these goals.

Streamlining the development flow

The Tensilica portfolio is already supported by an extensive partnership ecosystem in Audio, Voice and Vision, which should provide much of the help needed in filling out a pipeline implementation. Naturally OEMs will want to selectively differentiate through their own algorithms where appropriate and here recent advances become especially important.

The first and most attention grabbing is auto-vectorization. You write software for a DSP in much the same way you would write software for a CPU, except where you can take full advantage of the DSP to accelerate wide vector calculations. Previously, writing software to fully exploit this capability has required special expertise in manual coding for vectorization, creating a bottleneck in development and upgrades.

Auto-vectorization aims to automate this task. Prakash Madhvapathy (Director of Product Marketing and Management for Tensilica audio/voice DSPs at Cadence) tells me that there are some guidelines the developer must follow to make this pushbutton, but code written to these guidelines works equally well (though slower) on a CPU. They have run trials on a variety of industry standard applications and have found the compiler performs as well as hand-coded software. Auto-vectorization thus opens up DSP algorithm development to a wider audience to help reduce development bottlenecks.

The next important improvement supports double-precision floating point where needed. Double precision might seem like overkill for edge applications, however newer algorithms such as MFCC and softmax are now using functions such as exponentials and logs which will quickly overflow/underflow single-point floats. Double point helps maintain precision, ensuring a developer can avoid need for special handling when porting from datacenter-based software development platforms.

Another important improvement is support for 40-bit addresses in the iDMA, allowing the system to directly address up to a 1TB of memory without requiring the developer to manage address offsets through special code. MSoundslike a lot of memory for an edge device but you only need to consider automotive applications to realize such sizes are becoming common in some applications. The wider address range also offers more simplification when porting from datacenter platforms.

Product enhancements and availability

All existing DSPs are ported to the Tensilica Xtensa LX8 platform which supports all of the above advances together with an L2 cache option, enhanced branch prediction recognizing that intermixed control and vector code is becoming more common, direct AXI4 connection to each core, and expanded interrupt support.

Cadence has recently released two new Tensilica audio DSPs (HiFi 1s and HiFi 5s) and two new vision DSPs (Vision 110 and Vision 130), all building on the LX8 platform. These include several improvements for performance and power, but I want here to focus particularly on improvements to simplify and accelerate software development.

The Vision 130/110 DSPs support on-the-fly decompression, as mentioned in an earlier blog. Add to that higher fmax, higher floating-point performance, faster FFTs and support for floating point complex types delivering overall up to 5X performance improvements on standard DSP benchmarks and up to 3.5X improvement on AI benchmarks. These new Vision processors are available now.

The audio HiFi 1s and 5s DSPs deliver up to 35X improvement on double precision XCLIB functions, 5-15% codec performance through improved branch prediction, and up to 50% improved performance thanks to the L2 cache. These new DSP processors are expected to be available by December of 2023. (I should add by the way that HiFi 1s, though an audio DSP, has enough capability to also handle ultra-low power video wake-up. Pretty cool!)

You can learn more about HiFi DSPs HERE and Vision DSPs HERE. You should also check out a white paper on the importance of DSPs in our brave new AI world.

Also Read:

New STA Features from Cadence

Developing Effective Mixed Signal Models. Innovation in Verification

Assertion Synthesis Through LLM. Innovation in Verification


Generative AI for Silicon Design – Article 3 (Simulate My Design)

Generative AI for Silicon Design – Article 3 (Simulate My Design)
by Anshul Jain on 11-14-2023 at 10:00 am

Generative AI for Silicon Design

Generative AI has time and again showcased its power to understand, predict, and explain a myriad of phenomena. Beyond its famed applications in art and text, it’s making ripples in the niche realm of hardware engineering. In this article, our exploration focuses on the potential of Generative AI to comprehend and predict intricate behaviors integral to hardware design simulation. Using a contemporary Generative AI model, we examined how it interprets intricacies of typical hardware modules in simulated environments.

Experiment 3: Simulate My Design

Our latest experiment centers on the capability of Generative AI to understand and predict the behavior of hardware arbiters, which are quintessential in managing access to shared resources in digital designs. Using OpenAI’s GPT4, we assessed the intricate behavior of an arbiter module (sourced from https://github.com/lowRISC/opentitan/blob/master/hw/ip/prim/rtl/prim_arbiter_tree.sv), a predictable round-robin arbiter tailored for multiple requestors. Make no mistake, the second test result shown below proves that Generative AI really understood the “tree” implementation and did not suggest the behavior of a general round-robin arbiter.

Generated Simulations (Validated Against Actual Simulations)
Prompt 1 (test 1):
For 8 number of requestors, and ready_i always asserted, what will be the sequence of grants if all the requests are constantly asserted?

Response 1:
For a constant request vector of [1 1 1 1 1 1 1 1] and with ready_i always asserted, the prim_arbiter_tree module will rotate the grant through the requestors in a fair manner. The behavior of the arbiter is designed to mask out the recently granted requestor and prioritize other pending requests. This way, the arbiter ensures that every requestor gets an equal opportunity in a round-robin fashion.

Given the design of this arbiter, if all requestors are constantly asserting their request, the sequence of grants will be:

Requestor 0 (assuming the arbiter starts with this one by default)
Requestor 1
Requestor 2
Requestor 3
Requestor 4
Requestor 5
Requestor 6
Requestor 7
And then back to Requestor 0, and so on in a loop.
In terms of gnt_o output:

[1 0 0 0 0 0 0 0]
[0 1 0 0 0 0 0 0]
[0 0 1 0 0 0 0 0]
[0 0 0 1 0 0 0 0]
[0 0 0 0 1 0 0 0]
[0 0 0 0 0 1 0 0]
[0 0 0 0 0 0 1 0]
[0 0 0 0 0 0 0 1]
Then back to [1 0 0 0 0 0 0 0], and so on.
The grants will continue in this manner as long as all requests are constantly asserted and ready_i remains asserted.
Prompt 2 (test 2):
For 8 number of requestors, and ready_i always asserted, what will be the sequence of grants if all the requests are toggling between 1 and 0?

Response 2:
If all the requests are toggling between 1 and 0, the sequence in which they toggle will matter. Let's assume all requestors toggle simultaneously, meaning in one cycle all are 1 and in the next cycle all are 0, and then back to all 1, and so on.

Given the behavior of the prim_arbiter_tree module and based on the design, if ready_i is always asserted:

Cycle 1: req_i = [1 1 1 1 1 1 1 1] - Requestor 0 gets granted as it has the highest priority.
Cycle 2: req_i = [0 0 0 0 0 0 0 0] - No grant because no request is asserted.
Cycle 3: req_i = [1 1 1 1 1 1 1 1] - Requestor 1 gets granted. Requestor 0's request is masked out because it was granted in the most recent cycle when there was a request.
Cycle 4: req_i = [0 0 0 0 0 0 0 0] - No grant.
Cycle 5: req_i = [1 1 1 1 1 1 1 1] - Requestor 2 gets granted.
...and so on.
The sequence of granted requestors will be: 0, 1, 2, 3, 4, 5, 6, 7, and then it will loop back to 0.

This pattern is a result of two combined behaviors:

The arbiter's built-in round-robin prioritization.
The synchronous toggling of all request signals.
Since requests are off every alternate cycle, each requestor gets a turn when its respective request signal is on.
Imagine AI-driven Hardware Simulations

Picture a scenario where AI helps you query your hardware design without the need of setting up a full-fledged simulation testbench, allow you to test your DUT for sanity, conduct design exercise (understand high-level behavior of a design), without really firing up your EDA simulator. Preliminary results are promising, and it’s evident that with further advancements, AI will solidify its position in the hardware simulation toolkit.

Challenges and Forward Momentum

Navigating this pioneering terrain isn’t devoid of challenges. While AI offers impressive predictions, these must be meticulously validated against actual hardware simulations to ensure accuracy and reliability. Depending solely on AI without empirical verification can be perilous. Additionally, while AI is adept at recognizing standard patterns, nuanced corner cases might demand traditional evaluation methods.

In Closing

The integration of AI into hardware design simulation is an exciting frontier, promising a blend of innovation and efficiency. As we continue to traverse this path, the symbiotic relationship between AI and hardware design simulation hints at a future filled with unprecedented possibilities. The next chapter in hardware design is unfolding, and it’s an exhilarating era for all stakeholders involved!

Also Read:

Generative AI for Silicon Design – Article 2 (Debug My Waveform)

Generative AI for Silicon Design – Article 1 (Code My FSM)


Uniquely Understanding Challenges of Chip Design and Verification

Uniquely Understanding Challenges of Chip Design and Verification
by Daniel Nenni on 11-14-2023 at 6:00 am

Jean Marie Brunet (1)

Jean-Marie Brunet is Vice President and General Manager of Siemens Hardware-Assisted Verification. He and I spoke recently about how different his hardware group is from the rest of the software-centric EDA product space and why a hardware-oriented EDA vendor like Siemens fully understands the challenges of the chip design and verification community.

Siemens EDA is a hardware-assisted verification provider with a unique vantage point for an EDA vendor. Explain why.
Yes, we have a unique vantage point because we happen to practice the same thing as our customers. We design and verify our own chips just like they do and obviously practice what we are offering. We are exposed to the same problems that our customers are exposed to in terms of challenges and how big chips get verified.

Every new generation of our chip gets verified by the current emulator generation, giving us insight and understanding into our challenges and those of our customers. I believe it gives us far more credibility when we speak to customers, as well as credibility in understanding industry trends. Since we design complex devices, we understand the verification trends because we’re going through them ourselves. It’s an important consideration for companies evaluating their hardware-assisted verification (HAV) providers.

Some other HAV vendors have the same perspective we do because they design their chips, but not all because they don’t design and verify their own chips.

Hardware-assisted verification is a different market segment than typical EDA and a difficult one for startups.
The difference between doing a startup in the hardware domain versus an EDA software startup is fundamental. The cost structure on the software side is mainly headcount and that can scale. The startup that does hardware means a CapEx investment to create the hardware and it needs to raise far more money to get started. The entry point of investment is much higher than software.

It’s a difficult market for startups that must have the ability to serve efficiently and completely and where investment and scaling are needed. I know how much it costs to design hardware. It’s difficult and a high expense.  The investment is not peanuts in terms of advancement. Clearly, it’s a market for large companies because it requires a complete offering and customers’ challenges require a complete set of offerings –– not one small piece.

How do you define the hardware-assisted verification market?
In general, anyone looking at the combined hardware-assisted verification market that is FPGA based and custom ASIC based doesn’t make the distinction and assumes it’s a lump sum market.

One segment of the market is driven by FPGA devices. The recent AMD launch where we actively participated defines an envelope of what can be done with a hardware architecture that integrates FPGAs. The FPGA space is well defined and serves a market in need of performance and the ability to run faster but less able to debug than the platform in the other market.

The other market is more about debuggability and how the hardware will scale. If an engineering group is at the stage or what we call emulation, it has a well-defined envelope and market space serving the SoC debug stage. It is far more efficient at SoC bug and tailored for fast compilation, debug and scaling.

By separating the tools and the markets, it’s easy to identify the use models and match the type of hardware and architecture. The overall market is growing tremendously because of trends that are not slowing down at all. They are accelerating.

All the market analysts are predicting continued growth. We see demand for HAV continuing for at least the next five years. This is an important segment and there are things that have happened in this space. With two domains –– hardware based on an FPGA chip from a third party or hardware based on a chip designed in house and based on an FPGA device.

What trends are you seeing?
The HAV market has dynamic changes going on and it’s important to understand the dynamic trends to predict the next five years and keep up with what’s happening with Moore’s Law and More than Moore. It’s also important to keep up with market demand and the cost increases of producing devices. One of the key trends that we see is ML and AI content in many devices. This drives a need for software shift left and scalable capacity.

Jean-Marie Brunet
Jean-Marie Brunet is the Vice President and General Manager of the Hardware-Assisted Verification at Siemens EDA. He has served for over 25 years in management roles in marketing, application engineering, product management and product engineering roles in the EDA industry and has held IC design and design management positions at STMicroelectronics, Cadence, and Micron among others. Jean-Marie holds a master’s degree in electrical engineering from I.S.E.N Electronic Engineering School in Lille, France.

Also Read:

Make Your RISC-V Product a Fruitful Endeavor

Ensuring 3D IC Semiconductor Reliability: Challenges and Solutions for Successful Integration

The Path to Chiplet Architecture


New STA Features from Cadence

New STA Features from Cadence
by Daniel Payne on 11-13-2023 at 10:00 am

Tempus DRA Suite

Static Timing Analysis (STA) has been an EDA tool category for many years now, yet with each new generation of smaller foundry process nodes come new physical effects that impact timing, requiring new analysis features to be added. For advanced process nodes, there are five different types of analysis that must be included when using a STA tool. Brandon Bautz, Senior Product Management Group Director, Digital & Signoff Group at Cadence walked me through some of these new analysis challenges, and their new Tempus DRA (Design Robustness Analysis) Suite:

Tempus DRA Suite consists of advanced analysis capabilities delivering best-in-class PPA

Aging Robustness

Each of these five issues cause variation in the chip timing results, which can lead to intermittent failures in ICs, so a timing signoff tool must bound the distribution of variation. The first challenge listed is aging robustness, and this deals with the transistor Vt and IV curve changing over time, based on how often each transistor is toggled, how great the voltage swing is, and at what temperature the transistor is operating at. With older process nodes you could make the simplifying assumption that all transistors aged at the same rate, however with new process nodes the transistor aging is non-uniform across the chip, so that has to be taken into account now.

The Tempus tool for STA gets the aging information it needs from the Liberate tool used during standard cell characterization runs. So, Tempus knows how to handle the non-uniform aging challenge, providing the most accurate timing numbers based on stimulus from logic simulation plus a calendar age.

Tempus and Liberate – Transistor Aging

Voltage Robustness

As currents flow through VDD and VSS interconnect the parasitic resistances create a voltage drop, and that in turn impacts timing through the cells. The  Tempus PI (Power Integrity) tool uses the IR drop values to produce a more accurate STA number. Using Voltus (EM/IR drop) and Tempus together helps designers locate timing critical layout areas and reduce the impact of voltage variation.

Timing Robustness

Process variations impact timing values, so the idea is to analyze the full chip to find areas of high process variation and then how it effects timing of cells along a critical path. Knowing how timing variation is caused by process variation is another method like signoff slack to indicate how robust timing is.

Silicon Prediction

Foundries release Process Design Kits (PDK) and update the versions as they tune the silicon for that node over its useful lifetime. IC designers want to know how much the performance of their chip will change as PDK matures from version 0.5 to 0.7 to 1.0. The Liberate tool captures variation between the old and new PDKs without requiring a full library re-characterization. The designer may then use Tempus to analyze the timing impact of the new PDK on the design.

VT Skew Robustness

Current STA approaches can be pessimistic in applying VT Skew, so a new approach using skew to perform analysis at one corner for each Vt class in order to move delays towards others. For example, when analyzing the TT corner with slow and fast derate to move delays towards SSG and FFG corners.

Vt skew robustness

Summary

As process nodes became smaller, then more accuracy in modeling was required, so Cadence has updated their Tempus tool for STA to account for aging robustness, voltage robustness, timing robustness, silicon prediction and VT skew robustness. This improved modeling and analysis meanings more accurate timing compared to silicon, resulting in fewer silicon spins.

Read more about the Tempus DRA Suite.

Related Blogs


Successful 3DIC design requires an integrated approach

Successful 3DIC design requires an integrated approach
by Kalar Rajendiran on 11-13-2023 at 6:00 am

Siemens EDA 3DIC Graphics

While the leap from traditional SoC/IC designs to Three-Dimensional Integrated Circuits (3DICs) designs brings new benefits and opportunities, it also introduces new challenges. The benefits include performance, power efficiency, footprint reduction and cost savings. The challenges span design, verification, thermal management, mechanical stress, electrical compliance of the entire 3DIC assembly and reliability. An earlier post on SemiWiki discussed the challenges relating to semiconductor reliability and solutions to address those challenges.

Siemens EDA has recently published a whitepaper that covers the challenges and emerging solutions relating to design, integration, verification, thermal, mechanical and electrical aspects of 3DICs.

The Challenges

Chiplets introduce complexities in the integration process, particularly when utilizing silicon interposers for vertical chiplet stacking. This approach poses hurdles for automated routing tools which are designed with the traditional two-dimensional (2D) perspective in mind. Overlapping geometries on the same layer in different chiplets cannot be treated as a single shape in 3DIC design. Layer mapping, a crucial aspect of the process design kit (PDK), becomes tricky when incorporating chiplets with diverse process technologies into a single 3DIC design. This creates challenges for foundries and outsourced assembly and test (OSAT) companies in understanding the specific layers and associated rules for each chiplet.

The vertical and horizontal stacking of chiplets in 3DICs can lead to increased heat generation. Heat dissipation strategies must be carefully considered to ensure that the heat generated by the components is efficiently distributed and dissipated to prevent hotspots and thermal stress. Additionally, as 3DICs comprise various heterogeneous chiplets, each with its own thermal characteristics, designing a thermal management solution that accommodates these variations becomes complex. Ensuring that the entire 3DIC assembly remains within safe temperature limits while maintaining optimal performance is a significant challenge that designers must address.

Issues related to mechanical stress from components such as ball-grid arrays (BGAs), through-silicon vias (TSVs), and stacked dies need to be captured at the 3DIC assembly level. All of these can negatively impact the electrical behavior of active devices and metal interconnects. Designers must address these challenges by ensuring mechanical stability and minimizing stress-induced performance issues. The combination of thermal and mechanical stress analyses is essential for producing accurate results and enabling effective circuit simulation.

The integration of chiplets with different process technologies and functions necessitates a meticulous consideration of electrical compatibility. Achieving full electrical compliance at both the chip-specific level and down to the device level is essential to ensure proper functionality and reliability. Moreover, the unique vertical stacking of components can introduce challenges related to electrical interference and signal integrity, demanding the implementation of effective shielding and routing strategies. Additionally, as 3DICs involve a complex interplay of diverse chiplets, each with distinct electrical characteristics, it is crucial to manage cross-chiplet coupling and parasitic extraction for protection against electrostatic discharge and latch-up issues.

In essence, early thermal and mechanical stress analysis as well as EM/IR and signal integrity analysis need to be performed. Ensuring that the entire 3DIC assembly operates within specified electrical parameters while avoiding interference and maintaining reliable performance is a central challenge in the electrical domain of 3DIC design.

Solutions from Siemens EDA

All of the above challenges underscore the need for innovative solutions and specialized tools to effectively manage the complexities and ensure successful implementation. Siemens EDA is meeting the challenges of true 3DIC design with solutions such as its Calibre® 3DSTACK tool, Xpedition™ Substrate Integrator (xSI), mPower™ tootsuite, SimCenter™ Flotherm™, Calibre nmLVS™ Query Server technology.

The 3DSTACK tool addresses design rule checking (DRC) and layout vs. schematic (LVS) verification in the context of heterogeneous 3DIC assemblies. It starts by capturing design-specific information, including layer mapping and chiplet placement. This data can be entered manually or automated through Siemens’ Xpedition™ Substrate Integrator (xSI) tool. xSI enables designers to plan and prototype integration approaches. The Calibre 3DSTACK tool automates horizontal and vertical design integration, ensuring layer uniqueness and addressing connectivity challenges. Furthermore, it offers support for cross-die DRC checks and parasitic extraction.

With the mPower toolsuite, designers can assess the 3DIC power domain across all components. The SimCenter Flotherm helps generate accurate chiplet-level thermal models for static and dynamic extraction of the full 3DIC assembly. The 3DIC assembly-level temperature details can be applied back to the chiplet, sub-chip IP or even transistor levels.

All in all, the level of automation and integration enable iterative analysis, reducing the risk of performance and reliability issues in the final design.

Summary

3DICs offer improved performance and power efficiency but also introduce various challenges discussed above. Siemens EDA provides a comprehensive solution that bridges planning, layout automation, 3D physical verification, power analysis, thermal analysis, mechanical analysis, and EM/IR analysis. This solution enables early issue identification and thorough verification of both physical and electrical aspects, ensuring 3DICs deliver all the promised benefits.

You can download Siemens EDA’s entire whitepaper here.

Also Read:

Make Your RISC-V Product a Fruitful Endeavor

Ensuring 3D IC Semiconductor Reliability: Challenges and Solutions for Successful Integration

The Path to Chiplet Architecture


Podcast EP193: A Look at the Engineering Tracks for DAC 2024 with Frank Schirrmeister

Podcast EP193: A Look at the Engineering Tracks for DAC 2024 with Frank Schirrmeister
by Daniel Nenni on 11-10-2023 at 10:00 am

Dan is joined by Frank Schirrmeister. Frank is vice president of solutions and business development at Arteris. He leads activities for industry verticals, including automotive and enterprise computing and technology horizontals like artificial intelligence, machine learning, and safety. For DAC 2024, Frank is the vice chair for the engineering tracks.

Frank explores the topics and focus for the engineering tracks at DAC 2024 with Dan. In contrast to the R&D tracks that present research that will impact product development in the future, the engineering tracks focus on work that can impact current product development. Essentially, a practitioner-to-practitioner discussion. Frank explains there are four areas of focus for the engineering tracks: front-end design, back-end design. IP and embedded systems and software.

Frank reviews the high-impact results from last year’s engineering tracks, many of which focused on generative AI and chiplets. He expects these topics to continue to be center stage at DAC 2024.

Submissions for the engineering track are open now, with a due date of January 16, 2024. You can learn more about submitting work to the engineering tracks here. The 61st DAC will be held at Moscone Center in San Francisco from June 23 – 27, 2024.

The views, thoughts, and opinions expressed in these podcasts belong solely to the speaker, and not to the speaker’s employer, organization, committee or any other group or individual.