Banner 800x100 0810

Podcast EP247: How Model N Helps to Navigate the Complexities of the Worldwide Semiconductor Supply Chain

Podcast EP247: How Model N Helps to Navigate the Complexities of the Worldwide Semiconductor Supply Chain
by Daniel Nenni on 09-13-2024 at 10:00 am

Dan is joined by Gloria Kee, Vice President of Product Management at Model N. For 15 years at Model N, she has spent her time focused on product management and with an in-depth understanding of implementing and designing innovative software across a variety of business challenges. She is committed to product innovation and development within the High Tech Industry.

In this far reaching discussion, Dan explores the geopolitical forces at play in the semiconductor industry with Gloria. Global supply chain dynamics, including the evolving relationship with China are discussed. Gloria comments on national security considerations, workforce development, the role of international collaboration and the importance of sustainability. How technologies such as cloud and AI fit are also reviewed.

Gloria explains the broad charter for Model N to support the complex business needs of the world’s leading brands in pharmaceutical, medical device, high tech, manufacturing and semiconductors across more than 120 countries.

The views, thoughts, and opinions expressed in these podcasts belong solely to the speaker, and not to the speaker’s employer, organization, committee or any other group or individual.


Executive Interview: Michael Wu, GM and President of Phison US

Executive Interview: Michael Wu, GM and President of Phison US
by Daniel Nenni on 09-13-2024 at 6:00 am

IMG 2888

Michael Wu is the GM and President of Phison US. Michael is an acclaimed technology expert in the NAND storage sector and boasts over 17 years of industry experience. Over the course of his 14+ years at Phison, Michael has held positions such as GM, Director of Global Customer Relations and Project Manager to position Phison as the world’s leading NAND storage solution company. Under his leadership at Phison, Michael manages and oversees the company’s USA operation to drive sales growth and profitability. During his tenure at Phison, Michael previously coordinated customer activities and support for the North America, South America and European regions and provided technical support to design-in projects and managing product life cycles. Previously, Michael was the Verification Engineer at RF Micro Devices. Michael holds a Master of Science in Electrical Engineering and a Bachelor of Science in Electrical Engineering from Virginia Tech.

Tell us about your company?

Phison is a global leader in NAND controllers and storage solutions that delivers industry-first solutions built for the modern era of AI workloads and enterprise computing.

Today, Phison is one of the best kept secrets in the global tech sector as the largest SSD supplier for over 20 years that many are unaware of due to its white label business model origin.

For over two decades, Phison’s focus has been on offering customized design solutions and delivering turnkey SSDs on behalf of huge brands to address today’s enterprise storage challenges, particularly with the onslaught of AI applications. In May, with the launch of its PASCARI brand, Phison came out of stealth mode and showed its commitment to foster rapid advancements in enterprise innovation to keep up with evolving dynamic needs of the enterprise storage market. Then, by combining the power of PASCARI with Phison’s proprietary software, Phison launched aiDAPTIV+ Pro Suite, designed for optimal memory and storage usage to meet the demands of SMBs looking to leverage GenAI. The debut of these products demonstrates that Phison is on a mission to democratize AI so that enterprises of ALL sizes can participate.

What problems are you solving?

With aiDAPTIV+, Phison is removing the barriers of entry for small and medium size businesses to access the power of GenAI. While the hype cycle for GenAI has been robust, it is at an inflection point with financial market wobbles because many companies are not yet investing due to concerns about the ROI. CIOs and decision makers will not buy-in until it becomes accessible, affordable and profitable. Additionally, it requires that companies of all sizes can participate to realize the true potential and usher in the next Industrial Revolution. This remains a hurdle as Generative AI requires massive computing power that is mostly inaccessible for small and mid-size companies. If these companies want to do their own AI projects, they typically must do them through a cloud service provider, meaning the company suddenly loses a portion of control over its data.

At Phison, we have developed the aiDAPTIV+ Pro Suite as a solution to enable any organization to run generative AI projects and large language models (LLMs) from regular on-premises workstations. Our solution is an AI-specialized aiDAPTIV+ SSD with proprietary software and tools, along with an ecosystem of support through our trusted partners.

With this aiDAPTIV+ SSD solution, companies can develop their own AI projects and retain complete control over their data on-premises. The “entry fee” to generative AI is no longer exclusive to large enterprises.”

What application areas are your strongest?

Prior to the release of aiDAPTIV+ Pro Suite to the market, small and medium-sized businesses dealt with limited technology options offering small and imprecise training without the ability to scale beyond 7B model training. Phison’s aiDAPTIV+ solution enables significantly larger model fine-tuning to allow users to run workloads previously reserved for data centers. For the first time in the U.S., aiDAPTIV+ Pro Suite was demonstrated as an end-to-end on-premises generative AI solution at FMS 2024 and earned “Best of Show, Most Innovative AI Application”. Now even users with limited resources are empowered to train large language models. This AI appliance solution allows system integrators to build turn-key large language model training systems from start to successful finish. With a simplistic user experience that turns raw data into tokenized data, it runs the fine-tuning process with Llama-3 70B precision and offers the ability to ask questions about the data, all from a local domain on premises. Data can be fine-tuned in about 4 hours which is sufficient for most business applications.

At a game changing price point of about $40K, aiDAPTIV+ removes the barriers to entry for small and medium sized businesses, organizations fully own their data and can fine tune it with a turnkey solution, no additional IT or engineering staff is required to run it, and security risks and expenses are minimized with keeping data on premises. Most importantly, trained data delivers immediate business value.

In less than a year, the market response has been remarkable with over 100 enterprises using it in a variety of use cases. The solution has addressed universal pain points like onboarding new employees, keeping up with professional development needs, keeping up with coding demands, and the need to automate tasks to keep up with huge data volumes.

What keeps your customers up at night?

IT managers that have to keep up with evolving technology demands with less budget and staff to run it worry about security, loss of data, failures of legacy systems, unpredictable cloud service bills, vendor lock-in from cloud providers, and fear of missing out on promising technologies like GenAI.

What does the competitive landscape look like and how do you differentiate?

Prior to the launch of Phison’s aiDAPTIV+, only large enterprises with deep pockets could access and afford GenAI. We were the first to use storage as memory to lower the price point from $1million plus for equipment with associated fees for cloud services to $40K located on premises. It takes about 4 hours to train LLMs, this is acceptable for most businesses.

Now even users with limited resources are empowered to train large language models. Our end-to-end AI appliance solution has been recognized for allowing system integrators to build turn-key large language model training systems from start to successful finish. With a simplistic user experience that turns raw data into tokenized data, aiDAPTIV+ runs the fine-tuning process with Llama-3 70B precision and offers the ability to ask questions about the data, all from a local domain on premises.

What new features/technology are you working on?

Phison continues to invest in R&D and engage with customers to understand business challenges to best align roadmap to modern customer requirements. As the aiDAPTIV+ and Pascari brands grow, so will the robust set of features to meet the current and future industry needs.

How do customers normally engage with your company?

Previously customers relied solely on channel partners and system integrators to purchase Phison products. One in four SSDs in applications today are our white label products SSDs and NAND controllers. With Pascari and aiDAPTIV+ we bring the technology to the end user ourselves and with select partners.

Also Read:

CEO Interview: Wendy Chen of MSquare Technology

CEO Interview: BRAM DE MUER of ICsense

CEO Interview: Anders Storm of Sivers Semiconductors


Ansys and eShard Sign Agreement to Deliver Comprehensive Hardware Security Solution for Semiconductor Products

Ansys and eShard Sign Agreement to Deliver Comprehensive Hardware Security Solution for Semiconductor Products
by Marc Swinnen on 09-12-2024 at 10:00 am

Probing

Integrated circuits, or chips, lie at the heart of today’s electronic systems that are mission critical for almost every sector of the economy – from healthcare, to banking, military equipment, cars, planes, telecommunications, and the internet itself. The data flowing through these systems is the lifeblood of modern life and we go to great lengths to protect it from unauthorized access.

We are all familiar with security passwords, PIN codes, and two-factor authentication that aim to secure electronic systems against code viruses and software vulnerabilities. But there is also a completely different category of hacking vulnerabilities that are rooted in hardware, not software. This area of hardware security seeks to prevent unauthorized leakage of critical information carried out by so-called “side-channel attacks”. A side-channel can be any physical phenomenon that can be exploited to reveal the internal workings of a chip. Some of the most common side-channels are power noise, electromagnetic radiation, and thermal.

Power noise refers to the changes in power supply current drawn by a chip as it executes different instructions. By monitoring how much the supply current goes up and down it is possible to reveal a cryptographic key embedded in the chip. Similarly, an electromagnetic probe hovering a few millimeters above the surface of a chip can detect the switching activity of internal signals. Thermal side-channel attacks monitor the amount and location of heat produced by the chip. These are all examples of hardware security vulnerabilities that can be used to reveal secure data and cryptographic keys. A side-channel attack leaves no trace that the data has been compromised and may not even require any physical contact with the chip. It also cannot be defended against with traditional software security techniques.

Examples of probing integrated circuits for electromagnetic side-channel emissions

Much of our modern infrastructure relies on cryptography to secure data storage and communication. The internet has moved to secure “https://” website addresses; credit card and online banking transactions are secured with encryption in silicon; healthcare data is kept confidential with encryption; and military units rely on encryption to communicate on the battlefield. Hardware security vulnerabilities have already been exploited in the real world with examples ranging from the cloning of car fobs to identified microprocessor security vulnerabilities called ‘Meltdown’, ‘Spectre’, and ‘Platypus’. Data security has become a pervasive concern for many leading companies and semiconductor designers are eager to strengthen their chips against hardware vulnerabilities.

Traditionally, side-channel vulnerabilities have been measured and evaluated in a lab by examining the actual device. While accurate and reliable, lab testing is expensive and slow and – most importantly – it is closing the barn door after the horse has bolted. There is no way to fix any detected vulnerability! Often millions of dollars have already been spent on creating photomasks and the chip has been manufactured. Any fixes will require an expensive redesign, more money for new photomasks, and months of extra delay. It would be better to be able to simulate and predict side-channel vulnerabilities at the design stage before manufacturing. Simulation is faster, cheaper, and more flexible than actual physical measurements. It requires less operator expertise, and it can point the way to improving and safeguarding the design before any money is spent on manufacturing.

Ansys and eShard Technologies are both leading experts in hardware security and have announced an agreement to collaborate to deliver a comprehensive solution that includes both pre-silicon and post-silicon security verification. eShard is a leading provider of chip security testing with the esDynamic™ testing platform for side-channel testing, fault injection, and failure analysis. With dozens of customers, including major semiconductor companies, eShard is able to deploy proven algorithms for verifying many advanced security algorithms, including AES, RSA, ECC, and HMAC. esDynamic has algorithms to efficiently evaluate the security for these standards in physical hardware and generate relevant metrics for the strength of the security implementation.

eShard’s  agreement with Ansys allows Ansys RedHawk-SC Security™ to perform the same extensive suite of cryptographic analyses at the pre-silicon design stage and flag potential areas of weakness. RedHawk-SC Security is built on the foundry-certified Ansys RedHawk-SC™ golden signoff tool for power integrity analysis of digital chips. RedHawk-SC Security simulates pre-silicon layouts with Ansys’s industry-leading physics simulation engines for electromagnetic and power noise analysis. These pre-silicon predictions cover all anticipated usage modes by combining user-generated activity vectors, automatic ‘vectorless’ activity, and real-world activity based on actual software execution. RedHawk-SC Security ties into esDynamic’s platform for management of the entire security verification workflow.

Building a best-in-class pre-silicon to post-silicon security testing solution

The collaboration brings together eShard’s expertise in security with Ansys’s foundry-certified expertise in physical simulation to deliver a uniquely broad and capable hardware security solution. This collaboration offers the joint advantages of pre-silicon simulation for fast, cost-effective problem avoidance, and post-silicon verification for the most reliable accuracy. Customers can now deploy an integrated solution platform that gives regular chip designers at all stages in the design flow – from RTL to layout – the expertise to verify a comprehensive suite of security protocols. This easy-to-use workflow and deliver proven levels of hardware security at every stage of semiconductor product development.

Marc Swinnen, Director of Product Marketing – Semiconductors Ansys

Also Read:

Ansys and NVIDIA Collaboration Will Be On Display at DAC 2024

Don’t Settle for Less Than Optimal – Get the Perfect Inductor Every Time

Simulation World 2024 Virtual Event


Gazzillion Misses – Making the Memory Wall Irrelevant

Gazzillion Misses – Making the Memory Wall Irrelevant
by admin on 09-12-2024 at 6:00 am

Gazzillion pyramid

Memory Hierarchy and the Memory Wall

Computer programs mainly move data around. In the meantime, they do some computations on the data but the bulk of execution time and energy is spent moving data around. In computer jargon we say that applications tend to be memory bound: this means that memory is the main performance limiting factor. A plethora of popular applications are memory bound, such as Artificial Intelligence, Machine Learning or Scientific Computing.

By memory we mean any physical system able to store and retrieve data. In a digital computer, memories are built out of electrical parts, such as transistors or capacitors. Ideally, programmers would like the memory to be fast and large, i.e. they demand quick access to a huge amount of data. Unfortunately, these are conflicting goals. For physical reasons, larger memories are slower and, hence, we cannot provide a single memory device that is both fast and large. The solution that computer architects found to this problem is the memory hierarchy, illustrated in the next figure.

The memory hierarchy is based on the principle of locality, which states that data accessed recently are very likely to be accessed again in the near future. Modern processors leverage this principle of locality by storing recently accessed data in a small and fast cache. Memory requests that find the data in the cache can be served at the fastest speed; these accesses are called cache hits. However, if the data is not found in the cache we have to access the next level of the memory hierarchy, the Main Memory, largely increasing the latency for serving the request. These accesses are called cache misses. By combining different memory devices in a hierarchical manner, the system gives the impression of a memory that is as fast as the fastest level (Cache) and as large as the largest level (Hard Drive).

Cache misses are one of the key performance limiting factors for memory bound applications. In the last decades, processor speed has increased at a much faster pace than memory speed, creating the problem known as the memory wall. Due to this disparity between processor speed and memory speed, serving a cache miss may take tens or even hundreds of CPU cycles, and this gap keeps increasing.

In a classical cache, whenever a cache miss occurs, the processor will stall until the miss is serviced by the memory. This type of cache is called a blocking cache, as the processor execution is blocked until the cache miss is resolved, i.e. the cache cannot continue processing requests in the presence of a cache miss. In order to improve performance, more sophisticated caches have been developed.

Non-Blocking Caches

In case of a cache miss, there may be subsequent (younger) requests whose data are available in the cache. If we could allow the cache to serve cache hits while the miss is solved, then the processor could continue doing useful work instead of just being idle. This is the idea of non-blocking caches [1][2], a.k.a. lockup-free caches. Non-blocking caches allow the processor to continue doing useful work even in the presence of a cache miss.

Modern processors use non-blocking caches that can tolerate a relatively small number of cache misses, typically around 16-20. This means that the processor can continue working until it reaches 20 cache misses and then it will stop, waiting for the misses to be serviced. Although this is a significant improvement over blocking caches, it can still result in large idle times for memory intensive applications.

Gazzillion Misses

Our Gazzillion MissesTM technology takes the idea of non-blocking caches to the extreme by providing up to 128 cache misses per core. By supporting such a large number of outstanding misses, our Avispado and Atrevido cores can avoid idle times waiting for main memory to service the data. Furthermore, we can tailor the aggressiveness of the Gazzillion to fulfill customer’s design targets, providing an efficient area-performance trade-off for each memory system.

There are multiple reasons why Gazzillion Misses results in significant performance improvements:

Solving the Memory Wall

Serving a cache miss is expensive. Main memories are located off chip, on dedicated memory circuits based on DDR [3] or HBM [4] technology and, hence, just doing a round-trip to memory takes a non-negligible amount of time. This is especially concerning with the advent of CXL.mem [5], which locates main memory even further away from the CPU. In addition, accessing a memory chip also takes a significant amount of time. Due to the memory wall problem, accessing main memory takes a large number of CPU cycles and, therefore, a CPU can quickly become idle if it stops processing requests after a few cache misses. Gazzillion Misses has been designed to solve this issue, largely improving the capability of Avispado and Atrevido cores to tolerate main memory latency.

Effectively Using Memory Bandwidth

Main memory technologies provide a high bandwidth, but they require a large number of outstanding requests to maximize bandwidth usage. Main memory is split in multiple channels, ranks and banks, and it requires a large number of parallel accesses to effectively exploit its bandwidth. Gazzillion Misses is able to generate a large amount of parallel accesses from a small number of cores, effectively exploiting main memory bandwidth.

A Perfect Fit for Vectorized Applications

Vectorized codes put a high pressure on the memory system. Scatter/gather operations, such as indexed vector load/store instructions, can generate a large number of cache misses from just a few vector instructions. Hence, tolerating a large number of misses is key to deliver high performance in vectorized applications. A paradigmatic example of such applications are sparse, i.e. pruned, Deep Neural Networks [7], that are well known for exhibiting irregular memory access patterns that result in a large number of cache misses. Gazzillion Misses is a perfect solution for such applications.

What does this have to do with Leonardo Da Vinci?

To better illustrate Gazzillion Misses, we would like to borrow an analogy from the classical textbook “Computer Organization and Design” [8]. Suppose you want to write an essay about Leonardo Da Vinci and, for some reason, you do not want to use the Internet, Wikipedia or just tell ChatGPT to write the essay for you. You want to do your research the old-fashioned way, by going to a library, either because you feel nostalgic or because you enjoy the touch and smell of books. You arrive at the library and pull out some books about Leonardo Da Vinci, then you sit at a desk with your selected books. The desk is your cache: it gives you quick access to a few books. It cannot store all the books in the library, but since you are focusing on Da Vinci, there is a good chance that you will find the information that you need in the books in front of you. This capability to store several books on the desk close at hand saves you a lot of time, as you do not have to constantly go back and forth to the shelves to return a book and take another one. This is similar to having a cache inside the processor that contains a subset of the data.

After spending some time reading the books in front of you and writing your essay, you decide to include a few words on Da Vinci’s study of the human body. However, none of the books on your desk mention Da Vinci’s contributions to our understanding of human anatomy. In other words, the data you are looking for is not on your desk, so you just had a cache miss. Now you have to go back to the shelves and start looking for a book that contains the information that you want. During this time, you are not making any progress on your essay, you are just wandering around. This is what we call idle time, you stop working on your essay until you locate the book that you need.

You can be more efficient by leveraging the idea of non-blocking caches. Let’s assume that you have a friend that can locate and bring the book while you continue working on your essay. Of course, you cannot write about anatomy because you do not have the required book, but you have other books on your desk that describe Da Vinci’s paintings and inventions, so you can continue writing. By doing this you avoid stopping your work on a cache miss, reducing idle time. However, if your friend takes a large amount of time to locate the book, at some point you will be again idle waiting for your friend to bring the book.

This is when Gazzillion Misses comes pretty handy. Our Gazzillion technology gives you 128 friends that will be running up and down the library, looking for the books that you will need to write your essay and making sure that, whenever you require a given book, it will be available on your desk.

To sum up, our Gazzillion Misses technology has been designed to effectively tolerate main memory latency and to maximize memory bandwidth usage. Due to its unprecedented number of simultaneous cache misses, our Avispado and Atrevido cores are the fastest RISC-V processors for moving data around.

Further information at www.semidynamics.com

References

[1] David Kroft. “Lockup-Free Instruction Fetch/Prefech Cache Organization”. Proceedings of

the 8th Int. Symp. on Computer Architecture, May 1981, pp. 81-87.

[2] Belayneh, Samson, and David R. Kaeli. “A discussion on non-blocking/lockup-free caches”. ACM SIGARCH Computer Architecture News 24.3 (1996): 18-25.

[3] DDR Memory: https://en.wikipedia.org/wiki/DDR_SDRAM

[4] High Bandwidth Memory: https://en.wikipedia.org/wiki/High_Bandwidth_Memory

[5] Compute Express Link: https://en.wikipedia.org/wiki/Compute_Express_Link

[6] Memory Wall: https://en.wikipedia.org/wiki/Random-access_memory#Memory_wall

[7] Han, S., Mao, H., & Dally, W. J. (2015). Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149.

[8] Patterson, D. A., & Hennessy, J. L. (2013). Computer organization and design: the hardware/software interface (the Morgan Kaufmann series in computer architecture and design). Paperback, Morgan Kaufmann Publishers.

Gazzillion Misses is a trademark of Semidynamics

Also Read:

CEO Interview: Roger Espasa of Semidynamics

Semidynamics Shakes Up Embedded World 2024 with All-In-One AI IP to Power Nextgen AI Chips

RISC-V Summit Buzz – Semidynamics Founder and CEO Roger Espasa Introduces Extreme Customization


Bluetooth 6.0 Channel Sounding is Here

Bluetooth 6.0 Channel Sounding is Here
by Bernard Murphy on 09-11-2024 at 6:00 am

car keyless entry

I posted a blog on this topic a year ago. Now the Bluetooth Sig has (just) ratified the standard it is timely to provide a reminder on what this new capability can offer. Channel Sounding introduced in Bluetooth Core specification version 6.0 is a method to significantly increase the accuracy of Bluetooth-based distance measurements, from an accuracy of around 3-5 meters down to around 30-50 centimeters. Channel sounding opens new, more efficient and more secure options to support keyless entry, Find My Device, and other applications.

Why do we need a new standard for ranging?

Bluetooth is everywhere and already provides ranging support through RSSI (received signal strength indication). Does that suggest there is no need for a new standard and new devices? Can’t we just use what we already have? RSSI is as simple as it gets, measuring the attenuation in signal strength from the transmitter. However attenuation depends on more than distance; obstacles and diffraction also contribute. Multiple beacons can help improve accuracy in say retail or hospital settings, but such solutions are obviously not helpful for personal applications like keyless car entry.

Angle of arrival (AoA) and angle of departure (AoD) measurements, introduced in Bluetooth 5.1, can help improve accuracy through trigonometric refinements, though they require multiple antennas on receiver or transmitters. And these methods are equally compromised by multipath propagation resulting from reflection from surfaces around the path between receiver and transmitter.

Bluetooth 6.0 Channel Sounding instead uses phase-based ranging. A device sends a sine wave to a peer device which then sends the same signal back to the original device. The phase difference between the initial signal and the received signal gives a quite precise measure of distance. Better yet, a reflection off an obstacle will travel a longer path than the direct return, exhibiting a bigger phase shift and making it easy to ignore in distance estimation.

On the other hand, UWB (ultra-wide band) is very accurate, able to deliver positions with accuracy down to a centimeter or below. The tradeoff is that UWB requires an additional and specific MAC, modem and radio (either integrated or in separate chipset), adding to a device bill of materials (given that the device will also need to support Bluetooth for other reasons). And UWB is more energy hungry than Bluetooth, draining a battery faster unless used only when absolutely needed.

Is there a best solution?

One idea might be to combine RSSI methods for approximate ranging with UWB for accuracy close-in. There are two problems here. First, earlier generation Bluetooth versions are vulnerable to relay attacks. In a raw ranging attempt an attacker can intercept the BT communication and relay it to another device, allowing them to open your car door if UWB is not used for added security.

The second problem is with power consumption under some conditions. Suppose your car is parked near your front door and your car keys are on a table just inside the door, within range of the car. It’s not hard to imagine your key fob constantly trying to communicate with the car, triggering UWB ranging sessions and quickly draining the key fob battery.

There is significant support for the view that practical applications must depend on a combination of methods. The FiRa consortium advancing UWB supports this direction as does the Car Connectivity Consortium (CCC). One suggestion is to use RSSI for initial rough estimates within perhaps 30 meters. This then switches to channel sounding as you get closer. Here it can exchange credentials to first establish a secure connection and can further refine the RSSI estimates at the driver approaches the car. At some point within closer range UWB can further refine the estimate.

Switching between these methods should of course be transparent to a user and should be tuned to avoid the problem of unnecessarily turning on UWB when not required. Optimizing these choices is an area where product builders can differentiate if they work with a supplier offering all three options.

Talk to an expert to optimize your product options

Now for a no apologies company plug. If you really want the best solution in your product, you should probably talk to a company that has been at the forefront of embedded Bluetooth since the very early releases. Ceva is well established here, offering backward compatible Bluetooth Classic, Bluetooth LE, and dual mode over many generations with countless chips in production. They are just as active in the newer UWB standard, with supporting IP proven across several customers.

On every generation Ceva tracks emerging standards, ready to release once the standard has been ratified, and ready to announce certification once the test spec has been announced. They already have compliant Bluetooth 6.0 IP for early testing and expect to be certified shortly after the test spec is released.

With this depth of expertise, they encourage you first to simply talk with them about your goals. They can advise how you can best meet your objectives. Once you feel comfortable, they have all the IP, software stacks and profiles to help you deliver the very best of what you have to offer.

Also check out Ceva’s Bluetooth solutions and UWB solutions web pages.


TetraMem Integrates Energy-Efficient In-Memory Computing with Andes RISC-V Vector Processor

TetraMem Integrates Energy-Efficient In-Memory Computing with Andes RISC-V Vector Processor
by Wenbo Yin on 09-10-2024 at 10:00 am

MX100

The rapid proliferation of artificial intelligence (AI) across a growing number of hardware applications has driven an unprecedented demand for specialized compute acceleration not met by conventional von Neumann architectures. Among the competing alternatives, one showing the greatest promise is analog in-memory computing (IMC). Unleashing the potential of multi-level Resistive RAM (RRAM) is making the promise more real today than in the past. Leading this development, TetraMem, Inc., a Silicon Valley based startup, is addressing the fundamental challenges holding this solution back. The company’s unique IMC that employs multi-level RRAM technology provides more efficient, low-latency AI processing that meets the growing needs of modern applications in AR/VR, mobile, IoT, and beyond.

Background on the Semiconductor Industry

The semiconductor industry has seen significant advancements over the past few decades, particularly in response to the burgeoning needs of AI and machine learning (ML). Innovations in chip design have pushed the boundaries of performance and efficiency. However, several intrinsic persistent challenges remain, such as the von Neumann bottleneck and memory wall, which limits data transfer rates between the CPU and memory, and the escalating power consumption and thermal management issues associated with advanced node technologies.

In-memory computing (IMC) represents a ground-breaking computing paradigm shift in how data processing is accomplished. Traditional computing architectures separate memory and processing units, resulting in significant data transfer overheads, especially for the data centric AI applications. On the other hand, IMC integrates memory and processing within the same physical location, enabling faster and more efficient data computations with a crossbar array architecture to further eliminate the large quantity of intermediate data from those matrix operations. This approach is particularly beneficial for AI and ML applications, where large-scale data processing and real-time analytics are critical.

Selecting a suitable memory device for IMC is crucial. Traditional memory technologies like SRAM and DRAM are not optimized for in-memory operations due to their device and cell constraints and their volatility idiosyncrasies. RRAM, with its high density, multilevel capability and non-volatility with superior retention, overcomes these challenges with no refresh needed. The working principle of RRAM involves adjusting the resistance level of the memory cell through controlled voltage or current, mimicking the behavior of synapses in the human brain. This capability makes RRAM particularly suited for analog in-memory computing.

TetraMem has focused its efforts on multi-level RRAM (memristor) technology, which offers several advantages over traditional single level cell memory technologies. RRAM’s ability to store multiple bits per cell and perform efficient matrix multiplications in situ makes it an ideal candidate for IMC. This technology addresses many of the limitations of conventional digital computing, such as bandwidth constraints and power inefficiency.

The RRAM programmable circuit element remembers its last stable resistance level. This resistance level can be adjusted by applying voltage or current. Changes in magnitude and direction of voltage and current applied to the element alters its conductance, thus changing its resistivity. Akin to how a human neuron functions, this mechanism has diverse applications: memory, analog neuron, and, at TetraMem, in-memory computing. The operation of an RRAM is driven by ions. With control of the conductive filament size, ion concentration and height, different multi-levels for cell resistance can be precisely achieved.

Data processed in the same physical location as it is stored with minimum intermediate data movement and storage results in low power consumption. Massive parallel computing by crossbar array architecture with device-level grain cores yields high throughput. And computing by physical laws in this way (Ohm’s law and Kirchhoff’s current law) produces low latency. TetraMem’s nonvolatile compute in-memory cell reduces power consumption by orders of magnitude over a conventional digital von Neumann architecture.

Notable Achievements

TetraMem has achieved significant milestones in the development of RRAM technology. Notably, the company has demonstrated an unprecedented device with 11 bits per cell, achieving over 2,000 levels in a single element. This level of precision represents a major breakthrough in memory compute technology.

Recent publications in prestigious journals such as Nature1 and Science2 highlight TetraMem’s innovative approaches. Techniques to improve cell noise performance and to enhance multi-level IMC have been key areas of advancement. For example, TetraMem has developed proprietary algorithms to suppress random telegraph noise, resulting in superior memory retention and endurance characteristics for RRAM cells.

Operation of IMC

TetraMem’s IMC technology utilizes a crossbar architecture, where each cross-point in the array corresponds to a programmable RRAM memory cell. This configuration allows for highly parallel operations, which are essential for neural network computations. During a Vector-Matrix Multiplication (VMM) operation, input activations are applied to the crossbar array, and the resulting computations are collected on the bit lines. This method significantly reduces the need to transfer data between memory and processing units, thereby enhancing computational efficiency.

Real-World Applications

TetraMem’s first evaluation SoC through the commercial fab process, the MX100 chip (see figure) exemplifies the practical applications of its IMC technology. The chip has been demonstrated in various on-chip demos, showcasing its capabilities in real-world scenarios. One notable demo, the Pupil Center Net (PCN), illustrates the chip’s application in AR/VR for face tracking and authentication monitoring in autonomous vehicles.

To facilitate the adoption of its technology, TetraMem provides a comprehensive Software Development Kit (SDK). This SDK enables developers to define edge AI models seamlessly. Furthermore, the integration with Andes Technology Inc.’s NX27V RISC-V CPU with Vector extensions streamlines operations, making      it easier for customers to deploy TetraMem’s solutions in their products.

The TetraMem IMC design is great for matrix multiplication but not as efficient in other functions such as vector or scalar operations. These operations are used frequently in neural networks.  For these functions, Andes provides the flexibility of a CPU plus a vector engine as well as an existing SoC reference design and a mature compiler and library to accelerate our time to market.

TetraMem collaborated with Andes Technology to integrate its IMC technology with Andes’ RISC-V CPU with Vector Extensions. This partnership enhances the overall system performance, providing a robust platform for a variety of AI tasks. The combined solution leverages the strengths of both companies, offering a flexible and high-performance architecture.

Looking ahead, TetraMem is poised to introduce the MX200 chip based on 22nm, which promises even greater performance and efficiency. This chip is designed for edge inference applications, offering low-power, low-latency AI processing. The MX200 is expected to open new market opportunities, particularly in battery-powered AI devices where energy efficiency is paramount.

Conclusion

TetraMem’s advancements in in-memory computing represent a significant leap forward in the field of AI hardware. By addressing the fundamental challenges of conventional computing, TetraMem is paving the way for more efficient and scalable AI solutions. As the company continues to innovate and collaborate with industry leaders like Andes Technology, the future of AI processing looks promising. TetraMem’s solution not only enhances performance but also lowers the barriers to entry for adopting cutting-edge AI technologies.

By Wenbo Yin, Vice President of IC Design, TetraMem Inc.

  1. “Thousands of conductance levels in memristors monolithically integrated on CMOS”, Nature, Mar 2023 https://rdcu.be/c8GWo

“Programming memristor arrays with arbitrarily high precision for analog computing”, Science, Feb 2024 https://www.science.org/doi/10.1126/science.adi9405

Also Read:

Unlocking the Future: Join Us at RISC-V Con 2024 Panel Discussion!

LIVE WEBINAR: RISC-V Instruction Set Architecture: Enhancing Computing Power

Andes Technology: Pioneering the Future of RISC-V CPU IP


Samtec Demystifies Signal Integrity for Everyone

Samtec Demystifies Signal Integrity for Everyone
by Mike Gianfagna on 09-10-2024 at 6:00 am

Samtec Demystifies Signal Integrity for Everyone

As clock speeds go up, voltages go down and data volumes explode the need for fast, reliable and low latency data channels becomes critical in all kinds of applications. Balancing the requirements of low power and high performance requires the mastery of many skills. At the top of many lists is the need for superior signal integrity, or SI. In its most basic form, SI ensures that a signal is transmitted with sufficient quality (or integrity) to allow effective communication. This definition is deceptively simple. Implementing effective SI requires mastery of many disciplines. There are signal integrity experts among us who have spent an entire career learning how to create reliable, efficient signal channels. But what about the rest of us?  It is often difficult to even know where to start. The good news is that Samtec has an extensive set of resources to help. Let’s look at how Samtec demystifies signal integrity for everyone.

The Tools of the Trade

Understanding signal integrity and how to optimize it demands a working knowledge of a vast array of concepts, technologies and metrics. While most of us know what a baud rate and a differential signal are, the concepts of Nyquist frequency, bit error rates, insertion loss and what exactly an eye diagram illustrates may be a less obvious.

The world of frequency domain analysis, S-parameters and electromagnetic compatibility opens a lot of new concepts and analysis regimes as well. The terminology is easy to get lost in, such as evaluating crosstalk through a NEXT or FEXT lens.

The good news is that Samtec, a company that has mastered signal integrity to deliver high performance data channels of all types, also provides a wide range of tools to help us all understand signal integrity – why it’s important, how to measure it and how to optimize it.

If you are unfamiliar with Samtec, you can get a good overview of what the company has to offer on SemiWiki here.

Samtec to the Rescue

A core resource to help navigate the subtleties and semantics of signal integrity is the document referenced at the top of this post. Samtec’s Signal Integrity Handbook provides everything you need to decode what good signal integrity practices look like and why they are so important. A link is coming so you can get your copy. You’ll be glad to have it. Let’s first look at what’s covered in this handbook and touch on some of the other ways Samtec helps its customers achieve the best system performance possible.

The handbook begins by covering the basics – the definition of signal integrity and the differences in dealing with single-ended and differential signals. Key signaling terms such as NRZ and PAM4 are then explained.  We then explore the entire world of frequency domain analysis and the role of S-parameters. Topics such as insertion loss, return loss and crosstalk are covered here.

The basics of time domain analysis are then explored, with a discussion of topics such as impedance, apparent impedance, propagation delay and skew. The various metrics used to characterize channel performance are then covered. This is followed by a discussion of modeling considerations, where topics such as simulation tools and validation are explored.

System analysis is then covered. This is where we learn about eye diagrams. Methods to select appropriate connectors and how to use them are then detailed. The handbook concludes with a discussion of signal integrity for RF systems. 

You will not likely read this handbook cover to cover, although it is quite well organized and easily consumed. Rather, it will become a reference for you as you navigate the challenges of your next system design.

Beyond the Signal Integrity Handbook, Samtec offers a broad set of resources to dig into the many topics associated with signal integrity. These take the form of short videos and informative blogs. There are links to some of these resources embedded in the handbook, which is quite convenient. If you want a quick overview of the Samtec Signal Integrity Group, you can get that in under two minutes with this informative video. If you want broader access to experts and materials, you can find that on Samtec’s Signal Integrity Center of Excellence. If you’ve moved beyond the basics, Samtec also offers its monthly gEEk spEEK webinar series.

To Learn More

At last, here is where you can download your copy of the Samtec Signal Integrity Handbook. Keep it handy when you start your next design. And that’s how Samtec demystifies signal integrity for everyone.


Hot Chips 2024: AI Hype Booms, But Can Nvidia’s Challengers Succeed?

Hot Chips 2024: AI Hype Booms, But Can Nvidia’s Challengers Succeed?
by Joseph Byrne on 09-09-2024 at 10:00 am

NVLink switch tray top view open black

You don’t know you’re at a peak until you start to descend, and Hot Chips 2024 is proof that AI hype is still climbing among semiconductor vendors. Juggernaut Nvidia, startups, hyperscalers, and major companies presented their AI accelerators (GPUs and neural-processing units—NPUs) and touched on the challenges of software, memory access, power, and networking. As always, microprocessors also made a significant contribution to the conference program.

Nvidia recapitulated Blackwell details, the monster chip introduced earlier this year. Comprising 208 billion transistors on two reticle-limited dice, it can deliver a theoretical maximum of 20 PFLOPS on four-bit floating-point (FP4) data. The castle wall protecting its dominance is Nvidia’s software, and the company discussed its Quasar quantization stack that facilitates FP4 use and reminded the audience of its 400-plus Cuda-X libraries.

The software barrier for inference is lower. Seeking to bypass it altogether—as well as to offer AI processing in an easier-to-consume chunk than a whole system—AI challengers such as Cerebras and SambaNova provide API access to cloud-based NPUs. Cerebras is unusual in operating multiple data centers, and both companies also offer the option to buy ready-to-run systems. Tenstorrent, however, sees a developer community and software ecosystem as essential to the long-term success of a processor vendor. The company presented its open-source stack and described how developers can contribute to it at any level, facilitated by its use of hundreds of C-programmed RISC-V CPUs.

AI Networking at Hot Chips 2024

Networking is critical to building a Blackwell cluster, and Nvidia showed InfiniBand, Ethernet, and NVLink switches based on its silicon. The NVLink switch picture reveals a couple of interesting challenges to scaling out AI systems. Employing 200 Gbps serdes, signals in and out of the NVLink chips must use flyover cables, shown in blue and pink in Figure 1, because PCBs can’t handle the data rate. Moreover, the picture suggests the blue lines connect to front-panel ports, indicating customers’ data centers don’t have enough power to support 72-GPU racks and must divide this computing capacity among two racks.

Figure 1. Nvidia NVSwitch. (Source: Nvidia)

Short-hop links eventually will require optical networking. Broadcom updated the audience on its efforts developing copackaged optics (CPO). Having created two CPO generations for its Tomahawk switch IC, Broadcom completed a third CPO-development vehicle for an NPU. The company expects CPO to enable all-to-all connectivity among 512 NPUs. Intel also discussed its CPO progress. Disclosed two years ago, Intel’s key technology is an eight-laser IC its CPO design integrates, replacing the conventional external light source Broadcom requires.

Software and networking intersect at protocol processing. Hyperscalers employ proprietary protocols, adapting the standard ones to their data centers’ rigorous demands. For example, in presenting its homegrown Maia NPU, Microsoft alluded to the custom protocol it employs on the Ethernet backbone connecting a Maia cluster. Seeing standards’ inadequacies but also valuing their economies, Tesla presented its TTPoE protocol, advocating for the industry to adopt it as a standard. It has joined the Ultra Ethernet Consortium (UEC) and submitted TTPoE. Unlike other Ethernet trade groups focused on developing a new Ethernet data rate, the UEC has a broader mission to improve the whole networking stack.

Alternative Memory Hierarchies at Hot Chips 2024

Despite Nvidia’s success, a GPU-based architecture is suboptimal for AI acceleration. Organizations that started with a clean sheet have gone in different directions particularly with their memory hierarchies. Hot Chips 2024 highlighted several different approaches. The Meta MTIA accelerator has SRAM banks along its sides and 16 LPDDR5 channels, eschewing HBM. By contrast, Microsoft distributes memory among Maia’s computing tiles and employs HBM for additional capacity. In its Blackhole NPU, Tenstorrent similarly distributes SRAM among computing tiles and avoids expensive HBM, using GDDR6 memory instead. SambaNova’s SN40L takes a “yes-and” approach, integrating prodigious SRAM, including HBM in the package, and additionally supporting standard external DRAM. For on-chip memory capacity, nothing can touch the Cerebras WS-3 because no other design comes close to its wafer-scale integration.

CPUs Still Matter

AMD, Intel, and Qualcomm discussed their newest processors, mostly repeating information previously disclosed. Ampere discussed the Arm-compatible microarchitecture employed in the 192-core AmpereOne chip, revealing it to be in a similar class as the Arm Neoverse-N2 and adapted to many-core integration.

The RISC-V architecture is the standard for NPUs, being employed by Meta, Tenstorrent, and others. The architecture was also the subject of a presentation by the Chinese Academy of Sciences. The organization has two open-source projects under its XiangShan umbrella, which covers microarchitecture, chip generation, and development infrastructure. Billed as comparable to an Arm Cortex-A76, the Nanhu microarchitecture shown in Figure 2 is a RISC-V design focused more on power- and area-efficiency than maximizing performance. The Kunminghu microarchitecture is a high-performance RISC-V design the academy compares with the Neoverse-N2. Open source, and thus freely available, these CPUs present a business-model challenge to the many companies developing and hoping to sell RISC-V cores.

Figure 2. Nanhu microarchitecture (source: https://github.com/OpenXiangShan)

Bottom Line

Artificial-intelligence mania is propelling chip and networking developments. The inescapable conclusion, however, is that too many companies are chasing the opportunity. Beyond the companies highlighted above, others presented their technologies—a key takeaway about each is available at xpu.pub. The biggest customers are the hyperscalers, and they’re gaining leverage over merchant-market suppliers by developing their own NPUs, such as the Meta MTIA and Microsoft Maia presented at Hot Chips 2024.

Nvidia’s challengers, therefore, are targeting smaller customers by employing various strategies, such as standing up their own data centers (Cerebras), offering API access and selling turnkey systems (SambaNova), or fostering a software ecosystem (Tenstorrent). The semiconductor business, however, is one of scale economies, and aggregating small customers’ demand is rarely as effective as landing a few big buyers.

Although less prominent, RISC-V is another frothy technology. An open-source instruction-set architecture, it also has open-source implementations. Businesses have been built around Linux, but they involve testing, improving, packaging, and contributing to the open-source OS, not replacing it. Their business model could be a template for CPU companies, which have focused on developing better RISC-V implementations—which could be fruitless given the availability of high-end cores like Kunminghu.

At some point, both the AI and RISC-V bubbles will burst. If it happens in the next 12 months, we’ll learn that Hot Chips 2024 was the zenith of hype.

Joseph Byrne is an independent analyst. For more information, see xampata.com.

Also Read:

The Semiconductor Business will find a way!

Powering the Future: The Transformative Role of Semiconductor IP

Nvidia Pulled out of the Black Well


The Semiconductor Business will find a way!

The Semiconductor Business will find a way!
by Claus Aasholm on 09-09-2024 at 6:00 am

Endless Loop of Sanctions rotates while China buys more

Embargoes and other fun stuff in the Semiconductor Tools Market

While this post dives into the semiconductor tool market’s Q2 data, it is also about the senseless embargo game currently in place. The post illustration shows what my notebooks look like as embargoes enforce what they are supposed to suppress and are replaced by even more restrictive embargoes that… You get the point.

I don’t take sides in the Chip Wars or have an opinion about whether embargoes are needed or justified. I investigate what is going on.

I know that “Business will find a way” (inspired by Michael Crichton’s Jurassic Parc Quote: Life will find a way). The AI part of the embargos has been covered in the post: Pulled out of the black well.

The Semiconductor Tools Market

From a relatively quiet life away from the international spotlight, the Semiconductor Tools companies were propelled onto the geopolitical stage as semiconductors became a matter of national security—first for the US, later for its economic allies, and finally everywhere.

After AI showed its face and made Nvidia a superstar, Non-semiconductor insiders realised that none would be possible without factories in Taiwan and tools from the borderlands of the Nederlands and Belgium.

Despite decades of plentiful investments, the Chinese could not crack the macadamia nut of Semiconductors: Advanced lithography tools.

One company had orchestrated an incredibly advanced R&D effort throughout its supply chain and churned out machines capable of printing the small geometries needed for the most advanced technologies.

The US realised that despite retaining design dominance in Semiconductors, the manufacturing had slipped to Korea and Taiwan and with it, the supply chains.

The Chips Act could stimulate the growth of advanced US manufacturing capacity, but it could do little to remove the reliance on advanced European manufacturing capacity.

Even with the Chips Act, success was not guaranteed as the Chinese had invested in Semiconductor capacity for decades. The US government had to stop the flow of advanced lithography tools to China. Time for embargos.

ASML Embargoes

The US embargoes on AMSL sales to China are delicate as the US has to rely on relationships with the Dutch government. As the ASML sales to China are significant, significant commercial interests are at play.

The embargos started in 2019 with a ban on Extreme UV tools to China capable of better than 7nm processes. These have later been tightened in both 2023 and in the beginning of 2024.

With a mixture of bans and licenses, the embargoes slide from licenses to bans and include more products. As there has been no decline in sales to China, we are likely to see more restrictions over the next period.

The latest development has been debacles between the US and the Nederlands if the embargoes are really driven by national security concerns or more by commercial interest. As a result, the Netherlands have fully aligned to the US restrictions and have taken over the duty of issuing licenses. This is applauded by ASML, which expects a more lenient treatment from the local authorities than the US government.

Before we dive deeper into the outcome of the embargoes, it is worth taking stock of the Semiconductor tools market and its latest results.

The Semiconductor Tools market

Before any chip is made, Semiconductor Tools need to be bought. Throughout the weeks-long journey from blank wafer to finished chip, a multitude of tools is required.

Incredibly simplified, the tools can be categorised as:

  1. Lithography – the application of the chip design onto the wafer
  2. Depositions – adding material layers
  3. Modification – changing material layers
  4. Removal – removal of material layers
  5. Other: Metrology, Cleans, Handlers and a multitude of specialised tools.

The tools are placed in the investment zone of our Semiconductor market model and are impacted by the Capital Expenditures of the 3 different categories of Chip manufacturers: IDM’s, Foundries and the FAB/Foundry semiconductor companies.

This is not a business for the faint of heart. A new leading-edge semiconductor fab now costs more than $50B$ to build, and 75% of that cost is tools. The most expensive tools are larger than a double-decker bus and cost more than 350B$

As the geometries become smaller, the chips become cheaper to produce, faster, and consume less power. However, the other side of the coin is that the design costs are skyrocketing, and the tool and factory costs are also increasing.

Comparing the revenue of semiconductor companies with that of tool companies generates the first insights into this development.

The starting point is the bottom of the 2019 semiconductor cycle, and as can be seen, the revenue of the tool companies is outgrowing the semiconductor revenue. The Foundry revenue growth mirrors this. In other words, the collective investment in tools and the manufacturing costs of fabless companies are increasing significantly.

It is not only from a revenue perspective that the tools business is interesting.

From an operating profit perspective, the tools business is more attractive than the foundry and Semiconductor businesses. Notably, most of the semiconductor companies’ Q2-24 profitability stems from Nvidia; without Nvidia, the growth of Semiconductor companies would be a measly 24%.

This is worth noting as an investor as this trend will not change any day soon.

The state of the Semiconductor Tool Companies

The following charts are based on the revenue of semiconductor tool companies, including service and other revenue. The pure tool analysis will follow.

After a growth streak following the latest upcycle, the combined revenue peaks around the introduction of the US Chips Act. This replaces growth with a relatively steady state, with revenues flatlining at around 22B$ and stable Gross and operating profits—nothing to see here.

The declining investments of Semiconductor companies were a direct result of the Chips Act, as projects were redirected to US soil.

This surprised the semiconductor tools companies, or it was too late for them to respond. The result has been an increasing level of inventory that is now double what it was in 2020.

Tool companies used to be concentrated in three countries: the US, Japan, and the Netherlands. At the beginning of the 2020s, only 3% of the revenue was outside these three countries, most of which was in China.

Despite a decade of massive semiconductor investments, these tools proved to be the most difficult to conquer. Success in semiconductor tools requires more than money.

Even after another half-decade, the same countries dominate, although China has worked hard to gain market share. From 3% at the beginning of the decade, China has now managed to get to 8.5% in market share, which is not cause for any celebration in Beijing.

The rapid CAGR growth stopped what was expected during the Chinese New Year of the Dragon but did not bounce back to the levels expected. The inventory position of the Chinese Tools companies indeed suggests that the Chinese tool customers are constipated and not ready to consume at the same levels,

The Chinese tool manufacturers almost exclusively sell to China, but these are not the only tools the Chinese tool customers buy.

The Semiconductor Tool market

After reviewing the Tool companies’ revenue, it is worth investigating the tool revenue of the top companies for more insights. The tool revenue excludes the service and other revenue of Semiconductor tool companies and is perfectly aligned with the CapEx of Semiconductor manufacturing companies.

The dominance of certain countries is due to the domination of a few companies in the Semiconductor tool market.

The top companies account for almost all of the Tool Revenue outside China, and each of the large ones has its own area of specialisation making most of them submarket leaders in subfields, most prominently ASML in lithography and AMAT in materials engineering.

Since introducing the Chips Act and sanctions on AI and advanced lithography tools, sales of Western semiconductor tools to China have exploded.

The revenue of Western tool sold to China grew in Q2 but the Chinese share of revenue declined slightly from 45% in Q1 to 44.4% in Q2. I am sure this is no cause for a victory lap in Washington.

The dominance is even more apparent when compared to revenue from other countries.

The most apparent effect of the Chips act was that TSMC stopped investing in tools as projects were redirected to the US.

Excluding Chinese revenue from the revenue of Western Tool Companies shows the dilemma. China is not only a pain but also a saviour.

This is a crucial problem for the US policy of embargoes. It impacts other countries’ economies, especially when a large company like ASML resides in a small country.

In the last few weeks, there has been a lot of back and forth between Holland and the US government. AMSL has accused the US authorities (in what I cannot believe is not aligned with the Dutch government) of leaning more towards commercial interests than national security interests. At the same time, the Dutch government aligned their restrictions to the US restrictions, so it is now the Dutch authorities that will approve export licenses rather than the US government, something that I am sure ASML is very happy about (I don’t know why…)

China has also threatened Japan with retaliation should the country further sanction tool sales to China.

Conclusion

The US embargoes are focused on preventing the Chinese authorities from accessing leading-edge AI technology. Half of the embargoes are aimed at the GPUs, and the other half are focused on the tools. Memory might be next.

During the last few days, there have been many examples of how easy it is for Chinese companies to buy Nvidia’s H100 or rent it online cheaper than in the West. At the same time, Nvidia’s sales to China are once again increasing despite the embargoes.

The Semiconductor Tools sanctions are in place to prevent the Chinese from gaining access to leading-edge manufacturing technology, which would enable China to make its own GPUs.

The US keeps tightening the sanctions on Semiconductor tools but as with the GPUs, every new sanction seems to enforce what was put in place to inhibit.

I am sure the US authorities are thinking about even more draconian embargos against China, like the Foreign Direct Product Rule that would require the US to dictate all tool sales to China (as they all contain US technology) or try to prevent AMSL and other companies from servicing the existing installed equipment in China.

This would significantly negatively impact ASML and the Dutch economy and would be difficult for the US to get an agreement on.

In this post, I have tried to lay out the result (or lack thereof) of the sanctions in the semiconductor tools market. Tool sales to China remain strong despite increasing embargoes. In my experience, embargoes don’t stop the flow of products; they might change the flow or the cost, but eventually, business will find a way. I accept embargoes might play a political role, but that is not my area of expertise.

I leave you with a chart of Western tool companies’ cumulative sales to China and the US. Since 2019, China has bought nearly $100B worth of Semiconductor tools. More than 2.5 times what the US has bought in the same period. My apologies if that disturbs anybody’s sleep.

KLA Metrology and Inspection

Strong pull from N3 and N2

adv packaging growth from 300m to 500m$ in 2024

Packaging will outgrow WFE

No memory growth yet – will be 2025 and lead by DRAM

KLA share of WFE will grow

DRAM will have higher process control intensity

But it’s not contributing to that they are going to build capacity based on demand. The demand is just going to be based on their market, the overall market, and then they’ll buy equipment accordingly. But we are not counting on our customers to get ChipPAC money to make our plans.

ASML

We also see continued improvement in lithography tool utilization level at both Logic and Memory customers, all in line with the industry’s continued recovery from the downturn.

Lower logic revenue as customers trying to digest last years addition

an increase of EUV use on every node. I think this is a trend that continue at least in the foreseeable future

I think you have seen also in DRAM that this point of time, all customers are using EUV in production

On High NA, we also see opportunity for DRAM at the horizon of ’25, ’26.

LAM

Lam customer investment profile generally unchanged from prior view

+ Slightly stronger domestic China spending

+ Additional demand related to HBM capacity ramp

+ Foundry/logic, DRAM, and NAND investments all expected to be higher on a year-on-year basis

+ Global spending on mature nodes expected to be roughly flat year-on-year

Thanks for reading Semiconductor Business Intelligence! Subscribe for free to receive new posts and support my work.


Podcast EP246: How Axoimise Provides the Missing Piece of the Verification Puzzle

Podcast EP246: How Axoimise Provides the Missing Piece of the Verification Puzzle
by Daniel Nenni on 09-06-2024 at 10:00 am

Dan is joined by Adeel Liaquat, a formal verification manager at Axiomise. The company delivers cutting-edge scalable and predictable formal verification solutions shortening the time-to-market, and left-shifting the verification curve.

Dan explores the substantial verification challenges for advanced designs with Adeel, including very subtle bugs that are hard to find with vector-based approaches. Adeel explains that a holistic approach to verification is needed, from IP to the final system. Both standard and customized architectures must be addressed with this approach.

The conversation shows how formal methods are the missing piece of the verification puzzle. The technology, training and support provided by Axiomise aims to bring this technology into mainstream use across the entire design spectrum, substantially improving the time it takes to achieve a high level of confidence in system capability.

The views, thoughts, and opinions expressed in these podcasts belong solely to the speaker, and not to the speaker’s employer, organization, committee or any other group or individual.