Banner 800x100 0810

Cadence® Janus™ Network-on-Chip (NoC)

Cadence® Janus™ Network-on-Chip (NoC)
by Kalar Rajendiran on 07-23-2024 at 10:00 am

Design Flow when using Janus NoC

A Network-on-Chip (NoC) IP addresses the challenges of interconnect complexity in SoCs by significantly reducing wiring congestion and providing a scalable architecture. It allows for efficient communication among numerous initiators and targets with minimal latency and high speed. A NoC facilitates design changes, enabling quick iterations to meet specific design goals regarding bandwidth, latency, area, and power. Cadence recently expanded their system IP portfolio with the addition of the Janus NoC IP. At the surface, it may prompt the question, what is the big deal, NoC IP is not a new concept and this type of IP is common in the industry. I got deeper insights by chatting with Cadence’s George Wall, group director of product marketing and Ronen Perets, senior product marketing manager, both in the Cadence Silicon Solutions Group.

Integral Subsystem Component

The Cadence Janus NoC IP is in response to requests from the company’s customer base for expanded system-level solutions. This IP is an integral part of Cadence’s silicon solutions strategy, aimed at providing significant value to its licensee partners. It leverages Cadence’s extensive design expertise and best-in-class verification tools and methodologies, ensuring that the NoC meets the highest standards of quality and performance. This strategic addition enhances Cadence’s portfolio, making it a crucial component for advanced SoC designs. The IP is designed to handle inter-chiplet communication efficiently, using programmable routing and supporting dynamic configurations. The NoC is designed to support the evolving multi-chip module and chiplet-based design architectures. This adaptability ensures future-proofing for increasingly complex SoC designs.

Leverages Cadence’s Extensive Portfolio of Software and Hardware Offerings

Cadence offers a comprehensive system solution that includes processors with a full set of Software Development Tools (SDT) and Software Development Kits (SDK), Digital Signal Processors (DSP), libraries, and frameworks, I/O controllers to facilitate various interface requirements, and PHY for physical layer implementations ensuring reliable data transmission. The Cadence Janus NoC enhances performance, power, and area (PPA) by efficiently managing high-speed communications within and between silicon components with minimal latency. By optimizing RTL for PPA and utilizing packetized messages, the NoC reduces wire count and mitigates timing closure challenges, thereby accelerating time to market.

Architectural Exploration and Verification

Cadence offers extensive simulation and emulation options to support architectural exploration and verification. The Palladium Accelerator provides full visibility and increases simulation speed, making it ideal for extensive performance benchmarking. The Protium Platform maps the full SoC onto FPGAs for extremely fast emulation, which is particularly useful for debugging at the SoC level. SystemC modeling allows for fast debugging and firmware bring-up using a functional SystemC model generated alongside the RTL. Additionally, the Cadence Helium Virtual and Hybrid Studio enables the mixing of different model types and running each module on different platforms, facilitating performance monitoring and rapid iteration.

Designed for Ease of Use

The Cadence Janus NoC is designed with ease of use in mind, offering a highly configurable and flexible architecture. It features a GUI configuration tool that allows users to easily configure and generate NoC RTL, and comes with a comprehensive package that includes synthesis scripts, a testbench, and a functional model, streamlining the design process. Early optimization of NoC design is facilitated through iterative design exploration and performance validation using Cadence simulation and emulation technologies, along with the Cadence System Performance Analysis (SPA) tool, ensuring that the architecture meets performance needs.

Cadence Janus NoC Architecture

The Cadence Janus NoC architecture consists of three main components: the Initiator Endpoint Adapter (IEA), which connects initiator endpoints to the NoC; the Target Endpoint Adapter (TEA), which connects target endpoints to the NoC; and the Routing Node, which routes packets between IEAs and TEAs to their respective destinations. A typical NoC comprises multiple IEAs, TEAs, and routing nodes. These nodes are interconnected, allowing messages to traverse from origin to destination efficiently. Routing nodes can be configured to optimize bandwidth and latency, with pipeline stages added to maintain the desired speed despite physical distance challenges.

Summary

The Cadence Janus NoC architecture offers a scalable, efficient, and adaptable approach to addressing the complex interconnect requirements of modern SoCs. With advanced configuration tools, robust simulation and emulation options, and comprehensive power management and verification strategies, Cadence’s NoC technology empowers designers to create optimized, high-performance SoCs efficiently and effectively. By managing high-speed communications efficiently, the Janus NoC helps design teams achieve their PPA targets faster and with lower risk, freeing up valuable engineering resources for SoC differentiation. As the industry continues to evolve, Cadence Janus NoC stands as a future-proof platform, enabling designers to meet current and future demands with confidence.

You can learn more about the Janus NoC System IP from here.

Also Read:

Accelerating Analog Signoff with Parasitics

Novelty-Based Methods for Random Test Selection. Innovation in Verification

Using LLMs for Fault Localization. Innovation in Verification


A Joint Solution Toward SoC Design “Exploration and Integration” released by Defacto #61DAC

A Joint Solution Toward SoC Design “Exploration and Integration” released by Defacto #61DAC
by Daniel Nenni on 07-23-2024 at 6:00 am

flow ip explorer soc compiler (1)

When I was at DAC last month, I had the chance to talk with Chouki Aktouf and Bastien Gratréaux from Defacto and they told me about a new innovative solution to generate Arm-based System-on-Chips. I heard that this solution has now been released.

Defacto and Arm developed a joint SoC design flow to help Arm users cover all needed automation—from SoC design architecture and exploration to top-level generation of all needed files for implementation and verification flows.

Through the intuitive graphical interface from the Arm design platform, Arm IP Explorer, helps make specification of the SoC easy and user friendly. Once SoC exploration is realized, RTL and IP-XACT design files are automatically generated using Defacto’s SoC Compiler design solution.

The jointly developed solution is built around a strong link between Arm IP Explorer and Defacto’s SoC Compiler to enable users to generate quickly several SoC design configurations. The speed of the Defacto SoC Compiler enables the generation of a multitude of SoC configurations based on different user specifications. With this solution, the overall design time from specification to an SoC ready for synthesis can be significantly reduced.

Why was this solution needed?

With the complexity of current SoC designs and the design space possibilities, designers and architects face significant challenges when exploring SoC architectures. They traditionally access an IP database, where they select, configure, and download IP. The following step is to connect the IPs to build the complete SoC design database which is ready for logic synthesis. Iterative work is usually needed for each of the configurations created, which impacts overall turn-around time (TAT).

Providing a comprehensive and automated design solution from specification to implementation with all necessary exploration metrics, such as chip size, power consumption, and so on, is needed more than ever.

How it works?

The joint Arm IP Explorer/SoC Compiler solution is the shortest path from the definition of Arm-based system architecture to implementation and design verification.

The first step is that users access Arm IP Explorer and start selecting IP cores from the catalog. IP parameters can be set at this level and IP configuration in general is made easy. With the selected IPs, users can architect the complete system. The platform gives also the flexibility to add custom IPs to reflect the desired system. At this stage, an estimation of the overall size of the SoC is provided.

Integration checks are performed on-the-fly to ensure that the built SoC is correct including all needed and complex connectivity. The completed and validated system is then exported into the Defacto SoC Compiler, which automatically generates the top-level IP-XACT / RTL / UPF files, along with different reports. These reports provide detailed connectivity density, chip size, and power consumption.

The generated files are fully compatible with standard RTL2GDS SoC design flows and can be provided directly to both logic synthesis tools and design verifications tools. With the simplicity, speed, and flexibility of this solution, users can quickly and automatically explore and generate several SoC design configurations.

Who is this solution for?

This solution has been developed for Arm users who need to quickly build new Arm-based SoC configurations. Using this solution users increase efficiency and productivity, making easy to find and compare Arm IPs in a unique source. With the simplified IP configuration, coupled with the automatic generation of the top level SoC, users are drastically reducing costs and time to market.

This flow has already been validated for a large number of systems and is ready to be used for several applications such as IoT, automotive, mobile, 5G, cloud computing, HPC, AI, etc.

More information can be found on the Defacto page on the Arm partner website: https://www.arm.com/partners/catalog/defacto-technologies

To have a dedicated demo and presentation of the flow, feel free to reach out to Defacto by email. (info_req@defactotech.com)

Also Read:

Defacto at the 2024 Design Automation Conference

WEBINAR: Joint Pre synthesis RTL & Power Intent Assembly flow for Large System on Chips and Subsystems

Lowering the DFT Cost for Large SoCs with a Novel Test Point Exploration & Implementation Methodology

Defacto Celebrates 20th Anniversary @ DAC 2023!


TSMC Foundry 2.0 and Intel IDM 2.0

TSMC Foundry 2.0 and Intel IDM 2.0
by Daniel Nenni on 07-22-2024 at 10:00 am

TSMC 2Q2024 Investor Call

When Intel entered the foundry business with IDM 2.0 I was impressed. Yes, Intel had tried the foundry business before but this time they changed the face of the company with IDM 2.0 and went “all-in” so to speak. The progress has been impressive and today I think Intel is well positioned to capture the NOT TSMC business by providing a trusted alternative to the TSMC leading edge business. The one trillion dollar questions is: Will Intel take business away from TSMC on a competitive basis? I certainly hope so, for the greater good of the semiconductor industry.

On the most recent TSMC investor call, which is the first call with C.C. Wei as Chairman and CEO, TSMC branded their foundry strategy as Foundry 2.0. It is not a change of strategy, it is a new branding based on what TMSC has been successfully doing for years now, adding additional products and services to keep customers engaged. 3D IC packaging is a clear example but certainly not the only one. The Foundry 2.0 brand is well earned and is clearly targeted at Intel IDM 2.0 which I think is funny and a great example of CC Wei’s sharp wit.

I thought for sure that Intel 18A would be the breakout foundry node for Intel but according to the TSMC investor call, that is not the case. TSMC N3 was a runaway hit with 100% of the major design wins. Even Intel used TSMC N3. I hadn’t seen anything like this since TSMC 28nm which was on allocation as a result of being the only viable 28nm HKMG node out of the gate. History repeated itself with N3 due to the delay of 3nm alternatives. This made the TSMC ecosystem the strongest I have ever witnessed with both the domination of N3 and TSMC’s rapidly expanding packaging success. I had originally thought that some customers would stick with N3 until the second generation of N2 appeared but I was wrong. On yesterday’s investor call:

CC Wei: We expect the number of the new tape-outs for 2-nanometer technologies in its first two years to be higher than both 3-nanometer and 5-nanometer in their first two years. N2 will deliver full load performance and power benefit, with 10 to 15 speed improvement at the same power, or 25% to 30% power improvement at the same speed, and more than 15% chip density increase as compared with the N3E.

CC had mentioned this before but I can now confirm this based on my hallway discussions inside the ecosystem at recent conferences: N2 designs are in progress and will start taping out towards the end of this year.

I really don’t think the TSMC ecosystem gets enough credit, especially after the overwhelming success of N3, but the N2 node is a force in itself:

CC Wei: N2 technology development is progressing well, with device performance and yield on track or ahead of plan. N2 is on track for volume production in 2025 with a ramp profile similar to N3. With our strategy of continuous enhancement, we also introduce N2P as an extension of our N2 family. N2P features a further 5% performance at the same power or 5% to 10% power benefit at the same speed on top of N2. N2P will support both smartphone and HPC applications, and volume production is scheduled for the second half of 2026. We also introduce A16 as our next nanosheet-based technology, featuring Super Power Rail, or SPR, as a separate offering.

And, of course, the TSMC freight train continues:

CC Wei: TSMC’s SPR is an innovative, best-in-class backside power delivery solution that is forcing the industry to incorporate another backside contact scheme to preserve gate density and device with flexibility. Compared with N2P, A16 provides a further 8% to 10% speed improvement at the same power, or 15% to 20% power improvement at the same speed, and additional 7% to 10% chip density gain. A16 is best suited for specific HPC products with complex signal routes and dense power delivery network. Volume production is scheduled for the second half of 2026. We believe N2, N2P, A16, and its derivative will further extend our technology leadership position and enable TSMC to capture the growth opportunities way into the future.

Congratulations to TSMC on their continued success, it is well deserved. I also congratulate the Intel Foundry team for making a difference and I hope the 14A foundry node will give the industry a trusted alternative to TSMC out of the starting gate.  In my opinion, had it not been for Intel and of course CC Wei’s leadership and response to Intel’s challenge, we as an industry would not be quickly approaching the one trillion dollar revenue mark. Say what you want about Nvidia, but as Jensen Huang openly admits, TSMC and the foundry business is the real hero of the semiconductor industry, absolutely.

Also Read:

Has ASML Reached the Great Wall of China

The China Syndrome- The Meltdown Starts- Trump Trounces Taiwan- Chips Clipped

SEMICON West- Jubilant huge crowds- HBM & AI everywhere – CHIPS Act & IMEC


A New Class of Accelerator Debuts

A New Class of Accelerator Debuts
by Bernard Murphy on 07-22-2024 at 6:00 am

Chimera GPNPU Block diagram

I generally like to start my blogs with an application-centric viewpoint; what end-application is going to become faster, lower power or whatever because of this innovation? But sometimes an announcement defies such an easy classification because it is broadly useful. That’s the case for a recent release from Quadric, based on an architecture which seems to carve out a new approach to acceleration. This is able to serve a wide range of applications, from signal processing to GenAI with depth in performance, up to 864 TOPs per their announcement.

The core technology

Quadric’s roots are in AI acceleration, so let’s start there. By now we are all familiar with the basic needs for AI processing: a scalar engine to handle regular calculations, a vector engine to handle things like dot-products, and a tensor engine to handle linear algebra. And that’s how most accelerators work – 3 dedicated engines coupled in various creative ways. The Quadric Chimera approach is a little different. The core processing element is built around a common pipeline for all instruction types. Only at the compute step does it branch to an ALU for scalar operations or a vector/matrix unit for vector/tensor operations.

Both signal processing and AI demand heavy parallelism to meet acceptable throughput rates, handled through wide-word processing, lots of MACs and multi-core implementations. The same is true for the latest Quadric architecture, but again in a slightly different way. Their new cores are built around systolic arrays of processing elements, each supporting the same common pipeline, each with its own scalar ALU, bank of MACs and local register memory.

This structure, rather than a separate accelerator for each operator class, has two implications for product developers. First it simplifies software development, still highly parallel to be sure, but abstracting out a level of complexity in multi-engine accelerator architectures where operations must be steered to the appropriate engines.

Second, the nature of parallelism in transformer-based AI models (LLMs or ViT for example) is much more complex than for earlier generation ResNet-class accelerators which process through a sequence of layers. In contrast, transformer graphs flip back and forth between matrix, vector and scalar operations. In disaggregated hardware architectures traffic flows similarly must alternate between engines with inevitable performance overhead. In the Quadric approach, any engine can handle a stream of scalar, vector and tensor operations locally. Of course there will be overhead in traffic between PE cores, but this applies to all parallel systems.

Steve Roddy (VP Marketing for Quadric) tells me that in a virtual benchmark against a mainstream competitor, Quadric’s QC-Ultra IP delivered 2X more inferences/second/TOPs for a lower off-chip DDR bandwidth and at less than half the cycles/second of the competing solution. Quadric are now offering 3 platforms for the mainstream NPU market segment: QC Nano at 1-7 TOPs, QC Perform at 4-28 TOPs, and QC Ultra at 16-128 TOPs. That high end is already good enough to meet AI PC needs. Automotive users want more, especially for SAE-3 to SAE-5 applications. For this segment Quadric is targeting their QC-Multicore solution at up to 864 TOPs.

All these platforms are supported by the proven Chimera SDK. Steve had an interesting point here also. AI accelerator ventures will commonly mention their “model zoos”. These are standard AI models adapted through tuning to run on their architectures. Like function libraries in the conventional processor space. As for those libraries, model zoo libraries must be optimized to take full advantage of their architectures. By implication a new model requires the same level of tuning, a concern for new customers who must depend on the AI developer to handle that porting for them, each time they add or refine a model.

In contrast, Steve says Quadric already hosts hundreds of models on their site which simply compile without changes onto their platforms (you can still tune quantization to meet your specific needs). It’s not a model zoo, but simply a demonstration that their SDK is already mature enough to directly map a wide class of models without modification. And he notes that if your model needs an operator outside the ONNX set they already support, you can simply define that operator in C++, just as you would for say an NVIDIA accelerator.

Applications and growth

Quadric is a young company, shipping their first IP just over a year ago. Since then, they can already boast a handful of wins, especially in automotive. Customer names of course are secret, but DENSO is an investor of record. Other customer wins are in domains that reinforce the general-purpose value of the platform, in traditional camera functions, perhaps also in femtocell basebands (for MIMO processing). These two cases may or may not need AI support, but they do heavily lean on the DSP value of the platform.

This DSP capability is itself pretty interesting. Each PE can handle a mix of scalar and vector operations – up to 32b integer or 16b float – and these can be paralleled across up to 1024 PEs in a QC Ultra. So you can serve your immediate signal processing needs with high-end DSP word widths and add transformer-grade functionality to your engine later.

Sounds like a new breed of accelerator engine to me. You can learn more HERE.

Also Read:

2024 Outlook with Steve Roddy of Quadric

Fast Path to Baby Llama BringUp at the Edge

Vision Transformers Challenge Accelerator Architectures


Podcast EP236: Why Comprehensive Development Support for AI/ML is Important with Clay Johnson

Podcast EP236: Why Comprehensive Development Support for AI/ML is Important with Clay Johnson
by Daniel Nenni on 07-19-2024 at 10:00 am

Dan is joined by Clay Johnson, CEO of CacheQ. Clay has decades of executive experience in computing, FPGAs and development flows, including serving as Vice President of the Xilinx Spartan Business Unit which was acquired by AMD.

Clay discusses the changes occurring in system design to leverage AI/ML and technologies such as large language models. Clay points out that enabling these changes doesn’t end with the development of a new chip that performs AI algorithms faster.

Rather, the availability of a comprehensive development environment to integrate new technologies into existing systems becomes the key enabler to progress. Clay describes several examples of this trend.

CacheQ’s heterogeneous development platform enables easy development, deployment and orchestration of applications across multiple cores and heterogeneous distributed compute architectures. This results in significant increases in application performance and a dramatic reduction in development time.

The views, thoughts, and opinions expressed in these podcasts belong solely to the speaker, and not to the speaker’s employer, organization, committee or any other group or individual.


CEO Interview: Orr Danon of Hailo

CEO Interview: Orr Danon of Hailo
by Daniel Nenni on 07-19-2024 at 6:00 am

Orr Danon CEO Hailo

Orr Danon is the CEO and Co-Founder of Hailo. Prior to founding Hailo, Orr spent over a decade working at a leading IDF Technological Unit. During this time he led some of the largest and most complex interdisciplinary projects in the Israeli intelligence community. For the projects he developed and managed, Danon received the Israel Defense Award from the president of Israel, and the Creative Thinking Award from the Head of the Military Intelligence. Danon holds a B.Sc., Physics & Mathematics from the Hebrew University as part of the ”Talpiot” program and an M.Sc. in Electrical Engineering (cum laude) from the Tel Aviv University.

Tell us about your company?
Hailo is an edge AI-focused chipmaker. We develop specialized AI processors that enable high performance machine learning applications on edge devices such as NVRs, cameras, personal computers, vehicles, robots and more.

Hailo’s current key offerings include the Hailo-8 AI accelerator, which allows edge devices to run deep learning applications at full scale more efficiently, effectively, and sustainability; the Hailo-15, vision processor that can be placed directly into next generation of intelligent cameras, and the Hailo-10 GenAI accelerator, which empowers users to operate Generative AI locally and minimize reliance on cloud-based platforms.

What problems are you solving?
The Hailo AI processors bring data-center class performance to edge devices, enabling processing of advanced deep learning models in real-time and high accuracy, at a very low power consumption and attractive cost. Users can now run sophisticated AI tasks such as object detection, image enhancement, and content creation on edge devices without compromising on cost – solving previous issues with AI at the edge.

What application areas are your strongest?
We see a number of key application areas, including security, automotive, personal computers, and industrial automation.

Hailo is already serving more than 300 customers in these market segments.

Earlier in the year we announced that our Hailo-8 AI accelerator has been chosen alongside the Renesas R-Car V4H SoC to power the iMotion iDC High domain controller, advancing the future of autonomous driving. A Chinese automaker is expected to begin mass production with the domain controller in the second half of this year.

Additionally, we announced in June that Raspberry Pi had selected Hailo to provide AI accelerators for the Raspberry Pi AI Kit, the computing company’s AI-enabled add-on for Raspberry Pi 5. The partnership will empower both professional and enthusiast creators to elevate their projects and solutions in home automation, security, robotics and beyond, with advanced AI capabilities.

What keeps your customers up at night?
Our customers are concerned with ensuring high quality machine learning and AI services independently of network connectivity, and they’re concerned with their AI empowerment offering a strong performance-to-cost ratio and performance-to-power consumption ratio.

Another aspect which customers are always concerned about is the software tools which we as a silicon company provide. AI is a rapidly developing field, and the ability to respond fast to the dynamic market environment in which our customers operate depends heavily on the quality of the software toolchain, its documentation and of course the support we provide to them.

What does the competitive landscape look like and how do you differentiate?
Hailo is the only chipmaker who designed a processor specifically for running AI applications on edge devices, taking into consideration factors like cost, size, power consumption and memory access. Other AI processors, such as GPUs were not designed to run edge AI applications, and are therefore more costly and power consuming.

Additionally, Hailo is the only chipmaker who is offering a full range of AI processors at the single-digit Watt range – from accelerators that operate as co-processors that handle the AI models only, to full blown camera SOCs that handle both vision processing and AI video enhancement and analytics, all with a single, robust software suite that allows developers to use the same applications on different platforms.

What new features/technology are you working on?
We recently announced a $120M extended Series C fundraising round, which will be used for continued research and development, and the Hailo-10 generative AI accelerators that unlock the power of GenAI on edge devices, such as personal computers, smart vehicles, and commercial robots, Hailo-10 allows users to completely own their GenAI experiences, making them an integral part of their daily routine.

How do customers normally engage with your company?
To support the thousands of AI developers using Hailo devices, and to accommodate the growing Hailo community, we recently introduced an online developer community featuring tutorials, FAQs, and other resources to foster innovation among creators and developers. Registered members will have the opportunity to engage with a team of Hailo experts and connect with each other to share code, experiences, resources, knowledge, and more.

Visit https://hailo.ai/ for more information about our products, solutions and latest case studies or contact us here.

Also Read:

CEO Interview: David Heard of Infinera

CEO Interview: Dr. Matthew Putman of Nanotronics

CEO Interview: Dieter Therssen of Sigasi


Has ASML Reached the Great Wall of China

Has ASML Reached the Great Wall of China
by Claus Aasholm on 07-19-2024 at 6:00 am

ASML Holdings 2024

Is it time to abandon the ASML stock?

The first tool company to report Q2-24 results is ASML, and the lithography leader delivered a result above the guidance of EUR5.95B. Revenue of EUR6.242B is 4.9% above guidance and 18% above last quarter’s result of EUR5.29B.

Both operating profit and gross profit grew but not to the level of the end of last year. ASML management calls 2024 a transition year in investor communications, indicating a stronger 2025.

Tool revenue increased after a significant dip. Service Revenue is much more resilient than tool revenue, as it is dependent on the installed base of tools.

Almost all of the tool revenue growth came from memory tool sales, indicating that the memory companies are finally ready to make substantial investments in new capacity, which is much needed after the shift to HBM production.

From a product perspective, the short-term trend of EUV revenue decline continued while the immersion product sales were solid.

Immersion is a technique that utilises that light through water, resulting in amplification, allowing better resolution at the same light wavelength.

Given the Chips Act and other subsidies, the ASML result is somewhat counter-intuitive as EUV is used for 3-7nm leading-edge manufacturing nodes, and immersion is used for 7-14nm. Given the US attempt to become a leading-edge manufacturing location, it could be expected that leading-edge tools would dominate revenue. This indicates that the new factories are not yet in the tooling phase.

The other significant consumer of leading-edge tools is TSMC, which reported Q2-24 result right after ASML.

 

Although Capex spending was up, it was still just slightly above the maintenance investment level—the investment needed to maintain the deterioration of the existing manufacturing assets. TSMC is likely waiting for ASML’s High-NA tool to be available. ASML has confirmed they shipped one of these babies last quarter and installed another in Veldhoven on the joint IMEC/ASML manufacturing line. The tool is priced North of $350M, and ASML is trying to reach a production capacity of 20 systems annually during the 24/25 timeline.

Despite beating the guidance and reasonable growth, the ASML share price plunged in the stock market. Are the markets losing confidence in the Lithography leader?

What about China?

The key reason for the decline is the ASML result coincided with news that further export limitations are in the works.

Since the signing of the Chips Act, tool sales to China have exploded. While this could be expected, it seems like the US administration’s patience has run out.

The Chinese companies have not had access to the EUV systems since 2019, and the latest embargo, which began on September 23, banned sales of the immersion systems. This makes 80% of ASML’s products (from a revenue perspective) unavailable for Chinese customers.

As ASML has been allowed to ship the backlog, the effect has been delayed, and China still accounted for 49% of all tool sales in Q2-24.

This, however, is about to end abruptly as the Chinese backlog has been depleted.

The ASML backlog now reflects the embargo revenue view, and from now on, the Chinese revenue will fall to 20% of the total from the current level of 49%.

The potential new embargo will impact ASML’s service revenue, which is currently 24% of total revenue. Under a potential new embargo, ASML can lose the ability to service its Chinese customers, which is incredibly important for keeping the tools alive and productive. As the Chinese manufacturing base could deteriorate fast, this could create new opportunities for ASML as mature node capacity would grow outside China.

The longer-term view

With the likely dip in China business and a potential embargo impacting service revenue, investors are starting to panic and run away from ASML. It is worth noting that this is an amazing company founded on a philosophy of long-term cooperation with its suppliers and other stakeholders. Constant innovation drives higher productivity and tool pricing a reaching an alarming (for customers) increasing in price.

While each tool increases productivity, it is still a hefty price if you want to be at the bleeding edge of Semiconductor manufacturing.

The current ASML manufacturing plan will enable the company to deliver a 20B$+ quarter (at current pricing) at the end of 2026. This is not a given or a forecast and can be changed according to industry development. However, it is a very strong indication that the company has faith in the long-term future of the current strategy.

Our research is focused on the business results and not on investment advice. However, if you have faith in the long-term plan of ASML, it might be too early to dump ASML shares.

Also Read:

Will Semiconductor earnings live up to the Investor hype?

What if China doesn’t want TSMC’s factories but wants to take them out?

Blank Wafer Suppliers are not Totally Blank


Blue Cheetah Advancing Chiplet Interconnectivity #61DAC

Blue Cheetah Advancing Chiplet Interconnectivity #61DAC
by Daniel Payne on 07-18-2024 at 10:00 am

blue cheetah 61dac min

At #61DAC, I love it when an exhibitor booth uses a descriptive tagline to explain what they do, like when the Blue Cheetah booth displayed Advancing Chiplet Interconnectivity. Immediately, I knew that they were an IP provider focusing on chiplets. I learned what sets them apart is how customizable their IP is to support specific physical and system bandwidth requirements, how the interconnect IP is configured for cost-sensitive or high-performance cases, how the energy and performance are optimized from 32 Gb/s down to 8Gb/s and lower, being process-ready at nodes from 16nm to 3nm, and finally having been silicon-proven with reference board designs. I sat down with John Lupienski, VP Product Engineering at Blue Cheetah, to better understand what they were all about. John’s background covers roles at Cadence, Broadcom, and Motorola.

Blue Cheetah at #61DAC

Chiplet designers can opt for an industry-standard interconnect, such as UCIe or BOW, or something custom; Blue Cheetah supports either approach. Blue Cheetah is active with the emerging chiplet standards and is an active participant of both organizations. Smaller IO core area, lower energy per bit, tailor-fit designs are compelling reasons to talk with this IP vendor. The company can customize its IP links per each unique application and deliver solutions using advanced process technologies across multiple foundries and supporting standard and advanced packaging technologies. Its IP has been used in tape-outs for chiplet interconnects ranging from 16nm down to the 4nm node.

During DAC,  Baya Systems and Blue Cheetah announced their combined chiplet-optimized Network on Chip (NoC) and Physical Layer (PHY) interconnect IP offerings, making it easier and less risky to design with chiplets. Tenstorrent, announced in February that it uses the Blue Cheetah die-to-die interconnect IP for its AI and RISC-V products. Tenstorrent recently announced that it also uses Baya Systems’ NoC fabric IP.

The demonstration at the booth showed test packages integrating 12nm chiplets (availability announced in May 2023) with channel lengths spanning 2mm up to 25mm. Blue Cheetah’s customers’ develop products for a wide variety of end markets; in addition to Tenstorrent, publicly known examples of Blue Cheetah’s customers and partners include DreamBig Semiconductor, FLC, and Ventana Microsystems.

Blue Cheetah test chip, various channel lengths

The architecture of the interconnect IP is modular, making it quicker to port to newer process nodes. John mentioned that packaging for chiplets requires an engineer to perform SI/PI analysis, as customers often use an OSAT to assemble, and each chiplet can be fabricated at different nodes, so you really want interconnect IP that has been silicon-proven. To help get you started with chiplets, they offer reference boards and software to speed up the learning curve.

Summary

SoCs have been around for decades, while the trend of using chiplets has just started in the last several years. Blue Cheetah is a trailblazer in the industry and has solidified its position with high-speed, low-latency, power-efficient D2D BlueLynx™ interface products. The company’s standards-based and customizable IP solutions are available now in 16nm,12nm, 7nm, 6nm, 5nm, 4nm, 3nm, and below across multiple semiconductor foundries.

You can follow up with John directly or contact the company on its website for more info. The company appears at many events throughout the year, including DAC, Chiplet Summit, ISSCC, OCP Global Summit, SemIsreal Expo, and foundry events.

Related Blogs


The China Syndrome- The Meltdown Starts- Trump Trounces Taiwan- Chips Clipped

The China Syndrome- The Meltdown Starts- Trump Trounces Taiwan- Chips Clipped
by Robert Maire on 07-18-2024 at 8:00 am

China Syndrome
  • The chip industry got a double tap of both China & Taiwan concerns
  • Bloomberg reported the potential for draconian China chip restrictions
  • Trump threw Taiwan under the bus demanding “protection money”
  • Over-inflated chip stocks had a “rapid unscheduled disassembly”
US looking to further restrict ASML & Tokyo Electron

It has been reported by Bloomberg that the US is going to crack down further on chip equipment sales.

Unfortunately the main targets appear to be non US semiconductor equipment companies such as ASML & Tokyo Electron rather than US equipment companies which sell a similar percentage of their wares to China.

Link to article on China restrictions

The US government is obviously punishing foreign firms more than US firms, Such as AMAT, LRCX & KLAC that are doing the same thing. Perhaps not wanting to hurt US companies…..or perhaps the government is finally realizing their efforts haven’t worked and will finally crack down on US based sales to China.

We mentioned in our note last week about the tens of millions of dollars being spent lobbying the government on behalf of US equipment companies…..maybe its not enough or the government is finally realizing they need to do more

Foreign Direct Product rule

….says that the US can restrict foreign companies, like ASML & Tokyo Electron from selling and servicing equipment that contains US technology.

Foreign Direct Product rule link

ASML famously bought Cymer, a US company in San Diego for their DUV & EUV sources.

Most investors don’t know that Cymer had a lot of “star wars” defense industry technology involving high power lasers and that ASML had to get permission from US defense related officials in order to acquire Cymer. Any agreements ASML made in order to achieve permission were never publicly released, but we would imagine the US government retained some sort of influence

The government is likely as concerned about chip technology as well as high power laser technology

Tokyo Electron does a lot of R&D in the US (as does ASML), so we are sure their products contain US technology in many places….its impossible to avoid

That giant “sucking sound”

We had mentioned in our note last week that US equipment companies would be “sucking major wind” if they lost the 40% plus of their sales which go to China.

But its much worse than it appears on the surface. US chip equipment companies charge Chinese companies a whole lot more than TSMC or Samsung, so the margins are much higher on that 40% plus than of the 50%+ non China sales.

We would not be surprised if closer to 60% or more of profitability comes from China sales. Thus losing China sales has an oversized impact on the bottom line.

US semiconductor equipment companies could actually lose money for the first time in many years if China sales were curtailed enough…..it could get very ugly very fast….

The Mafia “Don” wants “protection money” from Taiwan

Having been born and raised in New York we were very familiar with local establishments paying “protection money” to organized crime types to prevent something bad from happening…….

You can imagine the phone call from the US to Taiwan….” nice little island you got there, you wouldn’t want anything bad to happen to it, would you?”, “cut us in for 20 percent of the action on those chip things you make….”

This scenario is not as far fetched as it would sound as Donald Trump today suggested that the US might not defend Taiwan if they didn’t pay the US for that “protection”….so much for helping out friends and allies….obviously Ukraine will get a similar message.

This statement threw gasoline on an already raging China restriction issue that had the chip stocks in turmoil already.

If the US restricts China sales and China blockades Taiwan at Trumps invitation, equipment sales at the number one and number two markets are at risk……a very bad day…..

The Stocks

…were obviously crushed today on this double whammy of news.

Its not like the stocks were at low valuations to begin with. We have pointed out time and again that the stocks were overheated and over extended. We certainly think AI is the greatest thing in technology ever, but a lot of unrelated chip and chip equipment names got run up in the tsunami.

We will likely see a near term valuation reset across many names in the semi space.

Final valuations and impacts will not truly be known until the US actually publicly states what’s going on and how bad the damage will be. Until then it will be a guessing game but just guessing how bad the impact will be as its all negative.

Initially it will be ASML & TEL but we think this time US companies will likely finally feel some pain as well…..we just don’t know how much it will hurt……

About Semiconductor Advisors LLC

Semiconductor Advisors is an RIA (a Registered Investment Advisor),
specializing in technology companies with particular emphasis on semiconductor and semiconductor equipment companies.
We have been covering the space longer and been involved with more transactions than any other financial professional in the space.
We provide research, consulting and advisory services on strategic and financial matters to both industry participants as well as investors.
We offer expert, intelligent, balanced research and advice. Our opinions are very direct and honest and offer an unbiased view as compared to other sources.

Also Read:

SEMICON West- Jubilant huge crowds- HBM & AI everywhere – CHIPS Act & IMEC

KLAC- Past bottom of cycle- up from here- early positive signs-packaging upside

LRCX- Mediocre, flattish, long, U shaped bottom- No recovery in sight yet-2025?


Evolution of Prototyping in EDA

Evolution of Prototyping in EDA
by Daniel Nenni on 07-18-2024 at 6:00 am

Picture I

As AI and 5G technologies burgeon, the rise of interconnected devices is reshaping everyday life and driving innovation across industries. This rapid evolution accelerates the transformation of the chip industry, placing higher demands on SoC design. Moore’s Law indicates that while chip sizes shrink, the number of transistors increases rapidly. It is hard to imagine achieving such highly integrated, large-scale designs without advanced EDA tools.

Tape-out is a critical and high-risk phase in chip design. Even a minor error can lead to significant financial losses and missed market opportunities. Logic or functional errors account for nearly 50% of tape-out failures, with design errors comprising 50%-70% of these functional defects. Therefore, verification of SoC design is crucial to successful tape-out. SoC verification is highly complex, taking up about 70% of the entire cycle. To accelerate time-to-market, system software development and pre-tape-out verification must be conducted concurrently, highlighting the significant advantages of prototyping.

For large-scale SoC designs, traditional software simulations often fall short due to the slow execution speed. Consequently, prototyping and hardware simulations have emerged as the primary verification methods, with high-performance prototyping taking the lead. Prototyping, particularly FPGA-based, can be thousands to millions of times faster than software simulations. It is more cost-effective and faster than hardware simulations, making it indispensable for verifying complex SoCs. However, manually built prototyping platforms are difficult to maintain and scale in multi-FPGA and complex design environments. This method is time-consuming and prone to errors, leading to increased risks of project delays and cost overruns. Commercial prototyping solutions have thus emerged to address these challenges.

The Birth of Commercial Prototyping

In 1992, Aptix, the pioneer in the prototyping area, launched the System Explorer system, utilizing FPGAs and custom interconnect chips to achieve commercial prototyping. In subsequent years, projects such as Transmogrifier-l from the University of Toronto, AnyBoard from North Carolina State University, Protozone from Stanford University, and BORG from the University of California, Santa Cruz, explored ways to implement HDL chip designs on prototyping boards. Although these projects were not ready for large-scale commercialization, Aptix’s success inspired other vendors to spark interest in this field. Despite later being absorbed in mergers, Aptix’s pioneering contributions to chip verification methodology remain historically significant.

In 2003, Toshio Nakama founded S2C in San Jose, California, after departing from Aptix. At DAC 2005, S2C unveiled its first prototyping product, the IP Porter, and soon launched the commercially successful Prodigy series. This marked a new era for the company, positioning S2C as a leader in rapid SoC prototyping solutions. Concurrently, the Dini Group in the US released its first commercial FPGA prototyping system, the DN250k10, based on six Xilinx XC4085 FPGAs, providing a flexible and cost-effective solution for design teams. Around the same period, Sweden’s HARDI Electronics AB launched its first FPGA-based prototyping system, HAPS, using Xilinx Virtex FPGAs.

Rapid Growth Driven by Competition

In 2008, Synopsys entered the prototyping market by acquiring Synplicity for $227 million, marking the start of a rapidly growing and competitive era for prototyping. Synopsys spent nearly four years integrating the technology, eventually releasing the HAPS-70 series, a fully automated prototyping product. This acquisition significantly grew the prototyping market, previously dominated by software and hardware simulation tools​.

Cadence soon followed suit. Historically focused on designing its FPGA boards, Cadence faced challenges until it acquired Taray in March 2010. Taray’s pioneering routing-aware pin assignment technology optimized FPGA design with the circuit board, aiding in the development of a robust prototyping platform. Cadence later collaborated with the Dini Group to develop the Protium prototyping product. However, Dini Group was acquired by Synopsys on December 5th, 2019. Today, Cadence focuses on streamlining the integration between its prototyping and hardware simulation products, ensuring seamless connectivity​.

Siemens EDA (formerly Mentor Graphics acquired in 2016), had a turbulent history in prototyping. In the late 1990s, Siemens EDA licensed emulation technology from Aptix but faced several challenges. To enhance its timing-driven and multi-FPGA partitioning capabilities, Siemens EDA acquired Auspy and Flexras Technologies, the latter known for its “Wasga” automatic partitioning software. In June 2021, Siemens EDA further strengthened its prototyping portfolio by acquiring PRO DESIGN’s proFPGA product series​.

The entry of these major companies, along with providers like S2C, facilitated the shift from software and hardware simulation to automated prototyping solutions, enhancing the efficiency and accuracy of SoC designs, and paving the way for further innovations in the entire EDA industry.

Major Challenges and Solutions in Prototyping

The emergence of innovative prototyping solutions has driven increased complexity in SoC design and heightened demands for rigorous prototyping. These solutions require specialized expertise to manage design partitioning, mapping, interface and communications with external environments, debugging, and performance optimization. Consequently, prototyping has become a high-barrier field with only a few EDA companies maintaining a leading position. Some companies even rely on continuous mergers to strengthen their market presence.

As a leader in prototyping, S2C addresses challenges in multi-FPGA RTL logic partitioning, interconnect topology, IO allocation, and high-speed interfaces by timing-driven RTL partitioning algorithms and built-in incremental compilation algorithms. S2C continually updates hardware configurations to support more FPGAs and offer higher-performance connectors ensuring its technology remains at the industry’s forefront.

With over 20+ years of industry experience and a relentless commitment to innovation, S2C equips clients with the highly trusted tools necessary to stay ahead in the competitive market. Their comprehensive solutions accelerate time-to-market, offering unparalleled speed, accuracy, and reliability.

Also Read:

S2C Prototyping Solutions at the 2024 Design Automation Conference

Accelerate SoC Design: DIY, FPGA Boards & Commercial Prototyping Solutions (I)

Accelerate SoC Design: Addressing Modern Prototyping Challenges with S2C’s Comprehensive Solutions (II)

S2C and Sirius Wireless Collaborate on Wi-Fi 7 RF IP Verification System