SemiWiki – Page 104 – The Open Forum for Semiconductor Professionals

June 3, 2024July 17, 2025

The Case for U.S. CHIPS Act 2

The Case for U.S. CHIPS Act 2
by Admin on 06-03-2024 at 8:00 am
Categories: Foundries, Intel Foundry, Samsung Foundry, Semiconductor Services
2 Comments

Photo by Brandon Mowinkel on Unsplash

Despite murky goals and moving targets, the recent CHIPS Act sets the stage for long term government incentives.

Authored by Jo Levy and Kaden Chaung

On April 25, 2024, the U.S. Department of Commerce announced the fourth, and most likely final, grant under the current U.S. CHIPS Act for leading-edge semiconductor manufacturing. With a Preliminary Memorandum of Terms (or PMT) valued at $6.14B, Micron Technologies joined the ranks of Intel Corporation, TSMC, and Samsung, each of which is slated to receive between $6.4 and $8.5B in grants to build semiconductor manufacturing capacity in the United States. Together, these four allotments total $27.64B, just shy of the $28B that Secretary of Commerce Gina Raimondo announced would be allocated to leading-edge manufacturing under the U.S. CHIPS Act. The Secretary of Commerce has stated ambitions to increase America’s global share in leading-edge logic manufacturing to 20% by the end of the decade, starting from the nation’s current position at 0%. But will the $27.64B worth of subsidies be enough to achieve this lofty goal?

Figure #1, Data taken from NIST and the White House

To track achievement toward the 20% goal, one needs both a numerator and a denominator. The denominator consists of global leading edge logic manufacturing while the numerator is limited to leading edge logic manufacturing in the United States. Over the next decade, both the numerator and the denominator will be subject to large-scale changes, making neither figure easy to predict. For nearly half a century, the pace of Moore’s Law’s has placed the term “leading-edge” in constant flux, making it difficult to determine which process technology will be considered leading-edge in five years’ time. More recently, American chip manufacturing must keep pace with foreign development, as numerous countries are also racing to onshore leading-edge manufacturing. These two moving targets make it difficult to measure progress toward Secretary Raimondo’s stated goal and warrant a closer examination of the potential challenges faced.

Challenge #1: Defining Leading Edge and Success Metrics

The dynamic nature of Moore’s Law, which predicts the number of transistors on a chip will roughly double every two years (while holding costs constant), leads to a steady stream of innovation and rapid development of new process technologies. Consider TSMC’s progression over the past decade. In 2014, it was the first company to produce 20 nm technology at high volume production. Now, in 2024, the company is mass producing logic technology within the 3 nm scale. Intel, by comparison, is currently mass producing its Intel 4. (In 2021, Intel renamed its Intel 7nm processors to Intel 4.)

Today, the definition of advanced technologies and leading edge remains murky at best. As recently as 2023, TSMC’s Annual Report identified anything smaller than 16 nm as leading edge. A recent Trend Force report used 16 nm as the dividing line between “advanced nodes” and mature nodes. Trend Force predicts that U.S. manufacturing of advanced nodes will grow from 12.2% to 17% between 2023 and 2027, while Secretary Raimondo declared “leading-edge” manufacturing will grow from 0% to 20% by 2030. This lack of clarity dates back to the 2020 Semiconductor Industry Association (“SIA”) study which served as the impetus for the U.S. CHIPS Act. The 2020 SIA report concluded that U.S. chip production dropped from 37% to 10% between 1999 and 2019 based upon total U.S. chip manufacturing data. To stoke U.S. fears of lost manufacturing leadership, it pointed to the fast-paced growth of new chip factories in China, though none of these would be considered leading-edge under any definition.

A new 2024 report by the SIA and the Boston Consulting Group on semiconductor supply chain resilience further muddies the waters by shifting the metrics surrounding the advanced technologies discourse. It defines semiconductors within the 10 nm scope as “advanced logic” and forecasts that the United States’ position in advanced logic will increase from 0% in 2022 to 28% by 2032. It also predicts that the definition of “leading-edge” will encompass technologies newer than 3 nm by 2030 but fails to provide any projection of the United States’ position within the sub3 nm space. This begs the question: Should Raimondo’s ambition to reach 20% in leading edge be evaluated under the scope of what is now considered as “advanced logic”, or should the standard be held to a more rigorous definition of “leading edge”? As seen within the report, U.S. manufacturing will reach the 20% goal by a comfortable margin if “advanced logic” is used as the basis for evaluation. Yet, the 20% goal may be harder to achieve if one were to hold the stricter notion of “leading edge”.

Figure #2, Data taken from NIST

The most current Notice of Funding Opportunity under the U.S. CHIPS Act defines leading-edge logic as 5 nm and below. Many of the CHIPS Act logic incentives are for projects at the 4 nm and 3 nm level, which presumably meet today’s definition of leading-edge. Intel has announced plans to build one 20A and one 18A factory in Arizona, two leading edge factories in Ohio, bring the latest High NA EUV lithography to its Oregon research and development factory, and expand its advanced packaging in New Mexico. TSMC announced it will use its incentives for 4 nm FinFet, 3 nm, and 2 nm but fails to identify how much capacity will be allocated to each. Similarly, Samsung revealed its plans to build 4nm and 2 nm, as well as advanced packaging, with its CHIPS Act funding. Like TSMC, Samsung has not shared the volume production it expects at each node. However, by the time TSMC’s and Samsung’s 4 nm projects are complete, it’s unlikely they will be considered leading-edge by industry standards. TSMC is already producing 3 nm chips in volume and is expected to reach high volume manufacturing of 2 nm technologies in the next year. Both Intel and TSMC are predicting high volume manufacturing of sub-2nm by the end of the decade. In addition, the Taiwanese government has ambitions to break through to 1 nm by the end of the decade, which may lead to an even narrower criterion for “leading-edge.”

In this way, the United States’ success will be contingent on the pace of developments within the industry. So far, the CHIPS Act allocations for leading-edge manufacturing are slated to contribute to two fabrication facilities producing at 4 nm, and six at 2 nm or lower. If “leading-edge” were to narrow down to 3 nm technologies by 2030 as predicted, roughly a fourth of the fabrication facilities built for leading-edge manufacturing will not contribute to the United States’ overall leading-edge capacity.

If the notion of “leading-edge” shrinks further, even fewer fabrication facilities will be counted towards the United States’ leading-edge capacity. For instance, if the Taiwanese government proves to be successful with its 1 nm breakthrough, it would cast further doubt on the validity of even a 2 nm definition for “leading-edge”. Under such circumstances, the Taiwanese government will not only be chasing a moving target, but will shift the goalpost for the rest of the industry in the process, greatly complicating American efforts to reach the 20% mark. Thus, it becomes essential for the American leadership to keep track of foreign developments within the manufacturing space while developing its own.

Challenge # 2: Growth in the United States Must Outpace Foreign Development

Any measure of the success of the CHIPS Act must consider not only the output of leading edge fabrication facilities built in the United States, but also the growth of new fabs outside the United States. Specifically, to boost its global share of leading edge capacity by 20%, the U.S. must not only match the pace of its foreign competition, it must outpace it.

This means the U.S. must contend with Asia, where government subsidies and accommodating regulatory environments have boosted fabrication innovation for decades. Though Asian manufacturing companies will contribute to the increase of American chipmaking capabilities, it appears most chipmaking capacities will remain in Asia through at least 2030. For instance, while TSMC’s first two fabrication facilities in Arizona can produce a combined output of 50,000 wafers a month, TSMC currently operates 4 fabrication facilities in Taiwan that can each produce up to 100,000 wafers a month. Moreover, Taiwanese companies have announced plans to set up 7 additional fabrication facilities on the island, 2 of which include TSMC’s 2 nm facilities. In South Korea, the president has unveiled plans to build 16 new fabrication facilities through 2047 with a total investment of $471 billion, establishing a fabrication mega-cluster in the process. The mega-cluster will include contributions by Samsung, suggesting expansion of Korea’s leading-edge capacity. Even Japan, which has not been home to logic fabrication in recent years, has taken steps to establish its leading-edge capabilities. The Japanese government is currently working with the startup Rapidus to initiate production for 2 nm chips, with plans of a 2 nm and 1 nm fabrication facility under way. While the U.S. has taken a decisive step to initiate chipmaking, governments in Asia are also organizing efforts to establish or maintain their lead.

Asia is not the only region growing its capacity for leading edge chip manufacturing. The growth of semiconductor manufacturing within the E.U. may further complicate American efforts to increase its leading-edge shares by 20%. The European Union recently approved of the E.U. Chips Act, a $47B package that aims to bring the E.U.’s global semiconductor shares to 20% by 2030. Already, both Intel and TSMC have committed to expanding semiconductor manufacturing in Europe. In Magdeburg, Germany, Intel seeks to build a fabrication facility that uses post-18A process technologies, producing semiconductors within the order of 1.5 nm. TSMC, on the other hand, plans to build a fabrication facility in Dresden, producing 12/16 nm technologies. Though the Dresden facility may not be considered leading-edge, TSMC’s involvement could lead to more leading-edge investments within European borders in the near future.

In addition to monetary funding under the CHIPS Act, the U.S. also faces non-monetary obstacles that may hamper its success. TSMC’s construction difficulties in Arizona have been well-documented and contrasted with the company’s brief and successful construction process in Kumamoto, Japan. Like TSMC, Intel’s U.S. construction in Ohio has also faced setbacks and delays. According to the Center for Security and Emerging Technology, many countries in Asia provide infrastructure support, easing regulations in order to accelerate the logistical and utilities-based processes. For instance, during Micron’s expansion within Taiwanese borders, the Taiwanese investment authority assisted the company with land acquisition and lessened the administrative burden the company had to undergo for its construction. The longer periods required to obtain regulatory approvals and complete construction in the U.S. provide other nations with significant lead time to outpace U.S. growth.

Furthermore, the monetary benefits of CHIPS Act rewards will take time to materialize. Despite headlines claiming CHIPS Act grants have been awarded, no actual awards have been issued. Instead, Intel, TSMC, Samsung and Micron have received Preliminary Memorandum of Terms, which are not binding obligations. They are the beginning of a lengthy due diligence process. Each recipient must negotiate a long-form term sheet and, based upon the amount of funding per project, may need to obtain congressional approval. As part of due diligence, funding recipients may also be required to complete environmental assessments and obtain government permits. Government permits for semiconductor factories can take 12- 18 months to obtain. Environmental assessments can take longer. For example, the average completion and review period for an environmental impact statement under the National Environmental Policy Act is 4.5 years. Despite the recent announcements of preliminary terms, the path to actual term sheets and funding will take time to complete.

Even if the due diligence and term sheets are expeditiously completed, the recipients still face years of construction. The Department of Commerce estimates a leading-edge fab takes 3-5 years to construct after the approval and design phase is complete. Moreover, two of the four chip manufacturers have already announced delays in construction projects covered by CHIPS Act incentives. Accounting for 2-3 years to obtain permits and complete due diligence, 3-5 years for new construction, and an additional year of delay, it may be 6-9 years before any new fabs begin production. To achieve the CHIPS Act goal of 20% by 2030, the United States must do more than provide funding– it must ensure the due diligence and permitting processes are streamlined to remain competitive with Europe and Asia.

The Future of Leading-Edge in the United States

Between the constant changes in the meaning of “leading-edge” under Moore’s Law and the growing presence of foreign competition within the semiconductor industry, the recent grant announcements of nearly $28B for leading-edge manufacturing are only the start of the journey. The real test for the U.S. CHIPS Act will occur over the next few years, when the CHIPS Office must do more than monitor semiconductor progress within the U.S. It must also facilitate timely completion of the CHIPS Act projects and measure their competitiveness as compared to overseas expansions. The Department of Commerce must continually evaluate whether its goals still align with developments in the global semiconductor industry.

As such, whether the United States proves successful largely depends on why achieving the 20% target matters. Is the goal to establish a steady supply of advanced logic manufacturing to protect against foreign supply-side shocks, or is it to take and maintain technological leadership against the advancements of East Asia? If the former case, then abiding to the notion of “advanced logic” will suffice; the degree of such an achievement will be smaller compared to what was initially promised under “leading-edge”, but it remains a measured and sensible goal for the U.S. to achieve. If the latter case holds true, achieving the 20% benchmark would place the United States in a much stronger position within the global supply chain. To do so, however, will undoubtedly require much greater funding efforts towards leading-edge than the $28B that has been allocated.

National governments are increasingly investing efforts to establish a stronger productive capacity for semiconductors, and many will continue to do so in the succeeding decades. If the United States aims to keep pace with the rest of the industry, then it must maintain a steady stream of support towards leading-edge technologies. It will be an expensive initiative, but some leading figures such as Secretary Raimondo are already suggesting a secondary CHIPS Act to expand upon its initial efforts; In the global race, another subsidy package will provide the nation with a much needed push towards the 20% finish line. Hence, despite all the murkiness surrounding the United States’s fate within the semiconductor industry, one fact remains certain: the completion of the CHIPS Act should not be seen as the conclusion, but as the prologue to America’s chipmaking future.

Also Read:

CHIPS Act and U.S. Fabs

Micron Mandarin Memory Machinations- CHIPS Act semiconductor equipment hypocrisy

The CHIPS and Science Act, Cybersecurity, and Semiconductor Manufacturing

Why China hates CHIPS

June 3, 2024July 18, 2025

Follow the Leader – Synopsys Provides Broad Support for Processor Ecosystems

Follow the Leader – Synopsys Provides Broad Support for Processor Ecosystems
by Mike Gianfagna on 06-03-2024 at 6:00 am
Categories: IP, Synopsys

Synopsys has expanded its ARC processor portfolio to include a family of RISC-V processors. This was originally reported on SemiWiki last October. There is also a recent in-depth article on the make-up of the ARC-V family on SemiWiki here. This is important and impactful news; I encourage you to read these articles if you haven’t done so already. What I want to cover in this post is a broader perspective on what Synopsys is doing to provide holistic support for the entire processor ecosystem. I’ve always felt that the market leader should be expanding the market, creating new opportunities for not just itself, but for its current and future customers as well as the entire ecosystem at large. This is such a story. Follow the leader to see how Synopsys provides broad support for processor ecosystems.

The Organizational View

Kiran Vittal

The org chart for a company can tell a lot about strategy. In the case of support for the processor ecosystem at Synopsys there is information to be gleaned here. About two years ago, Kiran Vittal was named the Executive Director, Ecosystem Partner Alliances Marketing at Synopsys. In this role, Kiran has the charter to work with IP partners, foundry partners, EDA partners and the rest of Synopsys to optimize the EDA tool flows and associated IP for the markets served. Note there is no specific charter regarding, Arm, ARC or RISC-V. Kiran has it all. This organizational setup is an important ingredient to facilitate holistic support for an entire ecosystem. Just the way a market leader should.

I’ve known Kiran for quite a while. We worked together at Atrenta (the SpyGlass company) before it was acquired by Synopsys. Kiran has exactly the right personality and technical depth (my opinion) to do this rather challenging job. Recently, I was able to speak with Kiran to get a long-overdue update. Here are some of the things I learned.

What Holistic Support Looks Like

Kiran began by describing the broad support Synopsys offers for implementation and verification for the growing RISC-V market. The strategy covers a lot of ground across architectural exploration, IP support, software development, DevOps, HW/SW verification, design servicers and a broad suite of simulation and verification methodologies. The figure below summarizes all this.

Leading Solutions for RISC V Implementation & Verification

As I mentioned earlier. Kiran’s perspective is NOT limited to the RISC-V market. Arm has also been a strong partner for many years. There is plenty going on across the spectrum of processor options. The graphic at the top of this post is a high-level summary of this strategy. Some details are useful:

For Arm

Deep collaboration on advanced nodes down to 2nm
Fusion QuickStart Kits & verification collateral available for latest cores
Opportunity to tune the cores to get the best out-of-the-box PPA

For Synopsys ARC/ARC-V

Enablement of ARC/ARC-V cores with Synopsys digital design and verification families
Fusion QuickStart Kits for high-performance ARC/ARC-V cores
Design services for customer enablement

For RISC-V Cores

Partnering with key customers, RISC-V core providers, foundries and universities
Building customized flows for implementation and verification
Fusion QuickStart Kits (QIKs) with SiFive® available now

Kiran explained that Synopsys has a growing list of ARC-V customers, but there are also a lot of customers who have chosen to source processor IP from other vendors. Once the customer chooses the processor IP, Synopsys can provide a rich set of EDA tools, flows and IP to support that choice. It is true that ARC has been part of Synopsys for quite a while. That means all ARC products enjoy a tight integration and validation with Synopsys tools and IP.

While this does provide a competitive advantage, Synopsys still maintains strong relationships across the processor ecosystem to ensure a rich experience regardless of processor choice. As we had this discussion, I kept thinking about my view of the way a market leader behaves.

There is a lot more to the ARC-V story. I’ll be providing links to learn more in a moment. But first, I want to share a really interesting case study regarding a RISC-V design.

AI Impact on Processor Design

AI is finding its way into every part of our lives. If you design advanced processors, this is true as well. Here are some compelling details for an AI-driven optimization of a RISC-V high performance CPU core.

The design is a RISC-V based “Big Core” targeted for data center applications. Its size is 426umx255um for a single core. The target process technology is 5nm. The baseline (starting point) for this exercise was 1.75GHz at 29.8mW of power. This represents the out-of-the-box results from a RISC-V reference flow.

The desired target for this design was 1.95GHz at 30mW of power. The customer estimated hitting this target would take about one month for two expert engineers. Applying the Synopsys DSO.ai AI-driven RISC-V reference flow, 1.95GHz at 27.9mW of power was achieved over two days and 90 software runs, with no human effort. The expected area target was also met.

This is what the future of processor design looks like.

To Learn More

If you want to learn more about the Synopsys ARC-V processor IP family, you can find it here. If you want to step back and look at the overall processor solutions available from Synopsys, look here or you can learn about the Synopsys overall strategy to support RISC-V here. And if you want to learn more about the revolutionary DSO.ai capability from Synopsys, check out this link. And that’s how you can follow the leader to see how Synopsys provides broad support for processor ecosystems.

Podcast EP226: Lumotive’s Disruptive Optical Beamforming Technology With Dr. Sam Heidari

Podcast EP226: Lumotive’s Disruptive Optical Beamforming Technology With Dr. Sam Heidari
by Daniel Nenni on 05-31-2024 at 10:00 am

Dan is joined by Dr. Sam Heidari. Sam brings three decades of extensive management experience in the semiconductor sector. He currently holds the position of CEO at Lumotive, serves as Chairman at Arctic Semiconductor, and advises multiple technology startups. Previously, he served as CEO and Chairman at Quantenna Communications, CEO at Doradus Technologies, and CTO at Ikanos.

Dan and Sam begin by discussing the various applications for light in semiconductor systems. AI computing, holographic displays, high-performance compute and 3D sensing are all touched on.

Sam then describes the revolutionary new technology developed by Lumotive. Light Control Metasurface (LCM™) technology defies the laws of physics by dynamically shaping and steering light on the surface of a chip, without any moving parts. This new technology has many applications. Sam focuses on its ability to implement 3D sensing lidar systems with low cost and high precision for automotive applications.

The deployment of the technology in automotive platforms as well as edge computing are discussed with an assessment of the broader impact the technology will have going forward.

The views, thoughts, and opinions expressed in these podcasts belong solely to the speaker, and not to the speaker’s employer, organization, committee or any other group or individual.

May 31, 2024July 29, 2025

CEO Interview: Dr. Nikos Zervas of CAST

CEO Interview: Dr. Nikos Zervas of CAST
by Daniel Nenni on 05-31-2024 at 6:00 am
Categories: CAST, CEO Interviews, IP

Dr. Nikos Zervas joined CAST company in 2010, having previously served as VP of Marketing and Engineering and as COO. Prior to working at CAST, Nikos co-founded Alma Technologies and served as its Chairman and CEO for nine years. Under his leadership Alma Technologies bootstrapped to become a reputable IP provider, developing cutting-edge image and video compression cores and solutions. His research has been published in over 40 technical journal and conference papers, and he has been honored by the IEEE Computer Society with a Low-Power Design Contest award. Nikos served on the board of the Hellenic Semiconductor Industry Association from 2009 until 2013, where he was responsible for strategic planning.

Tell us about your company?
CAST provides IP cores that save time and money for electronic system developers across a wide range of industry segments and application areas.
Our current product line includes processors, compression, interfaces, peripherals, and security, and features industry-leading cores for automotive, industrial, defense, data centers, and other application areas. We focus on providing a better IP experience with every core, meaning high-quality products that are easy to use, backed by market-leading technical support, and available with simple, cost-effective licensing, including royalty-free terms.

CAST was established in 1993 and has helped to pioneer and shape today’s booming IP industry. The company is private and always internally held. This strategic avoidance of external investors or debt of any kind has given us the freedom to grow organically and focus on solving real customer problems rather than being driven primarily by the financial expectations of venture capitalists or the demands of shareholders.

CAST is a global company, with headquarters in New Jersey; staff in San Diego, Austin, and other US locations; offices in the Czech Republic, Italy, Greece, and Poland; and partners throughout Asia. We’ve sold thousands of products to hundreds of customers, with a repeat sales rate of over 50%. These customers range from established tier-1 firms to emerging start-ups, and together they are shipping millions of product units using IP from CAST.

We employ a novel and highly successful approach to developing IP products: our own engineering team develops many CAST IP cores, and we also work with tightly coupled partners who have outstanding technical experience in particular areas. We enforce the same stringent quality standards on every core that CAST offers and we provide front-line support for them all, so customers can get the most advanced IP technology available but work with a provably reliable and trustworthy provider. And for our development partners, it gives their engineering talent access to otherwise unapproachable markets and the advantages of what we believe is the industry‘s most experienced IP sales and marketing team.

What problems are you solving?
We strive to deliver the true promise of reusable IP cores across our whole product line. That is to help design groups complete SoCs in the shorter periods of time demanded by market conditions even though these designers may lack detailed personal knowledge of every piece of technology they must use to develop competitive and successful products. For example, a developer can reliably build a system around a functional safety RISC-V processor core from CAST without actually knowing how to design a processor or achieve functional safety status.

CAST enables SoC developers to focus on where they really excel. We provide them with effective, reliable IP cores that they can use to build their systems, we bundle these products with class-leading support should the developer need it during integration, and we offer all that with fair terms that work well for both parties. Our customers have peace of mind knowing that every IP core they get from CAST is going to do the job it’s supposed to do in an efficient way, without causing any trouble. Customers can thus bring products to market in 18 months or so rather than the decade it might take without IP cores.

What application areas are your strongest?
We’re proud to offer a product line that’s both broad and deep, including several stand-out IP cores for different application areas.

The Automotive market is racing along and we offer cores for interfaces—with CAN and TSN leadership—plus processors and compression, most enabling functional safety with ASIL-D readiness.

Aerospace and Defense contractors especially value our efficient networking stacks for communication and compression cores. Both the Mars Curiosity rover and the Hubble space telescope employ image compression IP licensed from CAST, and there are many more like those we cannot talk about.

Industrial Automation developers use our 32- and 8-bit microcontrollers (the 8051 still lives!), networking stacks and interface cores. TSN is hot in this market and interest in and single-pair Ethernet is growing rapidly.

Data Centers take advantage of our hardware compression engines to optimize storage and bandwidth, as well as our encryption and TCP/UDP/IP accelerators to keep up with today’s staggering networking speeds without throttling their CPUs.

Wearables and Internet of Things systems take advantage of our ultra-low power embedded processors, peripheral-subsystems, compression, and interfaces.

CAST customers typically demand confidentially to preserve their competitive advantage so we can’t really talk about who has fielded which products. But it’s true that IP from CAST is probably closer to every reader of this interview than they might realize:

• Your smartphone and smartwatch are more likely using some CAST IP,
• Your cloud data is physically stored in data centers that probably use our compression and networking stacks,
• The sensors, ECUs, braking and other systems in your car are most likely communicating over IP from CAST,
• The security cameras in your local airport (or even in your house) could be using compression from CAST, and
• That Mars or space wallpaper photo on your computer has probably been compressed with IP CAST provided.

What keeps your customers up at night?
CAST customers have no shortage of stress factors, but with respect to IP their greatest fear is experiencing immense delays or even entire re-spins because of bad IP. We address this natural fear in several ways.

First, we enforce experience-based standards for reusability and quality on every core, from its high-level architecture down to its coding, documentation, and packaging.

Second, we don’t cut corners, especially as far as verification is concerned. We never treat a core’s first customers as beta testers, and we have learned that getting a new core to market quicker is not worth customers getting their product to market later.

And third, we invest in a support infrastructure ensuring that the questions or issues customers inevitably may have are resolved with speed and satisfaction.

The latter, technical support responsiveness, can be a real frustration point for many IP core users, who sometimes wait days or weeks for a response to their urgent queries. Instead, at CAST we have a 24/7 support mentality and coverage network and have the actual IP developers available to assist. Our average first response time is currently under 24 hours, and our resolution time is less than three days. This is how we demonstrate respect to our customers.

What does the competitive landscape look like and how do you differentiate?
While the dozens—hundreds?—of “IP core companies” that once existed has been reduced through failure or acquisition, those who have thrived like CAST have either focused on technical niches where they shine or learned to compete with price, perceived product quality, comprehensive services, or some combination.

At CAST, we don’t tend to worry about particular competitors but instead focus on providing A Better IP Experience for IP customers. This means reducing risk by providing proven, high-quality IP products backed by fast, effective technical support and available with fair, flexible, and typically royalty-free licensing.

While we have a strong—and growing—engineering group within CAST, our model of close, long-term partnerships with select domain experts enables us to offer a product line that we believe is uniquely both broad and deep.

Ranging from commodity cores through leading-edge functions, developers can find—and trust—many of the cores they need in the CAST product line. We provide the reliability, stability, and services of a large corporate IP provider but with agility and relationship-building more akin to a start-up.

Another competitive advantage CAST offers is less tangible but equally important. Despite being geographically widespread, we work hard to build a close, strong team, treating every employee, partner, and customer with respect and integrity. While some days are certainly challenging, the CAST team overall enjoys their work and shares in the company mission, and this attitude very positively impacts the world in which we operate.

What new features/technology are you working on?
We don’t usually launch or even promote products until they are fully verified, packaged, and ready for immediate successful use. But I can share a few things just for your readers.

Soon we’ll introduce 100G and 400G TCP/UDP/IP hardware stack that we think will be the first to support both IPv4 and IPv6.

We’re also enhancing our offerings in the area of TSN Ethernet—in collaboration with our partner at Fraunhofer IPMS—with 10G cores that better address the needs of industrial automation, automotive, and aerospace system developers.

Single Pair Ethernet (SPE) is another hot area these days, and you’ll soon see us collaborating with other members of the SPE Alliance to showcase some exciting products.

Yet another area of focus for CAST is that of embedded processors, their peripherals, subsystems, and security. Adding more new RISC-V cores with improved security features and better performance is on our roadmap for this year, as well as a series of security function accelerators and peripherals.

How do customers normally engage with your company?
More than 50% of new sales are to repeat customers, who will just call or email us directly.

We do regular outreach at events, with editorial placements, IP portals, select advertising, and social media, but most new customers find us through our website, which we first launched in 1994.

Our commitment to saving SoC developers considerable time begins with their first contact with CAST online. We freely post a considerable amount of technical information about every core, which helps the developer determine if a particular core may be right for them, even before they contact us.

Once they do contact us, our team of experienced sales engineers and global representatives are ready to help them choose the best IP for their exact needs. Eventual system success begins with choosing the right IP cores, and we never propose a non-optimum solution just to close a sale.

Sometimes the potential customer is exploring a technical area in which they lack personal experience—automotive ASIL D certification, for example—and our team has the know-how and aptitude to help them understand and work through the best fit for their needs.

Our Better IP Experience philosophy thus extends from our first interaction with a potential customer through their successful production and shipping of products employing IP cores from CAST. We believe that this approach—as well as our product line—sets CAST apart as one of the best IP partners a SoC or FPGA developer might find.

Also Read:

The latest ideas on time-sensitive networking for aerospace

May 30, 2024October 15, 2024

How does your EDA supplier ensure software quality?

How does your EDA supplier ensure software quality?
by admin on 05-30-2024 at 10:00 am
Categories: EDA, Siemens EDA

In the fast-paced world of electronic design automation (EDA) software development, maintaining high code quality while adhering to tight deadlines is a significant challenge. Code coverage, an essential metric in software testing, measures the extent to which a software’s source code is executed in tests. High code coverage is indicative of thorough testing, suggesting a lower likelihood of undiscovered bugs. However, achieving and maintaining this can be resource-intensive and time-consuming. This is why AnaCov, our proprietary software code coverage solution, became a game-changer for Siemens EDA’s Caibre software. EDA users should have confidence in their supplier’s ability to meet very high software quality standards, so we invite you to learn more about how we do it.

AnaCov: The concept

We developed the sophisticated AnaCo tool to facilitate code coverage testing by mapping test cases to functions and source files. It utilizes coverage data obtained from GCOV, stored in Git repositories, and operates within an SQL database framework. This innovative approach allows quality assurance (QA) engineers to efficiently track and analyze code coverage over time. The primary goal is to ensure comprehensive testing while minimizing the use of time and disk space.

Why a tool like AnaCov is needed for modern software testing

As software becomes increasingly complex, the need for effective testing methods has never been more critical. Traditional approaches to code coverage can often lead to significant consumption of disk space and processing time, particularly when dealing with large volumes of test cases. AnaCov addresses these challenges by providing a streamlined, efficient method for tracking and analyzing code coverage.

AnaCov’s core features

Resource Optimization: AnaCov is designed to manage large-scale testing efficiently, reducing the time and disk space typically required for comprehensive code coverage analysis.
User-Friendly Interface: With a single command line interface, AnaCov is accessible to users of varying expertise levels, from seasoned QA engineers to newcomers.
Advanced Code Mapping: The tool’s ability to map test cases to specific source files and functions is crucial for targeted testing, ensuring that new code additions are thoroughly vetted.
Versioned Coverage Tracking: AnaCov enables QA teams to track coverage over different development stages, offering insights into long-term code quality and maintenance.

The working principle of AnaCov

AnaCov operates by taking GCOV run data as input and producing detailed coverage reports. These reports highlight the percentage of code covered by tests, offering invaluable insights to developers and QA engineers. The tool’s data storage module, a central component of its functionality, comprises local storage and a database. The local storage serves as a centralized space for coverage data, enhancing accessibility and management across various projects.

AnaCov’s database component stores the coverage output data generated during code analysis. Leveraging database capabilities, it facilitates instant retrieval and utilization of coverage data, enabling informed decision-making and tracking of code coverage progress. The components of AnaCov are shown in figure 1.

Fig. 1. AnaCov components

AnaCov’s unique approach to coverage analysis

The philosophy behind AnaCov focuses on optimizing resources and enhancing usability. AnaCov addresses the challenge of disk space consumption by archiving only essential files needed for generating coverage reports. Its run process analyzes the codebase and the test suite to determine the extent of coverage, feeding this data into the database and potentially committing it to a Git remote repository for collaborative analysis.

One of AnaCov’s standout features is its ability to merge multiple coverage archives. This function is particularly beneficial in large-scale projects with separate coverage reports for different components. By combining these individual archives, AnaCov offers a unified view of code coverage, helping teams understand the test coverage across the entire codebase.

Incremental coverage and historical tracking

A key feature of AnaCov is its incremental coverage capability, which allows QA engineers to measure coverage for newly added or modified code, separate from the entire codebase. This feature not only speeds up the testing process but also optimizes it by focusing only on the relevant code changes.

Moreover, AnaCov incorporates a historical coverage feature using the Git version control system. This feature maintains a comprehensive record of code coverage status across different development milestones, facilitating a deeper understanding of the evolution of test coverage over time.

AnaCov’s impact on software testing

AnaCov’s introduction significantly impacts software testing, addressing critical issues faced by QA engineers and developers. Its ability to efficiently track code coverage and analyze new code additions ensures that software quality is not compromised under tight development schedules. By enabling the merging of multiple coverage reports into a single, comprehensive document, AnaCov proves to be an invaluable tool, leading to higher quality code and more efficient testing processes.

Real-world application and benefits

In practical terms, AnaCov has shown remarkable results in real-world applications. For instance, using AnaCov for the Siemens EDA Calibre product reduced the disk space for a full coverage run from 670 GB to just 20 GB. This significant reduction demonstrates AnaCov’s effectiveness in optimizing resource usage in software testing environments.

Furthermore, AnaCov’s mapping functionality is vital for efficiently conducting code coverage analysis. By establishing mappings between test cases and code components, QA engineers can easily determine which test cases cover specific source files or functions. This targeted approach saves time and resources while ensuring comprehensive code coverage.

Conclusion

The end users of complex EDA software deserve to know how their vendors ensure high-quality software. AnaCov represents a significant advancement in the field of software testing. Its innovative approach to code coverage analysis addresses the critical challenges of resource optimization, usability, and efficiency. By offering detailed insights into code coverage and enabling efficient tracking of new code additions, AnaCov plays a crucial role in improving software quality. Its integration into the software development process marks a step forward in ensuring robust, high-quality software products.

Also Read:

The secret to Calibre software quality – AnaCov, our in-house code coverage analysis tool

AnaCov – A novel method for enhancing coverage analysis

How to Find and Fix Soft Reset Metastability

May 30, 2024October 2, 2025

The Fallacy of Operator Fallback and the Future of Machine Learning Accelerators

The Fallacy of Operator Fallback and the Future of Machine Learning Accelerators
by Kalar Rajendiran on 05-30-2024 at 6:00 am
Categories: AI, IP, Quadric

As artificial intelligence (AI) and machine learning (ML) models continue to evolve at a breathtaking pace, the demands on hardware for inference and real-time processing grow increasingly complex. Traditional hardware architectures for acceleration are proving inadequate to keep up with these rapid advancements in ML models. Steve Roddy, Chief Marketing Officer at Quadric Inc., made a presentation on this topic, at the IPSoC Conference in Silicon Valley last month. His talk elaborated on why and where the traditional architectures fall short and how Quadric’s innovative Chimera General Purpose NPU (GPNPU) offers a superior, future-proof solution.

The Limitations of Traditional Architectures

Traditional heterogeneous architectures typically employ a combination of NPUs, DSPs, and CPUs to handle various aspects of ML inference and real-time processing. Each component brings its strengths to the solution. NPUs are optimized for matrix operations, DSPs for math kernel performance, and CPUs for general-purpose tasks. However, these strengths come with significant limitations. Managing the interplay between NPU, DSP, and CPU requires complex data transfers and synchronization, leading to increased system complexity and power consumption. Developers must contend with different programming environments and extensive porting efforts, making debugging across multiple cores even more challenging and reducing productivity. Moreover, fixed-function accelerators, like traditional NPUs, are designed to handle a limited set of operations.

The Evolution of AI/ML Models

In the early days of machine learning, hardware accelerators were designed to handle relatively simple and highly regular operations. State-of-the-art (SOTA) networks primarily consisted of matrix-style operations, which suited hardwired, non-programmable accelerators like NPUs. These NPUs provided efficient ML inference by focusing on matrix multiplications, pooling, and activation functions. However, this specialization limited their flexibility and adaptability as AI models evolved.

The introduction of transformer models, such as Vision Transformers (ViTs) and Large Language Models (LLMs), marked a significant shift in AI/ML complexity. Modern algorithms now incorporate a wide variety of operator types, far beyond the scope of traditional matrix operations. Today’s SOTA models, like transformers, utilize a diverse set of graph operators—ResNets may use around 8, while transformers can use up to 24. This diversity in operations challenges hardwired NPUs, which are not designed to handle such a broad range of tasks efficiently, highlighting the limitations of traditional NPU architectures. As ML models evolve, these accelerators quickly become obsolete, unable to support new operators and network topologies.

What About Operator Fallback?

To mitigate the limitations of fixed-function NPUs, traditional systems use a mechanism called “Operator Fallback.” This approach offloads the most common ML computation operators to the NPU, while the CPU or DSP handles the less common or more complex operations. The assumption is that fallback operations are rare and non-performance critical. However, this is a flawed assumption for several reasons. When fallback occurs, the CPU or DSP handles operations at significantly lower speeds compared to the NPU. This results in performance bottlenecks, where the slow execution of fallback operators dominates the overall inference time. Fallback requires seamless data transfer and control between the NPU and the programmable cores, adding to system complexity and power consumption. As ML models grow in complexity, the frequency and criticality of fallback operations increase, further degrading performance.

Quadric’s Chimera GPNPU

Quadric addresses these challenges with its Chimera GPNPU, an architecture designed to be as flexible and programmable as CPUs or DSPs, but with the performance of specialized accelerators.

Chimera consolidates the processing capabilities into a unified, single core architecture that runs all kernels as a C++ application. This simplifies SoC design by reducing the need for multiple specialized cores, easing integration and debugging. The GPNPU is purpose-built for matrix math and convolutions, maintaining high utilization similar to systolic arrays. This ensures excellent inference performance on ML tasks without relying on fallback. With the Chimera Graph Compiler, developers can auto-compile hundreds of networks and write/debug graph code and C++ code on one core, streamlining the development process and enhancing productivity. Chimera’s C++ programmability allows engineers to quickly add new ML operators, ensuring that the hardware can adapt to future ML models. This eliminates the risk of obsolescence associated with fixed-function accelerators.

By reducing the need for complex data transfers and synchronization between multiple cores, Chimera operates more efficiently, consuming less power. Available in 1 TOPS, 4 TOPS, and 16 TOPS variants, Chimera can scale to meet the demands of various applications, from low-power devices to high-performance systems.

View Chimera performance benchmarks on various neural networks here.

Summary

As ML models continue to evolve, the need for flexible, high-performance hardware becomes increasingly critical. Traditional architectures, relying on a combination of NPUs, DSPs, and CPUs, fall short due to their complexity, inefficiency, and risk of obsolescence. The fallback operator mechanism further exacerbates these issues, leading to significant performance bottlenecks.

Quadric’s Chimera GPNPU offers a compelling alternative, providing a unified, programmable architecture that eliminates the need for fallback. By addressing the inherent flaws of Operator Fallback, Quadric is setting a new standard for performance, flexibility, and future-readiness in ML computing. By simplifying SoC design, enhancing programming productivity, and ensuring future-proof flexibility, Chimera delivers a significant acceleration in ML inferencing and real-time processing.

Learn more at Quadric.

Also Read:

2024 Outlook with Steve Roddy of Quadric

Fast Path to Baby Llama BringUp at the Edge

Vision Transformers Challenge Accelerator Architectures

May 29, 2024April 2, 2025

Secure-IC Presents AI-Powered Cybersecurity

Secure-IC Presents AI-Powered Cybersecurity
by Mike Gianfagna on 05-29-2024 at 10:00 am
Categories: AI, Events, Secure-IC, Security

Design & Reuse held its IP-SoC Silicon Valley 24 event on April 25th, 2024, at the Hyatt Regency Santa Clara. The agenda was packed with many relevant and compelling presentations from companies large and small. I attended one presentation on security that stood out for me. Secure-IC presented “AI-powered cybersecurity: Securyzr™ Intrusion Detection System (IDS)”. The presentation discussed a comprehensive approach to system security that includes both hardware and software. The addition of AI makes it even more potent. Security is a growing issue in our industry, and we need more focus on the problem. I’ll review one example of this needed focus as Secure-IC presents AI-powered cybersecurity.

Presentation Overview

Yathiendra Vunnam

Yathiendra Vunnam gave the presentation. He is a Field Application Engineer at Secure-IC. This allows him to participate in Secure-IC’s development in the United States in various verticals such as automotive, semiconductor, defense, space, and IoT. He holds an MS degree in Cybersecurity from Georgia Institute of Technology, so security is something he has a passion for.

His presentation began with a description of the problem and the mission of Secure-IC:

IoT devices being interconnected, each and every object could be a threat for the whole network. Therefore, the security of the objects or the devices with their lifecycle management is key, and so is their data. To ensure the integrity of this data, the whole system must be secured and managed. Trusted devices enable trusted data.

Secure-IC partners with its clients to provide them with the best end-to-end cybersecurity solutions for embedded systems and connected objects, from Chip to Cloud

The “punch line” of this statement in the presentation is the graphic at the top of this post. Sercure-IC is a unique company that provides a wide array of security solutions. You can get a feeling for the breadth of the company’s impact here.

Intrusion Detection

Next, Yathiendra discussed Secure-IC’s Securyzr™ intrusion detection system (IDS). The goal of this technology is to maintain trust throughout the whole device lifecycle. Features of this technology include:

Threat Detection: Monitors CPU and memory activity, network traffic (CAN bus, Ethernet) and more
Threat Analysis: Rule-based or AI methodology to discriminate alerts and eliminate false positives
Threat Response: Immediate local threat response based on pre-defined rules and leveraging edge AI computing
Life Cycle Management: The acquired data from IDS are sent to the Securyzr Server in the cloud, facilitating device life cycle management for the entire fleet of devices
Securyzr iSSP Integration: IDS can be employed in the monitoring service of iSSP, providing a comprehensive solution for fleet-wide security management.

Next, Yathiendra discussed the Intrusion Detection System (IDS), which is a pure software solution that offers:

Answer to new cybersecurity regulations
- Real-time monitoring of cybersecurity threats on a fleet of edge devices with alarms
- Cloud-based dashboards
- Cloud-based fleet data aggregation and processing for global updates based on rich data sets edge processing for fast mitigation
Minimal cost of implementation and integration in existing hardware solutions
Integration with Securyzr Server services
Leveraging of Secure-IC Securyzr integrated Secure Elements features

Next, AI-powered cybersecurity was discussed. This technology was summarized as follows:

Data collection

Collecting information coming from host CPU (buses, sensors, and security components such as the integrated secure element Securyzr iSE which may be included in the system SoC)

Threat Detection

Real-time anomaly detection on advanced techniques using rule-based & AI/ML methodology with the reduced rates of FN/FP (false negative / false positive)
- Novelty detection (with nominal behavior)
In case Secure-IC Digital Sensors are implemented in the SoC, AI ML-based SW smart monitor is used to analyze the answers from these sensors and reach a conclusion on the detected threat

Threat Response

Immediate threat response based on pre-defined alert detection
Based on full software AI with model stored in the host for fast simple response
More complex situations may be managed by software AI module at the server side

Automotive Use Case

Yathiendra then described an automotive use case for the technology. He went into a lot of detail regarding where and how the technology is deployed. The diagram below summarizes how IDS is part of an end-to-end detection and response solution.

End to end detection and response solution

Final Thoughts and to Learn More

Yathiendra conclude his talk with the following points:

Intrusion Detection System is included in an edge device and can monitor network buses, sensors,…
The system verifies in real time whether each of your connected devices is attacked and sends information to your supervising infrastructure
The system is hardware agnostic – it runs on a CPU with OS
The system can easily interface with security component Securyzr ISE and take advantage of its features (sensors)

This was a very relevant and useful presentation on a holistic approach to security. You can learn more about the breadth of technology solutions offered by Secure-IC here. And that’s how Secure-IC presents AI-powered cybersecurity.

WEBINAR: Redefining Security – The challenges of implementing Post-Quantum Cryptography (PQC)

Also Read:

How Secure-IC is Making the Cyber World a Safer Place

2024 Outlook with Hassan Triqui CEO of Secure-IC

Rugged Security Solutions For Evolving Cybersecurity Threats

May 29, 2024May 26, 2024

Mastering Copper TSV Fill Part 2 of 3

Mastering Copper TSV Fill Part 2 of 3
by John Ghekiere on 05-29-2024 at 8:00 am
Categories: 3D IC

Establishing void-free fill of high aspect ratio TSVs, capped by a thin and uniform bulk layer optimized for removal by CMP, means fully optimizing each of a series of critical phases. As we will see in this 3-part series, the conditions governing outcomes for each phase vary greatly, and the complexity of interacting factors means that starting from scratch poses an empirical pursuit that is expensive and of long duration.

Robust and void-free filling of TSVs with copper progresses through six phases as laid out below:

Feature wetting and wafer entry (previous article)
Feature polarization
Nucleation
Fill propagation
Accelerator ejection
Bulk layer plating
(Rinsing and drying, which we won’t cover in this series)

Feature Polarization

Before we talk about features specifically, let’s briefly review electrolyte formulation. In general, copper TSV plating chemistries are formulated of certain inorganic components and certain organic components. The inorganics are: deionized water, copper sulfate, sulfuric acid and hydrochloric acid. And the organics are commonly referred to as accelerator, suppressor and leveler. We could get very deep into the specifics here, and they are truly fascinating. However, we are not attempting to invent a TSV chemistry, but rather to put an existing one to use.

We ended the previous article having described wafer entry. In most cases, this entry step is followed directly by a brief “dwell step” during which the wafer simply sits in the electrolyte, ideally spinning at moderate speed, with no potential applied (thus no current flowing). During this step, the chemical affinity of the suppressor for the copper will cause the suppressor (and leveler as well) to adsorb to the surface. Complete coverage of the surface is critical as any location that is under-suppressed will experience unwanted copper growth. A previous colleague of mine used to refer to the effectiveness of this suppressor coverage as forming either a “blanket” of suppressor or an “afghan” of suppressor.

After the brief dwell step, the recipe moves into its first plating step. Here is where proper polarization plays out (or doesn’t!) Suppressors and accelerators operate in a sort of competition with each other in copper fill. Thus the specific formulations and relative concentrations matter very much in terms of the effectiveness of a given plating chemistry. Suppressors have an advantage over accelerators in that they absorb more readily. But accelerators have an advantage in that they diffuse more rapidly.

Given this, we can quickly understand how polarization plays out. All the components of the chemistry have equal access to the surface of the wafer as soon as it enters the bath. But suppressors dominate adsorption on the surface because they adsorb more readily, even in the presence of ample accelerator. However, all components need to travel down the via hole in order to adsorb in there. And this is where the advantage goes to the accelerator. Yes, it is a slower adsorber (spell check says that’s not a word), but because it gets to the bottom of the via before the other additives, it has the time it needs. A distribution thus forms wherein suppressor adsorption manifests at a very high concentration at the top of the via and accelerator adsorption manifests at a very high concentration at the bottom.

The blanket of suppressor behaves as a thin insulator on the surface of the copper; and its coverage thins and disappears down the wall of the TSV. This is the effect we are calling polarization.

So what happens if polarization does not work out? And how do you know whether it worked or didn’t? And what do you do about it?

What happens? Sadly, there is a hard truth in polarization. You either achieved it, or you didn’t. There really are no second chances. What I mean is that, in setting up the process, if the initial adsorption distribution is not favorable to bottom up fill, no subsequent recipe can recover deposition distribution for a good fill.

How do you know? We will cover this in more detail in the next article because it has a great deal to do with evaluating fill propagation. But suffice to say for now that you will be staring at FIB/SEM images.

What do you do? The two most likely causes for a failure in polarization are:

Dwell time duration is wrong. A dwell step that is too long or too short can lead to a non-optimal adsorption profile. Too short may mean the accelerator did not have quite enough time to collect in the via bottom at high concentration. Too long may mean suppressor or leveler molecules had time to get down there too. The height of the via is going to be a factor here.
Non-optimal mixture ratio of the organic components. Remember, suppressor and accelerator are in competition here. Too much of one and too little of the other and we don’t get the gradient we were aiming for. It’s important to note here that levelers are, in fact, a specific form of suppressor. And, in the case of the more advanced TSV fill chemistries on the market, the leveler is the more active than the conventional suppressor. So if you are using a more highly formulated TSV chemistry (you can tell by the price), adjusting leveler-to-accelerator ratios may be necessary.

Nucleation

As I hinted in the previous article, nucleation may or may not be necessary to good fill of TSVs. In my compelling cliff hanger on the topic, I also let drop that optimal nucleation can make up for some non-optimal outcomes from upstream processes.

What is nucleation? Nucleation has to do with the nature by which deposition of the copper initiates. And now we are talking about the molecular level, particularly way down at the via bottom. When the seed copper is rough, whether because of roughness in the seed copper itself or else roughness of the underlying layers translated through the seed, the texture will create hyper-local areas of increased current density. If you imagine zooming well in on the roughness, you will observe peaks and valleys in the surface. The peaks are the areas that generate the higher current density.

That higher current density more strongly attracts copper ions in the chemistry. The result is that copper ions can migrate to and preferentially deposit onto these peaks instead of distributing evenly. The peak thus increases in size, which makes that location even more favorable for deposition. And a bit of a runaway occurs. The observable result of this behavior is the formation of copper nodules. These nodules continue to grow faster than surrounding areas until the nodules begin to expand into each other. Guess what that causes. Yes. Voids. Large scale nodule formation will trap numerous tiny voids around the bottom side wall as the nodules grow into each other.

If such voids are observed, then better control of nucleation is likely necessary. The key here is that we not allow lazy copper ions to meander to whatever location they prefer, but rather to force them to lay down at the nearest location. We accomplish this by doing something that is very bad.

But it’s ok because we don’t do it for long. Agreed?

The bad thing we need to do is to put the system into a state approaching the “Limiting Current Density”. The limiting current density is the current density at which the ions available to react at the surface are consumed faster than they can be replenished. We do this by setting the current in the initial plating step to effect a much higher current density than normal. Perhaps 4 times higher. What happens is that we increase the deposition rate so much that copper ions do not have the chance to migrate but rather deposit as quickly as possible in the immediate vicinity, peak or no peak.

Again, this is a very bad thing we are doing and going on too long would cause a number of deleterious outcomes including…electrolysis of water.

I’ll bet you thought I was going to say voids.

Actually, the answer is also voids. It will cause voids.

So we would do this nucleation step for a short period of time, say 500 milliseconds. The result should be a “flattening” of the surface and the avoidance of nodule formation.

Maybe time to regroup:

We wetted the features. Best way to do this is by using a vacuum prewet chamber, especially for TSVs with an aspect ratio of 5 or greater.
We transferred the wafer to the plating reactor (ideally a fountain type that spins the wafer during plating) and performed an optimal entry which avoids trapping any air against the wafer surface.
We allowed the wafer to dwell in the chemistry for a short period of time, allowing suppressor to coat the wafer surface and accelerator to race down into the vias ahead of everyone else. Then we initiated a potential on the system causing current to flow. A lovely gradient of adsorbed organic molecules formed on the via surfaces. Our vias were polarized.
We had previously noted that roughness on the surface of the lower via was prone to causing nodules so we deployed the initial plating step as a high current nucleation step for half a second before returning to a normal current density.
And now we are ready to look at propagation of fill.

Looking ahead to our final post on the topic, we have a lot of ground left to cover: Fill propagation, accelerator ejection and bulk layer plating. It all matters so now slowing down now. In fact, things are going to get quite fast now. Well, I mean in terms of plating steps that take an hour or so.

If you enjoyed this post, be sure to Like and to follow me so you don’t miss a single thing. Meanwhile, you suspect polarization didn’t work and you don’t want to wait til next week to get it under control. Get hold of us and let’s see how we can help your product or your fab retain technology sovereignty in a highly competitive marketplace.

Also Read:

Mastering Copper TSV Fill Part 1 of 3

May 29, 2024June 10, 2024

Using LLMs for Fault Localization. Innovation in Verification

Using LLMs for Fault Localization. Innovation in Verification
by Bernard Murphy on 05-29-2024 at 6:00 am
Categories: AI, Cadence, EDA

We have talked about fault localization (root cause analysis) in several reviews. This early-release paper looks at applying LLM technology to the task. Paul Cunningham (GM, Verification at Cadence), Raúl Camposano (Silicon Catalyst, entrepreneur, former Synopsys CTO and now Silvaco CTO) and I continue our series on research ideas. As always, feedback welcome.

The Innovation

This month’s pick is A Preliminary Evaluation of LLM-Based Fault Localization. This article was published in arXiv.org in August. The authors are from KAIST in South Korea.

It had to happen. LLMs are being applied everywhere so why not in fault localization? More seriously there is an intriguing spin in this paper, enabled by the LLM approach – explainability. Not only does this paper produce a root cause; it also explains why it chose that root cause. For me this might add a real jump to success rates for localization. Not because the top candidate will necessarily be more accurate but because a human verifier (or designer) can judge whether they think the explanation is worth following further. If an explanation comes with each of the top 3 or 5 candidates, perhaps augmented by spectrum-based localization scores, intuitively this might increase localization accuracies significantly.

Paul’s view

A very timely blog this month: using LLMs to root cause bugs. No question we’re going to see lot more innovations published here over the next few years!

In 2022 we reviewed DeepFL which used an RNN to rank methods based on suspiciousness features (complexity, mutation, spectrum, text). In 2023 we reviewed TRANSFER-FL which used another RNN to improve ranking by pre-classifying bugs into one of 15 types based on training across a much larger dataset of bugs from GitHub.

This paper implements the entire “fault localization” problem using prompt engineering on OpenAI’s GPT-3.5. Two cutting edge LLM-based techniques are leveraged: chain-of-thought prompting and function calling. The former is where the question to the LLM includes an example not only of how to answer the question but the suggested thought process the LLM should follow to obtain the answer. The latter is where the LLM is given the ability to ask for additional information automatically by calling on user-provided functions.

The authors’ LLM prompt includes the error message and a few relevant lines of source code referenced by the error message. The LLM is given functions that enable to query if the test covered a particular method and to query the source code or comments for a method.

As is typical for fault localization papers, results are benchmarked on Defects4J, an open source database of Java code bugs. Somewhat amazingly, despite no pre-training on the code being debugged or prior history of passing and failing test results, the buggy method is ranked in the top-5 by the LLM in 55% of the cases benchmarked! This compares to 77% for DeepFL, but DeepFL required extensive pre-training using Defects4J data (i.e. leave-out-one cross validation). TRANSFER-FL is hard to compare since it is a more precise ranker (statement-level accurate not method-level). Most likely, a combination of LLM-based and non-LLM based methods will be the long term optimal approach here.

Raúl’s view

This paper is the first to use LLMs for fault localization (FL) and was published in August 2023. A search for “Use of LLM in Fault Localization” reveals another paper from CMU, published in April 2024, but it employs a different methodology.

The main idea in this paper is to overcome the LLM limit of 32,000 tokens (in this case ChatGPT), which is insufficient if the prompt includes, for example, 96,000 lines of code. Instead, to navigate the source code, the LLM can call functions, in particular (the names are self-explanatory) get_class_covered get_method_covered, get_code_snippet and get_comments.

The actual technique used, called AutoFL, requires only a single failing test. It works by first prompting the LLM to provide a step-by-step explanation on how the bug occurred, with some prompt engineering required (Listing 1). The LLM goes through the code with the functions and gives an explanation. Using this, AutoFL then prompts ChatGPT to find the fault location (Listing 3) assuming the LLM has implicitly identified it in the previous phase. To improve the technique, the authors restrict the function calls to 9, and do the whole process 5 times using all 5 results to rank the possible locations.

The paper compares AutoFL with seven other methods (reference [41]) on a benchmark with 353 cases. AutoFL finds the right bug location more often than the next best (Spectrum Based FL) when using one suggestion: 149 vs. 125. But it does worse when using 3 or 5 suggestions: 180 vs. 195 and 194 vs. 218. The authors also note that 1) AutoFL needs to call the functions to explore the code, otherwise the result gets much worse; 2) more than 5 runs still improves the results; and 3) “One possible threat to validity is that the Defects4J bug data was part of the LLM training data by OpenAI”.

The approach is experimental with sufficient details to be replicated and enhanced. The method is simple to apply and use for debugging. The main idea of letting the LLM explore the code with some basic functions seems to work well.

May 28, 2024May 28, 2024

Elevating Your SoC for Reconfigurable Computing – EFLX® eFPGA and InferX™ DSP and AI

Elevating Your SoC for Reconfigurable Computing – EFLX® eFPGA and InferX™ DSP and AI
by Kalar Rajendiran on 05-28-2024 at 10:00 am
Categories: AI, eFPGA, Flex Logix, FPGA

Use Case eFPGA Complementing Signal Processing

Field-Programmable Gate Arrays (FPGAs) have long been celebrated for their unmatched flexibility and programmability compared to Application-Specific Integrated Circuits (ASICs). And the introduction of Embedded FPGAs (eFPGAs) took these advantages to new heights. eFPGAs offer on-the-fly reconfiguration capabilities, allowing system designers to adapt to evolving protocols and cryptographic standards without the need for costly hardware changes. This inherent flexibility not only reduces risk but also ensures longevity and scalability, essential factors in today’s fast-paced technological landscape.

Flex Logix is well known for its reconfigurable computing solutions, particularly embedded Field-Programmable Gate Arrays (eFPGAs). What may not be as well-known is the company’s signal processing IP. As Artificial Intelligence (AI) applications continue to proliferate across industries, the need for making asynchronous decisions has become increasingly imperative. These applications demand robust support for linear math operations, convolution, and transforms, all of which require a powerful signal processing engine. To address this need, FlexLogix offers signal processing IP as well. By combining eFPGAs with dedicated signal processing capabilities, designers can develop more efficient solutions that preempt the signal processing engine’s starvation due to memory bandwidth limitations. This integration not only enhances performance but also unlocks new possibilities for real-time processing and analysis in AI applications. This was the focus of a talk by Jayson Bethurem, VP of Marketing at FlexLogix, at the IPSoC 2024 Silicon Valley conference.

eFPGAs and Asynchronous Applications

eFPGAs present a versatile solution for asynchronous applications, operating without a global clock signal and relying on local timing mechanisms. One key advantage of eFPGAs in this domain is their ability to offer custom timing control, allowing designers to implement precise timing circuits and control mechanisms tailored to the specific requirements of asynchronous applications. This flexibility enables optimization of timing parameters such as delay, skew, and signal propagation independently for different parts of the circuit, ensuring efficient operation.

Moreover, eFPGAs facilitate the implementation of fine-grained synchronization techniques, such as handshake protocols and delay-insensitive circuits, commonly used in asynchronous design methodologies. These synchronization mechanisms ensure correct operation and data integrity in asynchronous systems, even in the presence of varying delays and timing uncertainties. Additionally, eFPGAs provide high-speed interconnect resources that can be customized to build efficient communication channels between asynchronous modules or data processing elements, enhancing the performance and scalability of asynchronous systems. With support for dynamic reconfiguration, power efficiency features, and fault tolerance mechanisms, eFPGAs serve as an attractive platform for developing efficient and reliable asynchronous systems across various domains.

Signal Processing in eFPGA Use Cases

eFPGAs enable customizable accelerators for algorithms in machine learning and signal processing. Additionally, eFPGAs handle protocol offloading in communication systems, reconfigurable I/O interfaces in consumer electronics, and real-time data processing in storage and communication devices. They also support firmware upgrades in embedded systems and dynamic resource allocation in high performance computing (HPC). In automotive and industrial automation, eFPGAs facilitate sensor fusion and real-time image processing. They enable customizable networking protocols, enhance security features, and ensure fault tolerance in critical systems.

Summary

eFPGA integration streamlines product development by minimizing mask spins, reducing engineering costs, and accelerating time-to-market. Their adaptability ensures longevity by accommodating evolving protocols and facilitating periodic bug fixes through firmware updates, thus averting costly recalls. They enable product differentiation by implementing unique features, attract customers, and support premium pricing. Moreover, they meet regional requirements, address security threats, and streamline testing and debugging processes, further enhancing efficiency. Lastly, eFPGA integration supports the integration of evolving IP cores (such as in the field of AI), ensuring products remain competitive with the latest technological advancements without requiring hardware upgrades.

For more details, visit https://flex-logix.com/

Also Read:

WEBINAR: Enabling Long Lasting Security for Semiconductors

LIVE WEBINAR: Accelerating Compute-Bound Algorithms with Andes Custom Extensions (ACE) and Flex Logix Embedded FPGA Array

Reconfigurable DSP and AI IP arrives in next-gen InferX