ads mdx semiwiki building trust gen 800x100ai

Takeaways from CadenceLIVE 2023

Takeaways from CadenceLIVE 2023
by Bernard Murphy on 05-11-2023 at 6:00 am

Takeways image

Given popular fascination it seems impossible these days to talk about anything other than AI. At CadenceLIVE, it was refreshing to be reminded that the foundational methods on which designs of any type remain and will always be dominated in all aspects of engineering by deep, precise, and scalable math, physics, computer science and chemistry. AI complements design technologies, allowing engineers to explore more options and optimizations. But it will continue to stand on the shoulders of 200+ years of accumulated STEM expertise and computational methods as a wrapper around but not displacing those methods.

Granting this observation, where is AI useful in electronic design systems methods, and more generally, how do AI and other technologies affect business shifts in the semiconductor and electronic systems industries? That’s the subject of the rest of this blog.

AI in Cadence Products

Cadence clearly intends to be a front-runner in AI applications. Over the last few years, they have announced several AI-powered products—Cadence Cerebrus for physical synthesis, Verisium for verification, Joint Enterprise Data and AI (JedAI) for unifying massive data sets, and Optimality for multi-physics optimization. Recently, they added Virtuoso extensions for analog design, Allegro X AI for advanced PCB, and Integrity for 3D-IC designs.

As a physical synthesis product, I expect Cadence Cerebrus to be primarily aimed at block design for the same reasons I mentioned in an earlier blog. Here, I expect that reinforcement learning around multiple full physical synthesis runs drives wider exploration of options and better ultimate PPA.

Verisium has a quite broad objective in verification, spanning debug and test suite optimization, for example, in addition to block-level coverage optimization. Aside from block level coverage, I expect other aspects to offer value across the design spectrum, again based on reinforcement learning over multiple runs (and perhaps even between products in the same family).

Optimality is intrinsically a system-level analysis and optimization suite. Here, also, reinforcement learning across multiple runs can help complex multi-physics analyses—electromagnetics, thermal, signal and power integrity—to converge over more samples than would be feasible to consider in traditional manual iteration.

Virtuoso Studio for analog is intrinsically a block-level design tool because no one, to my knowledge, is building full-chip analog designs at the SoC scale (with the exception of memories and perhaps neuromorphic stuff). Automation in analog design has been a hoped for but unreached goal for decades. Virtuoso is now offering learning-based methods for placement and routing, which sounds intriguing.

Allegro X AI aims for similar goals in PCB design, offering automated PCB placement and routing. The website suggests they are using generative techniques here, right on the leading edge of AI today. The Integrity platform builds upon the large database capacity of the Innovus Implementation System and leverages both Virtuoso and Allegro for analog RF and package co-design, providing a comprehensive and unified solution 3D-IC designs.

Three Perspectives on Adapting to Change

It’s no secret that markets are changing rapidly in response to multiple emerging technologies (including AI) and faster moving changes in systems markets as well as economic and geopolitical stresses. One very apparent change in our world is the rapid growth of chip design in-house among systems companies. Why is that happening and how are semiconductor and EDA companies adapting?

A Systems Perspective from Google Cloud

Thomas Kurian, CEO of Google Cloud talked with Anirudh on trends in the cloud and chip design needs. He walked through the evolution of demand for cloud computing, starting with Software-as-a-Service (SaaS), driven by applications from Intuit and Salesforce. From there, the landscape progressed to Infrastructure-as-a-Service (IaaS) allowing us to buy elastic access to compute hardware without the need to manage that hardware.

Now Thomas sees digitalization as the principal driver: in cars, cell phones, home appliances, industrial machines. As digitalization advances happen, digital twins have become popular to model and optimize virtualized processes, applying deep learning to explore a wider range of possibilities.

To support this objective at scale, Google wants to be able to treat worldwide networked data centers as a unified compute resource, connecting through super low latency network fabrics for predictable performance and latency no matter how workloads are distributed. Meeting that goal demands a lot of custom semiconductor design for networking, for storage, for AI engines, and for other accelerators. Thomas believes that in certain critical areas they can build differentiated solutions meeting their CAPEX and OPEX goals better than through externally sourced semiconductors.

Why? It’s not always practical for an external supplier to test at true systems scale. Who can reproduce streaming video traffic at the scale of a Google or AWS or Microsoft? Also, in building system process differentiation, optimizing components helps, but not as much as full-process optimization. Say, from Kubernetes, to containers, to provisioning, to a compute function. Difficult for a mainstream semi supplier to manage that scope.

A Semiconductor Perspective from Marvell

Chris Koopmans, COO at Marvell, talked about how they are adapting to evolving systems company needs. Marvell is squarely focused on data infrastructure technology in datacenters and through wireless and wired networks. AI training and other nodes must be able to communicate reliably at high bandwidth and with low latency at terabytes per second across data center-size distances. Think of ChatGPT, which is rumored to need ~10K GPUs for training.

That level of connectivity requires super-efficient data infrastructure, yet cloud service providers (CSPs) need all the differentiation they can get and want to avoid one-size-fits-all solutions. Marvell partners with CSPs to architect what they call cloud-optimized silicon. This starts with a general-purpose component, serving a superset of needs, containing some of the right ingredients for a given CSP but over-built therefore insufficiently efficient as-is. A cloud-optimized solution is tailored from this platform to a CSP’s target workloads and applications, dropping what is not needed and optimizing for special purpose accelerators and interfaces as necessary. This approach allows Marvell to deliver customer-specific designs from a reference design using Marvell-differentiated infrastructure components.

An EDA Perspective from Cadence

Tom Beckley, senior VP and GM for the Cadence Custom IC & PCB group at Cadence, wrapped up with an EDA perspective on adapting to change. You might think that, with customers in systems and semiconductor design, EDA has it easy. However, to serve this range of needs a comprehensive “EDA” solution must span the spectrum—from IC design (digital, analog and RF) to 3D-IC and package design, to PCB design and then up to electro-mechanical design (Dassault Systèmes collaboration).

Add analytics and optimization to the mix, to ensure electromagnetic, thermal, signal and power integrity, allowing customers to model and optimize complete systems (not just chips) before the hardware is ready. While also recognizing their customers are working on tight schedules with now further constrained staffing. Together, that’s a tall order. More collaboration, more automation, and more AI-guided design will be essential.

With the solutions outlined here, Cadence seems to be on a good path. My takeaway, CadenceLIVE 2023 provided a good update on how Cadence is addressing industry needs (with a healthy dose of AI), plus novel insights into systems/semiconductor/design industry directions.


Alchip is Golden, Keeps Breaking Records on Multiple KPIs

Alchip is Golden, Keeps Breaking Records on Multiple KPIs
by Kalar Rajendiran on 05-10-2023 at 10:00 am

Alchip Revenue Breakdown along Nodes and Applications

I don’t know the story behind the name Alchip. I’ve been asking this question ever since its founding in 2003 and still haven’t found the answer. Wikipedia sometimes provides insights and stories behind names of companies, products and services but I couldn’t find any regarding the name Alchip. One thing is for sure. After its consistent recording breaking financial results for many years in a row, no one is going to confuse the “Al” in the name for what “Al” stands for in the periodic table of chemical elements.

Alchip just announced financial results for 2022, breaking records on revenue, operating income, net income and earnings per share (EPS). It was able to achieve this in spite of the lower-than-expected performance due to substrate shortage influencing Inference chip shipments to North America. NRE revenue accounted for 40% to 45% of total 2022 revenue, with ASIC sales accounting for 55% to 60%. That’s upwards of $184 million in NRE revenue and is significant in itself. This bodes well for Alchip’s future production revenue.  Artificial Intelligence (AI) is becoming a major driver in the projected growth of the semiconductor market. System companies are getting directly involved in SoCs and working with companies such as Alchip to ensure differentiation and profitability of their products. The number of design starts are projected to continue to grow, driven by many growth applications. This also bodes well for Alchip’s future.

Success Requires Focus

In the ASIC industry, those who are consistently successful have to judiciously overcome many challenges thrown at them. Consistent success doesn’t arrive by happenstance or luck. It requires focused dedication to the ASIC model and ongoing strategic investments to stay on top. Alchip has always focused on delivering leading edge services to its customer base with high performance computing (HPC), AI, Networking and Storage markets as key markets to pursue. While high-end markets and customers can offer high rewards, they also demand high investments. Without a laser beam kind of focus, players will try to be everything for everybody resulting in their investments being spread too thin. Alchip on the other hand has shown significant growth in design wins in its target focus markets through its focus and business acumen.

Design Technology and Infrastructure

Alchip has stayed with the market trends and developed design technology, infrastructure and methodologies to service its focus markets.  It has consistently stayed on top of supporting the latest process nodes from TSMC, the leading foundry. Not only has it developed capability to support 2.5D/3D packaging in general but has also been qualified to support TSMC’s CoWoS packaging technology. The company has developed and continues to enhance the following:

    • Robust yet flexible design methodology
    • Flexible engagement model (both commercial and technical)
    • Best-in-class IP portfolio (access to third-party IP and in-house IP/customization)
    • Heterogenous chiplet integration capability
    • Advanced packaging and test capabilities

Results Speak for Themselves

Over the last four years (2019 revenue not in the above graphic), Alchip’s revenue derived from the two leading-edge processes has grown from 60% to 88% in 2022. Over the same period, its revenue derived from the HPC market segment has grown from 59% to 82% in 2022. When Networking and Niche markets are added in, the share reaches a whopping 94%.

You can read the entire press announcement of Alchip’s 2022 financial results here.[Link once announcement goes public on May 1st]

About Alchip

Alchip Technologies Ltd., founded in 2003 and headquartered in Taipei, Taiwan, is a leading global provider of silicon and design and production services for system companies developing complex and high-volume ASICs and SoCs.  Alchip provides faster time-to-market and cost-effective solutions for SoC design at mainstream and advanced, including 7nm, 6nm, 5nm and 4nm processes. Alchip has built its reputation as a high-performance ASIC leader through its advanced 2.5D/3D package services, CoWoS/chiplet design and manufacturing experience. Customers include global leaders in AI, HPC/supercomputer, mobile phones, entertainment device, networking equipment and other electronic product categories. Alchip is listed on the Taiwan Stock Exchange (TWSE: 3661).

Also Read:

Achieving 400W Thermal Envelope for AI Datacenter SoCs

Alchip Technologies Offers 3nm ASIC Design Services

The ASIC Business is Surging!


Silicon Catalyst and Arm announce $150,000 Silicon Startup Contest!

Silicon Catalyst and Arm announce $150,000 Silicon Startup Contest!
by Daniel Nenni on 05-10-2023 at 6:00 am

Silicon Catalyst Arm contest 400x400

As I sift through mounds of semiconductor press releases trying to figure out the relevance (with mixed results) I consider it a learning experience even when they don’t really tell me anything. This one however tells me two very important things:

1) Arm is a much more competitive company with the new leadership. I saw a noticeable press release change when Softbank bought Arm back in 2016 and it is great to see them back in the game. We can expect more of this, maybe even at a higher level, once the Arm IPO goes through this year which I am highly anticipating.

2) Silicon Catalyst continues to be a positive disruptive influence in the semiconductor industry, even more so than I imagined when I first spoke to the founders back in 2015. I have been involved with dozens of start-up companies during my 40 year semiconductor career and know first hand of their importance. Anything to help the start-up ecosystem is greatly appreciated, but let me tell you, Silicon Catalyst has by far exceeded even my extremely high expectations, absolutely.

We first reported the Silicon Catalyst Arm partnership in 2020 which was the first of many Silicon Catalyst announcements and events we have covered. For Arm to choose Silicon Catalyst for this event is very high praise indeed. Rather than summarize this historical event here is today’s press release in its entirety:

Silicon Catalyst announces “Silicon Startups Contest” in partnership with Arm

Worldwide call for applicants to qualify and win significant commercial and technical support from Arm

 Silicon Valley, California and Cambridge, UK – May 10, 2023 – Silicon Catalyst, the world’s only incubator focused exclusively on accelerating semiconductor solutions, is pleased to announce a “Silicon Startups Contest” in partnership with Arm. The contest, launching today, is organized and administered by Silicon Catalyst and is directed towards early-stage entrepreneurial teams developing a system-on-chip (SoC) design using Arm® processor IP (intellectual property), proven in more than 250 billion chips shipped worldwide.

The contest offers an opportunity for silicon startups to win valuable commercial, technical and marketing support from Arm and Silicon Catalyst. The winner will receive Arm credit worth $150,000, which could cover IP fees for a complete embedded system, or significantly contribute to the cost of a higher performance application.  In addition, both the winner and two runners-up will receive:

  • Access to the full Arm Flexible Access for Startups program, which includes:
    • No cost, easy access to an extensive SoC design portfolio including a wide range of Cortex processors, Mali graphics, Corstone reference systems, CoreLink and CoreSight system IP.
    • Free tools, training, and support to enhance your team
    • $0 license fee to produce prototypes
  • Cost-free Arm Design Check-in Review with Arm’s experienced support team
  • Entry to an invitation-only Arm ecosystem event with a chance to be featured and connect with Arm’s broad portfolio of silicon, OEM and software partners
  • Investor pitch review and preparation support by Silicon Catalyst, with an opportunity to present to the Silicon Catalyst Angels group and their investment syndication network.

“We believe that Arm technology is for everyone, and early-stage silicon startups trust Arm to deliver proven, validated computing platforms that enable them to innovate with freedom and confidence,” said Paul Williamson, senior vice president and general manager, IoT Line of Business at Arm. “Since its launch, Arm Flexible Access for Startups has enabled around 100 startups with access to our wide portfolio of IP, extensive ecosystem and broad developer base, and we look forward to seeing what creativity this prize inspires in the exciting new startups that enter this contest.”

The contest is open to startup companies in pre-seed, seed and Series A funding, that have raised a maximum of $20M in funding and all contest applicant organizations will be considered for acceptance to the Silicon Catalyst Incubator/Accelerator.  Judges include senior executives from both Arm and Silicon Catalyst.

“Arm was the first member of our ecosystem to join as both a Strategic Partner and an In-Kind Partner. Their Flexible Access program is a game-changer for startups. Through this program, silicon startups can move fast, experiment with ease, and design with confidence – so it’s a highly valuable part of the contest prize,” stated Pete Rodriguez, Silicon Catalyst CEO. “Entrepreneurial teams entering the contest will also automatically be applying to our Incubator, with the winning company receiving credit with Arm that could give them a significant head start in the commercialization of their product, as well as the opportunity to present to the Silicon Catalyst Angel investment group and their syndication network of investment partners.”

The contest will run from May 10, 2023 through to June 23, 2023. The contest winner and two runner-up companies will be announced in early July 2023. Contest rules and application details can be found at https://siliconcatalyst.com/arm-sic-contest-2023

About Silicon Catalyst “It’s about what’s next”

Silicon Catalyst is the world’s only incubator focused exclusively on accelerating semiconductor solutions, built on a comprehensive coalition of in-kind and strategic partners to dramatically reduce the cost and complexity of development. More than 900 startup companies worldwide have engaged with Silicon Catalyst and the company has admitted 97 exciting companies. With a world-class network of mentors to advise startups, Silicon Catalyst is helping new semiconductor companies address the challenges in moving from idea to realization. The incubator/accelerator supplies startups with access to design tools, silicon devices, networking, and a path to funding, banking and marketing acumen to successfully launch and grow their companies’ novel technology solutions. Over the past seven plus years, the Silicon Catalyst model has proven to dramatically accelerate a startup’s trajectory while at the same time de-risking the equation for investors. Silicon Catalyst has been named the Semiconductor Review’s 2021 Top-10 Solutions Company award winner. More information is available at www.siliconcatalyst.com

About Silicon Catalyst Angels

The Silicon Catalyst Angels was established in July 2019 as a separate organization to provide access to seed and Series A funding for Silicon Catalyst portfolio companies. What makes Silicon Catalyst Angels unique is not only the investment group’s visibility into a semiconductor-focused deal flow pipeline, but our membership is comprised of seasoned semiconductor veterans who bring with them a wealth of knowledge along with their ability to invest. Driven by passion and a desire to ‘give back’, our members understand the semiconductor market thanks to a lifetime of engagement in the industry. When you couple our members enthusiasm, knowledge, and broad network of connections with companies that have been vetted and admitted to Silicon Catalyst, you have a formula that is to date, nonexistent within the investment community. More information about membership can be found at www.siliconcatalystangels.com

Contact Information

Silicon Catalyst Richard Curtin, Managing Partner
richard@sicatalyst.com
Silicon Catalyst UK Sean Redmond, Managing Partner
sean@siliconcatalyst.uk

Also Read:

Chiplets, is now their time?

2023: Welcome to the Danger Zone

Silicon Catalyst Angels Turns Three – The Remarkable Backstory of This Semiconductor Focused Investment Group


Curvilinear Mask Patterning for Maximizing Lithography Capability

Curvilinear Mask Patterning for Maximizing Lithography Capability
by Fred Chen on 05-09-2023 at 10:00 am

Curvilinear 1

Masks have always been an essential part of the lithography process in the semiconductor industry. With the smallest printed features already being subwavelength for both DUV and EUV cases at the bleeding edge, mask patterns play a more crucial role than ever. Moreover, in the case of EUV lithography, throughput is a concern, so the efficiency of projecting light from the mask to the wafer needs to be maximized.

Conventional Manhattan features (named after the Manhattan street blocks or the lit building windows in the evening) are known for their sharp corners, which naturally scatter light outside the numerical aperture of the optical system. In order to minimize such scattering, one may to turn to Inverse Lithography Technology (ILT), which will allow curvilinear feature edges on the mask to replace sharp corners. To give the simplest example where this may be useful, consider the target optical image (or aerial image) at the wafer in Figure 1, which is expected from a dense contact array with quadrupole or QUASAR illumination, resulting in a 4-beam interference pattern.

Figure 1. A dense contact image from quadrupole or QUASAR illumination, resulting in a four-beam interference pattern.

Four interfering beams cannot produce sharp corners at the wafer, but a somewhat rounded corner (derived from sinusoidal terms). A sharp feature corner on the mask would produce the same roundness, but with less light arriving at the wafer; a good portion of the light has been scattered out. A more efficient transfer of light to the wafer can be achieved if the mask feature has a curvilinear edge with the same roundness, as in Figure 2.

Figure 2. Mask feature showing curvilinear edge similar to the image at the wafer shown in Figure 1. The edge roundness ideally should be the same.

The amount of light scattered out can be minimized to 0 ideally with curvilinear edges. Yet despite the advantage of curvilinear edges, it has been difficult to make masks with these features, as curvilinear edges require more mask writer information to be stored compared to Manhattan features, reducing the system throughput from the extra processing time. The data volume required to represent curvilinear shapes can be an order of magnitude more than the corresponding Manhattan shapes. Multi-beam mask writers, which have only recently become available, compensate the loss of throughput.

Mask synthesis (designing the features on the mask) and mask data prep (converting the said features to the data directly used by the mask writer) also need to be updated to accommodate curvilinear features. Synopsys recently described the results of its curvilinear upgrade. Two highlighted features for mask synthesis are Machine Learning and Parametric Curve OPC. Machine learning is used to train a continuous deep learning model on selected clips. Parametric Curve OPC represents curvilinear layer output as a sequence of parametric curve shapes, in order to minimize data volume. Mask data prep comprises four parts: Mask Error Correction (MEC), Pattern Matching, Mask Rule Check (MRC), and Fracture. MEC is supposed to compensate errors from the mask writing process, such as electron scattering from the EUV multilayer. Pattern matching operations search for matching shapes and becomes more complicated without restrictions to only 90-deg and 45-deg edges. Likewise, MRC needs new rules to detect violations involving curved shapes. Finally, fracture needs to not only preserve curved edges but also support multi-beam mask writers.

Synopsys includes all these features in its full-chip curvilinear data processing system, which are fully described from the white paper here: https://www.synopsys.com/silicon/resources/whitepapers/curvilinear_mask_patterning.html.

Also Read:

Chiplet Q&A with Henry Sheng of Synopsys

Synopsys Accelerates First-Pass Silicon Success for Banias Labs’ Networking SoC

Multi-Die Systems: The Biggest Disruption in Computing for Years


Is Your Interconnect Strategy Scalable?

Is Your Interconnect Strategy Scalable?
by Bernard Murphy on 05-09-2023 at 6:00 am

Design min

“Strategy” is a word sometimes used loosely to lend an aura of visionary thinking, but in this context, it has a very concrete meaning. Without a strategy, you may be stuck with decisions you made on a first-generation design when implementing follow-on designs. Or face major rework to correct for issues you hadn’t foreseen. Making optimum architecture decisions for the series at the outset is key. Will it support replicating a major subsystem allowing more channels in premium versions, for more sensors or more video streams? Can the memory subsystem scale to support increased demand? Careful planning and modeling, checking target bandwidths and latencies is a necessary starting point. However architectural feasibility alone may not be sufficient to ensure scalability for one critical component – the interconnect between the function blocks in the design.

Strategies and risks for interconnect

The startup strategy. Starting with no design infrastructure, part of your funding must be committed to design tools and essential IP. Some CPU cores come with low-cost access to an interconnect generator based on a crossbar technology. Or perhaps you decide to build your own generator – how hard can that be?

This strategy may work well on the first-generation design. Crossbar-based interconnect is well-established for entry-level designs but exhibits a glaring scalability weakness as systems become more complex. Area consumed by interconnect grows rapidly as the number of initiators and targets grows, creating more challenges for bandwidth, latencies and layout congestion. Problems become acute in follow-on designs as target and initiator counts increase to merge multiple market demands into a common product. Designs must also be as robust as possible to IP changes. A home-grown bus fabric may have worked well with the IP portfolio for the launch design, but what if one IP fails to measure up in the next product? A workaround may be possible but would kill your margins. A better IP is available but only with an interface you don’t yet support. Designing and fully verifying a new protocol will take more time than you have in the critical path to product release.

If you are going to use a crossbar interconnect in your first-pass design, set clear expectations that this will be a proof-of-concept build. It is already widely accepted that scalable interconnect must be based on NoC technology; to transition to a scalable market product, it is almost certain you will have to redesign around that technology. Commercial NoC IP generators already support the full range of AMBA and other protocol standards, limiting risk if needing to change IP. Then again, you could just start with a NoC, avoiding later risks.

The “What we have works and change adds risk” strategy.

Risk in change is an understandable concern but must be balanced against other risks. If it was tough to close timing on your last design and your next design will be more complex, you may be able to battle through and make it work, but at what cost? Pride in surviving the challenge will dissipate quickly if PPA is compromised.

This is not a hypothetical concern. One large company planned to reduce total system cost by designing out a chip they were buying externally. They already had all the tooling and expertise needed to make this happen. The plan seemed like such a no-brainer that they built this expectation into forward projections to analysts – improved margins at more competitive pricing. But they couldn’t close timing at target PPA on their in-house replacement. To continue to deliver the larger system, they were forced to extend their contract with the existing external supplier. Missing projections and getting a black eye. For the next generation, they switched to a commercial NoC solution and were able to complete the design-out successfully.

The “Our interconnect is differentiating” strategy.

There are a few system architectures for which interconnect architecture must be quite special, commonly for mesh networks or more exotic topologies like a torus. Applications demanding such topologies are typically high-premium multi-core server systems, GPUs and AI training systems. Even here, commercial NoC generators have caught up, to the point that market-leading AI systems companies now routinely use these NoCs. Suggesting that fundamentally, differentiation even in these high-end designs is not in the NoC. Just as for other IP, the trend is to commercial solutions for all the usual reasons: Maybe initially comparable to the in-house option but proven across an industry-wide range of SoCs, continually enhanced to remain competitive, with lower total cost of ownership, always-on support and resilient to expert staff turnover.

In a challenging economic climate, it has become even more important for us to pick our strategic battles carefully. People who work on NoC design are often among the best designers in the company. Where is the best place to use those designers? In further securing your lead in truly differentiating features, or in continuing to support NoC technology you can buy off-the-shelf?

If these arguments pique your interest, take a look at Arteris’ FlexNoC and Ncore Cache Coherent interconnect IPs. They boast over 3 billion Arteris-based SoCs shipped to date across a wide range of applications.

 

 

 


Reconfigurable DSP and AI IP arrives in next-gen InferX

Reconfigurable DSP and AI IP arrives in next-gen InferX
by Don Dingee on 05-08-2023 at 10:00 am

InferX 2.5 reconfigurable DSP and AI IP from Flex Logix

DSP and AI are generally considered separate disciplines with different application solutions. In their early stages (before programmable processors), DSP implementations were discrete, built around a digital multiplier-accumulator (MAC). AI inference implementations also build on a MAC as their primitive. If the interconnect were programmable, could the MAC-based hardware be the same for both and still be efficient? Flex Logix says yes with their next-generation InferX reconfigurable DSP and AI IP.

Blocking-up tensors with a less complex interconnect

If your first thought reading that intro was, “FPGAs already do that,” you’re not alone. When tearing into something like an AMD Versal, one sees AI engines, DSP engines, and a programmable network on chip. But there’s also a lot of other stuff, making it a big, expensive, power-hungry chip that can only go in a limited number of places able to support its needs.

And, particularly in DSP applications, the full reconfigurability of an FPGA isn’t needed. Having large numbers of routable MACs sounds like a good idea, but configuring them together dumps massive overhead into the interconnect structure. A traditional FPGA looks like 80% interconnect and 20% logic, a point most simplified block diagrams gloss over.

Flex Logix CEO Geoff Tate credits his co-founder and CTO Cheng Wang with taking a fresh look at the problem. On one side are these powerful but massive FPGAs. On the other side sit DSP IP blocks from competitors that don’t pivot from their optimized MAC pipeline to sporadic AI workloads with vastly wider and often deeper MAC fields organized in layers.

Wang’s idea: create a next-generation InferX 2.5 tile built around a tensor processor unit, each with eight blocks of 64 MACs (INT8 x INT8) tied to memory and a more efficient eFPGA-based interconnect. With 512 MACs per TPU and 8192 MACs per tile, each tile delivers 16 TOPS peak at 1 GHz. It’s flipped the percentages: 80% of the InferX 2.5 unit is hardwired, yet it retains 100% reconfigurability. One tile in TSMC 5nm is a bit more than 5mm2, a 3x to 5x improvement over competitive DSP cores for equivalent DSP throughputs.

Software makes reconfigurable DSP and AI IP work

The above tile is the same for either DSP or AI applications – configuration happens in software.

The required DSP operations for a project are usually close to being locked down before committing to hardware. InferX 2.5, with its software, can handle any function: FFT, FIR, IIR, Kalman filtering, matrix math, and more, at INT 16×16 or INT 16×8 precision. One tile delivers 4 TOPS (INT16 x INT16), or in DSP lingo 2 TeraMACs/sec, at 1 GHz. Flex Logix codes a library that handles softlogic and function APIs, simplifying application development. Another footprint-saving step is an InferX 2.5 tile that can be reconfigured in less than 3usec, enabling a function quick-change for the next pipeline step.

AI configurations use the same tile with different Flex Logix software. INT 8 precision is usually enough for edge AI inference, meaning a single tile and its 16 tensor units push 16 TOPS at 1 GHz. The 3usec reconfigurability allows layers or even entire models to switch processing instantly. Flex Logix AI quantization, compilers, and softlogic handle the mapping for models in PyTorch, TensorFlow Lite, or ONNX, so application developers don’t need to know hardware details to get up and running. And, with the reconfigurability, teams don’t need to commit to an inference model until ready and can change models as often as required during a project.

Scalability comes with multiple tiles. N tiles provide N times the performance in DSP or AI applications, and tiles can run functions independently for more flexibility. Tate says so far, customers have not required more than eight tiles for their needs, and points out larger arrays are possible. Tiles can also be power managed – below, an InferX 2.5 configuration has four powered tiles and four managed tiles that can be powered down to save energy.

Ready to deliver more performance within SoC power and area limits

Stacking InferX 2.5 up against today’s NVIDIA baseline provides additional insight. Two InferX 2.5 tiles in an SoC check in around 10mm2 and less than 5W – and deliver the same Yolo v5 performance as a much larger external 60W Orin AGX. Putting this in perspective, below is super-resolution Yolo v5L6 running on an SoC with InferX 2.5.

 

 

 

 

 

 

 

 

 

 

 

 

Tate says what he hears in customer discussions is that transformer models are coming – maybe displacing convolutional and recurrent neural networks (CNNs and RNNs). At the same time, AI inference is moving into SoCs with other integrated capabilities. Uncertainty around models is high, while area and power requirements for edge AI have finite boundaries. InferX 2.5 can run any model, including transformer models, efficiently.

Whether the need is DSP or AI, InferX is ready for the performance, power, and area challenge. For more on the InferX 2.5 reconfigurable DSP and AI IP story, please see the following:

Press release: Flex Logix Announces InferX™ High Performance IP for DSP and AI Inference

Product pages: InferX DSP and InferX AI

Also Read:

eFPGA goes back to basics for low-power programmable logic

eFPGAs handling crypto-agility for SoCs with PQC

WEBINAR: Taking eFPGA Security to the Next Level


Tessent SSN Enables Significant Test Time Savings for SoC ATPG

Tessent SSN Enables Significant Test Time Savings for SoC ATPG
by Kalar Rajendiran on 05-08-2023 at 6:00 am

Pattern Generation Block Level ATPG Flow

SoC test challenges arise due to the complexity and diversity of the functional blocks integrated into the chip. As SoCs become more complex, it becomes increasingly difficult to access all of the functional blocks within the chip for testing. SoCs also can contain billions of transistors, making it extremely time-consuming to test chips. As test time directly impacts test cost, minimizing test time is critical to managing the cost of a finished product. Automatic Test Pattern Generator (ATPG) is a crucial part of SoC testing, as it generates test patterns to detect faults in the design. However, the automation of ATPG is a challenging task, especially for complex SoCs, due to the large number of functional blocks and test points that need to be covered. Developing efficient and effective ATPG algorithms is a key challenge for SoC testing. But many of the ATPG tools today are not fully automated. Users have to learn all the commands and the options offered by the tools in order to use them effectively.

Is there a solution that brings some automation to the ATPG process, thereby enhancing engineering productivity? What if this solution also delivers significant savings in test time? Siemens EDA’s Tessent Streaming Scan Network (SSN) solution promises to deliver these benefits. This was substantiated by Intel, one of Siemens EDA’s customers during the recent User2User conference. Intel’s Toai Vo presented proof points based on his team’s experience with their first design using Tessent SSN solution. His team included Kevin Li, Joe Chou and Chienkuo (Tom) Woo.

Tessent SSN Solution

In a standard scan testing approach, test data is loaded into the circuit one bit at a time and shifted through the scan chains to observe the output responses. This process is repeated for each test pattern, which can be time-consuming and can lead to long test times. But the Tessent SSN solution packetizes test data to dramatically reduce DFT implementation effort and reduce manufacturing test times. By decoupling core-level and chip-level DFT requirements, each core can be designed with the most optimal compression configuration for that core. This solution can be used to efficiently test large and complex chips that have a high number of internal nodes that need to be tested. It uses a dedicated network to transmit test data in a streaming manner, enabling parallel processing of the data and thereby reducing test time.

Scalability

The Streaming Scan Network supports scalable scan architectures that can handle SoCs with a large number of functional blocks. The tool provides scalable approach of testing any number of cores concurrently while minimizing test time and scan data volume. Tessent SSN test infrastructure is built around the IEEE 1687/IJTAG standard for delivering greater flexibility and scalability to handle more complex designs and test scenarios.

Automation

The hierarchical object oriented nature of the test infrastructure lends itself for easier automation. Using Tessent infrastructure, a user can easily insert test logic into a chip. The process begins with the RTL design, where the SSN test logic is inserted using automation.

Test Time Savings

Using a traditional ATPG approach, normally only block can be run at a time which extends total test time. With the Tessent SSN ATPG approach, multiple blocks can be run in parallel, thereby greatly reducing the total test time. The following table shows the test time savings achieved by Toai’s team on their design.

Summary

Toai’s team found it very easy to migrate from a traditional embedded deterministic testing (EDT) channel based ATPG to a packet-based ATPG with SSN. The Tessent SSN solution greatly reduced engineering effort and silicon bring up time. And the test time reduction was significant compared to a traditional solution for testing. In Toai Vo’s words, it is absolutely an innovative test solution and it really works.

For more details, visit the Tessent SSN product page.

Also Read:

Achieving Optimal PPA at Placement and Carrying it Through to Signoff

Mitigating the Effects of DFE Error Propagation on High-Speed SerDes Links

Hardware Root of Trust for Automotive Safety


Podcast EP160: How Agile Analog Makes AMS Integration Easy with Chris Morrison

Podcast EP160: How Agile Analog Makes AMS Integration Easy with Chris Morrison
by Daniel Nenni on 05-05-2023 at 10:00 am

Dan is joined by Chris Morrison, Chris has 15 years’ experience in delivering innovative analog, digital, power management and audio solutions for International electronics companies, and developing strong relationships with key partners across the semiconductor industry. Currently he is the Director of Product Marketing at Agile Analog, the analog IP innovators. Previously he has held engineering positions, including 10 years at Dialog Semiconductor, now acquired by Renesas.

Chris details some of the new developments at Agile Analog, including foundry ecosystem expansion and new product introductions that are coming. Chris also explains the details behind how Agile Analog puts a digital “wrapper” around analog IP subsystems. The benefits of this approach for AMS integration are detailed, along with information about the targeted, customized delivery methodology used by Agile Analog.

The views, thoughts, and opinions expressed in these podcasts belong solely to the speaker, and not to the speaker’s employer, organization, committee or any other group or individual.


Chiplet Q&A with Henry Sheng of Synopsys

Chiplet Q&A with Henry Sheng of Synopsys
by Daniel Nenni on 05-05-2023 at 6:00 am

SNUG Panel

At the recent Synopsys Users Group Meeting (SNUG) I had the honor of leading a panel of experts on the topic of chiplets. One of those panelists was the very personable Dr. Henry Sheng, Group Director of R&D in the EDA Group at Synopsys. Henry currently leads engineering for 3DIC, advanced technology and visualization.

Are we seeing other markets move in this direction?

We’re seeing a broad movement for multi-die systems for some very good reasons. Early on some of the advantages were seen in the area of high performance computing (HPC) but now automotive is starting to adopt multi-die systems.

There are other technical motivations such as heterogeneous integration. If you migrate a design to the most advance process node, do you really need the entire system to be at that three nanometer node? Or do you implement the service functions of your system with some different technology node. Memory access has been another game changer where in the past you have to go through a board to get the memory, and then with interposers you can get much closer and with much higher bandwidth.

Stacking unleashes a lot of possibilities. It’s not necessarily just memories, but also applications such image sensors. Instead of taking data through a straw, eventually you’re getting to the point where it is raining down data into your compute die. I think there’s a lot to like about multi-die system, from a lot of different applications.

What other industry collaborations, IP, and methodologies are required to address the system-level complexity challenge?

There’s a lot of collaboration needed. John just mentioned the partnership that Synopsys has with ANSYS on system analysis. That kind of collaboration is really key. Back in the day, you had manufacturing, design and tooling all under one roof. And then over time – market forces and – market efficiencies pulled that apart into different enterprises. But while that’s economics, the nature of the technical problem is still very much intertwined. And if you look across this panel, you see a very tightly connected graph amongst all of us here. There’s a lot of collaborations that’s needed. And I think that’s pretty remarkable. I don’t know how many other industries that have this deep level of collaboration in order to mutually compete, but also to make progress.

You’ll see things like UCIe as a prime example. Standards are just the tip of the iceberg. Underneath that, there’s a whole lot of different collaborations needed to move the needle. More formalization, more standardization. This morning’s keynote called out a need for more standardization around chiplets.

And then with our friends at TSMC and 3DFabric and 3DBlox you’re starting to see what we’ve always seen in 2D in the emergence of formalization and alignment between different participants in the ecosystem. So I think it’s vital and I think we’ve always done it. So I’m pretty confident there’s a lot of rich material for collaboration and we will continue to come up with collaborative solutions.

How are the EDA design flows and the associated IP evolving and where do customers want to see them go?

It’s evolved a lot. It was mentioned earlier that multi-die system is not new. We started working on it probably 12 years ago. But it’s only recently where the commercial significance and the complexity has grown and evolved from more of a hobbyist type of environment earlier. Now it is become more of a professional environment, where what we’re trying to do is to evolve it from design methods a few years ago which basically revolved around assembly – sYou have components, and assemble components together. Now we’re getting into more of a multi-die system type of activity, going from an assembly problem to more of a design automation problem and trying to elevate it to where you’re now looking at designing the system together, because the chips are so co-dependent on each other. You can’t design the chiplets in isolation from each other because there’s a host of inter-related dependencies.

Principally where we are as an industry, we’ve invested decades of work into highly complex products and flows, and we don’t want to throw that away, right? You don’t want to disrupt that. You want to ride on top of that and augment it.

Where I see the EDA space going – we will continue to see a lot of the fine-grained optimizations that you would see in a traditional 2D problem space. Where I come from in Place and Route, you have a lot of very nice and almost convex problems that are fairly suitable to apply traditional techniques to solve them.

However, when you get to system level, these problems get kind of lumpy, and your solution space can become highly non-convex and difficult to solve with traditional techniques. That’s where looking into future on AI and ML and these kinds of things that can really help drive it forward.

So design has evolved from manual implementation, to computer-aided design, to electronic design automation, to AI-driven design automation. And probably in the future, instead of computer-aided design, maybe it becomes human-aided design. The AI will tell me “Hey Henry, I need that spec tightened up by next week. I need you to get that to me.“ With the complexity, you really need the automation in order to reasonably build and optimize these systems.

Do you see multi-die system as a significant driver for this technology moving forward?

Yes. For things like silicon lifecycle management that’s an emerging for 2D – if it’s important for 2D, it’s even more so for 3D.

If you look at it from the standpoint of yield, normally you look at 2D dies and there’s the concept of known good dies. So you can test before you put it on all in. But really if you look at a multi-die system, the system yield is the product of your yields, right? So even if you have all known good dies, you still have to put them together. And so there’s some multiplicative factors and you can roughly translate that same type of analysis over into the overall health of the system as well which depends on the multiplicative health of the components.

You have heterogeneous dies with known different properties, different workloads, and different behaviors across your different dies. So it becomes all the more important to be able to keep on top of that in monitoring.

Thank you very much Henry!

Also Read:

Synopsys Accelerates First-Pass Silicon Success for Banias Labs’ Networking SoC

Multi-Die Systems: The Biggest Disruption in Computing for Years

Taking the Risk out of Developing Your Own RISC-V Processor with Fast, Architecture-Driven, PPA Optimization


Memory Solutions for Modem, EdgeAI, Smart IoT and Wearables Applications

Memory Solutions for Modem, EdgeAI, Smart IoT and Wearables Applications
by Kalar Rajendiran on 05-04-2023 at 10:00 am

PSRAM Interface Controller Block Diagram

Memories have always played a critical role, both in pushing the envelope on the semiconductor process development front and supporting the varied requirements of different applications and use-cases. The list of the various types of memories in use today runs long. At a gross level, we can classify memories into volatile or non-volatile, read-only or read-write, static or dynamic, etc. And when it comes to the cost, performance, power and area/form factor of an electronic system, a lot rides on the use of all the right memories for the application.  The lion’s share of the attention is paid to the effective use of Static Random Access Memories (SRAMs) and Dynamic Random Access Memories (DRAMs) as per the tradeoff benefits to be derived. While the need for higher density memories that consume very low power and perform like SRAMs has always been there, applications were able to manage with a judicious mix of DRAMs and SRAMs.

But over the recent years, fast growing markets such as modem, edge connectivity and EdgeAI have started demanding more from memories. Additionally, with the rise of the Smart Internet of Things (IoT) and wearable technology, there is an increasing demand for memory solutions that can provide high performance and low power consumption to extend battery life. These applications want memories that deliver the performance and power benefits of SRAMs (over DRAMs) and the density and cost benefits of DRAMs (over SRAMs) rolled into one. Fortunately, such a type of memory was invented quite a while ago and is called the Pseudo Static Random Access memory (PSRAM). PSRAM manufacturers were waiting in the wings for adoption drivers such as the above mentioned fast growing applications. The list of PSRAM memory suppliers includes AP Memory, Infineon, Micron Technology, Winbond Technology, and others.

What is PSRAM? [Source: JEDEC.org]

(1) A combinational form of a dynamic RAM that incorporates various refresh and control circuits on-chip (e.g., refresh address counter and multiplexer, interval timer, arbiter). These circuits allow the PSRAM operating characteristics to closely resemble those of a SRAM.

(2) A random-access memory whose internal structure is a dynamic memory with refresh control signals generated internally, in the standby mode, so that it can mimic the function of a static memory.

(3) PSRAMs have nonmultiplexed address lines and pinouts similar to SRAMs.

Mobiveil

Mobiveil is a fast-growing technology company that specializes in the development of Silicon Intellectual Properties, platforms and solutions for various fast growing markets. Its strategy is to grow with fast burgeoning markets by offering its customers valuable IPs that are easy to integrate into SoCs. One such IP is Mobiveil’s PSRAM Controller which has been in mass production for more than half-a-decade with customers across the US, Europe, Israel and China. The controller is available in different system bus flavors such as AXI and AHB and supports a variety of PSRAM and HyperRAM devices from many suppliers. The company recently expanded the list with the addition of support for AP Memory’s latest 250MHz PSRAM devices.

AP Memory

AP Memory is a world leader in PSRAM and has shipped more than six-billion PSRAM devices to date. The company has positioned itself as a market leader in PSRAM devices, providing a complete product line of high-quality memory solutions to support IoT and wearables market segments. The company continuously launches competitive products and provides customized memory solutions based on customer requirements.

Mobiveil-AP Memory Partnership

This partnership expects to bring significant benefits for SoCs, as PSRAM devices offer 10x higher density over eSRAM, 10x lower power compared to standard DRAM, and close to 3x fewer pin count. These advantages will result in lower power consumption, higher performance, and cost savings for the systems that leverage PSRAMs.

The result of the partnership is a controller IP that will provide cost-effective, ultra-low-power memory solutions for system designers. Mobiveil has adapted its PSRAM Controller to interface with AP Memory’s new PSRAM device that goes up to 250 MHz in speed and densities from 64Mb to 512Mb, supporting x8/x16 modes. This integration will allow SoC designers to take advantage of the high performance of the PSRAM controller at very low power, making it ideal for battery-operated applications, and extending the standby time of devices.

The PSRAM controller supports Octal Serial Peripheral Interface (Xccela standard), enabling speeds of up to 1,000 Mbytes/s for a 16-pin SPI option. Additionally, it provides support for a direct memory mapped system interface, automatic page boundary handling, linear/wrap/continuous/hybrid/burst support, and low power features like deep and half power down.

The PSRAM controller IP is system validated, process technology independent and highly configurable. To learn more about this IP, download a copy of the detailed product brief here.

Summary

Mobiveil’s flexible business models, strong industry presence through strategic alliances and key partnerships, dedicated integration support, and engineering development centers located in Milpitas, CA, Chennai, Bangalore, Hyderabad and Rajkot, India, and sales offices and representatives located worldwide, have added tremendous value to customers in executing their product goals within budget and on time. To learn more, visit www.mobiveil.com.

Also Read:

CEO Interview: Ravi Thummarukudy of Mobiveil

Mobiveil’s PSRAM Controller IP Lets SoC Designers Leverage AP Memory’s Xccela x8/x16 250 MHZ PSRAM Memory

Mobiveil Flash Memory Summit Gold Sponsor to Showcase Broad Portfolio of Silicon IP Platforms and Engineering Services for SSD Development

Mobiveil and Avery Design Systems extend partnership to accelerate design and verification of NVMe 2.0-enabled SSD development