Banner 800x100 0810

Expedera Proposes Stable Diffusion as Benchmark for Edge Hardware for AI

Expedera Proposes Stable Diffusion as Benchmark for Edge Hardware for AI
by Bernard Murphy on 02-05-2024 at 6:00 am

Stable diffusion image min

A recent TechSpot article suggests that Apple is moving cautiously towards release of some kind of generative AI, possibly with iOS 18 and A17 Pro. This is interesting not just for Apple users like me but also for broader validation of a real mobile opportunity for generative AI. Which honestly had not seemed like a given, for multiple reasons. Finding a balance between performance and memory demand looks daunting for models baselining at a billion or more parameters. Will power drain be a problem? Then there are legal and hallucination issues, which perhaps could be managed through carefully limited use models. Despite the apparent challenges, I find it encouraging that a company which tends to be more thoughtful about product releases than most sees a possible path to success. If they can then so can others, which makes a recent blog from Expedera enlightening for me.

A quick recap on generative image creation

Generative imaging AI is a field whose opportunities are only just starting to be explored. We’re already used to changing our backgrounds for Zoom/Google Meet calls, but generative AI takes this much further. Now we can re-image ourselves in different costumes with different features in imaginary settings – a huge market for image-conscious consumers. More practically, we should be able to virtually try on clothing before we buy or explore options when remodeling a kitchen or bathroom. This technology is already available in the cloud (for example Bing Image Creator) but with all the downsides of cloud-based services, particularly in privacy and cost. Most consumers want to interact with such services through mobile devices; a better solution would be local AI embedded in those platforms. Generative AI through the open-source Stable Diffusion model is a good proxy for hardware platforms to serve this need and more generally for LLM models based on similar core technologies.

Can on-board memory and performance be balanced at the edge?

First, we need to understand the Stable Diffusion pipeline. This starts with a text encoder to process a prompt (“I want to see a pirate ship floating upside down above a sea of green jello”). That step is followed by a de-noising neural net which handles the diffusion part of the algorithm, through multiple iterations creating information for a final image from trained parameters. I think of this as a kind of inverse to conventional image recognition, matching between prompt requirements and the training to create a synthesized match to the prompt. Finally a decoder stage renders the image from the data constructed in the previous step. Each of these stages is a transformer model.

The Expedera blog author, Pat Donnelly (Solutions Architect), gives a detailed breakdown of parameters, operations and data moves required throughout the algorithm which I won’t attempt to replicate here. What stood out for me was the huge number of data moves. Yet he assumes only an 8MB working memory based on requirements he’s seeing with customers rather than optimal throughput. When I asked him about this, he said that operation would clearly depend on a DDR interface to manage the bulk of this activity.

This is a switch from one school of thought I have heard – that model execution must keep everything in local memory to meet performance requirements. But that would require an unreasonably large onboard SRAM. DRAM makes sense for handling the capacity, but another school of thought suggests that no one would want to put that much DRAM in a mobile device. That would be too expensive. Also slow and power hungry.

DRAM or some other kind of off-chip memory makes more sense but what about the cost problem? See the above reference on Apple. Apparently they may be considering flash memory so perhaps this approach isn’t so wild. What about performance? Pat told me that for Stable Diffusion 1.5, assuming an 8K MAC engine with 7 MB internal memory and running at 750 MHz with 12 GBps external memory bandwidth, they can process 9.24 images/second through the de-noiser and 3.29 images/second through the decoder network. That’s very respectable consumer-ready performance. Power is always tricky to pin down since it depends on so many factors, but numbers I have seen suggest this should also be fine for expected consumer use models.

A very useful insight. Seems like we should lay to rest the theory that big transformer AI for the edge cannot depend on off-chip memory. Again you can read the Expedera blog HERE.


What ChatGPT has to say about the Chiplet Summit

What ChatGPT has to say about the Chiplet Summit
by Daniel Nenni on 02-04-2024 at 12:00 pm

Chiplet

Quite a few of the articles I now see about the semiconductor industry use AI. It is not hard to tell, especially for someone like myself, a 40 year experienced semiconductor professional who also writes. In the last 13 years (yes SemiWiki is now a teenager) we have published 9,058 blogs. SemiWiki was the first of now many semiconductor blogging sites and it is an honor to work with so many talented people on both sides of the table.

I already did a post on the Chiplet Summit but since chiplets are the next big semiconductor thing I thought it would be fun to see what ChatGPT had to say.

ChatGPT: Write and article about the Chiplet Summit:

Exploring the Future of Semiconductor Innovation: A Recap of the Chiplet Summit

Introduction: The technology landscape is constantly evolving, and one of the key drivers behind this evolution is the semiconductor industry. In recent times, a groundbreaking concept has been gaining traction in the semiconductor community – chiplets.  of chiplets in shaping the future of semiconductor design.

Understanding Chiplets: At its core, a chiplet is a modular component of a semiconductor design that serves a specific function. Rather than creating monolithic chips that encompass all functionalities, chiplets allow for the assembly of smaller, specialized components into a cohesive system. This modular approach has the potential to revolutionize the semiconductor industry by offering greater flexibility, scalability, and efficiency in the design and manufacturing processes.

Key Themes Explored at the Chiplet Summit:
  1. Interoperability and Standardization: One of the primary discussions at the Chiplet Summit revolves around establishing standards for chiplet design and interoperability. Creating a common framework ensures that chiplets from different manufacturers can seamlessly integrate into a single system. Standardization not only simplifies the design process but also encourages collaboration among industry players.
  2. Scalability and Customization: Chiplets pave the way for scalable semiconductor architectures, allowing designers to mix and match components based on specific application requirements. This scalability enables the creation of highly customized solutions for various industries, from consumer electronics to data centers. Attendees at the summit explore the potential applications and advantages of scalable chiplet architectures.
  3. Economic and Environmental Impacts: Chiplets have the potential to significantly reduce the economic and environmental costs associated with semiconductor manufacturing. By reusing proven chiplet designs across different products, manufacturers can streamline their production processes and minimize waste. The summit addresses how chiplets contribute to a more sustainable and cost-effective semiconductor ecosystem.
  4. Innovations in Integration Technologies: The Chiplet Summit serves as a showcase for the latest advancements in integration technologies. From advanced packaging techniques to novel interconnect solutions, participants explore the cutting-edge technologies that enable seamless integration of chiplets. These innovations play a crucial role in realizing the full potential of chiplet-based designs.
  5. Challenges and Solutions: While chiplets offer tremendous opportunities, they also present challenges, such as ensuring robust communication between chiplets, managing thermal issues, and addressing security concerns. The summit provides a platform for experts to discuss these challenges and collaboratively work towards viable solutions.

Conclusion: The Chiplet Summit stands at the forefront of semiconductor innovation, bringing together visionaries and thought leaders to explore the vast potential of chiplet technology. As the industry continues to embrace this modular approach to semiconductor design, the summit plays a pivotal role in shaping the future of electronics. With ongoing advancements and collaborative efforts, chiplets are poised to revolutionize the way we perceive and create semiconductor devices, unlocking new possibilities for technological innovation.

End:

In my opinion AI will in fact replace most of the people who write “popular”  semiconductor articles because ChatGPT in effect does what they do, re hash what has already been written. SemiWiki bloggers however share their semiconductor experience, observations, and opinions so we are safe, for now.

Also Read:

Chiplets Open Pandora’s Box

How Disruptive will Chiplets be for Intel and TSMC?

Will Chiplet Adoption Mimic IP Adoption?

UCIe InterOp Testchip Unleashes Growth of Open Chiplet Ecosystem


Podcast EP206: An assessment of the Chiplet Ecosystem Movement with Alphawave’s Tony Chan Carusone

Podcast EP206: An assessment of the Chiplet Ecosystem Movement with Alphawave’s Tony Chan Carusone
by Daniel Nenni on 02-02-2024 at 10:00 am

Dan is joined by Tony Chan Carusone, Chief Technology Officer of Alphawave Semi. Tony has been a professor of Electrical and Computer Engineering at the University of Toronto since 2001 and has served as a Distinguished Lecturer for the IEEE Solid-State Circuits Society and on the Technical Program Committees of the world’s leading circuits conferences.

He co-authored the classic textbooks “Analog Integrated Circuit Design” and “Microelectronic Circuits” and he is a Fellow of the IEEE. Tony has also been a consultant to the semiconductor industry for over 20 years, working with both startups and some of the largest technology companies around the world.

Dan and Tony discuss the emerging movement toward a chiplet ecosystem. Tony describes some of the significant benefits of this approach, including ease of prototyping, power reduction and achieving density beyond the reticle limit. The flexibility offered by a chiplet ecosystem and the ability to more easily integrate optical interconnect are also discussed.

Tony describes Alphawave’s commitment to building a chiplet ecosystem, the technology and products that the company offers and an assessment of what the future holds. He offers a realistic view of how the chiplet ecosystem will mature that is based on past achievements in the industry.

The views, thoughts, and opinions expressed in these podcasts belong solely to the speaker, and not to the speaker’s employer, organization, committee or any other group or individual.


CEO Interview: Jay Dawani of Lemurian Labs

CEO Interview: Jay Dawani of Lemurian Labs
by Daniel Nenni on 02-02-2024 at 6:00 am

Full Slide LL Spatial Processing Unit Architecture (1)

Jay Dawani is the co-founder and CEO at Lemurian Labs, a startup developing an accelerated computing platform tailored specifically for AI applications. The platform breaks through the hardware barriers to make AI development faster, cheaper, more sustainable, and accessible to more than just a few companies.

Prior to founding Lemurian, Jay founded two other companies in the AI space. He is also the author of the top-rated “Mathematics for Deep Learning.”

An expert across artificial intelligence, robotics and mathematics, Jay has served as the CTO of BlocPlay, a public company building a blockchain-based gaming platform, and served as Director of AI at GEC, where he led the development of several client projects covering areas from retail, algorithmic trading, protein folding, robots for space exploration, recommendation systems, and more. In his spare time, he has also been an advisor at NASA Frontier Development Lab, Spacebit and SiaClassic.

The last time we featured Lemurian Labs you were focused on robotics and edge AI. Now you’re focused on data center and cloud infrastructure. What happened that made you want to pivot?

Indeed, we did transition from focusing on building a high performance, low latency, system-on-chip for autonomous robotics applications that could accelerate the entire sense-plan-act loop to building a domain specific accelerator for AI focusing on datacenter-scale applications. But it wasn’t just an ordinary pivot; it was a clarion call we felt we had the responsibility to answer.

In 2018, we were working on training a $2.1 billion parameter model, but we abandoned the effort because the cost was so extraordinarily high that we couldn’t justify it. So imagine my surprise that GPT3, which OpenAI released as ChatGPT in November 2022, was a $175 billion parameter model. This model is more than 80X larger than what we were working on merely 4 years earlier, which is both exciting and frightening.

The cost of training such a model is staggering, to say the least. Based on current scaling trends, we can expect the cost of training a frontier AI model to exceed a billion dollars in the not too distant future. While the capabilities of these models will be astounding, the cost is ridiculously high. Based on this trajectory, only a handful of very well resourced companies with their own datacenters will be able to afford to train, deploy and fine-tune these models. This isn’t purely because compute is expensive and power hungry, but also because the software stacks we rely on were not built for this world.

Because of geographical and energy constraints, there are only so many places to build datacenters. To meet the compute demands of AI, we need to be able to build zettascale machines without requiring 20 nuclear reactors to power it. We need a more practical, scalable and economical solution. We looked around and didn’t see anyone on a path to solving this. And so, we went to the drawing board to look at the problem holistically as a system of systems and reason about a solution from first principles. We asked ourselves, how would we design the full stack, from software to hardware, if we had to economically serve 10 billion LLM queries a day. We’ve set our sights on a zettascale machine in under 200MW, by 2028.

The trick is to look at it from the point of view of incommensurate scaling –  different parts of a system follow different scaling rules, so at some point things just stop working, start breaking or the cost benefit tradeoff no longer makes sense. When this happens, the only option is to redesign the system. Our assessment and solution encompasses the workload, number system, programming model, compiler, runtime and hardware holistically.

Thankfully, our existing investors and the rest of the market see the vision, and we raised a $9M seed round to develop our number format – PAL, to explore the design space and converge on an architecture for our domain specific accelerator, and architect our compiler and runtime. In simulations, we’ve been able to achieve a 20X throughput gain in the smaller energy footprint than modern GPUs, and are projecting to be able to deliver an 8X benefit in system performance for total cost of ownership on the same transistor technology.

Needless to say, we’ve got a lot of work ahead of us, but we’re pretty excited about the prospect of being able to redefine datacenter economics to ensure a future where AI is abundantly available to everyone.

That certainly sounds exciting and those numbers sound impressive. But you have mentioned number systems, hardware, compilers and runtimes as all the things you’re focused on – it sounds like a lot for any company to take on at once. It seems like a very risky proposition. Aren’t startups supposed to be more focused? 

It does sound like a lot of different efforts, but it is, in fact, one effort with a lot of interconnected parts. Solving only one of these components in isolation of the others will only hinder the potential for innovation because it results in overlooking the systemic inefficiencies and bottlenecks. Jensen Huang said it best, “In order to be an accelerated computing company, you have to be a full stack company”, and I fully agree. They are the current market leader for a reason. But I would challenge the notion that we are not focused. It is in how we think about the problem holistically and how to best solve it for our customers, is where our focus is.

Doing that requires a multidisciplinary approach like ours. Each part of our work informs and supports the others, enabling us to create a solution that is far more than the sum of its parts. Imagine if you had to build a racecar. You wouldn’t arbitrarily pick a chassis, add racing tires and drop in the most powerful engine you can find and race it, right? You would think about the aerodynamicity of the car’s body to reduce drag and enhance downforce, optimize the weight distribution for good handling, custom design the engine for maximum performance, get a cooling system to prevent overheating, spec a roll cage to keep the driver safe, etc. Each one of these elements builds upon and informs the other.

That said, it is risky to try and do all of it at once for any company in any industry. To manage the risks we are taking a phased approach, allowing us to validate our technology with customers and adjust our strategy as needed. We have proven our number format works and that it has better power-performance-area than equivalent floating point types, while also having better numerical properties which make it easier to quantize neural networks to smaller bit-widths. We have designed an architecture which we feel confident in, and it is suitable for both training and inference. But more important than all of that is getting the software right, and that is the bulk of our immediate focus. We need to ensure we make the right decisions in our software stack for where we see the world being a year or two or more from today.

Building a hardware company is tough, expensive and takes a long time. The focus on software first sounds like a very viable business on its own, and potentially more appealing to investors in the current climate. Why are you also doing hardware given so many well-funded companies in the space are closing their doors, struggling to get adoption with customers and larger players  are building their own hardware?

You’re absolutely correct that software businesses have generally been able to raise capital much more easily than hardware companies, and that hardware is very tough. Our current focus is very much on software because that’s where we see the bigger problem. Let me be clear, the problem isn’t whether I can get kernels running on a CPU or GPU with high performance; that’s a long solved problem. The problem of today is how do we make it easier for developers to get more performance, productively out of several thousand node clusters made up of heterogeneous compute without asking them to overhaul their workflow.

That’s the problem we’re currently focused on solving with a software stack that gives developers superpowers and unlocks the full capability of warehouse scale computers, so we can more economically train and deploy AI models.

Now, regarding investment, yes, VCs are being more selective in the kind of companies they back, but it also means VCs are looking for companies with the potential to offer truly groundbreaking products that have a clear path to commercialization while having significant impact. We’ve learned from the challenges and mistakes of others and have actively designed our business model and roadmap to address the risks. It’s also important to take note that what’s made startups successful has rarely been how easily they can raise VC funding, but has more to do with their resourcefulness, stubbornness and customer focus.

And before you ask, we are still working on hardware, but primarily in simulation right now. We don’t intend to tape out for a while. But we can save that conversation for another time.

That is certainly compelling and your phased approach is very different compared with what we’ve seen other hardware companies do. I understand the problem you’re saying your software stack will address, but how does your software differentiate from the various efforts in the market?

Most of the companies you’re referring to are focusing on making it easier to program GPUs by introducing tile-based or task-mapping programming models to get more performance out of GPUs, or building new programming languages to get high performance kernels scheduled on different platforms with support for in-line assembly. Those are important problems that they’re addressing, but we see the problem we’re solving as almost orthogonal.

Let’s for a moment think about the cadence of hardware and software transitions. Single-core architectures gained performance from clock speed and transistor density, but eventually clock speeds hit a plateau. Parallelism using many cores circumvented this and provided sizable speedups. It took software roughly a decade to catch up, because programming models, compilers and runtimes had to be rethought to help developers extract the value in this paradigm. Then, GPUs started becoming general purpose accelerators, again with a different programming model. Again, it took almost a decade for developers to extract value here.

Again, hardware is hitting a plateau – Moore’s law, energy and thermal constraints, memory bottlenecks, and the diversity of workloads plus the need for exponentially more compute is pushing us towards building increasingly heterogeneous computer architectures for better performance, efficiency and total cost. This shift in hardware will of course create challenges for software because we don’t have the right compilers and runtimes to support the next evolution of computing. This time though, we shouldn’t have to wait another 10 years for software to extract the value of heterogeneous architectures or large clusters, especially when they are going more than 80% unutilized.

What we’re focusing on is building a heterogeneity-aware programming model with task-based parallelism, addressing portable performance with cross processor optimizations, context-aware compilation and dynamic resource allocation. And for us, it doesn’t matter whether it’s a CPU, GPU, TPU, SPU (Lemurian’s architecture) or a mesh of all of them. I know that sounds like a lot of fancy words, but what it’s really saying is that we’ve made it possible to program any kind of processor with a single approach, and we can port code from one kind of processor over to another with minimal effort without needing to sacrifice performance, and schedule work adaptively and dynamically across nodes.

If what you say is true you may just completely redefine computing. Let’s talk about funding. You raised $9 million in seed funding last year which signifies strong investor support and belief in your vision. What have you done since?

Over the past year, fueled by the seed funding, we have made significant strides. With our team now at 20 members, we meticulously addressed challenges, engaged with customers and refined our approach.

We focused on enhancing PAL for training and inference, explored computer architecture for our accelerator and developed a simulator for performance metrics. Simultaneously, we reimagined our software stack for datacenter applications, emphasizing heterogeneous computing.

This effort resulted in a well-defined architecture, showcasing PAL’s efficacy for AI at scale. Beyond tech advancements, we pursued collaborations and outreach to democratize access. These efforts position Lemurian Labs to address immediate customer challenges, gearing up for the release of our production silicon.

What are Lemurian Labs’ medium-term plans regarding software stack development, collaborations, and the improvement of the accelerator’s architecture?

Our immediate goal is to create a software stack that targets CPUs, GPUs and our AI accelerators with portable performance, which will be made available to early partners at the end of the year. We’re currently in conversations with most of the leading semiconductor companies, cloud service providers, hyperscalers and AI companies to give them access to our compiler and runtime. In parallel, we continue to work on and improve our accelerator’s architecture for a truly co-designed system of hardware and software. And of course, we have just started raising our series A with very strong interest from the investor community, which will enable us to grow our team and meet our target for software product delivery at the end of the year.

In closing, how do you see Lemurian Labs contributing to changing the landscape of AI development, accessibility and equity in the coming years?

We didn’t set out to redefine computing only for commercial gain or for the fun of it. As Lemurians, our driving force is that we believe in the transformative potential of AI and that more than just a few companies should have the resources to define the future of this technology and how we use it. We also don’t find it acceptable that the datacenter infrastructure for AI is on track to consume as much as 20% of the world’s energy by 2030. We all came together because we believe there is a better path forward for society if we can make AI more accessible by dramatically lowering its associated cost, accelerate the pace of innovation in AI and broaden its impact. By addressing the challenges of current hardware infrastructure, we seek to pave the path to empowering a billion people with the capabilities of AI, ensuring equitable distribution of this advanced technology. We hope our commitment to product-focused solutions, collaboration and continuous innovation positions us as a driving force in shaping the future of AI development to be a positive one.

Also Read:

Luc Burgun: EDA CEO, Now French Startup Investor

CEO Interview: Sridhar Joshi of TenXer

CEO Interview: Suresh Sugumar of Mastiska AI


2024 Outlook with Niels Faché of Keysight EDA

2024 Outlook with Niels Faché of Keysight EDA
by Daniel Nenni on 02-01-2024 at 10:00 am

Keysite EDA EVent

We have been working with Keysight EDA for the past two years and it has been an honor. The technical depth we have reached with them is impressive. Niels Faché, VP & GM, Keysight EDA, is responsible for Keysight’s design and simulation portfolio. Niels has been part of the Keysight-Agilent-HP family since 1994 when HP acquired Alphabit, a company he co-founded.

Tell us a little bit about Keysight EDA.
Keysight EDA is the design and simulation business inside Keysight Technologies test and measurement business. Our EDA solutions are focused on high speed, high frequency design problems. They include RF and microwave circuit and system design for MMICs and RFICs and modules, phased arrays, wireless and satellite systems, high-speed digital design for signal integrity, power integrity of complex modules and boards, memory and interconnect design, semiconductor device modeling, and power electronics, and of course process and data management (Cliosoft).

What was the most exciting high point of 2023 for your company?
The acquisition of Cliosoft in the design data and IP management space. Progress on partnerships with Synopsys, Ansys, Altium, and TSMC, Samsung on the foundry side. Introduction of new capabilities in our software suite that support mmWave workflows, Python automation, AI/ML in device modeling and more.

What was the biggest challenge your company faced in 2023?
Being recognized as a top-tier rather than niche EDA industry player. We made headway on this with our involvement in the ESD Alliance and showing at DAC.

How is your company’s work addressing this biggest challenge?
We are broadening our product portfolio to appeal to a larger audience. We are addressing customer pain points with design complexity and time to market requirements by focusing on connected design and test workflows and shift left from physical to virtual prototypes where issues can be resolved earlier and more cost effectively.

What do you think the biggest growth area for 2024 will be, and why?
In 2024, engineers will continue embracing shift left with their electronic product development cycles. As design moves from the physical into the virtual space, engineers are able to quickly discover and fix problems in the most efficient manner, providing greater insights and performance improvements. The next few years will see a continuing emphasis on connecting design and test workflows to handle rising complexity and more demanding time-to-market requirements for electronic products in wireless, wired, aerospace/defense, and other industries. Here are the emerging electronic design innovations we see for 2024:

3DIC and Heterogeneous Chiplets: New Standards Come into View
New standards such as UCIe are emerging for the creation of chiplets and the disaggregation of system-on-chip designs into smaller pieces of intellectual property that can be assembled into 2.5D and 3D integrated circuits using advanced packaging. For designers to accurately simulate die-to-die physical layer interconnect, it will require high-speed, high-frequency channel simulation to UCIe and other standards.

EDA Turns to AI: From Complexity to Clarity
The application of AI and ML techniques in EDA is still in the early adopter phase, with design engineers exploring use cases to simplify complex problems. The intelligence is particularly valuable in model development and validation for simulation, where it assists in processing large volumes of data. In 2024, organizations will increasingly adopt both technologies for device modeling of silicon and III-V semiconductor process technologies, as well as system modeling for forthcoming standards such as 6G, where research is well underway.

Software Automation Empowers Engineers
As Moore’s Law reaches its limits, improving design processes through workflow automation will provide a pathway to increasing the productivity of design engineers. In 2024, software automation techniques, such as Python APIs, will take a more significant role in integrating “best-in-class” tools into open, interoperable design and test ecosystems.

Navigating the Digital Shift: Design Management Essentials
With the creation of digital enterprise workflows, many organizations are investing in design management across tool sets, data, and IP. Moving forward, design data and IP management software will play a critical role in the success of complex SoC and heterogeneous chiplet designs supporting large, geographically distributed teams. Creation of digital threads between requirements definition and compliance, and establishing tighter links with enterprise systems such as PLM also play a role in digital transformation of product development cycles.

Next-Gen Quantum Design: Optimizing System Performance
Quantum computing is advancing at a rapid pace and is transitioning from predominantly free research tools to commercial products and workflows in quantum design. Next-generation quantum design will require more integrated simulation workflows that provide developers with fast and accurate capabilities to optimize system performance.

Silicon Photonics Fuels Data Center Transformation
Data centers are evolving to provide higher compute performance to support the exponential growth in AI and ML workloads, as well as the need for more efficient power utilization and thermal management. Silicon photonics will play a critical role in accelerating the transformation of data centers to meet the appetite for compute performance. As design engineers develop high-speed data center chips that incorporate silicon photonics interconnect, they will need process design kits (PDKs) and accurate simulation models that support the advanced development work.

How is your company’s work addressing this growth?
We work closely with customers to solve their design, simulation, and workflow problems. We focus on growth sub-segments of the market and are seeing strong growth in our high-speed digital business. We are launching new simulation engines and workflows, for example focused on UCIe simulation for chiplets. We incorporate measurement science into our products to make them more accurate.

What conferences did you attend in 2023 and how was the traffic?
DesignCon, OFC, TSMC Tech Symposium and OIP Forum, DAC, IMS, European Microwave Week and others. DAC was the highlight for us last year. We had a very strong presence with many customer meetings. It was a big success leveraging Cliosoft’s past approach to DAC for Keysight EDA. Our EDA business is relationship driven and we value show participation as a way to foster those relationships and initiate new relationships. We will be participating in the same conferences this year in addition to the Chiplet Summit and SNUG.

Also Read:

Higher-order QAM and smarter workflows in VSA 2023

Advanced EM simulations target conducted EMI and transients

Seeing 1/f noise more accurately


Cadence Claims the CFD High Ground with a New GPU-Based Accelerator

Cadence Claims the CFD High Ground with a New GPU-Based Accelerator
by Bernard Murphy on 02-01-2024 at 6:00 am

J24135 Millennium Press Image 400x400 min

For observers of EDA markets there is an easily overlooked opportunity for new growth. Today around 50% of EDA revenues come from systems rather than semiconductor companies, from datacenters to automotive, aerospace, energy, and others. In most of these industries total system design depends as much on mechanical and other multiphysics optimizations (aerodynamics, stress, thermal, electromagnetics, etc.) as on electronic design. Multiphysics analysis has already penetrated semiconductor design, for example in-package to in-system thermal analysis and management using computational fluid dynamics (CFD) for cooling analytics. In short, multiphysics bridges between electronic system design and total system design as critical to support power generation, airline, and automotive markets. As in chip design, system problems in these domains keep getting harder, demanding active and continuous innovation from solution providers to address modern design needs. Exploiting synergies between EDA and multiphysics expertise, Cadence claims the Millennium platform delivers a breathtaking performance advance for multiphysics analytics, solving industrial scale problems in hours rather than weeks and opening large new growth opportunities.

A Fast Layman’s Guide to CFD

CFD simulates the flow of a fluid (liquid or gas) around/through mechanical structures like circuit boards, aircraft, gas turbines and cars. Without CFD, these measurements must be made on prototypes, for example in wind tunnels, an expensive and time-consuming process. With CFD, engineers can shift-left (a familiar concept in EDA), to study performance of a digital twin against simulated fluid flows.

Simulations are based on the Navier-Stokes differential equation, mapped across discrete meshes to enable numerical solving. Meshes are designed for finer spacing around critical zones with coarser spacing elsewhere and commonly run to many millions of elements. Factors considered in solving across the mesh include pressure and temperature, also viscosity because all fluids flow slower near boundaries. Compressibility can be important when considering acoustics or Mach speeds; turbulence is another factor at high speeds. These factors have enough impact on mesh and solver methods that CFD must provide a family of technology solutions.

Turbulence is the most challenging condition to simulate accurately. The most widely practiced technique in industry today develops static averages, a weak approximation for a dynamic phenomenon able to deliver accurate CFD around an airplane wing at cruising altitude but not during ascent or descent. A different technique called Large Eddy Simulation (LES) can model much more accurately and dynamically but is more expensive in computation, making extensive turbulence modeling through a digital twin impractical. Thus critical analyses have been limited to real physical modeling using prototypes in wind tunnels, effective but too cumbersome to explore thousands of scenarios for optimization.

Cadence Authority in CFD and LES

CFD is a high expertise domain with a lot of history. Tool departments and often product teams are staffed with armies of PhDs. Algorithms for meshes and solvers, together with software, have evolved significantly and of course continue to evolve. In other words this is a domain an EDA company must enter inorganically.

Cadence started here in 2021 with a series of acquisitions. These include NUMECA with strong meshing and solver technologies and an established reputation in marine and turbomachinery applications. Shortly after Cadence acquired Pointwise with proven strength in CFD meshing and established in aerospace and defense markets. In late 2022 they acquired Cascade Technologies, a Stanford spinout with compelling technology for LES. Through these acquisitions Cadence has built a stable of thoroughbred technology and experts in CFD, adding to their established strength in other aspects of multiphysics. But it seems they didn’t stop there.

Industries are desperate for higher LES performance for more accurate digital twin modeling. As one example, 50% of the energy consumed by a car goes to overcoming aerodynamic drag, directly affecting ICE fuel consumption or EV range. Designers need digital twins to simulate over thousands of operating conditions to find and optimize the many small improvements they can make around the car structure to reduce drag. How did Cadence step up to this need?

Cadence Millennium M1 and the Fidelity LES Solver

CFD is very parallelizable, so an obvious solution is run a job across many server/CPU clusters. This was already possible on big CPU farms or supercomputers, but cost becomes prohibitive when running complex LES algorithms over very large meshes with experiments over thousand of runs. Overcoming this barrier has been one of the drivers prompting development of Millennium M1, Cadence’s first GPU-based accelerator.

Cadence has a proven track record in hardware acceleration across multiple generations of the Palladium and Protium platforms for hardware verification. They have worked out the design, operations, and supply chain kinks to build these platforms and they have established infrastructure to provide cloud-based access. (All platforms including Millennium can also be purchased for on-premises analysis.) Extending this expertise to a GPU-based platform is both obvious and brilliant. In one stroke (though I’m sure it took them time to get there 😀) they can accelerate CFD simulations. Adding new generative AI methods for design and analysis exploration they claim delivers up to 100X design impact in accuracy, speed, and scale at much lower power when compared with massive CPU server parallelism. Hardware acceleration from Cadence hardware know-how combined with genAI expertise from both EDA and CFD teams demonstrates the synergy required to deliver the active and continuous innovation I mentioned earlier.

CFD algorithm development has also been very active. Software is designed from the ground up to be GPU native. Problem preparation for analysis includes low-touch optimized mesh generation. And there are new numerical methods to ensure high stability in LES simulations (normally prone to unphysical behavior in turbulence modeling).

This capability is available today for CFD multiphysics modeling, in the cloud or on-premises.

Millennium is Not Just for CFD

It is obvious that a GPU-based accelerator should be able to do more than accelerate CFD. It could accelerate finite element analyses such as stress, thermal diffusion, and electromagnetics. It can also run generative AI. But why not just use one of the giant hyperscaler GPU banks for that purpose? For me one reason is simply availability and latency in competition with chatbots and creative image apps. Equally it is hard to believe that application-specific fine-tuning on top of a mass market LLM models could serve the high complexity, high accuracy, and domain-specific needs of modern EDA and multiphysics software. Dedicated hardware is the way to go, accessible through the cloud or in on-premises installations.

It will be very interesting to see what capabilities Millennium will offer in the future both for electronic design and for multiphysics. You can learn more HERE.

Also Read:

2023 Retrospective. Innovation in Verification

Information Flow Tracking at RTL. Innovation in Verification

ML-Guided Model Abstraction. Innovation in Verification


2024 Outlook with Justin Endo of Mixel

2024 Outlook with Justin Endo of Mixel
by Daniel Nenni on 01-31-2024 at 10:00 am

MixelLogo 25

 

Mixel is a semiconductor intellectual property (IP) company that we have been working with for 4 years with outstanding results.  Mixel’s focus is on developing mixed-signal IP, which includes analog and digital components. These IP cores are often used in applications such as mobile devices, consumer electronics, automotive systems, and other embedded systems where both analog and digital functions are required.

The company provides a wide range of IP cores for various interfaces and standards, including MIPI cores (Mobile Industry Processor Interface),  Multi-standard IP Cores,  and LVDS (Serializer and Deserializer). Their IP solutions aim to enable efficient and reliable communication between different components within electronic devices.

Tell us a little bit about yourself and your company.
Mixel is a leading provider of mixed-signal interface IPs. We offer a wide portfolio of high-performance mixed-signal connectivity solutions. Mixel’s mixed-signal portfolio includes PHYs and SerDes, such as MIPI PHYs (MIPI D-PHY, MIPI C-PHY, and MIPI M-PHY), LVDS, and Multi-standard SerDes. I lead Mixel’s marketing and sales strategy and implementation.

What was the most exciting high point of 2023 for your company?
2023 was a big year for us. We celebrated our 25th anniversary as a company. We hosted many of our partners, customers, and employees past and present at our global HQ in San Jose. It was great to reconnect with familiar faces and celebrate such a significant milestone with those who made it possible.

What was the biggest challenge your company faced in 2023?
Like many in our industry, there was a noticeable slowdown in the first half of 2023. While we are somewhat insulated as an IP provider, we still saw how dramatically things can change in a short period of time. Many of our customers pushed off their decision-making to later in the year. Thankfully, unlike many companies, small and large, that went through reduction of force, we increased our overall headcount. Last year was a good time to hire high quality talent due to the slow down, and a great time to focus on new product development.

How is your company’s work addressing this biggest challenge?
The best decision we made was to not slow down IP development during the downturn. After a slow first half of 2023, the second half was stronger than we expected. We are hoping to leverage this momentum coming into 2024 and we believe it will pay off in the coming months and years.

What do you think the biggest growth area for 2024 will be, and why?
As a leading provider of MIPI PHY IP, we have seen major growth in MIPI in automotive and AI enabled applications. As the number of sensors, cameras, and displays increase in cars, we see a corresponding increase in demand for automotive ICs on both sensors/display-side and processor-side. MIPI is seen as the de facto standard in many of those applications, particularly MIPI CSI-2 for low-power cameras and sensor.

How is your company’s work addressing this growth?
To meet our automotive customer’s requirements, we are ISO 26262 and ISO 9001 certified. Our process is certified up to ASIL-D and we have multiple IP configurations certified up to ASIL-B. We continue to develop for the latest MIPI specifications and having our ASIL-D certified process already in place allows us to provide automotive-grade IPs with functional safety documentation to support our customer’s safety critical applications.

What conferences did you attend in 2023 and how was the traffic?
As an active contributor to the MIPI Alliance and the development of its specifications, we attend all three face-to-face meetings each year. In addition, we sponsor many industry events including foundry exhibitions such as TSMC Technology Symposium & TSMC OIP Ecosystem Forum, GlobalFoundries Technology Summit, Samsung Foundry Forum & SAFE Forum, and Tower Semiconductor Technical Global Symposium At these events, we showcase our customers who have integrated our IP into their product, such as Microsoft’s HoloLens 2 and NXP’s i.MX7ULP applications processor. Attendees at these events really enjoy interacting with the technology and in terms of traffic to our booth, 2023 was the highest in recent years for several events, even surpassing pre-COVID numbers.

Will you attend conferences in 2024? Same or more?
We are aiming to attend the same events as last year and are actively looking to add others, particularly those outside the US, such as in China and Europe.

Additional questions or final comments?
Hope to see you at our next event!

Automotive-grade MIPI PHY IP drives multi-sensor solutions

MIPI D-PHY IP brings images on-chip for AI inference

MIPI bridging DSI-2 and CSI-2 Interfaces with an FPGA


Chiplets Open Pandora’s Box

Chiplets Open Pandora’s Box
by Daniel Nenni on 01-31-2024 at 6:00 am

Chiplet

Chiplets have simplified one area of design but opened pandora’s box on another front. The simulation complexity of each chiplet is lower but now the chiplet-to-chiplet interconnect has become complex. Folks are experimenting with different interconnect protocols, variations of UCIe, modifying UCIe settings, interface speeds, number of physical layers and so one. Now add legacy standards like AXI, new protocols like PICe6.0 and cache coherency to the mix.

All-in-all, this creates a completely new set of experiments. One for which the traditional emulation and RTL modelling will not work. You need first spend an effort on architecture trade-off, not just in selecting components. This will mean that you will have to conduct traffic analysis, application partitioning, system sizing and impact of different types of physical layer. Also, depending on the application the benchmark will be very different.

The UCIe specification is new and there are no clear benchmarks. Also, the UCIe specification only provides guidance on latency and power. Both are stringent requirements. This means that a Power-Performance-Area study is evitable. As you have protocol-protocol-protocol conversion such as PCIe 6.0 to UCIe to AXI, the modelling setup is complex.

One solution is to look at system-modeling using VisualSim from Mirabilis Design. They have recently launched a UCIe System-level IP model and will be demonstrating a number of use cases of the interconnect at the Chiplet Summit. To guide designers along, they have published a guide with lots of use cases, expected power-performance results and options for optimization. They have both a paper presentation and a booth at the Summit. I hope to see you there!

Also, here is the link for a paper that folks can get: Performance modeling of a heterogeneous computing system based on the UCIe Interconnect Architecture

Abstract:

Today’s complex chip designs at leading-edge nodes generally consist of multiple dies (or chiplets). The approach allows for dies from different manufacturers or processes, as well as reusable IP. Designers need a system level model to evaluate different implementations of such complex situations.

An example system consists of an I/O chiplet, low power core chiplet, high-performance core chiplet, audio-video chiplet, and analog chiplet, interconnected using the Universal Chiplet Interconnect Express (UCIe) standard.

Our team considered several scenarios and configurations including advanced and standard packages, varied traffic profiles and resources, and a retimer to extend the reach and evaluate events on timeout. Identifying the strengths and weaknesses of the UCIe interconnect for mission applications helped us obtain the optimal configuration for each subsystem to meet performance, power, and functional requirements.

About Mirabilis Design Inc.

Mirabilis Design is a Silicon Valley software company, providing software and training solutions to identify and eliminate risk in the product specification, accurately predicting the human and time resources required to develop the product, and improve communication between diverse engineering
teams.

VisualSim Architect combines Intellectual Property, system-level modeling, simulation, environment analysis and application templates to significantly improve model construction, simulation, analysis and RTL verification. The environment enables designers to rapidly converge to a design which meets a diverse set of interdependent time and power requirements. It is used very early in the design process in parallel with (and as an aid to) the written specification and before an implementation (for example, RTL, software code, or schematic) of the product.

Also Read:

WEBINAR: How to Achieve 95%+ Accurate Power Measurement During Architecture Exploration

Mapping SysML to Hardware Architecture

Model-Based Design Courses for Students


A Rare Offer from The SHD Group – A Complimentary Look at the RISC-V Market

A Rare Offer from The SHD Group – A Complimentary Look at the RISC-V Market
by Mike Gianfagna on 01-30-2024 at 10:00 am

A Rare Offer from The SHD Group – A Complimentary Look at the RISC V Market

The web is a wonderful place to find information on almost any topic. While top-level information is easy to find, a deep dive often requires the services of a market research firm. These organizations specialize in “going deep” on many technology topics, offering insights not available with a Google search. And these services aren’t typically free. Access to focused research can get pricey. So, when a top-drawer research firm offers an 80-page report on a topic as hot as RISC-V for free, that gets my attention. Read on to learn about a rare offer from the SHD Group – a complimentary look at the RISC-V market.

About The SHD Group

The SHD Group was formed about four years ago. The organization brings together skills and services that focus on go-to-market strategies, marketing, market analysis, business development, sales pipeline building and closure on profitable opportunities. It operates across a range of markets, including AI, semiconductor, smart sensors, RISC-V, consumer and automotive.

The leadership team has decades of experience spanning all of these areas and more. The company offers first-hand executive experience, know-how and expertise in AI “from Edge to Cloud & back”.  You can learn more about the breadth and depth of experience offered by this team here. I want to take a moment to focus on the author of the RISC-V report.

Rich Wawrzyniak

Rich Wawrzyniak brings over 35 years of semiconductor industry experience to the organization. Previously, Rich spent 20 years focusing on market analysis at Semico Research. His demonstrated skills span sales management to corporate planning, with an expertise in ASICs, SoCs, SIP, memory, and design starts, as well as emerging areas like AI, RISC-V, and chiplets.

I personally know Rich from his time at Semico when I was in both ASIC and EDA. I have great memories of many in-depth conversations that lead to insightful analysis that helped the companies I worked for in measurable ways. Rich has an inquisitive and highly analytical nature – he gets to the facts that matter and presents results with compelling detail and accuracy. All this fueled my enthusiasm when I found out some of his work was being offered to all for free. This is a great opportunity.

About the RISC-V Report

I attended the recent RISC-V Summit and I can tell you the movement is gaining momentum fast. Open source has found its way into mainstream chip design. It’s exciting to watch. There are a lot of moving parts across many markets, so a comprehensive report that puts all this into perspective is quite valuable. The report offers a top-level, global market view, containing 80 pages, 30 tables, 16 figures and an accompanying ecosystem guide.

Some highlights covered in the report include:

RISC-V SoC Market Growth

  • RISC-V-based SoC unit shipments are forecast to surge to 16.2B units, with revenues reaching $92B by 2030, boasting CAGRs of 44% and 47%, respectively

SoC Market Growth

  • SoC architectures utilizing third-party IP exhibit substantial growth in units and revenues across industrial, automotive, networking, computer, consumer, and other categories, notably driven by the burgeoning AI market
  • Projections indicate SoC unit shipments reaching 69B units and revenues hitting $416B by 2030, showcasing CAGRs of 12.5% and 8.7%, respectively

SoC Design Starts

  • SoC design starts for SoCs using RISC-V CPU cores are forecast to reach 1,371 designs by 2030, a 15.7% CAGR
  • Design starts for consumer applications are expected to show the largest number of designs by 2030, with computer and networking applications following closely behind

Third-Party IP Market

  • In 2022, the worldwide IP market reached $7.9B, marking an 8.4% growth from 2021. Forecasts predict a 5.3% increase to $8.3B in 2023, projecting a potential $15B market by 2030, with a CAGR of 9%
  • The central processing unit (CPU) IP market soared by 22.4% in 2022 to $2.7B and is anticipated to hit $5.8B by 2030, demonstrating a robust 10.4% CAGR
  • RISC-V IP revenues surged to $156M in 2023, with an estimated CAGR of 39.5% through 2030

Comments from the Author

 I had the opportunity to chat with the report’s author, Rich Wawrzyniak recently. I was looking for a more color about how the report was developed and some candid comments from Rich regarding what he discovered. First, Rich explained this was a large project. He surveyed and spoke with about 32 companies spanning IP vendors, software companies, EDA companies, device manufacturers, and end users.  A huge amount of data was compiled as a result of this exercise.

I asked Rich if there were any surprises in the results. He mentioned one regarding classification in the IP segment. It turns out the third largest segment here is do-it-yourself. That is, those who weren’t using a specific RISC-V vendor but rather building their own design. While this type of behavior doesn’t scale well, it highlights the substantial amount of exploration that is going on in the market today around RISC-V.

And that fact illuminates some of the unique attributes of the emerging RISC-V market. We spent some time discussing the impact AI has had here. Rich explained that the ability to customize the ISA extensions RISC-V offers SoC designers allows for fine-tuning of the ISA to more fully support the fast-paced AI algorithm developments that are now occurring in the market.

He went on to say that using this capability, SoC designers can more closely tailor their silicon solutions to match changing market requirements. This is especially important when they are considering development of domain-specific solutions for a wide range of applications. How that accelerator is built will have a dramatic impact on the success of any new AI technology. It’s all about speed and power efficiency. And every new idea has its own unique set of success factors.

The extensible nature of the RISC-V architecture and the rich community of innovators fuel the impact this movement is having on product development. A perfect storm of innovation supply and demand if you will. Rich felt this market is just on the cusp of explosive growth. Where all this takes us is hard to predict, but watching the progress will be quite exciting.

RISC-V International encouraged The SHD Group to launch this analysis effort and did encourage its membership to speak with the organization. The full version is likely the most comprehensive analysis of the RISC-V market available today. Rich has a long history of developing world-class market research – this project represents some outstanding and important work. 

To Learn More

You can download your complimentary copy of the 80-page report here.

The extensive full report spanning over 225 pages with 107 tables and 89 figures is also available for purchase. This report provides detailed insights into the current and future projections of the RISC-V market up to 2030. It covers most aspects of the RISC-V market, including end applications, device types, design starts, IP and global projections by region. The report is intended to provide a valuable analysis for business strategists, investors, and technology companies that require deep analysis and granular data. To purchase the full report, email info@theshdgroup.com.

And those are the details about a rare offer from the SHD Group – a complimentary look at the RISC-V market.


KLAC- OK Quarter & flat guide- Hopefully 2025 recovery- Big China % & Backlog

KLAC- OK Quarter & flat guide- Hopefully 2025 recovery- Big China % & Backlog
by Robert Maire on 01-30-2024 at 6:00 am

KLAC Foundry Logic

– KLAC reported an OK QTR & flat guide-waiting for 2025 recovery?
– China exposure remains both risk & savior & big in backlog
– Wafer inspect strong- Patterning on long slide- PCB biz for sale
– Some bright spots but memory still weak- Foundry/Logic OK

Bumping along the bottom of the cycle looking towards a hopeful 2025

KLAC reported $2.49B in revenues and $6.16 in non GAAP EPS. The street was at $2.46B in revenues and EPS of $5.91, so a little better as usual.

Guidance was more or less flattish to slightly down at $2.3B +-$125M and EPS of $5.26+- $0.60.

One of the reasons for March being down is some business that is slipping out of the quarter into future quarters making March lower and future quarters higher. So while technically March is a bottom, its NOT because its the bottom of the cycle but rather customer orders shifting around. Other than that we heard the same story that we heard from Lam that the back half of the year should be better than the front half (we have heard that before)

China is also “stable” at 41% of business but still a risk

China remains the double edge sword of both a risk and savior at the same time. Without the inflated China business KLA would be in a world of hurt (along with other equipment makers) but we continue to wonder how long its going to last.

Its very interesting to note that China is clearly making up a huge portion of the companies backlog as new Chinese buyers are putting down deposits to secure a place in line hopefully before any sanctions kick in. Perhaps the view is that by having an order in they will eventually get a tool.

We obviously see risk in this and remain concerned about the huge amount of business China represents on both a quarterly basis as well as in backlog.

Selling off the PCB business?

Management announced they were looking at strategic alternatives for the PCB business they acquired as it represents less than 1% and has not been great for KLA.

Our observation is that the acquisition of Orbotech almost 6 years ago has clearly not worked out as well as expected .

In our view, the $3.4B acquisition price would likely have been better spent inside KLA’s wheelhouse rather than venturing outside to try to diversify. While the aim to diversify was a good one, especially when comparing KLA to more diversified equipment companies like Applied Materials, the results of the effort have been less than stellar.

Unfortunately given todays political environment there is not a lot in KLA’s wheelhouse that they could buy so it is likely best to double down on existing markets and inside the company.

Wafer inspection strong while Patterning continues long slide

We would note that wafer inspection has become the lions share of KLA’s business at 47% while pattering was about a third of that coming in at 17% of overall revenues. Patterning was down 50% year over year and 21% quarter over quarter while wafer inspection was down a very small 7% year over year and up 15% quarter over quarter. For the year in total, wafer inspection was only off 5% while patterning was off by four times that at 20%.

While these businesses are lumpy from one quarter to the next and account for a lot of variations, the long term pattern is quite clear that KLA’s dominance in patterning is in decline and the growth is slowing significantly especially as compared to wafer inspection. This is obviously a significant change from the long term model where both markets were the two pillars of KLA’s business.

DRAM is somewhat alive with NAND still dead

DRAM represented 85% of memory business while NAND was at near zero levels of 15%, not much different than we have heard from other equipment makers.

HBM memory and DDR5 are bright spots that are driving DRAM while there remains a ton of excess and unutilized capacity in the market that we expect will take at least a year or more to sop up so we don’t expect a major broad recovery any time soon but specific area strength in memory

The Stocks

KLAC was down about 5% in the after market. Obviously the bad guide out of Intel did not help a luke warm report with no definitive recovery other than sometime in 2025.

We did not hear much different out of KLA than we heard from Lam and would expect similar stock performance from both as well as AMAT.

It is clear that a real recovery is a year away and memory will be slow to recover which will keep the pace of recovery in check.

China remains a both risk and reward for all three companies at 40% ish of business.

The stocks continue to trade at relatively high multiples for companies that are still in a downcycle. There is still a lot that can happen before we get to a real recovery and we don’t even know the slope or speed of the recovery other than a hope that it will be in 2025.

About Semiconductor Advisors LLC

Semiconductor Advisors is an RIA (a Registered Investment Advisor),
specializing in technology companies with particular emphasis on semiconductor and semiconductor equipment companies. We have been covering the space longer and been involved with more transactions than any other financial professional in the space. We provide research, consulting and advisory services on strategic and financial matters to both industry participants as well as investors. We offer expert, intelligent, balanced research and advice. Our opinions are very direct and honest and offer an unbiased view as compared to other sources.

Also Read:

ASML – Strong order start on long road to 2025 recovery – 24 flat vs 23 – EUV shines

2024 Semiconductor Cycle Outlook – The Shape of Things to Come – Where we Stand

Is Intel cornering the market in ASML High NA tools? Not repeating EUV mistake