Semiwiki EDA Webinar 800x100

DAC 2021 – Embedded FPGA IP from Menta

DAC 2021 – Embedded FPGA IP from Menta
by Daniel Payne on 01-17-2022 at 10:00 am

Menta min

I’ve followed the enthusiastic market acceptance of FPGA chips over the decades, and even semiconductor companies like  Intel acquired Altera, while AMD tries to acquire Xilinx. The idea of field programmable logic makes a lot of sense for use in systems designs today, and it was inevitable that a company like Menta would offer both soft and hard IP for embedding an FPGA into an ASIC design. At the #58thDAC I met with Yoan Dupret, the Managing Director & CTO at Menta, who has also done stints at Altis Semiconductor, Infineon Technologies, CSR, Samsung and DelfMEMS.

Menta at #58thDAC

Q&A

Q: The FPGA architecture has been around for awhile, so are there any patent issues with your approach?

A: Patents aren’t much of an issue now, we already have patents applied for using pure standard cell IP.

Q: What is the embedded FPGA design flow like for your customers?

A: The design flow is a standard ASIC flow, and we provide setup and guidelines to make regular structures for the best density. With our soft IP approach to embedded FPGAs, we train and guide our customers to be most successful.

Q: How did Menta get started?

A: Our company started back in 2007, and it’s a University spin-out. Two generations of IP have already been done, we started out privately financed, and then in 2015 a new investor came in and we changed to use a standard cell approach.

Q: Where can your embedded FPGAs be used?

A: With our soft IP product any foundry with standard cells can be used.

Q: What kind of customers are using Menta technology?

A: We have multiple Space & Defense customers in the US and Europe – indeed since 2015 – and also edge customers, including 5G.

Q: Do you sell IP generators or instances of FPGAs?

A: We don’t sell IP generators, although we do use our own internal generators, then deliver the FPGA instances. We also help our customers to use the IP in their SoC. Menta has chip architects in house to advise customers.

Q: When did you announce the eFPGA as soft IP?

A: We just announced our eFPGA as soft IP this week at DAC.

Q: What is the process to get an eFPGA as hard IP?

For hard IP we start out with the specification, and the time to reach a physical implementation is about 1-5 months, all dependent on the foundry and size of the FPGA.

Q: How many process nodes have you used for eFPGA IP?

A: We’ve implemented our IP on more than 10 technologies so far, ranging from 180nm to 12nm, with even smaller nodes now in progress. We have a custom IP delivery model. Our technology must be the most efficient to fit the requirements, by combining LUTs and DSPs, Multipliers, Adders, Filters, and Memory inside of an FPGA instance.

eFPGA

Q: How long does it take to get an eFPGA using soft IP?

A: With our new soft IP technology, we can go from  IP Spec in, to RTL verified out in just a few days.

Q: What are some typical application areas?

A: Anywhere that the electronic specifications or standards are still changing, like 5G, AI, cryptography and telecom. With the RISC-V community, they always want ISA extensions. Even a micro-controller chip can be adapted with new functions by using an eFPGA.

Q: Where is Menta physically located?

A: We have three offices: France, New York and Armenia. Our company is in growth mode, so I expect our staff to double in the next 6 months.

Q: Who are you partnered with at DAC this year?

A: At this DAC we have several in booth partners (Codasip, Andes, Secure IC). At the IP Track session we presented with Secure IC. There’s a Poster session with Andes and Codasip. We also brought a demo board in our booth, which is running an algorithm and filtering images.

Q: What makes your eFPGA soft IP different?

A: The fact that it is a soft IP makes already a large difference. We also own completely our software which allows our customers to integrate and re-distribute it within their SDK. Yield, reliability, test, flexibility to provide on any node and delivery time are all differentiators.

Q: How experienced is the management team at Menta?

A: There’s an average of 22 years experience at Menta within our management team.

Q: How long have you been attending DAC?

A: I’ve been attending DAC for about 10 years now, and I joined Menta in 2016.

Q: Where did the name Menta come from?

A: There is a book and movie called Dune, and part of the plot has people called Mentats, and the performed logic, computing and cognitive thinking.

Related Blogs


CMOS Forever?

CMOS Forever?
by Asen Asenov on 01-16-2022 at 6:00 am

CMOS Forever

Today, the CMOS chip manufacturing is the pinnacle of the human technology defining economy, society and perhaps us as modern humans. This was highlighted by the recent chip shortage, followed by the ‘shocking’ realization that more than 80% of all chips are manufactured in the Far East.

Important decisions need to be taken by the Western Governments regarding the future of the CMOS technology. When contemplating such decisions some of the ‘post’ or ‘beyond’ CMOS mythology from the recent past need to be re-examined.

Things were looking good with CMOS technology development on the West at the beginning of the century with Intel leading the advanced CMOS technology by two generations and Europe making significant contributions led by ST and LETI and complemented by IMEC. It was easy in such circumstances to take wrong strategic decisions. For example, NFS in the US decided that the semiconductor industry does not need any more academic research support and time has come to move to the next ‘big thing’. This thinking suited UK very well too as at that time, we have lost already the advanced CMOS manufacturing anyway. The notion that UK will compensate for this loss by inventing the next ‘big thing’ was politically appealing and financially manageable.

Hence, CMOS was de-prioritized by EPSRC and calls for proposals for the next ‘big thing’ in the ‘post CMOS’ era started to appear. In no particular order, these included carbon nanotubes, graphene, 2D materials, various incarnations of ‘quantum’ including quantum computing…. Despite the fantastic intellectual challenges associated with the corresponding research the realization that none of these have the potential of replace the CMOS technology is slowly coming home.

This is nothing new for the big chip manufacturers including Intel, Samsung and TSMC. Until 2013 the International Technology Roadmap for Semiconductors (ITRS) was ‘the bible’ of the semiconductor industry. Every two years in a new ITRS edition everybody was reading first the emerging technology section. My take from reading this section was that nothing was ‘emerging’ on the horizon capable of replacing CMOS. Not surprisingly, the investments of the biggest semiconductor players in ‘post CMOS’ technologies are a minute fractions of their CMOS R&D budgets.

In my humbled opinion, today there is still nothing on the horizon with potential for replacing CMOS. However, all the evidence suggests the maturing and consolidation of the semiconductor industry. Nothing new with this either, just look at the history of avionic industry: after approximately 80 years of rapid technology development the avionic industry is now a mature industry with only Boeing and Airbus remaining as major players, one of them is in the US and the other one is in Europe. Unfortunately, in the semiconductor industry case, most of the potential winners in the CMOS end game are in the Far East with China emerging as a strong contender.

Asen Asenov (FIEEE, FRSE) is the James Watt Professor in Electrical Engineering and the Leader of the renown Glasgow Device Modelling Group. He directs the development of quantum, Monte Carlo and classical models and tools and their application in the design of advanced and novel CMOS devices. He also was founder and the CEO of Gold Standard Simulations (GSS) Ltd. acquired in 2016 by Synopsys. He is currently also CEO of Semiwise – a semiconductor IP and services company, and a director of Surecore and Ngenics.


Podcast EP57: A Perspective of 2021 and 2022 with Malcolm Penn

Podcast EP57: A Perspective of 2021 and 2022 with Malcolm Penn
by Daniel Nenni on 01-14-2022 at 10:00 am

Dan is joined by Malcolm Penn, long-term semiconductor industry veteran and founder of Future Horizons, Dan and Malcolm review their last discussion on 2021 forecasts, which produced aggressive numbers many said were too optimistic. Their predictions turned out to be on the mark.

They also explore the topic of 2022 -what will this year look like and what will be the drivers and the risks going forward? Malcolm also mentions his yearly forecast event, coming up on January 18.  He has graciously offered a 40% discount on the event ticket for SemiWiki subscribers. You can register for the event here

The views, thoughts, and opinions expressed in these podcasts belong solely to the speaker, and not to the speaker’s employer, organization, committee or any other group or individual.


Webinar: Investing in Semiconductor Startups

Webinar: Investing in Semiconductor Startups
by Mike Gianfagna on 01-14-2022 at 6:00 am

Webinar Investing in Semiconductor Startups

Investing in semiconductor startups is something Silicon Catalyst knows a lot about. During a time when venture funding for chip companies all but disappeared, this remarkable organization built a robust incubator, ecosystem, support infrastructure and funding source. Silicon Catalyst has assembled a top-notch management team and an extensive, world-class advisor network. You can learn more about this remarkable organization here. Silicon Catalyst also has a great track record for putting on compelling events with A-list participants. You can read SemiWiki coverage of their most recent event here. So, when Silicon Catalyst announces a webinar on investing in semiconductor startups, you must take notice.

Chips are Popular Again

It appears the rest of the world is now seeing what Silicon Catalyst saw all along. As stated by Silicon Catalyst:

Following a remarkable year of over 25% year on year growth, the global semiconductor industry is poised to experience strong growth in 2022. World-wide sales for this year are projected to reach in excess of $600 billion in what many are calling the golden era of semiconductors.

Chips are indeed hot again. Another interesting fact courtesy of Silicon Catalyst:

Nuvia is a great example, having taken their first round of money in April 2019 at a post-money valuation of $16 million and being acquired by Qualcomm in March of 2021 for over $1.2 billion.

Examples like this are truly remarkable. They also don’t happen every day. For every home run, there are many more failures. Understanding the trends and developing insight to spot the companies that are correctly leveraging those trends is the focus of the upcoming webinar. As usual, Silicon Catalyst has assembled an all-star cast to discuss these topics. We can all learn a lot from these folks, so I highly recommend you attend this event. More information is coming.

An All-Star Cast Weighs In

First, let’s look at the panel lineup. A stellar group from around the world.

Moderator

  • Cliff Hirsch, Semiconductor Times. Cliff has extreme depth and breadth in semiconductors and related technologies, communications, data/telecom network infrastructure, and open-source web technology. He has analyzed greater than 4,000 private and public companies in the semiconductor & comm/IT space. Check out the latest news on semiconductor startups here

Panelists

  • Rajeev Madhavan, North America, Clear Ventures. Rajeev is a founder and General Partner of Clear, where he focuses on early-stage technology investments. His notable career exits include Apigee (IPO), YuMe (IPO), Virident (acquired), Magma (IPO), Groupon (IPO), VxTel (acquired), LogicVision (IPO) and Ambit (acquired). Rajeev has the uncanny ability to deeply understand what entrepreneurs are trying to do, and to steer them onto a successful path. I know Rajeev. He truly has the golden touch.
  • Emily Meads, EU, Speed Invest. Emily passionately supports Deep Tech companies and the Deep Tech ecosystem, and always strives to give scientific credibility to the VC side of the table. Before joining Speedinvest, Emily worked for Fraunhofer IZM, as well as a software engineering startup where she first caught the startup bug. She then worked at Spin Up Science where she specifically supported innovators on their Deep Tech commercialization journeys.
  • Dov Moran, Israel: Grove Venture Capital. Dov Moran is one of Israel’s most prominent hi-tech leaders, entrepreneurs and investors. He is known as a pioneer of several flash memory technologies, most notably as the inventor of the USB flash drive. Dov was a founder and CEO of M-Systems (NSDQ: FLSH), a world leader in the flash data storage market. Under Dov’s leadership, M-Systems grew to $1B revenue, and was acquired by SanDisk Corp (NSDQ: SNDK) for $1.6B.
  • Owen Metters, UK, Foresight Williams Technology Funds. Dr. Metters is an Investment Manager at Williams Advanced Engineering (WAE). He has worked at Oxford University Innovation, the technology transfer organization for the University of Oxford, supporting academics in the commercialization of University IP leading to the formation of several successful spin-out companies which have then raised over £20m of VC funding & holds a PhD in Inorganic & Materials Chemistry from Bristol University.

And a special presentation: Semi Industry Trends and Market Opportunities for 2022, presented by:

  • Junko Yoshida, Editor in Chief, The Ojo-Yoshida Report. Junko has always been a “roving reporter” in the most literal sense. After logging 11 years of international experience at a Japanese consumer electronics company, Junko pursued journalism, breaking stories, securing exclusives, and filing incisive analyses from Tokyo, Silicon Valley, Paris, New York, and China. During her three decades at EE Times, Junko rose through the ranks from Tokyo correspondent to West Coast bureau chief, European bureau chief, news editor, and editor-in-chief.

I know Junko and I find this part quite exciting. She is someone who will always find the hidden truth in every story. Her insights are legendary. I can’t wait to hear her perspectives in her new role. She will be joined by Bolaji Ojo, Publisher and Managing Editor @The Ojo-Yoshida Report.

Junko has offered some comments about the upcoming event. Consider this a sample of what’s to come:

“Semiconductors are the lifeblood of today’s economy. It is pouring into every economic sector, at different speeds and vigor. This means there are huge investment opportunities yet to be tapped in semiconductors using new products and old ones that are finding new applications. Finding where to strategically put investment dollars in semiconductors should be a passion for every investor because this process will endure for a while. The Ojo-Yoshida Report identifies certain technology segments and market applications investors should be paying attention to.”

How to Attend the Webinar

The webinar will be held on Zoom and is open to the public. Attendees will be able to submit questions to the panel and they will be addressed as time permits.

January 19, 2022, 09:00 AM in Pacific Time (US and Canada)

You can register for the webinar on investing in semiconductor startups here.

Also Read:

Silicon Catalyst Hosts an All-Star Panel December 8th to Discuss What Happens Next?

Silicon Startups, Arm Yourself and Catalyze Your Success…. Spotlight: Semiconductor Conferences

WEBINAR: Maximizing Exit Valuations for Technology Companies


CES is Back – Partially

CES is Back – Partially
by Bill Jewell on 01-13-2022 at 2:00 pm

chart pie chart description automatically genera 2

CES (formerly the Consumer Electronics Show) returned to Las Vegas, Nevada last week. In 2021, CES was remote due to the COVID-19 pandemic. On April 28, 2021, the Consumer Technology Association (CTA), the sponsor of CES, announced CES 2022 would be held in Las Vegas. On the date of the announcement new COVID cases in the U.S. were less than 60,000 per day. On the day CES 2022 opened, January 5, 2022, new COVID cases in the U.S. were over 700,000 per day as the new omicron variant spread rapidly. Nevertheless, the show went on with COVID protocols including proof of vaccination, wearing masks indoors, social distancing, and optional on-site testing.

CTA stated CES 2022 live attendance was over 45,000 people, about a quarter of the over 175,000 attendees at the last live event, CES 2020. Over 2300 companies exhibited at CES 2022, about half the 4500 companies at CES 2020. We at Semiconductor Intelligence elected to attend CES 2022 virtually.

In conjunction with CES 2022, CTA released its forecast for U.S. consumer electronics in 2022. Total U.S. consumer electronics are projected at $293 billion, up 1.8% from 2021. Smartphones and computing are the two largest segments at about $75 billion. Video, Smart Home and Automotive are each in the $23B to $25B range.

Most categories of consumer electronics are expected to grow in the low-to mid-single-digit range in 2022. However, three emerging categories with high grow rates are virtual reality eyewear, connected exercise equipment and electric bikes.

AT CES 2022, keynote presentations were given by Samsung Electronics, General Motors, and Abbott. Interestingly only one of the three keynotes was from an electronics company.

Samsung Electronics’ keynote was led by Jong-Hee (JH) Han, Vice Chairman & CEO. The emphasis was not on products but on demonstrating commitment to the environment through a more eco-conscious product life cycle. Samsung plans to have zero standby power usage in its TVs and smartphones by 2025. Older smartphones will be repurposed for IoT applications. Samsung TVs will have solar powered remote controls to reduce battery waste.

Samsung did introduce some new products in its keynote. The Freestyle portable projector can be controlled with voice commands or wirelessly with a smartphone. It can project up to 100 inch images and includes a smart speaker. The Samsung Gaming Hub will have access to video games directly from a Samsung smart TV. The Odyssey Ark is a 55-inch gaming projector which is curved and can be aligned either horizontally or vertically. Samsung also created the home connectivity alliance (HCA) with other appliance makers to increase interoperability between products, ensure safety & data security, and increase energy efficiency.

Samsung Freestyle Projector

Samsung Odyssey Ark Monitor

General Motor’s keynote address was led by chair and CEO Mary Barra. She stated GM is transforming from an automaker to a platform innovator through electrification, software-enabled services, and autonomous driving. GM will have 30 electric vehicle (EV) models by 2025 and all new vehicle models will be electric by 2035.

GMC Hummer EV Pickup

Abott’s keynote was led by president and CEO Robert B. Ford. The keynote focused on electronic implants to improve health and health monitoring. Abbott’s Freestyle Libre glucose monitoring system uses a small sensor on the back of arm and data on a smartphone app. Its Heartmate 3 heart pump is implanted as a blood pump for people with advanced heart failure. Abbott’s neuromodulation devices alter nerve activity through electrical impulses to treat movement disorders such as Parkinson’s disease. Abbott introduced Lingo, a line of bio-wearable devices to track glucose, ketones, lactate and alcohol in order to improve diets and athletic performance.

Abbott Lingo Biosensor

Pepcom’s Digital Experience at CES 2022 introduced many innovative products as shown below.

Labrador Systems demonstrated its Labrador Retriever personal robot which can help move large loads or deliver trays. It is controlled through voice commands or a smartphone app.

Labrador Retriever

In a sign of our times, the MaskFone includes built in earphones and a microphone to enable users to talk on their smartphones in public without removing their masks.

MaskFone

Altis introduced what it claims is the world’s first artificial intelligence (AI) personal trainer. The device consists of a soundbar-sized console which uses a computer vision neural network and an exercise science deep learning model to personally instruct the user.

Altis AI Personal Trainer

Vuzix Shield smart glasses are safety classes which include video projectors, stereo sound, and noise-cancelling microphones. The Vuzix Shield glasses connect to smartphones and other devices using Wi-Fi and Bluetooth.

Vuzix Shield Smart Glasses

Hopefully CES 2023 can return to the scope of previous CES shows. Seeing in-person demonstrations of new consumer electronics is far preferable to watching videos. A live CES enables people to see, touch and even use many new devices and to talk directly to representatives of the companies. CES also brings worldwide media attention to the electronics industry.

Related Blog


It’s Now Time for Smart Clock Networks

It’s Now Time for Smart Clock Networks
by Tom Simon on 01-13-2022 at 10:00 am

Movellus Maestro Clock Network

By now most SoC designers are pretty familiar and comfortable with the use of Network on Chip (NOC) IP for interconnecting functional blocks. Looking at the underlying change that NOCs represent, we see the use of IP to supplant the use of tools for implementing a critical part of the design. The idea that ‘smart’ things are better than just structural implementation is a ubiquitous theme in our lives. Smart bulbs, smart appliances, smart thermostats, smart doorbells all make for better performance and functionality once the technology became available. The time has now come for on-chip clocking to take advantage of a smart approach through the use of IP and a new architecture to replace fixed clock trees and meshes found in previous generations of designs.

Clock networks have always been a challenging area for IC design. While they are often regarded as an unseen part of any design, they consume a significant percentage of chip power and take up considerable real estate. On top of this they are a critical factor in proper chip functionality and performance. Though for years they have been a neglected area in design flows. Movellus, a provider of clock solutions, is taking a fundamentally new approach to clock design. Instead of building a fixed clock tree out of unintelligent buffers, wires and PLLs, they use a set of intelligent IP blocks to handle the major issues encountered in clock design, skew, gating, OCV, power integrity and more.

The capabilities of the Movellus Maestro Clock Network solution is nicely summarized in a paper authored by Linley Gwennap Principal Analyst and Aakash Jani Senior Analyst with the Linley Group. The paper titled “Movellus Maestro: An Intelligent Clock

Network” explains the motivation for applying an IP based solution to clock generation and covers the benefits that result.

Historically clock networks have either been clock trees or meshes or a hybrid. Each has their own advantages and trade-offs. Clock trees tend to use less power but are subject to clock skew. Meshes reduce skew but come with an increase in power consumption. Maestro blends the two with the addition of intelligent IP that monitors skew caused by a variety of factors such as supply voltage fluctuations, OCV and temperature.

Movellus Maestro Clock Network

By virtually eliminating Skew and PVT effects higher operating frequencies can be obtained. Movellus cites some examples where usable clock periods have increased by up to 44%, allowing for much higher Fmax. Another phenomenon that Maestro manages is voltage droop when blocks are toggled on and off to conserve system power. Normally as blocks are switched on when needed there is a latency period while the power rails recover from the additional load. The Maestro Adaptive Workload Module (AWM) reduces this latency by managing clock speeds, allowing in higher system performance.

Maestro reduces the effects of OCV and power jitter on the clock by constantly monitoring and adjusting the clock network. This is especially important at near threshold voltages found in IoT devices. With proper management of OCV and jitter, margins can be reduced to improve performance and power. Maestro also employs a clever system that distributes the operation of the clock subsystems across different phases to spread out simultaneous switching IR impact from clock operation. This reduces overall power consumption and allows for improved performance.

The Linley paper covers additional details and other features of the Movellus Maestro Clock Network. It’s about time that clocks became an area for innovation. Traditionally the major players in EDA have not devoted resources to radically rethinking this crucial component of all SOCs. In a way it is surprising, given the hugely important role that clock distribution plays. The paper is available to read as a download from the Movellus website.

Also Read:

Advantages of Large-Scale Synchronous Clocking Domains in AI Chip Designs

CEO Interview: Mo Faisal of Movellus

Performance, Power and Area (PPA) Benefits Through Intelligent Clock Networks


AI at the Edge No Longer Means Dumbed-Down AI

AI at the Edge No Longer Means Dumbed-Down AI
by Bernard Murphy on 01-13-2022 at 6:00 am

face recognition

One aspect of received wisdom on AI has been that all the innovation starts in the big machine learning/training engines in the cloud. Some of that innovation might eventually migrate in a reduced/ limited form to the edge. In part this reflected the newness of the field. Perhaps also in part it reflected need for prepackaged one-size-fits-many solutions for IoT widgets. Where designers wanted the smarts in their products but weren’t quite ready to become ML design experts. But now those designers are catching up. They read the same press releases and research we all do, as do their competitors. They want to take advantage of the same advances, while sticking to power and cost constraints.

Facial Recognition

AI differentiation at the edge

It’s all about differentiation within an acceptable cost/power envelope. That’s tough to get from pre-packaged solutions. Competitors have access to the same solutions after all. What you really want is a set of algorithm options modeled in the processor as dedicated accelerators ready to be utilized, with ability to layer on your own software-based value-add. You might think there can’t be much you can do here, outside of some admin and tuning. Times have changed. CEVA recently introduced their NeuPro-M embedded AI processor which allows optimization using some of the latest ML advances, deep into algorithm design.

OK, so more control of the algorithm, but to what end? You want to optimize performance per watt, but the standard metric – TOPS/W – is too coarse. Imaging applications should be measured against frames per second (fps) per watt. For security applications, for automotive safety, or drone collision avoidance, recognition times per frame are much more relevant than raw operations per second. So a platform like NeuPro-M which can deliver up to thousands of fps/W in principle will handle realistic fps rates of 30-60 frames per second at very low power. That’s a real advance on traditional pre-packaged AI solutions.

Making it possible

Ultimate algorithms are built by dialing in the features you’ve read about, starting with a wide range of quantization options. The same applies to data type diversity in activation and weights across a range of bit-sizes. The neural multiplier unit (NMU) optimally supports multiple bit-width options for activation and weights such as 8×2 or 16×4 and will also support variants like 8×10.

The processor supports Winograd Transforms or efficient convolutions, providing up to 2X performance gain and reduced power with limited precision degradation. Add the sparsity engine to the model for up to 4X acceleration depending on quantity of zero-values (in either data or weights). Here, the Neural Multiplier Unit also supports a range of data types, fixed from 2×2 to 16×16, and floating point (and Bfloat) from 16×16 to 32×32.

Streaming logic provides options for fixed point scaling, activation and pooling. The vector processor allows you to add your own custom layers to the model. “So what, everyone supports that”, you might think but see below on throughput. There are also a set of next generation AI features including vision transformers, 3D convolution, RNN support, and matrix decomposition.

Lots of algorithm options, all supported by a network optimization to your embedded solution through the CDNN framework to fully exploit the power of your ML algorithms. CDNN is a combination of a network inferencing graph compiler and a dedicated PyTorch add-on tool. This tool will prune the model, optionally supports model compression through matrix decomposition, and adds quantization-aware re-training.

Throughput optimization

In most AI systems, some of these functions might be handled in specialized engines, requiring data to be offloaded and the transform to be loaded back when completed. That’s a lot of added latency (and maybe power compromises), completely undermining performance in your otherwise strong model. NeuPro-M eliminates that issue by connecting all these accelerators directly to a shared L1 cache. Sustaining much higher bandwidth than you’ll find in conventional accelerators.

As a striking example, the vector processing unit, typically used to define custom layers, sits at the same level as the other accelerators. Your algorithms implemented in the VPU benefit from the same acceleration as the rest of the model. Again, no offload and reload needed to accelerate custom layers. In addition, you can have up to 8 of these NPM engines (all the accelerators, plus the NPM L1 cache). NeuPro-M also offers a significant level of software-controlled bandwidth optimization between the L2 cache and the L1 caches, optimizing frame handling and minimizing need for DDR accesses.

Naturally NeuPro-M will also minimize data and weight traffic . For data, accelerators share the same L1 cache. A host processor can communicate data directly with the NeuPro-M L2, again reducing need for DDR transfers. NeuPro-M compresses and decompresses weights on-chip in transfer with DDR memory. It can do the same with activations.

The proof in fps/W acceleration

CEVA ran standard benchmarks using a combination of algorithms modeled in the accelerators, from native through Winograd, to Winograd+Sparsity, to Winograd+Sparsity+4×4. Both benchmarks showed performance improvements up to 3X, with power (fps/W) by around 5X for an ISP NN. The NeuPro-M solution delivered smaller area, a 4X performance, 1/3 of the power, compared with their earlier generation NeuPro-S.

There is a trend I am seeing more generally to get the ultimate in performance by combining multiple algorithms. Which is what CEVA has now made possible with this platform. You can read more HERE.

Also Read:

RedCap Will Accelerate 5G for IoT

Ultra-Wide Band Finds New Relevance

Low Power Positioning for Logistics – Ultimate Tracking


From Now to 2025 – Changes in Store for Hardware-Assisted Verification

From Now to 2025 – Changes in Store for Hardware-Assisted Verification
by Daniel Nenni on 01-12-2022 at 6:00 am

Jean Marie Brunet

Lauro Rizzatti recently interviewed Jean-Marie Brunet, vice president of product management and product engineering in the Scalable Verification Solution division at Siemens EDA, about why hardware-assisted verification is a must have for today’s semiconductor designs. A condensed version of their discussion is below.

LR: There were a number of hardware-assisted verification announcements in 2021. What is your take on these announcements?

JMB: Yes, 2021 was a year of major announcements in the hardware-assisted verification space.

Cadence announced a combination of emulation and prototyping focused on reducing the cost of verification by having prototyping take over tasks from the emulator when faster speed is needed.

Synopsys announced ZeBu-EP1, positioned as a fast-prototyping solution. It isn’t clear what the acronym means, but I believe it stands for enterprise prototyping. After several years of maintaining that ZeBu is the fastest emulator on the market, Synopsys launched a new hardware platform as a fast (or faster) emulator. Is it because ZeBu 4 is not fast enough? More to the point, what is the difference between ZeBu and HAPS?

In March 2021, Siemens EDA announced three new Veloce hardware platform products: Veloce Strato+, Veloce Primo and Veloce proFPGA. Each of these products addresses different verification tasks at different stages in the verification cycle. The launch answered a need for hardware-assisted verification to be a staged, progressive path toward working silicon. Customers want to verify their designs at each stage within the context of real workloads where time to results is as fast as possible without compromising the quality of testing.

In stage 1, blocks, IP and subsystems are assembled into a final SoC. At this stage, very fast compile and effective debug is needed with less emphasis on runtime.

At stage 2, the assembled SoC is becoming a full RTL description. Now, design verification requires a hardware platform that can run faster than the traditional emulator. One that needs less compilation, less debug but more runtime.

In stage 3, verification moves progressively toward system validation. Here it’s about full performance where cabling interconnect to the hardware allows it to run as fast as possible.

LR: Let’s look at the question of tool capacity. Some SoC designs exceed 10-billion gates making capacity a critical parameter for hardware platforms. A perplexing question has to do with capacity scalability. For example, does a complex, 10-billion gate design (one design) have the same requirements as 10, one-billion gate designs (multiple designs) in terms of usable emulation capacity?

JMB: This question always triggers intense discussions with our customers in the emulation and prototyping community. Let me try to explain why it’s so important. Depending on the customer, their total capacity needs may be 10-, 20- or 30-billion gates. In our conversation with customers, we then inquire about the largest design they plan to emulate. The answer depends on the platform they’re using. Today, the largest monolithic designs are in the range of 10- to 15-billion gates. For the sake of this conversation, let’s use 10-billion gates as a typical measure.

The question is, do they manage a single monolithic design of 10-billion gates in the same way they manage 10, one-billion gate designs? The two scenarios have equivalent capacity requirements, but not the same emulation complexity.

Emulating a 10-billion gate design is a complex task. The emulator must be architected to accommodate large designs from the ground up through the chip and subsystem to the system level including requirements at the software level.

A compiler that can map large designs across multiple chips, across multiple chassis is necessary. A critical issue is the architecture that drives the emulation interconnect. If not properly designed and optimized, overall performance and capacity scaling drops considerably.

With off-the-shelf FPGAs as the functioning chip on the boards, the DUT is spread across each interconnected FPGA, lowering the capacity of each FPGA. By interconnecting multiple chassis, the overall performance drops below that of one or a few FPGAs.

Synopsys positions its FPGA-based tools as the fastest emulator for designs in the ballpark of one-billion gates. The speed of the system clock is high because FPGAs are fast. When enough hardware is assembled to run 10-billion gates, an engineer ends up interconnecting large arrays of FPGAs that were never designed for this application. And typically, the interconnection network is an afterthought conceived to accommodate those arrays. This is different from a custom chip-based platform where the interconnection is designed as an integral part of the emulator.

Cadence claims support for very high capacity in the 19-billion gate range. The reality is that no customer is emulating that size of design. The key to supporting high-capacity requirements is the interconnect network. It doesn’t appear that the Palladium Z2 interconnect network is different from the network in Palladium Z1, which is known for capacity scaling issues. As a result, customers should ask if Palladium Z2 has the ability to map a 10-billion gate design reliably.

Today, Veloce Strato+ is the only hardware platform that can execute 10-billion gate designs in a monolithic structure reliably with repeatable results without suffering speed degradation.

The challenge concerns the scaling of the interconnect network. Some emulation architectures are better than others. Based on the roadmap taken by different vendors, future scaling will get even more challenging.

By 2025, the largest design sizes will be in the range of 25-billion gates or even more. If today’s engineering groups are struggling to emulate a design at 10-billion gates, how will they emulate 25 billion+ gates?

Siemens EDA is uniquely positioned to handle very large designs, reliably and at speed, and we continue to develop next-generation hardware platforms to stay ahead of the growing complexity and size of tomorrow’s designs.

LR: Besides the necessary capacity, what other attributes are required to efficiently verify complex, 10-billion gate designs?

JMB: Virtualization of the test environment is as important as capacity and performance.

In the course of the verification cycle, the DUT representation evolves from a virtual description (high level of abstraction) to a hybrid description that mixes RTL and virtual models, such as AFMs or QEMU. Eventually, it becomes a gate-level netlist. When an engineer is not testing a DUT in ICE (in circuit emulation) mode, the test environment is described at a high level of abstraction typically consisting of software workloads.

It’s been understood for a while that RTL simulation cannot keep up with execution of high-level abstraction models running on the server. The larger the high-level abstraction portion of the DUT, the faster the verification. The sooner software workloads are executed, the faster the verification cycle. This is the definition of a shift-left methodology. A virtual/hybrid/RTL representation is needed to run real software workloads on an SoC as accurately as possible and as fast as possible. An efficient verification environment allows a fast, seamless move from virtual to hybrid, from hybrid to RTL, and from RTL to gate.

The hybrid environment decouples an engineer from the performance bottleneck of full RTL, which supports much faster execution. In fact, hybrid can also support software development that is not possible in an RTL environment. A full RTL DUT runs in the emulator with very limited interaction with the server in hybrid mode or the parts of the DUT that run on the server. Here the connection between server and platform, or what we call co-model communication, becomes critical. If not architected properly, the overall performance fails to be acceptable. Unlike the bottleneck of the emulator, now the bottleneck is the communication channel.

We have invested significant engineering resources to address this bottleneck. Our environment excels in virtual/hybrid mode because of our unique co-model channel technology

Capacity, performance and virtualization are the key attributes to handle designs of 10+-billion gates. When designs hit 25 billion+ gates in 2025, the communication channel efficiency becomes even more critical since hybrid emulation becomes prevalent in a wide range of applications.

LR: Thank you, Jean-Marie, for your perspectives and for explaining some of the little-known aspects of successful hardware emulation use.

Also Read:

DAC 2021 – Taming Process Variability in Semiconductor IP

DAC 2021 – Siemens EDA talks about using the Cloud

DAC 2021 – Joe Sawicki explains Digitalization


DAC 2021 – What’s Up with Agnisys and Spec-driven IC Development

DAC 2021 – What’s Up with Agnisys and Spec-driven IC Development
by Daniel Payne on 01-11-2022 at 10:00 am

IDesignSpec min 1

Walking the exhibit floors at DAC in December I spotted the familiar face of Anupam  Bakshi, Founder and CEO of Agnisys, so I stopped by the booth to get an update on his EDA company. My first question for him was about the origin of the company name, Agnisys, and I found at that Agni means Fire in Sanskrit, one of the five elements.

Agnisys at #58DAC

The company vision is the same today as it was from the founding, it’s a tool flow going from the specification to implementation, across design and verification and SW and device drivers. Having a single source of truth on registers for all engineering groups to know and use is a core idea. IDesignSpec is their EDA tool launched 11 years ago now, and the scope of the tool has only grown over time.

IDesignSpec

There are now resellers of Agnisys tools in all continents, the number of licenses have been going up, and the new trend is for site licensing, instead of having just a handful of licenses on one project. When one IC design team starts using IDesignSpec, then other adjacent teams start to hear about the benefits and want to give it a try on their project too.

Another EDA tool at Agnisys is called ISequenceSpec, released about three years ago, and it helps engineers to capture sequences as stimulus generation used in verification, firmware and even post-silicon validation. ISequenceSpec can convert into UVM or C levels. Here’s where ISequenceSpec fits into a design flow:

ISequenceSpec

The newest EDA tool has taken a totally different approach to introduction, because it is being crowd-sourced, and it’s called ISpec.ai. What’s unique is that this tool automatically converts English assertions into proper SystemVerilog Assertions (SVA) by using Machine Learning (ML) techniques. So the company finds out what engineers think about when learning SVA, and then the users can give Yes (Green) or No (Red) feedback, leaving any comments about the quality of the conversion. This tool was released about 2-3 months ago, then existing customers became aware of it and started testing, and so far about 200 engineers have provided feedback.

iSpec.ai

They have even offered quizzes to see if engineers can answer questions about SVA with or without using iSpec.ai, which is kind of fun and technical at the same time. So this tool in a way is kind of similar to Google Translate, as it translates in both directions, both into SVA or English. The company plans to productize this web-based tool after a learning phase.

DVCon US 2022 is coming up in February, and Agnisys has a paper on the iSpec.ai tool, so consider attending that online event to see what progress has been made so far.

Co-located with DAC this year was the RISC-V conference, and Agnisys presented on, “A System Level Verification and Validation Environment using SweRV”. You can watch this 10 minute presentation on YouTube, and it was a Lightning Talk. SweRV is an open-source RISC-V core from Western Digital.

RISC-V Lightning Talk

Connecting all of the semiconductor IPs together in a system-level environment, your team either does this by hand or with some automation. Using SweRV as the processor you can then connect together tests at the IP or system level. Using the C to UVM interface, then both levels can talk together. The processor knows C, while the other IPs understand UVM. So you can run your C program, and it then causes UVM transactions by the tool using SweRV.

Another new tool in 2021 is called IDS-FPGA, now part of the IDesignSpec family, so that FPGA design teams can reduce their development times by using an approach with automated code generation, IP generators, and have an integrated flow with FPGA vendor software. They support the Xilinx UltraScale+ IP-based design development, and have integration with Xilinx Vivado and the Intel Quartus Prime architectures.

Summary

Agnisys has a 15 year history in providing their IDesignSpec tool, and it just keeps getting more robust each year. This company is one of the very few EDA vendors that actually demonstrates their tool live, running on a laptop, so it wasn’t just a PowerPoint presentation at DAC. I think that engineers are really attracted to seeing an EDA tool running live, because they are curious at how the GUI looks, how quickly it operates, and how intuitive the flow is.

Also read:

AI for EDA for AI

What the Heck is Collaborative Specification?

AUGER, the First User Group Meeting for Agnisys

 


Horizontal, Vertical, and Slanted Line Shadowing Across Slit in Low-NA and High-NA EUV Lithography Systems

Horizontal, Vertical, and Slanted Line Shadowing Across Slit in Low-NA and High-NA EUV Lithography Systems
by Fred Chen on 01-11-2022 at 6:00 am

EUV shadowing across slit

EUV lithography systems continue to be the source of much hope for continuing the pace of increasing device density on wafers per Moore’s Law. Recently, although EUV systems were originally supposed to help the industry avoid much multipatterning, it has not turned out to be the case [1,2]. The main surprise has been the rise of stochastic defects and variability [1,2], which challenge both dose and overlay control. It has constrained sub-20 nm features to be printed with multipatterning assistance such as SALELE [3]. This has also accelerated the development of the next-generation High-NA EUV tools [4,5] in order to bring back the opportunity for avoiding multipatterning. On the other hand, High-NA tools have their own concerns as well [4-6].

EUV technology requires a substantially different infrastructure from previous optical lithography. A fundamental reason is it is based on reflective optics rather than transmissive optics. Even the mask needs to be built on a reflective multilayer substrate. This, in turn, has led to some distinct quirks in the EUV imaging process. Due to the reflection being an inherently off-axis process, the illumination of the mask has some inherent asymmetry, as shown in Figure 1[7].

Figure 1. EUV illumination of the mask is essentially a rotated off-axis angle across an arc-shaped slit. Illustration is based from Figure 1 in [7].

There is an arc-shaped slit, 26 mm across and ~1-2 mm thick (depending on design), through which a central illumination ray angle of 6 degrees is rotated azimuthally. As a result, features in the center of the exposure field are actually illuminated at different angles from features at the edge of the exposure field. Each different angle produces a different effective “shadow” which comprehends the light’s propagation through and reflection by the multilayer substrate, as well as double pass through the mask pattern [8]. Such shadowing could cause loss of image contrast (also known as fading) [9].

Figure 2. A particular illumination at the slit center is rotated at the slit edge. Illustration is based from Figure 9 in [10].

Consequently, the horizontal vs vertical line shadowing behavior varies across slit. The appropriate metric for the degree of shadowing is the larger incident angle, at the mask, in the direction perpendicular to the lines between the two pole angles in an ideal dipole illumination setup (targeting sine=0.5 wavelength/pitch at the wafer) for the slit center. Some results are shown in Figure 3 for horizontal and vertical lines. Low-NA (NA=0.33) and High-NA (NA=0.55) systems are plotted side by side.

Figure 3. Horizontal and vertical line shadowing vs slit position, for different pitches on both 0.33 and 0.55 NA systems.

There are several things to point out.

  1. In all cases, the smaller pitch has worse shadowing, i.e., a larger incident angle for one of the illumination poles compared to the other.
  2. The vertical line shadowing varies linearly across slit, because when the azimuthal angle flips sign going from one side of the slit to the other, light is still shining on one side of the line but casts a growing or diminishing shadow.
  3. The horizontal line shadowing is worse than the vertical line shadowing.
  4. High-NA tools do not necessarily provide relief from shadowing, particularly for vertical lines, at pitches targeted for High-NA.
  5. The doubling of demagnification in the High-NA tools from 4x to 8x causes equal shadowing at half the pitch for the latter.

DRAM active areas (Figure 4) present an interesting special case, for they are neither horizontal nor vertical but slanted in between.

Figure 4. Shadowing for DRAM active area lines (angled at 14.5 degrees with respect to the horizontal).  

As may be expected, the shadowing for slanted lines has combined characteristics of horizontal and vertical lines. The High-NA tool does not necessarily provide less shadowing than the Low-NA tool, but the range of shadowing across slit is less. Low-NA tools already show significant shadowing for 16-nm half-pitch, while High-NA tools do so for 10-nm half-pitch.

References

[1] https://m.blog.naver.com/PostView.naver?blogId=jkhan012&logNo=222410469787&categoryNo=30&proxyReferer=https:%2F%2Fwww.linkedin.com%2F

[2] D. De Simone and G. Vandenberghe, “Printability study of EUV double patterning for CMOS metal layers,” Proc. SPIE 10957, 109570Q (2019).

[3] K. Sah et al., “Defect characterization of EUV Self-Aligned Litho-Etch Litho-Etch (SALELE) patterning scheme for advanced nodes,” Proc. SPIE 11611, 116112H (2021).

[4] E. van Setten et al., “High NA EUV lithography: Next step in EUV imaging,” Proc. SPIE 10957, 1095709 (2019).

[5] https://www.imec-int.com/en/articles/high-na-euvl-next-major-step-lithography

[6] A. H. Gabor et al., “Effect of high NA “half-field” printing on overlay error,” Proc. SPIE 11609, 1160907 (2021).

[7] P. C. W. Ng et al., “A Fully Model-Based Methodology for Simultaneously Correcting EUV Mask Shadowing and Optical Proximity Effects with Improved Pattern Transfer Fidelity and Process Windows,” Proc. SPIE 7520, 75200S (2009).

[8] E. van Setten et al., “Multilayer optimization for high-NA EUV mask3D suppression,” Proc. SPIE 11517, 115170Y (2020).

[9] C. van Lare, F. Timmermans, and J. Finders, “Mask-absorber optimization: the next phase,” J. Micro/Nanolith. MEMS MOEMS 19, 024401 (2020).

[10] H. Tanabe, “Classification of EUV masks based on the ratio of the complex refractive index k/(1-n),” Proc. SPIE 11854, 11581416 (2021).

This article originally appeared in LinkedIn Pulse: Horizontal, Vertical , and Slanted Line Shadowing Across Slit in Low-NA and High-NA EUV Lithography Systems