Bronco Webinar 800x100 1

A Tour of This Year’s DAC IP Track with Randy Fish

A Tour of This Year’s DAC IP Track with Randy Fish
by Mike Gianfagna on 07-10-2020 at 10:00 am

Randy Fish

DAC is a complex event with many “moving parts”. While the conference has gone virtual this year (as all events have), the depth of the event remains the same. The technical program has always been of top quality, with peer-reviewed papers presented across many topics and across the world. This is also the oldest part of DAC, dating back 57 years. DAC has grown to include many other events that make up the entire experience. A major trade show with topical events presented in pavilions on the show floor, workshops, tutorials, a designer track and an IP track to name a few.

IP is a relatively new addition to DAC and the EDA segment in general. This is especially true if you consider DAC is 57 years old. I had the opportunity to chat with Randy Fish, the chair of the IP track for DAC this year. I learned some interesting things about how this track is put together and how it interacts with the rest of the conference.

First, a bit about Randy. He began his career as a design engineer at Intel. From there, he worked in applications, sales and marketing across an array of EDA and IP companies, both large and small. He is currently vice president of market development at UltraSoC, a company that has recently been acquired by Siemens.

So, how does one get involved with the DAC Executive Committee? In Randy’s words, he’s been going to DAC since the mid-1980’s. Like many of us, he’s had lots of great experiences, both technical and social over the years at DAC. If you’re in the EDA or IP business, this show punctuates your yearly existence in many ways. A couple of years ago, Randy was chatting with Mike McNamara, a past DAC general chair and Michelle Clancy, DAC’s publicity and marketing chair. They were giving Randy the recruiting speech – join the force of DAC.  Randy decided it was time to “give back” and so he joined the Executive Committee and he is heading the IP Track this year.

At the start our discussion Randy pointed out that there really isn’t a large, mainstream event for semiconductor IP. DAC is the best venue for such a focus and Randy believes this is at it should be. He went on to explain that the regular technical program at DAC is aimed at the “researcher”, but the IP program is aimed at the “practitioner” – those using IP to design chips. The choices of what IP to use as a practitioner are quite large – there are a lot of vendors to explore and a lot of new technologies.  A virtual show environment helps this agenda quite a bit since “sampling” many presentations and vendor booths are much easier in this format.

Next, Randy explained the scope and focus of the IP track. There are six folks on the IP committee. One aspect of their job is to develop invited sessions – topics of interest and possible presenters.  This is the “proactive” part of the content development if you will. There is also the review and selection of submitted papers on IP and organizing them into topical groups. This is the “reactive” part. Working both as a proactive and reactive organization, Randy and his team have put together an excellent program this year. Here are the top-level sessions:

Randy and his team were also working on functional safety track and decided the topic was better served as a tutorial, so the team “donated” the topic to a different track at DAC for the good of the agenda. This one also looks quite interesting, check it out:

IP also impacts the technical agenda at DAC.  Thanks to the RISC-V movement, there are internal designs and designs from companies like SiFive, Codasip and Andes which are all driving the need the processor verification, creating a renewed interest in this topic for the DAC technical agenda.

I think the IP track at DAC this year looks quite strong and I congratulated Randy and his team on the excellent work. Randy closed with a call to action that may resonate with some of you, at least I hope so. He said that his committee, and others as well at DAC are always looking for interested parties to get involved. So, if you’d like to help shape future DACs, just contact Randy, or anyone on the DAC Executive Committee.

The 57th DAC will be hosted virtually Monday, July 20 – Friday, July 24, with on-demand access to sessions through August 1, 2020.  Registration for DAC is now open.  There are three ways to attend DAC virtually – complimentary I LOVE DAC pass, Designer/IP/Embedded Track Special $49.00 or Full Conference pass starting at $199.00.

For more information on the Virtual DAC program and registration please visit: www.dac.com

 


Sensors, AI, Tiny Power in a Turnkey Board.

Sensors, AI, Tiny Power in a Turnkey Board.
by Bernard Murphy on 07-10-2020 at 6:00 am

Eta Compute ECM3532 AI Sensor Board Top

Got a great idea for a device with AI at the extreme edge? Self-contained and can run on a coin cell battery, maybe even harvested energy? Needs to fit in a space not much larger than a quarter? Eta Compute has a board for you. This comes with 2 MEMS microphones, a pressure/temperature sensor, a 6-axis MEMS accelerometer/gyroscope, their ultra-low power neural sensor processor, extensibility through a UART port and a micro-SD slot, BLE with antenna for communication and battery cradle, all in a 1.4”x1.4” board. You can learn more at a free workshop they are hosting on July 14th (I was told that all the free promotional boards have already been taken!)

Users can develop their AI solution through partner Edge Impulse’s TinyML development pipeline, uploading the completed solution through the UART port. One enthusiast was able to develop, upload and test an alarm detection system in under one hour.

Sensors, AI on a Tiny Board with Tiny Power

I talked to Semir Haddad (Sr Dir Product Marketing) at Eta Compute to understand why they developed this board. He told me that a lot of their customers want to prove out a solution, sensors, AI and communication, but they were having to hack together their own solutions through multiple boards or adapt evaluation boards, all of which takes time and creates debug and scaling problems. Those users wanted to get quickly to a proof of concept, even a solution they could deploy quickly in the field, say in an agricultural application. They wanted to prove the solution out and pilot at a modest scale, before deciding if they want to go to volume production in a custom design.

Semir discussed some addition use-cases, including vibration detection for machine monitoring, or detecting doors or windows opening or closing. He mentioned pressure detection, saying that it common to fuse this kind of sensing with motion for more accurate motion/position detection.

Also together with the Edge Impulse solution, microphones can be used such to recognize learned sounds (a chicken squawking for example – Warning! Fox in the chicken pen!) or wake words and command phrases (unlock or lock the gate). Similarly, the 6-axis motion sensor can be used for gesture detection. Between these two you have pretty wide range of options to control your edge device.

Tiny Power through Self-Timed Logic, CVFS

The system is built around Eta Compute’s ECM 3532 neural sensor processor on which I’ve written before. This has all the capabilities of a hybrid multi-core Cortex-M plus DSP solution, but built on self-timed logic with continuous voltage and frequency scaling (CVFS). That’s continuous, unlike conventional DVFS which can only switch between a small number of voltage and frequency options. These features allow for this processor to get under 1mW for inference operations and to get always-on operation (in support of the sensors) under 1uA.

Eta Compute’s software partner (Edge Impulse) is known for their TinyML pipeline, which I’m told together with this development board provides a pretty much turnkey solution – no code needs to be written to get a proof of concept up and running very quickly.

So if you want build with AI at the extreme edge, check them out!

Register for Workshop

Remember to register for free workshop. You can also learn more about the board HERE. Also, you can buy the boards through DigiKey.


Achronix Blog Roundup!

Achronix Blog Roundup!
by Daniel Nenni on 07-09-2020 at 10:00 am

Achronix Speedcore

Blogging is not an easy thing to do. It takes time, patience, commitment, and creativity. SemiWiki brought blogging to the semiconductor industry and many companies have followed. Very few have been successful with personal or corporate blogs but as a premier semiconductor blogger I have developed a proven recipe over the last ten years and can spot a winner when I see one.

As a corporate blog success story I will point to the Achronix blog site. Over the last three years Achronix has posted 28 blogs. My preference would be one per month without fail but 28 in 37 months is a serious commitment. There is a nice mix of authors from different aspects of the company (engineering, marketing, applications, C level, etc…):

– Kent OrthnerSystems Architect
– Alok SanghaviSr. Marketing Manager
– Steve MensorVice President, Marketing
– Katie PurcellSenior Staff Applications Engineer
– Volkan OktemSr. Director of Application
– Manoj RogeVP of Strategic Planning & Business Development
– Raymond NijssenVice President and Chief Technologist
– Bob SIllerDirector, Product Marketing
– Huang LunSr. Field Applications Engineer

While Author is important, blog title and the first paragraph is everything for both direct and search traffic. You need to speak to a specific problem to get high quality traffic. For semiconductor sites, quality versus quantity is important and be very careful about clickbait because it is a double edged sword. Again, the Achronix blogs are a great example for titles and summaries:

Embedded FPGAs for Next-Generation Automotive ASICs
Bob SIller, Sr. Marketing Manager
For anyone who has looked at new cars lately, it’s hard not to notice how quickly automotive electronics are advancing. Looking at automotive safety technology from just three years ago vs. today, you see a significant increase in the number of cameras to support applications such as surround-view display, driver distraction monitors, stereo vision cameras, forward-facing and multiple rearview cameras. Speedcore

Increase Performance Using an FPGA with 2D NoC
Huang LunSr. Field Applications Engineer
Achronix Speedster7t FPGAs feature a revolutionary new two-dimensional network on chip (NoC), which provides >20 Tbps ultra-high bandwidth connectivity to external high-speed interfaces and for routing data within the programmable logic fabric. The NoC is structured as a series of rows and columns spread across the Speedster7t FPGA fabric. Each row or column has two 256-bit data paths using industry standard AXI data format, which support 512 Gbps data rates.

What is an FPGA and Why the Answer is Changing?
Bob Siller, Director, Product Marketing
What is an FPGA? With the advent of new FPGA architectures, the answer has changed more in the last two years than ever before. Traditionally, an FPGA or field programmable gate array, is a reconfigurable semiconductor device comprising programmable logic gates and interconnect or routing, connected to multipurpose I/O pins.  An FPGA can be reprogrammed to perform any function, and its functionality can be changed over time. (For a great summary and history of the FPGA industry and technology.

Insights from the Next FPGA Platform Event
Manoj Roge, VP of Strategic Planning & Business Development
It was exciting to participate in Next FPGA Platform on January 22nd at the Glasshouse in San Jose. I found it was particularly exciting to have Achronix share in a panel discussion with Xilinx and Intel. The Next Platform co-editors Nicole Hemsoth and Timothy Prickett Morgan did a great job in interviewing experts from FPGA ecosystem with insightful questions. The best part of Next Platform events is their format, where they keep marketing pitches to minimum with no presentations, just discussions.

FPGAs in the 2020s – The New Old Thing
Bob Siller, Director, Product Marketing
FPGAs are the new old thing in semiconductors today. Even though FPGAs are 35 years old, the next decade represents a growth opportunity that hasn’t been seen since the early 1990s. Why is this happening now?

Mine Cryptocurrencies Sooner Part 1-3
Raymond Nijssen, Vice President, Marketing
Cryptocurrency mining is the process of computing a new cryptocurrency unit based on all the previously found ones. The concept of cryptocurrency is nearly universally recognized by the publicity of the original cryptocurrency, Bitcoin. Cryptocurrencies were supposed to be a broadly democratic currency vehicle not controlled by any one entity, such as banks, governments, or small groups of companies. Much of a cryptocurrency’s acceptance and trustworthiness is based on that proposition. However, with Bitcoin, that is not how it unfolded.

Getting the first wave of blog views is actually the easy part. Keeping readership is critical and that is all about the quality of content. If I had to credit one thing for the success of SemiWiki over the last 10 years it would be excellent content.  All content on a company website is important but done correctly blogs can bring a consistent stream of high quality traffic and improve your website SEO and rankings, absolutely.

You should also check out the Achronix videos and webinars, very well done!

 

About Achronix Semiconductor Corporation
Achronix Semiconductor Corporation is a privately held, fabless semiconductor corporation based in Santa Clara, California and offers high-performance FPGA and embedded FPGA (eFPGA) solutions. Achronix’s history is one of pushing the boundaries in the high-performance FPGA market. Achronix offerings include programmable FPGA fabrics, discrete high-performance and high-density FPGAs with hardwired system-level blocks, datacenter and HPC hardware accelerator boards, and best-in-class EDA software supporting all Achronix products. The company has sales offices and representatives in the United States, Europe, and China, and has a research and design office in Bangalore, India..

Follow Achronix
Website: www.achronix.com
The Achronix Blog: /blogs/
Twitter: @AchronixInc
LinkedIn: https://www.linkedin.com/company/57668/
Facebook: https://www.facebook.com/achronix/


Interface IP Category to Overtake CPU IP by 2025?

Interface IP Category to Overtake CPU IP by 2025?
by Eric Esteve on 07-09-2020 at 6:00 am

Top 5 Forecast 2020 2024

The Interface Design IP market explodes, growing by 18% in 2019, with $870 million, when CPU IP category grew by 5% at $1,460 million. In fact, Interface IP market is forecasted to sustain high growth rate for the next five years, as calculated by IPnest in the “Interface IP Survey 2015-2019 & Forecast 2020-2024”, to reach $1,800 million by 2025. Obviously the CPU IP category will not stay at the 2019 level and is expected to grow as well. But we think the CAGR 2020-2025 for CPU will be more modest, in the 4% range.

Why such a modest growth rate for the CPU IP category? The first reason is that the CPU IP market is shaky because the licensing business model is in revolt since the insertion of RISC-V CPU. The second reason is the uncertainty about ARM future revenues coming from IP sales in China (estimated to be in the 30% range), because of the exit of the JV built to support ARM IP sales in the country. This post “Tears in the Rain – ARM and JVs in China” from Jay Goldberg in Semiwiki gives very detailed explanation of the complete story. I strongly suggest you to read this post because it reflects that we were only guessing, translating a feeling into clear wording.

But the goal today is to explain why the interface IP category will see such a high growth rate until 2025. The below picture is showing that the CPU IP market share is declining since 2017 (40.8% to 37.2%) when the interface IP share is growing on the same period from 18% to 22.1%. This trend is validated during the last three years and we will see why this behavior will continue during the 2020’s.

In the 2010-decade smartphone was the strong driver for the IP industry pushing the CPU/GPU categories and some interface protocols like LPDDR, USB and MIPI. The smartphone industry is still active but has reached a peak. The new growth driver for IP sales are data-centric applications including servers, datacenter, wired and wireless networking and emerging AI.

All of these applications share the need for higher and higher bandwidth for in-system data exchange (with memory and between chips) as well across the global network to support faster and wider interconnects between datacenter and networking.

This translates into high speed memory controllers (DDR5, HBM or GDDR6) and faster release of interface protocols (PCIe 5, 400G and 800G Ethernet, 112G SerDes) as well as emergence of protocols supporting Chiplet (HBI or SerDes).

If we look at the interface IP segments, it will directly impact the memory controller, PCI Express, Ethernet and SerDes segments and a new segment that we could call “Die2Die” (D2D). We have already seen significant IP revenue growth in these segments in 2019, ranging from 12% (memory controller), 20% (PCIe) and even more for Ethernet and SerDes segment.

The drivers have been linked with emerging protocols adoption as well as new technology nodes, like 7nm and 5m. For PCIe the driver has been adoption for PCIe 4 (16 Gbps data rate per lane). For the memory controller segment, we have seen several drivers like DDR4 adoption in datacenter, and also the adoption of High Bandwidth Memory (HBM2) and Graphic (GDDR6) in numerous applications, some of them being new and linked with AI.

When  a design project starts on the last available technology node (7nm in 2019) and integrates the last release of a protocol, the license ASP is impacted and more expansive than before (n-1 release on N-1 node). So the growth for a specific IP segment is generated by the number of design starts (higher than before because there are more developments in application like datacenter and AI) multiplied by the license ASP increase, because the protocol is more complex and the target node is advanced.

What we start to see clearly is that data-centric applications (servers, datacenter, networking, AI, …) are strongly pushing the interface IP market, more specifically memory controller, PCIe, Ethernet and SerDes.

With SerDes we can consider that 2019 was the year where 112G PAM4 SerDes have started to be adopted, impacting positively the SerDes IP category revenues, but also Ethernet, as 400G MAC IP (and 800G MAC) have started to sale.

In fact, we have seen growth in the high 30% for this category, illustrated by Synopsys (thanks to Silabtech acquisition) Cadence (thanks to Nusemi acquisition) and three year old SerDes start-up Alphawave  reaching $25 million revenues in 2019!

Don’t forget other protocols like USB as the introduction of USB 4 should boost USB IP sales in 2021 and after. USB 4 offers much higher bandwidth with 40 Gbps (to be compared with 10 Gbps for USB 3.2 or 20 Gbps for USB 3.2×2) and clarifies USB nomenclature making it easier to understand for the end user (the consumer). It also supports DisplayPort and ThunderBolt, a new capability to make life easier for consumers who want to see movies.

The MIPI protocol, part of the top 5 interfaces, is massively used in the smartphone. The change is coming from the automotive segment with the adoption of MIPI CSI (camera) and MIPI A-PHY defined to support long range (LR) SerDes based interconnect in a car.

Nevertheless, the IPnest forecast for USB and MIPI predict a CAGR in the 10% range for 2020-2024 for these two protocols, slightly less than the 15% CAGR associated with the three other protocols.

IPnest has used a methodology based on design starts by protocol forecasting the new project growth in respect with the target market segment (like datacenter, networking or ADAS) and predicting the license price (as a function of the technology node for the PHY and linked with the protocol release for the controller). This approach is quite complex but we expect it to help with accurate results and more importantly a realistic forecast.

This is the 12th version of the survey starting in 2009 when the Interface IP category market was $250 million (in 2019 $870 million), and we can affirm that the 5 years forecast stayed within +/- 5% error margin! So, when IPnest predicts in 2020 that the interface IP category in 2025 will be in the $1800-$2000 range, passing the CPU IP category, this affirmation is backed-up by experience…

If you’re interested in this “Interface IP Survey” released in June 2020, just contact me:

eric.esteve@ip-nest.com .

Eric Esteve from IPnest

Also Read:

Design IP Revenue Grew 5.2% in 2019, Good News in Declining Semi Market

Chiplet: Are You Ready For Next Semiconductor Revolution?

IPnest Forecast Interface IP Category Growth to $2.5B in 2025


Arm Rings the Bell in Supercomputing

Arm Rings the Bell in Supercomputing
by Bernard Murphy on 07-08-2020 at 6:00 am

Fugaku

Late last year I wrote about Arm’s efforts to play a role in servers, in AWS, and particularly Arm-based supercomputing, in the Sandia Astra roadmap and in partnering with NVIDIA who are in the Oak Ridge Summit supercomputer. These steps came, at least for me, with an implicit “Good for them, playing a role on the edges of these challenging applications.”

Well they just blew right past that theory. The Fugaku Arm-based supercomputer was just named this year’s fastest in the world. Arm isn’t helping in some peripheral role. Arm cores are the CPUs in this supercomputer. What’s more, Fugaku earlier also topped the list of the world’s most efficient supercomputers.

Some Fugaku specs

Fujitsu and RIKEN developed Fugaku jointly, around the Fujitsu A64FX processor. Fujitsu have built these processors around a many-core Arm CPU, with 48 compute cores connected through a NoC, together with either 2 or 4 helper cores. In addition, each processor connects in-package to 32GB of high-bandwidth memory (HBM) supporting streaming memory accesses, also the types of accesses common in AI applications. The processor uses the Arm V8.2A architecture, plus scalable vector extension with a 512-bit vector implementation. One processor alone is a serious machine.

384 of these go in a full rack and there are 396 of those racks in the system, plus a number of half racks. Together these add up to a total of nearly 160k nodes in Fugaku. These interconnect through a torus-architecture network they call TofuD (a neat name for a Japanese supercomputer network).

Theoretical peak performance is eye-watering. In boost mode, the system reaches 1.07 exaflops/second in 32-bit single precision, 2.15 exaflops/second in AI training (16 bit) and 4.3 exaops/second in 8-bit inference. This with a theoretical peak memory bandwidth of 163 petabytes/second. Peak power for this monster is about 28 MW and, no surprise, depends on a closed circuit water cooling system.

COVID applications

RIKEN is working with the Japanese Ministry of Education, Culture, Sports, Science and Technology to use Fugaku on a number of projects targeting COVID. One is a project to search for new drug candidates using molecular dynamics modeling to find candidates with a high affinity for the spike proteins on the virus. They are applying this analysis to 2000 existing drug candidates.

A different analysis is looking at the molecular dynamics of the spike protein to find features which may not be experimentally detectable. This is to gain a better understanding of the mechanisms behind connection to ACE2 receptors on cell surfaces.

A third team team plan to model infection in indoor environments through virus droplets . This is with a view to testing possible counter-measures, such as airflow control. I like this simply because it’s an incredibly complex many-body fluidics problem. How else would you model this other than on a monster supercomputer?

Cray announces their Arm supercomputer

Fugaku isn’t the only Arm-based supercomputer on record. HPE/Cray have announced the Cray CS500, based on the Fujitsu A64FX processor. This product provides a Cray programming environment on the system. Already SUNY Stony Brook, DOE Los Alamos National Laboratory and ORNL have signed up for these systems.

No more patronizing Arm in supercomputing. They’re on the leader board and one of their customers is at the top of the leader board. I’ve heard that Cray plans to reclaim that spot next year. Wow!

You can read more about Arms journey in supercomputing HERE and you can learn more about Fugaku HERE.


Siemens Acquires UltraSoC to Drive Design for Silicon Lifecycle Management

Siemens Acquires UltraSoC to Drive Design for Silicon Lifecycle Management
by Mike Gianfagna on 07-07-2020 at 10:00 am

Some Key Executives from UltraSoC

As reported recently by Dan Nenni, Siemens has signed an agreement to acquire Cambridge, UK-based UltraSoC Technologies Ltd. We’ve all seen plenty of mergers and acquisitions in EDA.  Some transactions perform better than others. The best ones enhance an existing product or service by blending non-overlapping technologies. This one is different. The combination of two non-overlapping technologies is creating a whole new category.

The Details

The acquisition is integrating UltraSoCs’ embedded monitoring hardware with the Tessent product suite, a comprehensive silicon test and yield analysis solution from Mentor Graphics, now part of Siemens Digital Industries Software. I had a chance to explore the details of the deal with Brady Benware, Tessent vice president and general manager at Siemens Digital Industries Software. Brady joined Mentor almost 14 years ago and has worked on the development of the Tessent product suite the entire time.  He is a great source of detail and color about this acquisition.

Brady explained that UltraSoC started off in 2009 with monitoring IP that could be embedded in an SoC. The focus was to assist with silicon debug. Around 2015, UltraSoC began to implement a change in direction. They realized that their embedded monitoring technology could be used in a much broader range of applications. Cyber security, safety, system optimization and predictive analysis were some target areas. This change in direction put UltraSoC on a path that would ultimately intersect with Siemens and their Tessent product suite.

Mentor is a leader in test and the Tessent product suite brings a lot of silicon test and yield analysis solutions together. Areas such as automotive, logic, memory and mixed signal test are covered. Silicon learning tools to address test bring-up, silicon characterization, diagnosis-driven yield analysis and failure analysis are also addressed. These tools all focus on structural verification of the design during the manufacturing phase and while it’s deployed in the field.

In some important ways, the UltraSoC product family picks up where Tessent ends. Their embedded functional monitoring and analysis technology goes beyond structural verification, which enables a wider range of capabilities for monitoring and optimization of the part in the field. Brady discussed some of these capabilities. The first thing to realize is that applications such as automotive, IoT, data center and AI are all pushing performance to the limits of the silicon. These applications also demand optimal power and performance over the lifetime of the device and safe, reliable operation in a highly secure envelope.

UltraSoC’s embedded monitoring and analysis technology addresses all these requirements in a unique, hardware-driven way. The power and performance of the device can be monitored and analyzed, allowing modifications to the operating parameters of the device to compensate for aging effects. Bus traffic can be monitored to ensure there are no out of spec packets, which can indicate a security intrusion. The data collected from this embedded technology can also be used to perform predictive analysis for maintenance and provisioning.

An embedded approach provides some real advantages for this kind of analysis. Since all monitoring is done in hardware, the process is less intrusive to system operation, freeing capacity to address mission-critical tasks. A hardware-level approach is also less prone to hacking and external interference. Brady described a hierarchical communications backbone to leverage the data and analytics. It begins with the device and extends to the local system and collections of systems. From a security and safety standpoint, this kind of structure enables the identification of a system component that isn’t behaving like other similar system components. This could be an early warning for a compromised or failing device.

The Synergy

Putting all this together creates something Siemens is calling silicon lifecycle management, and this is the new category enabled by the acquisition. Testing and validation no longer ends when the part goes into production. It is rather the beginning of a monitoring and analysis process that extends over the entire lifecycle.

Moving quickly to bring these added capabilities to customers, the deal has already closed and integration is progressing. The two companies possess a shared vision, complimentary technology and similar customer base. This acquisition should bring substantial new capabilities to the market.


Waking Up to the Requirements of Voice Activity Detection

Waking Up to the Requirements of Voice Activity Detection
by Tom Simon on 07-07-2020 at 6:00 am

Dolphin Design Waking Up

There is a famous scene in the 1976 movie Taxi Driver when Robert De Niro’s character Travis is pretending to have a conversation looking in the mirror and repeatedly saying “Are you talking to me?”. I think about this scene every time I use a voice active device – Hey, are you talking to me? Yes, I am, but are you listening?

Voice command, which was the stuff of fantasy not that many years ago, has become a staple for smart products and systems. Even though many of these systems use computational processes similar to those used in our brains for voice recognition, electronic systems must operate under a set of tight constraints to make their use feasible. Chief among these are power limitations and the need to maintain privacy, primarily when conversation is not intended for the voice operated smart device. As a result, designers must design these systems with extra care to ensure these requirements are met.

Consumers will not tolerate voice systems that send all of their conversations over the internet to the cloud for analysis and potential recording. Furthermore, it is simply too costly to transmit that much audio information. It would require too much bandwidth and power consumption. Ideally voice activated systems would largely be in sleep mode with the absolute minimum circuitry active – listening for potential voice commands.

With that in mind Dolphin Design has developed several IPs that help systems locally detect valid voice input to start the process of interpreting voice commands. Voice activity detection (VAD) starts with the detection of a keyword that triggers overall system activation. Only once a voice and a correct keyword is detected will the entire voice recognition chain be switched on. Dolphin has a new white paper titled “Why VAD and what solution to choose?” that talks about different architectures for VAD based systems and their relative merits.

One of the most important metrics is the detection latency for various phonemes that can come at the beginning of a command phrase. VAD systems need to reject ambient noise yet respond quickly to valid voice input. Dolphin has developed the MiWok benchmarking platform to allow designers to compare key metrics.

Some systems use analog microphones, which means that most of the system can be in sleep mode, with only a small IP, such as the Dolphin WhisperTrigger, active to detect valid voice input. Other systems use digital microphones, which necessarily require more supporting circuitry, in addition to the WhisperTrigger IP, to remain in wake mode so the microphone input can be converted to a usable signal. The Dolphin white paper describes each type of system and their tradeoffs.

Regardless, their analysis shows that adding the WhisperTrigger IP to a voice activated system allows for significant power reductions, versus maintaining DSPs in an on state to analyze incoming audio data. The Dolphin WhisperTrigger IP offers extensive configurability to let designers fine tune sensitivity and performance for the specific application.

The white paper offers benchmark comparisons to help illustrate the alternatives available and their overall power consumption figures. If you don’t want the users of your system to feel like they are talking to themselves in the mirror, it might be worth reading the white paper to understand the options available for power efficient and reliable VAD system design. The white paper is located on the Dolphin Design Website for download and reading.


The Future of Chip Design with the Cadence iSpatial Flow

The Future of Chip Design with the Cadence iSpatial Flow
by Mike Gianfagna on 07-06-2020 at 10:00 am

Screen Shot 2020 06 20 at 2.30.57 PM

A few months ago, I wrote about the announcement of a new digital full flow from Cadence. In that piece, I focused on the machine learning (ML) aspects of the new tool. I had covered a discussion with Cadence’s Paul Cunningham a week before that explored ML in Cadence products, so it was timely to dive into a real-world example of the strategy Paul described. Since then, I also covered a position paper from Cadence on Intelligent System Design, which provides more details on advanced technology and ML for EDA.

The new digital full flow from Cadence is called iSpatial. Beyond ML, it also features unified placement and physical optimization engines that Cadence describes as an industry first. That’s a lot of integrated functionality. Questions that come to mind include:

How does the use model for a new tool like this compare to the prior generation? 

How is the workflow different, and what are the benefits of doing things a new way? 

I had the opportunity recently to explore these questions with Vivek Mishra, corporate VP, product engineering and Kam Kittrell, senior product management group director in the Digital & Signoff Group at Cadence. I was treated to a detailed tour of the use model for iSpatial and some actual results.

Vivek started our discussion by explaining that a key benefit of a flow like this is superior forward visibility (for the front-end synthesis team). We explored this statement further.

The front-end design team needs to know the power, performance and area of a given design iteration. This information drives optimization, and before iSpatial, the front-end team needed to wait for a completed design iteration from the back-end team to know these results. That could take many days.

Instead, with the iSpatial flow, the front-end design team gets meaningful and actionable information very quickly on things like overall performance, size and power as well as details on items such as routing congestion, critical path delays and clock insertion delays. The information is also presented in a format that is familiar to the front-end design team, avoiding the need to get an interpretation of the data from the back-end team. This contributes to efficiency as well as quality of results.

So, the integrated iSpatial flow minimizes turnaround time and maximizes efficiency for design iterations. But there’s more—the flow can reduce the overall number of design iterations as well. This is one application of ML. In this case, the tool will “learn” from prior design iterations and apply that knowledge in the form of suggestions for the next design iteration. Vivek provided some examples, things like modified pin placement to avoid DRC errors or a different choice of cell library elements that will improve performance. These suggestions are provided in the form of scripts that can be run to implement the various suggestions. This technology can actually help reduce design iterations by avoiding errors, which is headline news from a schedule perspective. Cadence calls these learning and optimization techniques “ML outside”.

There’s another ML use model which applies the technology to the core algorithms to optimize the results achieved. Cadence calls this “ML inside”. I explored some examples of these techniques with Vivek as well. Delay calculation was one we discussed. This is a very iterative and time-consuming process, requiring simulation. ML can optimize this process to increase both the speed of results as well as accuracy. Synthesis mapping is another example, where the best choice for a given implementation can be “learned” to avoid additional iterations.

Kam provided some more color on “ML inside” techniques at Cadence. Consider that many EDA algorithms are iterative in nature and the starting point for those iterations can impact the time to a converged result, or even if there is convergence at all. Finding the right starting point is something of a pattern-matching problem, and ML is quite good at those kinds of tasks.

As a final point, I asked about actual results on real customer designs. Kam reminded me that some detailed statistics were shared in the original press release, an unusual level of detail for a press release actually. MediaTek reported, “… we were able to automatically and quickly train a model of our CPU core, which resulted in an improved maximum frequency along with an 80% reduction in total negative slack. This enabled 2X shorter turnaround time for final signoff design closure.”

Samsung Electronics reported, “(iSpatial) enabled us to achieve 3X faster design turnaround time by quickly iterating on RTL, constraints and floorplan while improving total power by 6%. Furthermore, Cadence’s unique ML capabilities allowed us to train a model of our design on Samsung Foundry’s 4nm EUV node, which helped us further achieve a 5% performance improvement and 5% leakage power savings.”

Kam further mentioned that on several advanced customer designs, a double-digit total negative slack (TNS) improvement, often 50 percent or more, was achieved. On these same designs, power was improved by 1 to 3.5 percent. If you consider that a design team could spend months looking for a three percent power improvement, these numbers are quite impressive. Kam also explained that design groups using older technology nodes are also seeing benefits from the new flow in terms of reduced design iterations and a more finely tuned methodology.

At this point, I felt like I had seen the future (of chip design). You can learn more about the Cadence suite of digital design and signoff products here.

 


A Compelling Application for AI in Semiconductor Manufacturing

A Compelling Application for AI in Semiconductor Manufacturing
by Tom Dillinger on 07-06-2020 at 6:00 am

AI opportunities

There have been a multitude of announcements recently relative to the incorporation of machine learning (ML) methods into EDA tool algorithms, mostly in the physical implementation flows.  For example, deterministic ML-based decision algorithms applied to cell placement and signal interconnect routing promise to expedite and optimize physical design results, without the iterative cell-swap placement and rip-up-and-reroute algorithms.  These quality-of-results and runtime improvements are noteworthy, to be sure.

Yet, there is one facet of the semiconductor industry that is (or soon will be) critically-dependent upon AI support – the metrology of semiconductor process characterization, both during initial process development/bring-up, and in-line inspection driving continuous process improvement.  (Webster’s defines metrology as “the application of measuring instruments and testing procedures to provide accurate and reliable measurements”.)  Every aspect of semiconductor processing, from lithographic design rule specifications to ongoing yield analysis, is fundamentally dependent upon accurate and reliable data for critical dimension (CD) lithographic patterning and material composition.

At the recent VLSI 2020 Symposium, Yi-hung Lin, Manager of the Advanced Metrology Engineering Group at TSMC, gave a compelling presentation on the current status of semiconductor metrology techniques, and the opportunities for AI methods to provide the necessary breakthroughs to support future process node development.  This article briefly summarizes the highlights of his talk. [1]

The figure below introduced Yi-hung’s talk, illustrating the sequence where metrology techniques are used.  There is an initial analysis of fabrication materials specifications and lithography targets during development.  Once the process transitions to manufacturing, in-line (non-destructive) inspection is implemented to ensure that variations are within the process window for high yield.  Over time, the breadth of different designs, and specifically, the introduction of the process on multiple fab lines requires focus on dimensional matching, wafer-to-wafer, lot-to-lot, and fab line-to-fab line.

The “pre-learning” opportunities suggest that initial process bring-up metrology data could be used as the training set for AI model development, subsequently applied in production.  Ideally, the models would be used to accelerate the time to reach high-volume manufacturing.  These AI opportunities are described in more detail below.

Optical Critical Dimension (OCD) Spectroscopy
I know some members of the SemiWiki audience fondly (or, perhaps not so fondly) recall the many hours spent in the clean room looking through a Zeiss microscope at wafers, to evaluate developed photoresist layers, layer-to-layer alignment verniers, and material etch results.  At the wavelength of the microscope light source, these multiple-micrometer features were visually distinguishable – those days are long, long gone.

Yi-hung highlighted that OCD spectroscopy is still a key source of process metrology data.  It is fast, inexpensive, and non-destructive – yet, the utilization of OCD has changed in deep sub-micron nodes.  The figure below illustrates the application of optical light sources in surface metrology.

The incident (visible, or increasingly, X-ray) wavelength is provided to a 3D simulation model of the surface, which solves electromagnetic equations to predict the scattering.  These predicted results are compared to the measured spectrum, and the model is adjusted – a metrology “solution” is achieved when the measured and EM simulation results converge.

OCD illumination is most applicable when an appropriate (1D or 2D) “optical grating-like” pattern is used for reflective diffraction of the incident light.  However, the challenge is that current surface topographies are definitely three-dimensional, and the material measures of interest do not resemble a planar grating.  Optical X-ray scatterometry provides improved analysis accuracy with these 3D topographies, but is an extremely slow method of data gathering.

Yi-hung used the term ML-OCD, to describe how an AI model derived from other metrology techniques could provide an effective alternative to the converged EM simulation approach.  As illustrated below, the ML-OCD spectral data would serve as the input training dataset for model development, with the output target being the measures from (destructive) transmission electron microscopy (TEM), to be discussed next.

ML for Transmission Electron Microscopy (TEM)
TEM utilizes a focused electron beam that is directed through a very thin sample – e.g., 100nm or thinner.  The resulting (black-and-white) image provides high-magnification detail of the material cross-section, due to the much smaller electron wavelength (1000X smaller than an optical photon).

There are two areas that Yu-hing highlighted where ML techniques would be ideal for TEM images.  The first would utilize familiar image processing and classification techniques to automatically extract CD features, especially useful for “blurred” TEM images.  The second would be to serve as the training set output for ML-OCD, as mentioned above.  Yi-hung noted that one issue to the use of TEM data for ML-OCD modeling is that a large amount of TEM sample data would required as the model output target.  (The fine resolution of the TEM image compared to the field of the incident OCD exposure exacerbates the issue.)

ML for Scanning Electron Microscopy (SEM)
The familiar SEM images measure the intensity of secondary electrons (emitted from the outer atomic electron shell) that are produced from collisions with an incident primary electron – the greater the number of SE’s generated in a local area, the brighter the SEM image.  SEMs are utilized at deep submicron nodes for (top view) line/space images, and in particular, showing areas where lithographic and material pattering process defects are present.

ML methods could be applied to SEM images for defect identification and classification, and to assist with root cause determination by correlating the defects to specific process steps.

Another scanning electron technique uses a variable range of higher-energy primary electrons, which will have different landing distances from the surface, and thus, provide secondary electrons from deeper into the material.  However, an extremely large primary energy will result in the generation of both secondary electrons and X-ray photons, as illustrated below.  (Yi-hung noted that this will limit the image usability for the electron detectors used in SEM equipment, and thus limit the material depth that could be explored – either more SE sensitivity or SE plus X-ray detector resolution will be required.)   The opportunities for a (generative) machine learning network to assist with “deep SEM” image classification are great.

Summary
Yi-hung concluded his presentation with the following breakdown of metrology requirements:

  • (high-throughput) dimensional measurement:
      • OCD, X-ray spectroscopy  (poor on 3D topography)
  • (high-accuracy, destructive) reference measurement:  TEM
  • Inspection (defect identification and yield prediction):  SEM
  • In-line monitoring (high-throughput, non-destructive):
      • hybrid of OCD + X-ray, with ML-OCD in the future?

In all these cases, there are great opportunities to apply machine learning methods to the fundamental metrology requirements of advanced process development and high-volume manufacturing.   Yi-hung repeated the cautionary tone that semiconductor engineering metrology currently does not have the volume of training data associated with other ML applications.  Nevertheless, he encouraged data science engineers potentially interested in these applications to contact him.   🙂

Yu-hing also added that there is a whole other metrology field to explore for potential AI applications – namely, application of the sensor data captured by individual pieces of semiconductor processing equipment, as it relates to overall manufacturing yield and throughput.  A mighty challenge, indeed.

-chipguy

 

References

[1]  Yi-hung Lin, “Metrology with Angstrom Accuracy Required by Logic IC Manufacturing – Challenges From R&D to High Volume Manufacturing and Solutions in the AI Era”, VLSI 2020 Symposium, Workshop WS2.3.

Images supplied by the VLSI Symposium on Technology & Circuits 2020.

 


Teaching AI to be Evil with Unethical Data

Teaching AI to be Evil with Unethical Data
by Matthew Rosenquist on 07-05-2020 at 2:00 pm

Teaching AI to be Evil with Unethical Data

An Artificial Intelligence (AI) system is only as good as its training. For AI Machine Learning (ML) and Deep Learning (DL) frameworks, the training data sets are a crucial element that defines how the system will operate. Feed it skewed or biased information and it will create a flawed inference engine.

MIT recently removed a dataset that has been popular with AI developers. The training set, 80 Million Tiny Images, was scraped from Google in 2008 and used in training AI software to identify objects. It consists of images that are labeled with descriptions. During the learning phase, an AI system will ingest the dataset and ‘learn’ how to classify images. The problem is that many of the images are questionable and the labels were inappropriate. For example, women are described with derogatory terms, body parts are identified with offensive slang, and racial slurs were sometimes used to label minority people. Such training should never be allowed.

AI developers need vast amounts of training data to train their systems. Collections are often created out of convenience, without consideration for courteous content, copyright restrictions, compliance to licensing agreements, people’s privacy rights, or respect for society. Unfortunately, many of the available sets were haphazardly created by scraping the internet, social sites, copyrighted content, and human interactions without approval or notice.

Many of the most used training datasets have issues. A large number were created by unethically acquiring content, some contain derogatory or inflammatory information, and for others, the sample is not representative because it excludes certain groups that would benefit from inclusion.

The problem has become worse over time. Flawed datasets, that were made openly available to the developer community early-on, became so popular that they are now considered a standard. These benchmarks are used to check accuracy and performance across different AI systems and configurations.

Too few are vetted for inclusion, content, accuracy, or socially acceptable content. Using such flawed records is simply unethical because the resulting systems can be racially charged, biased, and promote inequality.

We cannot have good AI if the commonly used datasets create unethical systems. All files should be vetted and both the creators and product developers held responsible. Just as chefs are held accountable for the ingredients they put into their prepared dishes, so should the AI community be held responsible for allowing poor data to result in harmful AI systems.