Bronco Webinar 800x100 1

AI, Safety and Low Power, Compounding Complexity

AI, Safety and Low Power, Compounding Complexity
by Bernard Murphy on 04-28-2020 at 6:00 am

Hoc in a low-power ASIL-D design

The nexus of complexity in SoC design these days has to be in automotive ADAS devices. Arteris IP highlighted this in the Linley Processor Conference recently where they talked about an ADAS chip that Toshiba had built. This has multiple vision and AI accelerators, both DSP and DNN-based. It is clearly aiming for ISO 26262 ASIL D certification since the design separates a safety island from the processing island, pretty much the only way you can get to ASIL D in a heterogenous mix of ASIL-level on-chip subsystems.

Equally clear, it’s aiming to run at low power – around 2.7W for the processing island (the bulk of the functionality). It’s all very well to be smart but when you have dozens of smart components scattered around the car, that adds up to a lot of power consumption. The car isn’t going to be very smart if it runs its battery flat.

These are to some extent competing objectives. I’ve talked before about AI and safety and the need for a safety island to deliver ASIL D performance around AI accelerators. I’ll come back to that. But first I want to talk about power management and safety in on-chip networks.

Low-power design is one of those messy realities that spans all levels of design and this affects the on-chip networks, frequently NoCs in an SoC (there has to be a rap lyric in there somewhere), as much as any other aspect of the design. Down at the atomic level of a NoC, most of it is combinational, therefore as low power as you can reasonably reach in a synchronous world. Those DFFs that are needed are created by the network generator and can be done so in a way that is friendly to low-power synthesis, letting EDA synthesis tools infer clock-gating on banked registers. This covers around 95% of the DFFs in a NoC according to Kurt Shuler (VP Marketing at Arteris IP).

NoCs are generated with pre-defined unit-level building blocks – network interface units (NIU), arbiters and the like. Each of these can have built-in additional gating control so that, for example when an NIU is inactive it can be completely gated. All of this is zero-latency control managed by a little logic in each function. And each of these building blocks supports ASIL D duplication where needed so you get power efficiency and safety.

Power management at the next level up in the NoC – the SoC level – gets a lot more interesting. For a NoC entirely contained within a power or voltage/frequency domain (for DVFS), expectations are no different than they are for any other logic entirely within that domain. They need to support voltage up/down and/or frequency up/down as demanded.

But some NoCs cross between domains. Parts of the NoC may switch off or change voltage and frequency while other part remains active or don’t change. That requires intelligent interfacing at domain boundaries within the NoC. Now you need a NoC power controller in each domain, communicating with the SoC power controller. You also need elements at interfaces to handle handshaking between domains, so for example a request from an on-domain to an off-domain will trigger wake-up and wait for the off-domain to be ready. Equally, appropriate level-shifting and data-buffering will be used between DVFS domains. The cool thing is that the NoC generation tooling automatically takes care of configuring all of this and tying it all together based on your higher level system requirements.

Which brings me back to system-level safety. First, to get to a high-level of safety at the SoC level, you’ll want to use duplication as needed. But duplication burns more power, so there’s a balance between safety and power, In Arteris IP NoCs, this balance is managed carefully, especially through optimizing unit-level duplication. Second, the power-down and DVFS scaling support for low-power has an added benefit for safety in a safety-island-supported architecture. The safety island can initiate a power down for a full reset when, for example, an AI accelerator misbehaves.

One other interesting point Kurt told me. The Linley Processor Conference described how Toshiba uses Arteris IP FlexNoC and Ncore interconnects to implement their SoC architecture, and Toshiba used temperature monitoring to throttle processing performance. Naturally they use the NoC to manage this.

Obviously managing AI, safety and low power is a delicate but achievable balance in a NoC-centric SoC, judging by this Toshiba ADAS design. You can learn more details if you attended the Linley Spring Processor Conference 2020 by downloading the proceedings HERE. Arteris IP will also host the presentation on their www.arteris.com/resources web page next month.

Also Read:

That Last Level Cache is Pretty Important

Trends in AI and Safety for Cars

Autonomous Driving Still Terra Incognita


Synopsys – Turbocharging the TCAM Portfolio with eSilicon

Synopsys – Turbocharging the TCAM Portfolio with eSilicon
by Mike Gianfagna on 04-27-2020 at 10:00 am

Screen Shot 2020 04 18 at 2.21.03 PM

About 90 days ago, Synopsys completed the acquisition of certain IP assets from eSilicon. The remaining entirety of eSilicon was acquired by Inphi Corporation. I was the VP of marketing at eSilicon during that acquisition so it’s very interesting to me to find out how things are going with those certain IP assets.  I got an opportunity to find out recently.

I spent some time speaking with Rahul Thukral, senior product marketing manager at Synopsys. Rahul has spent a lot of time in memory design at Mentor Graphics, Virage Logic and Synopsys. We had a spirited discussion about those certain IP assets from eSilicon as the main focus of the team acquired by Synopsys was memory design.

First of all, Rahul reported that 90 days in, the team is completely integrated into Synopsys, including a Google cloud-based design environment that was developed at eSilicon and subsequently became part of the Synopsys acquisition. I’m sure many of you have either been through an acquisition of watched one (on more) closely. To be fully integrated and productive in 90 days or less is quite an accomplishment. I see it as a strong endorsement for the integration skills of Synopsys and the solid methodology and design talent of eSilicon.

Rahul was very complimentary of the eSilicon team – a strong addition to the Synopsys memory design capability that was right on the mark.  I know and respect the eSilicon team and it was nice to hear this assessment from an independent point of view. The acquisition included several memory IP titles, including TCAMs and multi-port memory compilers, as well as interface IP with high bandwidth interface (HBI) support. I will cover the comments around ternary content-addressable memories (TCAMs) here. This is a key growth area for Synopsys, as it was for eSilicon and is a good proxy for the other memory products.

For the uninitiated, a TCAM essentially operates in an inverse manner of a regular memory.  In a regular memory, one provides an address and the memory returns the contents of that address. In a TCAM, one provides the content of interest and the TCAM returns the address(es) where that content is stored. TCAMs find widespread use in networking applications, where it’s important to quickly keep track of source and destination addresses for network packets. Rahul explained that the addition of eSilicon’s TCAM products had an “instant impact” on the Synopsys portfolio.

Prior to the acquisition, Synopsys was focusing on high-density TCAMs and eSilicon was focusing on high-speed TCAMs. Merging these two differentiating capabilities makes for a strong market position. Synopsys had a design philosophy of building software compilers to create the various instances of their memory products. eSilicon had the same philosophy, making the integration task easier. Synopsys can now offer greater than 2 GHz operation in the latest technology – a strong result. Beyond performance, power is also an important consideration. Rahul explained that some networking chips can have hundreds of TCAMs. If they all start firing at once, a phenomenon known as “power ringing” can occur. This essentially creates a nightmare signal integrity problem. The eSilicon team had a strong focus on power optimized designs, as did the Synopsys team. More good synergy.

I probed a bit with Rahul about other applications that benefit from TCAMs. It turns out automotive is also a hot market for this technology. There are multiple electronic control units (ECUs) in a typical car today. The powertrain, passenger comfort, infotainment and driver assistance are just a few examples of on-board ECUs that must all be networked together to create a unified driving experience. If this is starting to sound like a networking application, it is, and TCAMs help a lot here.

Thanks to the Synopsys focus on automotive functional safety and reliability, TCAM technology can be deployed in the automotive market. There is a substantial investment in certification to address this market, such as ISO 26262. Synopsys has made that investment. The addition of built-in self-test (BIST) for TCAMs for the consumer and automotive markets is an important growth area as well and one that Synopsys is also focused on.

Overall, I felt quite good after my discussion with Rahul. When we first examined the potential transaction between Synopsys and eSilicon, the compatibility and synergy of the two teams seemed quite strong on paper.  It’s nice to see it worked out that way in real life.

Also Read:

Synopsys is Changing the Game with Next Generation 64-Bit Embedded Processor IP

Security in I/O Interconnects

IP to SoC Flow Critical for ISO 26262


SiFive’s Approach to Embedding Intelligence Everywhere

SiFive’s Approach to Embedding Intelligence Everywhere
by Tom Simon on 04-27-2020 at 6:00 am

SiFive Embedding Intelligence

Before the advent of RISC-V, designers looking for embedded processors were effectively limited to a handful of proprietary processors using ISAs from decades ago. While the major ISAs are being updated and enhanced, they also are facing limitations from many decisions made over many years.  RISC-V was conceived with a clean well thought out architecture and designed for expansion that would not create inconsistencies. Because it is open source, there is a rich set of tools and products that support it.

SiFive is one of the leading exponents of RISC-V and has been producing IP based on the RISC-V ISA for years now. Their product offerings have expanded significantly, now addressing everything from edge/IoT to server applications. Their recent webinar titled “Embedding Intelligence Everywhere with SiFive 7 Series Core IP” talks about how intelligence is needed in each market.

The webinar is divided into two parts. The first, presented by Jack Kang, Senior Vice President of Customer Experience at SiFive, offers a look at how embedded intelligence is becoming prevalent everywhere from the cloud to the edge. He also offers an overview of SiFive’s embedded RISC processors and how they fit current market needs.

Jack first talks about how AI is moving from the cloud to the edge, creating the need for additional processing capabilities in a wide range of devices. At the edge AR, VR and sensor fusion are driving the need for expanded real time processing. Jack also points out that there is also a need for increased intelligence in storage. He touches on how intelligence is supporting caching schemes, cryptography, memory maps and even in-cluster application processors. Likewise making networking smarter facilitates higher bandwidth and other activities such as the implementation of 5G stacks.

The cores offered by SiFive are grouped by their application areas. The E Cores are suitable for 32-bit embedded uses. For heavier workloads there are the S Cores, which add 64 bit capabilities. The most powerful processors are the U Cores, which are 64-bit application processors for high end computing.

Their smallest and most efficient RISC-V processor IP is the 2 Series, the E2 and S2, which are 32 and 64 bit respectively. They are core and memory configurable to customer specific needs. They also feature ultra-low latency interrupts for servicing real world events.

The 3 and 5 Series are their most widely deployed products. The E3 offers 32 bit performance for mid range embedded applications. The S5 adds 64 bit and the U5 is the top end offering higher performance. These cores can be used in multicore configurations and have Hard Real Time capabilities.

The 7 Series cores are the topic of the later second section of the webinar. Jack touches on them before talking about the U8, an extremely scalable high performance out-of-order core with their highest performance per watt. The U8 is also very area efficient. The combination of area and power efficiency make them very attractive for high end computing systems.

The second section, presented by Jahoor Vohra, Director of Field Application Engineering, is titled “SiFive 7 Series Core IP”. His presentation discusses the features of the three members of the Core IP 7 Series: E7, S7 and U7. This includes an overview of the 7 Series microarchitecture, focusing on performance, scalability and the detailed specifications of each.

The 7 Series are scalable up to 8+1 for each cluster. Like other RISC-V processors the instruction set is extensible through custom instructions. Their memory is configurable and tightly integrated for low latency. If called for, they support mixed precision arithmetic. They also feature enhanced determinism to better support demanding real time applications. There are also functional safety features provided by built-in fault tolerance mechanisms. These are just a few of the highlights that Jahoor brings up.

To gain a full appreciation and understanding of the SiFive offerings, across the board and about the 7 Series in particular, I highly suggest viewing the informative webinar. I have been following RISC-V and SiFive for a number of years now. The level of adoption and progress has been extraordinary. The entire effort is supported by some brilliant dedicated minds. The results speak for themselves.

SiFive is also giving a webinar soon on the topic of Rapid Embedded Prototyping with SiFive Software. These webinars are all part of the SiFive Connect webinar series, which aims to provide educational content in an interactive format. A full list of these webinars can be found here.


Preventing a Product Security Crisis

Preventing a Product Security Crisis
by Matthew Rosenquist on 04-26-2020 at 12:00 pm

Preventing a Product Security Crisis 1

The video conference company Zoom has skyrocketed to new heights and plummeted to new lows in the past few weeks. It is one of the handful of communications applications that is perfectly suited to a world beset by quarantine actions, yet has fallen far from grace because of poor security, privacy, and transparency. Governments, major companies, and throngs of users have either publicly criticized or completely abandoned the product. In a time of unimaginable potential growth, Zoom is sputtering to stay relevant, fend off competition, and emerge intact.

Avoiding Total Loss of Product Confidence
There are lessons to be learned, applicable to all product and service companies, to avoid such gruesome misfortune. Leadership of every organization should be taking an introspective look to understand how they can best prevent such missteps and determine how they might respond in times of such crisis.

Zoom is a teleconference platform that has proven to be scalable and effective at bringing groups together to collaborate remotely. It is in a competitive field where features, time-to-market, performance, and usability are crucial to success. This is true for so many products, services, and businesses. Often in such environments, management possesses a razor-sharp focus being competitive which means getting products and new features out to the market as fast as possible.

There are costs to such a narrow focus. Accuracy in marketing messages can be overlooked. Documentation quality is often sacrificed. More importantly, it is very common that security is also deprioritized as an acceptable tradeoff. This is where the shortsightedness begins.

Security is a foundation for trust. What is easily seen as a distraction by engineers and executives during the frantic development cycles, that can be addressed ‘later’, will introduce fundamental weakness that compound over time which can be exploited.

This is where Zoom is at. The organization is feeling the pain and chaos of decisions made far earlier, during product development, that are now emerging due to the rapid growth and adoption of their solution.

A number of issues have arisen that have customers, governments, and stockholders questioning the leadership and confidence in the product. There was a privacy issue that harvested user data and sent it to Facebook without consent. Default designs that allowed incidents of harassment, called “Zoombombing”, to the embarrassment and fury of users. The inaccuracy of marketing claims of End-to-End (E2E) security and an inaccurate privacy policy. The architecture design and code that has many vulnerabilities and that does not protect E2E the privacy of sessions between parties. Then there was the choice to use data center assets in China where they stored sensitive information but did not inform customers who are very uncomfortable to such configurations. Now Zoom faces grave and very public concerns regarding the trust in management’s commitment for secure products, the respect for user privacy, the honesty of its marketing, and the design decisions that preserve a positive user experience.

Learning from Failures
The lesson is straightforward. All the issues Zoom is facing could and should have been addressed earlier, well before they have exploded in spectacular fashion. This is the key takeaway for everyone: a lack of investment for security and privacy in the development phases can manifest into devastating consequences. Every organization should be evaluating their DevOps security programs. They should be re-evaluating the role and value of security during product design, development, updates, and sustaining operations. Zoom is showcasing the severe consequences of ignoring proper risk management. They aren’t the first, but the world is changing and peoples’ tolerance and patience for such issues is evolving to be less forgiving. Zoom and every other product company must adapt to meet the growing expectations for security, privacy, and safety.

How can Zoom recover?
For those interested in how Zoom should be addressing the systemic issues they face during their product crisis, I recommend the Zoom in crisis: How to respond and manage product security incidents article on HelpNetSecurity, where I break down a number of issues and steps for resolution.


COVID-19 Cars as an Essential Service

COVID-19 Cars as an Essential Service
by Roger C. Lanctot on 04-26-2020 at 10:00 am

COVID 19 Cars as an Essential Service

The Automotive News reported Friday that updated guidance from the Department of Homeland Security’s Cybersecurity and Iinfrastructure Security Agency had identified cars as an essential service. AN reported: “The new guidelines include “workers critical to the manufacturing, distribution, sales, rental, leasing, repair, and maintenance of vehicles and other transportation equipment, including electric vehicle charging stations, and the supply chains that enable these operations to facilitate continuity of travel-related operations for essential workers.”

Automotive News report: https://www.autonews.com/dealers/auto-sales-listed-essential-service-updated-federal-guidance?utm_source=daily&utm_medium=email&utm_campaign=20200417&utm_content=article2-headline

The announcement had a Pyrrhic quality to it as millions of Americans were rapidly coming to grips with the fact that they could, indeed, live without cars. In fact, they could live without moving around at all. Their very lives might depend upon not moving as the more moving a person might do, the more likely they were to become infected with COVID-19.

The government-ese is the problem here. Cars are “essential.” Really? By now we know that oxygen, water, food, family, and friends are essential. And maybe toilet paper. But cars?

Politicians have tried to poo-poo the pandemic by talking about how many Americans are killed in traffic incidents and by the “common” flu. But COVID-19 has outpaced the fatality rates from those two analogs.

COVID-19 is killing 1,800 Americans/day. That’s more than heart disease (1,774) and cancer (1.641). And cars? Cars, on a typical day in the U.S., kill about 100 people. One hundred daily fatalities is pretty horrible, but COVID-19 is slaughtering 18x that daily figure.

Restoring vehicle production and sales, though, assumes demand for vehicles will be strong and is, in fact, pent up – waiting to break free. The reality may be something quite different with more than 22M Americans already having filed for unemployment and stay-at-home orders in place across much of the country.

In fact, some cities and states still stand in the path of car sales in spite of the Federal designation. Los Angeles has yet to allow retail vehicle sales. The State of Pennsylvania doesn’t even allow online sales of cars. (Pennsylvania reversed its stance against online vehicle sales Tuesday afternoon.)

All of this has contributed to a steep plunge in used car prices further threatening the viability of Ford Motor Company and General Motors – which have billions of dollars of loans and leases on their books. The decline in used car prices was a further blow to Hertz which itself is reportedly teetering on the verge of bankruptcy.

Dealers will now be in position to test the theory of cars as “essential.” In fact, dealers themselves are facing a major test of their own viability during a pandemic of undetermined longevity.

It won’t be enough for dealers to open their doors. It won’t be enough for car makers to pump out cheery advertising messages and crazy incentives. Dealers will need to get creative.

The good news is that there are a host of marketing partners pushing out new tools to engage with customers remotely, digitally, virtually. Video tools for dealers are currently the hottest feature in the service space, according to some industry veterans.

Several companies are out in front including Xtime, MyKaarma, CITNow, Dealer-FX, UpdatePromise, and Text2Drive. Video conferencing for payment is only one part of the process and requires a strong technical infrastructure that can handle payments electronically. MyKaarma has the best solution followed by Xtime, Text2drive, and UpdatePromise, according to one observer.

COVID-19 has introduced more than the usual level of trepidation into a dealer visit. The average new or used car buyer will do a fair amount of research and will already know his or her price and financing plan. The days of test drives and hand shakes are practically pointless in a post-COVID-19 environment.

Dealers wanting to truly test the “esssential” quality of a car will reach out to potential customers, offer online vehicle evaluation and demonstration tools, and, most importantly, allow for an entirely online purchasing process with to-the-door delivery. Dealers are witnessing nothing less than the digitalization of the sales process. We may be only years away from the demise of the showroom and the rise of the virtual demo and close.

Doubters need look no further than virtual vehicle sales leader Carvana jumping to the fourth spot on Automotive News’ ranking of used vehicle sales leaders to appreciate the power of digital. Traditional dealers actually have more tools for creating a more interactive customer experience to go along with a full service back-end operation.

COVID-19 has forced all human beings to question what is essential. By now, we all know that cars are definitely not essential. The truly essential things today are those that the government can’t seem to give us as part of some “stimulus” package.

We need family and friends, food, empathy, and maybe a small helping of truth. Only truth will really free us of these COVID-19 bonds. Until then, we’ll have to settle for some innovative digital car retailing to rescue our economy in these dark times.


LRCX Supply constrained by Covid Crisis

LRCX Supply constrained by Covid Crisis
by Robert Maire on 04-26-2020 at 8:00 am

LCRX Lam Research 2020 COVID

Lam Q1 revenue soft by roughly 15% due to supply side
Demand remains solid for Q2 but beyond that, dubious
No guide but Q2 could be => Q1 revenues
NAND solid, China big @ 32%, Foundry remains great

Lam reported a solid quarter but light on revenues…
It was no surprise that Lam reported revenues of $2.5B versus their original guide of $2.8B +- $200M and solid earnings of $3.98.

The clear message is that while demand was solid, and outstripped Lam’s ability to supply, Lam was hobbled by sub supplier issues that limited production.

As we have previously underscored, the supply chain for complex semiconductor equipment spans the globe, especially into Asia and is susceptible to disruptions as many parts are single sourced or have limited manufacturers.

Given that Covid19 really had its worst impact well into the quarter there was not enough time to fix or try to mitigate the supply issues prior to the end of the quarter.

While the company did not give “offical” guidance for Q2, they did suggest that Q2 revenue could potentially be better than Q1 as some of the limiting issues get worked out.

The company repeated many times that it was not a demand issue and they saw no change in customer orders and the shortfall was all on Lam not being able to produce and ship 100% due to Covid19 related issues.

No change in demand (Yet)…. Demand will be down..just a question of when and how much…
While there was no near term indication of any changes in demand/orders from customers, its clear that there will be some sort weakness driven by the overall economic declines, especially on the consumer side….the company all but stated that outright.

We have suggested in our prior notes that there is the short term logistics impact of Covid19  (clearly on display in both ASML and Lam’s results) and the longer term, as yet unknown, demand impact.

While short term impact stops parts from getting to Lam immediately, it will take months and quarters before laid off workers and other negative economic impact trickles down through electronics makers, then through chip makers and finally to chip tool makers.

Yes, we will keep spending on technology advances…but capacity related purchases will be vulnerable.

Taking prudent steps…
As we heard from ASML, Lam is also being financially prudent by stopping buy backs, even though they have tons of cash and depressed stock price. They have pulled cash from their credit line and continue to put downward pressure on expenses.  All correct and conservative as the future is quite uncertain right now.

China a third of revs…NAND continues comeback…
China, at 32% of revenues was roughly a third of business and bigger than any other segment. Roughly half or more of China business was for indigenous companies.

We remain concerned of the overhang of a potential blockade of exporting US semiconductor tools to China…especially in light of an administration looking to punish China or deflect attention to other matters.

While we feel very good about CPU demand impacting Intel and AMD due to “work at home” we remain more concerned about NAND demand and pricing as we could get back into an oversupply especially if consumers slow.

We remain most concerned about the fall roll out of Apple’s Iphone12 and associated next generation of both Apple CPU’s as well as 5G modem devices.

Right now both we and Lam management have a hard time guessing about what demand will look like going forward. Lam management demurred on the call when talking about future demand but we might be a bit more direct in assuming its down.

Trickle down of demand will take a while….
As we have previously mentioned, fab capacity planning is very long term by its nature and has all the maneuverability of a supertanker. Many of the fab expansion plans taking place right now are “bounce back” reactions from the previous down cycle and less of a “steady state” indication of demand.

While we have heard of some inventory build in the channel, we are not yet “stuffed” with inventory that would cause chip makers to hit the brakes hard. We probably won’t see the trickle down to equipment makers until a least Q3 and more realistically Q4 when chip makers get a better sense of the full economic impact and start to adjust 2021 build plans.

The stock roller coaster continues….
After Lam’s stock was down big yesterday, it was up twice as big today only to fall off after hours with less than strong beat results. The volatility will obviously continue with larger issues and macro events driving the overall tone of the market.

Whether chip equipment companies or chip companies report good or bad earnings seems to matter little as most bets are off due to Covid19 issues. While the stock is cheap compared to prior expectations, the uncertainty of demand will likely haunt things for several quarters. The company continues to do a great job of execution but unfortunately can’t control the larger picture that they are a small part of.


Key Applications for Chip Monitoring

Key Applications for Chip Monitoring
by Daniel Nenni on 04-24-2020 at 2:00 pm

Richard McPartland

One of the side benefits of working with SemiWiki is that you get to meet a broad range of people and in the semiconductor industry that means a broad range of very smart people, absolutely. Recently I had the pleasure to meet Richard McPartland of Moortec. Richard and I started in the semiconductor industry at the same time but from across the pond as they say. Richard started at UK semiconductor pioneer Plessey in the early 1980s as an IC designer. The stories he can and does tell…

Richard and I are working on an upcoming webinar on optimizing power and increasing data throughput in advanced multi-core AI/ML/DL devices. Artificial intelligence, machine learning, and deep learning are touching just about every new design so this webinar will be a full one. Be sure and register to attend the event and you will get a link to the replay. Here is the webinar abstract:

If you are working on complex Artificial Intelligence (AI) or Machine Learning (ML) or Deep Learning (DL) designs using advanced node processes, you will understand the motivations for optimising CPU utilisation, device power and processing speed. Cutting-edge AI, ML & DL chips, by their very nature, are susceptible to intra-die process variability. Designers are often walking a fine line between optimal performance and failure.

This webinar from Moortec looks at how close real-time analysis of dynamic conditions, as well as identifying process corners, using embedded in-chip monitoring fabrics based on advanced node processes can greatly improve the power consumption, data throughput and computational performance of the overall system design.

Topics covered will include how tight dynamic guard-banding will enable improvement for the optimisation of multi-core utilisation, thermal load balancing and fine-grain SVS/AVS control, whilst the device is in mission-mode.

Due to their experience and dedication to In-Chip Monitoring, Moortec are able to support companies who are operating at the cutting edge of AI and Machine Learning chip design. Such companies have utilised Moortec’s highly accurate, highly featured sensors within their in-chip monitoring subsystem to ensure optimal performance and enhanced reliability.

Richard is also a fellow blogger. You can catch his musings on the Moortec blog site Talking Sense with Moortec.

He’s got two up thus far and they are very good:

Talking Sense with Moortec – Key Applications for In Chip Monitoring…In-Die Process Speed Detection
Chip designers working on advanced nodes typically include a fabric of sensors spread across the die for a number of very specific reasons. In this, the second of a three-part blog series Richard McPartland, Moortec’s Technical Marketing Manager continues to explore some of the key applications and benefits of these types of sensing solutions. In this instalment the focus is In-Die Process Speed Detection and why understanding in-chip process speed detection alongside thermal & supply conditions is essential if you want to maximise performance and power, improve reliability and ultimately reduce costs your cutting-edge design…

Talking Sense With Moortec – Key Applications For In Chip Monitoring…Thermal Sensing
The latest SoCs on advanced semiconductor nodes typically include a fabric of sensors spread across the die and for good reason. But why and what are the benefits? This first blog of a three-part series explores some of the key applications for In-chip thermal sensing and why embedding in-chip monitoring IP is an essential step to maximise performance and reliability and minimise power, or a combination of these objectives…

Moortec is one of the more collaborative companies I have worked with. They participate in most of the events I frequent and many more. It would probably be easier to list companies that they do not work with because their customer list is extensive. And when you do chip monitoring, analytics and optimization you get first hand experience with leading edge design challenges. So who better to partner with?

 


CEO Interview: Jason Xing of Empyrean Software

CEO Interview: Jason Xing of Empyrean Software
by Daniel Payne on 04-24-2020 at 10:00 am

empyrean

It’s been about seven years since Randy Smith last interviewed Jason Xing, the President/CEO of North America for Empyrean Software, so the timing felt good for a fresh update. I’ve been watching Empyrean at DAC for several years now, and have come away impressed with their growth and focus on some difficult IC design problems:

  • GPU-powered SPICE circuit simulator (ALPS-GT)
  • High capacity, parallel SPICE circuit simulator (ALPS)
  • IC Layout Analysis and chip finishing (Skipper)
  • Timing ECO (Empyrean XTop)

Q&A

How has Empyrean adapted to the pandemic, is work continuing but remotely?

Every year Empyrean has made significant investment in R&D which forms a big portion of its staffing. After the Pandemic broke out, we quickly enabled most employees to work from home by setting up remote server access and expanding online meeting capacities from Zoom and Webex. The pandemic limited employee mobility but saved on their commute time. Our customer engagement teams use this extra time to make deeper business planning, review case studies, and catch up on product technologies.

Has Empyrean been using Video technology to keep in contact with its employees and customers, or have you adopted any new apps to keep the business running and support customers?

Empyrean has used most major video technologies like Zoom, Webex, and XYLink. Also we use wechat for small scale and casual meetings as well.

With fewer trade shows and conferences happening in 2020, how will Empyrean connect with new customers?

Some of our target new customers were incubated in the past. During this pandemic, we just continue the business engagements through video technology communication and tool evaluations. We also try to reach new customers with online articles and webinars.

2019 just wrapped up, so what kind of progress did Empyrean make last year in the EDA industry?

In 2019, Empyrean successfully released a new product, the GPU powered, high performance, parallel SPICE simulator, ALPS-GT, which achieved over 10X performance speedup over competitors on large, post-layout simulation with high accuracy. This product has been adopted by several top-tier design houses and IP vendors.

Empyrean has also developed and perfected a design flow for flat-panel designs, which have been adopted major FPD IDMs.

What are the current semiconductor design challenges that your company is addressing in 2020?

We’re focused on the following four design challenges:

Analog design verification, including traditional analog design, structured memory design, and RF designs.

Difficult to debug, post-layout AMS designs at advanced nodes,

STA signoff is too pessimistic and cannot guard-band designs for very advanced design process and IOT designs,

Library characterization is too power and time-consuming for advanced design processes.

What kind of events will Empyrean be attending this year?

Empyrean will attend DAC, TSMC Symposium and OIP.

Which customers can you talk about from 2019 that were using Empyrean tools for IC design?

nVidia and Xilinx.

What would a successful 2020 look like for Empyrean?

Successfully roll out planned technology in our disruptive products and gain customer satisfaction for our products and support.

How do you compete with the solutions from big EDA companies?

Empyrean builds a competitive edge with  innovative or disruptive technologies to create products in a niche or void market place. Also, Empyrean tries to provide the best available customer support as well.

Did you see much competition for ALPS in the circuit simulation segment and how did you address it?

Yes. We did see competition. However, we saw that major competitors used massive RC reduction in order to gain simulation speed, which is not acceptable for post-layout simulation of designs with high accuracy at advanced nodes. Our ALPS/ALPS-GT circuit simulators excel in this type of long, accurate simulation and also our product team are still working hard to provide best innovations to achieve fast and accurate simulation for such designs.

Summary

I like the new products coming out of Empyrean for IC designers, and their customers include tier one semiconductor companies, all good signs. To take the next step, just contact Empyrean in Silicon Valley, China, Japan, Korea or Singapore. Empyrean also has a new webinar coming up. Even if you cannot attend on that day register and you will get a link to the replay:

WEBINAR: IP Integration Challenges of Complex SoC Platforms

Also read:

More CEO Interviews:

Executive Interview: Howie Bernstein of HCL

CEO Interview: Adnan Hamid of Breker Systems

CEO Interview: Cristian Amitroaie of AMIQ EDA


Using ML Acceleration Hardware for Improved DSP Performance

Using ML Acceleration Hardware for Improved DSP Performance
by Tom Simon on 04-24-2020 at 6:00 am

nnMAX Flex Logix Tile

Some amazing hardware is being designed to accelerate AI/ML, most of which features large numbers of MAC units. Given that MAC units are like the lego blocks of digital math, they are also useful for a number of other applications. System designers are waking up to the idea of repurposing AI accelerators for DSP functions such as FIR filter implementation. Word of this demand comes from Flex Logix, who have seen their customers asking if their nnMAX based InferX X1 accelerator chips can be used effectively for DSP based FIR. Flex Logix’s response is a resounding yes, with supporting information given recently at the Spring Linley Processor conference in March.

Flex Logix Senior VP Cheng C. Wang gave a presentation titled “DSP Acceleration using nnMAX” that shows how effective the nnMAX architecture is when applied to DSP functions. Applications such as 5G, testers, base stations, radar and imaging all need high sample rates and large numbers of taps. Some system designers are using expensive FPGAs or high-end DSP chips to get the performance they need. In fact, many times expensive FPGA are being used just for their DSP units.

The nnMAX tiles used in the InferX X1 each contain 1024 configurable MACs that run at 933MHz. They support INT8x8 and INT16x8 at full throughput. BFloat16x16 and INT16x16 run at half throughput. There is also support for mixed precision as well. nnMAX also provides Winograd acceleration for INT8 that can boost performance by 2.25x. For AI/ML nnMAX can be programmed by TensorFlow Lite/ONNX, with multiple models running simultaneously.

The nnMAX tiles can be arrayed to add compute capacity. In the InferX X1 each tile has 2MB L2 SRAM. Going to a 2X2 or even up to 7X7 provides exponential improvement in performance. NMAX clusters are assembled from arrays, with each cluster performing a 32 bit tap filter. When longer filters are needed, NMAX clusters are chained to form thousands or tens of thousands of taps.

Cheng gave several examples of possible configurations. For instance, at 1,000 MegaSamples per second (1GHz clock), a nnMAX cluster gives 16 taps, a nnMAX 1K tile gives 256 taps and a 2×2 nnMAX array gives 1024 taps. So, what does all this translate to in terms of FIR operation?

In one of the slides for the Linley presentation Cheng compares nnMAX to a Virtex Ultrascale. The comparison shows that a nnMAX 2×2 array can run 1000 taps at the same rate as an Ultrascale 21 tap FIR. Considering that the Ultrascale is 100’s of mm*2 and hundreds of dollars, and the nnMAX 2×2 array with 8MB SRAM is just 26 mm*2, these are impressive results. Cheng also provides an eye-opening comparison with CEVA XC16 and nnMAX. There is a link below to the full presentation with all the comparison numbers.

Cheng pointed out that the FIR application was the first one they tackled with the nnMAX. They already have a major customer for this and are working on improved usability by taking Matlab output to map onto the nnMAX. There will also be a technology port for the nnMAX adding GF12LLP and TSMC N7/N6 as new nodes. Their next target application will be fast FFT.

So, it seems that there is collateral benefit from the development of AI accelerators. Of course, even just for AI/ML, the nnMAX technology offers very high performance and performance per dollar. The full slide deck for the Linley presentation is available on the Flex Logix website. It offers more detail than can be provided here. I suggest taking a look at it if you are looking for AI/ML or DSP acceleration.


Tracing Technology’s Evolution with Patents

Tracing Technology’s Evolution with Patents
by Arabinda Das on 04-23-2020 at 10:00 am

Figure 1

We live in an age of abundant information. There is a tremendous exchange of ideas crisscrossing the world enabling new innovative type of products to pop up daily. Therefore, in this era there is a greater need to understand competitive intelligence. Corporate companies today are interested in what other competitors are brewing in their R&D labs and in predicting what novel application is coming up in the market so as to determine the best possible plan of action to counterattack. Moreover, new players with radically innovative ideas are rapidly emerging as partly deduced from the massive shift in the patent filing scenario in the past years. For example, in 2000, the three countries which filed the most patents were US, Japan and Germany. But since 2019, China has become the largest patent filing country with World Intellectual Property Organization (WIPO), surpassing USA, Japan, and Germany. South Korea has also emerged as a top five patent producers [1]. Companies around the world are looking for a synthesis of information from this data deluge. They are relying on industry experts to provide the technological know-how but also on patent engineers or analysts to perform the analysis of intellectual property (IP) of a particular company and/or a whole industry. Their aim is to understand the activities of the main players as well as the fields in which they dominate. Creating such a detailed patent landscape is time-consuming and complex, however, the end result could provide deep insights into the technology and the market.

I have come across several thorough patent landscapes that have predicted emerging technologies quite accurately. However, I have found mixed results for semiconductor road maps especially those related to advanced logic devices. Specifically, some of the major technologically break-through concepts in advanced logic devices were not predicted in time by market analysts or industry experts. The most striking example is the introduction of finFET device (a tri-gate where the gate wraps around the silicon fin for better control of the channel) by Intel in 2012 for its i5-3550 processor which arrived completely as a surprise to the industry.

The story gets even more interesting after the introduction of finFET devices. Very quickly there were multiple reports that after 10 nm node finFET devices were not going to be extendable. Solutions were proposed in public forums like IEEE papers, IEDM and VLSI conferences. Needless to say, prior to the publication of every proposed solution in a public literature, multiple patents related to them were filed by all major device manufacturers. All the patents and non-patent literature could be grouped into two categories: new materials or new device architectures. They discussed either new materials with existing technologies or suggested radical solutions where new device architectures were fabricated with new materials. For example, some of the serious propositions with prototype data were the following device structures: ultra-thin-body (UTB) field-effect-transistor (FET) based on silicon-on-insulator (SOI), gate-all-around (GAA) involving nano-wires/nano-sheets stacked horizontally or vertically, tunneling FET (TFET), and stacked FET. Meanwhile the materials section mainly focused on silicon -germanium (SiGe) replacing the silicon (Si) channel for PMOS or using III-V compounds. However, today, we are at 7 nm node and slowly transitioning to 5 nm node and still moving forward with the original finFET configuration.

I wondered why these predictions were inaccurate and came to the following conclusions. Firstly, all these suggested devices in spite of their strengths had some serious concerns too. The ultra-thin-body (UTB) architecture gave the possibility of back biasing and also had low consumption of power. The initial wafer cost was high then. UTB is now not used but SOI based technology is currently widely prevalent in the market despite not being used in high speed processors. Similarly the GAA concepts provided better electrostatic control of the channel but required two materials which could be deposited one top of each other, each of them having a very different etch selectivity for the same etching chemistry. The onus on deposition and etching was high, which made the overall process flow very expensive. Vertical GAA FET devices which required major integration change as the wire-shaped channel regions were perpendicular to the substrate (implying that source and drain regions were not on the same plane) were especially hindered by their requirements. This implied additional process steps involving deposition and etching which would make the manufacturing of advanced logic devices even more expensive. Regarding TFET, there was the promise of attaining the sub-threshold slope limit of 55mV/dec, which could open new applications for low power computing. However, the band gap tunneling based TFET devices unfortunately lacked a robust drive current. Next, let us consider stacked FET devices. This idea had been floating since a long time in the technical forum. In this concept, transistors are stacked one on top of another. Either the transistors are made in separate wafers and bonded or they are fabricated directly on the lower layer of transistors. This requires good bonding techniques or proper controlling of the thermal budget for the top devices. Additionally, controlling the implant process could be difficult on the stacked layer. Back in 2012, the solutions were not ready. What about SiGe replacing Si? Most of the patents filed and literature submitted highlighted two possible scenarios both of which involved integration methods post fin formation. One requires growing SiGe on the side walls, while the other is recessing the fins between the isolation structures and growing SiGe on top of the fin (see figure 1). Both methods required at least additional mask sets and numerous process steps, which suggested that the end result would be expensive.

If you observe the track history of semiconductor manufacturers it becomes evident why none of these concepts ever made it into the mainstream. The continuous miniaturization or scaling of the devices has maintained the transistor count trend in accordance with Moore’s law even today [2]. The scaling is actually the shrinkage of all the dimensions of metal-oxide-semiconductor field effect transistor (MOSFET). Every time the semiconductor manufacturers were faced with process challenges or design difficulties due to scaling, they analyzed what is the smallest change that could be made in the  integration scheme in order to continue to use the existing tool set and process flows in the new technology node. They also had to consider whether new processes that were to be introduced could be extended to future nodes. The strategy is that in every technology node when some new process-integration step is introduced, the majority of other process steps are kept unaltered. The direct result of this strategy is that with each coming generation the process-flow becomes more stable and reliable.

This strategy of minimum change for every new generation is well exemplified in Intel’s processors. Intel’s 22 nm had the 5th generation of strained silicon engineering with raised source-drain having embedded graded SiGe for PMOS channel, and embedded Si for NMOS. Similarly, for channel and gate engineering, high-k with replacement metal gates were introduced in 45 nm node and was further improved in 32 nm node and finally implemented in 22 nm finFET structure. Intel has maintained the same finFET architecture up to 10 nm. Yet the device performance has improved and the transistors per unit area count has increased. In the case of TSMC it is equally impressive, TSMC introduced finFET device at 16 nm node in the iPhone 7 processor in 2016, and since has produced three new generations of finFET devices. According to the press release, it will also continue to use finFET devices in their 5 nm devices [3].

Needless to say the devil is in the details; detailed structural analyses are needed to understand the process evolution. Even though finFET configuration has remained as the workhorse since 2012, the evolution of the integration process flow and the design layout are impressive. In a broad sense, maximum changes and new process steps in advanced logic nodes take place near the gate structure, especially in the lowest interconnect structure closest to the gate. A glimpse of the process sophistication can be deduced from an old presentation of Intel, along with Mr. Dick James’ comments of Intel’s 10 nm process which includes cross-sections and detailed explanations about the changes in contact formation [4]. This article highlights how by changing the layout and the integration scheme the standard cell could be reduced and thus increase the number of transistors per unit area. A detailed survey of technology process of finFETs starting from 14 nm to 10 nm is well collected in a presentation from Siliconics [5]. This presentation is full of cross-sections and detailed explanations, and is quite a treasure trove. It elaborates some of the major innovations that have been introduced in finFET devices. For examples, it discusses, fin geometry and pitches, work function metal layers of NMOS and PMOS transistors, solid-source diffusion punch stop and its role, the introduction of novel materials in the lower interconnect structure, the structure of dummy gates at the fin end, post patterning fin removal, the coming of super vias that connect directly from metal 1 to the gate without the need of an intermediate metal 0 layer, the implementation of multi-stage contacts to the source-drain regions, the introduction of quadruple patterning for the front-end, and air-gaps in the back-end-of line. Figure 2 taken from this presentation shows a variety of contacts, which is only one of the novelties in finFET devices. And of course each of these process steps is backed by a family of patents. This illustrates the point that massive innovations were implemented on the same finFET device configuration.

Predicting near future technologies for semiconductor devices would require looking for patents that make incremental changes yet affect the cell area or the layout of interconnect structure closest to the gate. These patents would be able to make the miniaturization process without much disruption while still maintaining the integration flow, thus keeping the manufacturing cost low. Modern technology will accelerate the process of using patents to more effectively predict the near future technologies of semiconductor devices. Related ideas are already being tried out with the help of deep learning as in the case of Google which announced that it is experimenting with artificial intelligence to make more efficient chips. It is not looking for radical changes in device structures but rather optimizing what is available [6]. Semiconductor technology has never stopped innovating and will not stop surprising us and a thorough understanding of current process steps and their corresponding patents could be key to predicting what is still to come.

The ideas expressed in this article are solely the opinion of the author and do not represent the author’s employer or any other organization with which the author may be affiliated.

References

1/ https://twitter.com/WIPO/status/1247498105135566848

2/ https://www.semiconductor-digest.com/2020/03/10/transistor-count-trends-continue-to-track-with-moores-law/

3/ https://www.tsmc.com/english/dedicatedFoundry/technology/5nm.htm

4/ https://newsroom.intel.com/newsroom/wp-content/uploads/sites/11/2017/09/10-nm-icf-fact-sheet.pdf

https://sst.semiconductor-digest.com/chipworks_real_chips_blog/2017/04/10/intel-unveils-more-10nm-details/

5/ https://nccavs-usergroups.avs.org/wp-content/uploads/JTG2018/JTG718-4-James-Siliconics.pdf

6/ https://www.zdnet.com/article/google-experiments-with-ai-to-design-its-in-house-computer-chips/