webinar banner2025 (1)

DesignWare IP as AI Building Blocks

DesignWare IP as AI Building Blocks
by Alex Tan on 09-11-2018 at 7:00 am

AI is disruptive and transformative to many status quos. Its manifestation can be increasingly seen in many business transactions and various aspects of our lives. While machine learning (ML) and deep learning (DL) have acted as its catalysts on the software side, GPU and now ML/DL accelerators are spawning across the hardware space to provide performance leaps and growing capacity to the compute hungry AI data processing needs.

For the past few decades EDA has bridged the hardware and software space, transforming many electronic design challenges into solutions that became key enablers or drivers for its adopters to push the technology envelopes further. Being at the HW and SW technology crossroad, many EDA providers have embraced AI by either having their solutions augmented with ML assisted analytics or expanding its features to enable AI centric designs or applications.

AI Enablers and Challenges
According to nVidia CEO Jen-Hsen Huang, the rise of AI had been enabled by two concurrent disrupting dynamics, i.e., how software development is done by means of deep learning and how computing is done through the more adoption of GPU as replacement to single-threaded, multi-core CPU, which is no longer scale nor satisfy the current increased computing needs.

Let’s consider these two factors further. First, the emergence of more complex DL algorithmic models such as CNN (convolutional neural network) and RNN (recurrent neural networks) has made AI based pattern recognition applications such as embedded vision and speech processings more pervasive.
Training and inferencing steps, the two inherent and needed ML traits which mimic human cognitive development aspects, are subjected to different challenges. The former requires sufficient capacity and bandwidth, while the later demands latency optimization as it deals with irregular memory access issue.


Secondly, on the hardware side, a CPU typically has a few to several dozens of processing units, some shared caches and control units. CPU is multi-purpose but has limited capacity. In contrast, a GPU employs numerous (hundreds to thousands) processing units with their own caches and control units, which are dedicated to perform specific works in parallel. This massive array of compute power and parallelism can absorb the workload presented by deep learning applications. GPU is well suited for training steps as it requires floating point engine while the inferencing portion may resort on a reduced accuracy or integer based data processor. Challenges to GPU architecture include the availability of adequate interconnect speed and bandwidth.

Aside from the above two factors, the advent of FinFET process technology has also played a major role in accommodating the integrations of the billions of devices into foundry allowed silicon die active area.

SoC and DesignWare Building Blocks for AI
Synopsys DesignWare® IP portfolio has been the foundation to chip design implementation for over two decades, containing technology- independent design IP solutions optimized for various cost factors (PPA). As DesignWare is at the epicenter of the ever evolving SoC implementation space, the list of DesignWare building block IP solutions has grown over the years, from a few number of compute primitives such as adders, multipliers to increasingly complex IP blocks such as microcontrollers, interface protocols (AMBA, USB, etc.) and eventually embedded microprocessors, interconnects and memories. Its integration also has been expanded to not only cover synthesis integration but also facilitate virtual prototyping and verification IP.

Designing SoCs targeted for AI related applications requires specialized processing, memory performance and real-time data connectivity. Recently, Synopsys upgraded its DesignWare IP portfolio for AI to also include these DL building blocks.

Specialized Processing
Specialized processors include embedded processors and tools for scalar, vector, and neural network processing. In order to handle AI algorithm efficiently, machine vision has relied on a heterogeneous pipelined processing with varying degree of data-centric parallelism.

As illustrated in figure 3, there are four stages associated with a visual data processing. The pre-processing step has the simplest data parallelism while the precise processing step has the most complex, requiring good matrix multiplication capabilities. Such unique processing needs can be served by Synopsys DesignWare ARC® processors. The DesignWare ARC EV6x processors integrate a high-performance scalar RISC core, a vector DSP, and a convolutional neural network (CNN) engine optimized for deep learning in embedded vision applications. The ARC HS4xD and EMxD processor families combine RISC and DSP processing capabilities to deliver an optimal PPA balance for AI applications.

Memory Performance
AI models demand large memory footprint contributing to the overall silicon overhead as training neural networks requires massive memory space. Synopsys’ memory IP solutions include efficient architectures for different AI memory constraints such as bandwidth, capacity and cache coherency. The DesignWare DDR IP addresses capacity needed for data center AI SoCs.

Furthermore, despite the sustained accuracy promised by AI model compression through the pruning and quantification techniques, their adoption introduces irregular memory access and compute intensity peaks –both of which degrade the overall execution and system latency. This also drives the need for more heterogeneous memory architectures. DesignWare CCIX IP enables cache coherency with virtualized memory capabilities for AI heterogeneous compute and reduces latency in AI applications.

It is common that parallel matrix multiplication and increased size of DL models or coefficients necessitate the use of external memories and high bandwidth accesses. The DesignWare HBM2 IP addresses the bandwidth bottleneck while providing an optimized off-chip picojoules (pJ) per bit memory access. A comprehensive list of embedded memory compilers enabled for high density, low leakage, and high performance on-chip SRAMs options as well as TCAM and multi-port flavor are also available. Most of them are also ported to 7nm FinFET process node.

SoC Interfaces and Real-Time Data Connectivity
Synopsys also provides reliable interface IP solutions needed between sensors. For example, real-time data connectivity between embedded imaging vision sensors and deep learning accelerator engine, normally done at the edge, is crucial as it is power sensitive. Synopsys offers a broad portfolio of high speed interface controllers and PHYs including HBM, HDMI, PCI Express, USB, MIPI, Ethernet, and SerDes to transfer data at high speeds, with minimal power and area.

Although DL SoCs have to deal with fluctuating energy metrics per operation due to massive data movement on and off chips, having both power management supported by memory IPs and the adoption of advanced FinFET process nodes such as 7nm, could provide an overall effective power handling.

As takeaways, designing for AI applications requires the understanding of not only the software or application intents but also the underlying data handling in order to deliver an accurate prediction within a reasonable compute time and resources. Having Synopsys’ silicon-proven DesignWare IP solutions as the building blocks will help accelerate the design realization while providing design teams with opportunities to explore an optimal fit to the more heterogeneous design architecture.

For more info on Synopsys DesignWare IPs please check HERE .


Affordable EDA Tools for IoT Designs, Guess which Vendor

Affordable EDA Tools for IoT Designs, Guess which Vendor
by Daniel Payne on 09-10-2018 at 12:00 pm

I just had to drive my car 7 miles from Tualatin, Oregon to visit with an EDA veteran who has played a lot of diverse roles in his career, including: IC Mask Designer, Layout Manager, Account Manager, Business Development, Director, Foundry Relations Director. His name is John Stabenow, with Mentor, a Siemens Business, and we met in Wilsonville, Oregon last month to talk about Tanner EDA. I’ve known John for awhile and he’s worked with all three major EDA vendors over the years, so has a really deep perspective especially on the IC side.

Q&A

Q: What is the sweet spot for companies using Tanner EDA tools these days?

Our IC design customers are using 28nm and above nodes, often building diverse IoT systems that may even use MEMS, RF and require AMS IP.

Q: Can you name a trend among Tanner EDA customers doing IoT designs?

Sure, in the past there were separate companies doing sensor design and chip design, but now we see these sensor component companies branching out into doing their own chip designs that connect to the sensors. The IoT is really about sensor-driven design, and IoT edge systems are using lots of sensors.

Q: How about getting started with an IoT that uses a processor core?

We’ve partnered with ARM so that you can do a core-based design concept using an ARM M0 or M3 at no cost, in order to get your project started. It’s part of the ARM DesignStart program and we started this back in 2016.

Q: I mostly think about Tanner EDA as full-custom IC tools, so how do you get digital tools like P&R or synthesis?

Mentor has a lot of digital implementation technology, so we make available a version of Nitro for P&R and Oasys-RTL for synthesis to Tanner EDA users. For smaller designs you can choose to use the Tanner EDA Place and Route tool within the Tanner L-Edit tool.

Q: Are Tanner tools only available on the Windows platform?

Historically the Tanner EDA tools were first offered on the Windows platform, then we’ve expanded that to include Linux using the Wine technology. On a side note, 2018 marks Tanner’s 30th anniversary, which I think makes Tanner a little older than Virtuoso.

Q: Where do I get a PDK when using Tanner tools?

We have a PDK team at Tanner where they create, QA and migrate all of the kits, collaborating with the foundries to create iPDKs.

Q: Any new tools coming out of Tanner recently?

You bet, there’s Tanner Designer, it’s a tool for analog verification management where you can track all of your tests in one place and determine how complete your testing plan is. The tools uses an Excel interface, so it’s intuitive to learn and setup, supporting simulators like: T-Spice, Eldo and AFS.

Q: Mentor has a lot of SPICE circuit simulators, so what do you recommend for Tanner users?

It depends a little on the customer. For some of our customers, we suggest that for day to day usage give T-Spice a try, then at sign-off you can switch to either Eldo or AFS simulators. IoT designers doing RF circuits will want to work with Eldo RF. Our enterprise customers are asking for Eldo or AFS, so we have integrations to both.

Q: Can I do photonic chip designs with Tanner tools?

Yes, you take a combination of the L-Edit tool and Luceda IPKISS.eda to enable photonic IC layout design for things like an Arrayed Waveguide Grating (AWG) with our Filter Toolbox, and also use our library of photonic components. We also have a strong partnership and mutual customers with Lumerical.

Q: What should I expect in future releases of Tanner tools?

You will see layout productivity improvements that will include layout generators, stay tuned.

Q: I know that Mentor acquired Tanner EDA, but now Mentor was acquired by Siemens, so how’s that all going?

The Siemens acquisition of Mentor has been one of the smoothest transitions ever, we still have our Tanner identity and are in growth mode for our product revenue and number of customers. Tanner EDA is bringing new customers into Siemens that they have never seen before. About 35-40% of new Tanner customers are new to Mentor. This acquisition by Siemens has been good for Tanner EDA.

Q: SoftMEMS was a partner of Tanner EDA, so how is that relationship doing?

Tanner EDA and SoftMEMS are still active and collaborating quite well in the field. There’s an on-demand webinarthat shows you more about the technology.

In the MEMS area we’re winning new customers, even beating Coventor tools.

Related Blogs


The Rebirth of Dolphin Integration!

The Rebirth of Dolphin Integration!
by Eric Esteve on 09-10-2018 at 7:00 am

You may have seen this Press Release (see below) announcing that Dolphin Integration (Dolphin) has been acquired by Soitec (60%) and MBDA (40%) – you can see more information about these two companies at the bottom of this blog. Founded in 1985, Dolphin had some recurrent cash flow issues during the last couple of years, that they solved by using bank loans. Even if the company was doing profitable business, the bottom line was negative.

Like it frequently happen in the US when a company goes to Chapter 11, Dolphin Integration management initiated in June an insolvency recovery process. Dolphin is one of my customers, I know their IP port-folio and also their new CEO since February, Christian Dupont, that’s why I want to share my personal opinion about the company.

This acquisition is an opportunity, a great opportunity for Dolphin to deploy their Energy efficient IP based new strategy and satisfy their European customers in the Defense/Aeronautic industry by delivering top class ASIC design & supply services. Let’s concentrate on their impressive IP port-folio: Dolphin has developed 600 IP (mostly mixed-signal) for about 500 customers WW! These IP can be ranked in three categories: Foundation, Features and Power Management IP.

Thefoundation IP, libraries and memory compilers, optimized for the best trade-off between power consumption and area support major Silicon foundries, all the way down to 22nm. Such foundation IP are even available foundry sponsored in major node, and Dolphin development team has a very good reputation. In fact, there is no other option than targeting the highest quality level when you deliver this type of IP to a foundry, as their customers will use it to design their SoC. No doubt that this business will expand if you consider the multiple process variants built around a single process node, and the diverse technology options, bulk or FD-SOI.

By Features IP, Dolphin consider an IP family, mostly built to support audio chip, where mixed-signal design expertize is key. Thanks to 30+ years experience of the company in this field, requiring an unique understanding of application-related audio challenges, they can do better and faster than new-comers, leading to better differentiation and faster TTM for their customers.

The Power Management IP family deserve an explanation: we talk about all these functions that you have to use to efficiently design energy optimized IC. To my opinion, this family has a huge growth potential as it target most application and is critical for the battery powered SoC. Decreasing the power consumption is becoming the next goal, like offering always higher frequency in the 2000’s. In fact we should say making better power efficiency, it’s more accurate.

In the Moore’s law golden age, a chipmaker had just to re-design to the next technology node to automatically benefit from higher performance and lower power consumption-at the same time! This is still true, but only for those who can afford to move from 10nm to 7nm, and this is not the majority. The others will have to be smarter, integrating Dolphin’s Power Management IP is a way to design more power-efficient SoC. This is true for IoT edge SoC and for battery-powered chip in general, and this is also true for automotive advanced application, where power consumption is a real concern.

By the way, Dolphin has already more than 10 customers in this automotive segment, and there is no doubt that they will quickly develop this customer base. The company is European based, which will certainly help.

Let’s mention this quote, from E. Lozano, Sr. Director of Business Development and IP Solutions with Open Silicon: “Dolphin Integration is a leading silicon IP provider for low power IoT SoCs. In view of the growing demand for low power consumption in IoT devices, we intend to leverage the unique solutions offered by Dolphin Integration to meet the challenges of our IoT customers.”

From an organization standpoint, Dolphin will operates independently from Soitec and MBDA, while obviously benefiting from each company ecosystem. But that doesn’t means that Dolphin will limit to FD-SOI based SoCs! It would be a mistake to do so… They are and will stay multifoundry and they will follow their customers in term of process or CPU choice.

In term of positioning, Dolphin will have the opportunity to focus the marcom resource where it’s needed the most to sell on a WW market, the IP products. The group in charge of ASIC design services will satisfy existing customers in Europe, but has not the charter to develop a WW business.

In fact, the Dolphin’s strategy is simply to be the best “Energy Efficient IP Company”. If you look at their port-folio, that makes sense at 100%: Power Management IP are allowing to architect SoC for better power efficiency. This will be well complemented by their feature IP like Audio IP (frequently integrated into battery-powered IC), and power-optimized libraries or memories. Power efficiency is becoming as important as pure performance, as you can see with this extract from a PR from Huawei (August 2018) about the Kirin 980:

“Debuting with the Kirin 980, Mali-G76 offers 46 percent greater graphics processing power at 178 percent improved power efficiency over the previous generation.”

Despite this summer insolvency recovery process, Dolphin acquisition by Soitec and MBDA is a real opportunity for the company to rebound and focus on what they do best, efficient IP development. Their positioning on power efficiency IP targeting battery-powered SoC as well as automotive and probably infrastructure application (which suffer from power dissipation limitation) is well aligned with the next decade strong market trend to develop power efficient SoC (in my opinion).

To see the PR, use thisNasdaq link

ByEric Esteve fromIPnest

About Soitec
Soitec (Euronext, Tech 40 Paris) is a world leader in designing and manufacturing innovative semiconductor materials. The company uses its unique technologies and semiconductor expertise to serve the electronics markets. With more than 3,000 patents worldwide, Soitec’s strategy is based on disruptive innovation to answer its customers’ needs for high performance, energy efficiency and cost competitiveness. Soitec has manufacturing facilities, R&D centers and offices in Europe, the U.S. and Asia.

Soitec and Smart Cut are registered trademarks of Soitec.
For more information, please visitwww.soitec.com and follow us on Twitter: @Soitec_EN

About MBDA
MBDA is the only European group capable of designing and producing missiles and missile systems that correspond to the full range of current and future operational needs of the three armed forces (land, sea and air).

With a significant presence in five European countries and within the USA, in 2017 MBDA achieved revenue of 3.1 billion euros with an order book of 16.8 billion euros. With more than 90 armed forces customers in the world, MBDA is a world leader in missiles and missile systems. In total, the group offers a range of 45 missile systems and countermeasures products already in operational service and more than 15 others currently in development.
MBDA is jointly owned by Airbus (37.5%), BAE Systems (37.5%), and Leonardo (25%).
For more information, please visitwww.mbda-systems.com


Is the Q4 Bounce Back now a 2009 Recovery?

Is the Q4 Bounce Back now a 2009 Recovery?
by Robert Maire on 09-10-2018 at 7:00 am

Last week saw a unique confluence of events that continue the negative news flow in semicap following the story about GloFo. At a financial conference, Micron’s CFO said NAND prices were declining, this was on top of an analyst note in the morning about the same issue. Even though this should be no surprise as memory has had an overly long strong run.

Then KLAC’s CFO tempered expectations for the balance of the year due to the pushout of DRAM business that was already announced and well known in the industry. (Samsungs pushout of spending..)

Also read: GLOBALFOUNDRIES Pivoting away from Bleeding Edge Technologies

Separately either topic is well known and out in the public domain already but being “announced” at the same time in the first conference back from the summer sets an ominous tone, and acts like a catalyst.

Micron announcing declining NAND prices is like pouring gasoline on the already dry tinder of memory concerns then ignited by a cigarette flicked out the window by KLAC’s comments.

We now have a meltdown…..even though its not a lot of new news and things we have been talking about for a long time….

Dead Cat Bounce Done?
Concerns about memory seem to have faded over the summer during vacations when the market is usually quiet but the issue never really went away. We have been very clear that the memory run was too strong for too long and well overdue for a correction that we had already started to see months ago.

Falling NAND pricing or DRAM pricing should be no surprise but perhaps having multiple comments on the first day back are a bit too much…..

Stock had recovered a bit while investors took their eye off memory concerns and we had a little bit of a dead cat bounce but now it sounds like multiple doctors are declaring the cat truly dead…….

Q4 “Snapback ” not so “Snappy” now
While the vast majority of analysts and bulls on the stocks rushed out in support of a one quarter downturn, we took a more conservative view suggesting we were very “dubious” as it would make absolutely no sense for Samsung to push out only one quarter. It now seems our more conservative view may be borne out as we are now talking about 1H19 recovery (maybe…). While CQ3 may still be the “trough”, but the length of the trough and the angle of recovery out of the trough is more questionable.

GloFo not that impactful..

The reality of last weeks news about GloFo dropping out of the 7NM race while surprising is not hugely impactful on the revenues of semicap companies. The reality is that most tool makers had long ago discounted GloFo as a major player and discounted the revenue expectations from them. The real issue is more the view that it is additional negative news and the ability to make up for a memory shortfall from the foundry side all that much more difficult.
So while the GloFo news itself is not that negative, its when its taken in context with the overall negative news flow we are experiencing.

KLAC bulletproof vest weakened
KLAC’s “defensive” play ,which up until now has been working, is obviously taking a bit of a hit. KLAC’s stock which has held up way better than AMAT & LRCX due to its lower memory exposure is now leading the group lower today. This lower memory exposure premium the stock has enjoyed will obviously be reduced now that the same memory issue is impacting KLAC’s financial performance.

To be clear, KLAC still has much less memory impact than AMAT or LRCX, but it does not have zero impact. The foundry side of KLAC’s business is unaffected and still very strong.

The stocks
Today is an ugly wake up call after a relatively quiet semiconductor summer. There is not a lot of new news here but putting it all together in one place and time is obviously overwhelming the stocks.

We have been very clear, for a long time, that memory was too strong for too long, and had to cool off. Wether it cools slowly or takes polar bear plunge matters little. It actually may have been better for memory to drop quickly and get it over with rather than reliving the fear of memory pricing dropping over and over again.

Likewise with the semicap stocks, its time to get over the fact that the industry is still cyclical and goes up and down. Customer concentration is making that worse. It is also naive to assume only a one quarter, or one company, slow down. All the cyclical downturns we have seen have more than one negative issue.

We had said that AMAT had downside to the low $40’s and we are there. We had also said that LRCX could have downside to $150ish but we are still a bit above that. KLAC may have lost about a 10% premium that it has held as a “defensive” play.

Micron is very, very cheap and getting cheaper. The negative memory discount has been applied too many times as the company is still making a lot of money.

We probably have another 6 weeks or more until companies report and we get a sigh of relief as fears turn into reality which is usually less negative than the anticipation itself.

In the meantime we find it hard to catch a falling spear and buy into this continuing flow of negative news especially given the skittishness of the overall market.


Mentor Graphics Makes a Transition

Mentor Graphics Makes a Transition
by Daniel Nenni on 09-07-2018 at 12:00 pm

This is the fourteenth in the series of “20 Questions with Wally Rhines”

I joined Mentor Graphics (now Mentor, A Siemens Business), in late 1993. Tom Engibous, one of my direct reporting people at TI, was promoted to replace me as head of the Semiconductor business of TI and I moved on to what I knew would be a real challenge, the rescue of an EDA company that had committed to a strategy that was likely to fail; I knew all this because I was a large customer for Mentor at TI. But my wife thought that Portland would be a good place to raise our very young children and Jerry Junkins, the CEO of TI, made it clear to me that any succession to his role wouldn’t happen for at least ten years because of his own career plans.

I came to Mentor with an optimistic view. After all, most companies that have failed product generations can quickly shift to other innovations they have on the shelf and re-generate their momentum. Not so with Mentor’s Version 8.0 Falcon (later referred to as Version Late dot Slow). There wasn’t a lot on the shelves to build upon and almost everyone in the company had been moved to the Falcon project to try to save it. But the shelves were not totally bare.

My first interest at Mentor was emulation. After all, I knew that Mentor had the best emulation technology in the industry, having visited there to observe it before. When I arrived at Mentor and asked about it, everyone started checking his shoe polish. Unfortunately, Mentor had sold its leading emulation technology, along with the patents, to QUICKTURN, leaving only a very limited ability to compete.

I then turned to physical verification. After all, I had signed the contracts for Mentor to OEM TI’s physical verification software while I was at TI, and it had been a reasonable recovery from Mentor’s loss of Dracula (their OEM solution) when Cadence acquired ECAD and terminated Mentor’s OEM agreement. TI was not interested in extending the OEM arrangement with Mentor to the next generation so we bought out their rights and in January 1994, we had a big kickoff meeting to develop the next generation of physical verification, headed by Laurence Grodd for the physical verification and Koby Kresh for the Logic to Schematic verification. In addition to the fact that Laurence was brilliant, we had the benefit that he had maintained a database of designs that were verified using “Checkmate”, the Mentor name for the product we OEM’d from TI. Laurence could handle hundreds of variants in design style. He proceeded to innovate innumerable approaches to physical verification including selective promotion and other things that are routine today; unfortunately, Mentor didn’t file any patents. So ISS, a company in North Carolina that was ultimately acquired by AVANTI, adopted many of these approaches, including hierarchical forms of analysis.

Internal politics were also a factor, as they always are in large companies. Mentor’s custom IC layout product, IC Station, was in a battle to beat Cadence’s product, Virtuoso. Our physical verification capability in IC Station came from Laurence, was called “IC Verify”, and was clearly superior to competition. So why would we sell it stand-alone to competitors using Virtuoso? Subsequently, a copy of “Calibre” was sneaked out to AMD and their designers became excited by it. Meanwhile, Mentor’s products that had evolved from the TI OEM continued to evolve and Intel became a major customer. The war had begun. At the next DAC, a decision was made to display the new “Calibre” capability and that was a decisive move, undercutting the roadmap that Intel was expecting. While the Intel surprise was upsetting for some in our sales force, Calibre clearly ushered in a whole new generation of physical verification.

The critical role at this time came to Brian Derrick, GM of the Physical Verification Division. Brian did something very innovative, and probably forbidden in most large companies. Brian worked directly with Danny Perng, a salesman in Taiwan who was interested in focusing on Calibre for the foundries, TSMC and UMC. Because our sales force knew that TSMC and UMC wouldn’t pay much for tools, the sales and support resources in Taiwan were insufficient to drive a foundry campaign. So without permission, Brian hired his own sales force to complement Danny’s effort. These specialists from the product division were able to convince the TSMC engineers, and later those at UMC, GLOBALFOUNDRIES, etc., that Calibre was superior to competitive approaches.

Simultaneously, Brian’s team concluded that optical proximity correction would be the next important extension of physical verification. Presim, a startup based in Portland, Oegon, was the leader and they had captured the Intel account. Not to be defeated, Brian found the leading experts in the technology (going to UC. Berkeley to find OPC Technology Inc. and hiring Nick Cobb to head up the development). These strategic moves created the basis for Mentor’s #1 position today in both physical verification and resolution enhancement.

Of course there were many more battles to win (and lots of fun yet to be experienced). Whenever I ask successful people in technology, including CEO’s, about the most enjoyable part of their careers, they almost always point to a period when they worked with a group that overcame the impossible and developed a product or capability that changed an industry. Calibre provided just such an experience for many, as did a number of other developments that emerged on Mentor’s path to recovery.

Mentor had undergone lots of problems and had moved from #1 to #3 in a competitive EDA industry. It was clear that we had found areas where we could be the defacto standard: Calibre physical verification, Tessent Design for Test, Expedition PCB Design, Calypto/Catapult high Level Synthesis, Automotive Embedded Electronics, and eight others by the metric provided in the official Gary Smith EDA analyses. Fortunately, Synopsys eventually decided that they didn’t have to do everything; they could pursue new areas that Mentor was not pursuing. That allowed a level of diversification that had not been common in the EDA industry.

And, with that, the EDA industry started to change. Each major EDA company developed specialties, instead of spending all their time trying to take market share from each other. And they all became more innovative. If I could claim one contribution to the EDA industry, it would be this. We are now an industry that looks for capabilities that will help our customers, and then develops (or acquires) those capabilities, rather than just trying to take market share from each other.

The 20 Questions with Wally Rhines Series


Turnkey 2.5D HBM2 Custom SoC SiP Solution for Deep Learning and Networking Applications

Turnkey 2.5D HBM2 Custom SoC SiP Solution for Deep Learning and Networking Applications
by Daniel Nenni on 09-07-2018 at 7:00 am

Before we jump into the specifics, let us understand what’s driving custom solutions in the high performance computing and networking space. It’s the growing demand for core capacity and greater performance, which is due to the increase in the level of parallelism and multitasking required to handle the enormous amount of data traffic. According to market research, the increase in the core capacity has gone up from just a few cores to nearly 60+ cores. The memory and network bandwidth requirements will, by default, increase to keep pace with the increase in the core capacity and performance. As per the market research, the increase in the memory bandwidth has increased from 10Gbytes to roughly 400Gbytes, and the increase in the network IP traffic has gone up from 90 Exabytes to close to 300 Exabytes.

HIGH BANDWIDTH MEMORY (HBM2) CONTROLLER AND PHY

All these factors are pushing the need for custom processors, custom SoCs and specialized memories, like HBM, in the high performance computing and networking market segments. There are several high performance applications that demand high bandwidth memory access. Some examples are data center, networking, artificial intelligence, augmented reality and virtual reality, cloud computing, neural networks and several other high end applications. An HBM solution is ideas for these applications for three key reasons:

[LIST=1]

  • It currently supports a huge bandwidth of up 256GBps
  • It improves the power efficiency per pin
  • It offers a massive reduction in space, resulting in a form factor reduction of the end productOpen-Silicon’s first HBM2 IP subsystem in 16FF+ is silicon-proven at 2Gbps data rate, achieving bandwidths up to 256GBps, and being deployed in many customs SoCs. However, the data-hungry, multicore processing units needed for machine learning require even greater memory bandwidth to feed the processing cores with data. Keeping pace with the ecosystem, Open-Silicon’s next generation HBM2 IP subsystem is ahead of the curve with 2.4Gbps in 16FFC, achieving bandwidths up to >300GBps.

    This 7nm custom SoC platform is based on a PPA-optimized HBM2 IP subsystem supporting 3.2Gbps and beyond data rates, achieving bandwidths up to >400GBps. It supports JEDEC HBM2.x and includes a combo PHY that will support both JEDEC standard HBM2 and non-JEDEC standard low latency HBM. High speed SerDes IP subsystems (112G and 56G SerDes) enable extremely high port density for switching and routing applications, and high bandwidth inter-node connections in deep learning and networking applications. The DSP subsystem is responsible for detecting and classifying camera images in real time. Video frames or images are captured in real time and stored in HBM, then processed and classified by the DSP subsystem using the pre-trained DNN network.

    One application that goes hand-in-hand with high performance computing is AI. AI is revolutionizing and transforming virtually every industry in the digital world. Advances in computing power and deep learning have enabled AI to reach a tipping point toward major disruption and rapid advancement. Custom SoC platforms enable AI applications through training in deep learning and high speed inter-node connectivity, by deploying high speed SerDes, a deep neural network DSP engine, and a high speed high bandwidth memory interface with High Bandwidth Memory (HBM) within a 2.5D system-in-package (SiP). Open-Silicon’s implementation of a silicon-proven system custom SoC platform is centrally located within this ecosystem.

    About Open-Silicon
    Open-Silicon is a system-optimized ASIC solution provider that innovates at every stage of design to deliver fully tested IP, silicon and platforms. To learn more, please visit www.open-silicon.com


A Fresh Idea in Differential Energy Analysis

A Fresh Idea in Differential Energy Analysis
by Bernard Murphy on 09-06-2018 at 7:00 am

When I posted earlier on Qualcomm presenting with ANSYS on differential energy analysis, I assumed this was just the usual story on RTL power estimation being more accurate for relative estimation between different implementations. I sold them short. This turned out to be a much more interesting methodology for optimizing total energy using ANSYS PowerArtist.

Yadong Wang of Qualcomm presented and owns power modeling and analysis for Adreno GPUs. Before that he was a hardware power engineer in NVIDIA so he’s pretty experienced in this domain. He started by noting that the impact of power on heating is a big challenge in mobile GPUs. As you play a game on your phone, the temperature rises. Eventually thermal mitigation kicks in and clock speed drops; the game runs slower. The longer you play, the slower the game runs (down to some limit), which doesn’t make for great customer satisfaction. This is why thermal-constrained performance is becoming one of most important KPIs in mobile design.

This is a dynamic power problem. Assuming you’ve done all you can to minimize leakage (through process selection and power islands), and you accept you want to avoid switching to lower voltage/frequency options for the reason cited above, you really have to direct most of your attention to minimizing redundant activity, which you pretty much have to do at RTL. This is the low-cost place to perform design changes, you can iterate quickly on different options and the impact of changes is generally much higher than for any fixes that are practically possible at implementation. Yadong uses ANSYS PowerArtist in his work.

The common approach to optimizing power in these cases is run an analysis with some workload, look at the hierarchical breakdown of dynamic power components (switching power and internal power) through the design then look for cases where there might be redundant activity, such as a clock toggling on a register when the data input to that register isn’t changing. This process works but it doesn’t necessarily feel optimal. Maybe power savings may not be possible, but you might not know that until you’ve done quite a bit of searching. Wouldn’t it be better to know at the outset if there is opportunity to reduce power on this function and that the potential for reduction is significant? That’s where Qualcomm’s approach is really clever.

The core of the method looks at energy (power integrated over time) rather than power. And instead of hunting for redundant toggles, the method tweaks the workload (my view) by inserting bubbles in the path of incoming transitions or outgoing responses, to mimic starvation or stalls. This draws out the simulated time for that workload and therefore the time over which power is integrated to yield total energy.

Now they compare that energy report with the same report from a bubble-free run. The bubble-free case runs for less time with a higher average power, while the bubbled case runs for a longer time with lower average power. Ideally, total energy for these cases should be identical. But if there is power inefficiency in the design, the longer run-time in the bubbled case will amplify that inefficiency. So you know up-front whether there is opportunity to reduce total energy and you also have an idea of how much reduction may be possible.

Yadong took this further. In the experiments he described, he looked particularly at register-related dynamic power. Power estimation tools report switching and internal power separately; He noted that redundant D/Q toggles on a register will, in the bubbled case, cause an increase in both switching and internal energy, whereas redundant toggles on the clock input will increase only internal energy. Thus in comparison with the un-bubbled analysis there are 4 possibilities:

  • No change in switching or internal energy – no improvements are possible
  • Internal energy increases but switching energy is the same – there are redundant toggles on clock pins
  • Switching energy increases but internal energy is the same – there are redundant toggles on D/Q pins when the clock is disabled
  • Both switching energy and internal energy increase – there are redundant toggles on both D/Q and clock pins

They can drill down through detailed reports to find where they can make improvements to reduce redundant toggles.

What is especially startling is that Yadong said they were able to reduce dynamic power by 10% driven by this analysis. This is in a company (and a market) where reducing power is pretty close to a religion. But I’m not surprised the approach is so effective. This feels like a more scientific technique to measure power inefficiency overall and to isolate root causes. By comparison, traditional methods look rather ad-hoc.

Yadong mentioned at the end that a similar approach could be used to look at inefficiencies in memory, combinational logic and clock tree dynamic power. Analysis could pull similar data from the power estimation reports, though discriminating on differences in switching versus internal power might look different in each case. The Webinar is well worth watching. You can register to see it HERE.


Accelerating Design and Manufacturing at the 25th Annual IEEE Electronic Design Process Symposium

Accelerating Design and Manufacturing at the 25th Annual IEEE Electronic Design Process Symposium
by Camille Kokozaki on 09-05-2018 at 12:00 pm

25th annual IEEE Electronic Design Process Symposium
Accelerating Design and Manufacturing
September 13 & 14, 2018, SEMI, 673 S. Milpitas Blvd, Milpitas, CA 95035

This year marks a milestone in EDPS’s history as it turns 25. The event will be held at SEMI’s new headquarter facility and will provide a forum for EDA, foundry and design industries to address design and manufacturing issues. The Symposium will focus on acceleration methods for the design and manufacturing processes.

Key changes in designs and design methodologies continue to be an EDPS focus. Leading industry members will be sharing their challenges and solutions in this vibrant symposium. Real design in a real conversation setting will be discussed. EDPS 2018 sessions are based on the following themes:

  • Cyber Systems Design with emphasis on security
  • Innovative Designs and Design Techniques including Machine Learning in System Design and EDA
  • Smart Manufacturing – Increased cooperation between design & manufacturing (2.5/3D-IC Assembly and Test, Die-Pkg-Board CO-design flow, Flexible Hybrid Electronic)
  • System reliability with a special focus on ADAS, 5G, and Photonics.

The event will conclude with a panel discussion to analyze Blockchain’s role in EDA and Design. The Thursday evening banquet is co-located with the ESDA event “Building Start-Ups to Successful Exit” moderated by Jim Hogan.

Other Keynote Speakers include Chris Rowen, (CEO, BabbLabs) with an address entitled ‘Deep Learning Revolution – From Theory to Impact’. Andrew Kahng (Prof Computer Science & Engineering, UCSD) discussing ‘Evolutions of EDA, Manufacturing, and Design’. Simon Johnson (Sr Principal Engineer, Intel) will outline ‘Hardware-Based Security’.

Visit http://edpsieee.ieeesiliconvalley.org/ for additional details.

This event will offer time for Q&A after every presentation and plenty of networking time among ~ 100 attendees and speakers.

The event is sponsored by IEEE’s Council on Electronic Design Automation (CEDA) and Silicon Valley’s IEEE Computer Society and corporate Sponsors Ansys, Mentor Graphics, Intel with Semi as Associate Sponsor and IEEE’s Electronics Packaging Society as technical co-sponsor

In case you missed the early bird registration, EDPS is happy to offer a promo code “chipexpert-edps” that will provide $50 off the registration.You can register at edps2018.eventbrite.com and a complete schedule is available at ieee-edps.org and is attached here.

About EDPS:
The 2018 Electronic Design Process Symposium is the leading forum for advanced chip and systems development and CAD methodologies. As we approach the end of Moore’s law scaling, innovative packaging techniques are becoming increasingly important as package, board and other system components drive significant cost reduction. Innovative and smart manufacturing methodologies and flows are also becoming increasingly important. Since algorithmic development is changing rapidly, smart manufacturing enabling reduced NRE and faster time to market are critical.

Among other things, data center applications require heightened cybersecurity. 3DIC chip stacking of host processor and accelerator avoids exposing the bus between them to cyber-attacks. Implementation of machine and deep learning algorithms provide a higher level of defense against hacking. Cybersecurity is also very critical in system designs such as the ones found in automotive applications.

Reliability at the system level as well as at the package and chip level is impacted by ESD and thermal issues. Guaranteed performance needs to take aging and power into account. Newer interconnect, changing communication protocols and a wide range of operating conditions for systems require enhanced reliability for power and signal interconnects.

Heterogeneous integration of chips in high-performance processes and chips in mature process nodes allows higher performance and better yield optimization. More flexible system level partitioning will lead the way to new products’ development. Architectural modularity and IP re-use will enable higher performance at lower total system cost. New FPGA methodologies, especially embedded FPGA will see extensive use.

And last but not the least, machine learning is permeating all fields of system design and design tools.

A Trip Down Memory Lane:

The picture is of SemiWiki founder Daniel Nenni at the 2015 EDPS in Monterey:

The first session was chaired by Daniel Nenni and is on FinFET vs FD-SOI. It kicked off with a keynote from Tom Dillinger of Oracle (think Sun) followed by a panel session with Tom, Kelvin Low of Samsung Foundry, Boris Murman of Stanford University, Marco Brambilla of Synapse Design, and Jamie Shaeffer of GlobalFoundries:

The emergence of multiple transistor technology options at today’s deep submicron process nodes introduces a variety of power, performance, and area tradeoffs. This session will start with an overview of the FinFET and Fully-Depleted Silicon-on-Insulator devices (FD-SOI, also known as Ultra-Thin-Body SOI), in comparison to traditional bulk planar transistor technology. The session will then delve into a detailed discussion of the architectural and circuit implementation tradeoffs of these new offerings, to assist designers to make the right choice for their target application.

Detailed 2018 Program Info:


Unhackable Product Claims are a Fiasco Waiting to Happen

Unhackable Product Claims are a Fiasco Waiting to Happen
by Matthew Rosenquist on 09-05-2018 at 7:00 am

Those who think that that technology can be made ‘unhackable’, don’t comprehend the overall challenges and likely don’t understand what ‘hacked’ means.

Trust is the currency of security. We all want our technology to be dependable, easy to use, and secure. It is important to understand both the benefits and risks as we embrace new features and capabilities. For product companies, there is a challenge to show how their wares are desirable and differentiated from others. However, marketing claims around security can potentially undermine customer confidence, enrage potential attackers, and be a source of embarrassment.


The latest ‘unhackable’ tech marketing claim, for a digital wallet no less, falls after just a week. Luckily, it was an ethical hacker who disclosed their findings and not a cybercriminal who would have concealed the capability until such time as they could victimize users for their financial gain. There are many lessons to be learned.

Here are my top 4 tips for avoiding the pitfalls of overstating product security:

Rule #1: Never let marketing make promises (or guarantees) that products are “secure” “unbreakable” or ‘unhackable” when talking about the security of products. These words are absolutes for a domain where it is not possible. If there is a security story to be told, be cautious, accurate, and specific. It is far too easy to self-sabotage customer trust by flagrantly throwing about promises that can’t be kept or worse, quickly disproven. It makes confidence look like arrogance, deceit, or ignorance. To put this into rational terms of how momentously bad this is, I am not aware of any digital technology ever deployed into the real world for widespread use that is ‘unbreakable’. Do you really believe yours is the first? Don’t let hubris be your downfall.

Rule #2: It is important to understand how secure and resistant to attack your product is, but it requires a comprehensive evaluation and expert opinions. Security technologists (engineers, architects, coders, etc.) have different perspectives than security risk experts (threats, intelligence, methods, likelihood & impact calculations, etc.). Both are needed to understand the whole picture.

I would venture a guess, that in the case above, the security technologists likely stated all known vulnerabilities were closed, which was interpreted by marketing or management as their product was bullet-proof, while the security risk expert group was likely ignored or never engaged. Risk intelligence professionals would have outlined the types of threats most motivated, the objectives they would pursue, the likely methods employed, and what resources it would take to break the fundamental chains of trust.

Both disciplines, technical and intelligence, are needed when determining resistance, translating viewpoints, and comprehending risk. Don’t rely solely on a technical vulnerability scan or code audit to determine risk, as it is simply one facet of a complex model and only provides a partial overall outlook.

Rule #3: Never ignore how technology will be used, by whom, and what dependencies exist (network, supply chain, endpoint configuration, etc.). Anything is fair game, including the technology, processes, and people involved in the development, implementation, use, and sustaining support. Attackers don’t follow your product manual rules. They will be creative in finding the easiest way to exploit your products, likely in ways you didn’t consider.

Rule #4: Be kind to the white-hat hackers (i.e. security researchers) as they will work with you to make your products better. Save your venom for the cybercriminals and black-hat hackers who are agreeable with making your customers, their victims.

Moving forward
Trust is key for adopting technology. As the saying goes: “Trust is earned in drips and lost in buckets”. Choose your path, messages, and partners carefully.

Interested in more insights, rants, industry news and experiences? Follow me on your favorite social sites for insights and what is going on in cybersecurity: LinkedIn, Twitter (@Matt_Rosenquist), YouTube, Information Security Strategy blog, Medium, and Steemit