webinar IPXACT banner

IBM Design Tools in the Cloud: Big News or Old News?

IBM Design Tools in the Cloud: Big News or Old News?
by Paul McLellan on 06-19-2015 at 7:00 am

One announcement that I missed coming up to the Design Automation Conference last week was that SiCAD is hosting a portfolio of IBM’s design automation tools in the cloud. Supposedly these are priced half the cost of similar capability from Cadence, Synopsys and Mentor. So should the big three be worried? Is this an earth-shattering event?

The press release opens:IBM today announced that it is launching IBM High Performance Services for Electronic Design Automation (EDA), the electronic industry’s first enterprise-class, secure cloud service, which provides on-demand access to electronic design tools, in partnership with SiCAD, Inc., a Silicon Design Platform provider, with expertise in EDA, design flows, networking, security, platform development, and cloud technologies.

So the 50,000 foot view of what is being offered is:

  • Access to IBM-patented EDA tools
  • Bundled with Platform LSF on IBM Softlayer
  • Charged by consumption based on pay-as-you-go model
  • Offered through SiCAD’s Virtual Design Center Platform

IBM has always had some of the most advanced design tools, and they have relied correspondingly little on 3rd party tools. Their own place and route, timing engine, synthesis engine, analysis and so on. It is not quite clear exactly which tools will be available in the end, they say that this initial announcement is just a start, but certainly their library characterization tool, logic verification tool and its own SPICE simulator.

I see four big problems.

The first problem is the one that we had at Compass Design Automation, when we spun it out from VLSI Technology. Despite the fact that our tools were supposedly technology independent, the reality was that they were much more intertwined with our own libraries and with VLSI’s process technology. VLSI used external foundries too, and other customers such as LG in Korea used our whole portfolio in their own fab with a completely different process. Nonetheless, lots of things break when you see things that you have never seen before. For example, IBM does all its server design on SOI, which it manufactures in its Fishkill fab. But nobody else does high performance designs on IBM’s process. That may change one GlobalFoundries gets the fab and can make the foundry available to others.

The second problem is that companies have been very reluctant to use cloud-based solutions for semiconductor design, except in niche areas like Nimbic. There are two reasons for this. One is the perceived “putting the crown jewels on the internet” effect. If Target can lose all its customer data, and the OPM can lose all the federal employee data, including security clearance data, and Snowden can walk off with half the NSA’s secrets, then this is not seen as a trivial risk. I’m sure the Chinese military would love to have Intel’s latest processor mask data, or TSMC’s 10nm process recipes, or Qualcomm’s latest modem. That’s why you don’t find those things out in the cloud.

Another cloud issue is that IC designs are large. So moving them in and out of the cloud is time consuming, and unless all the tools required are there (they are not) then this is a necessity. For some applications, such as library characterization which can involve tens of thousands of SPICE runs, I can see the attraction of the cloud. Relatively small amounts of data, large computational load, almost unlimited parallelism, LSF to handle it all.

The third problem is that this is the second (or maybe third) time that IBM has tried to commercialize its design tools. Remember BooleDozer and Einstimer. They might even have been the best synthesis tool and the best timing engine (Ambit’s timing engine was written by the same person, and it was clearly the best engine of its era). But IBM were not successful in the market. They tried to sell through distribution, so that probably contributed to the lack of success, but was not on its own not fatal.


The fourth problem is the usage-based pricing. If the price is low enough to be attractive to large companies that means that running hundreds of copies of a tool 24-hours a day needs to be cheaper than buying permanent (or time-based) licenses. But if the price is set this low then there is almost no revenue from the small and medium sized companies that is the explicit focus of the announcement. If, on the other hand, the price is set high enough to get interesting revenue from the medium sized companies it will not be of interest to the larger ones. There are probably ways to solve this but it has been one of the problems whenever EDA companies have toyed with usage-based pricing. The EDA company is only interested if the revenue is higher, and the customer is only interested if it is lower. At least in this case, since IBM does not have existing business to protect, that dynamic might not matter.

We will have to wait and see what happens as more tools become available and whether:With the cloud service, clients no longer need to purchase EDA tool licenses, new hardware, data center infrastructure or staff to manage on-premise environments. IBM High Performance Service for EDA provides high performance tools, security and overall improved price performance offering customers of all sizes more affordable access to EDA tools and decreased cost of designs.

SiCAD’s website, including the press release is here. An “end of EDA as we know it” article worth reading is here.


Solido Has Perfected the Emerging EDA Company Business Model!

Solido Has Perfected the Emerging EDA Company Business Model!
by Daniel Nenni on 06-18-2015 at 7:00 pm

Last year at #51DAC we gave away more than a thousand printed versions of our book “Fabless: The Transformation of the Semiconductor industry.” This year we gave away pens with a light and stylus. My friends at Solido Design gave away 600 pens in their booth and we gave away another 400 at our DAC reception on Wednesday night. Solido was actually very clever about it. They turned on the lights and just left them in the trays on the counter. People came to them like moths to a flame… Even John Cooley stopped by to investigate! NO PEN FOR YOU! 😉

Speaking of clever, Solido is one of the more interesting companies I have worked with. They have perfected the emerging EDA company business model, absolutely. Solido CEO Amit Gupta is very approachable and I always enjoy talking to him. Here is an update from Amit and thank you again to the Solido booth staff for giving away our pens. It was greatly appreciated!

Q: What does Solido do?
We are the world-leading provider of variation-aware custom IC design software. Our customers are using our product, called Solido Variation Designer, to dramatically boost SPICE simulator performance by reducing the number of simulations and increasing design coverage for PVT, 3-sigma Monte Carlo, high-sigma Monte Carlo, hierarchical Monte Carlo and variation debug.

Q: How do your customers use your product?
We have 3 segments of users: memory, standard cell and analog/RF/custom digital designers. Our memory customers are using Variation Designer for full chip memory and cell level statistical verification. Standard Cell designers use our product for statistical verification and sizing of cell libraries. And our analog/RF and custom digital customers use Variation Designer for statistical & PVT verification and debug. Overall, users are getting improved design coverage in way fewer simulations than brute force.

Q: What industry trends are you seeing that impact your business?
We are seeing 2 big trends in the custom IC design space – continued move to smaller nodes and ultra-low power design at mature nodes. The move to smaller nodes is increasing variation. 28nm, FinFET and FD-SOI devices all have an increasing amount of variation impacting designs. Also, ultra-low power design at the more mature nodes, for applications like IoT wearables, is having larger variation impact due to lower supply voltage. Both of these trends have resulted in much more variation-aware custom IC design being done in the industry.

Q: What are the benefits of your customers using your product?
You can no longer cut corners when doing your SPICE verification. Increased variation causes designers to overdesign (poor power, performance and area) due to unnecessary over-margining, or underdesign resulting in poor yield. Our customers are using Variation Designer to see the impact of variation and eliminate unnecessary over and under design, so they get much better power, performance, area and yield.

Q: How is your business doing?
Our business is growing very quickly. We had 60% revenue growth last year, and 90% revenue growth in the first half of this year with increasing profits. We now have over 25 customers, including most of the top semiconductor companies, and over 1,000 users worldwide using our software regularly. We are also hiring – we have 10 software developer and applications engineering positions to fill immediately.

Q: What’s new at DAC this year?

We are pleased to be an invited presenter at the TSMC Open Innovation Platform Theater to showcase the Solido – TSMC integrated solution for our mutual customers. We also hosted a panel where engineers from Applied Micro Circuits, Cypress Semiconductor and Microsemi discussed their experiences using Variation Designer in their design flows and how they transitioned from legacy tools over to Solido.

In our demo suites, we are previewing our next major release – Solido Variation Designer 4.0. It includes Statistical PVT which delivers unprecedented accuracy and coverage across 3-sigma statistical variation and operating conditions, Hierarchical Monte Carlo which verifies full-chip memories with perfect statistical accuracy, and a suite of brand new features for memory, standard cell and analog/RF/custom digital designers.

Q: Where can our readers find more information about Solido?
They can visit our website at www.solidodesign.com for more product details and contact information. Or visit our careers site to see our job postings: http://www.solidodesign.com/page/jobs/


Can FD-SOI Change the Rule of Game?

Can FD-SOI Change the Rule of Game?
by Pawan Fangaria on 06-18-2015 at 12:00 pm

It appears so. Why there is so much rush towards FD-SOI in recent days? Before talking about the game, let me reflect a bit on the FD-SOI technology first. The FD-SOI at 28nm claims to be the most power-efficient and lesser cost technology compared to any other technology available at that node. There are many other advantages from a technology standpoint which we have heard over a year or two. For example, simplicity of process, no channel doping, excellent electrostatic control of the channel, and back biasing with extremely thin box. These technology aspects translate into limited short channel effect, low DIBL (Drain Induced Barrier Lowering), minimum junction capacitance and diode leakage, lowest leakage current, and excellent voltage threshold variability. The result is – the device can operate at multiple voltages and multiple frequencies. It can be used for high-performance (at 28nm at this time) as well as ULP (Ultra-Low-Power) applications.


[Courtesy ST: FD-SOI transistor structure; SRAM SER comparison]

Okay, I’m not showing the power graphs here as they are widely known for FD-SOI technology. Just see the interesting Soft Error Rate (SER) comparison bar chart obtained at ST for SRAMs. It is the minimum in case of FD-SOI and that improves reliability of devices with FD-SOI technology. Reliability, low-power, and low-cost are the key requirements for IoT applications. Also, the technology has lot of benefits for analog and high-speed designs because of lower gate capacitance and leakage current and latch-up immunity. The device also has lower noise and higher gain because of the absence of channel doping and pocket implants.

Now, let’s see how the rule of game is changing. At 28nm, there is no FinFET to compare with. The FD-SOI technology stands tall against all others with all the advantages mentioned above. Is that the only reason for the rush towards FD-SOI at this juncture? There is something more to it; the semiconductor business scenario, the economy, and the trending segments. In next couple of years IoT is supposed to be the top growing segment. Also, IoT applications do not require 16nm, 14nm, or below technology nodes. The 28nm process node seems to be ideal for IoT applications as they need low power at low cost. The 28nm process also looks good for analog ICs which will be a key requirement for the IoT market.

This is the most opportune time for foundries (ST being in the leadership position for FD-SOI), chip and IP developers, EDA vendors, and service providers to avail the opportunity provided by FD-SOI.

During 52[SUP]nd[/SUP] DAC, CEA-Letimade a big announcement about the launch of “Silicon Impulse”, a platform aimed at broadening the use of FD-SOI technology for ultra-low-power devices that are used in IoT applications and other energy-efficient equipments. The platform will offer technical expertise for developing energy-efficient solutions along with access to FD-SOI technology and manufacturing facilities. The service will include infrastructure support such as emulator and other test services along with industrial multi-project wafer (MPW) shuttles. The platform has partners from wide spectrum of semiconductor ecosystem including academia, foundry, EDA providers and chip designers. The list of partners with CEA-Leti includes CEA-List, STMicroelectronics, Dolphin Integration, CMP, Mentor Graphics, Cortus, and Presto Engineering.

The collaboration and partnership is not limited to ‘Silicon Impulse’ platform partners. There are also other partnerships happening around the world. Sankalp Semiconductoris an Indian origin company, leading in SoC chip design services and specializing in end-to-end solutions for IOs, Analog and Mixed-Signal chip designs. It has multiple design centers in India and USA. Recently, just before DAC, Sankalp announced FD-SOI services and IP partnership with ST. Sankalp has been involved in the development of many FD-SOI analog IP and high-speed PHYs for ST. Sankalp has developed significant expertise about FD-SOI technology and its usage in various applications. With FD-SOI technology, Sankalp feels confident to serve for the emerging IoT, wearable, consumer, multi-media and automotive markets.

See the press release about Sankalp and ST partnership here.

To accelerate the global footprint for FD-SOI technology based development, CEA-Leti is also hosting a workshop on June 22-23 where an expanded representation from the semiconductor community will take place. The presenters include –

  • ST, GLOBALFOUNDRIES and Samsung on FDSOI manufacturing
  • Ciena, ST and NXP on products based on FDSOI chips
  • Cadence, Synopsys, Mentor Graphics, sureCore, eSilicon and Tiempo on their offers for FDSOI in terms of IP and EDA tools

Also, there will be prominent professors from world-class universities along with Leti who will present about their research and innovations in designs with FD-SOI.

It’s exciting to see the FD-SOI ecosystem growing so fast, definitely there is merit in this technology. One may argue about 14nm FinFET technology superiority in terms of performance. However, I hear that ST will soon bring 14nm FD-SOI up to speed. So, by the time IoT market matures with 28nm FD-SOI, 14nm FD-SOI will become available for mainstream design. Does that seem like FD-SOI game?

By the way, if anyone is interested in attending the FD-SOI workshop, it is in Grenoble, France. Registration link is here.

Pawan Kumar Fangaria
Founder & President at www.fangarias.com


Semiconductor Equipment: the Report and the Show

Semiconductor Equipment: the Report and the Show
by Paul McLellan on 06-18-2015 at 7:00 am

Somebody said to me recently that SEMICON West, which takes place in San Francisco July 14-16th, isn’t that big a deal since very little manufacturing goes on in the US any more. In fact 15% of manufacturing capacity is in north america (I think that actually means the US since I don’t think there are any fabs in Canada). That is more than China, almost twice as much as Europe. Of course if you put all the Asian countries together you get to about 70% of all capacity.

Looking at just new capacity, due to the presence of leading device manufacturers, North America represents a significant portion of the new equipment market. For the last two years, North America was the second largest market for semiconductor manufacturing equipment behind only Taiwan.

Last week SEMI announced he update of its World Fab Forecast report for 2015 and 2016. The report projects that semiconductor fab equipment spending (new, used, for Front End facilities) is expected to increase 11 percent (US$38.7 billion) in 2015 and another 5 percent ($40.7 billion) in 2016. Since February 2015, SEMI has made 282 updates to its detailed World Fab Forecast report, which tracks fab spending for construction and equipment, as well as capacity changes, and technology nodes transitions and product type changes by fab. You can hear more about this on Monday July 13th, the “day before SEMICON West” by attending the the SEMI/Gartner Market Symposium an update on the semiconductor supply chain market outlook. In addition to presentations from Gartner analysts, Christian Dieseldorff of SEMI will present on Trends and Outlook for Fabs and Fab Capacity and Lara Chamness will present on Semiconductor Wafer Fab Materials Market and Year-to-Date Front-End Equipment Trends.

On Tuesday there is a keynote panel at 9am, and on Wednesday a keynote on IoT, also at 9am.

  • Tuesday: Keynote Panel – Scaling the Walls of Sub‐14nm Manufacturing
    Featuring: Qualcomm, Stanford University, imec and more.

  • Wednesday: Keynote – The Internet of Things and the Next Fifty Years of Moore’s Law
    Doug Davis, Senior Vice President and General Manager, Internet of Things Group, Intel

One the exhibit floor there are two presentation areas known as TechXPOTs. There is one in the north hall of Moscone and one in the south. The presentations are included with any pass for the exhibits. Topics that will be covered include:

  • What’s Next for MEMS?
  • Automating Semiconductor Test Productivity
  • Emerging Generation Memory Technology: Update on 3DNAND, MRAM and RRAM
  • Materials Session: Contamination Control in the Sub-20nm Era
  • Subsystem and Component Suppliers at Critical Cross Roads to Deliver on Yield and Productivity
  • Equipment and Materials Opportunities for Flexible Hybrid Electronics
  • Packaging Session: Auto Utopia — Gearing up Semiconductor to Turn Dreams to Reality
  • The Evolution of the New 200mm Fab for the Internet of Everything
  • Monetizing the IoT: Opportunities and Challenges for the Semiconductor Sector
  • CMP Technical and Market Trends
  • Factory of the (Near) Future: Using Industrial IoT in Semiconductor Manufacturing Sector
  • Update on Industry Status of 450mm

There’s more too:

  • Freescale IoT Truck
  • Fuel Cell Car
  • Innovation Village
  • Pavilions: China, Europe, Malaysia, Silicon Saxony

Details on the World Fab Forecast are here.
The website for SEMICON West, including links for registration, is here.


The Best Conversations You Missed at #52DAC!

The Best Conversations You Missed at #52DAC!
by Daniel Nenni on 06-17-2015 at 7:00 pm

The CEO Fireside Chats were my very favorite part of #52DAC. Dr. Walden Rhines, Lip-Bu Tan, and Dr. Aart de Geus are heroes of the EDA industry, absolutely. I saw all three Fireside Chats and the one word that I’m left with is INSPIRED!
Continue reading “The Best Conversations You Missed at #52DAC!”


DDR stands for Don’t Do (Just) RTL

DDR stands for Don’t Do (Just) RTL
by Don Dingee on 06-16-2015 at 9:00 pm

In optimizing SoC design for performance, there is so much focus on how fast a CPU core is, or a GPU core, or peripherals, or even the efficiency of the chip-level interconnect. Most designers also understand selecting high performance memory at a cost sweet spot, and optimizing physical layout to clock it as fast as possible within power consumption limits, is imperative.

One can do all of that exactly right, and still have a lousy performing, and perhaps overdesigned, SoC. But, it doesn’t have to end up that way.

Dealing with the nuances of DDR memory controllers and comprehending what actual traffic patterns are in play can make a huge swing in performance. For instance, just getting address mapping right – conversion of AXI addresses to physical memory addresses, matching what the application is really doing – can improve memory subsystem performance by 20% or more. Optimizing clock frequency allows better use of bandwidth at lower speed bins, which can reduce cost and power.

It’s the last point raised by Synopsys’ Patrick Sheridan in opening a recent webinar that got my attention: QoS. “Different [DDR memory] masters can have varying and often contradicting requirements.” There is high priority traffic, and so-called low priority traffic, and both can starve affecting overall system performance. Optimizing a DDR controller isn’t as simple as throwing one switch; a blend of parameters needs to be explored.

Synopsys is in a unique position to provide a perspective on this topic. They provide IP, in this case a DesignWare DDR uMCTL2 memory controller block. They also provide tools for optimizing IP in SoC designs, such as Platform Architect MCO with multicore optimization technology. The environment described is a SystemC simulation with appropriate IP models to provide DDR subsystem visibility.


Combining in-depth understanding of DDR memory controller IP via models with workload simulation capability delivers what Synopsys claims is at least a 10x improvement over trying to fight it out with just RTL-level techniques. HDL co-simulation of RTL IP is fully supported. However, I think once viewers see this event, they may re-evaluate their current approach.

One thing I did not appreciate fully before viewing this webinar was just how many parameters are involved in designing around a DDR memory controller. The webinar moves on to take a very detailed look at analyzing the uMCTL2 IP in a mobile SoC application, presented by Tim Kogel.


The use case analysis Kogel presents looks at a mix of traffic from a CPU, a GPU, a camera, and a display in a mobile device. The scenario models 300 us of traffic, with a QoS goal of 200 uS for the graphics processing. Illustrated is an approach to define elastic workloads across the IP blocks synchronized as necessary, then all projected onto a deadline analysis.

Address mapping is explored and optimized using the performance model, using a graphical view of JEDEC commands per interval. “Hot bit” visualization aids exploration, and then the memory clock speeds are optimized – again, using the actual traffic load and the deadline constraints.


That’s just the start of the event. Kogel then goes into a detailed discussion of parameter configuration, including a video showing how Platform Architect MCO can optimize hundreds of parameters in the uMCTL2. A key takeway: 300 us of real-time traffic is simulated, with all instrumentation and graphical visualization enabled, in about 10 seconds. This makes it super easy to change a parameter and re-simulate almost instantly.

To register and view the complete event:

Optimize DDR Memory Subsystem Efficiency With Synopsys Platform Architect

This is a great example of how powerful SystemC modeling can get inside IP quickly and explore complex issues in real-world scenarios. Even if you are not using Synopsys IP, Platform Architect, or SystemC modeling, this is worth your time to see the approach. What you may be overlooking, or spending huge amounts of time solving, could make the difference in your next design.


Apple Watch Design Revisit with a Wi-Fi Twist

Apple Watch Design Revisit with a Wi-Fi Twist
by Majeed Ahmad on 06-16-2015 at 5:00 pm

Apple Watch is the world’s most celebrated gadget in 2015. At the same time, however, early product reviews highlight some issues about slow apps, less than impressive user experience, and short battery life.

Apple, the master of artful integration, has done well for its reputation of elite hardware and has been able to create a sophisticated product design for a wearable device. But here is a design avenue that can help counter the challenges like slower apps and battery drain. The idea can serve well to Apple Watch 2 design that is most likely on the drawing board right now and countless other smart wearables in the making.


Apple S1 comprises of 30 components

The design consideration is based on value points taken from the recent launch of CEVA’s RivieraWaves Wi-Fi IPs that allow system-on-chip (SoC) engineers to integrate Wi-Fi functionality onto their chips with clear and visible power and size benefits. CEVA has unveiled its Wi-Fi and Bluetooth solutions for mobile, wearable and IoT devices at the Linley IoT Conference held in Santa Clara, California on June 11, 2015.

The CEVA RivieraWaves Wi-Fi platform encompasses MAC and PHY modem functions. The MAC device—available as a hardware accelerator as well as software stacks in the form of lower MAC and upper MAC—is processor and operating system (OS) agnostic. For the modem, there are two options available, hardwired modem and software-defined modem (SDM).

Apple Watch Design Revisit

Let’s revisit the Apple Watch design footprint S1 that comprises of 30 components. Apple’s revered smart watch uses Bluetooth to connect to the iPhone and Wi-Fi to speed up data transfer when required. A sneak peek of S1 teardown from Chipworks shows that Apple used Broadcom’s BCM43342 chip for 802.11n, Bluetooth 4.0 and FM communication functions. The view of teardown also shows that Broadcom’s Wi-Fi plus Bluetooth combo IC is the second largest chip on the Apple Watch footprint. It acquired the die size of 18.5mm2 on S1.


Broadcom’s Wi-Fi chip is the largest after Apple’s APU

Now let’s take CEVA-plus-Catena-RF 802.11ac and Bluetooth combo solution that comes with MAC, modem, AFE, RF, CPU and memories. CEVA has joined hands with Catena, a supplier of RF IPs, to provide one-stop-shop for Wi-Fi and Bluetooth IP solutions. Catena’s radio IPs for Wi-Fi and Bluetooth are available on a number of process nodes, including 28nm at GlobalFoundries and 65nm at TSMC.

The CEVA-plus-Catena-RF solution, integrated with Apple’s APL0778 application processor manufactured at 28nm, would have taken up just 6mm2 on Apple S1, resulting in a 70 percent saving in die size. And that’s a lot of leverage in terms of power consumption and reduced cost due to smaller die size and lower BOM. The reduction in power consumption for the Wi-Fi plus Bluetooth connectivity stack on Apple’s SoC would have also come from the lower geometry of Apple app processor manufactured at 28nm. Broadcom’s Wi-Fi chip, on the other hand, has been manufactured at 40nm.

The CEVA RivieraWaves IP platform allows chip designers to integrate Wi-Fi connectivity onto their SoC solutions. The Wi-Fi integration is becoming imperative for wearable and Internet of Things (IoT) devices in particular because these devices are all about being smaller, cheaper and low-power. Moreover, Wi-Fi consumes lower power than Bluetooth for higher data transfer as shown in the Apple Watch use-case.


Wi-Fi integrated into APU or MCU

Another option that CEVA offers is a low-cost standalone Wi-Fi chip design that doesn’t require a host application processor. Here, CEVA’s TeakLite-4 DSP core can execute CPU functions. The CEVA-TeakLite-4 takes care of MAC and TCP/UDP protocol stacks and provides support for always-on sensing and audio processing applications.

CEVA’s MAC device is processor agnostic, so chip designers are free to pick other CPUs such as ARM Cortex-M, Andes, Cortus APS and ARC EM platforms.


A CEVA-powered Wi-Fi chip that doesn’t require a CPU license

Beken Design Win

Beken Corp., a high-volume supplier of wireless audio chips, has licensed the CEVA TeakLite-4 DSP core and RivieraWaves Sense 802.11n Wi-Fi connectivity solution for its upcoming SoC designs.

CEVA offers three connectivity platforms to cover all bases in the rapidly expanding Wi-Fi world. The RivieraWaves Sense IP for 802.11a/b/g/n/ac boasts the lowest power and smallest footprint that makes its suitable for wearable and IoT devices. The RivieraWaves Surf IP—aimed at mobile devices like smartphones and tablets—offers 802.11ac 1×1 and 2×2 IPs.


The CEVA RivieraWaves Wi-Fi IPs come in three flavors

CEVA’s third Wi-Fi flavor, the RivieraWaves Stream, is the highest performance IP that caters to wireless infrastructure products such small cells and access points. It serves 802.11ac for up to 4×4 MIMO applications and uses the CEVA-XC DSP core to facilitate advanced wireless communications.

Apparently, the Shanghai, China–based audio chipmaker Beken has opted for the CEVA-TeakLite-4 DSP for merging the Wi-Fi, Bluetooth and audio functionality onto a single core in its wireless SoCs. The CEVA-TeakLite-4 DSP cores are designed to handle audio, voice, sensing and wireless connectivity applications and they do it without requiring an additional CPU.

Again, resorting to the smart watch example, a single Bluetooth-enabled CEVA-TeakLite-4 can run always-on voice activation and voice commands, sensor fusion functionality, audio/voice processing, and dual-mode Low Energy Bluetooth also known as Bluetooth Smart Ready.

Visit RivieraWaves Wi-Fi product page for more information on the CEVA RivieraWaves Wi-Fi IP platform.


New Tool Suite to Accelerate SoC Integration

New Tool Suite to Accelerate SoC Integration
by Pawan Fangaria on 06-16-2015 at 12:30 pm

Today, an SoC is seen in the context of an optimized assembly of IPs; it’s no more a single monolithic chip design. It’s very common to see an ARM processor IP along with an interconnect IP, a memory IP, and couple of buses and interfaces IP in an SoC. Although the SoC seems to be an integrated collection of IPs, it can be very complex and the number of IPs can grow to any extent. From a manufacturing point of view, the power, performance and area (PPA) are the parameters to worry about at the IP level. For an SoC, there can be large catalogs of optimized IPs in every category from where the best IPs can be picked up and assembled in the SoC. Of course, the problem gets enlarged at the SoC level because one has to choose the right IPs and then integrate them in the most optimized manner to achieve best PPA, latency, and minimum congestion. The overall system throughput must be at the maximum within the given power and area constraints.

The problem is even wider in economic sense, because an SoC is a complete system that needs to be targeted to a particular market segment within specified cost parameters. The target segment, cost, and IP integration architecture are the key criteria for an SoC which appear much before the PPA for its success in the market place. There was a whitepaper written by me a couple of weeks ago (link to the whitepaper is at the end of this article) which provides details about the key criteria for SoCs in modern context. Today, I’m extremely happy to see the automated tools that address these top criteria for SoC integration.

During 52[SUP]nd[/SUP] DAClast week, it was a very pleasant occasion when I met Andy Nightingale, VP of System IP Marketing at ARM; Norman Walsh, Director of IP Tooling at ARM; and Simon Rance, Senior Product Manager of System and Software Group at ARM who demonstrated an innovative IP Tooling Suite developed at ARM for very fast and optimized SoC integration. That’s when I remembered about my whitepaper because it exactly touches upon some of the key criteria for SoCs mentioned there. Let’s see how this suite of tools helps in SoC integration.

ARM Socrates DE provides an advanced design environment where desired IPs can be chosen from an IP catalog, and then instantiated and configured as per designers’ need. The design environment is common to ARM’s existing environment that takes advantage of the already built-in protocols. It also supports third party IPs to be integrated into the sub-system or SoC. The IP-XACT format is used to maintain the IP interfaces at the industry level standard. If any third party IP does not have an IP-XACT description than the environment has utility to automatically generate IP-XACT from the RTL and point out mismatches, if any. The interesting part about Socrates DE is that it allows designers to customize IPs into different configurations to differentiate from others and instantly provides BOM (Bill of Material) for the overall sub-system or SoC. By using this tool, a designer can do several trials to configure IPs and optimize the overall sub-system or SoC for a target application within the given budget. The configuration and optimization of the SoC can be done in a day’s time or even hours as against several months to evaluate between various options without this tool.

After the initial architecture determination within the Socrates DE, there is ARM CoreSight Creator which provides an excellent Debug & Trace System seamlessly integrated with the Socrates DE. The CoreSight Creator is used to fine-tune the micro-architecture for better configuration efficiencies. It uses all built-in design rules provided in the ARM environment.

Another vital component in the tool suite is ARM CoreLink Creator which optimizes the Interconnect System for congestion free operation. ARM’s new CoreLink NIC-450 Network Interconnect offers a tool-driven automation flow that employs algorithms for tasks such as ensuring deadlock-free operation and partitioning across multiple power/voltage domains.

The overall suite of tools provides the most optimized and correct-by-construction configuration for an SoC or a sub-system along with its associated testbench. This approach reduces the SoC turn-around time and the risk of re-spin by a large extent because of improved predictability and QoR.

The ARM platforms including a large IP portfolio and suite of tools for IP configuration and integration provide an ideal platform for SoC integration. ARM’s partners can gain most value out of this system in today’s SoC environment. I am told that already there are more than 50 System IP tooling partners with ARM. It’s natural because finding right BOM with an optimized configuration and architecture for IP integration into the SoC is a burning need today.

ARM press release is here.
Here is my whitepaper “SoCs in New Context – Look beyond PPA”.
Also read “Even More Integration and Automation for ARM-based Designs

Pawan Kumar Fangaria
Founder & President at www.fangarias.com


High Level Synthesis. Are We There Yet?

High Level Synthesis. Are We There Yet?
by Paul McLellan on 06-16-2015 at 7:00 am

High level synthesis (HLS) seems to have been part of the backdrop of design automation for so long that it seems to be one of those things that nobody notices any more. But it has also crept up on people and gone from interesting technology to keep an eye on to getting genuine adoption. The first commercial product in the space was behavioral compiler introduced in 1994 by Synopsys. In that era we all thought that design would inevitably move up from RTL to C, but in fact IP came along as the dominant methodology, with IP blocks largely designed in RTL (if they were digital anyway) and then assembled into SoCs.

I attended a presentation at the Calypto theatre by Frans Sijstermans of Nvidia about How HLS Saved Our Skin…Twice. Nvidia were facing the problem that designs were growing at 1.4X per generation but design capacity only at 1.2X causing problem. In 2013 the video team faced a new chip with all sorts of new technologies on the must-have list: HEVC, VP9, deep color, 4K resolution and more. An analysis showed that using their old methodology would take 14 months but only 9 months were available. Google had been using HLS successfully for video and talking publicly about it so Nvidia contacted them and then they decided to use HLS to pull in the schedule.


So the first time their skin was saved was getting the design on-track for an RTL freeze in March 2014, using 8 bit color. Then in January 2014 was the consumer electronics show (CES). Lots of people were showing HEVC but with 10 bit color. They realized that the project had no future, they had guessed wrong about market requirements. “We messed up,” the head of the division admitted. Without HLS there was absolutely no way to fix this since it affected every register and operator of every datapath in the entire design.

They recoded the C++ for 10 bit instead of 8. They ran it through HLS. All the pipeline stages changed due to the different performance. But would it work? How could they verify it? They worked out that to run the 100,000 tests the HEVC standard provided at the RTL level would take 1000 cores for 3 months. Instead they did all the verification running all the tests at the C++ level. It took a single CPU 3 days. Of course they still ran a bunch of tests at the RTL level, just to be safe. They taped out the two versions, a 20nm block for mobile at 510MHz and a 28HP discrete GPU at 800MHz.

Nvidia’s conclusions:
[LIST=1]

  • We still have jobs!
  • Code is 5X smaller
  • Pipeline changes that are painful in RTL are not in HLS
  • QoR is the same
  • Complex algorithms go through wonderfully
  • Designs with strict cycle-by-cycle requirements are more challenging with no explicit clock

    One company who have been using HLS for a long time is ST Microelectronics. Their experience has been that increasingly RTL is the wrong level to code IP since it locks in the microarchitecture in a way that is next to impossible to change. The same algorithm in 28nm and 16nm will require a completely different pipeline running at a completely different frequency (after all, the throughput of, say, 4K video doesn’t change depending on the silicon used for implementation, it is set by the video standards).

    The world of using FPGAs in datacenters has been forefront in everyone’s attention recently with Intel’s acquisition of Altera (not yet completed). Nobody thinks Intel is doing this because they want a small (by their standards) FPGA business but because if integrated FPGAs in the datacenter are important they need control of that technology at more than a partner/foundry level. Last year a high-profile paper by Microsoft showed how using FPGAs and high-level synthesis to get the algorithms in could double the performance of the Bing search engine. Other similar papers have shown acceleration of other algorithms that are not especially well handled on a microprocessor compared to the huge inherent parallelism of an FPGA.

    The leader in HLS, at least in terms of real active users, has to be Xilinx. They acquired AutoESL a few years ago as the seed for what has become Vivado HLS. There are over 1000 users not just playing around with it but using it for production bitstreams (the FPGA equivalent of a tapeout). For sure there is some learning curve, but it seems to take about a week with an AE to get a team proficient in the new methodology. That mirrors the Nvidia experience where the team hit the ground running with zero prior HLS experience.

    It is also used in several other Xilinx approaches. For example, under the hood of SDSoc it is just necessary to mark a block of software code for acceleration. Using HLS that block will be turned into FPGA gates, along with both the gates needed to move data in and out, and the software stubs. The rest of the code runs on the ARM processor on the same silicon. In a sense, HLS is the key technology to software-defined-hardware.

    They are not the only companies in the space. Cadence developed their C-to-silicon compiler and then acquired Forte Design so are also a player in the space. That would be a bagpipe player, because by acquiring Forte they also acquired the responsibility to arrange the traditional bagpipers to play Amazing Grace to close the Design Automation Conference exhibits, which they duly did on Wednesday evening.

    It is a long way from academic research and behavioral compiler to where we are today with mature technology in use for some extremely demanding SoCs and high-end FPGA systems.