RVN! 26 Banner revised (800 x 100 px) (600 x 100 px)

Design Collaboration, Requirements and IP Management at #52DAC

Design Collaboration, Requirements and IP Management at #52DAC
by Daniel Payne on 05-14-2015 at 12:00 pm

For SoC designers attending DAC in June you probably want to check out the EDA vendors that enable design collaboration among your engineers and designers that are spread out across a building, campus or the globe. Dassault Systemes does offer tools and methodologies for: Design collaboration, requirements and IP management.

They have a tool called DesignSync that lets your share hierarchical design data between everyone on your design, verification and implementation team. This kind of tool answers the age-old question, “Am I working on the right version of hardware, software or firmware today?”

Related – Flexible Integration System for IPs into SoC

Another perennial issue with SoC design is getting design closure for timing and power between front-end and back-end implementation. The Dassault tool for this issue is Pinpoint, used by both RTL designers and physical designers.

If you follow EDA acquisitions then you’ll know that the Pinpoint product came from an acquisition of Tuscany Design Automation back in 2012.

Related – Pinpoint: Getting Control of Design Data

Every SoC has a list of requirements, but the big question remains, “Did my SoC implementation actually meet the requirements that were specified?” If you take a structured approach and use requirements-driven verification, then the answer to that question will be yes, otherwise you take your chances that a requirement was missed or even forgotten along the way. Requirements Central is the tool from Dassault you will want to see at DAC.

Related – Managing Semiconductor IP

SoC designs use a myriad of IP blocks from both internal and external sources, so how do you keep track of them all and know that you have the right versions in use? Gone are the days when a simple Excel spreadsheet could track a dozen or two IP blocks, because now an SoC could have hundreds of IP blocks, and even variations of the SoC for different markets. The Enterprise IP Management tool from Dassault will be available at DAC for you to see and ask questions to discover how it would benefit your team.

Related – Enterprise IP Management – A Whole New Gamut in Semiconductor Space

On the automotive side you’ll be able to see how Dassault tools and methodology can be used to:

  • Comply with auto industry safety standards
  • Enable early vehicle performance validation
  • Leverage design re-use
  • Use mechatronics with systems engineering in a synchronized fashion

Summary
Visit my friend Michael Munsey at DAC, you will find him in booth #2932 and see for yourself what Dassault Systemes has to offer that will make your next SoC project flow quicker and with fewer headaches. There’s a DAC signup form here.

If you’re in the automotive sector, then check out booth #1303 and look for Kiosk 7. Their tools work with all of the leading EDA point-tools and sub-flows, so this means that you don’t have to really integrate anything to receive the benefits of their automation in: collaboration, requirements and IP Management.

Also Read

Managing Semiconductor IP

Filling the Gap between Design Planning & Implementation

Smart Collaborative Design Reduces Business Risk


Intel and eASIC: Marriage or Just Good Friends?

Intel and eASIC: Marriage or Just Good Friends?
by Paul McLellan on 05-14-2015 at 7:00 am

A couple of days ago Intel announced a collaboration with eASIC. Here is the opening paragraph of the press release:Intel Corporation today announced plans to develop integrated products with eASIC Corporation that combine processing performance and customizable hardware to meet the increasing demand for custom compute solutions for data centers and the “cloud.” The new parts will enable acceleration of up to two times that of a field programmable gate array (FPGA) for workloads like security and big data analytics while also speeding the time to market for custom application specific integrated circuit (ASIC) development by as much as 50 percent.

Nowhere in the press release does it say whether or not Intel will be manufacturing parts for eASIC nor whether the intention is to integrate the eASIC technology onto the same die as the microprocessor. My guess would be that the plan is to use some sort of package-in-package technology but that they will be separate die. In the past eASIC has used Fujitsu and GlobalFoundries for manufacture and the most advanced arrays they have are at 28nm.

The first conclusion that everyone has leapt to is that this Intel’s reaction to its failure to acquire Altera recently. I don’t think that this is true.

eASIC’s boilerplate in the press release reads:eASIC is a semiconductor company offering a differentiated solution that enables us to rapidly and cost-effectively deliver custom integrated circuits (ICs), creating value for our customers’ hardware and software systems. Our eASIC solution consists of our eASIC platform which incorporates a versatile, pre-defined and reusable base array and customizable single-mask layer, our easicopy ASICs, our standard ASICs and our proprietary design tools.

If I interpret this correctly then the array is programmed with a single metal layer. Clearly it is possible to build accelerators for datacenter applications using this technology but they are not reprogrammable once they have been manufactured. It seems to me that what is required for datacenter accelerators is the capability to dynamically reconfigure the accelerator to do whatever is required, visual search at one moment, then voice recognition the next moment. FPGAs such as those from Xilinx and Altera can do this. I don’t know in detail about Altera’s capabilities but I do know that Xilinx can even reconfigure part of an array while keeping the rest of the array running an application during the reprogramming. So part of the array could be doing voice-recognition, say, while another part of the array is being set up for facial recognition. Even if Altera is not that flexible, they can still reprogram the entire array depending on what acceleration is required. These algorithms are by no means stable and so the capability to fix bugs and to upgrade the hardware performance as the algorithms improve seems like something important.

On the other hand, Intel also says:This collaboration is part of Intel’s strategy to integrate reprogrammable technology with Intel Xeon processors to greatly improve performance, power and cost.

But it seems a bit of a stretch to call eASIC’s technology reprogrammable unless I have completely misunderstood things. It is programmable at manufacture, and the bases can be banked to keep turnaround time low. It has an FPGA-like flow making for fast design and better power and performance than a true FPGA, but it seems to lack the most important feature: true reprogrammability.

Obviously this is some sort of a shot across ARM’s bows since they and their partners (such as Cavium, Qualcomm and more) are trying to infiltrate the datacenter with lower cost, lower power and lower physical form factor (and lower performance but not by much). Xilinx’s Zynq arrays contain multicore ARM processors as well as a lot of programmable fabric and might be ideally suited to some of these types of applications that require more acceleration than can be obtained from just a processor, not matter how fast.

The Intel press release is here. The eASIC website is here.


Samsung Foundry Update!

Samsung Foundry Update!
by Daniel Nenni on 05-13-2015 at 11:00 pm

It is hard to believe that Samsung is celebrating their 10[SUP]th[/SUP] anniversary in the foundry business this year. It certainly has not been an easy road but as of late you cannot argue with the results. Samsung is the first foundry to put 14nm silicon into smart phones, beating the #1 semiconductor company (Intel) and the #1 foundry (TSMC). That alone is worthy of significant praise, just wait until the next iPhone refresh. If, as I have “predicted”, the Apple A9 and the next generation of QCOM SnapDragons are based on Samsung 14nm, 2015 will be the year of Samsung Foundry, absolutely.

Samsung is hosting another VIP event in San Francisco next week. Paul McLellan was my date last year, this year I’m taking my beautiful wife. If you want to stay happily married for 30 years you need to schedule regular date nights and this event totally counts.

The theme last year was “The voice of the body”, this year it’s Smart, Connected Lifestyle:

In a world that is moving faster and faster, how can you get everything from your mobile device the moment you want it? Join Samsung Semiconductor executives for overviews on:

  • Mobile memory and flash technology
  • R&D driving innovation forward
  • Advanced process technology
  • Exynos application processor
  • IoT strategy

Last year Samsung talked about 14nm and this year they delivered it. I’m hoping this year they’ll talk about 10nm because from what I hear, next year Samsung will deliver it. If you want to talk about trust, trust is delivering on what you promise, absolutely.

The next big event is the 52[SUP]nd[/SUP] Design Automation Conference in San Francisco. Given all of the parties, DAC will count as multiple date nights so I’m good to go through June. Samsung had the best DAC presence last year for the foundries in my opinion. They showed 14nm wafers and silicon and delivered on that promise with the Exynos 7420 and the Galaxy S6. In fact, if you look at the tear-down, the S6 used 16 Samsung chips out of the 25 total. Shannon is the product family name for Samsung’s mobile chipsets, you can read more about them here: “The Curious Case of Samsung’s Shannon Chips”.

From what I’m told, Samsung will have another DAC theater for customer and partner presentations. Last year I sat through quite a few of them and really enjoyed it. They were very interactive and well attended. This year I expect a much bigger DAC attendance based on the other conference numbers which were up double digits. My guess is that semiconductor design is getting much harder and more competitive in regards to performance, power, area, and yield so people are searching much more aggressively for design tricks and tips than before. Just my opinion of course.

You can still get free I Love DACtickets HERE. SemiWiki will also be sponsoring the Love IP Party at DAC on Monday night and yes it counts as a date. However, we will not be dressing up. We grew up in the ’70s and once is enough! Since SemiWiki is a sponsor we will have tickets to give away so let me know if you need one.


Quark Adds Muscle to Intel in the IoT World

Quark Adds Muscle to Intel in the IoT World
by Pawan Fangaria on 05-13-2015 at 4:00 pm

We have been hearing about Intel’s Quark processor, which is based on its good old Pentium, making waves in IoT world. The CPU core of Quark is said to be the smallest in Intel. It is supposed to be inexpensive and extremely low in power; a perfect combination for IoT devices. The Pentium architecture equips the processor to perform on the edge as well as the gateways where data analysis becomes important. Without analysis, the data from an IoT device can be useless. It’s the analytics which makes the data meaningful for its connection with the internet in the IoT world. Also, Intel has different versions of Quark SoC to cater to the requirements of different segments of IoT.

Intel already has a Quark SoC X1000 based IoT platform built on open architecture which can be used by customers or developers to develop customized platforms for different applications. The gateway development kit provides an option for different processors to be used, e.g. Intel Quark SoC X1000 or Intel Quark SoC X1020D. The SoC enables the complete IoT ecosystem from the edge to the data center. In a brief video Intel demonstrated this to be working perfectly for industrial, energy and transportation segments of IoT.

In transportation segment, you can get the data from the power transmission through the cloud and that can be used in different applications as desired to control and monitor the transportation. The data can be used in commercial applications by transportation companies as well as personal applications such as monitoring of driving styles by individuals. The Quark chip can be used as a gateway or a sensor to stream data to the cloud.

In industrial applications such as building management systems, it can be used to sense data from major components such as heating, ventilating and air conditioning (HVAC) and send that to the cloud, thus enabling monitoring and controlling of the building in the most cost effective and efficient manner.

For developers to get started quickly with this IoT platform, they can buy readily available Galileo boards at very low prices at ~$60, put the applications on top of the professional grade Linux and then design their own custom boards.

You can view a brief demo at Intel website hereafter a quick registration. Peter Dice, Software Architect at Intelexplains very concisely about the working of the Intel Quark SoC X1000 based system which perfectly fits into the IoT applications.

There are many developers contributing to the IoT innovation with Quark. The Intel’s Embedded Design Centre provides hardware, software and solutions for embedded and IoT developers who can use them to effectively participate in the IoT development. There are product examples available for the new developers to go through.

This year, Intel IoT Gateways will come pre-integrated with the Wind River Edge Management System (EMS), a cloud-based IoT platform that enables devices to securely connect to a centralized console. Also, IoT solution needs to be interoperable because many devices in the system would be based on existing legacy system. Intel, McAfee and Wind River together provide an interoperable, secure and scalable IoT platform that can seamlessly work with third party solutions as well.

Pawan Kumar Fangaria
Founder & President at www.fangarias.com


"An art can only be learned in the workshop of those who are winning their bread by it"

"An art can only be learned in the workshop of those who are winning their bread by it"
by Paul McLellan on 05-13-2015 at 7:00 am

That was said by the novelist Samuel Butler, but it is not a bad description of why you should spend the Sunday at DAC in one of the workshops that are taking place that day.

One workshop is on Design Automation for Beyond CMOS Technologies. Before getting to design automation, it is good to start with which technologies are potentially on deck for after CMOS. I spoke with Rasit Topaloglu who is one of the organizers of the workshop. He is in development in the IBM Systems Group working on next generation computer systems. To give some focus to the workshop they decided to focus on what technologies might be used for high-performance computing (HPC) once we get beyond CMOS and even beyond charge-based computing (so carbon nanotube counts as CMOS in this definition!).

The workshop opens with a keynote from Philip Wong of Stanford titled Memory Leads the Way to Better Computing: the N3XT 1000X in Energy Efficiency.

There are then 3 sessions, each with 3 papers and a panel session at the end. There is a mixture of people from academia and industry (on the industry side there are presentations from Intel, GlobalFoundries and imec). The first session is on Architecture and how emerging devices will change architecture perhaps beyond the traditional von Neumann model. After lunch, there is a session on Spin Logic. The fact that this is the session where both imec and Intel are presenting shows how serious the investment in the area is. The final session is on New Materials, Design and Modeling, which is just what it says on the label. One presentation on new materials (topological insulators), one on design (from GF) and one on modeling.

There are 8 more workshops in addition to the “to infinity and beyond” one.

[LIST=1]

  • Low Power Image Recognition Challenge. This workshop is a programming contest. Registration is available for attendees to observe the participants. There is no workshop presentation.
  • See above.
  • SEAK 2015: DAC Workshop on Suite of Embedded Applications and Kernels. This workshop is a venue for the DARPA SEAK project to interact with the broader embedded community.
  • System to Silicon Performance Modeling and Analysis. The integration of heterogeneous electronic systems composed of SW and HW requires not only a proper handling of system functionality, but also an appropriate expression and analysis of various extra-functional properties: timing, energy consumption, thermal behavior, reliability, cost and others as well as performance aspects related to caching, non-determinism, probabilistic effects.
  • Computing in Heterogeneous, Autonomous ‘n’ Goal-Oriented Environments. As the push for parallelism continues to increase the number of cores on a chip, system design has become incredibly complex; optimizing for performance and power efficiency is now nearly impossible for the application programmer.
  • Design Automation for HPC, Clouds and Server-class SoCs. Traditionally SoC design methods have focused on low-power consumer electronics or high performance embedded applications. But now SoC design methods are moving into high-end computing due to the emergence of embedded IP offering capable double-precision floating point, 64-bit address capability, and options for high performance I/O and memory interfaces. System on Chip (SoC) integration is able to further reduce power, increase integration density, and improve reliability.
  • Requirements-driven Verification (for Standards Compliance). Requirements-driven verification is based on ensuring that feature-level requirements are adequately verified by tracing such requirements through to verification tasks.
  • Enterprise-level IP Management for the DAC Semiconductor Professional. This workshop addresses a burgeoning problem for semiconductor companies – managing the vast amount of IP that is being produced and consumed by their design teams. Over the last 20 years, we have seen the transition from almost no IP reuse to some modern SoC’s owing nearly 90% of their functionality to reused or purchased IP.
  • Interdisciplinary Academia-industry Collaboration Models and Partnerships. Design Automation as well as circuit design and implementation did NOT get easier recently! Even in challenging economic situations industry has to find ways to partner closely with academia to ensure continued leading edge research and education.

    Most of these workshops run from around 9am to around 4.30pm. Check the DAC website for up-to-date times and the rooms where the workshops will be held. Then at 5.30 you can walk over to the Intercontinental Hotel in time for the traditional Sunday night DAC reception.

    The DAC workshop page is here.


  • Chip Design – Coming of Age in the Computer Age

    Chip Design – Coming of Age in the Computer Age
    by Mike Gianfagna on 05-13-2015 at 2:30 am

    Previously, I examined chip design in the late 1970s and early 1980s. It was a nostalgic ride – thanks to all those who shared their stories. I enjoyed reading all of them. I drew two basic conclusions in the prior post:

    [LIST=1]

  • Chip design problems are the same, more or less, over time. The numbers just get bigger
  • Raising abstraction levels, re-using IP and automating design have been the cure for chip design problems for a long time

    Let’s now look at what we’ve managed to accomplish since the 1980’s. That’s over 30 years. What did we get done in all that time? A lot actually.

    To start with, an industry segment was born. Electronic design automation (EDA) didn’t exist in the late 1970s. Those were arguably the early years of the computer age of chip design. Graphics design was king. The big three companies were Applicon, Calma and Computervision. You’ll find history on the first two companies in the comments from my prior post. In those days, it was called computer-aided design (CAD).


    It all changed in the early 1980s. The complex software needed to design and build ICs became available on a merchant basis. This software was no longer just available to big, vertically integrated companies who could afford to hire teams of software developers (like Bell Labs and RCA). Access to EDA software was now in reach for everyone, thanks to the early pioneers at Daisy Systems, Mentor Graphics and Valid Logic. The birth of EDA started a growth trend for semiconductors that lasted a long, long time. The industry evolved from an elite market to a democratized market – custom ICs were no longer just a rich person’s game. Coincidental with the birth of the EDA industry was the birth of the ASIC industry.

    As EDA grew and matured, consolidation occurred. Daisy merged with Cadnetix and Valid was bought by Cadence. Mentor continued as Mentor. Today the big three are Synopsys, Cadence and Mentor. Business innovation was prevalent as the EDA industry matured and learned how to build, market, sell and support complex software. But what of technical innovation? There are many significant innovations that have advanced the state of the art. In the interest of time, I will touch on one.

    Logic synthesis. What started as a thesis project for a talented engineer working at GE Semiconductor became a major discontinuity in IC design and a new force in EDA. The engineer is Aart de Geus and the company is Synopsys. The idea to automate the creation of logic gates from a high-level language embodies the principles of raising abstraction levels, re-using IP and automating design. Aart really nailed it with this one. It took quite a few years to catch on, but it clearly changed the course of the industry.

    In more recent times, something is amiss however. The glory days of democratized silicon access and large numbers of chip start-ups has given way to very expensive design projects and a shrinking base of start-ups. Design at advanced process nodes is once again a rich person’s game. The elite market is back. A handful of foundries can build them and a handful of companies can afford to buy them. So the question is, what happens next? Has the cycle finished, or is there another wave of growth on the horizon? And if so, what are the catalysts that make it happen? We have to look beyond the computer age to find answers. More next time.


    From Mike Gianfagna of eSilicon

    Also Read: Chip Design Problems Remain the Same, More or Less


  • Saving Time and Money on Your Next SoC Project

    Saving Time and Money on Your Next SoC Project
    by Daniel Payne on 05-12-2015 at 8:00 pm

    Every SoC project that I know of wants to finish on time, under budget, and maximize profits per device. When I first started out doing DRAM design I learned that we could maximize profit by doing shrinks of existing designs, move from ceramic to plastic packages, and reduce the amount of time spent on a tester. Today, the economic reality is similar, and so this blog is focused on saving tester time which directly impacts product costs. My source for the latest info on reducing test costs came from both Advantest in the ATE (Automatic Test Equipment) world and Synopsys in the DFT (Design For Test) software segment as part of a webinar given last month, and now archived. Divide, parallelize and test is a proven cost-reduction approach for both multisite test with multiple devices and concurrent test with multiple IP blocks within the SoC.

    We often blog about how beneficial it is for collaboration between EDA, IP and Foundries to ensure design success, but there is also collaboration going on between ATE and EDA to ensure better test results and higher profits through lowered tester times.

    Related – Two New Announcements at ITC from Synopsys

    Multisite Test
    Adam Cron from Synopsys showed that every ATE system has a limited number of channels, so the greatest multisite count is: # ATE Channels / # Pins per die. The equation for percentage time saving with multisite is then: 1 – 1/N. As an example with N=5 the maximum time savings are 80%:

    With stimuli broadcast all of the inputs to the DUTs receive the same data, so you can test more devices with fewer ATE channels:

    Factors that increase the multisite test time were discussed: edge dies, test resources, index time and branching during test flow.

    Related – Catching IC Manufacturing Defects with Slack-Based Transition Delay Testing

    The DFT software from Synopsys that supports multisite and concurrent test is called DFTMAX Ultra and it features:

    • Hardware test compression for data reduction
    • Simplified scan shift clocking
    • Uses fewer pins, down to 1 SI (Scan In) and 1 SO (Scan Out)
    • Power-aware during scan testing to stay in spec
    • Synthesis-based to minimize iterations between design and test

    Concurrent Test

    Dave Armstrong from Advantest was up next in the webinar and talked about how their company is #1 in ATE, and that their two most popular testers are the V93000 and T2000 models.


    Advantest V93000 and T2000 testers

    These ATE systems have from 8K to 12K pins and can support a pattern depth of 32 Gbit. As a conceptual example of concurrent test a four core SoC with multiple IP blocks was considered, showing how you could start out with a purely sequential test approach, then gradually refine to add more parallelism thus decreasing test time.

    An approach called memory pooling allows you to add ATE channels together adding more depth to 32 Gbit in the most efficient manner.

    Branching in a test flow is often used for binning based on performance, so the Advantest approach is to parallelize the multiple branch paths.

    Both multisite and concurrent approaches help reduce test times and require:

    • High pin-count ATE with flexibility
    • DFT tools with hardware compression
    • A fast-responding thermal control environment

    Summary
    EDA and ATE companies are collaborating to drive down test time through hardware compression, DFT tools and methodologies. Synospys and Advantest have shown how that you can save time and money on your next SoC project by using concurrent and multisite test techniques.


    Beware of Parameter Variability in Clock Domain Crossings

    Beware of Parameter Variability in Clock Domain Crossings
    by Jerry Cox on 05-12-2015 at 4:00 pm

    How should we assess the risk of harmful metastability in a clock domain crossing (CDC) when the semiconductor process has significant parameter variability? One possibility is to determine the MTBF of a synchronizer at the worst-case corner of the CDC. But that approach has some conflicting complications:

    • Synchronizer failures can occur at any time before or after the MTBF.
    • Most chips in a wafer perform better than they do at the worst-case corner.
    • High-volume, safety-critical products should be held to a high standard.
    • The worst-case environment for a CDC may be rare in actual use.

    An alternative that has received some recent attention is a Monte Carlo simulation, one that randomly varies PVT conditions over the range expected for the product. Instead of an estimate of MTBF, this approach leads to an estimate of the probability of a metastability induced synchronizer failure, assessed over the expected distribution of parameters and conditions. However, in the early stages of design, an impractical level of effort is required to investigate carefully even a few alternatives.

    These thoughts led me to investigate the effects on the metastability settling time-constant that result from variability in the transistor threshold voltage. This approach bypasses the extensive burden of Monte Carlo simulation, but still provided an increase in understanding of the effects of parameter variability.

    I choose to study the variation in threshold voltage because it has a major effect on the settling time-constant and is a classic example of a Gaussian distribution. This investigation led me to wonder what happens when the cross-tied transistors’ threshold voltages are at an extreme value of that distribution and in the vicinity of the metastability voltage V[SUB]m[/SUB]. If the simple theoretical model of a strongly inverted transistor holds, the settling time-constant would be infinite; not a good thing for a synchronizer’s MTBF. But that’s theory. Can reality be different?

    To anticipate reality, we used a benchmark circuit, PublicSync,to simulate synchronizer behavior. As you can see in the above figure, the settling time-constant grows significantly as the transistor threshold moves toward and above the metastability voltage V[SUB]m[/SUB]and it does so smoothly without a singularity at V[SUB]m[/SUB]. These results were obtained by the analysis tool, MetaACE,using an automated scan of V[SUB]t[/SUB]. By fitting a curve to the data points it was possible to calculate the probability of a synchronizer failure, Pr(fail) given the mean and standard deviation of V[SUB]t[/SUB]. The details of this calculation can be found here.

    The table shows the ratio of two normalized calculations of the probability of failure:

    • p[SUB]wc[/SUB](fail, t[SUB]s[/SUB]): calculated assuming worst case (wc) conditions for V[SUB]t[/SUB]and with an allowed settling time t[SUB]s[/SUB]
    • p[SUB]vt[/SUB](fail, t[SUB]s[/SUB]): calculated assuming a distribution of V[SUB]t[/SUB]and an allowed settling time t[SUB]s[/SUB]

    This ratio of probabilities shows how failures are overestimated by the worst-case (wc) measure as compared with the varying threshold (vt) measure. The wider the distribution of threshold voltage and the longer the allowed settling time (t[SUB]s[/SUB]), the more this discrepancy grows. For a 500 MHz clock the right-hand column would correspond to two-stage synchronizer and a latency of 4 ns. For example, for some unsurprising safety-critical product conditions and a standard deviation of 20 mv, the wc measure suggests an extra, but unnecessary synchronizer stage with its accompanying added latency.

    So the take-way message for me is: calculate the probability of failure Pr(fail) and not worst-case MTBF. Such a probability-based measure of risk should include the number of units in use, the unit lifetime and the distribution of semiconductor parameters such as transistor threshold voltage. Pr(fail) also avoids the misleading tendency to associate the MTBF with a failure-free period. A mistake many make, but one that masks the real possibility of failures that occur at anytime during a product’s lifetime.


    ARM A57 (A53) Virtualizer + IP Accelerated = ?

    ARM A57 (A53) Virtualizer + IP Accelerated = ?
    by Eric Esteve on 05-12-2015 at 12:00 pm

    Hybrid IP Prototyping Kit from Synopsys!
    Synopsys has launched IP Accelerated initiative last year. The goal was clearly to accelerate Time-To-Market by providing a complete set of “tools” to augment design productivity:

    • IP Prototyping Kit with reference designs work out-of-the-box
    • IP software development kits enable early SW bring-up, debug and test
    • IP subsystems (customized integration and testing of Controller and PHY IP) to lower risk and speed time to market

    How to further boost system-on-chip design integration? When the SoC integrates one of the high performance ARM Cortex A57 or A53 64-bit processor IP core, the customer want to test the real world interface with this 64-bit system, on top of developing around the DesignWare IP (thanks to IP Accelerated). Adding a Virtualizer Development Kit (VDK) for ARMv8 based system to the DesignWare IP Prototyping Kit will create this Hybrid IP Prototyping Kit:

    If we take the example of a design team developing an ARMv8 based server chip, boosting software developer productivity will accelerate TTM. Using Virtual Prototype for ARM Cortex A57 (A53) is a known good solution for develop software faster, and even more in a multi-core environment. Hybrid IP will enable connecting the software development kit with HAPS-based DesignWare IP prototyping kit. In the above example of USB 3.0 IP subsystem (PHY + Controller) integrated in the SoC, the design team can fix S/W issue by going back to the virtualizer or benefit of a fast iteration flow for easy modification of the USB 3.0 IP. Synopsys call this tool-set “Hybrid IP”, as the designer can optimize software change and hardware modification by using a unique flow and using UMRBus to join Virtual and Physical worlds.

    The physical world is a close as possible with the target SoC and is represented by the well-known HAPS-DX based IP development kit, comprising the out-of-the-box reference design (controller + PHY + system logic) and allowing:

    • Pre-optimized, pre-loaded Linux OS & reference driver
    • Fast iteration flow for easy modification of IP

    The IP integrated into HAPS-DX board is a proven target for early software development. Thanks to the PCIe based UMRBus, on a x4 PCIe gen-2 interface which is offering 20 Gbit/s bandwidth, the designer can connect immediately (and fast) with the Virtualizer Development Kit to access the multi-core processor virtual model, including the memory and peripherals:

    Hybrid IP Prototyping Kit provides a very intuitive way to run hardware and software debug at the same time with a high confidence degree as the Interface IP (USB 3.0 in this case and PCI Express, MIPI UFS and DDRn to come soon after) is the exact configuration of the controller and PHY to be integrated into the design and ARMv8 virtual prototype is the best available model that a designer can use to speed up hardware and software co-development. Using Hybrid Prototyping will not only speed-up TTM, it will allow optimizing the system solution as the designer can intuitively play and modify the hardware at the same time that modifying the software, as all the tools can be integrated together within the same user interface, as we see below:

    The benefits are multiple: Hybrid IP Prototyping Kit allows fast IP bring-up and debug with complete reference design of VDK and Interface IP together. The reference design includes fast iteration flow with easily modified Interface IP design/build scripts. The designer can explore system drivers with real-world I/O and 64 bit processors. The reference drivers & application provide early software bring-up, debug and test with support for Linux. Last but not least, Hybrid IP Prototyping Kit is expandable and scalable to SoC prototypes.

    From Eric Esteve from IPNEST


    Is Low Power a Challenge? ICE-Grain Answers the Challenge

    Is Low Power a Challenge? ICE-Grain Answers the Challenge
    by Paul McLellan on 05-12-2015 at 7:00 am

    Blogs have limited wordcount so insert your own generic opening paragraph here about the importance of low power in IC design. Mention IoT and cloud datacenters for extra credit.

    It is well-known that the biggest reductions in power come from changes at the architectural level. Tools and process can do some things and since they are automatic, take them. But to get serious power saving you have to power down blocks, gate clocks, scale voltages and so forth. But this stuff is hard. The most advanced design groups have teams that specialize in this stuff but mainstream SoC groups find it challenging. It usually requires interaction between software and hardware and, unfortunately, the embedded software engineers do not really understand much about the chip, and the designers don’t really understand the software peoples’ issues. Often power saving registers are unused, at least in the first version of the design, since there isn’t time to get it all working together. There is complex timing associated with powering blocks up and down or varying their frequency, and this is complex to manage in software since the software has other stuff it has to do in parallel. So the processor has to do more than run a device driver for as much as a millisecond or two while it all happens and that sort of programming is hard. Despite working in Semiconductor/EDA all my career, my PhD is in operating systems so I have the scars. I have written my share of device drivers.

    The biggest savings come from basically exploiting when blocks are going to be idle. To make the design manageable, it is attractive to have relatively few blocks to worry about, but then the chance of them being completely idle diminishes compared to a higher number of smaller blocks. An additional complexity comes from not really knowing if it is worth powering a block down in advance, which can lead to a power-thrashing whereby a block is not even fully turned-off before it is required and has to be brought up again, which wastes power.

    So what what would an ideal power manager look like?

    • Instantly detect idle blocks (and somehow know they would continue to be idle for long enough in the future)
    • Instantaneously reduce the power
    • Instantly notice when a block needs to be active and instantly power it up again
    • Across a huge number of fine-grained domains with no power or area overhead
    • And a pony


    Well, OK, that is a dream. You are not going to get it. But how close can we get?

    Sonics ICE-Grain Power Architecture is an approach that comes close. Note that at this stage Sonics are not announcing a full product, this is technology that they have been working on for 5 years and where they are starting to engage with lead partners and are looking for more. It is also perhaps worth pointing out that the initial products will be targeted at mainstream SoC groups who struggle with this sort of power management, not with very large groups who already understand the technology and have specialized teams and methodologies in place. Plus, although there are some gains from using ICE-Grain with Sonics NoC technology, it is not necessary; the two are independent.

    So what’s with the weird name? ICE-Grain comes from two things:
    [LIST=1]

  • Power Grains are a collection of logic that can be individually controlled using one or more power saving methods:

    • driven by one or more clock domains
    • connected to one or more power domains
    • defines conditions for power transitions
    • may contain other grains (grains are hierarchical)
    • frequently small (dozens to hundreds of grains per SoC)
  • ICE stands for Instant Control of Energy (and ICE is, well, cool)

    • the architecture creates unique hardware-based control circuity to optimally control all of the grains. A grain is typically in one of normal operation, clock shut off, power shut off but retention registers still powered, complete shut down.

    Each grain has a local grain controller and there is also a central controller to coordinate shared resources and handle the interface to both software and the outside world. ICE-Grain does not magically find the grains, it starts by looking in the UPF (now IEEE 1801) to find areas that have associated power policy such as retention registers, power down, DVFS, level-shifters and so on.
    The big advantage of hardware control is that it is much faster than software and it makes it much easier to use finer grains and so find more opportunities for power saving (whereas large grains often contain a small portion that must remain active that prevents it being power managed).

    The diagram below shows this graphically. Conventional shows a software-controlled power down that is comparatively slow. The height of the orange shows the power being dissipated and the green the saving from actually managing the power. Under software control the microprocessor (high power) needs to wake up to power down the block. Then there is a saving. Then the microprocessor needs to wake up again to power up the block. With ICE-Grain, the microprocessor can remain dormant, a little bit of control logic consumes a little power for a short time and then even more power saving occurs in the center. Finally, since turning the blocks on again is faster, the entire activity finishes earlier saving power at the end.

    So why are Sonics doing this? They have years of experience of power management. Their NoC today can do some level of powering down blocks if they are idle for a long time (timer-based) and turning them on again when the block is sent a transaction (buffering the transaction if the block is going to be turned-on or causing it to fail if the block is going to remain powered off).

    The ICE-Grain Power Architecture is a complete power subsystem. It is correct-by-construction, eliminating deadlock, conflicts and errors, with powerful debug and tracing facilities. The fine-grained domain partitioning and faster on/off transitions means more opportunities for power saving. Aka lower power. Pony not included.

    The ICE-Grain product page is here.

    Phew, made it to the end of the blog without mentioning the ice-bucket challenge. Nearly…