webinar banner2025 (1)

SiFive Extends Portfolio with 7 Series RISC-V Cores

SiFive Extends Portfolio with 7 Series RISC-V Cores
by Camille Kokozaki on 11-16-2018 at 7:00 am

At the recent Linley Fall Processor Conference in Santa Clara, Jack Kang, SiFive’s VP of Product Marketing introduced SiFive’s Core IP 7 Series.Designed to power devices requiring Embedded IntelligenceandIntelligence Everywhere,the cores allow scalability, efficient performance and customization. The Core IP 7 Series is suited for use in consumer devices (AR/VR gaming, wearables), storage and networking (5G, SSD, SAN, NAS) and AI/ML/edge (sensor hubs, gateways, IoT, autonomous machines).

The 7 Series product family includes the E7, S7, and the U7 product series. The E7 Core IP Series comprises the 32-bit E76 and E76-MC and provides hard real-time capabilities. The SiFive Core IP S7 Series brings high-performance 64-bit architectures to the embedded markets with the S76 and S76-MC. The SiFive Core IP U7 Series is a Linux-capable applications processor with a highly configurable memory architecture for domain-specific customization. The 64-bit U74 and U74-MC, like all SiFive U cores, fully support Linux, while the E76, E76-MC, S76, S76-MC support bare metal environments and real-time operating systems.

The broad portfolio of cores enabled by the 7 Series feature low power consumption, 64-bit addressability, tight accelerator coupling, and custom instruction allowance. These features are new to the market and provide the highest performance commercial RISC-V processor IP available today. The SiFive Core IP 7 Series raises the bar with hardware-based, real-time capabilities and unprecedented scalability.

The 7 Series enables the sharing of common features with in-cluster heterogenous compute and allows users to combine E7 and S7 cores with U7 cores in a single coherent operation, thereby greatly easing the software team’s development effort

More specifically the Core IP 7 Series offers:

[table] border=”1″ cellspacing=”0″ cellpadding=”0″
|-
| style=”width: 173px” | Efficient Performance
| style=”width: 204px” | Scalability
| style=”width: 245px” | Feature Set
|-
| style=”width: 173px” | ~60% improvement in CoreMarks/MHz*
| style=”width: 204px” | 8+1 coherent CPUs in
a cluster
| style=”width: 245px” | In-cluster heterogeneous compute for Application + Real-time processors
|-
| style=”width: 173px” | ~40% improvement in DMIPS/MHz*
| style=”width: 204px” | 512 coherent on-chip CPUs via TileLink
| style=”width: 245px” | 64-bit architectures across the portfolio
|-
| style=”width: 173px” | ~10% improvement in Fmax*
| style=”width: 204px” | 2048 multi-socket coherent CPUs via ChipLink
| style=”width: 245px” | Innovative L1 Memory microarchitecture
|-

*Compared to SiFive Core IP 5 series

In storage applications, the 64-bit real-time addressability will be a key feature for big data applications to exploit. In addition, the capability for specific custom instructions will greatly supplement storage, machine learning (ML) and cryptography use cases.

Tightly integrated memories (TIM) and cache lock capabilities will benefit critical real-time workloads in 5G and networking. Configurable memory maps and coherent accelerator ports allow designers to tightly couple storage with specific accelerators. It is also possible to have coherent in-cluster combinations of application processors and real-time processors. Safety applications will be enhanced by ECC capability across the SRAMs as well as significant guarantees around deterministic performance.

AR/VR/sensor fusion applications can combine multiple SiFive Core IP series. For example, the 2, 3, 5 and 7 series can all be flexibly integrated into a single design with tight power constraints. Mixed precision arithmetic accelerates machine learning compute.

Within the 7 Series portfolio, standard cores are offered where existing configurations with known power, performance and area (PPA) may be preferred by customers. Customers will have the option of using the standard cores as silicon verified design start points with the ability to customize the 7 Series core to meet application-specific requirements.

The U7 Series contains

  • Heterogenous in-cluster combinations of application processor and real-time processor supported
  • Configurable Level 2 cache with cache lock capability and Tightly Integrated Memory (TIM) available
  • Functional safety, security and real-time features such as:

o SECDED ECC on all L1 and L2 memories
o PMP and MMU for memory protection
o Programmatically clear and/or disable dynamic branch prediction for deterministic execution and enhanced security

E7, S7, U7 Core Series Architectural Features

  • Dual Issue, in-order 8-stage Harvard pipeline
  • A very flexible memory system
  • Multi-core capable with coherency and optional L2 (E7, S7)
  • Deterministic fast interrupt responses
  • Higher throughput and efficiency


The E7/S7 Level 1 memory system allows access to large SRAMS that are on the system side and allows other masters on the SoC to access the memory through the main core complex with fast I/O ports ideal for hanging accelerators.

SiFive can aggregate value by giving a single deliverable to customers with all the various desired hardware design options packaged, integrated, and enabled with software development.

During a panel at the Linley Fall Processor Conference, Kang stressed the configurability of the cores, which all have the ability to change branch prediction sizes as well as L1 and L2 memory configurations. The cores can include or exclude single or double-precision floating-point-units and have the ability to add custom extensions. The ability to combine cores in a heterogeneous cluster is a unique differentiator from other core architectures that are not coherent. Kang clarified that, even though internal development is in Chisel, all deliverables to customers are in human-readable Verilog.

I got a chance to have a side chat with Jack Kang, and he clarified that heterogenous operation refers to a mix of SiFive cores from different core series connected together to form a coherent core complex.

Kang added that, even though no single architecture rules them all, customers need general purpose programmability and control. In new AI and ML application domains, vector extensions can be added to provide functionality.

I then asked what constitutes the next success metric in achieving critical mass or adoption. Kang said that the market has started seeing commercial products with RISC-V chips this year. The next phase is then seeing those products being announced, launched, and shipped in volume. The situation today is that companies are hesitant of being first to adopt RISC-V in their industry while simultaneously worrying about being left behind. This has forced many companies to review their RISC-V strategies. Kang’s view is that companies will be late to RISC-V if they do not come in now. Next year, RISC-V products will be out. And since products are trailing indicators, it is a sure sign to move to RISC-V now. Kang sees that companies are not merely replacing designs but are seeking advanced features and choosing RISC-V (and SiFive specifically) because it allows them to tackle unsolved problems.

Kang went on to say that a good architecture serves as table stakes but that there must also be an ecosystem to back it up. RISC-V has an ecosystem that is being globally co-developed by all RISC-V member companies (including the likes of Google and Samsung), not just SiFive (founded in 2015 by the inventors of the RISC-V ISA). The rate of growth of the software ecosystem is high, with Debian and Fedora being ported to RISC-V as evidence of momentum. The ecosystem and tools are rapidly maturing.
I asked Kang if he had a special message to convey. He stated that with the 7 Series, SiFive brings new features, such as in-cluster heterogenous core complexes, that are needed to enable embedded intelligence. The takeaway is that “RISC-V is just not a replacement architecture. It is innovation and customization with new features enabling embedded intelligence and we are starting to see it really take off.”

Other factoids:

– With SCIE (SiFive Custom Instruction Extension) customers can add custom, Verilog-based, instructions which execute in a single cycle or multi-cycle. Some customers can create their own extensions and keep them secret.
– SCIE uses intrinsics for custom instruction generation which decouples custom instructions from specific compiler versions and allows for use with standard GCC and LLVM toolchains.
– SiFive’s RISC-V Core IP will always support the latest RISC-V standard extensions.


Eliminate PCB Re Spins using an Integrated Multi Dimensional Verification Platform

Eliminate PCB Re Spins using an Integrated Multi Dimensional Verification Platform
by Daniel Nenni on 11-15-2018 at 12:00 pm

The rapidly increasing complexity of today’s designs, combined with schedule pressure to deliver innovative products to market as quickly as possible, strains engineering resources to the limit, often to the point of breaking. As a result, 17% of all projects get canceled, and another 28% miss their target release date (Source: Lifecycle Insights – September, 2018). Project health is suffering. A more efficient design flow is needed to better utilize available engineering resources, while keeping complex projects moving forward on schedule.

The key to a more efficient design flow is the early detection and elimination of potential design issues. These potential issues can range from simple schematic errors allowed to propagate forward into layout, to complex mechanical issues, to issues impacting product testability and manufacturability. Identifying and fixing these potential issues as early in the process as possible avoids unnecessary schedule delays and costly design re-spins. It also frees up valuable engineering talent to move on to other projects.

The Conventional Design Flow
The traditional project development flow is inefficient and fraught with pitfalls. It relies far too heavily on manual reviews and costly prototypes. Verification of each design phase occurs far too late in the process. Valuable engineering resources are spent debugging errors in the lab that should have been caught during schematic entry. Errors uncovered this late in the game result in costly re-spins, that once again follow the same inefficient, error prone, manual review process.

As a result of this conventional process flow, the typical project goes through 2.9 re-spins, with an average schedule hit of 8.5 days and a cost of $44,000 per re-spin (Source: Lifecycle Insights – September 2018). For high-performance designs, the costs are often much higher. Due to the complexities of modern designs, these delays and added costs are unpredictable and project managers tend to bake them into their schedules and budgets. This conventional approach wastes time, talent, materials and puts projects at risk for cancelation.

The Shift-Left Approach to Integrated Design Verification
In order to eliminate the inefficiencies of the conventional design flow, a “shift-left” approach is desired that integrates verification as early as possible in the design process. This means catching errors and potential issues at the source, before they can propagate forward into subsequent phases of the project. Schematic errors should be caught during schematic entry, not in the lab after building costly prototypes and hundreds of hours of debug time. Automated schematic integrity analysis should be employed to eliminate the reliance on manual, visual schematic reviews.

Routing constraints for signal and power integrity, as well as design for test constraints should be specified during schematic capture, not shoe-horned in at the layout phase. Signal and power integrity, EMI compliance, thermal analysis and vibration analysis should all be validated during the layout process.

The goal of the shift-left approach is the same in all cases – to move as much verification as possible as early in the design cycle as permissible, while also automating the analysis to provide the highest possible degree of coverage. The conventional design flow is frustratingly unpredictable. It relies far too much on manual visual design checks that allow far too many errors to propagate forward to the next step in the process. The shift-left verification flow catches errors and identifies potential issues early in the process where they are quickly and economically corrected. It is a more efficient process that provides more predictable results, eliminates design re-spins and yields higher quality products in less time. The ultimate goal should be an all-inclusive, multi-dimensional verification process that reduces reliance on both manual reviews and manual debugging of physical prototypes.

A Multi-Dimensional Verification Solution
A multi-dimensional verification solution is comprised of a broad range of analysis and verification tools used during the schematic and layout phases of the project. These tools are aimed at non-specialist PCB design engineers and layout designers and allow them to work within their familiar authoring environments to identify problems early in the design.

During schematic capture automated schematic integrity analysis is performed to eliminate common schematic errors that often escape the manual review process. Signal integrity and power integrity analysis is performed to determine a set of placement and routing constraints to be passed forward to layout. Testability analysis should also occur during schematic entry, prior to layout. The design is analyzed, test point requirements are identified and passed to layout as constraints.

During layout, as component placement is progressing, EMI validation, thermal analysis, vibration/acceleration and manufacturability analysis should all be performed to quickly identify and correct any potential issues. In the traditional design flow, these issues would not be discovered until physical testing in an EMI, thermal or HALT test chamber. If they are not caught during layout, issues that impact the mechanical integrity of the design are usually the most expensive and time consuming to fix. Such issues often require board re-spins and tooling changes to correct. Simulations during layout greatly increases the likelihood of first-pass success.

The Tools to Implement a Shift-Left Automated Verification Design Flow
The Mentor® Xpedition® platform includes all of the shift-left enhancements described in this article. Xpedition provides a multi-dimensional integrated verification platform for single and multi-board PCB designs that includes automated schematic integrity analysis with built-in automated design checks and an extensive library of intelligent models. Xpedition also includes integrated testability analysis, automated component modeling for vibration analysis, DC voltage drop analysis for rigid-flex and multi-board designs, as well as concurrent DFM analysis during layout.

Xpedition provides powerful verification tools integrated within the authoring environment to enable easier, faster validation which reduces costly design re-spins, improves time-to-market for new products and results in higher quality product with fewer defects. The Xpedition integrated verification platform is a better, more modern approach to today’s complex design challenges.

For more information, read Integrated Verification: A Shift-Left Solution for a More Efficient Design Flow.


NXP Strengthens Security, Broadens ML Application at the Edge

NXP Strengthens Security, Broadens ML Application at the Edge
by Bernard Murphy on 11-15-2018 at 7:00 am

Security and machine learning (ML) are among the hottest areas in tech, especially for the IoT. The need for higher security is, or should be, blindingly obvious at this point. We struggle to fend off daily attacks even in our mainstream compute and networking environment. How defenseless will we be when we have billions of devices in our homes, cars, cities, utilities and farms, open to attack by any malcontent (or worse) with an urge to create chaos? Meanwhile, ML is gaining traction at the edge simply because, for many of these devices, the classic human-interface paradigm of keyboards and monitors/cryptic displays is too cumbersome, too difficult to use and too costly.

In support of raising capabilities in both of these domains, NXP recently launched a couple of new platforms and a toolkit for intelligence at the edge. I’ll start with the platforms, the LPC5500 microcontrollers and i.MX RT600 crossover processors. They argue a multi-layered approach to security in these platforms, including

  • Secure boot for hardware-based immutable root-of-trust
  • Certificate-based secure debug authentication
  • Encrypted on-chip firmware storage with real-time, latency-free decryption

They’ve added a couple more important security features. Device-unique keys can be generated on-demand through a physically unclonable SRAM-based function (PUF). They also provide support for the DICE standard which is becoming increasingly popular in IoT identity, attestation and encryption. Even more interesting (to me), NXP are working on a relationship with Dover Microsystems, about who I’ll talk more in a later blog. NXP plan to integrate Dover’s CoreGuard technology offering an active, rule-based security mechanism.

On the ML side, NXP recently announced their eIQ software environment for mapping cloud-trained ML environments to edge devices. I found this to be one of most compelling parts of the NXP announcement. Normally when you think about mapping a TensorFlow, Caffee2 or whatever neural net model to a resource-constrained edge device, you think about mapping to specific NN architecture in that device. But what if you need to target a wide range of devices, all the way from CPUs up to dedicated ML cores? Will that require a different mapping solution and lots of ML expertise per platform? According to Geoff Lees, Sr VP and GM of microcontrollers, eIQ and the platforms mentioned above should make this multi-device targeting a lot easier.

I asked why anyone would want to implement ML on a CPU. After all, CPUs are famously the least effective platform for ML in terms of power per watt. I asked a similar question at an ARM press briefing last year and got what I thought was a rather defensive response. So I was curious to get NXP’s take. Geoff provided a great example of an intelligent microwave. This doesn’t need a lot of ML horsepower to handle (locally) trigger-word recognition and basic natural language processing for a very limited vocabulary. Or better yet, recognizing the food when you put into the microwave. Nor does it have to provide microsecond response times or run off a coin cell battery (since a microwave has to be wired anyway). So a Cortex M33 with its support for DSP processing is amply suited to the task and likely cheaper than more elegant NN platforms. Which should be important in a mass-market appliance.

For fancier applications, you’ll still want to rely on a dedicated ML engine. In the i.MX RT600 family, this is the Cadence Tensilica Hi-Fi 4 DSP. Hopefully now you see the value of eIQ – a common ML mapping platform which can handle mapping to all NXP devices, from high-end i.MX 8QM down through mid-range devices to the Cortex M33-based devices.

As examples of how these technologies can be applied, NXP recently showed (at the Barcelona IoT World Congress) an industrial application in which they used various subsystems including drones for operator recognition (are you allowed to perform this function), object recognition for operator safety, voice control and anomaly detection to predict failures in drone operation. At TechCon they showed trigger word recognition and voice control and in vision they showed food recognition (for that microwave) and traffic sign recognition.

From microwaves to traffic sign recognition and factory floor automation, looks like NXP is making a play to own an important piece of edge processing, both in security and in machine learning, across a wide range of processor solutions.


Mentor’s Symphony in Tune with AMS Designer Needs

Mentor’s Symphony in Tune with AMS Designer Needs
by Tom Simon on 11-14-2018 at 12:00 pm

Mixed signal simulation is a very hot topic these days. In modern designs, it is harder to draw a line between the analog and digital and work with them independently. Analog blocks are showing up everywhere. Even in what would have qualified as a digital design a few years ago, now designers need to look at things like PLLs, IOs and SerDes from a detailed analog perspective in context to ensure proper design behavior and performance. The drive to reduce power, the addition of sensors, increased use of ADCs, oscillators and other analog blocks in SOCs have all exacerbated the need for faster, easier and more accurate mixed signal modeling. At the same time requirements imposed by automotive standards such as ISO 26262 are creating the need for more comprehensive verification of mixed signal chips.

This last week Mentor has created quite a buzz with the introduction of their Symphony Mixed Signal Platform. Mentor has never been a slouch when it comes to analog and digital simulation. However, their AFS (Analog Fast SPICE) has been a game changer for the industry. What Symphony brings to the table is the ability to easily combine the leading analog simulator with Mentor’s, or other, digital simulators. At the same time Symphony overcomes many of the limitations that engineers faced while trying to verify mixed signal designs.

Typically, transistor level analog simulation was too slow to incorporate with digital simulations. As a result, people turned to behavioral models to speed up the analog side of the simulation. However, creating these models requires specializes skills and extra development time. And, of course any design revision required rework. Symphony lets design teams avoid the need for behavioral modeling to achieve faster run times. AFS provides nanometer SPICE accuracy and a capacity of 20M SPICE elements.

One of the key concepts of Mentor’s Symphony is their use of Boundary Elements (BE) that support all signal types and multiple power domains, including dynamic supplies. Their approach significantly improves debug, where now detailed information about signals at the interfaces can easily be examined in detail. Their approach is flexible enough that mixed digital and analog hierarchies are easily supported, with multiple levels and no restrictions on mixing A or D at each level. One important feature that Mentor is highlighting is their Hi-Z checking capability, which lets designers detect when a mixed signal net goes into a ‘Z’ state.

According to Mentor they have 30 customers who have been using Symphony prior to its release and their announcement contains many customer quotes reporting dramatic improvements in runtime and overall results.

Stepping back, this new product from Mentor is starting to paint a picture of what the Siemens acquisition means for Mentor. Going from a public company to a privately held company can mean big changes. I know that many people in the industry were wondering if Mentor would become the private EDA group for Siemens or if they would be able to continue robust product development. Much of Mentor’s more recent reputation and success has come from the Calibre line. Of course, Mentor has very competitive offerings across their product line. However, Symphony looks like a major long-term investment that aims to upset the analog mixed signal flow status quo. There is more information about Symphony on the mentor website.


Synopsys DDR5 LPDDR5 Memory Interface IP Targets AI, Automotive, and Mobile SoCs

Synopsys DDR5 LPDDR5 Memory Interface IP Targets AI, Automotive, and Mobile SoCs
by Camille Kokozaki on 11-14-2018 at 7:00 am

Synopsys announced on October 24 new DesignWare[SUP]®[/SUP] Memory Interface IP solutions supporting the next-generation DDR5 and LPDDR5 SDRAMs. The DDR5 and LPDDR5 IP significantly increase memory interface bandwidth compared to DDR4 and LPDDR4/4X SDRAM interfaces, while reducing area and improving power efficiency. The DesignWare DDR5 IP, operating at up to 4800 Mbps data rates, can interface with multiple DIMMs per channel up to 80 bits wide, delivering the fastest DDR memory interface solution for artificial intelligence (AI) and data center system-on-chips (SoCs).

The industry’s first LPDDR5 IP, running at up to 6400 Mbps, provides significant area and power savings for mobile and automotive SoCs with its dual-channel memory interface option that shares common circuitry between independent channels. For additional power savings, the DesignWare Memory Interface IP solutions provide several low-power states with short exit latencies and offer multiple pre-trained states for dynamic frequency change capability. The DDR5 and LPDDR5 controller and PHY seamlessly interoperate via the latest DFI 5.0 interface, providing a complete memory interface IP solution for high-bandwidth, low-power SoC designs.

DesignWare DDR IP Solutions

[table] border=”1″ cellspacing=”0″ cellpadding=”0″ style=”width: 100%”
|-
| style=”width: 15.32%” | DesignWare DDR PHY(New)
(full list here)
| style=”width: 33.92%” | SDRAMs Supported /
Maximum Data Rate

| style=”width: 15.9%” | Interface to Memory
Controller

| style=”width: 34.86%” | Typical Application
|-
| style=”width: 15.32%” | LPDDR5/4/4X
| style=”width: 33.92%” | LPDDR5 / 6400 Mbps
LPDDR4 / 4267 Mbps
LPDDR4X/ 4267 Mbps
| style=”width: 15.9%” | DFI 5.0
| style=”width: 34.86%” | Design in 16-nm and below that requires high-performance mobile SDRAM support up to 6400 Mbps
|-
| style=”width: 15.32%” | DDR5/4
| style=”width: 33.92%” | DDR5 / 4800 Mbps
DDR4 / 3200 Mbps
| style=”width: 15.9%” | DFI 5.0
| style=”width: 34.86%” | Design in 16-nm and below that requires high-performance DDR5/4 support up to 4800 Mbps
|-

Some highlights:

  • The industry’s first LPDDR5 controller, PHY, verification, and IP solution support data rates up to 6400 Mbps with up to 40% less area than previous generations
  • The complete DDR5 IP solution supports up to 4800 Mbps with single, dual channels for discrete devices and DIMMs
  • Both solutions provide several low-power states with short exit latencies and offer multiple pre-trained states for dynamic frequency change capability

The DesignWare DDR5 and LPDDR5 IP solutions support all required features of the DDR and LPDDR specifications, enabling designers to incorporate the necessary functionality into their SoCs:

  • Firmware-based training via an embedded calibration processor in the PHY optimizes the boot-time memory training for highest data reliability and margin at the system level. It also allows fast updates to the training algorithms without requiring changes to the hardware
  • Decision feedback equalization (DFE) used in the input receivers reduces the impact of inter-symbol interference (ISI) to improve signal integrity
  • Reliability, availability, serviceability (RAS) features, including inline or sideband error correcting code (ECC), parity, and data cyclic redundancy checks (CRC), reduce system downtime
  • Synopsys PHY hardening and signal/power integrity expertise enable faster design completion time and a higher design confidence degree.
  • Synopsys VIP for DDR5 and LPDDR5 provides randomized configuration and runtime selection, as well as built-in comprehensive coverage, verification plan, and protocol checks for increased productivity.

ARM, Micron and SK Hynix provided testimonials in a Synopsys press release on October 24, 2018. In that press release John Koeter, vice president of Marketing for IP at Synopsys, emphasized that Emerging applications such as AI, automotive, and cloud are requiring significantly higher memory bandwidth to address the massive amount of data throughput. He added that Synopsys is offering designers the fastest DDR5 and LPDDR5 IP solutions on the most advanced FinFET processes to deliver innovative products that are differentiated in bandwidth, power, and area.

Availability

  • The DesignWare DDR5 PHY and LPDDR5 PHY are scheduled to be available in Q1 of 2019
  • The DesignWare DDR5 Controller and LPDDR5 Controller are scheduled to be available in Q2 of 2019
  • The VC Verification IP for DDR5 and LPDDR5 is available now.

Worth Noting

  • All the DFI-compatible DDR PHYs are supported by Synopsys’ unique DesignWare DDR PHY Compiler. In addition, Synopsys’ DesignWare DDR5/4 Controller, LPDDR5/4/4X Controller, and Enhanced Universal DDR Memory and Protocol Controller IP feature a DFI-compliant interface, low latency and low gate count while offering high bandwidth. Optional market-specific features like AMBA AXI/4 AXI Quality of Service (QoS) and Reliability, Availability, and Serviceability (RAS) features allow you to match the area and capabilities of the controllers to designer needs.
  • Synopsys also offers DesignWare HBM2 IP, which provides 12x the bandwidth of DDR4 IP and ten times better power efficiency for graphics, high-performance computing, and networking SoCs.

DesignWare[SUP]®[/SUP] Memory Interface IP solutions


Fusion Synthesis for Advanced Process Nodes

Fusion Synthesis for Advanced Process Nodes
by Alex Tan on 11-13-2018 at 12:00 pm

Synopsys recently unleashed Fusion Compiler™, a new RTL-to-GDSII product that enables a data-driven design implementation by revamping Design Compiler architecture and leveraging the successful Fusion Technology –seamlessly fusing the logical and physical realms to produce predictable QoR. It is a long-awaited move that provides a breakthrough solution as more designers are migrating into deep advanced nodes, 7nm and beyond.

Let’s glance through earlier synthesis key challenges that might act as precursors to subsequent developments leading towards this vital product announcement.

Traditional synthesis challenges
As part of the RTL-to-GDSII flow, synthesis tool such as Design Compiler transforms design RTL description into an optimized gate-level representation. This includes performing architectural, logic and gate level optimization steps. Synthesis utilizes standard cell library, pre-characterized for timing and power across various input slews, load conditions and process corners (or PVT –Process, Voltage, Temperature), to generate optimal design based on a given set of PPA (Performance, Power and Area) target. Over time, synthesis has been fitted with limited physical and placement awareness as inroad into routing.

As wire performance fails to keep pace with device performance in advanced process scaling, inadequate interconnect modeling or estimation has translated to a disparity between the synthesis QoR and those generated by downstream physical implementation tools. The once tolerable trade-off across PPA during micrometer process node era is no longer acceptable for sign-off in advanced nodes –as designs are increasingly being targeted for emerging applications that require power efficiency as well.

Interconnect shift impacts not only on delay related metric, but also on power due to the increased RC or degraded-slew induced power dissipation. Such gap has been exacerbated by device threshold lowering or near threshold condition that shifts the total power contribution from dynamic to the static term. This drives the need of having a solution that delivers optimal results for both performance and power.

Moreover, increased design density also has strained synthesis tool and has demanded scalability, runtime improvements and more physical awareness. For example, the tool needs not only a congestion awareness but also a capability for generating legalized placement –to ascertain an accurate resource utilization and minimal perturbation during route optimization.

Common Data Model and Fusion Technology
Key to this breakthrough is the adoption of a common data-centric architecture. The Fusion Compiler single data model contains both logical and physical information to enable sharing of library, data, constraints, and design intent throughout the implementation flow. It has scalability to support ultra-large designs with the smallest feasible memory footprint. The Fusion Data Model serves all design phases and provides faster tool-data model interaction, interactive what-if analysis, and native multi-everything (cores, corners, etc.) with near-linear scalability across multiple CPU cores. It also supports transparent, multi-level hierarchy and the efficiency to run compute-intensive algorithms, facilitating more optimizations for better QoR.

Another enabler is Synopsys Fusion Technology™ which was announced in March 2018. It provides new level of integration of Synopsys synthesis, place & route and signoff tools, by redefining conventional product boundaries with systematic sharing of algorithms, code and data representations across multiple tasks.

Fusion provides Design Fusion, ECO Fusion, Signoff Fusion, and Test Fusion technologies. Design Fusion enables synthesis technology inside place-and-route, and place-and-route technology inside synthesis. ECO Fusion drives faster signoff closure with the signoff analysis and ECO optimization enabled directly from within implementation. Signoff Fusion eliminates design margin and over-design, using PrimeTime and StarRC for both optimization and signoff. Test Fusion is the combination of design-for-test (DFT), synthesis and automatic pattern generation (ATPG) technologies. Using physical design data, Test Fusion ensures optimal placement of test points while minimizing routing congestion and area impact.

Fusion Technology offers a bidirectional access between synthesis and the adjoining implementation tools, including sharing of optimization engines between the two domains. As Fusion Compiler integrates all synthesis, place-and-route and signoff engines on a single data model, it removes the necessity of having data conversion and transfer –hence, providing good QoR accuracy, best predictability and optimal throughput.

The Fusion Design Platform also AI-enhanced to enable additional QoR and TTR gains by speeding up computation-intensive analyses, predicting outcomes to improve decision-making, and leveraging past learning to drive better results.

Fusion Compiler QoR and Customer Feedback
The unified architecture of Fusion Compiler shares technologies across the RTL-to-GDSII flow delivering 20 percent better QoR and 2X faster time-to-results (TTR). It has been silicon-proven at several customers.

Fusion Compiler’s new solver-based global optimization engine enables path-based total negative slack (TNS) targets and analysis of critical path traces for effective design closures. Both pre- and post-route engines use the same costing and infrastructure for consistent correlation throughout the flow. Its underlying multi-corner multi-mode (MCMM) and multi-voltage (MV)-aware heuristic algorithms concurrently tackle all the design metrics for best QoR. Likewise, logic remapping, rewiring and legalization interleaved with placement minimizes congestion and speeds timing closure. The CTS engine follows a networking flow paradigm for optimal balancing of latency and skew.

“The power of this technology is essential for the design of tomorrow’s FinFET-based automotive applications. With Fusion Compiler, we achieved the target design goal and completed the tapeout. Compared to conventional technology, we confirmed a 33 percent reduction in timing violations, 10 percent area reduction, and 30 percent less leakage power, while cutting the design turnaround time in half. We have completed the integration of Fusion Compiler in Toshiba’s design environment and have begun to deploy it to upcoming SoC designs,” said Seiichi Mori, senior vice president, Toshiba Electronic Devices and Storage Corporation.

As shown in figure 2 and 3, Fusion Compiler runs produced more optimal PPA results from improved via-pillars handling and CCD (Concurrent Clock Data) optimizations –two snapshots of many underlying technological enhancements that promote the overall QoR gain.

“As design complexity increases across all our market segments, our key requirement is to achieve the best product performance coupled with the highest levels of predictability,” said Michael Goddard, senior vice president, Samsung SARC and ACL. “With Fusion Compiler, we are on track to achieve optimal PPA with up to 10 percent better timing, 10 percent lower leakage, two-to-five percent dynamic power savings, and typically two-to-three percent area reduction for our most challenging design blocks on our imminent tapeout. In addition, the predictable path from synthesis to signoff reduces design iterations, ensuring that we can meet our aggressive product schedules.”

“Strong semiconductor market drivers like autonomous driving and the adoption of AI continue to drive global demand for larger, faster, and more energy-efficient SoCs”, said Sassine Ghazi, co-general manager of Synopsys’ Design Group.

“Our early assessment of Fusion Compiler shows significantly better full-flow predictability, faster full-flow turnaround time, and better timing QoR compared to the previous approaches. We are collaborating with Synopsys to deploy this innovative RTL-to-GDSII solution, as it will streamline physical design of our mission-critical projects and allow us to bring new products to market much faster,” said Taichiro Sasabe, general manager, SoC Design Division at Socionext.

With this release of Fusion Compiler, Synopsys has raised the bar for a holistic synthesis solution –replacing the traditional RTL-to-GDSII design flow that is comprised of either disconnected or loosely-coupled tools for emerging applications and advanced process nodes.

For Fusion Compiler whitepaper please check HERE and datasheet HERE.


DAC 2019 to Host the Second System Design Contest!

DAC 2019 to Host the Second System Design Contest!
by Daniel Nenni on 11-13-2018 at 7:00 am

Interested in showing off your talent in developing deep learning algorithms on embedded hardware platforms for solving real-world problems? Join the second System Design Contest (SDC) at the 56[SUP]th[/SUP] Design Automation Conference in 2019!
Continue reading “DAC 2019 to Host the Second System Design Contest!”


IoT Security Process Variation to the Rescue

IoT Security Process Variation to the Rescue
by Tom Simon on 11-12-2018 at 12:00 pm

Unique device identities are at the core of all computer security systems. Just as important is that each unique identity cannot be copied, because once copied they can be used illegitimately. Unique device IDs are used to ensure that communications are directed to the correct device. And they also provide the ability to encrypt communication – an essential component for security of data in motion. Any device with a programmable ID can be cloned. The only way to limit this is to perform the programming as soon after fabrication as possible. However, programmable IDs still leave open a window of opportunity for misuse and add extra steps to the manufacturing process.


The number of IoT devices is expected to proliferate to nearly 50 billion by 2020. Each one needs security, most likely provided by an on-chip identifier. What if each device could contain a unique ID automatically, right at the point of manufacture, that could be used as the basis of a security system? This is the premise behind a Physical Unclonable Function (PUF).

As we know, there are minute variations in silicon chips due to manufacturing processes. Intrinsic ID, a software and hardware IP provider, has examined the wide range of techniques available to capture a repeatable yet unique ID from ICs. Eschewing methods that required analog circuity, the addition of special layers or the use of special processes, they settled on SRAM bit cell initialization states. Practically every IoT chip has SRAM and an embedded processor. Every SRAM bit cell will initialize to a 1 or a 0 depending on the precise threshold voltages of its transistors. It’s worth noting that some bit cells will fall within a range where the initialization state is not predictable, but there are methods to avoid or correct for these specific cells.

When the chip is powered off there is no trace of the unique ID left by a range of SRAM cells (volatile memory); as well, the unique ID is generated on demand and never stored. To date, analysis by security labs and customers have not been able to reveal any weaknesses in their system. Through a process called enrollment a PUF key is generated. This is used to create a public and private key for data exchange with external systems.

Small blocks of SRAM can be used to create 128-bit or 256-bit keys. Intrinsic ID has performed reliability testing over a wide range of conditions and also has done aging analysis to guarantee a lifespan of 25 years. Intrinsic ID’s PUF has been qualified for automotive, industrial and military uses through their work with customers and partners. Just as importantly, this IP’s unique operational invariance across technology nodes and fabs makes designer’s jobs easier.

The SRAM-based PUF from Intrinsic ID can be implemented with a small uninitialized SRAM block on chip and either an RTL IP block or embedded code that runs on chip; both approaches would need to have a proper security perimeter implemented. Intrinsic ID’s solution has gained excellent traction through a number of their customers and partners. Invensense created their TrustedSensor concept using this PUF. NXP offers SRAM PUF in its LPC and i.MX platforms for secure microcontrollers. Synopsys Designware uses SRAM PUF in their ARC EM Architecture for ultralow power embedded processors. Intel, Microchip, Renesas and Samsung also offer products that utilize SRAM PUF.

Intrinsic ID has written a white paper that is available on their website that goes into greater detail on the technology of their SRAM PUF. Unique unclonable keys are an absolute necessity for the profitable proliferation of the IoT. With this technology, devices used for personal or commercial applications are secure from hacking and data interception. It is easy to implement SRAM PUF without the need for special processes or dependence on analog IP. In closing I’ll say it’s nice to finally write an article about how process variation can serve a beneficial purpose.


Intel Diversity Semiconductors

Intel Diversity Semiconductors
by Daniel Nenni on 11-12-2018 at 7:00 am

Growing up in a military family, mostly in California, I would consider my cultural diversity life experience to be more than most. I remember in the 1960s some older folks were chattering about a colored family moving into our neighborhood and they had a son my age. Imagine my excitement as a child in having a multicolored friend! As it turns out he was only one color but we were fast friends anyway.

The other diversity experience I had growing up was with my mother. She wanted to be a mechanical engineer but that was a challenge for women in the 1950s and even more so after having children. She ended up being a draftsperson for a NASA contractor. I remember visiting her at work and getting some very cool Apollo NASA stickers and gifts. The other thing I remember is that all of the drafts people were women which seemed kind of odd to a young mind. Clearly I was never going to be a draftsperson because you had to be a woman so I decided to be an astronaut because those were all men.

My mother’s theme song was “Anything You Can Do I Can Do Better” from Annie Get Your Gun and that was the way she lived. She bowled in the PWBA when it first started and was a full on pool hustler. Her final job was at LAM Research, testing semiconductor equipment in Fremont. She really was a Rosie the Riveter of her era.

For my undergraduate degree I attended a University in Northern California that had nursing and teaching programs so the female to male ratio was higher than most but still the engineering classes were male dominant. There were some women in computer programming classes but hardware classes were again all men.

When I joined the semiconductor industry in the 1980s it was not diverse at all up until the fabless semiconductor transformation in the 1990s. Yet another thing we can thank Morris Chang and TSMC for. Today I would say the semiconductor industry is diverse (as compared to other technology based industries) and that diversity really is the core strength of the semiconductor ecosystem, absolutely.

The semiconductor diversity exception is a few old school IDMs lagging behind which brings us to Intel.

In 2015 Intel announced a Diversity in Technology initiative, committing $300M to accelerate diversity inside Intel. I guess I wasn’t shocked when I saw the diversity slides based on my personal experience with Intel but spending $300M for a quick fix to a years long problem seemed puzzling at the time. You can see the 2015 slides HERE. Intel released a diversity update claiming “full representation” in its workforce two years ahead of schedule. You can see the 2018 slides HERE:


And here is the updated Intel diversity blurb:

A diverse workforce and inclusive culture are key to Intel’s evolution and they are the driving forces of our growth. In addition to being the right things to do, they are also business imperatives. If we want to shape the future of technology, we must be representative of that future. In January 2015, Intel announced the Diversity in Technology initiative, setting a bold hiring and retention goal to achieve full representation of women and underrepresented minorities in Intel’s U.S. workforce by 2020. The company also committed $300 million to support this goal and accelerate diversity and inclusion – not just at Intel, but across the technology industry. The scope of Intel’s efforts span the value chain, from spending with diverse suppliers and diversifying its venture portfolio to better serving its markets and communities through innovative programs. Intel achieved its goal of full representation in its U.S. workforce in 2018, two years ahead of schedule. This achievement was the result of a comprehensive strategy that took into account hiring, retention and progression. However, Intel’s work does not stop here. We continue to foster an inclusive culture where employees can bring their full experiences and authentic selves to work.

So, let’s congratulate Intel on their diversity achievement. Hopefully now they can hire and retain the most qualified people without bias as to race or sex. Hey, wait, what about age diversity?


For Car Makers Google Scare Means It’s Time to Share

For Car Makers Google Scare Means It’s Time to Share
by Roger C. Lanctot on 11-11-2018 at 12:00 pm

Google says it wants to charge fees to handset makers in Europe for Android apps such as Googlemaps and Gmail, according to the New York Times. The move is clearly a reaction to the $5.1B fine imposed by the European Commission (and under appeal by Google) in reaction to Google’s perceived monopolist practices.

Is the scare of Google hegemony enough to convince auto makers they need to share data in the interest of preserving their independence?

A key motivation behind the $3.1B acquisition of map maker HERE from Nokia by Daimler, BMW and Audi was to ensure the independence of HERE and access to its maps for support of in-vehicle navigation systems, mobility services and autonomous driving development. In the ensuing three years, the venture has failed to attract any additional auto maker investors even as Audi, BMW and Daimler have proceeded to share vehicle sensor data and expand the HERE platform.

The abiding concern regarding Google, is the potential for the company to disrupt consumer relationships in the industry such that Google ultimately controls such key customer engagement points as service delivery, and content and application management and any related advertising or marketing opportunities. It all comes down to browsing and search which underpin Google’s $100B advertising portfolio.

The car is arguably the ultimate browser. Google wants to own that space.

Many auto makers have their own app platforms today, just as handset makers once did. In the handset space, most independent app stores were long ago eliminated by the dominant Google and Apple offerings. For auto makers the significance of the announcement is that it is a reminder of Google’s over-arching influence. It is enough to give pause to any auto maker considering the broader adoption of Google’s automotive services (i.e. Volvo, Renault) and to give impetus to those considering a tie-up with the HERE-Audi-BMW-Daimler venture.

The confrontation calls to mind the Microsoft Consent Decree arrived at in the U.S. nearly 20 years ago which forbid Microsoft from bundling its Internet Explorer browser with its operating system. By the time that agreement was reached the bundling of IE was a moot point and the importance of advertising was only just emerging.

Makers of Android-based smartphones had no initial comment for the New York Times to report, but the change will mean added cost for these devices that will have to either be absorbed or passed on to consumers.

Auto makers are watching developments closely, or should be, because the cost of implementing Android along with related Google provided services and applications is a key consideration behind adopting the operating system. And many auto makers are in the process of doing just that – sticking the Android operating system into their in-vehicle infotainment systems arriving in the market next year and beyond.

Implementing Android in cars is actually a relatively harmless process as no surrender of customer or vehicle data is necessary. Google has even intimated to auto makers that they will be able to add Google Voice to Android without surrendering customer control. But it may be time for auto makers to consider taking out some insurance in the form of a stake in the HERE joint venture. For its part, HERE will do well to give its best performance as a reliable alternative to Google.