RVN! 26 Banner revised (800 x 100 px) (600 x 100 px)

Margin Call

Margin Call
by Bernard Murphy on 06-04-2017 at 7:00 am

A year ago, I wrote about Ansys’ intro of Big Data methods into the world of power integrity analysis. The motivation behind this advance was introduced in another blog, questioning how far margin-based approaches to complex multi-dimensional analyses could go. An accurate analysis of power integrity in a complex chip should look at multiple dimensions: a realistic range of use-case simulations, timing, implementation, temperature, noise and many other factors. But that would demand an impossibly complex simulation; instead we pick a primary topic of concern, say IR drop in power rails, simulate a narrow window of activity and represent all other factors by repeating analysis at a necessarily limited set of corners of margins on the other factors.


That approach ignores potential correlation between these factors, which worked well in simpler designs built in simpler technologies but is seriously flawed for multi-billion-gate designs targeted to advanced technologies. Ignoring correlations requires you to design to worst-case margins, increasing area and cost, blocking routing paths, delaying timing closure and still leaving you exposed, because without impossibly over-safe margins you’re still gambling that worse cases don’t lurk in hidden correlations between the corners you analyzed.


Ansys big data technology (called SeaScape) aims to resolve this problem by getting closer to a true multi-dimensional analysis, tapping existing distributed data reserves of simulation, timing, power, physical and integrity data through distributed processing. This breadth of analysis should provide a more realistic view across multiple domains, providing both efficiency and safety; you don’t overdesign for “unknown unknowns” and you don’t under-design because you see a much broader range of data. Ansys have had a year since my first blog on the topic, so it seems reasonable to call this – did they pull it off?

It’s always difficult to get direct customer quotes on EDA technology, so I must talk here in general terms, but I believe there will be some joint presentation activity at DAC, so look out for that. The technology first appears in RedHawk-SC and has been proven in production with at least two of the top 10 design companies that I know of, building the biggest and most advanced designs around today. I was told that 16 of those designs are already in silicon and around twice that many have taped-out.

Off-the-record customer views on the value-add are pretty clear. The most immediately obvious advantage is in run-times. Since much of the processing is distributed, they can get results on a block within an hour and a (huge) full-chip overnight. It becomes practical to re-run integrity analysis on every P&R update. They can run four corners simultaneously for static IR, EM and DVD transients. They can profile huge RTL FSDBs and parallel-solve for multiple modes to find the best vectors with best activities for EM and IR stressing. And that provides the confidence to be more aggressive in reducing over-design, which in turn accelerates closure (less blockages). This customer also commented on the elasticity of this approach to analysis. Previously, running faster was capped by the capabilities of the biggest systems they could use for analysis. Now, since analysis is distributed, they found it much easier to scale up by simply adding access to more systems.

Faster is always good, but what about impact on the final design? One very compelling customer example looked at die-size reduction. In that case they removed M2 over standard cell rows, then added it back in only where this more refined analysis showed it was needed to meet power integrity margins. They found that they could reduce the overall die size by ~ 5% by freeing up more resources for signal routing. Which resulted in a reduction of P&R block size by 10%. That’s an easily understood and significant advantage, enabled by big data analytics.

All this is great for teams building multi-billion gate chips in 16 or 7nm, but I was interested to hear that both customers saw significant value for analyzing and optimizing blocks, between 1M and 8M gates, in around 50 minutes, which they found helped them close faster and more completely on physical units than was possible before. So the technology should also have value for less challenging designs.

Given this, my call is that Ansys delivered on the promise. But don’t take my word for it. Check out what they will be presenting at DAC. You can learn more about SeaScape HERE.


Memory drives semiconductor boom in 2017

Memory drives semiconductor boom in 2017
by Bill Jewell on 06-03-2017 at 7:00 am

The semiconductor market was down 0.4% in first quarter 2017 from 4Q 2016 and up 18.1% from a year ago, according to World Semiconductor Trade Statistics (WSTS). The 0.4% decline in 1Q 2017 versus 4Q 2016 is strong compared to an average 4% decline from 4Q to 1Q over the previous five years. The relative strength in 1Q 2017 was driven by a strong memory market. The three largest memory companies – Samsung, SK Hynix and Micron Technology – grew their revenues a combined 10% in 1Q 2017 versus 4Q 2016. Excluding these three companies the semiconductor market declined 3.7%, in line with recent seasonal trends.

Memory will help drive solid 2Q 2017 growth over 1Q 2017. Micron Technology expects 16% growth in its fiscal quarter ending this month versus the prior quarter. Samsung and SK Hynix did not provide 2Q 2017 guidance, but both companies cited strong demand and healthy price trends for both DRAM and flash memory. With the exception of Intel – which is expecting a 2.7% decline – the top non-memory semiconductor companies have guided for healthy 2Q 2017 revenue growth. The midpoint of the guidance from these companies ranges from 2.3% from MediaTek to 5.0% from STMicroelectronics. The high end guidance ranges from 4.5% from MediaTek to 11.6% from Qualcomm. Qualcomm cut $500 million (about 10 percentage points of growth) from its initial guidance due to a royalty dispute with Apple. NXP Semiconductors did not provide guidance since its acquisition by Qualcomm is pending. Toshiba’s reporting has been delayed by financial problems and it is in the process of selling off its memory business. Intel’s projected revenue decline and Samsung’s strong memory growth is expected to result in Samsung passing Intel as the world’s largest semiconductor company in 2Q 2017 according to IC Insights.

The outlook for full year 2017 semiconductor market growth has improved following the robust start to the year. Recent forecasts range from 11% from IC Insights to 16% from us at Semiconductor Intelligence. These forecasts are about 5 to 6 percentage points higher than the forecasts made by the same companies in the January to February time frame. Forecasts for 2018 include 3.5% growth from Mike Cowan and our Semiconductor Intelligence’s 7.0% growth. Our outlook for 2018 is based on moderating memory demand and stable economic trends and electronic equipment markets.

The global economic outlook for 2017 and 2018 is solid, according to the latest forecast from the International Monetary Fund (IMF). The table below shows IMF’s April 2017 forecast for annual GDP percent change and the percentage point change in GDP growth rate (acceleration or deceleration). The IMF expects global economic GDP growth to pick up from 3.1% in 2016 to 3.5% in 2017 and 3.6% in 2018. The advanced economies should see modest growth of 2.0% in 2017 and 2018. Among the key countries in this category, improvement in the U.S. is offset by flat or decelerating growth in the Euro area, United Kingdom and Japan. The global GDP growth acceleration is driven by emerging and developing economies. Within this category, lower growth rates in China are offset by accelerating growth in India and the ASEAN-5 (Indonesia, Malaysia, Philippines, Thailand and Vietnam). Also Russia and Latin America should show growth in 2017 and 2018 after GDP declines in 2016.

16% growth in the semiconductor market in 2017 does not seem like much of an upturn compared to prior peak growth years (32% in 2010, 28% in 2004 and 37% in 2000). However, it will be the first double-digit growth in seven years and follows a flat 2015 (-0.2%) and weak 2016 (1.1%). But all good things must come to an end. Memory booms are always followed by memory busts, usually dragging the overall semiconductor market negative. This could happen as early as 2019.


Is ARC HS4xD Family More a CPU or DSP IP Core?

Is ARC HS4xD Family More a CPU or DSP IP Core?
by Eric Esteve on 06-02-2017 at 4:00 pm

When I had to define the various IP categories (processor, analog & mixed-signal, wired interfaces, etc.) to build the Design IP Report, I scratched my head for a while about the processor main category: how to define the sub-categories? Not that long ago, it was easy to identify a CPU IP core and a DSP IP core. As of today, if a DSP is clearly dedicated to process digital signal, a CPU IP may also support these type of tasks, on top of the main processing/control function it was initially designed for. Synopsys DesignWare new ARC HS4xD family is a perfect example of a RISC CPU IP core offering 5.0 CoreMark/MHz (so we should rank it in the CPU IP category), being also capable of high performance pure DSP processing (but can we rank it into DSP IP category?).

Let’s make it clear from the beginning: HS44, HS46, HS48 execute RISC only operations when HS45D and HS47D execute RISC and DSP (through ARCv2DSP). When combining RISC and DSP capabilities in a processor, the key is the software tools and library support, allowing seamless C/C++ programming and debug.

All the cores are supporting dual-issue, increasing utilization of functional units with limited amount of extra hardware. What is dual-issue? The capability for up to two instructions per clock, with in-order execution and the same software view as single issue. Dual-issue increases both RISC and DSP performance, but the area and power penalty is very decent, with only 15% increase. The instruction set has been improved to increase instruction per clock, allowing to execute multiple instructions in parallel and take benefit of the dual-issue pipeline.

While all the cores are supporting Instruction and Data Close Coupled Memory (CCM) from 512B to 16 MB, the designer will have to select HS46, HS47D or HS48 to benefit from Instruction and Data cache up to 64K, supporting cache coherency. The L2 cache (from 2MB to 8 MB) is available as an option for HS46 and HS47D, as well as MMU, but is by default supported by the HS48 core.

Such a core family can support a very wide range of applications, thanks to the high level of configurability. For example, all the cores support multi-core implementation, with single, dual or quad instances. Moreover, Synopsys proposes various licensable options, like FPU, MPU, MMU, real-time trace (RTT), L2 cache, FastMath Pack, cluster DMA or CoreSight Interface.
The HS4x RISC (only) family can address enterprise SSD processing needs, home networking, automotive control, wireless control or home automation.
With the HS4xD family, it’s possible to support mobile baseband, Voice/speech applications and multi-channel home audio or human machine interface.

HS4x(D) family has been tailored for embedded applications and any core is able to manage power budgets that are fixed, at best, or dropping. For any core, the power domain policy has been increased, offering user control over power management.
Every CPU IP vendor will claim offering the best solution, that’s why it could be wise to look at verified facts when comparing with the competition. Let’s talk about performance-efficiency rather than raw performance, as most of the applications need a tight control of power budget. Synopsys is claiming to offer best in class performance efficiency vs competition, with same or better features.
Some facts:

  • 45% higher performance than Cortex-A9 at ½ the power consumption
  • 2x higher performance than MIPS InterAptiv or Cortex-A7 at 20% lower power consumption
  • 2.5 higher performance than Cadence Tensilica processors
  • HS4x cores can be clocked at over 2.5 GHz in 16ff (typical), and this is faster than any core in this class
  • HS48x2 delivers higher performance than Cortex-A17… at lower power than Cortex-A9
  • HS family supports up to 8 contexts, when ARM and Cadence only support 1

So, “should we rank this DesignWare HS4xD IP core family in the CPU or DSP category?” is probably not the most crucial question, the real point to highlight is who (which competitor) and when this HS4xD family will be challenged by another IP vendor!

By Eric Esteve from IPnest


AIM Photonics Catching Its Stride as They Move into 2nd Year

AIM Photonics Catching Its Stride as They Move into 2nd Year
by Mitch Heins on 06-02-2017 at 7:00 am

AIM Photonics held its 2017 Proposers Meetings on May 24[SUP]th[/SUP] in Rochester, NY. The meetings included a review of AIM’s progress and strategic direction by their TRB (technical review board) and a session targeted at PIC (photonic integrated circuit) design for multi-project wafer (MPW) runs. While these discussions were covered under non-disclosure agreements, it’s easy to see from public postings in the news and on the AIM website that significant progress has been made by the institution whose mission it is to “advance integrated photonic circuit manufacturing technology development while simultaneously providing access to state-of-the-art fabrication, packaging and testing capabilities for small-to-medium enterprises, academia and the government”. I’ve pulled together a summary of some AIM PIC design related highlights based upon data publicly available on the AIM website.

The PIC Design for MPW session was chaired by Brett Attaway, who is the AIM Photonics EPDA (Electronic / Photonic Design Automation) Director. From a posted interview of this session, Brett pointed out that the goal of AIM’s EPDA work is to enable the design community with MPW and eventually TAP (test and packaging) services for PIC designs. This includes the development of AIM Photonic PDKs (process design kits) as well as Electronic/Photonic design flows and methodologies. The first AIM PDK was released in June of 2016 (v0.3). A second release was made in September (v0.5) of the same year and a third release was made early January of 2017 (v1.0). Plans are to make major releases of the PDKs twice per year with v1.5 of the PDKs currently targeted for August of 2017 and then v2.0 and v2.5 being release in January and July of 2018 respectively.

PDK releases include three variants. These include a variant for passive devices, a variant for active devices and a variant for a photonic interposer. The interposer enables the integration of electrical and photonic ICs as well as lasers into the same package. Per the AIM website, the passives portion of the PDK includes components such as silicon and silicon nitride versions of waveguides, edge couplers, vertical couplers, 3db 4-port couplers, Y-junctions, directional couplers, crossings and an interesting device known as an escalator coupler. The escalator coupler enables designers to move light from layer to layer sort of like a photonic via. The actives portion of the PDK includes components such as digital and analog versions of germanium photo-detectors and Mach-Zehnder modulators. Also included are thermo-optic phase shifters and switches as well as tunable filters and micro-disk switches and modulators. AIM plans to have five MPW runs in 2017, 2 full-flow runs with actives and passives, 2 passives-only runs and 1 interposer run for integration work. MOSIS acts as the AIM MPW aggregator and distributor of AIM PDKs.

AIM PDKs include support for documentation and CAD views enabling schematic capture, simulation, layout and design rule checking for a variety of flows including:

  • Cadence Virtuoso + Lumerical Solutions INTERCONNECT + PhoeniX Software OptoDesigner for mixed electrical-photonic design.
  • Mentor Graphics Pyxis + Lumerical Solutions INTERCONNECT + PhoeniX Software OptoDesigner for mixed electrical-photonic design.
  • Mentor Graphics Calibre for sign-off design rule checking including design-for-manufacturing and simulation of advanced lithographic effects.
  • Synopsys OptSim Circuit + PhoeniX Software OptoDesigner for PIC design. Synopsys component level simulation tools can also be used in conjunction with the AIM processes.
  • Lumerical component level photonic simulation tools + INTERCONNECT for PIC design.
  • PhoeniX Software photonic layout and component level simulation tools + ASPIC for PIC design.
  • There is also an interface between Lumerical Solutions and PhoeniX Software for PIC design.


The AIM Proposers meetings are meant to solicit inputs for next year’s funded AIM projects. Per the video with Brett Attaway, one of the key items that AIM is pursuing is to continue a project started in 2016 to have photonic reference designs that can be duplicated across the supported EPDA design flows. Per a presentation made by Brett at the Optical Fiber Conference in March of this year, the current reference design is focused on an integrated transceiver with PIC and CMOS designs as well as some efforts to try to collaborate on ways to ease PDK creation. Brett mentioned that he would like to see the current project expanded in 2018 to put more focus on the efficient system-level design of photonic systems that would include interface modeling between the ICs (electronic and photonic) and AIM’s interposer technology.

Additional projects were being discussed behind closed doors but it’s a sure bet that the rest of the proposed projects will have something to do with one of four KTMAs (Key Technology Manufacturing Areas)

  • Telecom/Datacom,
  • RF Analog Applications
  • PIC Sensors
  • PIC Array Technologies

or one of the four MCEs (Manufacturing Innovation Centers of Excellence)

  • EPDA: Electronic Photonic Design Automation
  • MPWA: Multi Project Wafer / Assembly
  • ICT: Inline Control & Test
  • TAP: Test Assembly and Packaging

AIM is pushing hard to enable the eco-system and there is much activity in the market place as both members and non-members are taking advantage of the MPW services they are offering now. It looks like AIM is hitting its stride which is good, because not only do they need to enable the ecosystem, but they also must be self-funding by the time their five-year funding from the government expires sometime in 2020.

Time flies when you’re having fun and right now time seems to be flying at the speed of light for AIM Photonics.

See also:


Getting to IP Functional Signoff

Getting to IP Functional Signoff
by Bernard Murphy on 06-01-2017 at 7:00 am

In the early days of IP reuse and platform-based design there was a widely-shared vision of in-house IP development teams churning out libraries of reusable IP, which could then be leveraged in many different SoC applications. This vision was enthusiastically pursued for a while; this is what drove reusability standards and cost-metrics, among other initiatives. But shifts in markets and fierce competition disrupted the in-house ideal. IP and EDA vendors offered extensive and growing libraries for standard IP, proven over many more designs and in many more processes than most in-house design teams could match. And for chip-vendors, the cycle time and cost to make existing IP truly reusable became increasingly difficult to justify in the face of tougher competition and squeezed schedules.


This became apparent in retrenchment to adapting internal assets as needed, design by design, rather than investing much in forward-looking reuse objectives; when you’re fighting to stay in the game, tactical priorities tend to overrule long-term strategies. Now it seems the outlook for many semiconductor suppliers has become more stable and EDA vendors like Cadence see a return to separate IP development teams and resurgence in demand for reusability. This is motivating a greater expectation of RTL signoff for IP; after all, reuse is meaningless if an IP must be reworked and re-verified for every design.

Pete Hardee (product management director at Cadence) told me that chip verification teams are now demanding a higher level of functional quality from IP teams than they had expected in the past, because they no longer have time to debug IP problems. Naturally this requires IP development teams to make a bigger investment in dynamic verification and it also requires starting to make an investment in formal verification; when you don’t know in advance how an IP is going to be used, the more complete checking offered by formal methods becomes important. But there’s a challenge – IP teams can’t afford to staff for formal experts; they must be self-sufficient, so investment in this area must require minimal formal expertise.

In support of this need, Cadence in their JasperGold product was probably the first to provide a range of autoprove apps requiring little to no expertise in formal, and has recently announced significant customer validation for two of these: Superlint and clock domain crossing (CDC) analysis. The superlint app includes the standard HAL checks, along with checks requiring formal such as overflow and underflow (no, it’s not just a width check), controllability and observability (for testability analysis) and FSM livelock and deadlock checks. CDC analysis includes structural checks (with support for multiple synchronization styles) along with reconvergence analysis and a range of functional checks, such as correct gray-coding on fifo pointers.

A very nice feature they have added is formal-supported waiver management. CDC analysis can be very noisy, producing many potentially false violations, not because the analysis is inaccurate but because a lot of what determines correct design for CDCs depends on design intent. A good example (and a source of a lot of false violations) comes from quasi-static signals.

These signals, often used for configuration control, in theory could switch at any time but in practice commonly (though not always) switch only during power-up or reset or other phases where synchronization concerns may be minimal. Since there can be a lot of these, avoiding synchronizers where possible can save useful area – but note the caveats in the previous sentence. Not every such case is a safe candidate to drop a synchronizer – some reconfiguration may be possible during active design use. So how do you figure out which of these violations are potential quasi-statics and which are safe to ignore?

JasperGold CDC will generate and auto-prove assertions to determine if violations result from quasi-statics. These will drill back to root-causes, often catching a lot more potential quasi-statics in the process. Of course, you’re going to have to make the final decision on whether the root-cause indicates those cases are indeed safe. But with minimal involvement, no formal expertise but still with high confidence you can waive lots of violations and get more quickly to clean CDC signoff. The app also supports dumping assertions for additional checking in Xcelium simulation.

Cadence has endorsements from ARM and ST for these technologies. ARM, being ARM, did a detailed analysis (reported at the last Jasper User Group meeting) of how using Superlint accelerated bug hunting during RTL development and pulled in RTL signoff, reducing the need for late-stage RTL changes by as much as 80%. ST commented on how the CDC app increased quality of design and chopped up to 4 weeks off design and verification time for each IP.

This is important – as much for how formal is becoming important in IP RTL development as for the apps themselves. The whole point of reuse is to reduce overall design time and increase design quality by sharing proven IP. Improving IP quality through better RTL signoff is an important way to get there. You can learn more HERE.


Embedded FPGA IP update — 2nd generation architecture, TSMC 16FFC, and a growing customer base

Embedded FPGA IP update — 2nd generation architecture, TSMC 16FFC, and a growing customer base
by Tom Dillinger on 05-31-2017 at 12:00 pm

Regular Semiwiki readers are aware that embedded FPGA (eFPGA) IP development is a rapidly growing (and evolving) technical area. The applications for customizable and upgradeable logic in the field are many and diverse — as a result, improved performance, greater configurable logic capacity/density, and comprehensive testability are customer requirements of increasing importance.

I recently had the opportunity to chat with Geoff Tate, CEO, and Cheng Wang, Senior VP of Engineering at FlexLogix, about the expansion and advancements in the eFPGA market. Flex Logix has just announced their “second-generation” array architecture, with initial silicon validation targeting TSMC’s 16FFC technology offering — the discussion of the features incorporated in this new IP generation was especially insightful.

eFPGA performance

Geoff highlighted,“A large cross-section of our customers are focused on performance. We made a key change to our basic architecture — 6-input LUT’s(also available as dual 5-input LUT’s)replace the 5-input(dual 4-input)topology of our first generation design.”

I countered with the statement that there is a contingent of FPGA users recommending 4-input LUT’s as the preferred logic mapping and configuration memory area tradeoff. Cheng provided some interesting data to counter that assertion, “Our networking customers require high packet processing throughput. This application leverages the high fan-in functions available with 6-input LUT’s, to reduce the number of logic levels in each pipeline stage. Additionally, we have optimized our unique hierarchical interconnect topology for improved performance for larger eFPGA arrays.”

As for the logic mapping efficiency with higher fan-in LUT’s, Cheng provided a comparative data point:

ARM Cortex-M0 microcontroller:

  • 3905 4-input LUT’s
  • 3089 6-input LUT’s

Cheng highlighted that the eFLEX2.5K core granularity when building an array resulted in the Cortex-M0 using 1 DSP and 2 logic cores in their first-generation architecture, whereas the new (6-input LUT) design maps the Cortex-M0 into 1 DSP and 1 logic core. (For a review of the FlexLogix eFLEX design approach that supports arrays of 2.5K cores and interconnect, please refer to a previous semiwiki article — link.) Synthesis algorithms are clearly successful in leveraging higher fan-in LUT cells.

Cheng referred to several RTL benchmarks indicating that the combination of the new core and hierarchical interconnect is providing 25% performance gains over the previous eFPGA arrays (at the same process node.)

eFPGA logic density — moving aggressively from 28nm to 16FFC

The FlexLogix architecture and compiler support a range of eFLEX2.5 core instances, in array configurations as large as 7X7, for a total capacity exceeding 100K LUT’s. They recently released a full array testsite to TSMC’s 16FFC process node.

Figure 1. Image of the 7×7 array of eFLEX2.5 cores integrated on a 16FFC testsite.

Cheng continued,“Our customers are enthusiastic about the PPA characteristics of 16FFC. There is significant momentum behind this node — we are seeing consolidation behind the 1P2xa1xd3xe PDK, which we used for the testsite — we have optimized the use of six metal levels within and between cores.”

Configuration Readback

An important feature of any programmable logic implementation is the ability to read the configuration bits, as part of production test and/or during functional runtime. “Customers require the capability to verify the configuration data, at any time.”, Cheng said. “The SRAM read operation is available with little additional hardware overhead, with the bits visible through the configuration chain.”

DFT and Production eFPGA Test

“Customers are also requiring very high (stuck-at) fault coverage during production test.”, Geoff emphasized.“And, for sure, tester time, and thus cost, must be optimized, as well.”

Naively, I mentioned that the eFPGA could simply adopt the “standard” embedded IP core wrap test architecture. Cheng educated me to the unique characteristics of eFPGA test: “The test overhead to load configuration bits to a large array as a single entity in wrap test fashion — with 1.4M configuration SRAM bits per core — would be prohibitive. We needed to develop an aggressive approach to parallelizing the loading of configuration bits and exercising test patterns. Give the multiple eFLEX2.5 core instances that comprise the embedded IP, we re-architected the core and array compiler to enable common configuration bits to be applied during production test to each core in parallel.”

Figure 2. 2nd-gen eFLEX2.5 core block diagram (left), with DFT connectivity defined for each core (right)


Figure 3. Parallel loading of common configuration bits and test patterns to each core in the array.

Cheng continued,“We developed a core model that enables commercial DFT tools to generate patterns. We’re achieving well in excess of 98% stuck-at fault coverage. We will collaborate with customers to develop additional patterns to focus on primary I/O and inter-core logic to bring the coverage well above 99%, if required.” It’s clear to me now that an embedded FPGA is definitely not like other IP, when addressing production test pattern development.

The eFGPA market is evolving rapidly — customers are requiring improved performance, greater capacity, (runtime) configuration visibility, and improved production test coverage/efficiency. The FlexLogix development team is responding to these requirements with corresponding innovations in their “second-generation” architecture.

For more information on their recent eFPGA release, please follow this link.

-chipguy


RTL Correct by Construction

RTL Correct by Construction
by Bernard Murphy on 05-31-2017 at 7:00 am

Themes in EDA come in waves and a popular theme from time to time is RTL signoff. That’s a tricky concept; you can’t signoff RTL in the sense of never having to go back and change the RTL. But the intent is still valuable – to get the top-level or subsystem-level RTL as well tested as possible, together with collateral data (SDC, UPF, etc) clean and synchronized with the RTL, to minimize iterations / schedule slips to the greatest extent possible in full-system verification and implementation – the true signoff steps.

I talked to Chouki Aktouf , CEO at DeFacto about his ambitions in this area. You probably know this company from their work in early design for testability support at RTL and their very popular capabilities to support scripted editing of RTL. Building on these strengths, particularly in managing and editing RTL, Chouki has pivoted the focus of the company to SoC RTL integration and design reuse (in the true sense of that term), in both cases with an aim to be correct by construction. This is represented in their STAR solution flow.

Naturally an important part of completing RTL signoff must be functional verification, but it would be a mistake to think this is the only important part. SoC design methodologies, based heavily on pre-designed IPs interwoven with communication, control, test / debug and other fabrics, lend themselves to correct-by-construction design methods. The less designers touch the assembly, the less mistakes will be made; the STAR focus is on assembly and coherency between the assembled RTL and implementation-related data such as constraints and early floorplanning views. Getting this right naturally doesn’t eliminate the need for verification but it can eliminate or greatly reduce the need for verification wasted on finding assembly problems and constraint mismatches.


The center of STAR is a scriptable assembly tool to instantiate and connect RTL and/or IP-XAXT blocks. Think of this as the promise of IP-XACT assembly, generalized to help not just IP-XACT fans but also those who are happy to stick with RTL. Scripted assembly makes automated and parametrized assembly possible, something you may have seen in spreadsheet approaches but in this case more flexibly. There are also connectivity checking and reporting features (eg reporting clock connections from PLLs to IP clock inputs) which can greatly simplify hookup checking.

Why is it valuable to look at RTL together with other views? Because these views have become very intertwined. Think about power management through switched domains. Domains are planned at RTL, reflected on the existing RTL hierarchy. But each switchable domain consumes physical resources (the area it consumes, the switch overhead and possibly PG overhead). Combining domains with the same power switching attributes can reduce area consumed and simplify the power distribution network. To understand whether this makes sense, you need to look at the RTL, the UPF and early floorplan data. To implement the change you must be able to restructure the RTL to push those blocks into a common hierarchy, and reflect the corresponding changes in the UPF.

Or think about optimizing design layout by running feedthrus through blocks. This is a well-known technique to reduce routing overhead for critical signals. Historically it was a purely physical design problem – not much to do with the RTL. But in our power-managed world feedthrus must be reflected in the RTL and the UPF, otherwise verification and/or equivalence checking breaks. Again, you need a mechanism to restructure the RTL to rip-up and re-stitch signals reflecting a feedthrough. The point here is that getting to correct by construction RTL in modern design is not just about scripting the assembly, it’s also about being able to adapt the design as power and physical strategies change. That becomes very difficult for in-house scripts to contemplate without major rework and is essentially impossible if big chunks of the hierarchy are in RTL. Unless you have a solution like STAR which can understand and modify RTL (and other files) to reflect these changes.

I want to elaborate on my earlier point about design reuse. This term is often used somewhat loosely to mean IP reuse. But reuse of a full-chip (or subsystem) design is also a very valuable and very common starting point for derivative designs. Here there can be an important difference from IP reuse. Typically, you don’t take the whole design as a black-box. You use a lot of it, but you may want to lose some IPs and add others. Power and physical constraints will often change. A DDR interface in the original design was carefully arranged to sit next to associated IO pads. But in the derivative the block must be moved to a new location which may require hierarchy and feedthrough changes – requiring the same capabilities we saw earlier.

There are more functions in the STAR solution – to help generate early UPF at the top-level, to check UPF for consistency with the RTL and to provide consistency checking between RTL and SDC. These checks are all about consistency and completeness checking – those components important to that RTL signoff goal. I should add that STAR also provides, in addition to Tcl scripting, rich APIs accessible through Python, Java, C++ and Ruby through which you can develop your own checks and assembly generators.

Chouki wasn’t prepared to name customers (though I know of a few important cases, especially from their scripted RTL editing capabilities). He did say that users split between designers who want to directly script in support of their immediate design objectives and CAD teams who see an opportunity to develop in-house applications. I’m a believer in this approach. A lot of SoC assembling and prep for implementation is largely mechanical. There’s not a huge amount value in wasting design expertise in mechanical tasks. More automation and more certainty in moving more quickly into the implementation phase can only be a good thing. You can learn more about DeFacto HERE.


Consolidation and Design Data Management

Consolidation and Design Data Management
by Bernard Murphy on 05-30-2017 at 7:00 am

Consensia, a Dassault Systemès channel partner, recently hosted a webinar on DesignSync, a long-standing pillar of many industry design flows (count ARM, Qualcomm, Cavium and NXP among their users). A motivation for this webinar was the impact semiconductor consolidation has had on the complexity of design data management, particularly as acquired design groups bring with them design flows based on different design data management (DDM) solutions. Acquisitions are made at least in part with the expectation that complementary functionality can be combined to create even stronger differentiation in products. Bringing together these design components from different flows can raise challenges, perhaps manageable in most areas by carefully checking view and constraint files at interfaces, but synchronizing between flows becomes a lot harder when it comes to DDM.


Dave Noble (VP Biz Dev at Consensia) told me of one example in which an SoC was being assembled from AMS components managed in DesignSync, together with the bulk of the digital circuitry managed under SVN (Subversion). See the problem? There’s no linkage between these DDMs, so when you want to check-out a golden release or a previous release, managing the bridge (or bridges) between these repositories is a manual task. Dave said in this example it took 25 engineers a week to pull together a release snapshot they were sure they could trust. And even then, there was no traceable genealogy across the design. I’d imagine challenges when you want to archive the design for post-silicon support and derivatives. Certainly, you would have to archive both repositories, but also you need to archive the snapshot along with instructions on how that snapshot was generated.

From a tool/flow point of view, life would be simple if you could mandate that everyone must switch to one supplier. But that’s wildly unrealistic in any production environment. Consensia has a more pragmatic approach in recognizing that mixed DDM environments aren’t going to go away anytime soon. DesignSync can not only manage individual components of a design (IP and subsystems) but it can also provide enterprise level release management in a mixed DDM hierarchy.

DesignSync recognizes other repositories, such as Perforce, SVN and Git and can use external hierarchical references to “pull’ information from other repositories, enabling it to manage an SOC/IP release. In fact, it uses native commands to interoperate with these other repositories, so it can do more than simply pull – I am told it can also do things like retrieve status and add tag information.


You probably know already that DesignSync is integrated with Cadence Virtuoso, Synopsys Galaxy and Keysight ADS (formerly from Agilent), simplifying debug in integration. But most important in this context, DesignSync as the SoC-level DDM provides a trackable, traceable release capability while allowing different design teams to each continue to use their preferred DDM system. Checkout is automated, genealogy retrieval is automated and you can archive repositories confident that, whenever needed, you can reconstruct release data or roll-back to earlier versions.

There’s another very nice capability that I can only touch on, since Consensia will be announcing more in this area at DAC. This relates to all the many complex aspects of IP management, both internal and external, including whether you are allowed to use this IP (maybe other designs have already used all the licenses your organization paid for), whether you can edit the IP or even view it, whether usage of this version of the IP complies with policy guidelines (perhaps it hasn’t yet been proven on silicon) or more general requirements (eg ITAR compliance) and so on. This is comprehensive IP management, essential to avoid late-stage nasty surprises around technical, business or regulatory problems. And it’s integrated with DesignSync.

DesignSync has a long track-record (15 years) and may today be top of the pile in design seats (especially for managing in large design teams), so it’s worth checking out the webinar to understand better how it might fit into your enterprise design management strategy. Also look out for more webinars along this theme; this was one in a series in which Consensia plans to focus on IP Management and the product’s role in enabling customers to securely and transparently manage their internal and external IP, especially through increasing interoperability with third party products.