webinar IPXACT banner

S2C eyeing 1B gate FPGA-based prototypes

S2C eyeing 1B gate FPGA-based prototypes
by Don Dingee on 04-21-2015 at 1:00 pm

We hear a lot about FPGA-based prototyping hardware: Aldec, Dini Group, PRO DESIGN, Synopsys, and others. So, why is today’s news on a new platform from S2C important? It’s a matter of intent, beyond the act of gluing a few large FPGAs on a board for customers to dump more and more prospective RTL into.

Size differences aside, each vendor has an emphasis. At the risk of oversimplification ….

Aldec is concentrating on speed of the basic implementation, and incorporating actual target hardware on FMC daughtercards, handy in situations requiring DO-254 compliance. Dini focuses on application platforms, such as FPGA-based algorithm acceleration for financial trading. PRO DESIGN is developing a modular approach to mix and match both FPGA architecture and I/O capability. Synopsys is leveraging their FPGA synthesis knowledge to create better partitioning and smoother upward integration for IP blocks into complete systems.

Unsurprisingly, in this field of worthy contenders, S2C has chosen their own emphasis in launching the Prodigy Complete Prototyping Platform. S2C isn’t a new company, they have been at FPGA-based hardware since 2004 and have deep connections, particularly in China – Daniel Nenni has an article coming shortly with a look at the history of the firm.

In their opening of their press release, they make a rather sweeping statement: “any functional design stage, with any design size, and across multiple geographical locations.” When looking at more specifics, their strategy – at least what they’ve announced so far – looks like an amalgam of the competition with some unique additions, and a hint at what’s coming.

Where most of the vendors are concentrating on the biggest Xilinx FPGA they can find (PRO DESIGN the exception), S2C has a selection of Xilinx Virtex-7, Kintex-7, Virtex-6, and Altera Stratix IV “logic modules”, the hardware side of the solution. A Xilinx UltraScale product is coming soon.

To facilitate partitioning a design across multiple FPGAs, S2C has Prodigy Player Pro 5.1, a hardware-aware partitioning tool that also provides remote monitoring and control. One of the biggest performance boosters in partitioning is where and how to insert LVDS pin multiplexing, handled with either automatic or guided modes in the partition engine. In addition to clock and reset control, self-test, and remote management capability, the software also has virtual switches and LEDs for simple I/O functions usually found only on the physical board.

From there, Prodigy ProtoBridge takes over. It links system-level simulation to the FPGA-based prototyping platform using an AXI-4 and other protocols over a 4-lane PCIe Gen2 interface. It supports up to 16 master and 16 slave instances, with configurable data width from 32-bit to 1024-bit, and transmission length limited only by the host hard disk space. A C-based API allows simulators to connect and run verification routines, such as high-performance regression tests.

There is also a selection of Prodigy Prototype Ready IP, with interfaces such as USB, HDMI, MIPI, a range of memory, a Xilinx Zynq module, and many others. S2C is offering design services to create specific modules per customer requirements.

If it ended there, S2C would have a comprehensive solution. What about the “multiple geographical locations?” S2C is readying a breakthrough approach linking these platforms across a network, exposed and managed in a private cloud, targeting designs of a billion gates and perhaps more. In conjunction, deep trace debugging across multiple FPGAs is also on the S2C roadmap.

Executed properly, a private, secure cloud solution could enable interesting capabilities between third-party IP vendors, SoC designers in distributed teams, and foundry partners. So far, I haven’t heard of other FPGA-based prototyping vendors taking on the cloud in this kind of strategy. This will be intriguing to watch.


Top 10 Reasons to Use Industry-standard Data Management

Top 10 Reasons to Use Industry-standard Data Management
by Paul McLellan on 04-21-2015 at 7:00 am

Should a semiconductor/IP company use a proprietary data-management (DM) environment? Or even develop their own? After all, every company is unique and developing a unique DM allows a perfect match of just what is required for that particular company. And, in principle, a proprietary DM system can underpin the design management solution perfectly. On the other hand…

There are several industry-standard DM environments that are widely used. Probably the most famous are Perforce, Subversion and Git. Methodics ProjectIC environment can use any of them. These are actually two separate decisions that Methodics made early on:

  • do not develop their own proprietary DM environment
  • make ProjectIC work with any DM environment

Here is a list of reasons why it makes sense to use an industry-standard DM rather than using something non-standard, even if it is hidden under the hood of the design management solution selected.

[LIST=1]

  • Software/Hardware compatibility: since most (hopefully all) software developers already use industry-standard configuration management tools, it makes the interface between the hardware designers and the software designers more efficient and effective if both are using the same underlying system. In the software world these DM environments are often called source-code management, but for IC design, data management seems like a better general term.
  • Industry-standard DM solutions make typical software development methodology such as Agile available to hardware designers. This methodology has a proven history of making large teams more efficient at collaborating and working on pieces of a design that must all come together seamlessly at the end. Like a chip design or releasing a block of IP.
  • Zero cost of development: companies are not indirectly (or directly if they are insane enough to develop their own) paying for a proprietary DM environment with the cost amortized over a small user-base.
  • Lower cost of maintenance. Again, everyone can focus all their resources on doing design rather than wrestling with a proprietary DM and paying a large percentage of its ongoing maintenance costs.
  • With industry-standard DM solutions, the installed base is much larger than with proprietary tools. Most industry standard DMs are open-source and so the number of contributors to the ongoing development is large. Consequently, new features are constantly updated as new requirements emerge from this large user base and its associated cohort of developers.
  • There is no rule 6. Obligatory Monty Python reference.
  • There are integrations to a long list of existing third-party tools already in-place, including MS Word, Emacs, Eclipse, and others.
  • There is a huge online knowledge base that can be accessed for quick answers to a wide range of questions based on other users’ experiences, so users can easily and quickly search and find answers to problems.
  • Industry-standard DM solutions have better tuned performance profiling since a greater variety of use models will have been seen and handled and problems already seen and addressed.
  • The needs of any company change over time, especially in IC design where designs only get larger and more second order effects become first order. There is only ever more data to be managed. An industry standard DM is much more likely to already support any new requirements and to scale to future needs. A proprietary DM may require extensive development which can put the DM itself on the critical path to tapeout.


    These reasons together make a compelling case for not reinventing the wheel in the DM area. There are plenty of excellent wheel providers already out there.


  • SecurCore: Modern Hardware Security Approach

    SecurCore: Modern Hardware Security Approach
    by admin on 04-20-2015 at 7:00 pm

    The increasing number of interconnected devices grows day by day and has slowly begun expansion into other consumer products. The need for safe, efficient, and reliable systems that meet modern user expectations has become increasingly important as a result. SoC engineers addressing these challenges must consider design tradeoffs such as available silicon area or clock speed for security purposes while still trying to maintain the desired specification. Security breaches in legacy systems were typically handled by application-layer software. However, the proven susceptibility of these systems has generated a push to design hardware with security in mind. SecurCore is a chip series manufactured by ARM that looks to meet these standards. Designed from the bottom-up with security in mind, it looks redefine how a modern, safe system should be designed.

    SecurCore is designed using two main concepts: the principle of least privilege and a partitioning of the system into protected compartments. First, hardware and software resources are split up into two different worlds called the “Secure world” and “Normal world.” The Secure world is a trusted execution environment that only handles sensitive data and has access to the entire system plus a subset of private resources that only it can access. The Normal world is where casual user activity takes place. A common OS such as Windows lives here and functions as it would on any other end device. The OS kernel still manages system calls and has access to non-secure portions of the system. These two worlds are separated using hardware logic in the bus fabric by inclusion of a non-secure (NS) bit.

    The NS bit is what the processor uses to differentiate between secure and non-secure activity, creating a mechanism that prevents the activity in the Normal world from affecting the Secure world. This mechanism also limits the direct memory access of peripheral devices that may attempt to access secure private data. It also simplifies cache memory management, as cache flushes between context switches are no longer necessary.

    A security issue with multicore processors is that of shared resources. To mitigate this, SecurCore only utilizes a single core to process all data and provides two virtual cores: one for managing Normal world activity, the other for handling Secure transactions. Processor time is split between the two virtual cores in a time-sliced fashion managed by a hypervisor monitor, which creates the two worlds as virtual machines and provides a mechanism for safe context switching between the worlds through monitoring of the NS bit. The monitor also provides a single point of entry, eliminating the need for extraneous security processor cores from design.

    The reach of SecurCore is to create an environment where casual and business activity can take place separately on a single device while providing robust security. This would be helpful in areas like the music business where producers could listen to new material while on the move instead of having to travel to a studio, or stockbrokers making trades while on a business trip. If the Normal world were to become compromised on the device, it still would not be able to access sensitive resources available in the Secure world. This fact makes it a good candidate for future implementation in consumer goods such as refrigerators, thermostats, etc.

    Overall, SoC designers are provided greater flexibility when designing with a chip with built-in security such as SecurCore. Eliminating the need for independent security component allows for greater optimization of silicon area on the chip. The ARM chip uses larger transistors to lower dynamic power consumption by reducing supply voltage. They are also used to reduce subthreshold and gate leakage, therefore increasing reliability. However, larger transistors have a longer critical path, which decreases clock speed and performance. But this is a small tradeoff for now. The issue of leakage at small feature size may limit the progress of faster security devices in the future for reliability issues, but SecurCore is a step in the right direction with a bottom-up approach to secure system design.

    Source: https://www.legacy.semiwiki.com/forum/content/3953-securecore-secure-mpu-iot.html

    By Jason Ball and Terence Roby

    The University of Mississippi Electrical Engineering Department introduced a Digital CMOS/VLSI Design course this semester. As part of this course, students researched a contemporary issue and wrote a blog article about their findings for presentation on SemiWiki. Your feedback is greatly appreciated.


    TSV Modeling Key for Next Generation SOC Module Performance

    TSV Modeling Key for Next Generation SOC Module Performance
    by Tom Simon on 04-20-2015 at 1:00 pm

    The use of silicon interposers is growing. Several years ago Xilinx broke new ground by employing interposers in their Virtex®-7 H580T FPGA. Last August Samsung announced what they say is the first DDR4 module to use 3D TSV’s for enterprise servers. Their 64GB double data rate-4 modules will be used for high end computing where capacity and performance are critical. Nvidia has announced that it is following its Maxwell GPU’s with the Pascal family that will use 3D memory and an interposer. This with other improvements promises a huge improvement in performance. The 3D interconnect will allow a 3X improvement in memory bandwidth, making Pascal 10X faster overall than Maxwell.

    But interposers present challenges to design engineers. Unlike package substrates or wire bond connections, silicon interposers are made of silicon, a poor dielectric. Depending on the design configuration there are needs to pass signals vertically, which requires going through the silicon interposer or die themselves. The only way to achieve this high interconnect density is to use through silicon vias (TSV’s). These are metal vias that are relatively large and tall. They require insulation which is provided in the form of a silicon oxide sleeve separating the TSV metal from the silicon bulk.

    Their design structure and characteristics in high speed designs mandate analysis and modeling that will be accurate enough to predict potential performance issues. In advanced designs there will be high densities of TSV’s, which means there will be interactions between adjacent TSV’s. Simple RC extraction will not be sufficient. Nearby TSV’s can couple inductively. Also because of the properties of the interposer there will be parasitic MOS capacitors formed that can couple between adjacent TVS’s. The models needed to accurately represent this system should have frequency dependent elements.

    I came across an excellent discussion of modeling and analysis of designs with multiple TSV’s recently that was published by Mentor.Here is the link. This paper discusses the difficulties in producing good compact models for TSV’s. It further discusses the tradeoff between full wave and quasi-static analysis methods in light of the many frequency dependent effects that need to be considered.

    Mentor acquired Nimbic a while back and the test cases in the white paper that illustrate the results were run with nApex, which came from this acquisition. nApex is a quasi-static extractor, but is able to do some clever things to get good correlation with full wave solvers, which run much slower and also have the added difficulty of outputting S-parameters. Getting from S-parameters to compact models suitable for transient SPICE can be extremely difficult. When the target application is modeling large numbers of TSV’s there will be a high port count that almost always makes simulation difficult without an equivalent circuit.

    The Mentor paper shows results for a test case modeled from DC up to over 100GHz. It drills into the relative effects of inductive and capacitive coupling within a TSV array as a function of distance. In real world cases there will also be RDL and potentially package structures that need to be included. All of this makes an effective case for using quasi static methods that can be shown to do well in modeling skin effects, etc.

    The need to model 3D structures is only going to grow. It will be interesting to see how overall system performance improves as these new design approaches offer ways to reduce memory access time and expand inter-die busses. Nvidias successor to Pascal has already been discussed and the projected performance gains allow them to track Moore’s Law type improvements in their GPU’s. This certainly would not have been possible without the benefits of 3D memory and interposers that rely on TSV’s.


    A Comprehensive Power Optimization Solution

    A Comprehensive Power Optimization Solution
    by Pawan Fangaria on 04-20-2015 at 7:00 am

    In an electronic world driven by smaller devices packed with larger functions, power becomes a critical factor to manage. With power consumption leading to heat dissipation issues, reliability of the device can be affected, if not controlled or the device not cooled. Moreover, for mobile devices such as smartphones or tablets that run on battery, low power consumption is essential. For a holistic solution to the power problem, it is important that this is addressed at the source, i.e. the design stage. For SoCs with robust power optimized designs, comprehensive EDA tools are needed that can accurately measure and estimate power requirements, reduce power by utilizing various techniques, and verify the power at every stage of the design starting from RTL to the physical netlist. The RTL stage coming early enough in the design process and being detailed enough; it is the sweet spot that provides the largest avenue for power reduction and optimization.

    Atrenta’sSpyGlass Powerprovides a complete solution for power optimization that analyzes and reduces power, verifies the original RTL against the new modified RTL, and ensures the design is compliant with the power intent, post-synthesis and post-layout. The power reduction is done by utilizing several approaches including power exploration at the SoC architecture level, manually fixing critical blocks with tool guidance, and automatic power optimization of non-critical blocks.


    Figure 1: Power optimization flow

    The SpyGlass Power optimization flow works interactively between power estimation, reduction and verification. The power estimation can be done with user specified inputs without vectors, or with simulation/emulation data in VCD, FSDB, or SAIF files. Different modes can be used such as average, cycle-based, or hybrid, depending on the requirement; for example, in cases of memories, cycle-accurate monitors are applied by default, even in average mode. For predictable accuracy, calibration is done against a reference netlist, and a correlation toolbox is used for models of capacitance, clock tree, and so on. The beauty of the SpyGlass Power solution is that with physical-aware power estimation, it can even skip the calibration and correlation step, while still producing the same level of accuracy by leveraging timing and physical optimization engines. This is a solution that will be introduced soon in SpyGlass Power, providing enhanced local fidelity and allowing trade-off analysis between different physical prototypes.

    Along with power estimation, SpyGlass Power also performs power profiling and provides power efficiency metrics that include information such as ‘Clock Gating Ratio’, ‘Intrinsic Clock Gating Efficiency’, ‘Incremental Clock Gating Efficiency’, and ‘Register Output Activity Density’. Designers can analyze this data and make informed decisions to either accept particular clock gating and register structure, or modify them manually to increase power efficiency.


    Figure 2: Power Explorer

    Within the tool, there is a versatile Power Explorer,which is a rich GUI that works as a central cockpit for top-down power methodology and efficient power reporting, analysis, and suggested next steps for improvement. The master view with hierarchical instances contains information about design objects such as registers, combinational logic, power profile numbers, annotations, and so on. The report can be easily customized as required and presented in different matrix forms. Similarly the slave or secondary views contain information about registers, memories, clocks, micro-architectures, and opportunities for power saving.

    SpyGlass Power uses formal sequential analysis techniques to identify ‘Enables’ beyond synthesis tools for sequential power reduction. While introducing registers for Enables, it leverages SpyGlass CDC to keep the design CDC-safe. Similarly memory power reduction is achieved by using techniques such as ‘redundant access removal’ and automatic activation of ‘light sleep mode’.


    Figure 3: Activity Trigger Detection

    There is a very effective and useful technique called ‘Activity Trigger Detection’ that is used to identify events that can up-surge or down-surge activities and turn parts of the design ON or OFF. The signals that are root causes of such changes are identified with a combination of statistical, structural, and formal approaches, and the events can be combined to gate a particular block or even power-gate it.


    Figure 4: Examples of Power Guidance rules

    SpyGlass Power provides power guidance for micro-architectural improvements such as ‘FIFO optimization’, ‘counter gating’, ‘glitch detection and removal’, and so on. It works with or without vectors. ‘Power lint’ can be used to improve RTL without vectors. With vectors, the power guidance also provides information about specific modifications with their expected power gains.

    As part of power verification, SpyGlass Power has an independent Signoff Verification Solution that supports UPF 2.0/2.1 with a single tool for RTL, post-synthesis netlist, and post-layout netlist verification. The power intent browser cleanly represents power at every block or IP level without design clutter and with cross-probing features. The non-instrumented RTL is verified with power intent “lint” checks and power intent consistency checks. On instrumented RTL or netlist designs where low power elements such as isolation cells or level shifters are inserted, the checks also flag if the implementation is improper. The signoff power verification on post-synthesis netlist ensures correct implementation of power elements, while on post-layout netlist the tool also ensures all supply connections are correct via ERC (Electrical Rule Checks).

    SpyGlass Power also has a powerful debug environment with several features such as power annotation on schematics, click on port to see isolation, retention or level-shifting strategy, and cross-links between violation messages, power intent browser, schematics, design files, etc. Additionally, some functional checks can be done through formal power verification based on design functionality and power intent.

    SpyGlass Power has seamless integration of all these technologies within the SpyGlass platform and provides a powerful comprehensive solution for power estimation, analysis, optimization and verification. This solution for power optimization and verification has been adopted by many leading semiconductor companies.

    A very detailed description with several examples about SpyGlass Power solutionwas presented in a webinarby Guillaume Boillet, Sr. Technical Marketing Manager at Atrenta. The webinar can be attended on-line after a one-step free registration on the Atrenta website here.


    Networking at 52nd DAC in SFO

    Networking at 52nd DAC in SFO
    by Daniel Payne on 04-19-2015 at 7:00 pm

    Yes, the 52nd DAC(Design Automation Conference) is a technical conference plus exhibition with wonderful keynote speakers and agenda, however there is a certain serendipity that occurs by just meeting people, face to face at the many networking opportunities. The best way to kick off your DAC experience is by attending the Sunday night event called the Welcome Reception, it’s held at the Intercontinental Hotel in the Grand Ballroom BC from 5:30PM to 7:00PM. You are likely to see most of the bloggers from SemiWiki present, along with Wally Rhines, the CEO from Mentor Graphics and Tom Quan, Director of TSMC. Not only seeing people like this from our industry, but actually approach them, introduce yourself, start a conversation and even exchange business cards.

    I’ve always found this Welcome Reception to be an energizing start for the next 72 hours of non-stop meetings, discussions and blogging about all things EDA, semiconductor and IP. It’s a time to reconnect with former co-workers, get introduced to new people, ask about industry trends, see how their business is going, find out about what is new, and share any rumors that spread throughout DAC like lightning.


    Intercontinental Hotel, nearby Moscone Center

    Immediately following the Welcome Reception there is typically a Gary Smith presentation, held in a nearby room of the same hotel. Gary is an analyst that covers the EDA space, and he also sells research reports over at Gary Smith EDA. You cannot miss Gary, because he will be wearing his trademark white sport coat.

    Related – Gary Smith at DAC

    Each evening throughout the week you’ll have more opportunities for networking at DAC like:

    • Monday, 6PM to 7PM on the exhibit floor, Cocktails & Conversations Reception
    • Tuesday, 4:30PM to 6PM at the Designer and IP Track Poster Session on the exhibit floor
    • Tuesday, 6PM to 7PM on the exhibit floor, Cocktails & Conversations Reception
    • Wednesday, 6PM to 7PM on the Esplanade Foyer, Reception
    • Thursday, 5:30PM to 6:30PM on the Esplanade Foyer, Reception

    As a blogger I typically get invited to a dinner by Mentor on Monday night and Synopsys on Tuesday night. If you love loud music then plan on attending the Denali/Cadence party on Tuesday night, but make sure that you sign up first then pick up your ticket at the Cadence booth #3515.

    Enjoy your experience at DAC this year by attending the technical conference, exhibits, keynote presentations and most of all get connected with some more people by networking this year.

    Related blogs about DAC:


    Rockchip Bets on Arteris FlexNoC Interconnect IP to Leapfrog SoC Design

    Rockchip Bets on Arteris FlexNoC Interconnect IP to Leapfrog SoC Design
    by Majeed Ahmad on 04-19-2015 at 9:00 am

    China was a virgin territory for Arteris Inc. before July 19, 2012 when Fuzhou Rockchip Electronics announced that it has licensed the Arteris FlexNoC network-on-chip (NoC)-based interconnect IP technology for its multicore SoCs for budget Android tablets. Rockchip mostly targets the tablet and set-top box (STB) markets in China and Taiwan with high-end processors at a lower price point.

    Rockchip, one of the largest chipmakers in China, quickly set the precedent for efficient SoC development and soon a number of chipmakers in China followed suit by licensing Arteris’ FlexNoC interconnect IP technology. The chipmakers in China that have now licensed Arteris’ interconnect IP for SoC development include high-flyers such as HiSilicon, Spreadtrum, Leadcore, RDA Microelectronics, Allwinner and Nufront.

    Earlier this month, the Fuzhou, China-based Rockchip renewed ties with Arteris by licensing the FlexNoC fabric IP for its RK Series of SoC devices. The FlexNoC interconnect fabric will serve as the backbone communications IP for Rockchip applications processors. Arteris’ FlexNoC IP will provide the interconnect glue between graphics processing units (GPUs), on-chip peripherals and other subsystems of the SoC device.


    Rockchip’s RK3288 is known to have cost US$40 per chip

    The on-chip quality-of-service features of FlexNoC guarantees high bandwidth availability for initiators such as GPU and low latency for communications such as the CPU-to-memory. One of the key advantages of FlexNoC interconnect IP in large SoC development is the minimization of routing congestion, which in turn, reduces timing convergence issues and accelerates time-to-market. It’s important to note that Rockchip, in its early going, used to develop one SoC device in a year. Now, since adopting Arteris NoC IP, Rockchip managers boast doing as many as six SoC designs in a year.

    In 2012, Rockchip’s IC Design Manager Li Shiqin acknowledged that his company chose Arteris’ FlexNoC technology instead of older interconnect technologies like buses and crossbars because it allowed Rockchip to simultaneously meet design frequency, power, memory efficiency and QoS requirements. Fast forward to April 2015, Li is now IC Design Director at Rockchip and says that FlexNoC interconnect technology brings Rockchip SoCs both differentiation and time-to-market benefits. “Our extensive use of Arteris FlexNoC interconnect IP has enabled us to increase the number of complex SoC designs we can implement in a year with the same amount of resources.”

    Beyond faster turnaround time, IP products like the Arteris FlexNoC fabric come with another substantial benefit for China’s SoC houses like Rockchip. China’s large SoC makers like Rockchip are mostly using CPU cores from ARM and graphics cores from either ARM or Imagination Technologies. So being able to choose from the best of the cutting-edge IP products provides them with an effective venue for differentiating their SoCs from competitor products.


    Rockchip was first in China to embrace Arteris interconnect IP for SoC design

    Take Rockchip’s RK3288 processor, for instance, which Acooo has used in its OneBoard PRO+ with a backlit mechanical keyboard housed in an aluminum case. The keyboard—designed to be used both as an Android system and a normal keyboard for your PC or Mac—comes with a power adapter and cable, a PC-line connect cable with DVI connector on one side and HDMI plus USB on the other side, an HDMI cable, and a DIY accessory tool box. The RK3288 SoC uses quad ARM Cortex-A17 processors and a powerful ARM Mali-T764 GPU. It is known to have cost around $40 per chip.

    Rockchip’s RK3288 application processor has also powered two sub-$150 Chromebooks from Haier and Hisense and a Chrome stick from Asus called Chromebit that implements a set-top Chrome computer in an HDMI dongle. These design wins have established the RK3288 SoC as one of the fastest ARM processors in tablet and notebook computers.


    Rockchip’s RK3288 is powering the Asus Chromebit dongle

    In retrospect, China’s SoC underdogs came to adopt the network-on-chip technology for digitally packetizing information between IP blocks within an SoC die just a couple of years after silicon behemoths like Qualcomm, Samsung and TI. So according to the sequential steps for new technology adoption, as explained in Geoffrey Moore’s “Crossing the Chasm” business bestseller, the companies like Rockchip could be considered early adopters of the crucial network-on-chip technology in SoC designs.

    That helps to understand the transformation of China chipmakers from SoC design wannabe’s to mainstream silicon players. The fabless chip firms in China have been working relentlessly for over a decade to earn an identity. However, after they began to bet aggressively on new IP products like the FlexNoC interconnect fabric, they started to win design breakthroughs they have been longing for since the early 2000s.


    RK3288 board

    Take Rockchip’s SoC journey as testament to how vital investing in semiconductor IP can be for a chipmaker’s technology evolution. The chipmaker from Fuzhou, who first licensed Arteris FlexNoC interconnect IP in summer 2012, had been largely known as a budget SoC supplier for tablets. Last year, it signed a strategic deal with Intel that will allow it access to x86 CPU core IP as well as to Intel’s 3G baseband technology that the world’s largest silicon vendor had acquired after buying Infineon’s wireless unit.

    In other words, Rockchip is going to make a leap from the tablet to the smartphone SoC market. That premise also became evident when ARM announced its next-generation Cortex-A72 processor core for smartphones in February this year. Rockchip was among the three licensees that ARM disclosed at the time of announcement of its 64-bit CPU core platform.

    According to DigiTimes, Rockchip has showcased smartphones and tablets using Intel processors at the Hong Kong Electronics Fair that was held on April 13 through 16, 2015. Rockchip seems set for exciting times as its bets in licensing world-class IP clearly distinguish it in its quest to develop world-class SoC devices.


    Moore’s Law is dead, long live Moore’s Law – part 3

    Moore’s Law is dead, long live Moore’s Law – part 3
    by Scotten Jones on 04-19-2015 at 4:00 am

    In the second installment of this series we reviewed the cost drivers that have enabled the semiconductor industry to continue to cost reduce the cost per transistor year after year. In the next three installments we will discuss the product specific issues beginning with this installment discussing DRAM.
    Continue reading “Moore’s Law is dead, long live Moore’s Law – part 3”


    Moore’s Law is dead, long live Moore’s Law – part 2

    Moore’s Law is dead, long live Moore’s Law – part 2
    by Scotten Jones on 04-19-2015 at 12:00 am

    In the first installment of this series on Moore’s law we examined what Moore’s law is and presented some data on how it has affected the industry. In this installment we will discuss the manufacturing cost reduction strategies that have made Moore’s law possible.

    Manufacturing Cost Drivers
    The manufacturing cost of a semiconductor is made up of wafer fabrication, wafer sort, packaging and class test.

    Packaging has seen a move from expensive ceramic packages to plastic packages, packaging has moved offshore to low labor cost locations, new smaller and lower cost packages have been introduced such as QFNs and recently gold wire bonding wore has been replaced with copper wire bonds.

    For wafer sort and class test, probably the biggest cost reduction has been the move to parallel test where up to hundreds of parts are now tested at the same time.

    The single biggest drive of semiconductor cost reductions has been the cost of wafer fabrication where four major factors have driven down cost.

    Wafer fabrication cost

    Wafer size
    The first big driver of wafer fabrication cost reductions has been wafer size transitions. In 1960 wafer sizes were split between 0.75 and 1.0 inch diameter wafers with 0.75” ramping down and 1.0” ramping up. Over the subsequent decades wafer sizes have transitioned all the way to 300mm (~12”) with 450mm (~18” in development). Table 1. Illustrates wafer size transitions.

    [TABLE] align=”center” border=”1″
    |-
    | style=”width: 109px” | Wafer size
    | style=”width: 126px” | Year introduced to production
    | style=”width: 114px” | Wafer area (cm2)
    | style=”width: 101px” | Wafer area increase
    | style=”width: 101px” | Years since last new wafer size.
    |-
    | style=”width: 109px” | 1.5”
    | style=”width: 126px” | 1963
    | style=”width: 114px” | 11.4
    | style=”width: 101px” | 1.44
    | style=”width: 101px” | NA
    |-
    | style=”width: 109px” | 2.0”
    | style=”width: 126px” | 1966
    | style=”width: 114px” | 20.3
    | style=”width: 101px” | 1.78
    | style=”width: 101px” | 3
    |-
    | style=”width: 109px” | 3.0”
    | style=”width: 126px” | 1970
    | style=”width: 114px” | 45.6
    | style=”width: 101px” | 2.25
    | style=”width: 101px” | 4
    |-
    | style=”width: 109px” | 100mm
    | style=”width: 126px” | 1974
    | style=”width: 114px” | 78.5
    | style=”width: 101px” | 1.72
    | style=”width: 101px” | 4
    |-
    | style=”width: 109px” | 125mm
    | style=”width: 126px” | 1981
    | style=”width: 114px” | 123
    | style=”width: 101px” | 1.56
    | style=”width: 101px” | 7
    |-
    | style=”width: 109px” | 150mm
    | style=”width: 126px” | 1984
    | style=”width: 114px” | 177
    | style=”width: 101px” | 1.44
    | style=”width: 101px” | 3
    |-
    | style=”width: 109px” | 200mm
    | style=”width: 126px” | 1988
    | style=”width: 114px” | 314
    | style=”width: 101px” | 1.78
    | style=”width: 101px” | 4
    |-
    | style=”width: 109px” | 300mm
    | style=”width: 126px” | 1998
    | style=”width: 114px” | 707
    | style=”width: 101px” | 2.25
    | style=”width: 101px” | 10
    |-
    | style=”width: 109px” | 450mm
    | style=”width: 126px” | 2022
    | style=”width: 114px” | 1,590
    | style=”width: 101px” | 2.25
    | style=”width: 101px” | 24
    |-

    Table 1. Wafer size transitions.

    Wafer size transitions require new larger more expensive equipment increasing depreciation, facility and maintenance costs. Consumable usage increases and wafers get more expensive. Indirect labor per wafer typically stays relatively flat and direct labor per wafer has actually gone down due to increasing automation. The net result is that the cost per unit area of processed wafer typically goes down, for example table 2 presents a comparison of 300mm costs versus 200mm costs for an identical logic process.

    [TABLE] align=”center” border=”1″
    |-
    | style=”width: 139px” | Wafer size
    | style=”width: 192px” | $/wafer
    | style=”width: 150px” | $/cm2
    |-
    | style=”width: 139px” | 200mm
    | style=”width: 192px” | $1,203.17
    | style=”width: 150px” | $3.83
    |-
    | style=”width: 139px” | 300mm
    | style=”width: 192px” | $1,936.11
    | style=”width: 150px” | $2.74
    |-

    Table 2. 300mm versus 200mm logic process cost comparison. Source, IC Knowledge.

    The key assumptions are:

    • Material: $/cm[SUP]2[/SUP] the same for both sizes (currently approximately true although not at the introduction of a new wafer size).
    • Direct Labor (DL) and Indirect Labor (IDL) productivity equal (300mm is actually better for DL).
    • Equipment cost: 1.25x (assumes no technology improvements).
    • Throughput: 0.52 expose, 0.62 implant and metrology, 1.0x others. Assumes no throughput enhancements that would increase the equipment price.
    • Footprint: actual change.
    • Maintenance factor: same for both.
    • Consumables and utilities: 2.25x (actually has generally been less than this).

    The net result is a 28% reduction in cost per unit area. The problem is that as we can see from table 1. the time between wafer size transitions has lengthened dramatically from 3 to 4 years in the past to 10 years for 200mm to 300mm and over 20 years to get from 300mm to 450mm. the net result is that wafer size transitions are no longer a significant contributor to yearly reductions in wafer cost.

    Yield

    In the early days of the semiconductor industry yields were low and improved slowly. In recent years mature yields are typically in the ninety percentage plus range and the time to achieve mature yields has compressed to around six months. There are of course exceptions; there were reports of yield problems for TSMC at 130nm. At 28nm yield struggles at most foundries gave TSMC a clear early lead and Intel has had well publicized yield issues at 14nm.

    The bottom line is that modern yields are so high and time to yield is so fast that further improvement in costs due to faster yield improvement aren’t really possible and yield improvement is no longer a driver of year to year cost improvement.

    OEE

    In 1995 Sematech introduce OEE to the industry. OEE is basically the percentage of the capacity of a tool that is actually achieved. OEE which stands for Overall Equipment Effectiveness accounts for down time both scheduled and unscheduled, idle time due to no work or no operator, yield loss, speed (tool running slower than normal), set up, qualification lots and engineering wafers. It is basically how many good sellable wafers a tool produces per hour divided by the tool capacity in wafers per hour.

    When Sematech went out and surveyed the industry in 1995 they found the industry wide OEE was only averaging 30%. A major program to improve OEE was then undertaken and by 2003 a new Sematech study found OEE had improved to 40%.

    Generally speaking OEE follows these principles:

    • OEE is better for newer equipment designed to produce smaller linewidths because of the work the equipment manufacturers continue to put into improving equipment performance.
    • OEE is lower for high mix fabs due set up time for change overs and qualification runs and higher for low mix fabs.
    • OEE is better for larger fabs due to the ability to better match equipment capacity.

    In general from lower to higher OEE we can rank fabs as follow: high mix logic fabs (foundry), low mix logic, single process large DRAM fabs, single process massive NAND fabs.

    OEE today ranges from around 50% to over 70%. It should be noted here that due to cycle time reasons it is not desirable to drive OEE to 100% and some of the largest NAND fabs are approaching the “optimal OEE”. OEE is therefore becoming less of a year to year cost reduction driver.

    Linewidth shrinks
    The largest and most consistent driver of cost reductions has been linewidth shrinks. Linewidth shrinks have historically increased wafer cost but provided a larger increase in transistors per unit area and therefore reduced cost per transistor.

    In 1974 Dennard, et.al. of IBM disclosed the concept of MOSFET scaling. Basically you take an existing linewidth, for example 250nm and you multiply it by a scaling factor resulting in a linewidth of 180nm. By shrinking from 250nm to 180nm you increase the transistors per unit area and the transistor performance simultaneously as long as you scale everything correctly. Table 3 summarizes constant electric field scaling from 250nm to 180nm.

    [TABLE] align=”center” border=”1″
    |-
    | rowspan=”2″ style=”width: 162px” | Parameter
    | rowspan=”2″ style=”width: 60px” | Scaling factor
    | colspan=”3″ style=”width: 342px” | Devices
    |-
    | style=”width: 108px” | Before
    | style=”width: 114px” | Calculation
    | style=”width: 120px” | After
    |-
    | style=”width: 162px” | Gate length (Lg)
    | style=”width: 60px” | 1/k
    | style=”width: 108px” | 250nm
    | style=”width: 114px” | 250/1.4
    | style=”width: 120px” | 180nm
    |-
    | style=”width: 162px” | Operating voltage
    | style=”width: 60px” | 1/k
    | style=”width: 108px” | 1.8 volts
    | style=”width: 114px” | 1.8/1.4
    | style=”width: 120px” | 1.3 volts
    |-
    | style=”width: 162px” | Packaging density
    | style=”width: 60px” | K[SUP]2[/SUP]
    | style=”width: 108px” | 1x
    | style=”width: 114px” | 1.4[SUP]2[/SUP]
    | style=”width: 120px” | 2x
    |-
    | style=”width: 162px” | Power consumption
    | style=”width: 60px” | 1/k[SUP]2[/SUP]
    | style=”width: 108px” | 1x
    | style=”width: 114px” | 1/1.4[SUP]2[/SUP]
    | style=”width: 120px” | 0.5x
    |-
    | style=”width: 162px” | DC power density
    | style=”width: 60px” | 1
    | style=”width: 108px” | 1
    | style=”width: 114px” | NA
    | style=”width: 120px” | 1
    |-
    | style=”width: 162px” | Circuit delay
    | style=”width: 60px” | 1/k
    | style=”width: 108px” | 1
    | style=”width: 114px” | 1/1.4
    | style=”width: 120px” | 0.7
    |-
    | style=”width: 162px” | Power delay product
    | style=”width: 60px” | 1/k[SUP]3[/SUP]
    | style=”width: 108px” | 1
    | style=”width: 114px” | 1/1.4[SUP]3[/SUP]
    | style=”width: 120px” | 0.4
    |-
    | style=”width: 162px” | Functional throughput
    | style=”width: 60px” | K[SUP]3[/SUP]
    | style=”width: 108px” | 1
    | style=”width: 114px” | 1.4[SUP]3[/SUP]
    | style=”width: 120px” | 2.7
    |-

    Table 3. Constant electric field scaling.

    From table 3. we can see that packing density (transistors per unit area increased by 2x, power consumption was cut in half, circuit delay was decreased by 30% and functional throughput went up by 2.7x.

    In order to accomplish constant electric field scaling gate oxide thickness also has to scale to maintain good electrostatic control over the gate. Unfortunately at the 90nm logic node gate oxides became so thin that further reductions in thickness resulted in too much leakage. This will be discussed further in the section on logic devices but basically mobility enhancement through strain was used until high-k gate oxide was introduced.

    Figure 1. illustrates the linewidths for the market leaders in four main segments. Plotted on figure 1 are Intel MPU and TSMC SOC minimum metal pitches, Samsung DRAM bit line pitch and Samsung NAND polysilicon pitch.

    Figure 1. Linewidth trends for Samsung DRAM, TSMC SOC, Intel MPU and Samsung NAND. Source, IC Knowledge.

    As you can see from figure 1. all four segments have continued to scale down linewidths. The problem is that as pitch drops below approximately 80 nanometers single patterning with immersion scanners can no longer print the required patterns and multipatterning is required.

    Multipatterning
    There are two main approaches to multipatterning:

    Litho-etch splits patterns up into multiple masks where a mask is applied and etched into the wafer and then additional masks and etches are performed. Table 4. Summarizes various litho-etch techniques.

    [TABLE] align=”center” border=”1″
    |-
    | style=”width: 127px” | Technique
    | style=”width: 127px” | Abbreviation
    | style=”width: 127px” | Pitch
    | style=”width: 127px” | Masks
    |-
    | style=”width: 127px” | Litho-etch – Litho-etch
    | style=”width: 127px” | LE2
    | style=”width: 127px” | ~60nm
    | style=”width: 127px” | 2
    |-
    | style=”width: 127px” | Litho-etch – Litho-etch – Litho-etch
    | style=”width: 127px” | LE3
    | style=”width: 127px” | ~50nm
    | style=”width: 127px” | 3
    |-
    | style=”width: 127px” | Litho-etch – Litho-etch – Litho-etch – Litho-etch
    | style=”width: 127px” | LE4
    | style=”width: 127px” | ~40nm
    | style=”width: 127px” | 4
    |-

    Table 4. Litho-etch multipatterning options. Source, IC Knowledge Strategic Cost Model.

    The second technique is Self-Aligned Multipatterning. The self-aligned techniques can double, quadruple or octuple the pitch but they create ovals around a mandrel and require cut masks to create lines. Depending on the required pitch, large number of cut masks may be required. Self-aligned multipatterning is also primarily useful for line-space pairs and not as useful for 2D patterns. Table 5. Summarizes self-aligned multipatterning.

    [TABLE] align=”center” border=”1″
    |-
    | style=”width: 127px” | Technique
    | style=”width: 127px” | Abbreviation
    | style=”width: 127px” | Pitch
    | style=”width: 127px” | Masks
    |-
    | style=”width: 127px” | Self-Aligned Double Patterning
    | style=”width: 127px” | SADP
    | style=”width: 127px” | 40nm
    | style=”width: 127px” | 3 for 40nm
    |-
    | style=”width: 127px” | Self-Aligned Quadruple Patterning
    | style=”width: 127px” | SAQP
    | style=”width: 127px” | 20nm
    | style=”width: 127px” | 6 for 20nm
    |-
    | style=”width: 127px” | Self-Aligned Octuple Patterning
    | style=”width: 127px” | SAOP
    | style=”width: 127px” | 10nm
    | style=”width: 127px” | 10 for 10nm
    |-

    Table 5. Self-aligned multipatterning options. Source, IC Knowledge Strategic Cost Model.

    As we can see from tables 4 and 5 multipattering enables much smaller pitches than the 80nm single exposure limit but at the cost of more masks and other additional patterning. There are also edge placement concerns issues that may limit the ability to achieve some of the smallest pitches.

    The bottom line of all this is that as we move further into the era of multipatterning cost per wafer is going up faster than it has historically increased slowing the rate of cost reduction. As we will see in the next three installments there are also structural issues that will ultimately stop our ability to continue to scale at all.

    In the next three installments we will examine the specific issues and status of DRAM, Logic and NAND.

    Also read:
    Moore’s Law is dead, long live Moore’s Law – part 1
    Moore’s Law is dead, long live Moore’s Law – part 3

    Moore’s Law is dead, long live Moore’s Law – part 4
    Moore’s Law is dead, long live Moore’s Law – part 5


    Silvaco: TCAD to Signoff in Vertical Markets

    Silvaco: TCAD to Signoff in Vertical Markets
    by admin on 04-18-2015 at 8:00 pm

    Recently, I talked about meeting with Dave Dutton the CEO of Silvaco. Mainly we were talking about the recent acquisition of Invarian but he also brought me up to date on Silvaco and how he is bringing their disparate product lines into a more focused strategy.

    See also Silvaco Swallows Invarian

    Silvaco would be the first to admit that they have not done a great job of marketing themselves and their product lines. In fact people are often surprised how significant they are:

    • 30 years old, no debt, no VCs, funded entirely by cash flow
    • >400 customers worldwide
    • around 200 employees with a global footprint
    • development centers in Santa Clara CA, Cambridge UK, Hsinchu TW and Yokohama JP
    • only providrer delivering a complete TCAD, 3D RC extraction, SPICE modeling, SPICE simulation, custom IC design and verification flow
    • #1 supplier for flat-panel display (FPD) with almost all manufacturers, both TFT and OLED
    • #1 supplier of solutions for radiation and soft-error reliability
    • #2 TCAD supplier overall, with customers for for power, optical, radiation/reliability, and CMOS markets
    • #4 supplier for full AMS/power IC flow

    As a private company they do not reveal their detailed financials. But they will say they are financially strong (cash reserves are 30% revenue) and profitable. Given 200 employees…well, you do the math. They plan to double revenue in the next 3 years by 2018.

    They are also in their second year of collaboration with Sematech working on 16nm FinFET TCAD and TSVs, parasitic analysis of 10nm FinFET/SRAM and impact of strain/stress on FinFET performance.


    The matrix above shows how their various product capabilities fit together to serve specific markets. In most markets they have a tool portfolio that allows a company to leverage their TCAD capabilities up to SPICE and then go up further to the design and verification level, with everything tied tightly together. This gives them a “TCAD to Signoff” capability.

    The key vertical markets that Silvaco brings technology to address are:

    • display: this market is primarily in Asia where all the displays for TVs, computers and phones are manufactured. Almost everyone in the market uses Silvaco for this
    • power (design of processes and circuits for high-voltage switching)
    • optical: this is a pure TCAD play
    • radiation and soft-error analysis: recently declassified Silvaco technology previously restricted to the military is now available for the commercial market
    • CMOS and advanced CMOS: TCAD for process and device development, analog, standard cell and memory cell design

    See also If You Plan to put Electronics in Space or Avionics You Must See This Webinar

    The market that gets all the glory in semiconductor design is, of course, digital design in leading edge processes, especially for the very high volume mobile market. It is easy to assume that this is the entire market for semiconductors and for tools to design them, but nothing could be further from the truth. Even TSMC does nearly 30% of their volume in non standard designs such as power, flash and MEMS, and if you throw in analog design done in non-leading-edge processes (such as 90um or 130um) it is an even bigger percentage. Silvaco doesn’t really “do” leading edge digital design (no synthesis, place & route and so on) but for all these other designs, including leading edge analog, they have a huge range of technologies to bring to bear.

    The different verticals are actually addressed using different key technologies, sometimes special variants of them, combined in different ways. Those key technologies are:

    • TCAD: technology CAD, with a single engine for 1D, 2D and 3D with process, device and stress simulation
    • SPICE modeling and simulation: leading supplier of high-voltage and TFT models, statistical yield analysis
    • custom layout and extraction:highest accuracy capacitance extraction for TFT, analog, SRAM and FinFET. Full flow including layout, schematic, physical verification, extraction and more. Wide range of PDKs available to support many foundries, primarily for analog, mixed-signal, RF including PCells
    • EM/IR: (electro-migration, current/resistance drop analysis) including thermal analysis and reliability flow

    Now, also, with the acquisition of Invarian, users of the Silvaco flow have all the data required to enhance their analysis capabilities with Invar, and thereby get the most critical insight on reliability that is required for successful designs.

    Silvaco’s website is here.