Banner 800x100 0810

What Applications Implement Best with High Level Synthesis?

What Applications Implement Best with High Level Synthesis?
by Daniel Payne on 07-26-2013 at 3:12 pm

RTL coding using languages like Verilog and VHDL have been around since the 1980’s and for almost as long a time we’ve been hearing about High Level Synthesis, or HLS that allows an SoC designer to code above the RTL level where you code at the algorithm level. The most popular HLS languages today are C, C++ and SystemC. Several EDA vendors have tools in this space, and one of them is Forte Design Systems, founded in 1998.

My question today is, “What applications implement best with HLS?”

Let’s take a look at three application categories that make sense to use an HLS approach.

Digital Media

I love to view or create digital media with my devices:

  • 35mm Canon DSLR
  • MacBook Pro laptop
  • iPad tablet
  • Google Nexus 7 tablet
  • Samsung Galaxy Note II smart phone
  • Amazon Kindle Paperwhite, e-book reader

Each new generation of graphics processing in a tablet is increasing throughput by about 4X, creating smoother experiences. With all of the increase in pixel counts and frame rates, designers must still meet more competitive battery life times which means controlling power throughout the design process.

Hardware acceleration of algorithms is the way to make your consumer devices stand out, instead of using software-based approaches on a general purpose CPU.

If you insist on coding at the RTL level it will simply take you much longer to explore, refine and implement a given algorithm. For example an IC designer coded a motion estimator block using C code in just 1/4th the time compared with using RTL code.

Digital media designers can code in C their algorithms directly:

Security

I’ve done some web programming where sensitive credit card data needed to be encrypted and decrypted so I used a PHP function for the MD5 algorithm. Likewise, an SoC can have this same algorithm in hardware
Here’s a list of security algorithms well suited for HLS:

To get a feel for what the SystemC code looks like for any of these algorithms visit the OpenCores web site which also shows the same code in Verilog.

Wireless

The final application area well suited to use HLS is wireless, driven by consumer electronics devices and the IoT (Internet of Things). Instead of coding in RTL and then getting surprised when the specification changes which can add weeks to your SoC schedule, you can code at the algorithm level and update your code then re-synthesize in days or hours.

Examples of wireless applications include:

Summary

HLS is here to stay and there’s a growing list of applications that will benefit from coding algorithms in SystemC. Forte offers an HLS tool called Cynthesizer that is well-used in the industry for Digital Media, Security and Wireless applications.

lang: en_US


Epitaxy: Not Just For PMOS Anymore

Epitaxy: Not Just For PMOS Anymore
by Paul McLellan on 07-25-2013 at 2:25 pm

At Semicon I met with Applied Materials to learn about epitaxy. This is when a monocrystalline film is grown on the substrate which takes on a lattice structure that matches the substrate. It forms a high purity starting point for building a transistor and is also the basis of the strain engineering in a modern process.

Since holes have lower mobility than electrons, p-type transistors are inherently lower performance than n-mos transistors (which is why before we had CMOS, semiconductor was dominated by NMOS and its variants, n-type transistors with some sort of pull-up resistor/transistor). Since epitaxy improves performance, it was first used for the p-type transistors.


Basically, the source and drain are etched out to form a pit and then the pit is filled by depositing epitaxial silicon (with Applied Materials equipment in most cases). It is actually deposited until the source/drain is proud of the surrounding silicon. Adding small amounts of impurities that are larger than silicon, such as germanium, during deposition induces strain in the lattice which turns out to enhance mobility in the channel. It increases transistor speed but does so, unlike many other things we might do, without increasing leakage and so without increasing static power.

But now, at 22/20nm nodes, epitaxy is needed to get extra performance out of the n-type transistors too, an contribution of around 20% of the mobility.


As usual, almost anything associated with p-type transistors is the other way around for n-type. So to improve performance, strain needs to be tensile. To induce tensile strain in n-type transistors the impurities need to be smaller than silicon, such as carbon or phosphorous atoms. Carbon is 62% smaller than a silicon atom, for example. This increases electron mobility and thus n-type transistor performance.


There are several advantages of epitaxy especially when it is used for both transistor types:

  • precision channel material (since it is not used for source and drain) enhances performance
  • physically raised source and drain keep metal contacts away from channel
  • increased strain on channel increases drive current

Applied are the leader in equipment for epitaxy having shipped over 500 systems (and more every week). Their revenue in this area increased by 80% over the last 5 years. Looking forward, the market is moving towards new channel materials such as III-V elements which have inherently higher electron mobility.

The bottom line message: nMOS epitaxy is essential for faster transistors inside next-generation mobile processors. It boosts transistor speed by the equivalent of have a device node without increasing off-state power consumption. What’s not to like? That is why it is coming to a 20/22nm process near you.

More details here.


System Reliability Audits

System Reliability Audits
by Paul McLellan on 07-25-2013 at 12:09 pm

How reliable is your cell-phone? Actually, you don’t really care. It will crash from time to time due to software bugs and you’ll throw it away after two or three years. If a few phones also crash due to stray neutrons from outer space or stray alpha particles from the solder balls used in the flip-chip bonding then nobody cares.

How about your heart pacemaker? Or the braking system in your car? Or the router at the head of a transpacfic fiber-optic cable? OK, now you start to care.


iRocTech provides audit services at the system level for these sort of situations. However, at the system level, the overall reliability depends, obviously, on the reliability of the various components. One big problem is that the component suppliers are not always co-operative. In some cases they simply don’t know the reliability of their components. But also they tend to want to provide the best possible data so that it cannot be used against them. It is as if we went to TSMC and asked about cell-timing and got given the typical corner and then were told that they hadn’t a clue when we asked about a worst case corner because they didn’t want anyone to know just how slow the process might get.

The problem is actually getting worse. For all the same reasons that we want to put 28nm and 20nm silicon into cell-phones (especially low dynamic and low leakage power, lots of gates, performance), engineers designing implantable medical electronics and aviation electronics want to do so to. But the leading edge processes and foundries are driven by the mobile industry which is probably the industry the least concerned with reliability of all semiconductor end-markets (well, OK, birthday cards that play a tune when you open them, $5 calculators, but these are not really markets). This means that there is not as much focus on reliability and measuring it as the markets outside of mobile require.

The big markets that iRoC works on for system reliability are:

  • networking: not your living room wireless router but the big ones that form internet and corporate backbones. they need an accurate MTBF number
  • automotive: an especially extreme temperature environment (it gets hot under the hood in the desert) and very long lifetime (cars need to work for 15-20 years)
  • avionics: at high altitude (never mind in space) there is 3-400 times the neutron flux that there is at sea level
  • medical: in particular implantable medical. these are very low voltage since you may have to open up someones chest when the battery runs out. and they sometimes get in hostile environments too when you go for an MRI or a CAT scan or get in a plane
  • nuclear plants: historically these have been build with mostly electo-mechanical technology due to neutrons and gamma rays that may be released in an emergency, but they are now retrofitting and need to be able to use electronics
  • military and space: there really aren’t any rad-hard foundries left so commercial components are used more and more, but reliability has to be high in an aggressive environment

What these industries would like to do is to push down their system reliability requirements to the component vendors, but compared to mobile they don’t have enough influence, at least in the short term. A second best solution is to find out the reliability of the components and back it up to a system reliability number.

One end-market that is not on the list is cloud computing. At the level of big data centers, events that we consider rare on our own computer (a disk drive fails, the processor melts, the power-supply blows up) are everyday occurrences and so the infrastructure has to be built to accommodate this. For example, GFS (Google File System) never stores any file on less than three separate disks in different geographical locations (Google is actually prepared for a meteor hit on a datacenter that permanently destroys it without impacting service). I don’t want to imply Google is special, I’m sure Facebook and Amazon and Apple are all the same, just that I know a little more about Google since they have published more in the open literature (and I have done some consulting for them).

Since some measurable problems especially latchup and single event failure interrupt (SEFI) are actually very rare, they are hard to measure. If only a short period of measurement is done then the numbers may look deceptively good. However, the reality is that the mean might be good but the standard deviation is enormous. A better reliability measure than the mean alone is the mean plus one standard deviation. To get that measure to look good, extensive measurement is required to get the standard deviation down to something manageable along with a better estimate of the mean. Single event upsets (SEE) which can be accelerated with a neutron beam (as I wrote about here) are much more common and so the standard deviation is much narrower.


Of course, once there is a measure, the question is what to do about it. It is a well-known proverb that a chain is only as strong as the weakest link. But a corollary is that there is no point in having especially strong links, in particular there is no point in strengthening links other than the weakest. Identifying the lowest reliability component and improving it is how overall system reliability can be improved.

iRoc Technologies website is here.


From Layout Sign-off to RTL Sign-off

From Layout Sign-off to RTL Sign-off
by Pawan Fangaria on 07-25-2013 at 5:00 am

This week, I had a nice opportunity meeting Charu Puri, Corporate Marketing and Sushil Gupta, V.P. & Managing Director at Atrenta, Noida. Well, I know Sushil since 1990s; in fact, he was my manager at one point of time during my job earlier than Cadence. He leads this large R&D development centre, consisting about 200 people at Atrenta’s Noida facility. In fact, they have just moved into a new building, yet to be inaugurated. I will write more about it and various development stories when the inauguration happens.


[Sushil Gupta]

Coming back to Atrenta’s product and technology edge; it was an intriguing discussion on how Atrenta is solving today’s SoC problems. Sushil talked about Atrenta’s SpyGlass being deployed for SoC designs across the complete mobile ecosystem; rightly, as what we have on a PC or laptop today has shifted to the handheld smart phone. And that has been possible with the advent of SoCs where multiple functionalities have been squeezed into the same chip. However it’s not so simple a road to ride as there are tremendous challenges considering the very small window of opportunity for a design, the complexities of verifying and integrating multiple blocks and IPs from different origins, process bottlenecks and physical effects at small geometries, performance-power-area optimization and so on. The only viable option is to reduce long iterative loops in the design flow and introduce shorter and faster loops, early in the cycle, to set the design right. That would significantly reduce the possibility of re-spins and also provide an edge for time-to-market.

So, there comes Atrenta’s philosophy of pulling up the sign-off process at the earliest possible opportunity, i.e. at the register transfer level (RTL). Well, that cannot completely eliminate layout sign-off, however can definitely and significantly reduce long iterative loops from layout to earlier stages and enable the designer to achieve faster convergence of the design. Traditionally, sign-off is done at the last stage prior to fab, i.e. layout.

As is evident, post layout sign-off is too late and too risky.

Atrenta’s guiding methodology is to do RTL sign-off before proceeding further. And Atrenta provides a complete platform for RTL sign-off. That’s amazing!!

As we can see, the platform contains all the ingredients to realise an SoC that includes a complete design flow, IP flow and integration, debug and optimization. In fact, Atrenta has also collaborated with TSMCand provides an IP Kit which validates and qualifies any soft IP as per TSMC process before they are integrated into SoCs.

I will talk more about Atrenta’s individual products/technologies and their capabilities in my future articles. But I must share my remembrance that when I had a first read (about two years ago) of the Atrenta SoC Realization whitepaper, I had talked about it with Sushil in his earlier office. And today, to my excitement, Atrenta has really further strengthened that realization!!


Any MIPI CSI-3 Host IP Solution for SoCs Interfacing with Sensors?

Any MIPI CSI-3 Host IP Solution for SoCs Interfacing with Sensors?
by Eric Esteve on 07-25-2013 at 4:37 am

For those taking a quick look at the various MIPI Interface specification, the first reaction is to realize that they will have to look at MIPI more closely, and that it will take them longer than expected to make sure they really understand the various specifications! Let’s start with the PHY. One specification defines the D-PHY, up to 1 Gbps (1.5 Gbps is also defined, but not really used), another defines the M-PHY, to support higher data bandwidth and higher speed. Look simple? In fact, we did not mention the various “gear” supported by M-PHY (per lane): Gear 1 is up to 1.25 Gbps, Gear 2 up to 2.5 Gbps when gear 3 is defined up to 5 Gbps. In fact there are many more differences between D-PHY and M-PHY, if you take a look at MIPI Org web site, you will find this comprehensive picture:

Now, you clearly understand the various MIPI PHY, and you know that a PHY is nothing without a Controller, the digital part of the function in charge of processing the protocol layers, like “Link Layer”, “Transport Layer” and so on. Let’s stay with the M-PHY example. If life would be simple, you would attach one MIPI Controller to this M-PHY. But, if we are (more or less) well paid engineers, it’s because SC related life is not simple… Just take a look at the picture below:

In order to ease SoC integration, M-PHY can support up to six different protocols. This means that when a chip maker decides to integrate several MIPI protocols on the same chip, he wills also instantiates several times the same PHY IP, and the various controllers attached. All controllers are not made equal: DigRF (interfacing with RF chip), LLI (interfacing the SoC and a modem chip, to share a unique DRAM) and SSIC (SuperSpeed USB IC protocol, for board level inter-chip connection) can be plugged directly with the M-PHY. But, another group of Controllers (CSI-3, DSI-2 and UFS) require another piece of IP, UniPro, to be inserted between M-PHY and, for example, MIPI CSI-3 controller (Camera Serial Interface specification).

When a chip maker designs an Application Processor for smartphone or media tablet, he is integrating over 100 IP, from ARM A9 to I2C or SRAM. Such a chip maker will certainly appreciate the fact that Synopsys propose a complete Camera Serial Interface 3 (CSI-3) host solution, including the new DesignWare MIPI CSI-3 Host Controller IP combined with the MIPI UniPro Controller and multi-gear MIPI M-PHY IP. With support for up to four lanes in Gear1 to HS-Gear3 operation, the CSI-3 host solution simplifies the system-on-chip (SoC) interface for a wide range of image sensor applications, giving SoC designers maximum flexibility to increase throughput while reducing pin count requirements and integration risk.

I agree with Joel Huloux, MIPI Alliance Chairman when he says: “IP supporting the MIPI CSI-3 v1.0 specification, along with a HS-Gear3 M-PHY, gives designers the ability to rapidly build host configurations into their SoCs,”. “Synopsys’ DesignWare MIPI CSI-3 Host Controller promotes the MIPI ecosystem while furthering the realization and reach of the latest MIPI specifications.” Working in the IP business for about 10 years, I have realized how important it is for a chip maker who decides to outsource a certain function, split into PHY and Controller, to have the opportunity to acquire a complete solution with a single supplier. This is the guarantee that the function has already been integrated (by the vendor), and also validated and verified before he will integrate it. In this case of a Camera solution, we are talking about three different functions! Last but not least, this new MIPI CSI-3 Host Controller, simplifying CSI-3 image sensor interface integration, is a low power solution.

By Eric Esteve from IPNEST

lang: en_US


Semicon: Multiple Patterning vs EUV, round #2

Semicon: Multiple Patterning vs EUV, round #2
by Paul McLellan on 07-24-2013 at 9:00 pm

Round #1 was here.

In the EUV corner were Stefan Wurm of Sematech (working on mask issues mostly) and Skip Miller of ASML who are the only company making EUV steppers (and light sources, they acquired Cymer).

You may know that the biggest issue in EUV is getting the source brightness to have high enough energy that an EUV stepper has a throughput of at least 120 wafers per hour so that it is competitive with multiple patterning. And the source is like something out of science-fiction. First, you make little tiny droplets of molten tin. Then you hit them with a laser to shape the drop. Then you hit it with a really big laser, so big that it needs a whole power infrastructure in the sub-fab, and this vaporizes the tin droplet to plasma. With a 20KW laser with a power efficiency of 10% you need 0.2GW of input power. The plasma lets out a little bit of EUV. Oh, and do that about 100M times per hour.


But EUV is absorbed by everything so you can’t use normal (transmissive) masks and the stepper has to have a high vacuum (because even air absorbs EUV). But you can’t use conventional mirrors like you have in your bathroom. They absorb EUV too. You need to build masks out of multiple layers of silicon and molybdenum to form a mask that reflects due to interlayer interference. But they still don’t reflect very well, about 30%, so after a couple of mirrors to focus the EUV light, and 6 more to direct it, and a reflective mask, 96% of the light is absorbed and only 4% hits the wafer.

So with that background, are we there yet?

Stefan started of pointing out that he is working on the assumption that the light source issue is solved. He can’t do anything about it at Sematech but waiting for it to be solved before looking at the other problems is clearly silly.

First the good news. EUV resists seem to be in good shape, first production-type EUV tools are being delivered, masks blanks are being made, there is some experience with pilot runs. Line width roughness (lwr) and CD uniformity (cdu) are getting better but so far only by accepting slower resists.

Mask blanks are still a big issue. We would like defect free masks but that is not really going to happen and here is why. As I said above, the masks are mirrors built up by depositing layers of silicon and molybdenum onto a glass blank using ion beam deposition (IBD). A big problem is that defects on the glass that are too small to see get amplified by this process and become real defects that affect the mask and then you can see (but then it is too late). Best masks are about 12 defects at 45nm. Those 12 break down into 10 pits from substrate (that were too small to see before you started deposition), one handling defect and one from deposition. Marathon runs of IBD of 100 blanks get 20-30% yield of acceptable masks. There are some long-term issues, IBD may not be viable long-term as process feature sizes continue to shrink.


Another issue is the EUV masks don’t have a cover on them (known as a pellicle) because that cover would absorb the EUV. So any defect on the mask is in the focal plane. There is an assumption in developing EUV that there is no contamination in the chamber, but of course that is not completely realistic. To me this is a huge issue, and one we don’t have with optical masks that have a pellicle to keep contamination out of the focal plane. So masks need to be cleaned regularly. But there are starting to be degradation of the patterns after 50-100 cleanings.

There is some work on pellicles with materials that are transparent(ish) to EUV. The most promising material seems to be single crystal silicon.

Takeaway:

  • IBD can produce usable mask blanks but may not be viable long-term
  • Substrate quality is an issue, hampered by lack of defect metrology
  • Need to ensure adders (particles that get on the mask after it was made) do not print
  • Mask lifetime learning has just begun (backside coating damage, clean handling)
  • EUV mask supply chain is a weak link. Will not be ready at quality and volume needed for HVM ramp and so industry needs to strengthen mask supply ecosystem.


Next was Skip from ASML. Cost is a big concern in the whole industry. In 2000 1GB would set you back $1182 in lithography costs, but by 2015 it should be about $0.17. But post-28nm cost is flat per transistor. Only EUV can give a full scaling for 10nm node due to litho/layout restriction with multiple patterning. EUV also has 30-75% reduced cycle time. They have 11 systems in various stages of construction in their clean room.


Currently source is generating 55W of power which is 43wph. They expect 80W and 60wph by the end of the year. EUV production is expected for 10nm logic and 1xnm DRAM volume production in 2015-2016.

My opinion. I understand the need for EUV to keep Moore’s law on track especially without having insanely high costs and insanely long tunround time. But I still don’t see how everything can be made to work in time. Intel is already planning 10nm without it. The pellicle issue I have always considered a killer but perhaps a silicon pellicle can be made to work. This meeting was the first time I’d heard the hint of possibility of an EUV-transparent pellicle. The fact that masks will not be defect free seems like a big issue. So much is being invested in the light source that I can believe that will be solved. But the almost laser-like focus (see what I did there) on that one issue has obscured many other issues that stand between EUV and use in high volume production lithography.


Constrain all you want, we’ll solve more

Constrain all you want, we’ll solve more
by Don Dingee on 07-24-2013 at 8:30 pm

EDA tool development is always pushing the boundaries, driven in part by bigger, faster chips and more complex IP. For several years now, the trend has been developing tools that spot problems faster without waiting for the “big bang” synthesis result that takes hours and hours. Vendors, with help from customers, are tuning tools to real-world results.

Continue reading “Constrain all you want, we’ll solve more”


Metastability Starts With Standard Cells

Metastability Starts With Standard Cells
by Daniel Nenni on 07-24-2013 at 8:05 pm

Metastability is a critical SoC failure mode that occurs at the interface between clocked and clockless systems. It’s a risk that must be carefully managed as the industry moves to increasingly dense designs at 28nm and below. Blendics is an emerging technology company that I have been working with recently, their MetaACE product can be used throughout the design flow starting with foundation IP.

For standard cells, there are at least three groups that benefit from MetaACE:

  • The designer of the standard-cell synchronizer
  • The individual responsible for characterizing the synchronizer cell
  • The integrator of the synchronizer cell into a SoC product

MetaACE is used to refine cell design by minimizing the settling time-constant (tau) while maintaining other cell specifications. MetaACEis then used to obtain the parameters that characterize the synchronizer. The results can then be used to determine the MTBF of the synchronizer, as it will be used in the SoC product.

Let’s look in detail, as it was explained to me during customer meetings, at how a standard-cell characterization team might use MetaACE as part of their flow. Assume that the design is sent to characterization; this includes the extracted cell netlist as well as device models for the process in question.


The typical characterization flow would be run to find things like setup/hold times, propagation delay, input loads, etc. Using MetaACE one could also determine the four parameters needed for metastability analysis: Tw(1), Tw(2), tau-m and tau-s. This uses the same extracted cell netlist and device models for the characterization but with a few twists:

[LIST=1]

  • A small netlist should be created that instantiates the design.
  • Also, a few parameters would be defined that MetaACE will use for its analysis:

    *include process models (SS corner, for example)
    .include ‘$models/processModSS.sp’
    *include cell to test
    .include ‘$CellLib/DFF.sp’
    *include the file MetaACE creates to drive simulation
    .include ‘$MetaACE/ic.sp’

    *define SUPPLY and wire it to Vdd
    Vdd vdd 0 DC ‘SUPPLY’

    *Wire up the flip-flop/Synchronizer cell(s) to be simulated
    xdff1 Vdd 0 D C QN Sync DFF_X1

    * bring out any internal nodes you may want to plot/analyze
    Vm3 xdff1.z9 n11 0
    Vm4 xdff1.z10 n21 0

    In the above netlist example, the model file and the file that MetaACE modifies are included as well as the flip-flip/synchronizer to be simulated. “SUPPLY” is wired up as are “C” and “D” which are used by MetaACE to specify the supply voltage and clock/data inputs. Finally, any internal nodes in the circuit needed for analysis should be brought to this top level.

    [LIST=1]

  • MetaACE is now run, specifying the netlist created, above, as the input as well as a few other parameters. The main items needed are the location for the simulator (HSPICE) as well as:
    [LIST=1]

  • The temperature of the run,
  • Vdd for the run,
  • The name of the clock and its rise/fall time and width,
  • The name of the data input and its rise/fall times,
  • The device’s setup/hold times (if known), and
  • What node(s) should be plotted and analyzed.
  • For a master-slave type device, the first simulation run may specify the node that is the input to the slave as the node to analyze, first; this will give tau-m.
  • Once tau-m is found, the same circuit is run, again, but this time looking at the output of the first slave stage (if more than one stage). This will give the results for tau-s and TW(1).
  • After this second run, the simulation can be rerun a third time looking at the output of the second flip-flop (for a multi-stage device) which will give TW(2).
  • At each run, the configuration used can be saved for future use (in GUI or command-line mode). Steps 4-6 could be run from the command line as part of a script automatically extracting all parameters for each submission of the circuit for characterization.

    These general procedures can be run over various process corners by copying the configuration files (which are XML) and the top-level netlist, modifying the netlist to call different corner models and changing the configuration file to point to the appropriate netlist to simulate. In this way, one could simulate the SS corner and the FF corner, for example. Even more complex cases can be run, such as an N-stage synchronizer cell, where each flip-flop could be assumed to be at a different process corner. Whatever you can specify in your netlist, MetaACE can simulate.

    After the characterization process concludes, the results for the taus and TWs are tabulated along with the other parameters of the cell and passed back to the library folks for design updates, datasheets, etc. The data are all that is needed to calculate MTBF for this cell once the input clock frequency, duty cycle and data arrival rate is determined. This data could also be used to see if any recent changes made to the cell influenced tau; for example, the drive strength of the cell increased but this may have caused tau-s to get a bit larger. This may be acceptable based on some specification about what the maximum allowable tau. But for the first time, one may actually gain some insight into how their cell changes affect not only things like propagation delay and loading, but also how those changes may make performance of the cell, when used as a synchronizer, better or worse.

    Bottom line: MetaACE is a powerful tool that allows any engineer, who has the extracted cell netlists and device models, the ability to obtain the metastability parameters well before fabrication and even during the cell design process. It also allows any product engineer, who integrates synchronizer cells into his design, to calculate the overall MTBF of all the synchronizers in the system.

    lang: en_US


  • The FPGA Blob is Coming…

    The FPGA Blob is Coming…
    by Luke Miller on 07-24-2013 at 5:00 pm

    I never understood when I was a kid how ‘the Blob’ could actually catch someone but it sure did. It caught the unsuspecting, the off guard. I mean you’d have time for a soda and shower if you saw it on your road. And no, your manager is not the Blob; don’t think like that, it’s always his boss. The blob comes to consume the worker who was unaware that they could leave at 4pm on a Friday to avoid the next mini design crisis; only to learn they did not need your FPGA fix, it was the software reading the wrong register all the time.

    I better write something techy… So do you know what I liken the Blob to? The FPGA… Yikes that at the surface does not sound like a compliment but it is and maybe when I’m done, you’ll want to be the blob too. I better stop.

    Over the last decade have you noticed what the FPGA has consumed from your marvelous circuit board? Hmm, have you? I have seen the FPGA Blob personally eat whole RADAR Chassis into one part. Now that is blobbish. We are not only doing more math in FPGAs, but handling massive amounts of IO. The IO is how the Blob eats and spits out data. No more propriety IO chips, implement whatever you want. The Blob loves ‘À la carte’. Shoot, even Richard Simmons can’t stop this thing. VPX FPGA COTS boards have an IO FPGA tied to the VPX high speed fabric. That means that board could use SRIO, PCIe etc… and is not locked into a particular ‘open’ (That’s funny) architecture.

    Remember the bridge chips? I do… It makes me wonder what else going to be consumed? I have some ideas but will keep them with me for now. The major question I have is what will be left for the microchip makers? Am I really going to buy a video encoder/decoder chip? Memory is safe but everything else is fair game. Could it be that in the future instead of say TI making chips solely, is that it also designs IP for some FPGA house? Thus the single chip IP solutions will find themselves as hardened IP or even soft for that matter in an FPGA. I do not know many engineers at this point using the SHARC Processors, why? Well the FPGA’s are doing much of that DSP now. Blobbed.

    It really is a new reality for FPGAs and I like what NVIDIA is starting to think about which is ‘Network on a Chip’, Yes we do need a System on a Chip (My Kids Actually think that is a dip) but that System needs to talk to other Systems over a medium called a network. Who is the biggest FPGA Blob victim you ask? ASIC’s! I remember the days considering the trade space of ASIC vs. FPGAs. No longer is the ASIC a real competitor in the way we used to think of them. In fact IBM laid off many employees from its ASIC division last month, no doubt due to the FPGA blob in part. You see the Blob can change shape, unlike the ASIC it is very flexible just like our friend the FPGA. The Blob is coming and there is no stopping the momentum, just don’t get eaten.

    lang: en_US


    TSMC Q2 Results: Up 17%; 20nm and 16nm on track

    TSMC Q2 Results: Up 17%; 20nm and 16nm on track
    by Paul McLellan on 07-24-2013 at 10:47 am

    TSMC announced their Q2 financial results yesterday. Revenue was $5.2B (at the high end of guidance) with net income of $1.6B. This is up 17.4% on Q1 and up 21.6% year-to-year. Gross margin is up too, at 49% which is up 3.2 points on Q1 and 0.3 points year-to-year. As usual the financial results are not directly that interesting since I don’t much care whether TSMC is a buy next quarter. What is more interesting is trying to read the tea-leaves for the big strategic picture on a multi-year timescale.


    Their business breaks down 57% in communication, 16% in computer, 20% in industrial and 7% in consumer. Pretty much all the grown since last quarter is in the communication area, which isn’t really a big surprise, up 22% on the biggest numbers, although other areas are all up in the 10-20% too but from smaller bases.

    It is interesting to see the shift taking place between process generations. 29% is 28nm, 21% is 40/45nm, 16% is 65nm and everything else is older. Suprisingly, 15% is in 0.15/0.18um (I’m guessing mostly analog and other specialist stuff since there is almost nothing in 0.13um or 90nm).

    ARM also announced their results yesterday, and these are significant for TSMC for one reason. If ARM starts to lose share to Intel in mobile (or Intel starts to lose share to ARM in servers) this will impact TSMC negatively (or positively for the servers). Simon Segars, in his first quarterly presentation since becoming CEO, was very bullish on both areas. Perhaps the most interesting little factoid from the ARM presentation is that royalties are up 24% year-on-year, which is much bigger than the growth in overall semiconductor (2%). And perhaps even more interesting is that a large number of cores that ARM has licensed are not yet shipping (and so not yet producing royalties). For instance, the Cortex-M (which is a microcontroller) has 180 licensees but only 50 are yet shipping. Not all of these ARM-based chips will be manufactured by TSMC., of course, but certainly TSMC will get their unfair share as the biggest foundry. That’s an attractive pipeline. ARM-based servers are now starting to ship, and AMD (admittedly a biased observer) is predicting double-digit market share by 2016/17 which is huge if it turns out to be true. And while AMD themselves do a fair bit with GF, other server licensees work with TSMC. And those are big chips (mostly 64 bit) which will need a lot of wafers.


    What is TSMC’s total capacity? Their forecast for the end of the year is for 16.5M 8″ equivalent wafers per year. Fab 14 alone is 2.2M 12″ wafers (5M 8″ equivalents). That’s a lot of silicon, up 11% from last year with 12″ capacity up 17% (new fabs are all 12″ of course). Their capex spending remains on-track for $9.5B to $10B for this year (of which 55% has already been spent in the first half).

    When Morris Chang spoke he was bullish too. For overall semiconductor they are cutting their forecast from 4% to 3%. But for fabless they predict 9% growth. And for the foundry industry (not just TSMC) they are raising the forecast to 11% from 10%. And for TSMC bigger than that.

    As for 28nm:“Our 28-nanometers is on track to triple in wafer sales this year and our 28-nanometer high-K metal gate is ramping fast, and will exceed the Oxynitride solution starting this quarter. For the Oxynitride solution in which we do have competitors, we believe that we have a substantial lead in yield. For the high-K metal gate solution, we do not have any serious competitors yet. We believe we have a substantial lead in performance. If you recall, ours is a gate-last version and our competitors are mainly in the gate-first version.

    20nm: Risk production has started and volume production starts Q1 2014. Doesn’t see any real competition.

    14nm: Volume production starts a year after 20nm in early 2015.

    Morris again:“On the 16, if we put it on a foundry to foundry or foundry to IDM basis, we are competitive. If you put it on a grand alliance to IDM basis, we are more than competitive.”
    (BTW the transcript for this part keeps saying IBM but that makes no sense and it must really mean IDM, integrated device manufacturer. Or, to be precise, Intel. What Morris is saying is that they will be competitive with Intel at 14nm).

    Presentation is here. Transcript of call is here. Transcript of ARM’s call is here.