Banner 800x100 0810

MIPS Warrior Goes Into Battle

MIPS Warrior Goes Into Battle
by Paul McLellan on 06-26-2013 at 7:00 am

You are probably aware that Imagination Technologies, perhaps most well known for creating the GPU that is in the iPhone and iPad, acquired MIPS, which was originally a spinout from Silicon Graphics and licenses a line of general purpose microprocessors.

MIPS considers that they have a purer implementation of the RISC philosophy built around the fact that simple instructions provide higher performance. An interesting experiment from way back when was when IBM took the PL/I compiler for the IBM 801 (the first RISC microprocessor, why on earth did they not put it in the first PC?) and retargeted it to the IBM 360 architecture. Since the compiler didn’t have the concept of complex instructions, or accessing memory except by a load or store instruction, it actually used a small subset of the IBM 360 instructions. Nonetheless, it turned out to run code faster than the industrial strength PL/I compiler designed for the IBM 360 that took advantage of the entire instruction set. RISC won even on a very CISC architecture.

The result of this pure implementation is that MIPS have a higher CoreMark/MHz than the competition and uses only 60% of the area. So a superior performance/area. CoreMark is a much better benchmark than the old Dhrystone benchmark with about 700,000 instructions versus 350. It stresses CPU’s branch prediction and L1 cache performance in a way that Dhrystone never did. It is much closer to a modern workload. By the way, an interesting little story on the Dhrystone name. There was a floating point benchmark called Whetstone named after the town in England where it was developed. So when a fixed point benchmark was created, the obvious name was Dhrystone, a play on words wet/dry plus the odd extra inserted “h”.

This means that a MIPS proAptiv design can fit a quad core into the same area as other competitors can fit a dual core, achieving a CoreMark score over twice as high in the same area.

interAptiv is a smaller less power hungry core with in-order execution. It supports multi-core and coherent caches, it is also multi-threaded. Its focus is supporting Linux and other embedded operating systems. Then, for really minimal area there is microAptiv, with a 5-stage pipeline that optimizes PPA. It has performance/area over 45% more efficient that its main competito and much higher DSP performance.

Today MIPS have announced the interAptiv single core CPU IP designed for small footprint parallel processing intensive applications such as networking, baseband, protocol processing. It has a smaller area and lower dynamic power consumption. It removes the extra logic associated with multi-core coherency, and the L2 cache-controller so providing a very efficient multi-threaded single core processor.

Features include:

  • MIPS multi-threading
  • Extended Virtual Addressing (EVA)
  • ECC on L1 cache and scratchpad RAM
  • DSP Application Specific Extensions rev 2, eliminating need for separate DSP for voice and audio
  • Un-Cached Accelerated (UCA) writes


An example is building an LTE stack (true 4G wireless) using a multi-threaded core with optimized baseband stack and a multi-thread aware RTOS. This delivers multicore performance in the area/power envelope of a single core. The multithreading gives a performance boost of up to 2X versus single thread.


MIPS also gave a sneak preview of the future. First there were the classic MIPS cores, then the Aptiv cores, and next will be the Series5 Warrior series of cores. These will have:

  • hardware virtualization across the entire range of cores, providing compelling benefits for applications from compute-intense enterprise environments to energy efficient mobile platforms
  • MIPS hardware multi-threading technology, enabling better overall throughput, quality of service (QoS), and power/performance efficiency in select ‘Warrior’ cores
  • Imagination’s unique, extensible and highly scalable security framework for applications including content protection on mobile devices, secure networking protocols and payment services
  • MIPS SIMD architecture (MSA), built on instructions designed to be easily supported within high-level languages such as C or OpenCL for fast and simple development of new code, as well as leverage of existing code
  • a consistent and comprehensive toolchain across the ‘Warrior’ series for fast, easy development and debugging
  • full compatibility with existing legacy 32-bit and 64-bit code


Can China alone drive semiconductor growth?

Can China alone drive semiconductor growth?
by Bill Jewell on 06-25-2013 at 9:04 pm

Production of electronic equipment (including computers, communications, consumer, etc.) has been sluggish over the last year in all key regions of the world except China. The chart below shows three-month-average change versus a year ago for electronics production in local currency through April 2013 (May 2013 for China). Total industrial production is shown for Europe and South Korea where electronics production data is not available. China has maintained growth of over 10% since the beginning of 2011. The U.S. showed moderate growth until turning negative in October 2012. Industrial production in Europe and South Korea was healthy in early 2011. Europe turned negative in February 2012 while South Korea has been negative since February 2013. Japan electronics production has not recovered since the March 2011 earthquake and tsunami, with mostly double digit declines. Other key electronics manufacturing countries not shown on the chart include Taiwan and Singapore, each of which turned from growth to decline in early 2012.

The black line on the chart above shows three-month-average worldwide semiconductor shipments versus a year ago from WSTS. The semiconductor market went negative in July 2011 then turned positive in November 2012 before slipping to a 1.8% decline in April 2013. It is worth noting that semiconductors turned positive in November 2012 with only China showing significant growth in electronics production.

Can the semiconductor industry return to growth with only China driving increases in electronics? The answer is probably yes. China has emerged as the dominant power in electronics manufacturing, especially in the last 12 years. The chart below shows electronics production in local currency based on government data for China, the United States and Japan. Each country is indexed to 100 in 2000. Electronics production in China in 2012 was over eight times the level in 2000. U.S. production in 2012 was two-thirds 2000 production. Japan 2012 production was only 36% of 2000. Comparisons of values of electronics production between countries is difficult due to differences in definitions and exchange rates. Based on the closest available data, the U.S. dollar value of electronics production in 2012 was $1,100 billion in China, $342 billion in the U.S. and $65 billion in Japan. In 2000 production was $100 billion in China, $510 billion in the U.S. and $134 billion in Japan. Europe is not shown on the chart, but the electronics production in euro of the 15 European Union (EU) member countries in 2000 was basically the same level for those 15 countries in 2011.

Semiconductor shipments by region from WSTS do not completely reflect this shift in electronics production. The table below shows the percentage of semiconductor shipments by region from WSTS. The Americas is primarily the U.S., but also includes Canada, Mexico and South America. EMEA is Europe, the Middle East, and Africa. Asia Pacific includes China, Taiwan, South Korea, Singapore, India, the Philippines and other Asian countries.

The growth of Asia Pacific from 25% of the world in 2000 to 56% in 2012 is impressive, but does not fully show the relative growth of China. China is becoming a larger portion of the Asia Pacific region. While China’s electronic production grew over eight times from 2000 to 2012, Taiwan’s electronics production was roughly the same level in 2012 as in 2000.

Also the region of shipment of semiconductors may not necessarily be the region in which the semiconductors are used in electronics manufacturing. For example, a Japan-headquartered electronics company may purchase semiconductors for its warehouse in Japan, counting as a Japan shipment. The company may then transfer the semiconductors to its manufacturing subsidiary in China. The final use of the semiconductors for manufacturing is in China, but the shipment region is Japan in WSTS data.

Accurate worldwide electronics production data is not available. We at Semiconductor Intelligence estimate China accounts for over half of worldwide electronics production. China should continue to be the key driver of electronics production growth for at least the next decade. The mature regions of the U.S., Japan and Europe are unlikely to show any meaningful growth in electronics production in the next few years, if ever.

lang: en_US


Kathryn Kranen: The Problem with EDA is…

Kathryn Kranen: The Problem with EDA is…
by Paul McLellan on 06-25-2013 at 5:35 pm

Kathryn Kranen, CEO of Jasper Design Automation, got to give her view of the future of EDA on the Thursday of DAC. For many years she has been on the EDAC board and is currently chair. When she first was on the board she talked to many of the stakeholders in the EDA ecosystem: EDA companies, IP companies, semiconductor companies, academics, consultants, everyone who is involved in getting semiconductors designed and manufactured.

Many of them would say “The problem with EDA is dot dot dot.” Of course, always someone different from the speaker.

  • the problem with EDA is big companies blocking small companies with all-you-can-eat deals
  • the problem with EDA is customer CAD is now just about procurement and not methodology
  • the problem with EDA is that it doesn’t invest in academia any more
  • the problem with EDA is that academics don’t understand EDA’s need to keep things proprietary
  • the problem with EDA is startups are hurt by all those expensive tools
  • the problem with EDA is that it is an aging industry out of fresh ideas
  • the problem with EDA is all the innovation comes from small companies

In fact big EDA companies have to upgrade virtually every tool in their chain, sustaining innovation in Clayton Christensen’s terminology. This is even more true from about 28nm onwards: HKMG transistors, double patterning, FinFETs; every node has a new discontinuous change. And then there are disruptive innovations which can occur anywhere.

Kathryn picked the subscription model as an innovation that came from big companies. I thought it a curious choice. Firstly, the true subscription model was pioneered by Gerry Hsu at Arcsys when it was still small but it is true that big companies perfected it. If you want to have an all-you-can-eat model then it helps to have a lot of food on the buffet, for a startup not so much. But to be fair to the complainers, when people say all the innovation comes from small companies I think they mean technical innovation not business model innovation.

But EDA is a great industry. We have to work together to solve problems. We get to see results on a very rapid timescale, not having to wait 10 or 20 years for a full learning cycle. We have the highest IQ of any industry. So the future continues to be bright.

There is a video of Kathryn’s vision on the DAC website here.


Remember FPGA Memory

Remember FPGA Memory
by Luke Miller on 06-25-2013 at 3:10 pm

We must admit the excitement of the FinFETs and all that coupled with the enormous amount of DSPs and BRAMs in the FPGA world is very cool. They even have ARMs, and I highly recommend that they get Legs then they can run around and everything and fit in with the rest of us. Perhaps the Feds can grant them immigration status and they could also vote, who knows. And once again, no emails please, I don’t read them.

Now all you G’PU’ folks (As I hold my nose to you, ‘pu’) probably don’t care about such terms of DSP48s, or BRAMs but I do know that we all care about our very special DDR3/4 Memory. The memory technology as of today has addressed and solved many bandwidth issues but is just about past its prime. What is the problem you ask? Good question. The problem is IO. The term of the decade is Giga, you know a Billion. Silicon cannot keep the same IO footprint of DDR signaling to/from the memories. If you start to do the math and extrapolate out a bit you will see that the data the FPGA is consuming on a front end could not be buffered very deep no longer for future designs, even if you used a Virtex-8. What if for instance you were designing a RADAR front end and needed to buffer a whole CPI of data across 10,000 elements. Remember the FPGAs today have 28gb GT’s per lane!

Now the solution is not going to be 32 Banks of DD4 using 1200 LVDS, I wonder if the future of memory will be Gigabit IO instead of LVDS DDR. I think so, and when this happens it will open up not only the FPGA data processing pipes but unleash the CPU and GPUs as well. Now don’t say that is hard, everything about silicon is hard but the challenge needs to be overcome. Secondary to this memory design is the FPGAs ability to download an image from PROMs or FLASH so the FPGA can be configured. Xilinx FPGAs have the ability of partial reconfiguration which works well for designs such as software defined radio, but for RADAR applications it comes up short.

The issue is reprogramming times in the RADAR realm need to be under a millisecond. Using today’s programming rates; that is not happening but if you get the pencil and paper out you could easily design away a gigabit configuration PROM. That is the easy part; the hard part is getting the FPGA bit stream inside the FPGA to go thru the numerous states to configure quickly, including AES schemes. The payoff is creating for instance a Surveillance RADAR and Missile and Fire Control RADAR using the same COTs board. When a threat is detected, reprogram the RADAR to become a MFCR and finish the track and engagement out. Don’t think that’s possible, like everything give it time and money and we’ll see it soon enough.

lang: en_US



Visual AMS Debug, an update at DAC

Visual AMS Debug, an update at DAC
by Daniel Payne on 06-24-2013 at 4:07 pm

If you’re involved with AMS or transistor-level IC design then having visual tools will help you design and debug quicker. At DAC I met with Gerhard Angst, President and Founder of Concept Engineering to get an update.


Gerhard Angst (center), Concept Engineering

Continue reading “Visual AMS Debug, an update at DAC”


Fujitsu, Mediatek, Richtek and Synopsys om Custom IC Design

Fujitsu, Mediatek, Richtek and Synopsys om Custom IC Design
by Daniel Payne on 06-24-2013 at 3:01 pm

Synopsys has been acquiring EDA and IP companies at a fast clip over the past few years and it’s often made me wonder how they are going to craft a coherent tool flow for custom IC design. At DACthis year I learned that for schematic capture the winning tool is Custom Designer SE– a relatively new tool, while the IC layout tool will be Lakerfrom SpringSoft. I’m not sure if that makes the installed base of SpringSoft users happy or not, because they had been using Laker for both schematics and IC layout, but that’s material for another blog. Continue reading “Fujitsu, Mediatek, Richtek and Synopsys om Custom IC Design”


Is Your Synchronizer Doing its Job (Part 2)?

Is Your Synchronizer Doing its Job (Part 2)?
by Jerry Cox on 06-23-2013 at 8:10 pm

In Part 1 of this topic I discussed what it takes to estimate the mean time between failures (MTBF) of a single stage synchronizer. Because supply voltages are decreasing and transistor thresholds have been pushed up to minimize leakage, the shortened MTBF of many synchronizer circuits at nanoscale process nodes is presenting an increased risk of failure. Moreover, new SoC designs are expected to have hundreds of clock domains and at least as many Clock Domain Crossings (CDCs). To decrease this risk of failure in these multi-synchronous designs, many designers routinely use multistage synchronizers in each CDC.

The following figure shows a typical multistage synchronizer. The N synchronizer stages are all clocked from the right-hand clock (fc), but data transitions produced in the left-hand clock domain (fd) may violate the setup and hold constraints of the first synchronizer flip-flop. These violations can result in metastable behavior of each of the N stages, but each added flip-flop reduces the width of the synchronizer’s window of vulnerability. Data transitions within that narrow window can cause serious mischief in the right-hand domain since the outputs from logic blocks L1 and L2 can then be inconsistent and lead to an unknown state. N must be chosen so that the chance of a data transition falling within the narrow window of vulnerability is extremely rare and the resulting MTBF is exceedingly long.


Today, two-stage (N=2) synchronizers are routine and three and four stages are becoming more common. However, the calculation of MTBF for these multistage devices is not straightforward. In fact, our recent paper, “MTBF bounds for multistage synchronizers” makes it clear that published MTBF models give widely varying results: the MTBF calculated at a single process corner and at a single operating condition gave results that varied over five orders of magnitude among the existing models. When compared with complete circuit simulation, our model gave consistently accurate results as shown in the following figure.


As one can see, calculated results can accurately predict simulated MTBF values. Some noteworthy comments about this result:

[LIST=1]

  • Only four parameters are required to predict MTBF over a wide range of clock periods and number of stages.
  • These four parameters can be obtained from a few circuit simulations at a single clock frequency.
  • As discussed in Part 1. the clock duty cycle must be known to calculate an effective settling time-constant.
  • Also as discussed in Part 1, parameters from both Sam and Ian must be known to determine the multistage MTBF.

    For the above figure, it is clear that at a 1 GHz clock rate, a 200 MHz data transition rate and the SS corner, this 90 nm single-stage, master-slave circuit was clearly unreliable for synchronizer service. A two-stage circuit has an MTBF of less than a year and even a three-stage circuit had an MTBF of less than 1000 years (considering you may have hundreds or even thousands of them in an ASIC, that’s still unreliable).

    Risk of failure increases substantially at 45 nm and below, at lower supply voltages and at lower operating temperatures. Clearly, multistage synchronizers will find increasing use, particularly in multi-synchronous, custom silicon that goes into mission critical applications. Such applications include, for example, automotive engine control modules, lithium battery charger circuits, implantable medical devices, certain avionics products and industrial control systems. These designs should all have a critical sign-off covering all CDC MTBF specifications. The fact that it is not happening today is troubling.

    lang: en_US


  • TSMC and Xilinx on the FinFAST Track!

    TSMC and Xilinx on the FinFAST Track!
    by Daniel Nenni on 06-23-2013 at 2:00 am

    The power of the fabless semiconductor ecosystem never ceases to amaze me. On one hand you have the Intel backed press crowing about Intel stealing Altera from TSMC. On the other hand you have Xilinx and TSMC crowing about a new ‘one-team’ approach. If you are interested in the real story you’ve come to the right place.

    “Altera’s FPGAs using Intel 14 nm technology will enable customers to design with the most advanced, highest-performing FPGAs in the industry,” said John Daane, president, CEO and chairman of Altera. “In addition, Altera gains a tremendous competitive advantage at the high end in that we are the only major FPGA company with access to this technology.”

    “I am extremely confident that our ‘FinFast’ collaboration with TSMC on 16-nanometer will bring the same leadership results that we enjoyed at previous advanced technologies,” said Moshe Gavrielov, President and CEO of Xilinx. “We are committed to TSMC as the clear foundry leader in every dimension, from process technology to design enablement, service, support, quality, and delivery.”

    The one disadvantage of the fabless semiconductor ecosystem and crowd sourcing in general is that you are working with companies that also work with your competitors. That is certainly the case with TSMC since just about every fabless semiconductor company manufactures at TSMC and TSMC is bound by honor (The Trusted Technology and Capacity Provider) to provide a level playing field for all customers. The only thing worse would be if the company that manufactures your product competes directly with you, just ask Apple!

    “We look forward to collaborating with Altera on manufacturing leading-edge FPGAs, leveraging Intel’s leadership in process technology,” said Brian Krzanich, chief operating officer, Intel. “Next-generation products from Altera require the highest performance and most power-efficient technology available, and Intel is well positioned to provide the most advanced offerings.”

    “We are committed to working with Xilinx to bring the industry’s highest performance and highest integration programmable devices quickly to market,” said Morris Chang, TSMC Chairman and CEO. “Together we will deliver world-class products on TSMC’s 20SoC technology in 2013 and on 16FinFET technology in 2014.”

    This was certainly the case for Altera and Xilinx at TSMC. The flow of information and collaboration was definitely guarded knowing full well that any process improvement would benefit both companies. Altera moving to Intel changed that of course, a change for the better in regards to the greater good of the fabless semiconductor ecosystem. Putting the number one foundry (TSMC) in close collaboration with the number one provider of programmable technologies and devices (Xilinx) could be a serious game changer, absolutely. Look for a Xilinx flavored version of the 16nm process for higher performance applications like FPGAs, CPUs, and GPUs. Just my opinion of course.

    Let’s look at the FUD side of this:

    • Intel as a foundry is an unknown
    • How fast will Altera be able to build a competitive Intel based ecosystem?
    • Intel as an FPGA manufacturer is an unknown
    • Will Intel eat crow and sign an ARM Manufacturing deal? (ARM cores are big in the FPGA world)
    • Or will Intel force Atom on Altera?
    • What happens to the other Intel FPGA partners Tabula and Achronix?

    I’m not questioning Altera’s decision to partner with Intel. It was definitely the right thing to do given Xilinx seriously challenged them at 28nm and will again at 20nm. Competition fuels our industry and Intel/Altera are a competitive threat so it is for the greater good.

    I do however question the Intel biased spin on the situation and the constant bashing of the fabless semiconductor ecosystem. My opinion, Intel will rue the day they openly attacked QCOM, ARM, TSMC, and the rest of the fabless crowd, believe it. Hey Mr. Intel, this is not the microprocessor world you have controlled since the beginning of time. You are not in Kansas anymore Dorothy.

    lang: en_US