CEVA Dolphin Weninar SemiWiki 800x100 260419 (1)

The EDAC Wally Rhines Roast (video)

The EDAC Wally Rhines Roast (video)
by Daniel Nenni on 11-17-2015 at 7:00 am

Last week was the EDAC Phil Kaufman award dinner. It was much more like a roast, probably because Wally has a great sense of humor and as Aart de Geus said, “Wally is a cool cat to have a beer with…” Aart is right of course, hanging with Wally is one of my favorite work things to do.

The place was lousy with media people so I will try and add some additional color here. But first check out the tribute video below featuring former Intel CEO Craig Barrett, TSMC CEO Morris Chang, NXP CEO Rick Clemmer, Sir Peter Bonfield, TI CEO Rich Templeton, Synopsys CEO Aart de Geus, Cadence CEO Lip-bu Tan, and even a clip from the Phil Donahue Show:

The video is fast moving (like Wally’s hairline) so it is definitely worth the ten minutes.

This whole thing started at the last Design Automation Conference in San Francisco. The Kauffman Award honors an individual who has had demonstrable impact on EDA. EDAC had a DAC booth with pictures of all the Kaufman Award winners and Wally’s picture was not there. I had just done a Fireside chat will Wally and asked him why?


He said he believed it was because he was viewed as a semiconductor guy versus an EDA guy. Considering Wally has been a member of the EDAC Board of Directors since 1994 and served as EDAC’s chairman from 1996 to 2012 I did not agree. I also noticed that there were no female award winners so maybe that will be addressed as well.

One of the more interesting topics of the evening was Wally’s contribution to blue light-emitting diodes which was later “perfected” and awarded The Nobel Prize in Physics in 2014. You can read more about that here:

EDA and the Nobel Prize in Physics!
by Daniel Nenni Published on 10-15-2014 05:00 AM
What does EDA and the Nobel Prize for Physics have in common? Our very own Dr. Walden Rhines (CEO of Mentor Graphics)…

It was quite an elegant affair with several hundred people, some of which are recognized EDA heroes. There were actually two videos shown but I doubt the second one will be posted. It was a spoof on Wally that fell flat. Craig Barett’s keynote was very good. Craig was a professor at Stanford and was on Wally’s PhD advisory panel so he had some interesting stories.

The most interesting comment Craig made was about Wally’s father, Dr. Fredrick Rhines, who authored “Phase Diagrams in Metallurgy Their Development and Application” circa 1956 and coined the term “microstructology” to describe the study of microstructures of metals and alloys. The apple did not fall far from the tree it seems. According to Wally’s Wiki page: During his career Dr. Rhines Senior was the Alcoa professor of light metals at the Carnegie Institute of Technology from 1946-1959 and founder of the department of materials science and engineering at the University of Florida, from which he retired in 1978; today the department is housed in Frederick N. Rhines Hall.

The other interesting information of the evening came from the guy sitting next to me, a long time EDA lawyer, who had some incredibly funny stories that he made me promise not to print. But let me tell you, between the two of us we know where most of the EDA bodies are buried, absolutely!

The video of the event will be posted HERE at some point in time.

Don’t forget to follow SemiWiki on LinkedIn HERE…Thank you for your support!


Maybe Clockless Chip Design’s Time has Come

Maybe Clockless Chip Design’s Time has Come
by Tom Simon on 11-16-2015 at 4:00 pm

There have always been novel technologies vying to compete with conventional design practices. It is hit or miss on the success of these ideas. In the 90’s I recall speaking to someone who was convinced that they could effectively build computers based on multilevel logic. This, as we know did not pan out. But there have been many more ideas that have been partially or fully successful.

Years ago Intrinsity was formed by the former Motorola PPC design team to commercialize their dynamic logic based design approach. They could achieve impressive performance/mW numbers, but it was only suitable for full custom designs and was a challenge to work in their design specification language. They did some work with Microsoft on one of the early X-boxes. They had a tough time selling their solution to a more general market. However, they scored a major win when the design team was acquired by Apple. These are probably the same folks that are building your A9 chips.

One area of consistent effort is clocking schemes. One such effort that seems to have stalled out is resonant clocks. The start up Cyclos produced some chips with AMD, but making it commercially viable proved difficult for them. They boasted 4GHz clock speeds, but this is a clock speed achievable using conventional clock trees, so the benefit may not have justified the added effort.

Azuro is well known for shaking up the CTS market. Before they came along, the big guys probably had one or two part-time developers working on improving CTS. With their dramatic improvements in area and power, they signed some big deals with major chip companies and were eventually acquired by Cadence.

What a lot of people do not know is that their CEO Paul Cunningham originally wanted to develop a commercial solution for clockless design. However, after significant effort they abandoned that and focused on improving the existing CTS methodology. Clockless design still remains the holy grail of power-performance improvement. Of course it would also be a win in PVT in general as it would handle variation much better.

Wave Semiconductorhas developed an approach for implementing digital designs using a clockless approach called Wave Threshold Logic (WTL). To make it work they needed to change some fundamental ways circuits are designed, but claim to be able to make equivalent functionality, running much faster with a minimal area penalty.

They needed to use 2 wires per signal, to convey 0, 1, NOT_DATA, and an unused/illegal state. An increase in signal lines is not an excessive penalty for eliminating an entire clock tree and all its signal wires, buffer and and registers. But they also boast a power consumption win from the elimination of glitching transitions. The only transitions in their logic are true data transitions. When logic changes it is because of necessary switching. They also had to invent a new type of gate – one that switches based on the sum of the number of its input that are at logic 1.

Using this new gate and backward branching lines to coordinate logic readiness, they can implement the equivalent of any Boolean logic gate and pipelines of arbitrary depth. Arguably this approach still suffers from the problem Intrinsity had, which is the lack of a direct translation from RTL to their gates. However, Wave Semiconductor can easily reach data speeds of 8 to 12 GHZ. I am sure the lack of a gate level compiler had held back their business progression. Now It seems they have found a path that will provide them a bigger TAM.

Given how much bandwidth they have available, they appear to have decided to move up the abstraction chain and build a fabric of 8 bit processors and a way of converting high level design specifications into configurations of the fabric. Their website even talks about dynamic reconfiguration during chip operation. While this sounds similar to what an FPGA might offer, FPGA’s still have the limitations of a clocked architecture and cannot compete with ASIC’s and SOC’s on performance/area/power.

Wave’s fabric running at ~10GHz could implement a programmable approach that is fast enough to be the finished chip. In their current offering they have licensed NOC technology from SONICS, and are including a CPU and other IP blocks to round out the functionality. By going with a commercial product that could go to volume they have solved one of the big problems other novel clocking scheme designs have struggled with. If they can make the supporting software and the design specification process straightforward enough, they could be successful in an area where many others have struggled.

I think there might be an interesting back story on how their ‘brilliant’ idea was molded into something that can be marketed effectively to compete with the traditional solutions. After all, there have been a lot of clever ideas that did not quite pan out because there was not a good enough fit with the design needs of the market. One need not look any further than companies like Cyclos, Tabula or Intrinsity for good examples.


More than just mobile phones for Mali

More than just mobile phones for Mali
by Don Dingee on 11-16-2015 at 12:00 pm

ARM TechCon 2015 was another tour de force for ARM and its ecosystem. Besides some of the developments in mobile, IoT, and security (more coming soon in the Epilogue of “Mobile Unleashed”), there were two topics that I found very educational and will cover in blogs this week. One was how the Mali family is powering more than just mobile phones. Continue reading “More than just mobile phones for Mali”


A (R)evolution in Hardware-based Simulation Acceleration

A (R)evolution in Hardware-based Simulation Acceleration
by Tom Dillinger on 11-16-2015 at 9:45 am

The most exciting products in our industry are those that are both evolutionary and revolutionary. Cadence has just announced an update to their hardware simulation acceleration platform – Palladium Z1 – which continues the evolution of the unique capabilities of processor-based acceleration, plus a revolutionary approach to managing this resource across an increasingly diverse set of users and verification environments.

I recently spoke with Frank Schirrmeister, Senior Group Director, Product Management, for the Cadence System and Verification Group, who shared his excitement about the capabilities of this new platform.

Simulation Hardware Acceleration as a General-Purpose Resource

Simulation acceleration platforms fall into two categories – processor-based and FPGA-based architectures.

Cadence offers both types of systems – Palladium and Protium – and has been developing flows to enable verification teams to move workloads between the two as seamlessly as possible, to leverage the best of both offerings. (For a discussion of Palladium and Protium migration, please refer to this earlier Semiwiki article: “What’s the Difference Between Emulation and Prototyping?”)

Typically, these platforms are reserved for large, system-level verification workloads, where the throughput of software simulation tools are inadequate for the task. Model compilation and job execution are usually managed by a smaller verification team, who are experts in the nuances of:

  • partitioning the model across platform domains (e.g., FPGA’s or acceleration hardware clusters)
  • managing multiple, concurrent project workloads running on the platform
  • integrating attached in-circuit hardware emulation interface modules
  • debugging methods specific to these platforms

The increasing complexity of SoC’s and the IP integrated into these chip designs requires that simulation acceleration no longer be primarily focused on system verification, often used in a narrow interval of time in the overall project development schedule. Rather, IP and SoC verification plans also need to incorporate the benefits of accelerated simulation, in various potential scenarios – more on “usage models” shortly.

Recognizing this need, Cadence approached the development of the Palladium Z1 platform to be more of a general-purpose resource, readily available and familiar to a broader cross-section of the verification team, across the full gamut of IP, Core, SoC, and system environments, as illustrated in the figure below.

First, some of the evolutionary improvements in the Palladium Z1 offering…

Palladium evolution
Leveraging technology scaling has enabled Palladium Z1 to improve specifications significantly over the previous Palladium XP-II platform:

  • up to 4X maximum model capacity
  • up to 2X performance in model build and resource allocation
  • up to 1.5X runtime execution throughput
  • 2X power density improvement (watts per million gates)

Extending multiple boxes to accommodate larger models has been enhanced to utilize optical fiber and Infiniband interfaces for the inter-system connectivity. (Existing Palladium-XP users will no doubt acknowledge that multi-system model domain management and cabling has been a pain.)

Cadence has continued to emphasize model portability between platforms, including the Cadence Incisive Enterprise Simulation software toolset. Utilizing a model compilation front-end that is aware of semantic differences between software simulation and acceleration (e.g., multi-state vs. two-state evaluation), verification environments and intermediate runtime results can be moved from IES to Palladium Z1 and back again, using a methodology that Cadence refers to as a “hotswap”.

Debug databases are readily off-loaded from the Palladium Z1, for off-line post-processing.

And, the ability to integrate in-circuit emulation with Palladium is supported. The Emulation Development Kits (EDK) are adapted to reflect the change in overall Palladium product strategy, to be discussed next.

Palladium “revolution”
As mentioned above, the need for accelerated simulation is reaching lower levels of IP, core, and SoC verification. To enable (and scale) for this growing requirement, Palladium Z1 has been re-designed to be a “data center” resource.

The unique product form factor of previous Palladium models has been replaced by a rack, with footprint, power, cooling, and cabling all consistent with data center “standards”.

Verification job queuing and dispatch on Palladium Z1 integrates readily with existing compute infrastructure (such as the LSF resource management tool).

The allocation of Palladium Z1 capacity to a verification task no longer requires an expert in the Palladium architecture to assign domains to each project. The job allocation to specific Palladium domain(s) is managed by the Cadence software. Indeed, the Z1 resources are dynamically allocated. Re-targeting of job resources is supported — i.e., re-location and re-shaping, without re-compilation — to enable a subsequent large dispatched job to have the (contiguous) resources necessary to execute with optimal performance. Frank indicated that the new platform supports up to 2304 concurrent jobs, with a 4M gate granularity.


As illustrated above, the EDK hardware emulation attach support has also been adapted to the data center strategy, with modules physically accessible to the Palladium Z1 in a rack located within 30 meters. These attached resources are available to verification users across the corporate data center network.

The strategy of expanding simulation acceleration to a much larger set of users and verification tasks also requires addressing the multitude of “usage models” that a verification team encompasses. The figure below highlights the specific areas of focus that Cadence has maintained for the Palladium Z1.


Throughout successive generations of microelectronics technology, the complexity of building block designs has grown, as has their functional verification requirements.

Yet, the transition from software simulation to an accelerated emulation or FPGA prototyping platform has typically required specific expertise, which has hindered the utilization of these platforms across the range of verification tasks. The data center focus of the new Palladium Z1 platform addresses this issue, and significantly reduces the “expertise gap”.

Verification project plans can now incorporate greater diversity in target tools and platforms for each level of design decomposition, to optimize throughput and testbench focus. The impact of adopting accelerated throughput across a wider set of models – especially, IP and subsystem designs – could indeed “revolutionize” how large design projects are verified.

More information on Palladium Z1 is available HERE.

-chipguy


Strengthening That Serving ARM

Strengthening That Serving ARM
by Bernard Murphy on 11-15-2015 at 4:00 pm

Everyone is aware of ARM’s dominance in mobile devices and their likely dominance in IoT, but what about servers? ARM has been making a play for this area but conventional wisdom is that fortress Intel will protect its server market at all costs. You’ll hear that servers are not so much about compute power, they’re more about I/O and no-one knows I/O with all its backward compatibility requirements better than Intel. Is that really the case and is ARM charging a hill it cannot take? Or are changing market needs shifting to favor new entrants?

There was a very revealing presentation at ARM Techcon this year from the Linaro group which goes to the heart (or at least an important heart) of this topic. Linaro is an open-source collaboration to drive compatibility for Linux kernel and other software for ARM-based platforms; they developed the Linaro Enterprise Group (LEG) in 2012 to focus on compatibility for server platforms. That in turn enables server SoC development from AMD (Seattle platform), Cavium (ThunderX platform), Applied Micro (X-Gene platform) and HiSilicon among others, aside from internal server development in Amazon (for AWS), Facebook and Google.

LEG is steadily moving ARM up the ranks in server support. The first step was enablement – the stuff you have to do to even play in the server/cloud space. This is UEFI (the modern replacement for BIOS), ACPICA for configuration and power management, KVM for kernel virtualization infrastructure, XEN for hypervisor and OpenJDK for Java support. LEG have been busy developing patches, having these approved and getting them upstreamed to releases.

They then expanded focus to workload optimization and this is where it gets really interesting. There are a lot of capabilities you need to ensure you run well on a Linux platform: LAMP, OpenStack, Docker, Ceph and more but one area really points to a strategic focus – establishing ARM as a best-in-class citizen and officially supported platform in big data, specifically around Hadoop. Hadoop itself is a complex ecosystem, including Spark, Pig, Hive, Calcite, HBASE and many other pieces. (For anyone who thinks software is easy compared to hardware, your head should be spinning by now.)

A central component of Hadoop is H[SUB]2[/SUB]O which provides an interactive interface to an underlying database view of the data. A lot of the statistical analysis and model-building starts here. LEG ran benchmarking to look at scaling on a cluster of 6 Seattle (AMD) 8-core nodes, using a 14 GB dataset (airport landing and departure times). They found that both file-parsing and model-building scales linearly with memory and that, surprise surprise, the speed of the external network is limiting for performance. For 1Gb Ethernet, performance flattens out after 2 nodes, but for 10Gb Ethernet it remains linear with the number of nodes (at least up through 8 nodes).

This has an obvious consequence for on-chip server node integrations: integrations are valuable only in so far as you can achieve ~10Gb Ethernet communication between nodes. And that’s for up to 8 nodes; it will flatten at some point beyond that, which will drive you to to even higher speeds. It is unlikely you can do any of this with traditional fabrics. The people who are building large node-count SoCs (the Calxeda team now in Amazon Web Services for example) almost certainly see their own proprietary fabrics as their primary technology advantage.

Given all this, does ARM have a shot? Big Data support is a new and potentially large market driving server growth (and therefore worth chasing), the playing field is leveled through need for innovation in fast, coherent on-chip fabrics (where traditional IO interface expertise doesn’t help) and the biggest customers don’t seem to have the patience to wait for a commercial solution and are building their own server chips and servers (which they pretty much have to do using ARM cores).

In short, as long as the ARM ecosystem can keep pace with big data needs, yes – they have a very real shot.

More articles by Bernard…


ARM Announces Cortex-A35, mbed OS 3.0 at ARM TechCon 2015

ARM Announces Cortex-A35, mbed OS 3.0 at ARM TechCon 2015
by Majeed Ahmad on 11-15-2015 at 12:00 pm

Mike Muller, Chief Technology Officer of ARM, has announced the availability of Cortex-A35 processor core for low- to mid-range smartphones at the opening keynote of ARM TechCon on November 10, 2015 in Santa Clara, California. According to Muller, there are going to be 2.8 billion smartphones shipped this year, and more than one billion of them are heading to developing economies mostly using the entry-level smartphones.

It’s the latest member of ARM’s ultra-high-efficiency 64-bit processors family based on the ARMv8-A architecture. Cortex-A35 chips are targeted at mobile and embedded applications, and production silicon from ARM partners is expected to be available in late 2016. MediaTek is one of the first chipmakers adopting the Cortex-A35 processor that consumes nearly 33 percent less power per core and takes 25 percent less silicon area compared to the Cortex-A53 processor core.

Next, Muller announced the availability of mbed OS 3.0 and mbed Device Connector for the Internet of Things (IoT) service provisioning and authentication. He also explained how ARM is bringing TrustZone security hardware infrastructure to low-cost microcontrollers. ARM’s TrustZone technology, which has been part of the Cortex-A processors, will now be part of the Cortex-M processors targeted at embedded and IoT applications.


Mike Muller: ‘ARM server chips have arrived’

Muller also talked about ARM-based server chips. He said that ARM has been talking about server chips for two to three years, and the firm has been making inroads in the server market during this time. “We are at the cusp of transformation while shipping server chips for a number of manufacturers.”

Muller told the audience that arm.com is now running on ARM-based server chips. When asked what ARM brings to table for server chips, Muller’s answer was cost and diversity. He said that it’s not going to be a single server chip; instead, there will be a variety of chips, for instance, for high-performance computing (HPC), C-RAN, networking and storage, etc.

ARM is celebrating the 25th anniversary of its founding at the ARM TechCon 2015.


The Reason ARM Will Win IoT!

The Reason ARM Will Win IoT!
by Daniel Nenni on 11-15-2015 at 7:00 am

After spending the week in Silicon Valley at ARM TechCon and related meetings, there was one common thread amongst the presentations and conversations and that was security. No matter what the topic was, mobile, consumer or industrial IoT, wearables, automotive, etc… security always came up. The question I had was how will companies approach security organizationally?

Take ARM for example, do they have one security group or security specialists in each product group? Do they approach security horizontally, vertically, or diagonally in regards to the different market segments? The answer of course is “all of the above” because security will be one of the most critical aspects of system level design and that starts with silicon, absolutely.

“We have the opportunity to get this right. Let’s take that opportunity to get the IoT right. As the IoT evolves, and as it gets more complex, it will be difficult to address security after the fact.” ARM CEO Simon Segars, ARM TechCon 2015 Keynote.

A funny side note, right in the middle of Simon’s keynote, alarms went off and we were all evacuated from the building. As it turns out, not so smart sensors detected an over toasted bagel during an IoT conference. Just a little bit of irony there… The sensors were probably connected to a PC but I digress…

The most impressive slide I saw at ARM TechCon, and I saw plenty, is the ARM partner slide above. Let me tell you something about collaboration, it is a language in itself. The fabless semiconductor ecosystem was born and raised collaborating so it is our native tongue, one that we have spoken for more than 25 years. TSMC is a great collaboration example of course and the result is the largest and most successful foundry ecosystem the world will ever see. TSMC and partners start at process development and continue through to the finished chips that hide inside the electronic devices we rely on every day.

ARM, on the other hand, has a different challenge. ARM and partners also start at process development but they continue through to the system level (both silicon and software) that enable the electronic devices we use every day. There are hundreds of silicon, design support, software, training, and consortia partners in the ARM ecosystem. This is literally 25 years of collaboration experience synthesized down to one slide and no one in this industry speaks collaboration better than ARM.

When I attend other conferences and hear companies that are new to the fabless semiconductor ecosystem talk about collaboration it really is hard to keep a straight face. It’s like me trying to speak my kid’s social language. I do not now nor will I ever really know what it is like to be a millennial but sometimes I pretend to, just to amuse my adult children. Try this one, tell your kids that you and their Mother are going to “Netflix and chill tonight” and see what happens.

Back to security, they say it takes a village to raise a child? Well, it will take an ENTIRE ecosystem of EXPERIENCED collaborators to make a SECURE mobile, consumer or industrial IoT, wearable, automotive, etc… device, ABSOLUTELY!

Don’t forget to follow SemiWiki on LinkedIn HERE…Thank you for your support!


When Talking About IoT, Don’t Forget Memory

When Talking About IoT, Don’t Forget Memory
by Tom Simon on 11-13-2015 at 7:00 am

Memory is a big enough topic that it has its own conference, Memcon, which recently took place in October. While I was there covering the event for SemiWiki.com I went to the TSMC talk on memory technologies for the IoT market. Tom Quan, Director of the Open Innovation Platform (OIP) at TSMC was giving the talk. IoT definitely has special needs for memory because of the need for low power, data persistence and security.

Tom Quan started the talk with an informative view of the IoT market. I have heard a lot of IoT overviews, but I listed with a keen ear to learn how TSMC views this market. Since 1991 there have been three big growth drivers for the semiconductor industry. IoT promises to be the fourth.

The first was personal computing which saw 7X growth from 1991 to 2000. Next came mobile handsets demonstrating 9X growth from 1997 to 2007. Last we have smart mobile computing, which lumps together mobile computing, internet, mobile communications, and sensing. This segment grew by 12X from 2007 to 2014. Clearly the IoT is the next big thing and will likely continue this accelerating this trend. The conclusion is that IoT will be the next big growth driver for the semiconductor industry. The sum of PC’s smartphones, tablets and IoT is expected to approach 20 billion units by 2018, compared to roughly 8 billion total today. The last year there is hard data was 2013 with ~5 billion units.


IoT is really an extension of mobile computing. Tom’s talk broke it down into the four umbrella categories of smart wearables, smart cars, smart home and smart city. It will consist of smart devices on smart things. Think of health sensors, gesture and proximity, chemical sensors, positional sensors and more. So where do today’s technologies stand as far as meeting the requirements of the IoT?

Probably every metric for design will be stressed by IoT. Unit volumes will go from the single-digit billions of the PC era to hundreds of billions in the IoT era ahead. Operating times for devices will need to go from hours to years. This in turn demands that the hundreds of watts that PC’s used transform into nano watts for edge sensors and the like. Additionally, new technologies, materials and architectures will need to be developed.

To make these transitions everything will need to be moved forward technologically. The slide below shows how this might work for the wearables segment


A huge part of this will involve embedded memory. Right now SRAM is used primarily for volatile memory. There are a number of solutions for non-volatile memory (NVM), including several future technologies that are very promising. The two major axes for NVM are density (size) and endurance (re-writability). Small sizes that do not need high endurance are things like configuration bits, analog trim info, and calibration data. The work well with one time programmable (OTP) approaches.

Even some boot code can be stored in OTP when it is configured to simulate re-writable memory. However, the number of re-writes will be limited as the non-reusable bit cells are utilized.

Flash and eeprom are good for applications that require larger sizes and more re-write endurance. But they come with the penalty of requiring additional layers or special processes.

Tom suggested that magneto-resistive RAM (MRAM) is one technology that shows some promise. It harkens back to the old core memories, but is scaled to nanometer size. Of the two original approaches for MRAM only the spin-transfer torque (STT) technology has been proven to scale well. There are two competing approaches for this: In-Plane and Perpendicular. MRAM using STT is very fast and uses low power. So it looks very promising as a NVM replacement that can also be used for SRAM replacement.

Another area of promise for future NVM solutions is resistive RAM or RRAM which uses the memristor effect in solid state dielectrics. There are several flavors of this technology being researched. But RRAM is not as far along commercially as MRAM is. However, this is an active area of research and the frequently use high-k dielectric HfO2 material has been discovered to work as RRAM.

Advances in NVM will have a huge impact on IoT. Power savings from having readily available persistent storage will open up new application areas. Think of not having to save system RAM when entering sleep. To further save power TSMC continues to add ultra low power processes to its existing process nodes. IoT will be driven in large part by technologies that allow edge node devices to be power sipping.

For more info on TSMC’s Open Innovation Platform look here.


A Novel Microprocessor Fighting Dark Silicon, Energy Efficiency, Code density and Silicon area

A Novel Microprocessor Fighting Dark Silicon, Energy Efficiency, Code density and Silicon area
by Roger Sundman on 11-12-2015 at 4:00 pm

Processor cores used in computers and smartphones have become impaired by their own complexity and can’t fully utilize future CMOS generations for increasing their efficiency. Due to the continued increase of density and speed of transistors, these big cores produce too much heat per mm[SUP]2[/SUP] if trying to follow Moore’s law for both transistor count and frequency.

Every transistor switch event produces energy, i.e. heat. Both size and delay time for transistors are reduced for every generation, and if these are utilized, i.e. density or clock frequency is increased, then the heat density increases. This has become a problem with the latest generations. If both maximum density and maximum frequency would be utilized over the entire chip, then the heat would destroy the chip. For a processor core is of course desirable to be able to utilize the maximum performance the technology allows, but this can be done only on a few percent of the chip area, and the percentage is reduced for every generation. The remainder has to be “dark silicon”, meaning that is has to have much lower activity.

A radically new processor architecture, reducing overhead high frequency switching, is needed in order to fully utilize the potential of future CMOS technology. Optimizing for energy efficiency, throughput, cost, code density, adaptability and scalability is a big challenge for the computer architect.

Imsys’ processor has a different, yet well proven, fundamental design that doesn’t have the above mentioned limitation and is therefore suitable for the new situation in semiconductor technology. The core itself consists mainly of memory and it has rich functionality, which enables it to save energy by efficient use of its small flexible arithmetic logic.

Almost the entire chip area, 97%, is memory. Energy efficiency has, for the first time and for the foreseeable future, become the most important characteristic of processors, big and small, also for other reasons than the problem described above.

A proof-of-concept chip, prototyping a tile of a many-core system, has been produced and verified. 97% of its transistors are used in memory blocks. It includes Imsys’ patented dual core solution, where the pair of cores occupies 40% less space and consumes 25% less power than two single cores while doubling performance. The chip is manufactured by UMC using the 65 nm LL process and draws 18 mA at 1.2V and 350 MHz with both cores active. The cores share memories and a 5-port grid network router, NoC. Each core has local memory capacity sufficient for its immediate need, bringing down the load on the grid network on the chip. Memory management is handled by microcode and memory is interwoven with the processor and there is no cache or memory controller needed.

Simply placing 128 copies of this verified tile next to each other results in 256 cores, 42 MByte ROM and 25 MByte RAM on 320 mm[SUP]2[/SUP] silicon, consuming 2.8 W with all cores running at 350 MHz. This can simply be scaled down – with 14 nm technology, an area of 238 mm[SUP]2[/SUP] could have 4096 cores, 672 MByte ROM and 400 MByte RAM and consume 31 W at 1.6 GHz.

The second core only need half the power used by the first core. Each core has almost constant power consumption when active, and the heat it generates spreads across the adjacent memory areas. This allows a higher total power dissipation and simplifies cooling system and power budgeting.

Microcode, as opposed to logic gates, is compact and energy efficient. Imsys uses extensive microprogramming to accomplish a rich set of instructions, thereby reducing the number of cycles needed without energy inefficient speculative activity and duplicated hardware logic. Each core has two instruction sets, including native Java bytecode execution.

Microcode is also used for computationally intensive standard routines, such as crypto algorithms, which would otherwise be assembly coded library routines or even special hardware blocks. Optimizing CPU intensive tasks by microcode can reduce execution time and energy consumption of hot spots by more than an order of magnitude compared to C code.

The rich instruction set optimized for the compiler reduces the memory needed for software and, just like the microcoded special algorithms, it reduces the number of clock cycles needed for execution. The reduced requirements for memory bandwidth and the flexible microprogram control allow the compact arithmetic unit to do useful work all the time.

This platform has a certified JVM and uses an RTOS kernel certified to ISO 26262 safety standard for automotive applications. The development tools will be enhanced with the support enabled by the LLVM infrastructure. A new instruction set optimized for an LLVM backend has been developed and is being implemented in the coming hardware generation.

More information HERE.

Don’t forget to follow SemiWiki on LinkedIn HERE…Thank you for your support!


Merger Mania: The Future of the Semiconductor Industry

Merger Mania: The Future of the Semiconductor Industry
by Pawan Fangaria on 11-12-2015 at 12:00 pm

In a semiconductor industry which appears maturing, we are also seeing the technologies unravelling newer transistor structures, memories, processors, and newer ways of designing ICs and electronic systems. The present decade appears to be at the cusp of a new transformation in the semiconductor industry. Amid a slew of mergers and acquisitions this year, multiple questions arise on how these will affect or shape the future of semiconductor industry.

Dr. Walden C. Rhines, Chairman and CEO of Mentor Graphics Corporationpresented a keynote in their 11[SUP]th[/SUP] User2User India Conference. In his keynote titled “Merger Mania” Dr. Rhines emphasized about the major change in the structure of the semiconductor industry that will affect everything from the product definition to its design and manufacturing. I found a great opportunity talking with Dr. Rhines and getting his perspective on some of the key issues I thought will be coming in the way as we transition into the new phase of the semiconductor industry along with these mergers. Here is the conversation –

Q: Dr. Rhines, a large number of mergers this year reminds me about 2008 when the global financial crisis took place. Although the main cause of that crisis was housing subprime mortgages, there was also a significant consolidation in the semiconductor and other arena just before 2008. How do you interpret the current consolidation? Are we expected to see some sort of financial crisis again in coming years?

A: I hope there will not be a crisis like 2008; at that time housing loans by a large number of people could not be paid back. Today, the situation is not like that. There is liquidity in the market, credit is available at cheap interest rates, and there is free market economy. This is favouring debt and hence acquisitions and mergers are practical because they provide better economy of scale. At the same time the government tries to bring equilibrium in the economy whenever there is distortion in the market. So when interest rates start increasing the merger activities will slow down.

Q: In the current consolidation we are seeing multiple elements. The cost and capital conservation are anyway there to an extent, but there is innovation and expansion as well in particular areas such as IoT and its related segments. Do you see this consolidation bringing a bigger business opportunity rather than just contraction in the number of companies?

A: I see the opportunities in several ways. The corporate overhead will reduce and more money will be available for R&D and innovation. This is a periodic cycle; when acquisition happens, there is reduction in spending, but new businesses expand in due course of time. On the average 14% of the semiconductor revenue goes into R&D. So there may be more expansion in coming years, of course the law of economics cannot be violated.

Q: How do you see the fabless and IP model of business being impacted amid this consolidation?

A: The percentage of IC revenue from foundry wafers continues to increase; it’s touching to about 40%. The silicon foundries have enabled competitive costs for fabless companies, thus enabling economy of scale for IC production. So there may be a few IDMs acquiring fabless companies, but overall the fabless business will continue to grow. Among IP, the standard IP like USB or PCIe are commoditized. However the other value-added IP business will continue to grow.

Q: In near future we can foresee major changes coming in terms of transistor technologies, memories, processing power, IC manufacturing etc. and all of that at nominal cost to end-user. How is the semiconductor industry preparing to cope with providing high-value at lower cost?

A: The cost of transistor has decreased since last 50 years through reduction in feature size. Now wafer cost reduction is reaching its limits, but the transistor cost can still go down by increasing volume and other new technologies to reduce cost. There can be multiple dies in the same package. There is a learning curve associated with every new technology, so in due course we will see new technologies evolving and there will be further growth in the semiconductor industry.

Q: The technology is the driver for EDA tools and EDA tools are enablers for designs. The technology also brings new paradigms in designs. How do you see the EDA tools evolving amid new demands from the technology and design arena?

A: The EDA is an exciting field. Today it caters to multiple prominent process nodes – 28nm, 20nm, 14nm, 10nm, and now 7nm is close to stabilizing. The modern EDA tools are factoring multiple aspects in the design phase of ICs, e.g. manufacturing, cost, yield, and so on. The correlation of test failure with layout is evolving as a significant business. The cost reduction has to be achieved through multiple means. The systematic defects have to be automatically identified through the use of data mining and analysis.

About 5 years ago, Emulation was applied to only graphics chips, but today it’s used to verify any big digital chip. The Emulation driven verification business has doubled in last 3 years. The verification incurs largest expense in a design. The verification cost has to be reduced by multiple means. In the coming future, there will be many changes in EDA tools to enable more efficient design and verification in multiple ways.

Q: China is investing huge money to develop semiconductor industry in that region; M&A is also a part of that initiated by Chinese institutions. How do you see the worldwide semiconductor industry from regional perspective?

A: Yes, in China, the government along with private equity (PE) corporations have announced a five year plan to invest in big way in semiconductors (foundry as well as fabless) which will involve development in China as well as acquisitions in foreign countries. The government will invest ~$20 billion and PEs will invest ~$100 billion. We have already seen the acquisitions of Spreadtrum, STATS ChipPAC, ISSI, OmniVision, and Montage Technology. Micron’s acquisition is proposed for $23 billion. So, there will be a significant development in the semiconductor industry from China, stimulated by PE investment there.

Q: After more or less 35 prime years of semiconductors, we are seeing our second generations just entering this fascinating industry which infuses semiconductors in most of the other industries. What is your message to our younger generation engineers who may see a much transformed semiconductor industry than we saw?

A: The best is yet to come. So far we have seen many styles of chips in the form of ICs, now SoCs. Going forward system design will be in the mainstream. Design and verification is still in infancy. Although large designs are being developed, the verification is still being done in the same way as we were doing 30-40 years ago. We have to think in terms of system design from the beginning; define, implement and verify at that level.

Q: This is Mentor’s 11[SUP]th[/SUP] U2U conference, similar events are happening in other EDA companies. Considering these initial years of this century as pivotal for next generation semiconductor industry, what is the most important observation in the design community you think should be continued or changed?

A: The semiconductor industry is a learning industry; it needs a lot of collaboration and interaction. Today there are many ways of sharing information across the world through internet, webinars, tele-conferences, and so on. However, the design engineers still require direct physical interaction. So, focused User2User and other technical conferences are the best forums for sharing and learning. There is no end to these conferences, they will continue. The attendance in U2U conferences keeps increasing; we hold U2U conferences in all user dominated regions across the world.

It was a very inspiring discussion with Dr. Rhines. We eagerly look forward to the best yet to come in the semiconductor industry!

Pawan Kumar Fangaria
Founder & President at www.fangarias.com