Dan Ganousis posted in our SemiWiki forums about a newer technique to lower the power consumed by GHz clocks on SOC designs and asked if I was interested to learn more, so we met today via WebEx. Dan is with a company called Cyclos Semiconductor, co-founded in 2006 by Marios Papaefthymiou, President and Alexander Ishii, VP of Engineering.
The Power Challenge
A modern SOC can consume up to 30% of its power just on the clock buffers, which really is a big contributor to overal power. Other EDA vendors are focused on reducing power for the areas marked below with red arrows:
The promise of Cyclos technology is to reduce the power consumption on the clock buffers:
Many chips today use the familiar clock tree approach to distribute a clock signal across the chip.
Another clock distribution approach is called the Mesh where a metal layer ties all the distributed clock signals together to form a low resistance Mesh after the initial clock driver cells:
The clock mesh gives you a very low skew value however it’s capacitance requires increase energy to drive which also increases power consumption. We like the low skew but we don’t like increasing the power.
In EE theory classes we all learned about oscilators built out of LC circuits:
What if we could combine the benefits of the clock mesh topology with the resonance of an oscillator to reduce the energy required to drive a clock network?
Hmm, that idea could work in theory:
Benefits of such an approach:
- Low clock skews because of the low-resistance mesh
- Metal mesh less impacted by On Chip Variation (OCV) and Process/Voltage/Temperature (PVT) variations
- The Post-gater trees timing are isolated, so ECOs are easier in the design cycle
- Lower power consumed by the clock distribution network
Challenges of this approach:
- EDA tools not commercially developed yet
- Design flow not well understood or built
The LC circuit created by the inductors basically helps to recycle clock power thus lowering consumption:
OK, lower power consumption for my clock network is always a good thing but are there more benefits? Yes, you even have reduced jitter on your clock edges:
The theory of using a resonant mesh for clocks is appealing, but who is really using this in production chip?
I did a quick Google search and found hundreds of articles and patents on the subject, so it looks like the leap from theory to practice has been bridged. A few more side benefits of resonant mesh clock designs are lower RF noise than clock trees, and electromigration reduction from bidirectional current flow in the clock net.
The Cyclos Semi approach has been used with an ARM926 chip where first silicon showed a 25% to 35% reduction in total power. Several other chips have used this approach with early customers and one DSP chip showed a 75% lower clock power number while using GHz speeds.
The theory matches the silicon results, so how do we get inductors onto an IC design? Here’s a mesh with distributed inductors built in a top-level of metal using standard processing steps. You don’t want circuits underneath these on-chip inductors so that will increase your silicon area up to 5% typically:
There are at least three clock distribution choices: clock tree, clock mesh, resonant mesh
The engineers at Cyclos are promoting the resonant mesh for GHz designs as a way to reduce power and tighten up the clock specs.
EDA Tool Flow for Design
Initially the way to get this resonant clock mesh (RCM) for your chip requires some manual work so you could hire Cyclos as a consulting company or wait until their tool flow is released in 2012. The idea is to create a compiler that automates the layout implementation parts.
Q: Do I need special libraries or process?
A: No, standard libraries and process.
Q: What is the area impact?
A: Up to 5% using top-level metal layer for the inductors.
Q: Can I have dynamic frequency scaling?
A: Yes, either incremental or deep scaling will work.
Q: Do I need new tools?
A: Use your existing tools plus RCM Compiler.
Q: Can I use voltage islands and voltage scaling?
A: Yes, both are supported.
Q: Does my testing process change?
A: No, same wafer and package testing as before.
Comparing Clock Methods
Here’s how the three clock distribution approaches compare:
To be fair I would add an extra row called area, where the RCM approach does add extra area which is considered Bad, so RCM does cost you something.
The Money Part
All new services, IP and EDA tools cost you something, so here’s the investment with Cyclos:
To get the RCM implementation is either $500K as a design service or wait until the RCM Compiler tool is ready around the DAC time frame. An IP license will run you $1M per process node, and finally there’s a usage fee.
Compared to the digital libraries from Artisan/ARM there were only usage fees, called royalties.
People at Cyclos
This startup has 8 people now and could grow to 20 by the end of next year. Marios is a PhD from MIT who is a tenured full professor of EECS at the U of Michigan. Alexander Ishii has a PhD also from MIT and has experience in EDA flows, ASIC and custom VLSI design. Dan Ganousis handles business development and has been in EDA and IP for years: Mentor, Viewlogic, Arithmatica, AccelChip, Innoveda and VeriBest.
Starting out with consulting and then moving into an EDA product and IP model could just work for Cyclos. Stay tuned for the ISSCC coference in February for an announcement from a big company using the RCM technology. I would guess that Synopsys (Magma), Cadence, Mentor, an IP company, FPGA or Foundry could acquire Cyclos and keep this new technology all to themselves if the price is right.