Array
(
    [content] => 
    [params] => Array
        (
            [0] => /forum/index.php?threads/is-it-worth-digging-deeper-in-asynchronous.8574/
        )

    [addOns] => Array
        (
            [DL6/MLTP] => 13
            [Hampel/TimeZoneDebug] => 1000070
            [SV/ChangePostDate] => 2010200
            [SemiWiki/Newsletter] => 1000010
            [SemiWiki/WPMenu] => 1000010
            [SemiWiki/XPressExtend] => 1000010
            [ThemeHouse/XLink] => 1000970
            [ThemeHouse/XPress] => 1010570
            [XF] => 2021370
            [XFI] => 1050270
        )

    [wordpress] => /var/www/html
)

Is it worth digging deeper in asynchronous?

Torq

New member
Hi! I have been working as a hardware developer (both F-End and B-End digital flow) for many years. My hobby is so-called self-timed (asynchronous, QDI) logic. Last 3 years I spent learning the ways to use Cadence and Synopsys synthesizers and B-End tools for async implementations, and I made some achievements in this field. But i doubt if it is worth to continue my research because of some disadvantages of asynchronous logic and impossibility to make some test chips by myself (due to high cost even for 100+nm ASIC). It becomes too expensive for just a hobby. So, I met the crossroads: whether to abandon this hobby entirely, or make it my primary profession.

So, here is my question: What do you think, is there any future for the the async? Do any big companies provide R&D in the async field?
 
Also check out SyNAPSE from IBM. It seems unlikely that whole industries built around synchronous logic design will be replaced, but in emerging areas like neuromorphic computing it is possible that non-traditional design methods may may have special advantages to offer (low power especially).
 
Asynchronous IC design is used quite a bit in SRAM and DRAM block designs, but not much at all in digital logic designs because of the challenges of running Static Timing Analysis tools which expect at least one clock and synchronous behavior.

Sure, there's a future in asynchronous IC design, especially for security and ultra low-power designs.
 
Thanks for the answers!

Bernard, even $7k is too expensive for just a hobby. The idea of neuro computing is very interesting. But last decades reveal no proof of usefulness of brain-like algorithms and their implementations. I heard about SyNAPSE, SpiNNaker and another neural networks in silicon. In my opinion they are nothing more than a pure scientific researches and has no impact on industry. But thank you anyway.

Daniel, i fully agree that STA is a cornerstone of digital IC design because of tools. But i found the solution how to apply STA to async design. In a few words, i found a way, how to represent asynchronous circuits in a synchronous view. So STA is not a problem to async design at all.

I heard about the only company which provides the async R&D nowadays - TiempoSecure. But it seems they failed because there have been no news on their project for about 2 years.
 
Intel has bought several companies selling products designed with asynchronous technology. Achronix, Fulcrum, Timeless Design Automation, etc.
Using "synchronous" EDA tools to design asynchronous circuits was proven more than a decade ago with several startups trying to sell the concept. Nanochronous Logic, Elastix, etc.
 
pmat,
I agree, there were many startups, who tried to use async logic. But where are they now?
Big companies like Intel sometimes take over little startups just to obtain new technologies, not to use them. I heard about Intel's "elastic" circuits - asynchronous-like spin off from classical synchronous design. But I have never heard about practical usage of elastic circuits in Intel's processors. The same about Philips spin off "Handshake Solutions - HS". They made first commercially-successful asynchronous mc's based on ARM996 and i8051 architectures. And where is HS now? I may only conclude, that all async startups disappear for unknown reasons. And I am very curious about this reason.
 
Check out eSilicon. They are advertising ~$7k/mm2 in I think a 45nm process. Not cheap, but not outrageous.

And if you go to higher nodes, 0.18um is still quite common, you can get 20-50 chips for a few k$. But for hobby projects I think you will in the end need crowdfunding (like Kickstarter) and thus convince other people you're so great they should help to pay for your chips.
Strange thing is he seems to have money for the EDA tools though...
Of course my fully subjective opinion is that europractice/imec will give you better service :); here you find schedule and price list.
 
Last edited by a moderator:
Torq, I participated in one of the several mid-00s asynchronous startups. Asynchronous technology is proven in the industry with test chips and extensive benchmarking. The problem has always been full solution automation (my opinion). Unfortunately, asynchronous design has always been a kind of black magic. Even the main asynchronous design conference is like a closed club. I challenge you to go through the papers in this conference. Same few labs around the world doing research for the last decades.
Regarding HS, at some point it seemed they were going to take off. What they did was already remarkable. Personally i always found their methodology hard to use (for the average digital designer). You had to learn a new HDL and think in a totally different way.
 
Asynchronous design works better for high variability Silicon - sub 45nm the device-to-device spread is higher than the wafer-to-wafer, and it gets worse as things shrink.

It helps if you have an asynchronous design methodology (instead of RTL), but the EDA companies haven't delivered one. I made an attempt - http://parallel.cc

You also need simulation languages like Verilog-AMS that handle analog behavior better than plain old Verilog, so that you can verify the design works, but that hasn't been delivered by the EDA companies either - they might have the simulator, but they don't supply useful cell models. However, they haven't delivered a working verification flow for FD-SOI/DVFS either (which has similar requirements), so it might turn up if big IC design companies apply some pressure to get that working.

You can probably use Xyce for simulation, but I'm not sure where you will get the modeling support.

You might find support in the AI/deep-learning community since asynchronous techniques are ideal for implementing complex neural networks for the mobile market.
 
Staf,
My boss allows me to use our tools for my experiments. But i am afraid you are right: make the chip on the fab is a little different from just using tools for experiments. It will be complicated to convince our CEO to allow me produce test chip under our licenses, but not impossible.

pmat,
Thank you for the advice. I knew about ASYNC, and i had read many of their papers.
Concerning my approach to async design, I studied all popular approaches. For my opinion, BD approach (like in HS or MIT chips) is truly a bit complicated for a regular designer; NCL approach is proprietary, heavyweight, and badly synthesizable. So i chose the classical self-timed approach (also known as NCL_X) with completion detectors. And finally, I gained success in adopting this schemes for the standard synchronous flow (RTL to GDS). The flow consists of the traditional steps: synthesis, place and route, with one exception: it requires the intermediate synchronous netlist translation into dual rail after the synthesis. Netlist translation is provided by the Perl script using Verilog-Perl library. The clock network is replaced by the GALA sub-circuit based on Muller's C-elements (similar to Sutherland's control circuit). The disadvantages of dual-rail four-phases approach are well known: low speed (due to completion detection sub-schemes) and a high consumption (due to high switching activity of signals). But, it is fully delay insensitive (with some restrictions to P&R). The question is - where these schemes may be applied: slow and with high consumption, but very robust to delay variations. IoT, probably. I don't sure.

simguru,
Thank you for ideas.
I also use AMS, but it is too hard to model schemes larger than ~10k transistors and nearly impossible to model schemes with more than 100k cells. So i prefer to use the verilog netlist simulation with SDF. I had no problems with models because my approach exploits only RS-latches and C-elements, which have quite a simple Liberty description.
 
Staf,
...

simguru,
Thank you for ideas.
I also use AMS, but it is too hard to model schemes larger than ~10k transistors and nearly impossible to model schemes with more than 100k cells. So i prefer to use the verilog netlist simulation with SDF. I had no problems with models because my approach exploits only RS-latches and C-elements, which have quite a simple Liberty description.

I have a book on asynchronous design, so I know what a C-element looks like :D

I mentioned Xyce because it's a parallel processing simulator designed for higher capacity. You don't really want to simulate at transistor level, which is why it's good to be able to do behavioral analog models (which don't need solver support). Digging back into the neural-network (NN) stuff there are ways to auto-generate block-level behavioral models using NNs that will probably do the job nicely (for FDSOI/DVFS too). IoT/AMS is also a driver for this.

You could talk to eFabless about Xyce and making experimental chips.
 
Staf,
My boss allows me to use our tools for my experiments. But i am afraid you are right: make the chip on the fab is a little different from just using tools for experiments. It will be complicated to convince our CEO to allow me produce test chip under our licenses, but not impossible.

Good to know you are aware. Microelectronics industry is a relatively small industry and from experience we know EDA vendors will ask questions on products where origin or flow is not clear.

The disadvantages of dual-rail four-phases approach are well known: low speed (due to completion detection sub-schemes) and a high consumption (due to high switching activity of signals). But, it is fully delay insensitive (with some restrictions to P&R). The question is - where these schemes may be applied: slow and with high consumption, but very robust to delay variations. IoT, probably. I don't sure.

I think you answered yourself the question why asynchronous is not used more; the power in PPA is getting more and more important.
Even for sensors in IoT I think most of them want to process data coming from the sensor and before sending it to the Internet. That will involve some more intensive datapath logic and you want that with as low power consumption as possible. Sometimes asynchronous is sold as being able to avoid the clock tree power consumption but by your used technique that seems to be undone by the higher datapath power consumption. Additionally area is directly related to cost per chip if chip is not pad limited. Cost is again very important in the lower margin IoT world; earning 30 or 40 cents on a chip is a difference of 25%.
Also I think the cases where delay insensitivity is the requirement are small; most of the time one wants to know how fast something will perform.
 
Last edited by a moderator:
simguru, why are you saying that EDA companies have not introduced asynchronous design methodologies instead of RTL ? Please refer to "Communicating Process Architectures 2005 : Handshake Technology". HS had a complete asynchronous methodology to create asynchronous circuits.
 
torq, NCL_X is not fully delay insensitive. How do you cope with signal orphans between reset/set phases? Achieving fully delay "insensitiveness" is far more complicated.

I agree with the rest of your comments about the cost of the dual rail circuits.
 
Last edited:
Staf,
This is a tricky question about consumption, because asynchronous circuits operate easily under a threshold voltage where synchronous are inoperable. But you are right about the cost - dual rail encoding takes at least twice as much area (3-5x actually, depends on completion detection sub-scheme). I have never thought about it, thank you!

pmat,
There are two major approaches to completion detection in Dual Rail: you may put detection on elements outputs (QDI), or - you may put it on every input (DI). The second approach is a lot harder, but eliminates the problem of wire-orphans for ever. The first approach may be considered as a DI too with some restrictions to P&R: you should take care about so-called isochronous forks.

Ok, thank you guys! It seems, it is better for me to abandon this.
 
Last edited:
simguru, why are you saying that EDA companies have not introduced asynchronous design methodologies instead of RTL ? Please refer to "Communicating Process Architectures 2005 : Handshake Technology". HS had a complete asynchronous methodology to create asynchronous circuits.

Simulating asynchronous circuits requires continuous time simulation (and a PWL modeling style) that Verilog doesn't support.

Verilog-AMS can do it, but the available implentations are not designed for digital work.

Silistix tried to do async. NoC synthesis, but the routing tools are tuned up for synchronous and do the wrong thing.
 
Generation of timing delays to allow nets to resolve has been a major problem, but Intel/Altera Stratix 10 has put "registers everywhere" in the fabric and are choosing the one to use based on timing during P&R. The next step would be to generate a wiring delay equal to net delay to clock the registers rather than placing the register based on path delay.

Conceptually it would be a tapped delay line with timing determined by the data path pipeline.

Some years ago one of IBM's mainframes used wire jumpers to generate delays of 1.2 ns per foot(or something like that)

How to get their attention???
 
Last edited:
Q:
1A) is there any future for the the async?
Yes! The future is now.
The Landauer limit LAW too HOT too much power too small event has occurred for synchronous design.
ASYNC is the answer to the end of Moore's law (lore).
Pivot or perish. Businesses must accrue the benefits of ASYNC. Existing semi tech using ASYNC can realize great value.
And avoid disastrous very expensive business threatening HOT flaming device events.

This semiwiki post "About That Landauer Limit… by Bernard Murphy" is good -

U.S. Convenes Chip Study Group White House explores China, Moore’s law

Erik Demaine. “So we need to develop a new way to think about computation.” He and his colleagues in the MIT Computer Science and Artificial Intelligence Laboratory have been doing just that.



1B) Do any big companies provide R&D in the async field?
Not as many do that should. See "Chip Study Group" list. Some have a hammer so the world is a nail to them.
Also check out sponsors of - 23rd IEEE International Symposium on Asynchronous Circuits and SystemsMay 21-24 2017, San Diego, California, US
2016 had Intel and nVidia among others.

Please consider getting involved, contributing - ASYNC 2017 Call For Papers -
They are open to Fresh Ideas - - ASYNC 2017 will accommodate a special workshop to present “fresh ideas” in asynchronous design, that are not yet ready for publication. We solicit 1-to-2-page submissions for the workshop, which will go through a separate light-weight review process. Accepted submissions will be handed out at the workshop.

In summary ... ASYNC now!
 
Back
Top