There’s been a lot of discussion of late about deep learning technology and its impact on many markets and products. A lot of the technology under discussion is basically hardware implementations of neural networks, a concept that’s been around for a while.
What’s new is the compute power that advanced semiconductor technology brings to the problem. Applications that function in real time, on real products are now possible. But what exactly does a deep learning chip look like? What technology drives these designs?
I caught up with Mike Gianfagna recently to discuss deep learning and pose some of these questions. Besides buying lunch, Mike told me some interesting things about deep learning chips based on what’s happening at eSilicon.
First of all, chips targeted at deep learning applications are often not “chips” at all using the traditional definition of a monolithic piece of silicon in a package. Rather, they are combinations of monolithic chips and massive external memories all integrated in a sophisticated 2.5D package. The use of 2.5D makes the whole process a good bit more complex but allows the delivery of significant new capabilities.
If you poke around inside one of these 2.5D deep learning packages, you typically find HBM2 memory stacks, along with the associated HBM PHY and controller. High-speed SerDes is also typically needed for off-chip communication. The actual deep learning chip itself typically has optimized multiply-accumulate functions – many of them. These designs have a need for specialized on-chip memories for efficiency and power reasons. So, a deep learning chip looks something like this:
To really take advantage of advanced silicon technology, customization to optimize deep learning algorithms is a very good strategy. That means building ASICs, and that’s where eSilicon comes in. There aren’t many places you can go to implement a deep learning ASIC. There are lots of technical challenges involved.
Performance demands use of FinFET technology, and that raises the stakes quite a bit. FinFET-class ASICs are substantially more challenging to design than older planar technology chips. Customizing memory for the multiply-accumulate design is tricky to get correct. Interfacing HBM memory stacks to the ASIC also requires very high-performance circuits – something not everyone is good at. And then there’s the 2.5D package. Integrating multiple components on a silicon interposer requires a lot of skill as well. Assembly yield is impacted by thermal and mechanical stress. Testing these devices also requires some new approaches, as does the actual design of the interposer itself.
And on top of all this, Mike explained that it takes a team of ecosystem partners to get the job done. Critical IP is typically sourced from more than one vendor. Fabrication of the chip is done by the foundry, but HBM memory stacks, interposers and 2.5D packages all come from other vendors. It takes a well-coordinated team to get all this done reliably.
As the lattes were being served at the end of our lunch, Mike told me about an event at the Computer History Museum in Mountain View on March 14. eSilicon is teaming up with Samsung Memory, Amkor and Northwest Logic to explain how that group of ecosystem partners works together to build deep learning ASICs. They also have a keynote address from Ty Garibay, the CTO of Arteris IP. I’ve been to eSilicon events in the past and they are typically very informative. The wine and food are pretty good, too. If you want to dig into deep learning I would attend this event, absolutely. Check out more about the seminar, or register to attend. SemiWiki will be there.