Facebook open-sources AI Hardware

Bernard Murphy · Dec 10, 2015

Likely in response to Google recently open-sourcing its AI software, to drive more development and to attract more AI engineers. The hardware is a box with PCBs loaded with GPUs. As far as I know there are no custom chips in the design. There's lots of interesting material here for speculation:

Will this help accelerate the open source hardware trend?
Will it accelerate development and competition in hardware AI solutions?
The GPUs are used for neural nets (see my earlier blog on Cadence use of ARC for a similar purpose and a very lightweight explanation of how this works). Does this point to a new growth market for nVidia and other GPU makers?

More content HERE

simguru · Dec 10, 2015

A lot of the AI software is fairly well documented and open-source anyway, and a chassis for a bunch of standard GP-GPU cards probably doesn't rank high on the open-source hardware scale, so I don't think that's going to shift the needle much. I liked this TED talk -

https://www.youtube.com/watch?v=n-YbJi4EPxc

I think this is where the FPGA guys have dropped the ball since the GP-GPUs are very power hungry and something more task-specific would be a lot more efficient and scalable, but there's no easy way to program a box full of cheap FPGAs (e.g. Zynq). GP-GPUs are good for data-parallel applications, but it's not clear to me that that makes them good at general Neural-Network stuff which a bit like sparse matrix processing (and they aren't great at that).

Put another way: GP-GPUs are like the visual cortex of your brain, and that isn't the smart part.

Li Yisuo · Dec 11, 2015

With Finfet inside, CUDA is not as power hungry as before. And Torch might be truly advance. However I don't know how they are going to scale laterally. And what's about redundancy?
FPGA may benefit more from large scale AI, big brain.
Google's tensorflow is in C++ and can be used with CPU&GPU. On top of that, we can code with Python or anything with SWIG. I feel that is cool!

Bernard Murphy · Dec 11, 2015

Optimum engines for neural net processing are probably DSP because most of what you are doing is a lot of multiply/accumulate. So TI, CEVA, Cadence ARC are candidates, there may be more. You could definitely program a NN into an FPGA but I feel it would be slower at these core operations than a dedicated DSP, which might make it less effective for real time recognition. But I admit that's a gut feel - I have no proof

Bernard Murphy · Dec 11, 2015

One other though on open-source hardware. Point taken that the chassis is probably not super exciting from a purely hardware perspective, but that probably isn't the most important factor - Facebook backing is. Builders may be drawn to innovate around this platform purely because of the business scale of the opportunity with that one customer. A bunch of people could retire rich on something that would be a blip on the FB biz plan.

But thanks for the point about this not moving the needle much for the GPU guys. Good for bragging rights, but agreed not much of an impact on revenues.

Staf_Verhaegen · Dec 14, 2015

Bernard Murphy said:
You could definitely program a NN into an FPGA but I feel it would be slower at these core operations than a dedicated DSP, which might make it less effective for real time recognition.

Given that current FPGAs have all quite some hard macro's DSP-like functionality I would expect FPGAs to parallelize the computation more than DSPs.

Bernard Murphy · Dec 14, 2015

OK- you got me Staf

But still wondering if tight-loop programming in a dedicated DSP wouldn't be faster than programmed logic in a more general purpose FPGA. I have no evidence I admit, but it feels like this should be the case

simguru · Dec 14, 2015

CPUs are efficient in code density (you can optimize how memory is being used), that minimizes Silicon use, but they are less power efficient for a given performance level. FPGAs would be the highest performance (bar ASICs), but you are dedicating a lot of Silicon area to wiring (vs using common data buses). Architecturally in an AI you would probably use DSPs etc. for learning processes, then drop the results into FPGA to get performance when you need it.

Bernard Murphy · Dec 14, 2015

Interesting point, especially if the training could be done centrally then distributed to field-program FPGAs. But wouldn't unit-price also be a factor? Of course if these things went to high volume FPGAs could become much more price-competitive.

simguru · Dec 14, 2015

Bernard Murphy said:
Interesting point, especially if the training could be done centrally then distributed to field-program FPGAs. But wouldn't unit-price also be a factor? Of course if these things went to high volume FPGAs could become much more price-competitive.

There's an argument that FPGAs make more sense than other structures on the bleeding edge of scaling since they should tolerate not being fully functional. That might be where Xilinx beats Altera, since they have tools that will attempt to fit your design into broken (cheaper) FPGAs, where Altera/Intel have always been focused on making stuff that (looks like it) works 100%.

It should be cheaper in Silicon if the systems can just adapt to being broken as they go along - you can skip a lot of testing for starters.

Bernard Murphy · Dec 14, 2015

Ah-ha - the slippery slope to software quality standards

Search

Facebook open-sources AI Hardware

Bernard Murphy

Moderator

simguru

Member

Li Yisuo

Member

Bernard Murphy

Moderator

Bernard Murphy

Moderator

Staf_Verhaegen

Guest

Bernard Murphy

Moderator

simguru

Member

Bernard Murphy

Moderator

simguru

Member

Bernard Murphy

Moderator