[FONT="]I often hear the assumption that even though progress in CPUs seems to be plateauing, we will still have exponential progress in GPUs and ASICs for a long time. Even Elon Musk came out and said progress in AI hardware is exponential.[/FONT]
[FONT="]
[/FONT]
[FONT="]On the other hand, it seems that the exponential progress in parallel chips will stop once the transistor sizes catch up with CPUs. For a given transistor density you can't increase the clock speed beyond a certain point, and we probably won't go much beyond 5nm processes - this is why CPU speeds have stagnated. After that, every extra parallel core or unit of computation increases both the cost and power of the chip linearly.[/FONT]
[FONT="]
[/FONT]
[FONT="]ASICs have already been pretty well-optimized for AI applications. Google's TPU specifically optimizes matrix multiplication, which is where almost all of the computation in training a neural network goes. There may be even more clever tricks to implement, but it's tough to say that these would result in more than a 5x improvement or so.[/FONT]
[FONT="]
[/FONT]
[FONT="]There may be more work to be done in the design of the FETs or 3d layering, but these will not have radical or exponential effects on performance.[/FONT]
[FONT="]
[/FONT]
[FONT="]Quantum computing won't be better for matrix multiplication. Optical computing is limited to clock speeds of 10s of GHz due to dispersion of spectral light pulses, which is not much better than what can be done with silicon.[/FONT]
[FONT="]
[/FONT]
[FONT="]It also seems that many of the remaining improvements won't multiply together. If two new architectures come out that are 5x faster than previous architectures, they probably can't just be combined for a 25x improvement. A new FET that's more power efficient might not be suitable for 3d layering. In general it seems that further progress in computations per second per dollar will be slow and difficult rather than exponential.[/FONT]
[FONT="]
[/FONT]
[FONT="]On the other hand, it seems that the exponential progress in parallel chips will stop once the transistor sizes catch up with CPUs. For a given transistor density you can't increase the clock speed beyond a certain point, and we probably won't go much beyond 5nm processes - this is why CPU speeds have stagnated. After that, every extra parallel core or unit of computation increases both the cost and power of the chip linearly.[/FONT]
[FONT="]
[/FONT]
[FONT="]ASICs have already been pretty well-optimized for AI applications. Google's TPU specifically optimizes matrix multiplication, which is where almost all of the computation in training a neural network goes. There may be even more clever tricks to implement, but it's tough to say that these would result in more than a 5x improvement or so.[/FONT]
[FONT="]
[/FONT]
[FONT="]There may be more work to be done in the design of the FETs or 3d layering, but these will not have radical or exponential effects on performance.[/FONT]
[FONT="]
[/FONT]
[FONT="]Quantum computing won't be better for matrix multiplication. Optical computing is limited to clock speeds of 10s of GHz due to dispersion of spectral light pulses, which is not much better than what can be done with silicon.[/FONT]
[FONT="]
[/FONT]
[FONT="]It also seems that many of the remaining improvements won't multiply together. If two new architectures come out that are 5x faster than previous architectures, they probably can't just be combined for a 25x improvement. A new FET that's more power efficient might not be suitable for 3d layering. In general it seems that further progress in computations per second per dollar will be slow and difficult rather than exponential.[/FONT]