Array
(
    [content] => 
    [params] => Array
        (
            [0] => /forum/threads/any-thoughts-on-the-amd-microsoft-alliance-in-ai-ml.17921/page-2
        )

    [addOns] => Array
        (
            [DL6/MLTP] => 13
            [Hampel/TimeZoneDebug] => 1000070
            [SV/ChangePostDate] => 2010200
            [SemiWiki/Newsletter] => 1000010
            [SemiWiki/WPMenu] => 1000010
            [SemiWiki/XPressExtend] => 1000010
            [ThemeHouse/XLink] => 1000970
            [ThemeHouse/XPress] => 1010570
            [XF] => 2021770
            [XFI] => 1050270
        )

    [wordpress] => /var/www/html
)

Any thoughts on the AMD/Microsoft alliance in AI/ML?

A 16nm ASIC vs GAA FPGA. You think an FPGA is faster?

It seems like a great time to tinker and handle more calculations on the edge. Either way, I'm all in. We will see what happens.
 
Last edited:
So custom ASICs at 16nm are out on local devices. That is what you are saying.
Not sure what you mean. The discussion was about Microsoft ML possibilities, and whether it makes sense for them to use 16nm ASICs.

It may make sense for other companies, for sure. 14/16 continues to be a popular node. A big Versal FPGA is a non starter for many local devices due to power and cost, and the device need may be much smaller than those chips deliver. But for Microsoft, they have figured out how to drive huge loads through FPGAs a decade ago, so the threshold for alternatives is much higher.
 
I think we are just out of sync then. I was questioning 2 years for an enhancement for an ASIC. Whether or not this was an ML chip, a data center chip, or out on the edge, I questioned 2 years for an algorithmic change for an ASIC. No need to further the debate.
 
Back to the original topic...

So Tanj, you seem to be quite the expert. For ML on the edge/local/iot/inference/whateveryoucallit, it seems to me that an ASIC with a CPU, fast math multipliers, and small quantity of CLBs provides the ideal situation for customers. Assuming this is true, what size LUTs would be the most practical, 4, 5, or 5/6 (series 7 style, whose patents expire in 2024)?
 
I think we are just out of sync then. I was questioning 2 years for an enhancement for an ASIC. Whether or not this was an ML chip, a data center chip, or out on the edge, I questioned 2 years for an algorithmic change for an ASIC. No need to further the debate.
It is simply a matter of scale. A small ASIC can be updated for minor things in a short time. But a large ASIC, or major shift in design of ASIC, takes longer. Leading edge AI is both large chips and rapid changes.
 
it seems to me that an ASIC with a CPU, fast math multipliers, and small quantity of CLBs provides the ideal situation for customers. Assuming this is true, what size LUTs would be the most practical, 4, 5, or 5/6 (series 7 style, whose patents expire in 2024)?
Which customers with what size and kind of models? Do they need least power or most speed? Throughput, or latency? There are many solutions because there are many answers to those questions.

I was not aware the patents expire. Would expect they are backed by more patents. I've not noticed a clear advantage between Xilinx and Altera approaches to LUTs which is not surprising because each invested in compilers that take the level of RTL I write and just deliver working chips. You would need to find someone who works on those compilers, and who has worked on both, to give you the tradeoffs. There seems a slight trend over time to increased size of LUT which probably reflects clever choice of the internal structure (which are no longer just LUTs, they have options for special configurations that can make things like carry chains more efficient) that the vendors probably refine based on feedback on special cases that are difficult to implement. But the bigger story has been the proliferation of hard blocks mixed in with the FPGA - embedded cores, multiple memory interfaces, phy + link layers for PCIe and Ethernet, on-chip fabrics, various sizes of SRAM, vector and matrix MACs, etc. The leading edge of FPGA has diverse SKUs with mixes of hard blocks chosen to satisfy different markets from telephony to avionics to AI. The fabrics are increasingly crucial for allowing the resources to be interconnected, with the logic side of the FPGA adding smarts to that flow.
 
Back
Top