The DSP48E2 (I do not come up with these names… Could have named it a multiplier thingy) in the Xilinx 20nm UltraScale family (I do not come up with these names… Could of named it Virtex-8, or Luke-8) is simply amazing. Today was good, as I began playing with UltraScale tools and seeing how the DSP checks out. I also encourage you to check out the Altera 20nm node and you will once again see quite a difference between the families. In fact Altera says they are ‘redefining the midrange’ at the 20nm node… and for the Stratix-10, ‘Delivering unimaginable performance’. I am holding back on that one, in fact dear reader you could add in the punch line here. I was humming ‘it’s a small world after all’, picturing tattoo from Fantasy Island, yelling De FPGA, De FPGA Boss!
Xilinx claims that the complex multiplier in the 20nm UltraScale will need ½ the DSP resources it needed in the 28nm node. That is a big deal, as in my field, RADAR/EW complex multiplication is important. For example, the wife asks me to cut her hair. That is input A, input B is I keep saying no but decide to do it. The output C is a very complex situation that involves sobbing, and that is me sobbing. That is a true story by the way (No I didn’t cry, I embellished a bit), and it turned out exactly like I said it would. My dear, dear Manly reader, may I suggest you never cut your wife’s hair, ever. My dear, dear feminine reader, do not ask your man to style your hair, you will really regret it and will take 3 months to forgive your guy. I know, pray for my family.
In the UltraScale 20nm Kintex 115, we have 5500 DSP. There are 3600 DSP in the largest 28nm Virtex-7 FPGA, so you say ‘well that is only a 1900 DSP increase.’ I say you spoiled designer you! Remember the Virtex-2 Pro Days, with a whopping 232 18×18 multipliers, see how you are spoiled? Ok, the 1900 DSP increase in actually much more than that when you factor in clock frequency, bit widths, and efficiency. For most configurations, the complex multiplier that used 8 DSP slices in 28nm will only need 4 DSP in UltraScale 20nm. That is 1900 + 2X GMACs gained when performing complex multiplies.
For example, if you wanted a system that needed 1375 18×27 complex multiplies you could do this in one UltraScale Kintex-115. Altera’s Arria-10, would need 1.63 Arria’s. And I know it is very hard to purchase a 0.63 FPGA so I think you need 2 Altera FPGAs. Altera’s Data sheet says you need 2 Variable DSP’s for one complex multiply and there are 1678 Variable DSPs, that means you can only have 839 complex multiplies. 1375/839 = 1.63.
Wait… Now wait! Hold, hold…hold.. On here… in a George Bailey voice. This is for an 18×27 Complex Multiply, and the 1.63 Arria’s was for 18×19…Well I was conservative, and I did not even discuss gigabit transceivers. But the Xilinx KU115 can move 2 Terabits over 64 x 16 gb/s Transceivers. Arria-10, not so much with 48 x 17.4 gb/s transceivers = 1.67 Terabits.
Xilinx KU115 = 5520 DSP x 741 MHz x 2 = 8180 GMACs
Arria10 GX660 = 3356 DSP x 500 MHz x 2 = 3356 GMACs
Arria base multipliers are 18×19 and Xilinx is 18×27, this means as in most systems you will need to use more multipliers to achieve most applications. How many systems do you know that use 18×19 for filter applications? Most systems have digital processing gain that effectively use the Xilinx DSP slice. Xilinx’s DSP is close to 3X better than Altera.
By the way a history lesson from Wikipedia, reader beware, the Name Arria:
“Arria (also Arria Major) was a woman in ancient Rome. Her husband Caecina Paetus was ordered by the emperor Claudius to commit suicide for his part in a rebellion but was not capable of forcing himself to do so. Arria wrenched the dagger from him and stabbed herself, then returned it to her husband, telling him that it didn’t hurt (“Non dolet, Paete!”). Her story was recorded in the letters of Pliny the Younger, who obtained his information from Arria’s granddaughter, Fannia.
Ok..Right… Let’s name an FPGA after a lady who stabbed herself to prove it didn’t hurt to her husband. Sounds like me cutting my wife’s hair! – Non dolet, Paete.
Anyways, Xilinx of course has all your FPGA ranges covered, they call it the ‘low end series’ (I assure you I do not come up with these names, Low End? Arria? How about Super Duper Value FPGA?). Looking at Xilinx’s 28nm, 20nm, there is no other device that comes close to it. So you are using Altera why? Can someone explain to me where the measurable advantage is? How about TI’s DSP chips, why? You can program Xilinx’s FPGAs using C/C++ very easily which excellent QoR, and Floating Point for Xilinx FPGAs are trivial. Considering GPU’s? Think again about using FPGAs SEU, Power and Reliability are many times better than GPUs. Xilinx clearly is the leader in the FPGA realm and it did not happen by accident. Check them out today, you will not be disappointed.