Array
(
    [content] => 
    [params] => Array
        (
            [0] => /forum/index.php?threads/will-amd-nvidia-or-intel-use-risc-v-in-the-future.16636/page-2
        )

    [addOns] => Array
        (
            [DL6/MLTP] => 13
            [Hampel/TimeZoneDebug] => 1000070
            [SV/ChangePostDate] => 2010200
            [SemiWiki/Newsletter] => 1000010
            [SemiWiki/WPMenu] => 1000010
            [SemiWiki/XPressExtend] => 1000010
            [ThemeHouse/XLink] => 1000970
            [ThemeHouse/XPress] => 1010570
            [XF] => 2021370
            [XFI] => 1050270
        )

    [wordpress] => /var/www/html
)

Will AMD, Nvidia, or Intel use RISC-V in the future?

Because of the Flash wear-leveling and DRAM buffering you get no advantage in using Optane, that's why Intel lost so much money on it. You need to switch architecture to something where you get the benefits of using it directly (ditching the DRAM and excess wear-levelling hardware), but Intel doesn't have the imagination to do that. Mostly you want to shift to many core (with a lot more L2) so that cache-misses are rare enough that slow writes don't impinge.
 
Because of the Flash wear-leveling and DRAM buffering you get no advantage in using Optane, that's why Intel lost so much money on it. You need to switch architecture to something where you get the benefits of using it directly (ditching the DRAM and excess wear-levelling hardware), but Intel doesn't have the imagination to do that. Mostly you want to shift to many core (with a lot more L2) so that cache-misses are rare enough that slow writes don't impinge.
I have no idea what you're talking about, and I don't think you do either.
 
I have no idea what you're talking about, and I don't think you do either.
(Explaining...)

The DRAM is essentially an extra level of cache for slow devices, if your devices are fast enough it isn't needed. If you have stalling issues because of the write latency from L2/L3 level you can expand at the level so that you don't miss cache at a rate where you'll stall. However, there's a working limit of about 2MB per core for L2, so if you need more L2, you need more cores to go with it. I know how to make that work, Intel doesn't.

A machine tuned for cache-line transfers with processors that can handle a complete cache line at one go will beat serial code processors like X86/RISC, so VLIW and Optane is the way to go for maximum speed.
 
(Explaining...)

The DRAM is essentially an extra level of cache for slow devices, if your devices are fast enough it isn't needed. If you have stalling issues because of the write latency from L2/L3 level you can expand at the level so that you don't miss cache at a rate where you'll stall. However, there's a working limit of about 2MB per core for L2, so if you need more L2, you need more cores to go with it. I know how to make that work, Intel doesn't.
Optane isn't fast enough to replace DRAM. Optane Gen2 DIMM writes are in the 700ns range. Write bandwidth is only about 1/3 of read bandwidth, and even read bandwidth is significantly less than DDR4 capabilities, much less than DDR5.


If you're thinking that Optane can replace DRAM with a larger L2 cache, and L2 cache is always SRAM, you're mistaken.
A machine tuned for cache-line transfers with processors that can handle a complete cache line at one go will beat serial code processors like X86/RISC, so VLIW and Optane is the way to go for maximum speed.
I recommend reading more about Optane. You don't seem to know how it functions. I recommend this blog entry by Frank Hady (he's a really smart guy):


And this UCSD performance measurement paper on Optane Gen 1. Gen 2 is faster, but not incredibly so.


Edit: this too:


IMO, your proposal is poorly formed.
 
Optane isn't fast enough to replace DRAM. Optane Gen2 DIMM writes are in the 700ns range. Write bandwidth is only about 1/3 of read bandwidth, and even read bandwidth is significantly less than DDR4 capabilities, much less than DDR5.


If you're thinking that Optane can replace DRAM with a larger L2 cache, and L2 cache is always SRAM, you're mistaken.

I recommend reading more about Optane. You don't seem to know how it functions. I recommend this blog entry by Frank Hady (he's a really smart guy):


And this UCSD performance measurement paper on Optane Gen 1. Gen 2 is faster, but not incredibly so.


Edit: this too:


IMO, your proposal is poorly formed.
Optane is a phase-change memory technology, it's fast and low power, it holds data indefinitely, is rad-hard and survives orders of magnitude more write cycles than Flash. However, Flash is significantly cheaper, so if you stick a cache in front of them both and give yourself enough Flash to survive the wear, then Flash wins because of it's per-bit cost and the fact the performance mostly comes down to the DRAM. However the DRAM caching is not a low-power solution.

While not as fast as DRAM, Optane is not orders of magnitude slower, and you can buffer writing to it without using DRAM. I.e. flushing L2 cache to NVM does not require DRAM, it's more of a FIFO job that can be done on the same Silicon as the L2 cache and processor.

Optane's advantage would be the ability to build denser more power efficient systems without losing performance, but those would be NUMA machines where you would probably still need temporary (volatile storage) like DRAM as well as the NVM. It depends what kind of machine you are building. Intel doesn't make NUMA machines well.

AI machines that are training neural networks do a lot of work in multiply/accumulate but relatively little re-writing of the weights. Simulation tasks you want to be in cache as much as possible but you usually need to log results as you go, a fast NVM like Optane works better than Flash because under continuous writing you'll still get stuck at the speed of the underlying memory, the (DRAM) caches only help if the writing is in bursts that average out less than the NVM supports. Likewise, putting a DRAM cache in your read path doesn't help if you aren't revisiting the data before it gets displaced.

Horses for courses.
 
Optane is a phase-change memory technology, it's fast and low power, it holds data indefinitely, is rad-hard and survives orders of magnitude more write cycles than Flash. However, Flash is significantly cheaper, so if you stick a cache in front of them both and give yourself enough Flash to survive the wear, then Flash wins because of it's per-bit cost and the fact the performance mostly comes down to the DRAM. However the DRAM caching is not a low-power solution.

While not as fast as DRAM, Optane is not orders of magnitude slower, and you can buffer writing to it without using DRAM. I.e. flushing L2 cache to NVM does not require DRAM, it's more of a FIFO job that can be done on the same Silicon as the L2 cache and processor.

Optane's advantage would be the ability to build denser more power efficient systems without losing performance, but those would be NUMA machines where you would probably still need temporary (volatile storage) like DRAM as well as the NVM. It depends what kind of machine you are building. Intel doesn't make NUMA machines well.

AI machines that are training neural networks do a lot of work in multiply/accumulate but relatively little re-writing of the weights. Simulation tasks you want to be in cache as much as possible but you usually need to log results as you go, a fast NVM like Optane works better than Flash because under continuous writing you'll still get stuck at the speed of the underlying memory, the (DRAM) caches only help if the writing is in bursts that average out less than the NVM supports. Likewise, putting a DRAM cache in your read path doesn't help if you aren't revisiting the data before it gets displaced.

Horses for courses.
Well, you do have a lot of trouble staying on topic. This thread is about whether or not the leading CPU vendors will adopt RISC-V, and you proceeded to assert that ISAs will soon be obsolete, which there's no evidence for, and then proceeded to assert that VLIW processors will succeed Harvard architecture superscalar processors, and finally that VLIW architecture and Optane is the ultimate winning combination. And you also asserted that Optane would naturally fit architecturally below expanded L2 caches in the memory hierarchy, but presented no evidence at all except that you believe know how to do it and Intel doesn't, which is probably nonsense. And now you're off on a tangent, while we're discussing processor architecture and memory hierarchies, about flash memory characteristics versus Optane - and flash has no place in the memory hierarchy other than perhaps to store VM paging files.

Oh yeah, and you're still harping on how Optane DIMMs would work better than flash for AI workloads, when the processor industry and its major customers have already chosen a direction to use SIMD, streaming cores, and specialized matrix math capabilities for AI/ML, which are very bandwidth hungry, not general purpose processors, which are more latency-sensitive, and they are transitioning to HBM (Nvidia, Intel) or on-wafer SRAM (Cerebras), just to name a few examples. There's no evidence of even one of the AI/ML vendors partnering with Intel on Optane, probably because it has lower bandwidth and higher latency than DRAM (no less HBM). Intel did, back in 2017/2018 talk about AI applications for Optane DIMMs:


But nothing came of it, because it turned out bandwidth was more important than capacity for AI/ML processors, so the best Optane DIMMs could do was support memory-resident random access and indexed databases, like Oracle, SAP HANA, etc, and it hasn't been especially successful, or Intel wouldn't have cancelled the Optane program. On Intel's latest solution page, no AI at all:


So, I suggest you stop diverting this interesting thread about RISC-V, and tempting me to help you with counter discussion, with your unsupported pronouncements about future processor architectures and memory hierarchies.
 
Last edited:
Well, you do have a lot of trouble staying on topic. This thread is about whether or not the leading CPU vendors will adopt RISC-V, and you proceeded to assert that ISAs will soon be obsolete, which there's no evidence for, and then proceeded to assert that VLIW processors will succeed Harvard architecture superscalar processors, and finally that VLIW architecture and Optane is the ultimate winning combination. And you also asserted that Optane would naturally fit architecturally below expanded L2 caches in the memory hierarchy, but presented no evidence at all except that you believe know how to do it and Intel doesn't, which is probably nonsense. And now you're off on a tangent, while we're discussing processor architecture and memory hierarchies, about flash memory characteristics versus Optane - and flash has no place in the memory hierarchy other than perhaps to store VM paging files.

Oh yeah, and you're still harping on how Optane DIMMs would work better than flash for AI workloads, when the processor industry and its major customers have already chosen a direction to use SIMD, streaming cores, and specialized matrix math capabilities for AI/ML, which are very bandwidth hungry, not general purpose processors, which are more latency-sensitive, and they are transitioning to HBM (Nvidia, Intel) or on-wafer SRAM (Cerebras), just to name a few examples. There's no evidence of even one of the AI/ML vendors partnering with Intel on Optane, probably because it has lower bandwidth and higher latency than DRAM (no less HBM). Intel did, back in 2017/2018 talk about AI applications for Optane DIMMs:


But nothing came of it, because it turned out bandwidth was more important than capacity for AI/ML processors, so the best Optane DIMMs could do was support memory-resident random access and indexed databases, like Oracle, SAP HANA, etc, and it hasn't been especially successful, or Intel wouldn't have cancelled the Optane program. On Intel's latest solution page, no AI at all:


So, I suggest you stop diverting this interesting thread about RISC-V, and tempting me to help you with counter discussion, with your unsupported pronouncements about future processor architectures and memory hierarchies.

The RISC-V point I was making is that you can subsume RISC-V (along with X86 & ARM) into a VLIW approach. That's likely to happen at the compiler/runtime level because there is no standard RISC-V, so you want to move to an approach of interpreting (source or machine) code and (re) JiT'ing for the processor you have. Once you have the flexible runtime, the processor architecture can be anything you like, including heterogeneous, VLIW still offers the best speed, if you can work out how to do it, and there's more chance of that with AI driven tools. Humans are not good at parallel things, that's why the compiler writers like RISC over VLIW. Humans have also failed to create FPGA compilers for standard languages. Given the failings of human compiler writers, it's better to leave them with the simple task, and have the machines do the hard stuff.

Shared cache-coherent DRAM has a lot of overhead, and is a performance bottleneck, a NUMA co-packaged cores and Optane architecture would perform better, the Optane's longevity makes that more viable than doing it with Flash. Sticking Optane on a DIMM and treating it like DRAM was a mistake, but Intel's processors generally run too hot for die-stacking.

NB: I didn't f**k-up rolling out Optane, Intel did, as they did with Itanium, and various GPUs and accelerators, plus failing to capitalize on their Altera acquisition, and not using their own AI products in IC design (despite flogging them as wonderful for everything to everybody else).

If you have some coherent plan for displacing ARM & X86 with RISC-V, I'd love to hear it, so far you are just criticizing mine.
 
The RISC-V point I was making is that you can subsume RISC-V (along with X86 & ARM) into a VLIW approach. That's likely to happen at the compiler/runtime level because there is no standard RISC-V, so you want to move to an approach of interpreting (source or machine) code and (re) JiT'ing for the processor you have. Once you have the flexible runtime, the processor architecture can be anything you like, including heterogeneous, VLIW still offers the best speed, if you can work out how to do it, and there's more chance of that with AI driven tools.
Unlikely in the next decade.
Humans are not good at parallel things, that's why the compiler writers like RISC over VLIW. Humans have also failed to create FPGA compilers for standard languages. Given the failings of human compiler writers, it's better to leave them with the simple task, and have the machines do the hard stuff.
SYCL and OneAPI and CUDA are a good start at creating FPGA and GPU compilers for standard languages. Obviously, for FPGAs nothing will be as efficient as RTL generation tools, but the existing compilers look like a good start. They are certainly not failures.
Shared cache-coherent DRAM has a lot of overhead, and is a performance bottleneck, a NUMA co-packaged cores and Optane architecture would perform better, the Optane's longevity makes that more viable than doing it with Flash. Sticking Optane on a DIMM and treating it like DRAM was a mistake, but Intel's processors generally run too hot for die-stacking.
More pseudo-technical nonsense.
NB: I didn't f**k-up rolling out Optane, Intel did, as they did with Itanium, and various GPUs and accelerators, plus failing to capitalize on their Altera acquisition, and not using their own AI products in IC design (despite flogging them as wonderful for everything to everybody else).
Intel has capitalized on Altera; they still produce FPGAs of various capabilities. They even produced hybrid CPUs with in-package FPGAs. Personally, it did not strike me as a great idea for integration with hot Xeons, but they did it. Intel's capabilities are why AMD paid a fortune for Xilinx.
If you have some coherent plan for displacing ARM & X86 with RISC-V, I'd love to hear it, so far you are just criticizing mine.
I'm not looking for a plan to displace Arm or x86 with RISC-V. Your plan still has a weak foundation.
 
Unlikely in the next decade.

SYCL and OneAPI and CUDA are a good start at creating FPGA and GPU compilers for standard languages. Obviously, for FPGAs nothing will be as efficient as RTL generation tools, but the existing compilers look like a good start. They are certainly not failures.

More pseudo-technical nonsense.

Intel has capitalized on Altera; they still produce FPGAs of various capabilities. They even produced hybrid CPUs with in-package FPGAs. Personally, it did not strike me as a great idea for integration with hot Xeons, but they did it. Intel's capabilities are why AMD paid a fortune for Xilinx.

I'm not looking for a plan to displace Arm or x86 with RISC-V. Your plan still has a weak foundation.
I'm sure lots of interesting stuff will be happening this decade, we can review at the end of it.

SYCL, OneAPI and CUDA are all work-arounds for not knowing how to write compilers for regular C/C++ for heterogeneous platforms. Altera & Xilinx had decades to work out how to do that and failed, they also failed to leverage FPGAs as a general IC design platform (themselves), and were sold to Intel/AMD before the open source tools caught up with them. It's a problem I've studied since working at Inmos in ~1990, where it became obvious the bulk of programmers aren't interested in non-standard programming methodologies, and if your processor needs that, you aren't going to sell many.

Intel didn't do that much with Altera, and certainly didn't deliver better performance for everyone with an X86/FPGA hybrid. AMD might have some better ideas on what to do, but Xilinx isn't any better than Altera on the tools front. All the key FPGA patents have expired, so some folks wasted their money.

I'd say my foundation is more solid than Intel's at this point, AMD can always fall back on its GPUs. NVIDIA, had plenty of opportunity to build RISC(-V) processors to work with their GPUs but failed to deliver anything interesting, and also did not chase the B$ mining market (which was working for them), because they can't do low power.

So I'll bet Intel/AMD don't do much RISC-V, NVIDIA probably buys some company that does, and they all become extinct in a decade because the folks that know how to use AI to make processors will take it all away from them.
 
So I'll bet Intel/AMD don't do much RISC-V, NVIDIA probably buys some company that does, and they all become extinct in a decade because the folks that know how to use AI to make processors will take it all away from them.

Not sure about the Optane caching debate above, but you definitely got it right that Intel + RISC-V = 💔. Good catch!


I'm wondering what everyone thoughts are on this. Mine are along the lines of:

Intel is focused on custom super-optimized high-performance CPUs for server's & desktop PCs, but RISC-V is a tool for building SoCs. Intel has tried to build a successful high-end SoC and failed every time. Successful as in "succeed in the market + perform architectural/tech-node iterations". Maybe I'm missing something (automotive maybe?) but since Intel lost the mobile SoC train 10-15 years ago, I can't think of a high-end SoC with "Intel inside" written on the package. AMD never even tried to build one; not planning, either; they also killed their ARM initiative.

Interesting to look at what was written and commented back a year ago and how that initiative was tightly coupled to IFS: https://semiwiki.com/semiconductor-...osystem-implications-as-the-other-shoe-drops/
 
Not sure about the Optane caching debate above, but you definitely got it right that Intel + RISC-V = 💔. Good catch!


I'm wondering what everyone thoughts are on this. Mine are along the lines of:

Intel is focused on custom super-optimized high-performance CPUs for server's & desktop PCs, but RISC-V is a tool for building SoCs. Intel has tried to build a successful high-end SoC and failed every time. Successful as in "succeed in the market + perform architectural/tech-node iterations". Maybe I'm missing something (automotive maybe?) but since Intel lost the mobile SoC train 10-15 years ago, I can't think of a high-end SoC with "Intel inside" written on the package. AMD never even tried to build one; not planning, either; they also killed their ARM initiative.

Interesting to look at what was written and commented back a year ago and how that initiative was tightly coupled to IFS: https://semiwiki.com/semiconductor-...osystem-implications-as-the-other-shoe-drops/
The Intel 800 series networking adapters have a moderately complex SoC in them, about on par with what Nvidia does in competing adapters.


I think you're underselling RISC-V's potential as the basis for performance CPUs. See the offerings from Ventana Microsystems:


I'm not sure why Intel got into RISC-V technology, so I also have no idea other than cost-cutting in tough market conditions for why they got out.
 
Intel sold nearly all its Optane business in July last year. Several comments have illustrated why Intel didnt do so well... but thats hardly the reason why they get into RISC-V or not.

As Semiconductor markets and capabilities increase, it is the more specialized products that provide the highest benefits. As such there is a drive to create more high value specialized products as which provide the biggest profit margins and innovation incentives.

RISC-V processors are a result of the relatively modern open source activity to re-invigorate a general purpose computing platform after innovation has steered away from it.

Why would Intel circle back around to build lower margin and highly competitive general purpose platforms when the real money - and excitement - is in specialized higher margin platforms? Why would anyone expect / want them to do so?

Anyone interested in understanding why Intel is not chasing RISC-V might want to read the following: The Decline of Computers as a General-Purpose Technology: cacm.acm.org/magazines/2021/3/250710-the-decline-of-computers-as-a-general-purpose-technology/fulltext
 
Anyone interested in understanding why Intel is not chasing RISC-V might want to read the following: The Decline of Computers as a General-Purpose Technology: cacm.acm.org/magazines/2021/3/250710-the-decline-of-computers-as-a-general-purpose-technology/fulltext
Many of us strongly agree with the article in spirit (including Dan, who calls the custom stuff "bespoke silicon"), but why is an article published in 2021 using example chips from 2016? Articles as silly as this example is one reason I dropped my ACM membership.
 
Many of us strongly agree with the article in spirit (including Dan, who calls the custom stuff "bespoke silicon"), but why is an article published in 2021 using example chips from 2016? Articles as silly as this example is one reason I dropped my ACM membership.
Assessments of technology success always trail technology itself - a limiting factor if reading from a technology viewpoint rather than the general trend apprach. Economists and Academics always trail the engineer. I agree that the example isnt up to date but I see we agree with the over-all conclusion.

The conclusion is also more than' general processor' and 'bespoke silicon' but rather the fragmentation of processor designs: AI / Server / Low Power / IoT / etc etc. Intel's product portfolio has become much more fragmented in developing focused solutions to an increasing number of market segments. The value, then, is in specialized improvements for market segments. RISC-V is not going to support this type of segmented performance - and if it did, where is the advantage to Intel?

Intel's business value is in leading the performance in each niche and extracting the additional benefits as profit. Unless its fabs fall underutilized, there is no real business incentive to build RISC-V solutions - and even then they would make better margins with a fabrication business without diluting their business model.
 
While doing a bit of investigation unrelated to this topic for some friends, I stumbled across the reason for Intel's interest in RISC-V. It was clearly for IFS. Intel's primary representative to the RISC-V International is a VP from IFS, and they hoped to be a premier source for RISC-V IP in ASICs. I don't know why they really got out, whether it was cost-cutting or feeling threatened by the emergence of datacenter-class RISC-V cores, but the reason they got in is pretty obvious from this announcement:

 
Further evidence of Intel not taking their foundry services seriously. IFS is intended as a Chips Act money grab.

Taxpayer funds need to go towards pure play foundries, IC packaging companies, and perhaps US suppliers to the foundries
 
Further evidence of Intel not taking their foundry services seriously. IFS is intended as a Chips Act money grab.
Why does Intel need IFS to get CHIPS Act funding? No reason I can think of.

Intel did define their Nios-V RISC-V 32bit processor, which I assume was intended to replace Arm cores in their FPGA and networking SoCs. Most (all?) of these SoCs are currently fab'd on TSMC, and I suspect these SoC products are also targeted for a transition to IFS as the right process, IP, and capacity become available. If I was in one of these internal teams I would barf all over the notion of switching away from Arm (too much work and cost for too little benefit). Perhaps Qualcomm wants to transition away from Arm for obvious reasons (Arm has sued them over their Nuvia CPU effort), though I suspect Qualcomm will just do their own RISC-V cores, but it is also possible that Arm's hold on the embedded space is greater than Intel anticipated, and they just decided to stop throwing good money after bad. There are many factors that could have influenced the investment cancellation, and we might never know exactly why.
 
Why does Intel need IFS to get CHIPS Act funding? No reason I can think of.
Because the Chips Act funding should go to towards the DoD or enabling smaller US companies to design chips (growing our countries chip making infrastructure to larger than a few monopolies). Not to Intel at the expense of AMD. IFS getting funded is a bait and switch tactic by Intel. They care about their CPU and FPGA, not IFS, IMO. RISC-V is for the small design houses.

Note: I edited the paragraph above
 
Last edited:
Because the Chips Act funding should go to towards the DoD or enabling smaller US companies to design chips (growing our countries chip making infrastructure to larger than a few monopolies). Not to Intel at the expense of AMD. IFS getting funded is a bait and switch tactic by Intel. They care about their CPU and FPGA, not IFS, IMO. RISC-V is for the small design houses.

Note: I edited the paragraph above
"Should" has nothing to do with the legislation. It is what it is, and while I think we need to reduce the cost differential between building fabs in the US and Asia, the CHIPS Act is one ridiculous and inefficient way to do it. Nonetheless, your small company utopia is not what was signed into law. Intel has not announced they're applying for funding yet.
 
Back
Top