Array
(
    [content] => 
    [params] => Array
        (
            [0] => /forum/index.php?threads/intel-is-said-to-have-made-a-2b-takeover-offer-for-chipmaker-sifive.14331/&_debug=1
        )

    [addOns] => Array
        (
            [DL6/MLTP] => 13
            [Hampel/TimeZoneDebug] => 1000070
            [SV/ChangePostDate] => 2010200
            [SemiWiki/Newsletter] => 1000010
            [SemiWiki/WPMenu] => 1000010
            [SemiWiki/XPressExtend] => 1000010
            [ThemeHouse/XLink] => 1000970
            [ThemeHouse/XPress] => 1010570
            [XF] => 2020672
            [XFI] => 1050070
        )

    [wordpress] => /var/www/html
)

Intel is said to have made a $2B takeover offer for chipmaker SiFive?!?!?

Daniel Nenni

Admin
Staff member
Intel is said to have offered to purchase SiFive for more than $2B. SiFive, a designer of semiconductors, has been talking to its advisors to see how to proceed, according to a Bloomberg report, which cited people familiar. SiFive has received multiple bids from other interested parties and has also received offers for an investment. SiFive last raised more than $60M in a Series E financing round last year and was valued at about $500M, according to PitchBook. In June 2019, Qualcomm (NASDAQ:QCOM) participated in a $65.4M Series D round for SiFive, a fabless semiconductor company building customized silicon based on the open RISC-V instruction set architecture.

Wow, great move if it is true. If Intel wants to get into the foundry business doing turnkey ASICs is definitely the way to go. Intel already acquired eASIC. That way Intel can closely control and protect IP and make sure designs/chips are done the Intel way, absolutely.

The ASIC business has changed quite a over the last couple of years as fabless chip companies take control (Marvell, Broadcom, and Mediatek). Exciting times in the semiconductor ecosystem, absolutely!

 

kvas

New member
Interesting move by Intel, I like it! I wonder if they are thinking that with the whole industry drifting towards ARM, RISC-V will be the next stop on the road to CPU unification and commoditization and they are trying to jump directly into the future.
 

Daniel Nenni

Admin
Staff member
The majority of SiFive’s revenue is Openfive which was formerly OpenSilicon the ASIC company. That is probably the jewel in the crown.
 

soAsian

Member
its probably Intel's best interest to bet on RISC-V since Apple is taking up ARM with AWS, Google and Microsoft helping Qualcomm with Windows 10 for ARM.
 

Karl S

Member
The majority of SiFive’s revenue is Openfive which was formerly OpenSilicon the ASIC company. That is probably the jewel in the crown.
And it is a graceful way to move away from the multi-core, super scalar out of order execution overly complex x86 into heterogeneous computing just as Apple is doing. BUT Microsoft Research "Where's the Beef?" found that FPGAs were the way to go because there is no instruction fetch from memory. That means that Risc V which is a load/store architecture is not the answer.

The remaining problem is that FPGA design is very hard to do. (That is if it is done with traditional HDL tool flow)

There is a simpler way to design FPGAs using the Roslyn Compiler API to personalize a simple FPGA design. (actually making the FPGA "programmable")

The problem now becomes how to convince designers that there actually is a simpler way, and that RISC V is not the magic bullet.
 

count

Active member
I think RISC-V vs ARM is going to be an important battle in the 2030s. I think it's a smart move on Intel's part, but it's a long game they are playing and it's not something that is likely to move the needle for a decade. But it is good to see the company thinking ahead for a change.
 

Karl S

Member
I think RISC-V vs ARM is going to be an important battle in the 2030s. I think it's a smart move on Intel's part, but it's a long game they are playing and it's not something that is likely to move the needle for a decade. But it is good to see the company thinking ahead for a change.
So what can we expect for the next 10 years?

I think that RISC-V is doing the same thing again (load/store) and expecting different results(which is a symptom of insanity) just because it is "free". It is based on assembler level programming, but practically no one programs in assembler. After all Intel started with the 8080 which was about as RISC-y as it could be.

ARM doesn't have all the answers, either. So what is the next step for ARM?

In fact most programming is done in languages that are compiled at a more abstract level than C, and it seems that RISC-V has only an Assembler and a C compiler(maybe in the works).

Intel should develop easy to use design tools for heterogeneous FPGA applications. Seems that Apple dumped Intel and the M1 chip looks to be heterogeneous -- BUT not ARM.

C# has everything to build the blocks(Classes) that have identical functional behavior as Verilog modules. And the VS IDE has the build and debug tools for free.

Intel should focus on heterogeneous FPGA design. Longer pipelines, multicore, out of order, etc. are not the answers. Neither is Load/Store because the performance is limited by memory, primarily access time, since day one.

Maybe ARM will take on heterogeneous design.

But did Apple get there first?
 
There are many opinons on this. I think the future is "multi-core
super scalar out of order execution overly complex x86". X86_64
still provides the most algorithm computing power per power usage.
X6_64 turn off units power saving is only behind because there
hasn't been X86_64 competition recently to drive fast switching
semiconductor technology. I see future process quality as not being
low powerness. Every (there must be some exceptions) interesting
computing application has steps that do not parallelize. I think you
posters are assuming computation is illiterates watching
videos and social networking on their cell phones. One example
is molecular applications that are needed for medicine deelopment.
 

count

Active member
So what can we expect for the next 10 years?

I think that RISC-V is doing the same thing again (load/store) and expecting different results(which is a symptom of insanity) just because it is "free". It is based on assembler level programming, but practically no one programs in assembler. After all Intel started with the 8080 which was about as RISC-y as it could be.

ARM doesn't have all the answers, either. So what is the next step for ARM?

In fact most programming is done in languages that are compiled at a more abstract level than C, and it seems that RISC-V has only an Assembler and a C compiler(maybe in the works).

Intel should develop easy to use design tools for heterogeneous FPGA applications. Seems that Apple dumped Intel and the M1 chip looks to be heterogeneous -- BUT not ARM.

C# has everything to build the blocks(Classes) that have identical functional behavior as Verilog modules. And the VS IDE has the build and debug tools for free.

Intel should focus on heterogeneous FPGA design. Longer pipelines, multicore, out of order, etc. are not the answers. Neither is Load/Store because the performance is limited by memory, primarily access time, since day one.

Maybe ARM will take on heterogeneous design.

But did Apple get there first?
Maybe me saying it's a 2030s battle isn't really correct.

I expect ARM to dominate the next 10 years at the very least. ARM is increasingly moving up the value chain from mobile and embedded to PCs and servers in a way that mirrors high Intel moved from PCs to servers in the 1990s.

As far has heterogenous, that's a fabless vs IDM battle almost by definition. ARM can help better enable heterogenous, to get itself designed into heterogenous architectures, but it's not ARM that's building the chips, it's licensing the IP for one part of the chip that fabless companies are designing for their varied use cases.

Only with a lot of investment can RISC-V even hope to compete with ARMs ecosystem, and even then it'll take 10 years, but those investments need to be made now for that to happen - just like investments into the ARM ecosystem in the early 2000s is what paved the way for ARM to become the powerhouse it is today.
 

Karl S

Member
There are many opinons on this. I think the future is "multi-core
super scalar out of order execution overly complex x86". X86_64
still provides the most algorithm computing power per power usage.
X6_64 turn off units power saving is only behind because there
hasn't been X86_64 competition recently to drive fast switching
semiconductor technology. I see future process quality as not being
low powerness. Every (there must be some exceptions) interesting
computing application has steps that do not parallelize. I think you
posters are assuming computation is illiterates watching
videos and social networking on their cell phones. One example
is molecular applications that are needed for medicine deelopment.
I wish there was a way to get past "opinions" with some realistic analysis. Multi-core was sold as "pie in the sky" -- like double the performance for free. But no one could do the necessary parallel programming. Out of order execution and cache were invented for matrix inversion, but mainly justified on intuitive appeal rather than realistic test cases. Where are the practical evaluations that can be used to measure out of order execution? While we are at it let's find the test cases that measure the impact of pipeline latency on overall performance.

This thought/assumption "I think you posters are assuming computation is illiterates watching
videos and social networking on their cell phones." is offensive.

Computation is evaluating mathematical and logical expressions. The logical evaluation determines if/which arithmetic expression to evaluate.

There is operator precedence that determines the sequence of evaluation for both expressions.

Both cache and out-of-order execution were conceived before or when compilers were just being developed. Sadly no one looks back and questions how effective they are.

Compilers allocate memory on the stack or heap. So computers must load operands and instructions from memory/cache. First fetch the instruction and fetch at least one operand after calculating the memory address. Does X86_64 do something different?

The procedural flow that fetches both operands and instructions for if statements loads 2 operands, a compare instruction, a branch instruction, and finally the next 2 operands, and the next instruction is the primary performance limiter because of pipeline latency and memory access time. This agrees with the "Where's the Beef" observation that FPGAs win because of instruction fetches.
 
You people need to learn some history. In the 1950s probably the smartest
person of the 20th century John von Neumann rejected his 1930s
quantum logic and developed the von Neumann architecture. The model
he used is called MRAM (random memory access with multiply
and indexing). His papers showed that neural network type parallelism
is inefficient. The theoretical basis for this was analyzed by Hartmanis
and Simon. The analysis shows Turing Machines with the need for
parallel tapes are slow because lack of indexing, registers and RAM.
For Neumann's MRAM model P is equal to NP so there is no need
for guessing. Here is my historical preprint on von Neumanns 1950s
thinking. "John von Neumann's 1950s Change to Philosopher of
Computation". URL: https://arxiv.org/abs/2009.14022

Von Neumann showed complex instruction set CPUs such as
X86_64 are must faster than RISCs and especially ARMs.
 

Karl S

Member
You people need to learn some history. In the 1950s probably the smartest
person of the 20th century John von Neumann rejected his 1930s
quantum logic and developed the von Neumann architecture. The model
he used is called MRAM (random memory access with multiply
and indexing). His papers showed that neural network type parallelism
is inefficient. The theoretical basis for this was analyzed by Hartmanis
and Simon. The analysis shows Turing Machines with the need for
parallel tapes are slow because lack of indexing, registers and RAM.
For Neumann's MRAM model P is equal to NP so there is no need
for guessing. Here is my historical preprint on von Neumanns 1950s
thinking. "John von Neumann's 1950s Change to Philosopher of
Computation". URL: https://arxiv.org/abs/2009.14022

Von Neumann showed complex instruction set CPUs such as
X86_64 are must faster than RISCs and especially ARMs.
So he could also "see" into the future because the CISC/RISC concepts did not exist at that time.

It so happens that I began working in computer systems in 1957, installing, trouble shooting, modifying, and getting into the inside of computers to see exactly how things worked or didn't work. The first system/computer was ANFSQ7, a descendent of MIT project Whirlwind.

Also I was working in IBM systems development when cache was invented for the System360 Model 85 also when John Cocke and George Radin were tauting the 801 (RISC) architecture.

If you Google MIT project whirlwind, you will find some interesting history about computers that I saw first hand. Any more suggestions about what I should do? That is except go to Hell?
 
In the late 1940s and early 1950s von Neumann was creating
his architecture. He understood the importance of random
access memory accessed into registers by instructions with
as many addressing modes as could be encoded in instructions.
The story is told in William Aspray's excellent book.
"John von Neumann and the Origins of Modern Computing."





"
 

kvas

New member
You people need to learn some history. In the 1950s probably the smartest
person of the 20th century John von Neumann rejected his 1930s
quantum logic and developed the von Neumann architecture. The model
he used is called MRAM (random memory access with multiply
and indexing). His papers showed that neural network type parallelism
is inefficient. The theoretical basis for this was analyzed by Hartmanis
and Simon. The analysis shows Turing Machines with the need for
parallel tapes are slow because lack of indexing, registers and RAM.
For Neumann's MRAM model P is equal to NP so there is no need
for guessing. Here is my historical preprint on von Neumanns 1950s
thinking. "John von Neumann's 1950s Change to Philosopher of
Computation". URL: https://arxiv.org/abs/2009.14022

Von Neumann showed complex instruction set CPUs such as
X86_64 are must faster than RISCs and especially ARMs.
So von Neumann preferred random access memory (like in VN architecture) to Turing machines on efficiency grounds -- fair enough. Still, I don't see how this maps to RISC vs CISC debate: both of those work with random access memory. What am I missing?
 

Karl S

Member
Part of it is that RISC and CISC are just buzz words. The 801 architecture was an attempt to use the micro-programmed control of System360/370 as an ISA.
Well, that didn't fly so well mainly because the micro program control array had to run at clock speed and did not fetch "instructions" from memory. But a computer ISA accesses memory for both instructions and data therefore speed was limited by memory cycle time rather than clock frequency.

But some computer science guy (not a computer hardware guy) coined the term RISC and it had so much curb appeal...

Super scalar with pipelines as long/deep as possible and out of order execution are justified intuitively because there is no practical way to define CPU "speed" because the time to executed programs depends more on the program than the CPU. To make things worse, is not possible to run the same program on a "RISC" and a "CISC" to measure time to execute.

By the way, pipeline latency is ignored because it is assumed that the expression are long enough that the latency is a small factor of evaluation time. I completely disagree except for very special cases.

There was a lot of analysis done including different computer architectures and eventually MS project Catapult came to use FPGAs instead of a computer based on performance. There is still the question of "How does an FPGA out perform a CPU that runs at ten times clock speed?"

Of course neither the RISC or CISC camps will accept that it is true. Yes, project catapult uses FPGAs in the data centers.
 

kvas

New member
Yeah, that kind of how I was thinking about this: performance depends on the clock frequency, size of caches, size of pipelines, memory latency, out of order smart-assery that the CPU is doing, and god knows what else. To make matters worse different programs will run at different "speeds" and people don't necessarily agree on what represents a "realistic workload" so even with a bunch of benchmarks and fancy statistics it's still not always clear who wins, unless it's a hands down victory, sort of like "I can run your binary code in emulation faster than you run it natively". Because of this I find it hard to be convinced by philosophical arguments for a certain hardware architecture.

I can see why FPGAs are fast though. They are playing a different game, so yeah. As a software developer, I hope the industry is ready to go there some day, that would be fun.
 

Karl S

Member
Yeah, that kind of how I was thinking about this: performance depends on the clock frequency, size of caches, size of pipelines, memory latency, out of order smart-assery that the CPU is doing, and god knows what else. To make matters worse different programs will run at different "speeds" and people don't necessarily agree on what represents a "realistic workload" so even with a bunch of benchmarks and fancy statistics it's still not always clear who wins, unless it's a hands down victory, sort of like "I can run your binary code in emulation faster than you run it natively". Because of this I find it hard to be convinced by philosophical arguments for a certain hardware architecture.

I can see why FPGAs are fast though. They are playing a different game, so yeah. As a software developer, I hope the industry is ready to go there some day, that would be fun.
Hi! I am working on teaching an FPGA to play this game. It is a bit like using a JIT to compile CIL/intermediate code.

The C#/Roslyn Compiler API will compile C expressions and produce the operators and operands in sequence to evaluate using a simple stack. Also it will emit the statements and loop conditional expressions.

The FPGA then does if/else, for, while, do and evaluates the assignments.

FPGAs have embedded memory blocks that are very fast, dual ported.

One memory block holds the variables and constants. Reads 2 operands for the binary expressions.

Another holds operators and address of the next operand.

A third one holds stack pointer and keyword address.

First cycle: read addresses of first 2 operands

Next/Second cycle: read first 2 operands and first operator

Next/Third cycle: read next operand and next operator

Repeat until last operator

Last cycle: Put away/write result

Expressions are Boolean for loop control for keywords

Both arithmetic and Boolean expression for assignments

The demo FPGA uses a couple of hundred LUTs and 3 embedded memory blocks.

FPGAs have thousands of LUTs and hundreds of embedded memory blocks

SO WHAT: Dedicated functions running in parallel.

I once had a demo on GIT, still there? Dunno, nobody cared.

I think there's a demo on DropBox for download. Maybe I can post the link here.

I am using Visual Studio and C# for development/debug but don't know anything about team projects.
 

kvas

New member
I'd be interested in checking it out if you have it published somewhere. Don't have any FPGAs to try it on, but would be interesting to look at the code.
 

Karl S

Member

I am a novice, but think this is a link to something that may be useful.

Meanwhile I will try to attach my .cs code that uses the SyntaxWalker
 
I was trying to make a simpler point about computer
instruction set architectures in general. von Neumann
beleived that complicated instruction set architectures
that facitated table look up and indexing algorithms was
superior to simple instruction set architectures inluding
autoatons and neural network type architectures.

Doesn't CISC mean complex instruction set? My understanding
of RISC architecture is small number of simple instructions
with limited addressing modes. My claim is that Neumann
was right and that in some theoretical efficiency sense
independent of semiconductor fab technology, CISC
complexity is better.
 
Top