Will the Chinese DeepSeek AI upset the AI/ML race?

KevinK · Jan 29, 2025

I tend to try things myself instead of relying on randos on YouTube or TikTok. Here's simple output from OpenAI with reasoning path:

tonyget · Jan 30, 2025

KevinK said:
I tend to try things myself instead of relying on randos on YouTube or TikTok. Here's simple output from OpenAI with reasoning path:

I think there is a misunderstanding about "reasoning process". In reasoning LLM，it is referred as "chain of thoughts"

What you have posted the result from ChatGPT，is mere final answer of the question，without showing user chain of thoughts.

Here is the output from the exact same question from deepseek，you can see the detailed chain of thoughts before final answer

KevinK · Jan 30, 2025

tonyget said:
I think there is a misunderstanding about "reasoning process". In reasoning LLM，it is referred as "chain of thoughts"

What you have posted the result from ChatGPT，is mere final answer of the question，without showing user chain of thoughts.

Here is the output from the exact same question from deepseek，you can see the detailed chain of thoughts before final answer

I think you're trying to be pedantic, about something you're wrong about, but something that is also meaningless to the original discussion. The base model, DeepSeek V3 used distilled data using OpenAI API - it's becoming more and more apparent. DeepSeek R1 was built on top of DeepSeek V3, adding reasoning using reinforcement learning.

tonyget · Jan 30, 2025

KevinK said:
I think you're trying to be pedantic, about something you're wrong about, but something that is also meaningless to the original discussion. The base model, DeepSeek V3 used distilled data using OpenAI API - it's becoming more and more apparent. DeepSeek R1 was built on top of DeepSeek V3, adding reasoning using reinforcement learning.

AI experts already debunked OpenAI's claim that deepseek R1 has anything to do with OpenAI's product

Jert · Jan 31, 2025

TSMC was $122 before Deepseek. It dropped to $188 on Monday 27th when Deepseek "shock the world". It closed today at $209, still $13 off from $122 but $21 recovered from $188.

Looks like the Deepfake glorious one-day "breakthrough" tour is over?

XYang2023 · Jan 31, 2025

Jert said:
TSMC was $122 before Deepseek. It dropped to $188 on Monday 27th when Deepseek "shock the world". It closed today at $209, still $13 off from $122 but $21 recovered from $188.

Looks like the Deepfake glorious one-day "breakthrough" tour is over?

DeepSeek R1 is now available on Azure AI Foundry and GitHub | Microsoft Azure Blog

DeepSeek R1, available through the model catalog on Microsoft Azure AI Foundry and GitHub, enables businesses to seamlessly integrate advanced AI.

azure.microsoft.com

KevinK · Jan 31, 2025

tonyget said:
AI experts already debunked OpenAI's claim that deepseek R1 has anything to do with OpenAI's product

Thought this was an interesting view from some experts. Good chance that DeepSeek did distill from OpenAI, but that’s not the important thing. Biggest reason to train/create biggest (most compute intensive) models to be able to distill into smaller targeted solutions models.

XYang2023 · Jan 31, 2025

KevinK said:
Thought this was an interesting view from some experts. Good chance that DeepSeek did distill from OpenAI, but that’s not the important thing. Biggest reason to train/create biggest (most compute intensive) models to be able to distill into smaller targeted solutions models.

Then you can run those models on low cost devices. I believe IBM is a customer of Gaudi 3.

Intel and IBM Collaborate to Provide Better Cost Performance for AI Innovation

IBM and Intel have announced a collaboration to deploy Intel® Gaudi® 3 AI accelerators as a service on IBM Cloud.

newsroom.ibm.com

Also distillation requires much less GPU training time.

https://www.reddit.com/r/MachineLearning/comments/1icfbll/d_deepseek_distillation_and_training_costs

KevinK · Jan 31, 2025

XYang2023 said:
Then you can run those models on low cost devices. I believe IBM is a customer of Gaudi 3.

XYang2023 said:
Also distillation requires much less GPU training time.

Buried in the discussion at the end is that the focus moves from just the models and to the entire AI app solution framework - the models are just building blocks.

XYang2023 · Jan 31, 2025

KevinK said:
Buried in the discussion at the end is that the focus moves from just the models and to the entire AI app solution framework - the models are just building blocks.

I don't think there is a strong argument of solution stack. I think the most important aspects are the capability of a model and then cost. Once you have those, there are frameworks such as LangChain that people can leverage. For serious development, unique data set and evaluation approaches unique to an application, are important but they are not shared and hence I don't think they are part of any stack.

XYang2023 · Jan 31, 2025

https://twitter.com/x/status/1884303237186216272

XYang2023 · Feb 1, 2025

https://twitter.com/x/status/1885411425872703668

XYang2023 · Feb 1, 2025

https://twitter.com/x/status/1885651465219518936

XYang2023 · Feb 1, 2025

Yann LeCun on LinkedIn: - A common disease in some Silicon Valley circles: a misplaced superiority… | 323 comments

- A common disease in some Silicon Valley circles: a misplaced superiority complex. - Symptom of advanced stage: thinking your small tribe has a monopoly on… | 323 comments on LinkedIn

www.linkedin.com

XYang2023 · Feb 1, 2025

https://twitter.com/x/status/1885764352336425379

XYang2023 · Feb 3, 2025

My latest video:

KevinK · Feb 3, 2025

And yet SoftBank will pay $3B / year to OpenAI so they can offer AI solutions in Japan... Real money flow is more probably important than benchmarks.

SoftBank joins with OpenAI in yearly $3B venture to expand AI in Japan

SoftBank inked a $3 billion deal with OpenAI in a joint venture to market OpenAI tech in Japan with its newly-minuted "Cristal Intelligence" suite of tools.

ca.finance.yahoo.com

XYang2023 · Feb 3, 2025

KevinK said:
And yet SoftBank will pay $3B / year to OpenAI so they can offer AI solutions in Japan... Real money flow is more probably important than benchmarks.

SoftBank joins with OpenAI in yearly $3B venture to expand AI in Japan

SoftBank inked a $3 billion deal with OpenAI in a joint venture to market OpenAI tech in Japan with its newly-minuted "Cristal Intelligence" suite of tools.

ca.finance.yahoo.com

He also funded WeWork. At the moment, even with the current rates charged by OpenAI, they are not profitable. I believe that due to Deepseek, OpenAI is rushing to launch new services. How can they make money with increasing competition (significantly lower costs)?

XYang2023 · Feb 3, 2025

In English:

XYang2023 · Feb 3, 2025

https://twitter.com/x/status/1886578193865539800

Will the Chinese DeepSeek AI upset the AI/ML race?

Well-known member

Attachments

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member