Array
(
    [content] => 
    [params] => Array
        (
            [0] => /forum/threads/will-the-chinese-deepseek-ai-upset-the-ai-ml-race.21768/page-4
        )

    [addOns] => Array
        (
            [DL6/MLTP] => 13
            [Hampel/TimeZoneDebug] => 1000070
            [SV/ChangePostDate] => 2010200
            [SemiWiki/Newsletter] => 1000010
            [SemiWiki/WPMenu] => 1000010
            [SemiWiki/XPressExtend] => 1000010
            [ThemeHouse/XLink] => 1000970
            [ThemeHouse/XPress] => 1010570
            [XF] => 2021770
            [XFI] => 1050270
        )

    [wordpress] => /var/www/html
)

Will the Chinese DeepSeek AI upset the AI/ML race?

I tend to try things myself instead of relying on randos on YouTube or TikTok. Here's simple output from OpenAI with reasoning path:
 

Attachments

  • Screenshot 2025-01-29 at 9.57.11 PM.png
    Screenshot 2025-01-29 at 9.57.11 PM.png
    127 KB · Views: 95
I tend to try things myself instead of relying on randos on YouTube or TikTok. Here's simple output from OpenAI with reasoning path:

I think there is a misunderstanding about "reasoning process". In reasoning LLM,it is referred as "chain of thoughts"

What you have posted the result from ChatGPT,is mere final answer of the question,without showing user chain of thoughts.

Here is the output from the exact same question from deepseek,you can see the detailed chain of thoughts before final answer

蜂蜜浏览器_2025-01-30_162323.jpg
 
I think there is a misunderstanding about "reasoning process". In reasoning LLM,it is referred as "chain of thoughts"

What you have posted the result from ChatGPT,is mere final answer of the question,without showing user chain of thoughts.

Here is the output from the exact same question from deepseek,you can see the detailed chain of thoughts before final answer
I think you're trying to be pedantic, about something you're wrong about, but something that is also meaningless to the original discussion. The base model, DeepSeek V3 used distilled data using OpenAI API - it's becoming more and more apparent. DeepSeek R1 was built on top of DeepSeek V3, adding reasoning using reinforcement learning.
 
I think you're trying to be pedantic, about something you're wrong about, but something that is also meaningless to the original discussion. The base model, DeepSeek V3 used distilled data using OpenAI API - it's becoming more and more apparent. DeepSeek R1 was built on top of DeepSeek V3, adding reasoning using reinforcement learning.

AI experts already debunked OpenAI's claim that deepseek R1 has anything to do with OpenAI's product
 
TSMC was $122 before Deepseek. It dropped to $188 on Monday 27th when Deepseek "shock the world". It closed today at $209, still $13 off from $122 but $21 recovered from $188.

Looks like the Deepfake glorious one-day "breakthrough" tour is over?
 
TSMC was $122 before Deepseek. It dropped to $188 on Monday 27th when Deepseek "shock the world". It closed today at $209, still $13 off from $122 but $21 recovered from $188.

Looks like the Deepfake glorious one-day "breakthrough" tour is over?
 
AI experts already debunked OpenAI's claim that deepseek R1 has anything to do with OpenAI's product
Thought this was an interesting view from some experts. Good chance that DeepSeek did distill from OpenAI, but that’s not the important thing. Biggest reason to train/create biggest (most compute intensive) models to be able to distill into smaller targeted solutions models.

 
Thought this was an interesting view from some experts. Good chance that DeepSeek did distill from OpenAI, but that’s not the important thing. Biggest reason to train/create biggest (most compute intensive) models to be able to distill into smaller targeted solutions models.

Then you can run those models on low cost devices. I believe IBM is a customer of Gaudi 3.


Also distillation requires much less GPU training time.

 
Last edited:
Buried in the discussion at the end is that the focus moves from just the models and to the entire AI app solution framework - the models are just building blocks.
I don't think there is a strong argument of solution stack. I think the most important aspects are the capability of a model and then cost. Once you have those, there are frameworks such as LangChain that people can leverage. For serious development, unique data set and evaluation approaches unique to an application, are important but they are not shared and hence I don't think they are part of any stack.
 
Last edited:
And yet SoftBank will pay $3B / year to OpenAI so they can offer AI solutions in Japan... Real money flow is more probably important than benchmarks.

He also funded WeWork. At the moment, even with the current rates charged by OpenAI, they are not profitable. I believe that due to Deepseek, OpenAI is rushing to launch new services. How can they make money with increasing competition (significantly lower costs)?
 
Back
Top