Array
(
    [content] => 
    [params] => Array
        (
            [0] => /forum/threads/microsoft-abandons-more-data-center-projects-td-cowen-says-bloomberg.22417/
        )

    [addOns] => Array
        (
            [DL6/MLTP] => 13
            [Hampel/TimeZoneDebug] => 1000070
            [SV/ChangePostDate] => 2010200
            [SemiWiki/Newsletter] => 1000010
            [SemiWiki/WPMenu] => 1000010
            [SemiWiki/XPressExtend] => 1000010
            [ThemeHouse/XLink] => 1000970
            [ThemeHouse/XPress] => 1010570
            [XF] => 2021770
            [XFI] => 1050270
        )

    [wordpress] => /var/www/html
)

Microsoft Abandons More Data Center Projects, TD Cowen Says - Bloomberg

XYang2023

Well-known member
Microsoft Corp. has walked away from new data center projects in the US and Europe that would have amounted to a capacity of about 2 gigawatts of electricity, according to TD Cowen analysts, who attributed the pullback to an oversupply of the clusters of computers that power artificial intelligence.

The analysts, who rattled investors with a February note highlighting leases Microsoft had abandoned in the US, said the latest move also reflected the company’s choice to forgo some new business from ChatGPT maker OpenAI, which it has backed with some $13 billion. Microsoft and the startup earlier this year said they had altered their multiyear agreement, letting OpenAI use cloud-computing services from other companies, provided Microsoft didn’t want the business itself.

Microsoft’s retrenchment in the last six months included lease cancellations and deferrals, the TD Cowen analysts said in their latest research note, dated Wednesday. Alphabet Inc.’s Google had stepped in to grab some leases Microsoft abandoned in Europe, the analysts wrote, while Meta Platforms Inc. had scooped up some of the freed capacity in Europe.

Microsoft has said it will spend about $80 billion building out AI data centers in its fiscal year that ends in June, but that the pace of growth should begin to slow after that. Executives have said that, after a frantic expansion to support OpenAI and other artificial intelligence projects, spending would shift from new construction to fitting out data centers with servers and other equipment.

Spokespeople for Microsoft, Meta and Google didn’t immediately comment on the research note on Wednesday.

Earlier this week, Alibaba Group Holding Ltd. Chairman Joe Tsai warned of a potential bubble in data center construction, saying new projects may exceed demand for AI services.

“We continue to believe the lease cancellations and deferrals of capacity points to data center oversupply relative to its current demand forecast,” TD Cowen analysts Michael Elias, Cooper Belanger and Gregory Williams wrote.



 
I’m also wondering whether hyperscalers aren’t scaling as efficiently as they could be for GenAI and inference. Seems like chip disruptive companies like Groq and Cerebras are building out and operating their own data centers and winning contracts with GenAI model / app companies and enterprises directly. Maybe why NVIDIA is on the path to acquire a next gen cloud AI provide Lepton.ai and is stepping up to deliver whole AI data centers.


Seems like AI inference data centers are fundamentally different than the old-guard (Amazon, Microsoft) data centers and much more efficient ?? I see claims of Amazon undercutting NVIDIA chips in Amazon instances but that’s all measured in Amazon-land, not in tokens / sec and cost against optimized chip vendor data centers.
 
I’m also wondering whether hyperscalers aren’t scaling as efficiently as they could be for GenAI and inference. Seems like chip disruptive companies like Groq and Cerebras are building out and operating their own data centers and winning contracts with GenAI model / app companies and enterprises directly. Maybe why NVIDIA is on the path to acquire a next gen cloud AI provide Lepton.ai and is stepping up to deliver whole AI data centers.


Seems like AI inference data centers are fundamentally different than the old-guard (Amazon, Microsoft) data centers and much more efficient ?? I see claims of Amazon undercutting NVIDIA chips in Amazon instances but that’s all measured in Amazon-land, not in tokens / sec and cost against optimized chip vendor data centers.
AWS has its own inferencing chips and I think the latest one is more efficient than NVDA GPU in PPW but not so powerful. AWS is always trying hard to use its won chips. Azure has many CPUs and GPUs from AMD and never got the same efficiency AS NVDA solution. AMD got punched hard because of this news.
 
AWS has its own inferencing chips and I think the latest one is more efficient than NVDA GPU in PPW but not so powerful
Yes, but what the industry seems to be seeing with DeepSeek and others, is that smart AI data center level design, plus model management and optimization trumps individual chip performance / power / cost. That also means rack-level and data center level interconnectivity and resource allocation will be very different driven by the model-serving strategies of the underlying chips (or in the case of Cerebras, wafers).

I’m not sure hyperscalers like Amazon can implement the data-center level lessons from DeepSeek for their inference chips and other outside GPUs.
  • Prefill and Decode Disaggregation: By separating the prefill and decode phases of the inference process and assigning them to different sets of GPUs, DeepSeek can optimize each phase for its specific demands.

  • Dynamic Resource Allocation: This approach allows for tailored resource allocation (e.g., different degrees of parallelism) during prefill and decode, leading to efficient use of hardware and improved throughput.
  • Disaggregated Storage: DeepSeek uses a disaggregated approach in its Fire-Flyer File System (3FS), enabling flexible access to storage resources and avoiding traditional data locality limitations.

  • Smart KV Cache Management and Routing: DeepSeek incorporates a cost-effective and high-throughput KV Cache, which optimizes performance by storing and reusing previously computed data like key and value vectors, crucial for language model inference.
 
Back
Top