in

Nvidia’s data center customers are postponing Blackwell chip orders due to overheating and other issues

Nvidia’s data center customers are postponing Blackwell chip orders due to overheating and other issues

Serving tech enthusiasts for over 25 years.

TechSpot means tech analysis and advice you can trust.

What just happened? Some of Nvidia’s top enterprise customers are reportedly delaying orders of the latest Blackwell chip racks due to overheating issues and glitches in chip connectivity. The news has sent ripples through the tech industry and financial markets, with Nvidia’s shares experiencing a sharp four-percent decline in early trading.

The Information notes that Blackwell GB200 racks, crucial components in data centers, have exhibited problems during initial deployments. The unprecedented power consumption of these cutting-edge GPUs, with each rack drawing a staggering 120-132kW, is the source of the problem, as the extreme power density has pushed traditional cooling systems to their limits.

Additionally, initial shipments of Blackwell racks revealed interconnect glitches, hampering efficient heat distribution and creating problematic hotspots. The complex multi-chip module design, which integrates two large GPU dies on a single package, further exacerbates the heat management challenges.

As deployments scale, with configurations featuring up to 72 Blackwell chips per rack, these thermal inefficiencies compound dramatically. The current server rack designs have proven insufficient to handle the extreme thermal output, prompting Nvidia to request multiple design modifications from its suppliers. Resolving these issues will likely require a combination of chip-level optimizations, the development of more advanced cooling solutions, and a complete overhaul of server rack infrastructure.

Some of Nvidia’s biggest buyers, including Microsoft, Amazon Web Services, Google, and Meta Platforms, have reduced their orders for the Blackwell GB200 racks. These hyperscalers had placed orders worth $10 billion or more for the new technology. The impact of these order reductions could be significant.

For instance, Microsoft had initially planned to install GB200 racks with at least 50,000 Blackwell chips in one of its Phoenix facilities. However, as delays emerged, Microsoft’s key partner, OpenAI, requested Nvidia’s older generation ‘Hopper’ chips instead.

Despite these setbacks, how these order reductions ultimately affect Nvidia’s sales remains unclear. Other potential buyers for the GB200 server racks may exist, even with the reported issues.

During initial testing with a flagship liquid-cooled server containing 72 of the new chips, Nvidia CEO Jensen Huang denied media reports of overheating problems. In November, Huang also stated that the company was on track to exceed its earlier target of recording several billion dollars in revenue from Blackwell chips in its fourth fiscal quarter.

Nvidia and Amazon have declined to comment on the situation, while Microsoft, Google, and Meta have not yet responded to requests for comment.

Report

What do you think?

Newbie

Written by Mr Viral

Leave a Reply

Your email address will not be published. Required fields are marked *

GIPHY App Key not set. Please check settings

Ben Stiller Reveals The Biggest Surprises Of Severance’s Famous Dance Scene

Ben Stiller Reveals The Biggest Surprises Of Severance’s Famous Dance Scene

20% of online job listings are misleading or never result in employment

20% of online job listings are misleading or never result in employment