High Performance Computing: Liquid Cooling Essential for Future Efficiency

Increasing Power Consumption in Confined Spaces

The market for server cooling solutions is thriving, especially with Nvidia’s recent GPU advancements. As these high-performance systems continue to evolve, they demand not only more advanced cooling solutions but also a greater quantity of them. Without proper cooling, the latest technology is severely restricted, and this issue is only expected to worsen.

GPU accelerators are the main drivers behind this trend, with CPUs close behind. While a discrepancy will remain, both show steep upward trends in power consumption. Currently, CPUs and APUs, which consume around 350 to 400 watts, are expected to reach 500 to 600 watts. Meanwhile, GPUs are projected to increase from around 700 watts to 1,200 or even 1,500 watts. Nvidia’s Blackwell GPU will require 1,200 watts with liquid cooling, and Intel’s Falcon Shores is anticipated to hit 1,500 watts. Discussions at the ISC High Performance event in Hamburg indicated that Intel would not offer an air-cooled version for Falcon Shores, committing entirely to liquid cooling with a 1,500-watt specification.

15 kW in a Server, 300 to 500 kW per Rack

To understand the implications for a system, consider that the power consumption of GPUs and CPUs in a fully loaded server is significant. A server with two CPUs and eight GPUs alone needs 10 to 12 kW. Additional system components, like the network infrastructure (with switches consuming up to 200 watts), quickly bring the total to around 15 kW.

Liquid cooling offers multiple advantages, becoming cost-effective over more than two years of use, despite high initial costs. This cooling method supports faster hardware in a smaller space, enhancing overall efficiency.

Up to 1 kW for Air Cooling

Fewer racks or cabinets are needed with liquid cooling, and the power supplied is used for performance rather than driving fans. Lenovo, for example, emphasizes that “Fans don’t calculate” – they don’t contribute to computing power. Using nearly 1 kW per blade to push cooling air is inefficient. Although the infrastructure for a water cooling loop in a data center is expensive, it amortizes quickly, as highlighted by Lenovo at ISC 2024.

Supermicro has also significantly increased its focus on this area compared to a few years ago, now developing its solutions. Currently, Supermicro relies on third-party expertise but plans to bring everything in-house soon. Customers can already handle all support needs directly through Supermicro.