Addressing data centre challenges in the era of surging AI workloads

0
800
Addressing data centre challenges in the era of surging AI workloads

The surge in data means that data centres are more critical than ever, and so are the power and cooling requirements for them, which means a holistic approach is imperative, according to Martin Ryder, channel sales director Northern Europe at Vertiv.

Understanding the future of business operations requires a thorough look at the upcoming surge in data, a trend that has caught the attention of stakeholders across industries. In this rapidly changing landscape, hyperscalers play a key role as pioneers, setting the specifications that will drive the next significant transformation. The ongoing collaboration between hyperscalers and the supply chain represents a close partnership aimed at aligning expectations and optimising available resources.

The certainty of this transformative wave is not up for debate; the focus has shifted to figuring out when this shift will fully take place. Customers, realising the importance of this imminent change, are actively expressing interest, with some forward-thinking entities already making strategic adjustments to smoothly integrate with the upcoming artificial intelligence (AI) revolution.

As we navigate through this evolving terrain, the combination of increasing bandwidth requirements and the continuous influx of data is fundamentally reshaping the business landscape. Beyond the traditional considerations of power and cooling, attention is now squarely on connectivity, emerging as a critical element that significantly influences the design and functionality of data centres. The relationship between hyperscalers, the supply chain, and businesses adapting to the demands of the AI era highlights the complexity and interdependence of these crucial components in shaping the future of data management and technological advancement.

Exploring power requirements

One of the primary challenges looming on the horizon is the significant surge in power requirements, a direct consequence of the deployment of high-performance CPUs and GPUs essential for handling intricate AI workloads. As businesses gear up for the AI revolution, this surge presents a pressing issue that demands strategic solutions from data centres.

Critical infrastructure solution providers are taking a leading role in providing innovative solutions for power management and optimisation. Effectively navigating the challenge of rising power demands entails a proactive exploration and implementation of such cutting-edge solutions. There is an emphasis on the integration of energy-efficient hardware within data centres, which involves not only adopting state-of-the-art hardware designs but also staying abreast of advancements in processor technology.

The strategic emphasis on power efficiency goes beyond immediate operational needs. It aligns with the broader imperative of promoting sustainability in the face of escalating energy consumption. By taking a forward-thinking stance on power efficiency, data centres can not only meet the challenges posed by burgeoning AI workloads but can also contribute to a more environmentally conscious and sustainable future. 

Cooling

Shifting our attention to the critical domain of cooling unveils a closer examination of thermal management within data centres, not least the ongoing trend toward liquid cooling. Throughout the years, data centre designs have progressed from chilled water systems to indirect adiabatic systems, with a recent resurgence of interest in chilled water systems with three distinct options for liquid cooling at the rack level.

The first option involves directing liquid to the server itself, using a room-based heat exchanger to reject heat back into the air. This modular system allows seamless integration without substantial changes to existing infrastructure. The second option introduces a Cooling Distribution Unit (CDU), directly circulating liquid from the server or GPU, connecting to a chilled water system. A third option is an interchangeable liquid-to-gas system. This approach incorporates a remote condenser on the roof or building, utilising gas-to-liquid heat exchangers for deployment flexibility.

Ultimately, despite progress in liquid cooling, it is most likely that air-cooled and liquid-cooled solutions will co-exist. Even within liquid-cooled servers, elements necessitating air cooling persist, highlighting the nuanced nature of the evolving thermal management landscape.

Ensuring a holistic approach

In the face of the ever-evolving and rapidly changing landscape of data centre infrastructure, a holistic design approach emerges as the cornerstone for operators aiming to future-proof their operations and enable compatibility with new technologies and emerging waves of demand.

The key to success lies in involving all stakeholders, recognising the importance of collaboration and communication across diverse disciplines. Engaging not only power and cooling specialists but also those responsible for facility management, storage and technology deployment, fosters a comprehensive understanding of the data centre’s intricate requirements.

As data centres embrace denser configurations and rapidly evolving technology, the holistic approach extends to decision-making timelines. While operators may be inclined to defer decisions to the final stages of design, a balance must be struck to avoid risks associated with delayed investments and potential loss of market share. Holistic design, therefore, involves streamlining decision-making processes while considering lead times and involving stakeholders at every stage.

In a dialogue with industry experts, the importance of technology interchangeability surfaces as a critical consideration for clients. In some areas we have seen a slowdown in direct deployments by hyperscalers, which may reflect a strategic pause to understand what technology changes and specifications are required. Challenges arise in finding the optimal operating conditions for CPUs and GPUs, with manufacturers defining specifications and clients striving to plan for a diverse technology landscape over the next five to 10 years.

In this pursuit of future-ready design principles, clients encounter design pitfalls and challenges. The balance between CPU and GPU environments, coupled with defining optimal operating conditions, requires a meticulous approach to allow adaptability over an extended operational lifespan. As the industry grapples with these complexities, a holistic design ethos remains the compass guiding operators through the dynamic terrain of data centre evolution.

Holistic design

In this era dominated by AI, mobile and cloud technologies, and the advent of hybrid computing as the new norm, the importance of holistic design in data centres has never been more apparent. The evolution in the fungibility of workloads in the realm of AI signals a paradigm shift, recognising that workloads are no longer static, but dynamic, ever-changing entities.

Navigating this dynamic landscape requires a holistic approach. Data centre architects, faced with the challenges of climate change, surging power requirements and heightened heat generation, are at the forefront of this transformative journey.

In embracing a holistic design philosophy, data centres can position themselves to not only meet, but thrive in the face of the burgeoning demands of the AI-driven era. Sustainability and efficiency become the bedrock of operations, ensuring that data centres lead the charge in an era defined by growth and technological innovation.