CN-122028388-A - Data center liquid cooling flow dynamic distribution method and system based on load prediction

CN122028388ACN 122028388 ACN122028388 ACN 122028388ACN-122028388-A

Abstract

The invention relates to the technical field of data processing, in particular to a data center liquid cooling flow dynamic distribution method and system based on load prediction, wherein the method comprises the steps of collecting and preprocessing time sequence data of power consumption of a data center server and outlet temperature of cooling liquid, constructing a sliding window sample to train a stacked LSTM model, and predicting a future thermal load sequence; the method comprises the steps of integrating temperature and load information, calculating a thermal compression factor, obtaining a thermal inertia factor representing heat dissipation delay characteristics through mutual information hysteresis scanning, obtaining preventive scheduling priority based on the thermal compression factor and the thermal inertia factor, constructing a constraint optimization model, obtaining an optimal cooling liquid flow distribution scheme through a greedy algorithm, and finally realizing closed-loop control through real-time data acquisition. According to the invention, through load prediction, multidimensional factor fusion and flow optimization distribution, the defects of traditional liquid cooling flow distribution are overcome, the risk of server overheating is reduced, and the energy efficiency of a liquid cooling system and the reliability of a data center are improved.

Inventors

WU YANG
LIU JIE
CHEN XIN
LAN WEIJIE

Assignees

嘉杰科技有限公司

Dates

Publication Date: 20260512
Application Date: 20260410

Claims (9)

1. The dynamic distribution method of the liquid cooling flow of the data center based on the load prediction is characterized by comprising the following steps: Collecting power consumption and cooling liquid outlet temperature time sequence data of each server in a data center cabinet, and preprocessing to obtain a standardized multidimensional time sequence data set of each server; constructing a sliding window sample pair for each server by using a standardized multidimensional time sequence data set, independently training a stacked LSTM model, and predicting a future heat load sequence of each server in real time through the stacked LSTM model; the method comprises the steps of merging a current coolant outlet temperature and a future heat load sequence of servers, respectively calculating a temperature pressing factor and a standardized load fluctuation factor, and obtaining the heat pressing factor of each server through weighted fusion; Based on the preprocessed power consumption of the server and the time sequence data of the temperature of the cooling liquid outlet, extracting an instantaneous variation sequence of the power consumption of the server and the temperature of the cooling liquid outlet, identifying the characteristic lag time of a heat radiation subsystem of the server in response to the heat load variation through mutual information lag scanning, and obtaining the heat inertia factor of each server through normalization processing, wherein the heat inertia factor quantifies the inherent heat radiation response delay degree of the server; And (3) cooperatively calculating the thermal compression factor and the thermal inertia factor of each server to obtain preventive scheduling priority, constructing a linear programming model with the constraint of the upper limit and the lower limit of the total flow and the single flow with the aim of maximizing the weighted cooling gain of all servers, solving by adopting a greedy algorithm to obtain an optimal cooling flow distribution scheme, converting the distribution scheme into a control instruction, transmitting the control instruction to an intelligent flow regulating valve to complete flow distribution, and repeating the steps by continuously collecting real-time data to form closed-loop control.
2. The dynamic distribution method of the liquid cooling flow of the data center based on the load prediction according to claim 1, wherein the construction of the sliding window sample pair comprises defining the length of a history window input by a model and the length of a prediction window output by the model, sliding and intercepting a standardized multidimensional time sequence data set of a server by a fixed step length, taking two-dimensional characteristics of power consumption and cooling liquid outlet temperature at the moment of the history window length in the sliding window as input characteristics of the model, taking a total power consumption sequence of the server at the moment of the prediction window length immediately after the two-dimensional characteristics as a prediction target, forming the sample pair, and dividing all samples into a training set, a verification set and a test set according to time sequence.
3. The dynamic distribution method of the liquid cooling flow of the data center based on the load prediction according to claim 1, wherein the temperature stress factor is calculated in the following manner: The method comprises the steps of obtaining the current cooling liquid outlet temperature of a server, calculating the difference value between the temperature and the preset ideal basic temperature of the cooling liquid outlet, taking a non-negative value, and comparing the non-negative value with the difference value between the preset upper safety temperature limit of the cooling liquid outlet and the ideal basic temperature to obtain a temperature urgent factor.
4. The dynamic distribution method of the liquid cooling flow of the data center based on the load prediction according to claim 1, wherein the calculation mode of the standardized load fluctuation factor is as follows: Calculating the ratio of the standard deviation to the mean value of a future thermal load sequence of the server to obtain a variation coefficient, carrying out exponential mapping on the variation coefficient through an exponential decay function, and taking the mapping result as a standardized load fluctuation factor.
5. The dynamic distribution method of the liquid cooling flow of the data center based on the load prediction according to claim 1, wherein the thermal compression factor of each server is calculated in the following manner: Multiplying a preset weighting coefficient by a temperature pressing factor of the server, multiplying a difference value between 1 and the weighting coefficient by a standardized load fluctuation factor of the server, and adding results obtained by the two multiplications to obtain a thermal pressing factor of the server, wherein the thermal pressing factor is used for balancing static temperature risks and dynamic load fluctuation risks.
6. The dynamic distribution method of the liquid cooling flow of the data center based on the load prediction according to claim 1, wherein the thermal inertia factor is calculated in the following manner: And extracting an instantaneous variation sequence of the power consumption of the server and the outlet temperature of the cooling liquid, identifying the characteristic lag time of the heat radiation subsystem of the server in response to the heat load variation through mutual information lag scanning, and comparing the characteristic lag time with the maximum value of the characteristic lag time of all servers in the cabinet to obtain a thermal inertia factor.
7. The dynamic distribution method of the liquid cooling flow of the data center based on the load prediction according to claim 1, wherein the construction mode of the linear programming model is as follows: Setting the maximum sum of products of preventive scheduling priorities of all servers and corresponding allocated coolant flows as an optimization target, setting three types of constraint conditions, wherein the first type is total flow constraint, the sum of the coolant flows allocated by all servers in the cabinet does not exceed the total coolant flow available in the cabinet, the second type is single flow upper and lower limit constraint, the coolant flow allocated by each server is between preset minimum guarantee flow and maximum flow, the third type is non-negative constraint, and the coolant flow allocated by each server is not less than zero.
8. The dynamic distribution method of the liquid cooling flow of the data center based on the load prediction according to claim 1, wherein the optimal cooling flow distribution scheme is obtained by the following steps: Initializing and distributing preset minimum guarantee flow for all servers, calculating residual cooling liquid flow in a cabinet, sequencing the servers according to preventive dispatching priority from high to low, sequentially distributing the residual cooling liquid flow to the sequenced servers, enabling single distribution amount to be smaller value in difference value between the residual cooling liquid flow and the maximum flow and the minimum guarantee flow of the servers, and updating the residual cooling liquid flow in real time until the residual cooling liquid flow is zero or all the servers are distributed, and finally obtaining an optimal cooling liquid flow distribution scheme of each server.
9. A load prediction based data center liquid cooling flow dynamic distribution system comprising a processor and a memory, the memory storing computer program instructions that when executed by the processor implement the load prediction based data center liquid cooling flow dynamic distribution method according to any one of claims 1-8.

Description

Data center liquid cooling flow dynamic distribution method and system based on load prediction Technical Field The invention relates to the technical field of data processing. In particular to a dynamic distribution method and a system for liquid cooling flow of a data center based on load prediction. Background The dynamic distribution of liquid cooling flow is a key link of heat dissipation management of a data center, and the distribution efficiency and the accuracy directly determine the operation safety of a server and the overall energy efficiency of a system. Along with continuous improvement of power density and increasingly enhanced load dynamic property of a data center server, the requirements on fine scheduling of liquid cooling resources are urgent, and scientific and reasonable liquid cooling flow dynamic distribution can avoid reliability risks such as local hot spots caused by insufficient cooling, reduce the waste of cooling resources of a low-load server, realize effective control of cooling energy consumption, and is a core technical support for guaranteeing efficient, stable and energy-saving operation of the data center. The load prediction is a core prospective technical means in dynamic distribution of liquid cooling flow of a data center, and particularly relates to a prediction model based on a long and short term memory network (LSTM) model, wherein each server in a cabinet of the data center is independently trained, historical power consumption of the server and outlet temperature of cooling liquid are taken as input, a thermal load (total power consumption) sequence in a specified time in the future is accurately output in real time, the total power consumption of the server is taken as a thermal load core representation index, future heat dissipation requirements of the server can be prejudged in advance, key data support is provided for active and accurate distribution of cooling resources, and the method is an important foundation for realizing transition from passive response to active prevention scheduling of the liquid cooling resources. The existing data center liquid cooling flow distribution method has the obvious technical defects that the heat dissipation requirement of the current servers is difficult to match, the uniform flow distribution method ignores individual differences of real-time loads and heat dissipation states among the servers, and causes the problem that excessive and insufficient cooling resources coexist, and based on a feedback control strategy of the current outlet temperature, the response is delayed by heat transfer inertia and is influenced by the actual heat load change of the servers, and passive compensation can be performed only after overheating occurs. And both methods do not merge with the prospective prediction of the future load trend of the server, and do not quantitatively utilize the inherent heat radiation characteristic difference of different servers, so that the matching misalignment of the cooling resources and the real heat radiation requirements of the servers is caused, the running risk cannot be thoroughly avoided, and the ineffective consumption of a large amount of cooling energy consumption is caused. Disclosure of Invention In order to solve the problems of misalignment of cooling supply and demand and running risk and energy consumption waste caused by uneven resource allocation and response lag existing in the existing data center liquid cooling flow allocation method without combining load prediction and inherent heat dissipation characteristics of a server, the invention provides a scheme in the following aspects. The method comprises the steps of collecting power consumption and cooling liquid outlet temperature time sequence data of servers in a data center cabinet, preprocessing the data to obtain a standardized multi-dimensional time sequence data set of the servers, constructing a sliding window sample pair by using the standardized multi-dimensional time sequence data set for each server, independently training a stacked LSTM model, predicting a future thermal load sequence of each server in real time through the stacked LSTM model, merging the current cooling liquid outlet temperature and the future thermal load sequence of the servers, respectively calculating a temperature pressing factor and a standardized load fluctuation factor, obtaining the thermal pressing factor of each server through weighted fusion, extracting the instantaneous variation sequence of the power consumption and the cooling liquid outlet temperature of the servers based on the preprocessed power consumption and cooling liquid outlet temperature time sequence data of the servers, identifying the thermal inertia factor of each server by utilizing a characteristic lag time of a mutual information lag scanning response thermal load change of a server subsystem, obtaining the thermal inertia factor quantifying the inherent response delay d