CN-121985521-A - Energy consumption control method and system for double-coil cooling system of data center

CN121985521ACN 121985521 ACN121985521 ACN 121985521ACN-121985521-A

Abstract

The application relates to an energy consumption control method of a double-coil cooling system of a data center, which is deployed in the data center and comprises the steps of collecting operation state data of the double-coil cooling system, constructing a state vector according to the operation state data, utilizing a hierarchical deep reinforcement learning network constructed based on an Actor-Critic network architecture to conduct hierarchical decision based on the state vector, outputting energy consumption control actions comprising a system-level operation mode and component-level mechanism adjusting parameters, and executing cascade cooling control on the double-coil cascade cooling system according to a natural cooling priority principle according to the energy consumption control actions, wherein a reward value is calculated according to the state vector and a response state after executing the energy consumption control actions through a multi-objective reward function of natural cooling priority, and training and parameter updating are conducted on the hierarchical deep reinforcement learning strategy network by utilizing the state vector, the control actions, the response state and the reward value. The method realizes the maximized preferential utilization of natural cooling on the premise of ensuring the safe operation of the IT equipment of the data center.

Inventors

PENG GUIPING
Pan Yueshuai
LI HONGTAO
Rao Zhiqin
SU YICHENG

Assignees

华信咨询设计研究院有限公司

Dates

Publication Date: 20260505
Application Date: 20260403

Claims (10)

1. A method for controlling energy consumption of a dual coil cooling system of a data center, the method comprising: collecting the running state data of the double-coil cooling system, and constructing a state vector according to the running state data; Using a hierarchical deep reinforcement learning network constructed based on an Actor-Critic network architecture, performing hierarchical decision based on the state vector, outputting energy consumption control actions comprising a system level operation mode and component level mechanism adjustment parameters, and performing cascade cooling control on the double-coil cascade cooling system according to a natural cooling priority principle according to the energy consumption control actions; and calculating a reward value according to a state vector and a response state after executing the energy consumption control action through a multi-objective reward function with natural cooling priority, and training and updating parameters of the hierarchical deep reinforcement learning strategy network by utilizing the state vector, the control action, the response state and the reward value.
2. The method of claim 1, wherein collecting operational status data for a dual coil step cooling system and constructing a status vector from the operational status data comprises: The method comprises the steps of obtaining the running state data of a double-coil cascade cooling system, wherein the running state data comprises a first coil water inlet and outlet temperature, a second coil water inlet and outlet temperature, a chilled water flow, a return air inlet and outlet temperature, an outdoor temperature and humidity, an outdoor wet bulb temperature, IT load power and natural cooling system running time; and constructing the running state data into a vector mode and mapping the vector mode to a preset interval to obtain the state vector.
3. The method of claim 2, wherein calculating the prize value based on the natural cooling prioritized multi-objective prize function comprises: determining a reward function item, wherein the reward function item comprises an air supply temperature deviation penalty item, an energy consumption penalty item, a natural cooling utilization rate reward item, a mechanical refrigeration starting penalty item and a mode switching penalty item; the natural cooling energy efficiency ratio is calculated according to the ratio of the refrigerating capacity provided by the natural cooling coil pipe to the energy consumption of the natural cooling assembly, and the natural cooling energy efficiency ratio is introduced into the natural cooling utilization rate rewarding item as an independent variable, so that the natural cooling utilization rate rewarding item is increased along with the increase of the natural cooling energy efficiency ratio; and carrying out weighted summation on each reward function item to obtain the reward value, wherein the weight coefficient corresponding to the natural cooling utilization rate reward item is set as the maximum value in all weights.
4. The method of claim 1, wherein outputting energy consumption control actions based on the state vector using a hierarchical deep reinforcement learning network comprises: outputting a system level operation mode based on a long time scale state through an upper layer strategy network, wherein the system level operation mode comprises a natural cooling mode, a hybrid cooling mode and a full mechanical cooling mode; The opening action of the water valve is output through a lower strategy network based on a short time scale state and a system level operation mode output by the upper strategy network, and the opening action is used for controlling the opening of the first coil water valve, the opening of the second coil water valve and the flow of chilled water; The hierarchical deep reinforcement learning network is constructed by adopting a depth deterministic strategy gradient algorithm or a soft Actor-critique algorithm, the upper strategy network and the lower strategy network share a unified experience playback buffer zone by adopting different sampling frequencies, an Actor network is used for outputting actions, and a Critic network is used for evaluating state-action values.
5. The method of claim 4, wherein performing a step cooling control on a dual coil cooling system according to natural cooling priority based on the energy consumption control action comprises: when the natural cooling capacity is recognized to be sufficient, the natural cooling mode is used for operation, the opening degree of a first coil water valve output by a lower-layer strategy network is executed, and a second coil water valve is closed; When the natural cooling capacity is in a preset interval, the opening of the first coil water valve is adjusted to a maximum opening interval by using the mixed cooling mode, and the opening of the second coil water valve is dynamically adjusted by the lower-layer strategy network according to a real-time cold quantity gap; When the natural cooling capacity is insufficient or the outdoor wet bulb temperature is too high, a mechanical cooling mode is used, mechanical cooling is started gradually, and the basic opening of the first coil is kept for precooling.
6. The method of claim 1, wherein using the state vector, control actions, response states, and the reward values to update parameters of the hierarchical deep reinforcement learning strategy network comprises: the state vector, the control action, the response state and the rewards value state are established as state transition experiences, and the state transition experiences are stored in an experience playback buffer; And sampling natural cooling experiences with rewards larger than a preset branch from the experience playback buffer zone by adopting a priority experience playback mechanism, carrying out network parameter optimization updating of the hierarchical deep reinforcement learning strategy, and adopting a target network soft updating mechanism to improve training stability.
7. The method of claim 1, wherein during parameter updating of the hierarchical deep reinforcement learning strategy network, the method further comprises: adopting a constraint reinforcement learning algorithm based on a Lagrangian multiplier method, and setting the fluctuation range of the air supply temperature as a constraint function; Introducing Lagrangian multipliers into an optimization target of the hierarchical deep reinforcement learning strategy network as product items of the weight of the constraint function and the constraint function; dynamically updating the Lagrange multiplier according to whether the air supply temperature violates the constraint function, and optimizing the energy consumption of the strategy on the premise that the air supply temperature safety constraint is met by adjusting the weight of the product term in the multi-objective rewarding function; and dynamically adjusting the standard deviation of the exploration noise according to the natural cooling potential prediction, increasing the exploration noise to encourage exploration of a better strategy when the natural cooling potential is higher than a first preset threshold value, and reducing the exploration noise to ensure the control stability when the natural cooling potential is lower than a second preset threshold value.
8. The method of claim 1, further comprising, prior to constructing the policy network, a migration learning step for quickly adapting to a new environment: The strategy network is pre-trained in a simulation environment by adopting a model-independent meta-learning algorithm, and when the strategy network is deployed to a new data center, a small amount of online data experience is collected for fine tuning iteration so as to quickly adapt to an optimal strategy and form a closed-loop control link.
9. A data center dual coil cooling system energy consumption control system for implementing the dual coil cooling system energy consumption control method of any of claims 1-8, the system comprising: The double-coil air treatment unit comprises a first coil and a second coil, wherein the first coil is connected with a natural cold source system, the second coil is connected with a mechanical cold source system, and air sequentially flows through the first coil and the second coil to be subjected to step cooling; the natural cold source system comprises a cooling tower or a dry cooler, a natural cooling circulating water pump, a natural cooling pipeline system, a first coil water inlet valve and a water outlet valve, and is used for cooling circulating water by using outdoor low-temperature air; The mechanical cold source system comprises a water chilling unit, a chilled water pump, a mechanical cooling pipeline system, a second coil water inlet valve and a water outlet valve and is used for generating chilled water in a mechanical refrigeration mode; The sensor system comprises a temperature sensor, a flow sensor, a power sensor and a humidity sensor, and is respectively used for collecting the water inlet and outlet temperature, the chilled water flow, the equipment power and the environmental humidity data of the coil; The hierarchical reinforcement learning controller is electrically connected with the sensor system and the executing mechanism, and is internally provided with an upper-layer strategy network module, a lower-layer strategy network module, a state processing module, an action output module, an experience storage module, a strategy updating module, a natural cooling priority decision module, a safety constraint module and a transfer learning module, and is used for realizing natural cooling priority energy consumption optimization control based on hierarchical deep reinforcement learning.
10. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of any one of claims 1 to 8 when executing the computer program.

Description

Energy consumption control method and system for double-coil cooling system of data center Technical Field The present application relates to the field of energy consumption optimization of cooling systems, and in particular, to a method, a system, a computer device, and a computer readable storage medium for controlling energy consumption of a dual coil cooling system of a data center. Background Data centers are an important infrastructure of modern information society, and their energy consumption problem is increasingly focused. The energy consumption of the data center mainly comprises the energy consumption of IT equipment and the energy consumption of a cooling system, wherein the energy consumption of the cooling system generally accounts for 30-40% of the total energy consumption of the data center. Reducing cooling system energy consumption is a key way to increase data center energy efficiency and reduce power usage efficiency values (PUEs). The double-coil cascade cooling system comprises two cooling coils connected in series, wherein a first coil (natural cooling coil) is connected with natural cold sources such as a cooling tower or a dry cooler, and a second coil (mechanical cooling coil) is connected with mechanical refrigeration equipment such as a water chilling unit. The air sequentially flows through the first coil pipe and the second coil pipe to realize stepped cooling. When the outdoor temperature is lower, the system can utilize a natural cold source to cool, so that the use of mechanical refrigeration is reduced, and the energy consumption is further reduced. However, the cooperative operation of the natural cold source and the mechanical cold source in the traditional double-coil air cascade cooling unit has the energy efficiency bottleneck, and the prior art has the following technical defects: First, existing control methods are based primarily on fixed thresholds or simple rules for mode switching. A single outdoor temperature or load threshold is generally adopted to judge whether natural cooling is started or not, the comprehensive influence of multiple factors cannot be considered, and globally optimal energy consumption control cannot be realized. Second, adaptive learning capabilities are lacking. The control parameters remain unchanged after the determination of the debugging stage, and the strategy cannot be dynamically adjusted according to factors such as load change, seasonal change, equipment aging and the like, so that the long-term operation effect is poor. Meanwhile, mechanical refrigeration is started too early in transitional seasons, and natural cooling potential is not fully exploited. Third, the water valve opening adjustment is coarse and the mode switching is not smooth. In the prior art, the water valve opening cannot be finely and continuously adjusted by adopting switch control or sectional control, so that the cooling capacity is not matched with the actual demand. In addition, when switching between full natural cooling, hybrid cooling and full mechanical cooling modes, abrupt change or fluctuation of air supply temperature is easily caused, and safe operation of IT equipment is affected. Fourth, the existing reinforcement learning control method adopts a single-layer decision architecture, and cannot optimize the mode switching strategy of a long time scale and the water valve opening adjustment of a short time scale at the same time, so that the decision efficiency is low. Meanwhile, the existing reinforcement learning method lacks a safety constraint mechanism, and the air supply temperature may be over-limited in the exploration process. Disclosure of Invention The embodiment of the application provides an energy consumption control method, an energy consumption control system, computer equipment and a computer readable storage medium for a double-coil cooling system of a data center, which at least solve the problem that the related technology does not fully exploit natural cooling potential and causes unnecessary energy waste. In a first aspect, an embodiment of the present application provides a method for controlling energy consumption of a dual-coil cooling system of a data center, deployed in the data center, where the method includes: collecting the running state data of the double-coil cooling system, and constructing a state vector according to the running state data; Using a hierarchical deep reinforcement learning network constructed based on an Actor-Critic network architecture, performing hierarchical decision based on the state vector, outputting energy consumption control actions comprising a system level operation mode and component level mechanism adjustment parameters, and performing cascade cooling control on the double-coil cascade cooling system according to a natural cooling priority principle according to the energy consumption control actions; and calculating a reward value according to a state vector and a response state aft