Search

CN-122028366-A - Data center server running state regulation and control system and method based on thermal imaging

CN122028366ACN 122028366 ACN122028366 ACN 122028366ACN-122028366-A

Abstract

The invention discloses a thermal imaging-based data center server running state regulation and control system and a thermal imaging-based data center server running state regulation and control method, wherein the system comprises a thermal imaging monitoring module, a sliding rail fan system, a data processing and analysis module and an intelligent control module, wherein the thermal imaging monitoring module is used for collecting infrared thermal imaging data of a server and a machine room environment, the sliding rail fan system is used for carrying out directional heat dissipation on a designated area, the data processing and analysis module is used for identifying the running state of the server and locating a thermal abnormality area; the intelligent control module is respectively in communication connection with the data processing and analyzing module, the slide rail fan system and the machine room cooling system and is used for generating control instructions according to the analysis result of the data processing and analyzing module to drive the slide rail fan system and the machine room cooling system to execute an adaptive regulation and control strategy. The development requirement of high efficiency and low carbon of the data center is met.

Inventors

  • LI PIN
  • WANG QI
  • Ma haide
  • Cai Rangcuo

Assignees

  • 青海大学
  • 海南州国有资产投资运营集团有限公司

Dates

Publication Date
20260512
Application Date
20260206

Claims (10)

  1. 1. A thermal imaging-based data center server operational state regulation system, comprising: The intelligent control system comprises a thermal imaging monitoring module, a sliding rail fan system, a data processing and analyzing module and an intelligent control module, wherein the thermal imaging monitoring module is used for acquiring infrared thermal imaging data of a server and a machine room environment and realizing full coverage monitoring of a machine room thermal field, the sliding rail fan system is used for carrying out directional heat dissipation on a designated area, the data processing and analyzing module is used for identifying the running state of the server and locating a thermal abnormality area, and the intelligent control module is used for maintaining the running of the server in a safe temperature range; The data processing and analyzing module is in communication connection with the thermal imaging monitoring module and is used for receiving infrared thermal imaging data and predicting the temperature change trend of the thermal anomaly area based on historical data; The intelligent control module is respectively in communication connection with the data processing and analyzing module, the slide rail fan system and the machine room cooling system, and is used for generating a control instruction according to an analysis result of the data processing and analyzing module, and driving the slide rail fan system and the machine room cooling system to execute an adaptive regulation and control strategy.
  2. 2. The thermal imaging-based data center server running state regulation and control system according to claim 1, wherein the sliding rail fan system is deployed above a thermal channel and comprises a guide rail, a sliding block moving on the guide rail, a steering engine, the top of which is fixedly connected with the bottom of the sliding block and used for adjusting the fan out direction, and a fan installed at the bottom of the steering engine, wherein the steering engine can control at least 2 degrees of freedom.
  3. 3. The thermal imaging-based data center server operation state control system according to claim 2, wherein all fan initial positions in the sled fan system are located at one end of the sled, and when a hot spot is identified, the system controls the fan to move to the hot spot corresponding area so that the fan is moved to the hot spot corresponding area Axis coordinates With hot spots Axis coordinates Alignment, the system program determines the fan rotation direction by the following mathematical logic: When (when) When the decision result is "rotate in Y-axis negative direction"; When (when) When the decision result is "rotate in Y-axis positive direction"; Wherein, the As the Y-axis coordinates of the hot spot, As the Y-axis coordinate of the fan, 。
  4. 4. The thermal imaging-based data center server running state regulation and control system according to claim 2, wherein the specific step of adjusting the fan pitch angle through the steering engine in the slide rail fan system is as follows: obtaining the mounting height of the fan Three-dimensional coordinates of local hot spots Fixed horizontal distance between fan and hot spot Calculating a target pitch angle by the following mathematical formula And driving the actuating mechanism to finish angle adjustment according to the calculation result: Wherein, the As an arctangent function for the vertical height difference between the fan and the hot spot And a fixed horizontal distance Solving the pitching angle required by the directional air supply of the fan.
  5. 5. The thermal imaging-based data center server operation state control system according to claim 1, wherein the temperature trend prediction function of the data processing and analyzing module is based on a time series analysis method, and models historical thermal imaging data and server load data to predict temperature changes of each area in a specific future time period, and the specific process is as follows: selecting historical temperature sequences of all areas of a machine room in a preset historical period Wherein The number of the region is given to the number, The thermal imaging monitoring points are numbered, For the time of day of the history, Indicating time of day Lower region Middle monitoring point Temperature value of (2); Synchronous selection of server load sequences at corresponding moments Wherein The server is numbered with the number of the server, Indicating time of day Lower server The load value is obtained by normalizing the real-time monitoring data of CPU utilization rate, memory occupancy rate and total power consumption, directly corresponds to the instantaneous heat generation capacity of the server, and simultaneously extracts the history contemporaneous machine room environment parameter sequence Including the ambient air density And ambient temperature Adapting to the environmental characteristics of the plateau; Sequencing the temperature Load sequence Environmental parameter sequence Time stamp alignment, construction of multidimensional associated datasets Constructing a prediction model based on a time sequence analysis method, and introducing a heat transfer time sequence correction term to correlate a data set Data in successive predetermined length time windows as model input vectors: wherein For the length of the time window, determining according to the thermal inertia characteristics of the plateau machine room, and ensuring to cover the complete hysteresis period of heat transfer; With a preset predicted duration after the time window Temperature value of monitoring point of each area As model output vector By associating data sets Is used for establishing an input vector by combining time sequence autocorrelation analysis and heat transfer physical rules And output vector The core mathematical expression of the coupling mapping relation between the two is as follows: Wherein the method comprises the steps of For a time series mapping function obtained through data driving training, For the current time zone Monitoring point Is used for the temperature change rate of the (c) in the air, Is the standard sea level air density of the sea, For correcting the effect of the plateau low air density on the heat transfer efficiency; Real-time associated data input by using the coupling mapping relation Performing an operation in which For the current time, get the future time period Temperature prediction value of monitoring points of each area By calculating the deviation between the predicted temperature and the safety threshold Wherein For the safe operation temperature threshold of the server, combining the temperature change rate The occurrence time and the intensity of the thermal anomaly risk are pre-judged, and a core data support is provided for the intelligent control module to formulate a predictive intervention strategy.
  6. 6. The thermal imaging-based data center server operational state regulation system of claim 1, wherein the control logic executed by the intelligent control module comprises: When the temperature of the identified local hot spot exceeds a first real-time threshold value, triggering a real-time intervention mode, controlling the sliding rail fan system to move to the position above the hot spot and carrying out directional air supply with the highest intensity, and simultaneously starting nearby equipment to dissipate heat; when the predicted local hot spot temperature exceeds a second early warning threshold value in a future time period, triggering a predictive intervention mode, and controlling the sliding rail fan system to move to a predicted position in advance and perform preventive medium-intensity air supply.
  7. 7. The thermal imaging-based data center server operational state regulation system of claim 1, wherein the regulation strategy of the intelligent control module comprises: dynamically adjusting operation parameters of basic heat dissipation equipment according to the whole heat load and average temperature of a machine room, and compensating low air density of a plateau; when a local hot spot is identified or the hot spot to be generated is predicted, controlling the sliding rail fan system to move to the position right above the hot spot area to perform directional blowing, and synchronously starting fixed cooling equipment near the hot spot; and the energy-saving operation strategy is to reduce the power and the air quantity of the basic heat dissipation equipment and preferentially utilize the natural cold source when the situation that the environment enters the night and the external environment is lower than the set temperature threshold value is detected.
  8. 8. The thermal imaging-based data center server operation state regulation system according to claim 7, wherein in the energy-saving operation strategy, the target set value of the overall temperature of the machine room and the alarm threshold of the local hot spot are dynamically adjusted up while the power and the air quantity of the basic heat radiation equipment are reduced.
  9. 9. The thermal imaging-based data center server operational state control system of claim 8, wherein the compensation for the altitude low air density in the basic adaptive heat dissipation strategy is achieved by multiplying the calculated basic air volume command by an altitude compensation coefficient greater than 1.
  10. 10. A thermal imaging-based data center server operation state regulation method applied to the thermal imaging-based data center server operation state regulation system according to any one of claims 1to 9, characterized by comprising the steps of: s1, continuously acquiring infrared thermal imaging data of a server and a machine room environment by a thermal imaging monitoring module, realizing full coverage monitoring of a machine room thermal field, and transmitting the acquired infrared thermal imaging data to a data processing and analyzing module in real time; S2, after the data processing and analyzing module receives the infrared thermal imaging data, analyzing the infrared thermal imaging data, identifying the running state of the server and positioning a thermal abnormality region, calling historical thermal imaging data and server load data, predicting the temperature change trend of the thermal abnormality region based on a preset prediction model, and synchronously transmitting an identification result, positioning information and a temperature prediction result to the intelligent control module; S3, the intelligent control module receives output information of the data processing and analyzing module, and generates an adaptive control instruction by combining real-time operation parameters of a machine room and external environment data; S4, the intelligent control module respectively transmits control instructions to the slide rail fan system and the machine room cooling system, drives the slide rail fan system to move to a designated position, adjusts the air supply direction and the air supply strength to realize directional cooling, regulates and controls the operation parameters of the machine room cooling system, and cooperatively executes a regulation and control strategy through the slide rail fan system and the machine room cooling system to maintain the operation of the server in a safe temperature range; s5, continuously feeding back the regulated thermal field data of the machine room by the thermal imaging monitoring module, tracking the temperature change of the thermal abnormal region in real time by the data processing and analyzing module, and dynamically adjusting the control command by the intelligent control module according to the feedback result to form closed-loop regulation.

Description

Data center server running state regulation and control system and method based on thermal imaging Technical Field The invention belongs to the technical field of data center thermal management, and particularly relates to a data center server running state regulation and control system and method based on thermal imaging. Background In the running process of the data center, a large amount of heat can be generated when the server cluster runs under high load continuously, if the heat cannot be timely dissipated, the temperature of the server can be abnormally increased, the running stability is further influenced, and even serious problems such as hardware faults and data loss are caused, so that the temperature regulation and control are key links of the operation and maintenance of the data center. With the rapid development of digital economy, the scale of a data center is continuously enlarged, the density of a server is continuously improved, the heat load is obviously increased, and higher requirements are put forward on the accuracy, timeliness and high efficiency of temperature regulation and control. At present, a traditional fixed heat dissipation mode is generally adopted in a data center, and the data center is integrally cooled by means of fixed fans and air conditioning systems distributed in a machine room, so that obvious limitations exist in the mode. The traditional monitoring mode adopts a point type sensor more, only can acquire the temperature data of local point positions, is difficult to realize the full-coverage monitoring of a machine room thermal field, cannot comprehensively grasp the temperature distribution condition, and causes that scattered or sudden local thermal abnormality areas are difficult to be found in time. Meanwhile, the air supply direction and the intensity of the fixed heat radiation equipment cannot be flexibly adjusted, the overall uniform heat radiation can be only carried out, the problem of local heat abnormality is difficult to solve in a targeted manner, the heat radiation resource waste is caused, and the heat radiation effect cannot be ensured. In addition, the external environment and the internal load of the data center are in dynamic change, factors such as server load fluctuation, environmental temperature change and the like can lead to real-time change of a machine room thermal field, the traditional regulation and control mode lacks an effective dynamic response mechanism, the regulation and control strategy is fixed and stiff, the regulation and control strategy cannot be timely adjusted according to the thermal field change, and the condition of regulation and control lag or excessive regulation is easy to occur. The problems not only affect the safe and stable operation of the server, but also cause excessive energy consumption and increase the operation and maintenance cost of the data center, and do not accord with the development trend of green low carbon. Therefore, the intelligent temperature control system solves the practical problems of incomplete temperature monitoring, insufficient heat dissipation accuracy, lag in control response and the like of the data center, improves the intelligent level of temperature control, has important significance for guaranteeing stable operation of the data center, reducing energy consumption and saving operation and maintenance cost, and becomes a technical direction to be broken through urgently in the field of heat management of the current data center. Disclosure of Invention The invention aims to solve the problems of incomplete temperature monitoring, insufficient heat dissipation precision, delayed regulation response and the like of the current data center, and realizes the precise, timely and dynamic regulation and control of the temperature through thermal imaging full-coverage monitoring, thermal anomaly positioning and trend prediction, directional heat dissipation, cooperative regulation and control and a closed-loop feedback mechanism, thereby ensuring the safe and stable operation of a server, improving the heat dissipation efficiency, reducing the energy consumption and operation and maintenance cost and meeting the development requirements of high efficiency and low carbon of the data center. In view of the above-identified deficiencies or improvements in the art, as a first aspect of the present invention, there is provided a thermal imaging-based data center server operational status regulating system comprising: The intelligent control system comprises a thermal imaging monitoring module, a sliding rail fan system, a data processing and analyzing module and an intelligent control module, wherein the thermal imaging monitoring module is used for acquiring infrared thermal imaging data of a server and a machine room environment and realizing full coverage monitoring of a machine room thermal field, the sliding rail fan system is used for carrying out directional heat dissipation on a