Search

CN-122015358-A - Intelligent defrosting control method, system and computer program product of low-temperature variable-frequency air source heat pump

CN122015358ACN 122015358 ACN122015358 ACN 122015358ACN-122015358-A

Abstract

The invention relates to the technical field of air source heat pumps, in particular to an intelligent defrosting control method, system and computer program product of a low-temperature variable-frequency air source heat pump. The method integrates the characteristics of coil temperature, frost layer thickness, ambient humidity and the like acquired by a multi-mode sensor, predicts the optimal control parameter of the next control period through a convolution-LSTM depth network, combines the confidence coefficient and a hardware boundary to execute safe redundancy judgment, and triggers a rollback strategy based on the temperature slope and the frost thickness when the confidence coefficient is insufficient. Meanwhile, a distributed reinforcement learning framework is adopted to optimize a prediction model and a rollback control table on line, so that the dynamic balance of defrosting energy consumption and integrity is realized. Experiments prove that the invention can obviously reduce defrosting energy consumption and downtime and improve system stability and heating performance.

Inventors

  • ZHONG JIAYU
  • JIANG JIANJUN
  • JIANG ZHIHAO

Assignees

  • 鑫磊压缩机股份有限公司

Dates

Publication Date
20260512
Application Date
20260327

Claims (10)

  1. 1. An intelligent defrosting control method of a low-temperature variable-frequency air source heat pump is characterized by comprising the following steps: s1, data fusion and acquisition, namely 1.1, acquiring the temperature T def of a defrosting coil of a fin heat exchanger; 1.3, collecting the current frequency F of a compressor, the air quantity Q of an outdoor fan, the opening theta of an electronic expansion valve, the temperature T wb of an environment wet bulb and the relative humidity RH; s2 non-linear prediction decision: Inputting the multi-mode feature sequence { T def ,L ice ,F,Q,θ,T wb , RH } of the step S1 by utilizing a pre-trained convolution-LSTM (C-LSTM) depth network model, outputting the optimal control vector { F pred ,Q pred ,θ pred } of the next control period, and giving a confidence coefficient P conf ; S3, safety redundancy judgment, wherein when P conf ≥P thr and an output vector meet a hardware safety boundary, prediction control is executed, P thr is a confidence threshold, otherwise, a rollback strategy is triggered, namely a) if T def ≥T defT −T set and dT def /dt≥K slope ,T defT are defrosting exit temperature thresholds, T set is an advanced frequency-reduction temperature difference offset, K slope is a temperature slope threshold, the frequency of the compressor is reduced to a first-level frequency-reduction target frequency F set1 according to a preset curve, and meanwhile, target air quantity Q set and target opening theta set are adjusted cooperatively, and b) if T def ≥T defT or L ice ≤L thr ,L thr is a minimum residual frost thickness threshold allowed, the frequency of the compressor is reduced to a second-level frequency-reduction target frequency F set2 , and a four-way valve is switched for heating; S4, online reinforcement learning updating, namely carrying out online fine adjustment on parameters of the C-LSTM and a rollback control table by adopting a distributed reinforcement learning agent by taking defrosting energy consumption-time integral and residual frost thickness as rewarding functions, wherein the updating interval is not more than N defrosting periods; And S5, ending the judgment, namely ending the defrosting control cycle of the wheel when the compressor, the fan and the expansion valve all stably operate delta T end and T def 、L ice meet the exit threshold.
  2. 2. The method of claim 1, wherein the convolution-LSTM network comprises a) two serially connected two-dimensional convolution layers, the convolution kernel sizes are 3 x 3, the number of output channels of the first layer is greater than or equal to 32, the number of output channels of the second layer is greater than or equal to 64, batch normalization and ReLU activation are used between the two layers, b) a bidirectional LSTM layer with a time step of T seq is arranged behind the convolution layers, the number of hidden units is 128, and a rejection rate of 0.3 is added on the output side, c) one-dimensional features such as convolution features and humiture are spliced by channel dimensions before LSTM.
  3. 3. The method of claim 2, wherein the convolution-LSTM network further includes a multi-head self-attention fusion layer at the LSTM output end, where the number of heads is h=4, so as to give different weights to the frost thickness sequence and the temperature sequence, and improve the response speed to the sudden frost layer growth.
  4. 4. The method of claim 1, wherein the reinforcement learning unit adopts a distributed PPO architecture, and comprises a) at least two Actor nodes which are respectively deployed on a plurality of heat pump edge control boards and are used for collecting interaction tracks in parallel and executing current strategies, b) a Learner node which is deployed on a gateway server and is responsible for calculating global strategy gradients and updating parameters, c) a parameter server which is used for asynchronously synchronizing strategy network weights between the Actor nodes and the Learner nodes, wherein the updating frequency is not less than f sync =1 Hz; preferably, the experience playback of the distributed PPO adopts priority experience playback caching, the priority is weighted according to time difference error |delta| and the caching size is more than or equal to 10000, and the temperature distribution and frost thickness distribution during random sampling are required to meet the same distribution principle as the real working condition; Still preferably, the priority empirical playback sampling probability is as follows: 。
  5. 5. The method of claim 4, wherein the reinforcement learning reward function R satisfies R= - α -E- β -L res −γ·N switch , Wherein E is the energy consumption of the defrosting of the round, L res is the thickness of the residual frost when the round exits, N switch is the switching times of the four-way valve, and alpha, beta and gamma are adaptively adjusted by a Learner node according to the energy consumption-performance Pareto front of the latest M defrosting cycles so as to realize the dynamic balance of the energy consumption and the defrosting integrity; preferably, the Pareto adaptive weight update satisfies: 。
  6. 6. The method of claim 1, wherein F set1 and F set2 in the backoff strategy are obtained from an energy consumption curve of a previous defrost cycle by exponential weighted average calculation to ensure the adaptivity of the backoff control.
  7. 7. The method of claim 1, wherein the visual imaging device is a ToF depth camera or millimeter wave radar with a resolution of 0.1mm or more and a thermal compensation algorithm is configured to reduce condensation-haze interference.
  8. 8. An intelligent defrosting control system of a low-temperature variable-frequency air source heat pump is characterized by comprising: The multi-mode sensor group is used for acquiring data such as fin temperature, frosting thickness, environment temperature and humidity and the like; The actuating mechanism comprises a variable frequency compressor, an outdoor fan and an electronic expansion valve; The depth prediction unit is internally provided with a convolution-LSTM network and is used for predicting an optimal control vector according to the sensor data; The reinforcement learning unit is used for updating the parameters of the depth prediction unit in real time based on the energy consumption and the thickness of the residual frost; a safety redundancy control unit for executing a rollback policy when the prediction confidence is insufficient or overrun; a central controller for implementing all the steps of the method according to any one of claims 1-8 and for coordinating the units.
  9. 9. A computer readable storage medium having stored thereon a computer program or instructions, which when executed by a processor, realizes the steps of the method of any of claims 1-5.
  10. 10. A computer program product comprising a computer program or instructions which, when executed by a processor, carries out the steps of the method of any one of claims 1 to 5.

Description

Intelligent defrosting control method, system and computer program product of low-temperature variable-frequency air source heat pump Technical Field The invention relates to the technical field of air source heat pumps, in particular to an intelligent defrosting control method, system and computer program product of a low-temperature variable-frequency air source heat pump. Background The air source heat pump has the advantages of dual modes of heating and cooling, energy saving, flexible installation and the like, and is widely applied to the field of heating, ventilation and air conditioning (HVAC) of houses and commercial buildings. However, when the outdoor ambient temperature is lower than 5 ℃ and the humidity is high, the outdoor fin heat exchanger (evaporator) surface is extremely prone to frosting. The frost layer obviously reduces the heat transfer coefficient and the air quantity, so that the heating attenuation, the exhaust overheat of the compressor and the system energy efficiency are rapidly reduced, and even the low-pressure protection stop is triggered in serious cases. Therefore, an efficient and reliable defrost control strategy is one of the core bottlenecks in low temperature variable frequency air source heat pump commercialization. The earliest reverse circulation defrosting generally adopts a fixed timing method, namely the four-way valve is forcibly switched to defrost after the compressor is operated for 30-90min in a cumulative way, and the four-way valve is forcibly withdrawn after 4-10 min. The method does not need an additional sensor, but has two defects of 1 underdefrosting, namely, if a frost layer is not melted completely and is withdrawn, the residual frost is quickly regenerated, 2 overdefrosting, namely, defrosting according to fixed time length when the external environment is improved, causing overlong heating interruption and electric power waste, and reducing the comprehensive energy efficiency (HSPF) by more than or equal to 10 percent. The improved scheme is that the temperature difference of a coil is used as triggering/termination judgment, for example, U.S. patent No. 4373349 proposes a self-adaptive defrosting control system of a heat pump system, but the method is greatly influenced by wind speed, frost layer distribution and sensor drift because only single-point temperature is monitored, and a fixed-frequency compressor is adopted, so that the frequency and the energy consumption cannot be optimized in real time by combining a variable-frequency working condition. The self-learning timing strategy based on a single chip microcomputer appears in the 80 s of the 20 th century, and is represented by US4573326A and US4751825A. The controller records the time t of frosting-defrosting-frosting again in the previous period and calculates the proportionality coefficient corresponding to the time t ref of the target frost thickness so as to update the timing threshold value of the next period. The scheme can be adjusted along with season and unit aging, but still lacks the utilization of dynamic information such as the ambient wet bulb temperature, the coil temperature rise slope and the like based on time characteristics, and meanwhile, the single sensor is adopted, so that uneven frosting is difficult to perceive. The literature Deep-learning-based prediction on performance change of ASHP adopts CNN-LSTM to predict the COP of the heat pump, the RMS error is about 2.4 percent, and in addition, LSTM research also inputs an air image field and an operation condition to predict the air quantity. These work demonstrated the effectiveness of the depth timing network in multi-source nonlinear regression, but the study object was still limited to energy consumption or capacity prediction, not coupled with defrost control loops. In recent years, deep Reinforcement Learning (DRL) has been used for optimizing air conditioner set points by DQN and DDPG and for economic operation of heating systems by PPO. Open source/patent literature has not found application of distributed PPO to low temperature variable frequency heat pump-defrost-real time safety constraint scenarios. By combining the researches, the existing method has the defects of large energy loss or over conservation, difficulty in cross-season migration by only relying on an empirical threshold, or lack of a unified framework for cooperative control and online self-learning of multiple execution mechanisms. In the scene of coexistence of ultra-low temperature (-below 25 ℃) and high-humidity rime, an intelligent defrosting scheme integrating deep prediction, reinforcement learning and safety redundancy is needed to achieve four targets of defrosting timeliness, heating continuity, energy consumption economy and control reliability. Disclosure of Invention In order to solve the technical problems, the technical aim of the invention is to provide an intelligent defrosting control method for a low-temperature variable-f