Search

CN-121977375-A - Heat storage system regulation and control method and system based on expert demonstration and deep reinforcement learning

CN121977375ACN 121977375 ACN121977375 ACN 121977375ACN-121977375-A

Abstract

The invention relates to a heat storage system regulation and control method and a system based on expert demonstration and deep reinforcement learning, wherein the regulation and control method comprises the steps of constructing a multidimensional state space for describing the running state of the heat storage system; based on the multidimensional state space, a deep reinforcement learning algorithm fused with expert demonstration data is adopted to train the control intelligent agent so as to learn and obtain an optimized control strategy, wherein the training process takes a reward function which maximally and comprehensively reflects the multi-target performance of the system as a guide, the trained control intelligent agent is deployed in the heat storage system, and a control instruction is fed back and output according to the real-time state of the heat storage system so as to realize active regulation and control of the heat storage system. The invention effectively solves the problem of power response hysteresis of the heat storage system caused by inherent thermal inertia, and eliminates the dependence of the traditional control strategy on an accurate mathematical model, thereby realizing multi-target self-adaptive optimization operation aiming at power grid dispatching instructions.

Inventors

  • LI GUANGKUO
  • DAI QIUXIA
  • XU JUNHUI
  • WANG GUOHUA
  • NIU YONGTAO

Assignees

  • 中盐金坛盐化有限责任公司
  • 江苏加怡热电有限公司

Dates

Publication Date
20260505
Application Date
20260114

Claims (10)

  1. 1. A heat storage system regulation and control method based on expert demonstration and deep reinforcement learning is characterized in that, Comprising the following steps: constructing a multidimensional state space for describing the operation state of the heat storage system; based on the multidimensional state space, training a control intelligent body by adopting a deep reinforcement learning algorithm fused with expert demonstration data to learn to obtain an optimized control strategy, wherein the training process is guided by a reward function for maximizing the comprehensive reflection of the multi-target performance of the system; and deploying the trained control intelligent agent in the heat storage system, and feeding back and outputting a control instruction according to the real-time state of the heat storage system to realize active regulation and control of the heat storage system.
  2. 2. The method for regulating and controlling a heat storage system based on expert demonstration and deep reinforcement learning according to claim 1, wherein, The multidimensional state space comprises state variables of a heat storage system, external signals and requirements, wherein, The heat storage system state variables comprise current values of core physical parameters, at least a part of core physical parameter historical values and at least a part of core physical parameter historical value change rates; the external signals and demands include grid power instructions, real-time electricity prices, and predicted future short-term heat loads.
  3. 3. The method for regulating and controlling a heat storage system based on expert demonstration and deep reinforcement learning according to claim 2, wherein, The core physical parameters include heat storage medium temperature, system pressure, heat storage medium flow and heating power.
  4. 4. The method for regulating and controlling a heat storage system based on expert demonstration and deep reinforcement learning according to claim 1, wherein, The depth reinforcement learning algorithm for fusing expert demonstration data is a depth deterministic strategy gradient DDPGfD algorithm based on demonstration, and the training process for the control intelligent agent comprises the following steps: A pre-training stage, in which expert demonstration data from a traditional controller or historical operation data is utilized to conduct supervised learning on an executor network of the control intelligent agent so as to initialize network parameters; and in the offline reinforcement learning stage, the pre-trained control agent is placed in a simulation environment, and autonomous learning is performed by interacting with the environment and based on the reward function so as to optimize the control strategy.
  5. 5. The method for regulating and controlling a heat storage system based on expert demonstration and deep reinforcement learning according to claim 1, wherein, The reward function is a multi-objective compound function, and the output value of the reward function is obtained by weighting an error item reflecting tracking performance, an economic operation cost item, an equipment safety state item and an energy efficiency optimization item.
  6. 6. The method for regulating and controlling a heat storage system based on expert demonstration and deep reinforcement learning according to claim 5, wherein, The error term of the tracking performance is expressed as The calculation formula is as follows: The economic operation cost term is expressed as The calculation formula is as follows: the device security status items, expressed as The calculation formula is as follows: The energy efficiency optimization term is expressed as The calculation formula is as follows: In the formula, Representing a power command of the electric network, Indicating the actual output power of the system, The real-time electricity price is represented, Indicating the heating power of the heating device, Representing the step of scheduling time, The temperature of the heat storage medium is indicated, The safe temperature is represented, the constant is adopted, Indicating the rotational speed command of the pump in the heat storage system, The energy consumption coefficient of the auxiliary machine is represented.
  7. 7. The method for regulating and controlling a heat storage system based on expert demonstration and deep reinforcement learning according to claim 1, wherein, The action space of the control intelligent body is a continuous space, and the output of the control intelligent body comprises control signals for a pump, a valve and a heater in the heat storage system.
  8. 8. A heat storage system regulation and control system based on expert demonstration and deep reinforcement learning is characterized in that, Comprising the following steps: the data sensing unit (110) is in communication connection with key measuring points of the heat storage system and is used for collecting physical parameters forming a multidimensional state space for describing the running state of the heat storage system in real time; The intelligent decision unit (120) is in communication connection with the data sensing unit (110) and is used for controlling an agent deployed on the computing equipment, wherein the intelligent decision unit (120) is configured to receive state data from the data sensing unit (110) and run the trained control agent to generate a corresponding control instruction; And the control execution unit (130) is in communication connection with the intelligent decision unit (120) and is used for receiving the control instruction and driving the heat storage system to execute corresponding adjustment actions.
  9. 9. The heat storage system regulation and control system based on expert demonstration and deep reinforcement learning of claim 8 wherein, The system also comprises a simulation training platform, wherein the simulation training platform is used for running a digital model of the heat storage system before deployment and providing a simulation environment of the pre-training stage and the off-line training stage for the control intelligent agent.
  10. 10. The heat storage system regulation and control system based on expert demonstration and deep reinforcement learning of claim 9 wherein, The simulation training platform and the heat storage system jointly form a digital twin system.

Description

Heat storage system regulation and control method and system based on expert demonstration and deep reinforcement learning Technical Field The invention relates to the technical field of energy technology and automatic control intersection, in particular to a heat storage system regulation and control method and system based on expert demonstration and deep reinforcement learning. Background Constructing a new power system is a core path for achieving the "two carbon" strategic goal. With the high proportion of renewable energy sources connected in grid, the randomness and intermittence of the renewable energy sources form a serious challenge for the flexible regulation capability of the power system. In this context, high capacity heat storage systems are gradually becoming a key infrastructure for supporting stable operation of the grid and new energy consumption by virtue of their large-scale, long-term energy storage capacity and superior full life cycle economy. However, while thermal storage systems provide large-scale energy storage, there is a significant dynamic delay in the system power response due to the inherent large thermal inertia of their thermal storage/release processes. The physical characteristic causes that the method is difficult to quickly and accurately track high-frequency and high-precision power grid dispatching instructions such as Automatic Generation Control (AGC), and the effective deployment of the method in scenes requiring high dynamic performance such as frequency modulation, load tracking and the like is severely restricted. In order to break through the dynamic response bottleneck caused by the physical characteristics, the adoption of advanced control strategies to improve the system performance has become a key approach. However, the existing control method still has a plurality of limitations in practical engineering application, namely, firstly, the existing control method depends on a strategy (such as model predictive control) of an accurate mathematical model, and when facing to strong nonlinearity, large inertia and complex thermo-electric coupling characteristics of a heat storage system, the control performance is easy to be reduced due to model mismatch, secondly, the system operation needs to be weighted in real time among multiple targets such as quick response, economy, equipment life loss and the like, the traditional PID or rule-based strategy is difficult to realize dynamic optimization decision under such complex constraint, thirdly, the existing control architecture generally lacks self-adaptive regulation and control capability, is difficult to cope with uncertain factors such as system parameter change, external environment disturbance and the like, and the continuous effectiveness of the control strategy cannot be ensured. Therefore, the development of an intelligent control method which does not depend on a system accurate model and has self-adaption and multi-objective dynamic optimization capability becomes an urgent technical requirement for fully excavating flexible adjustment potential of a large-capacity heat storage system and supporting safe and stable operation of a high-proportion new energy power system. Disclosure of Invention The technical problem to be solved by the invention is to overcome the defects of the prior art, and provide the heat storage system regulation and control method based on expert demonstration and deep reinforcement learning, which can effectively solve the problem of power response delay of the heat storage system caused by inherent thermal inertia and eliminate the dependence of the traditional control strategy on an accurate mathematical model, thereby realizing multi-target self-adaptive optimization operation aiming at power grid dispatching instructions. In order to solve the technical problems, the technical scheme of the invention is that the heat storage system regulation and control method based on expert demonstration and deep reinforcement learning comprises the following steps: constructing a multidimensional state space for describing the operation state of the heat storage system; based on the multidimensional state space, training a control intelligent body by adopting a deep reinforcement learning algorithm fused with expert demonstration data to learn to obtain an optimized control strategy, wherein the training process is guided by a reward function for maximizing the comprehensive reflection of the multi-target performance of the system; and deploying the trained control intelligent agent in the heat storage system, and feeding back and outputting a control instruction according to the real-time state of the heat storage system to realize active regulation and control of the heat storage system. Further, the multi-dimensional state space comprises a heat storage system state variable, an external signal and a demand, wherein, The heat storage system state variables comprise current values of c