Search

CN-121485036-B - Multi-point energy storage aggregation cooperative operation method, device, equipment and storage medium

CN121485036BCN 121485036 BCN121485036 BCN 121485036BCN-121485036-B

Abstract

The invention relates to the technical field of comprehensive energy management and discloses a multipoint energy storage aggregation cooperative operation method, device, equipment and storage medium, wherein the method comprises the steps of respectively constructing Markov decision process models of a lithium battery and flywheel energy storage based on the differentiated physical characteristics of the two in cycle life, energy density and operation constraint; finally, a centralized training and distributed execution mechanism training algorithm framework is adopted, the trained model is subjected to cooperative optimization by multiple agents under the target constraint, the optimal cooperative operation scheme taking power distribution as a core and considering dynamic response performance and full life cycle economy is solved, the dynamic performance is guaranteed through quick response of a flywheel, full life cycle economy is realized through reasonable scheduling and life loss control of a lithium battery, and the stability and benefit of multi-point energy storage aggregation operation are remarkably improved.

Inventors

  • YAN XINRONG
  • ZHAO JIANFU
  • LIU LILI
  • Jiao Junhao
  • XU FAN
  • CHEN QIAO
  • DING KE
  • DENG RUIFENG
  • XIE YURONG
  • LI YINSHI
  • WANG YUANCHEN
  • MOU MIN
  • AN DOU
  • WANG RUI
  • YANG HAOJIE

Assignees

  • 华电电力科学研究院有限公司
  • 西安交通大学

Dates

Publication Date
20260508
Application Date
20251226

Claims (10)

  1. 1. The multipoint energy storage polymerization cooperative operation method is characterized by comprising the following steps of: based on the differentiated physical characteristics of the lithium battery energy storage and the flywheel energy storage in terms of cycle life, energy density and operation constraint, respectively, a Markov decision process model of the lithium battery energy storage and a Markov decision process model of the flywheel energy storage are constructed, wherein the components of the Markov decision process model of the lithium battery energy storage comprise: And a reward function of lithium battery energy storage, which is used for quantifying the comprehensive benefits of lithium battery operation by considering power fluctuation punishment and system cost, and is expressed as: wherein B represents a lithium battery, Instant rewards for the lithium battery energy storage at the time t; , the priority of different optimization targets is balanced by determining the weight coefficient through a grid search strategy; for time between lithium battery and flywheel multi-point energy storage station and main electric network The power exchanged; The average exchange power in a preset time period is set; Transaction costs for the system; the degradation cost of the lithium battery; The components of the Markov decision process model for flywheel energy storage comprise: and a reward function of flywheel energy storage, which is used for quantifying the comprehensive benefits of flywheel operation by considering power fluctuation penalty, system cost and rotational speed deviation penalty, and is expressed as: wherein F represents the flywheel, For instant rewards of flywheel energy storage at time t, FC t is the cost of the flywheel energy storage system, Is the accumulated fatigue coefficient of the flywheel at the time t, The actual rotational speed of the flywheel at time t, In order to preset the optimal rotation speed, , , As a weight coefficient, determining through a grid search strategy; constructing a heterogeneous multi-agent depth deterministic strategy gradient algorithm framework, and completing modeling of lithium battery energy storage agents and flywheel energy storage agents based on two heterogeneous Markov decision process models, so that each agent is adapted to the physical characteristics of corresponding energy storage; And training the algorithm framework by adopting a centralized training and distributed executing mechanism, and solving an optimal coordinated operation scheme of multi-point energy storage aggregation of the lithium battery and the flywheel by taking power distribution as a core and considering dynamic response performance and full life cycle economy through a heterogeneous multi-agent cooperative optimization strategy under target constraint by the trained algorithm model.
  2. 2. The method of claim 1, wherein the components of the markov decision process model for lithium battery energy storage include: the state space of the lithium battery energy storage comprises electricity purchasing price, photovoltaic power generation power, wind power generation power, user demand power, proton exchange membrane electrolyzer consumption power and lithium battery charge state; the action space of the lithium battery energy storage is that the lithium battery charge and discharge power and the maximum charge and discharge power constraint of the lithium battery are respectively satisfied; the state transfer function of the lithium battery energy storage describes the dynamic process of the charge state along with the change of charge and discharge power, and is expressed as follows: Wherein, the The state of charge of the lithium battery at the time t is represented; The state of charge of the lithium battery at the time t+1 is represented; Charging efficiency for lithium batteries; The discharge efficiency of the lithium battery is improved; Is lithium battery capacity; the time interval between two adjacent decision moments; Charging power for the lithium battery; and discharging power for the lithium battery.
  3. 3. The method according to claim 1 or 2, wherein the flywheel stored energy markov decision process model component further comprises: The state space of flywheel energy storage comprises electricity purchasing price, photovoltaic power generation power, wind power generation power, user power consumption, proton exchange membrane electrolyzer consumption power, the current rotating speed of the flywheel and the accumulated fatigue coefficient of the flywheel; The action space of flywheel energy storage is flywheel charging and discharging power; And calculating a rotational speed change and fatigue accumulation based on the rotational inertia and the efficiency, wherein the state transfer function of flywheel energy storage is expressed as: Wherein, the The rotation speed of the flywheel at the moment t; the rotational speed of the flywheel at the moment t+1; the time interval between two adjacent decision moments; the flywheel rotational inertia; The charge and discharge efficiency of the flywheel is improved; is the charging power at the moment of the flywheel t, Is the discharge power at the moment of the flywheel t, Energy storage intelligent body for flywheel The action value output at the moment ensures the positive and negative rationality of the charging and discharging power by taking a max function; Is the loss coefficient; the power variation of the flywheel at the moment t; The accumulated fatigue coefficient of the flywheel at the time t+1; The accumulated fatigue coefficient of the flywheel at the time t.
  4. 4. The method of claim 1, wherein training the algorithm framework using a centralized training, distributed execution mechanism comprises: Constructing an evaluation network, and performing centralized training by adopting global state information, wherein the global state information is a set of a state space for storing energy of a lithium battery and a state space for storing energy of a flywheel and is used for comprehensively evaluating the collaborative decision-making effect of heterogeneous intelligent agents; constructing an action network, and performing distributed execution by adopting local state information, wherein the local state information of the lithium battery energy storage intelligent bodies is a state space for lithium battery energy storage, and the local state information of the flywheel energy storage intelligent bodies is a state space for flywheel energy storage, so that communication overhead among the intelligent bodies is reduced; Training a model, and outputting lithium battery energy storage charge and discharge power and flywheel energy storage charge and discharge power meeting action space constraint by iteratively updating an evaluation network and an action network to realize aggregation cooperative operation of the lithium battery and the flywheel multipoint energy storage.
  5. 5. The method of claim 4, wherein the evaluation network is updated by minimizing a loss function when training the model, the loss function being defined as: Wherein, the The sum of the lithium battery energy storage instant rewards and the flywheel energy storage instant rewards; for a discount factor, for weighting the importance of the current reward and the future reward; Is in a state of The collaborative policy of the lower action network output, Is in a state of Take action Is the expected return of (1); the action network By policy gradient update, the policy gradient is defined as: Wherein, the In order to be a parameter of the action network, In order to be a performance index of the policy, For the batch size, S B is a state of energy storage of a lithium battery, and S F is a state of energy storage of a flywheel.
  6. 6. The method of claim 3, wherein the target constraints include a system power balance constraint, a lithium battery energy storage operation constraint, and a flywheel energy storage operation constraint, wherein: The system power balance constraint requires the sum of photovoltaic power generation power, wind power generation power, lithium battery charge-discharge power and flywheel charge-discharge power to be matched with the sum of user demand power and proton exchange membrane electrolyzer consumption power; The lithium battery energy storage operation constraint comprises a state of charge constraint and a charge and discharge power constraint; The flywheel energy storage operation constraint comprises a rotation speed constraint and an accumulated fatigue coefficient constraint.
  7. 7. A multi-point energy storage aggregation co-operating device, the device comprising: the heterogeneous model building module is used for building a Markov decision process model of lithium battery energy storage and a Markov decision process model of flywheel energy storage based on different physical characteristics of the lithium battery energy storage and the flywheel energy storage in terms of cycle life, energy density and operation constraint, wherein the components of the Markov decision process model of the lithium battery energy storage comprise: And a reward function of lithium battery energy storage, which is used for quantifying the comprehensive benefits of lithium battery operation by considering power fluctuation punishment and system cost, and is expressed as: wherein B represents a lithium battery, Instant rewards for the lithium battery energy storage at the time t; , the priority of different optimization targets is balanced by determining the weight coefficient through a grid search strategy; for time between lithium battery and flywheel multi-point energy storage station and main electric network The power exchanged; The average exchange power in a preset time period is set; Transaction costs for the system; the degradation cost of the lithium battery; The components of the Markov decision process model for flywheel energy storage comprise: and a reward function of flywheel energy storage, which is used for quantifying the comprehensive benefits of flywheel operation by considering power fluctuation penalty, system cost and rotational speed deviation penalty, and is expressed as: wherein F represents the flywheel, For instant rewards of flywheel energy storage at time t, FC t is the cost of the flywheel energy storage system, Is the accumulated fatigue coefficient of the flywheel at the time t, The actual rotational speed of the flywheel at time t, In order to preset the optimal rotation speed, , , As a weight coefficient, determining through a grid search strategy; the intelligent body modeling module is used for constructing a heterogeneous multi-intelligent body depth deterministic strategy gradient algorithm framework, and modeling of the lithium battery energy storage intelligent body and the flywheel energy storage intelligent body is completed based on two heterogeneous Markov decision process models, so that each intelligent body is adapted to the physical characteristics of corresponding energy storage; The multi-agent cooperation optimization strategy output module is used for training the algorithm framework by adopting a centralized training and distributed execution mechanism, and the trained algorithm model solves the optimal cooperation operation scheme of the lithium battery and the flywheel multipoint energy storage aggregation taking power distribution as a core and considering dynamic response performance and full life cycle economy through the heterogeneous multi-agent cooperation optimization strategy under the target constraint.
  8. 8. An electronic device, comprising: a memory and a processor, the memory and the processor being communicatively connected to each other, the memory having stored therein computer instructions, the processor executing the computer instructions to perform the multipoint storage aggregation co-operating method according to any one of claims 1 to 6.
  9. 9. A computer readable storage medium having stored thereon computer instructions for causing a computer to perform the multipoint energy storage aggregation co-operation method according to any one of claims 1 to 6.
  10. 10. A computer program product comprising computer instructions for causing a computer to perform the multipoint energy storage aggregation co-operating method according to any one of claims 1 to 6.

Description

Multi-point energy storage aggregation cooperative operation method, device, equipment and storage medium Technical Field The invention relates to the technical field of comprehensive energy management, in particular to a multipoint energy storage aggregation cooperative operation method, device, equipment and storage medium. Background Driven by the energy structure transformation, the installed scale of renewable energy sources (such as photovoltaics and wind power) is continuously expanding. However, renewable energy sources have strong intermittence and volatility, and after large-scale access to a power grid, the renewable energy sources bring serious challenges to stable operation, power quality guarantee and supply and demand balance adjustment of a power system. In order to stabilize renewable energy fluctuation and improve the flexibility of a power system, an energy storage technology becomes a key supporting means. The lithium battery energy storage has the advantages of high energy density, good cycle characteristic and the like, is widely applied to the energy storage field, and the flywheel energy storage has the characteristics of high response speed, high power density, long cycle life, no influence of charge and discharge depth and the like, and can quickly stabilize short-time power fluctuation. In an actual application scene, it is often difficult for a single type of energy storage to simultaneously meet multidimensional requirements of a system on energy capacity, power response speed, full life cycle economy and the like. Therefore, the heterogeneous energy storage of the lithium battery and the flywheel and the like is polymerized and operated cooperatively, the advantage complementary characteristics of different energy storage technologies are fully exerted, and the lithium battery and the flywheel energy storage technology become a research hot spot in the energy storage field. Currently, research on coordinated operation of multi-point heterogeneous energy storage aggregation is focused on traditional optimization algorithms (such as linear programming, dynamic programming and the like). However, the traditional algorithm has the problems of dependence on an accurate mathematical model, poor adaptability to a complex nonlinear system, difficulty in processing real-time interaction of multiple agents and the like. With the development of reinforcement learning technology, a depth deterministic strategy gradient (DDPG) algorithm has good potential in the control problem of continuous action space, and provides a new technical path for heterogeneous energy storage aggregation cooperative operation. However, most of the existing reinforcement learning-based methods aim at single-type energy storage or do not fully consider heterogeneous characteristics and cooperation mechanisms among multiple intelligent agents, and under a multi-point heterogeneous energy storage aggregation scene, optimal cooperation operation which takes dynamic response performance and full life cycle economy into consideration is difficult to realize. Disclosure of Invention In order to solve the technical problems that in the prior art, in the polymerization cooperative operation of multi-point heterogeneous energy storage (lithium battery and flywheel), the adaptability of a traditional algorithm is poor, heterogeneous characteristics and multi-agent cooperation are not fully considered in a reinforcement learning method, and the dynamic response performance and full life cycle economy are difficult to consider. The invention provides a coordinated operation method, a device, equipment and a storage medium for multi-point energy storage polymerization, wherein the physical characteristics of different energy storage media of lithium battery energy storage and flywheel energy storage are represented by heterogeneous intelligent bodies, so that the dynamic coordination optimization of a multi-point energy storage system is realized, and the dynamic response performance and the full life cycle economy are considered. In a first aspect, the present invention provides a coordinated operation method of multipoint energy storage polymerization, including: respectively constructing a Markov decision process model of lithium battery energy storage and a Markov decision process model of flywheel energy storage based on the differentiated physical characteristics of the lithium battery energy storage and the flywheel energy storage in terms of cycle life, energy density and operation constraint; constructing a heterogeneous multi-agent depth deterministic strategy gradient algorithm framework, and completing modeling of lithium battery energy storage agents and flywheel energy storage agents based on two heterogeneous Markov decision process models, so that each agent is adapted to the physical characteristics of corresponding energy storage; And training the algorithm framework by adopting a centralized training and dis