CN-121238554-B - Strategy generation method and device for comprehensive energy system and electronic equipment
Abstract
The invention provides a strategy generation method and device for a comprehensive energy system and electronic equipment, which can be applied to the technical field of energy scheduling. The method comprises the steps of processing historical load information of a user side by calling a dynamic transaction adjustment model, and generating initial resource transaction information of the power distribution network. And processing the initial resource transaction information, the initial electricity load transfer information and the preset translatable load proportion by calling a dynamic load transfer model, and generating an electricity utilization strategy of the user side. And calling an objective function to process the initial environment information and the photovoltaic power generation equipment parameters to generate the expected output power of the photovoltaic power generation equipment. A distribution strategy of the distribution network is determined based on the power usage strategy, the expected output power, and an energy storage state of the energy storage system. An energy storage policy of the energy storage system is determined based on the initial resource transaction information. And generating a target strategy by iteratively updating the initial resource transaction information, the electricity utilization strategy and the energy storage strategy by using a reinforcement learning algorithm.
Inventors
- LV SHILEI
- XIN XIAOYUE
- ZHANG PENGBO
- JIA YANBING
- WU ZHIFU
- WANG CHAOLIANG
Assignees
- 天津大学
Dates
- Publication Date
- 20260505
- Application Date
- 20251204
Claims (7)
- 1. A strategy generation method for an integrated energy system, comprising: acquiring historical load information, initial power load transfer information, initial environment information of a user side and an energy storage state of an energy storage system, wherein the energy storage system comprises water cold storage equipment and battery equipment; historical load information of a user side is processed by calling a dynamic transaction adjustment model, and initial resource transaction information of the power distribution network is generated; The method comprises the steps of determining the expected electricity consumption behavior of a user according to initial resource transaction information, generating an electricity consumption strategy of the user by calling a dynamic load transfer model to process the initial electricity consumption load transfer information, a preset translatable load proportion and the expected electricity consumption behavior of the user, wherein the electricity consumption strategy indicates expected time period information of the transferable load of the user to the energy storage system, so that the energy storage system can execute energy storage operation in the expected time period information; invoking an objective function to process the initial environmental information and the photovoltaic power generation equipment parameters, and generating the expected output power of the photovoltaic power generation equipment; determining a power distribution strategy of the power distribution network based on the user demand of the user side, the expected output power of the photovoltaic power generation equipment and the energy storage state of the energy storage system according to the priority; Determining an energy storage strategy of the energy storage system based on the initial resource transaction information, and Constructing a state space according to historical load information, initial electricity load transfer information, initial environment information and energy storage states of an energy storage system of the user side, constructing an action space according to an electricity utilization strategy of the user side, a power distribution strategy of the power distribution network and the energy storage strategy, and generating a target strategy by iteratively updating the initial resource transaction information, the electricity utilization strategy and the energy storage strategy by utilizing a reinforcement learning algorithm based on a target reward function; Wherein the determining an energy storage policy of the energy storage system based on the initial resource transaction information comprises: In response to determining that the initial resource transaction information is less than a predetermined transaction threshold, determining that the water storage device initiates a cold storage operation, the battery device initiates a charging operation, and And in response to determining that the initial resource transaction information is greater than or equal to the predetermined transaction threshold, determining that the chilled water storage device initiates a cool release operation and the battery device initiates a discharge operation.
- 2. The method for generating the strategy according to claim 1, wherein the historical load information comprises a dynamic electricity utilization strategy of a historical period and resource transaction information of the historical period; The method for generating the initial resource transaction information of the power distribution network by calling the dynamic transaction adjustment model to process the historical load information of the user side comprises the following steps: Carrying out average treatment on the dynamic electricity utilization strategies of the historical time period to obtain an average value of the electricity utilization strategies of the historical time period; performing difference processing on the dynamic electricity utilization strategy of the historical period and the average value of the electricity utilization strategies of the historical period based on a preset period to obtain the electricity utilization strategy variance of the historical period, and And processing the power utilization strategy variance of the historical time period, the dynamic power utilization strategy of the historical time period, the power utilization strategy average value of the historical time period and the resource transaction information of the historical time period based on a preset adjustment coefficient to obtain initial resource transaction information of the historical time period.
- 3. The policy generation method according to claim 1, wherein the policy generation method further comprises: The user determining the cooling demand according to the initial environment information and the initial resource transaction information, and And determining the cooling release rate of the chilled water storage equipment during the period of starting the cooling release operation according to the cooling demand.
- 4. The method for generating a strategy according to claim 1, wherein the target rewarding function comprises a rewarding function of a user side, a rewarding function of an energy storage system and a rewarding function of a power distribution network, and the method for generating a strategy further comprises: Invoking a reward function of the power distribution network to process the initial resource transaction information and the load demand of the power distribution network, and generating a reward value of the power distribution network; Invoking a reward function of the user terminal to process the initial resource transaction information, the electricity utilization strategy of the user terminal and the preset electricity utilization preference parameter, and generating a reward value of the user terminal; invoking a reward function of the energy storage system to process the initial resource transaction information and the energy storage state of the energy storage system, and generating a reward value of an energy storage end; The target strategy comprises an electricity utilization strategy of the user side, an energy storage strategy of the energy storage side and target resource transaction information of the power distribution network when the respective rewarding values of the user side, the energy storage side and the power distribution network meet convergence conditions.
- 5. A policy generation device for an integrated energy system, comprising: The system comprises an acquisition module, an energy storage system and a control module, wherein the acquisition module is used for acquiring historical load information, initial electricity load transfer information, initial environment information of a user side and an energy storage state of the energy storage system; The first generation module is used for processing the historical load information of the user side by calling the dynamic transaction adjustment model to generate initial resource transaction information of the power distribution network; The system comprises an initial resource transaction information, a second generation module, a dynamic load transfer model, a power utilization strategy generation module and an energy storage system, wherein the initial resource transaction information is used for acquiring initial resource transaction information, the second generation module is used for determining expected power utilization behavior of a user according to the initial resource transaction information, and generating the power utilization strategy of a user side by calling the dynamic load transfer model to process the initial power utilization load transfer information, a preset translatable load proportion and the expected power utilization behavior of the user side; The third generation module is used for calling an objective function to process the initial environment information and the parameters of the photovoltaic power generation equipment and generating the expected output power of the photovoltaic power generation equipment; The system comprises a first determining module, a first power distribution module, a second determining module and a third determining module, wherein the first determining module is used for determining the priority of the photovoltaic power generation equipment, the energy storage system and the power distribution network for supplying energy to the user side; A second determining module, configured to determine an energy storage policy of an energy storage system based on the initial resource transaction information; A fourth generating module, configured to construct a state space according to historical load information, initial power load transfer information, initial environment information, and an energy storage state of an energy storage system of the user side, construct an action space according to a power utilization strategy of the user side, a power distribution strategy of the power distribution network, and the energy storage strategy, and generate a target strategy by iteratively updating the initial resource transaction information, the power utilization strategy, and the energy storage strategy by using a reinforcement learning algorithm based on a target reward function; Wherein the second determining module is configured to: In response to determining that the initial resource transaction information is less than a predetermined transaction threshold, determining that the water storage device initiates a cold storage operation, the battery device initiates a charging operation, and And in response to determining that the initial resource transaction information is greater than or equal to the predetermined transaction threshold, determining that the chilled water storage device initiates a cool release operation and the battery device initiates a discharge operation.
- 6. An electronic device, comprising: One or more processors; A memory for storing one or more programs, Wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the policy generation method of any of claims 1 to 4.
- 7. A computer readable storage medium having stored thereon executable instructions, which when executed by a processor cause the processor to implement the policy generation method of any of claims 1 to 4.
Description
Strategy generation method and device for comprehensive energy system and electronic equipment Technical Field The present invention relates to the field of energy scheduling technologies, and in particular, to a method and an apparatus for generating a policy for an integrated energy system, and an electronic device. Background With the development of renewable energy sources accessing to a power distribution network and an electric power market, time-sharing resource transaction information is difficult to reflect fluctuation characteristics of power distribution network loads and renewable energy sources, and further users are difficult to be stimulated to reasonably adjust loads according to power distribution network demands. In regional integrated energy systems, the problems of interaction and optimization of multiple energy sources are more complex. In the related art, resource scheduling of a power system is difficult to fully consider the resource utilization states of a power distribution network and other energy systems and the game relationship among multiple main bodies, so that the resource utilization rate of the comprehensive energy system is low. Disclosure of Invention In view of the above, the invention provides a strategy generation method, a strategy generation device and electronic equipment for an integrated energy system. One aspect of the present invention provides a policy generation method for an integrated energy system, including: and acquiring historical load information, initial power load transfer information, initial environment information and energy storage state of an energy storage system of the user terminal. And (5) processing the historical load information of the user terminal by calling the dynamic transaction adjustment model, and generating the initial resource transaction information of the power distribution network. And generating an electricity utilization strategy of the user side by calling a dynamic load transfer model to process the initial resource transaction information, the initial electricity utilization load transfer information and the preset translatable load proportion. The electricity utilization strategy indicates expected time period information of the load transferable to the energy storage system by the user side so that the energy storage system can execute energy storage operation on the expected time period information. And calling an objective function to process the initial environment information and the parameters of the photovoltaic power generation equipment, and generating the expected output power of the photovoltaic power generation equipment. And determining a distribution strategy of the distribution network based on the electricity utilization strategy, the expected output power and the energy storage state of the energy storage system. And determining an energy storage strategy of the energy storage system based on the initial resource transaction information. And constructing a state space according to the historical load information, the initial electricity load transfer information, the initial environment information and the energy storage state of the energy storage system of the user side, constructing an action space according to the electricity utilization strategy of the user side, the distribution strategy of the power distribution network and the energy storage strategy, and generating a target strategy by iteratively updating the initial resource transaction information, the electricity utilization strategy and the energy storage strategy by utilizing a reinforcement learning algorithm based on a target reward function. According to the embodiment of the invention, the historical load information comprises a dynamic electricity utilization strategy of a historical period and resource transaction information of the historical period. The method comprises the steps of processing historical load information of a user side by calling a dynamic transaction adjustment model, and generating initial resource transaction information of a power distribution network, wherein the step of carrying out average processing on dynamic power utilization strategies in the historical period to obtain power utilization strategy average values in the historical period. And carrying out difference processing on the dynamic electricity utilization strategy of the historical time period and the average value of the electricity utilization strategy of the historical time period based on a preset period to obtain the electricity utilization strategy variance of the historical time period. And processing the power utilization strategy variance of the historical time period, the dynamic power utilization strategy of the historical time period, the power utilization strategy average value of the historical time period and the resource transaction information of the historical time period based on a preset adjustment coefficient to obtain initi