CN-121984223-A - Distributed control-based power energy storage optimal configuration system and method thereof

CN121984223ACN 121984223 ACN121984223 ACN 121984223ACN-121984223-A

Abstract

The invention relates to the technical field of automatic control and distributed energy network of a power system, in particular to a power energy storage optimal configuration system and a method thereof based on decentralized control, wherein the power energy storage optimal configuration system comprises the following steps of establishing communication connection between decentralized control agents; the method comprises the steps of initializing and setting a Lyapunov stability constraint boundary and a game vulnerability threshold, responding to a trigger signal of a current control period, and executing a decentralized optimization flow, wherein the decentralized optimization flow comprises the steps of collecting running state data of a local energy storage unit, receiving interaction state data broadcast by adjacent nodes, resolving a cooperative game vulnerability index of a current network node, eliminating abnormal interaction data with false data injection characteristics, generating a cleaned trusted data set, and generating an energy storage power scheduling instruction by utilizing a multi-agent reinforcement learning model driven by a target control strategy.

Inventors

CHENG SIZHE

Assignees

三峡大学

Dates

Publication Date: 20260505
Application Date: 20260120

Claims (8)

1. The power energy storage optimal configuration method based on the decentralized control is characterized by comprising the following steps of: deploying distributed control agents on a plurality of distributed energy storage nodes in an electric power network, wherein communication connection is established among the distributed control agents; Initializing and setting a Lyapunov stability constraint boundary and a game vulnerability threshold; in response to the trigger signal of the current control period, the following decentralized optimization procedure is performed, including: Step 1, collecting running state data of a local energy storage unit and receiving interaction state data broadcasted by adjacent nodes; Step 2, based on the running state data and the interaction state data, a cooperative game vulnerability index of the current network node is calculated through a non-cooperative game equilibrium analysis algorithm, wherein the cooperative game vulnerability index is used for representing probability risks that adjacent nodes exit cooperative regulation; Step 3, inputting the interaction state data into a preset antagonistic noise identification model, removing abnormal interaction data with false data injection characteristics, and generating a cleaned credible data set; Step 4, comparing the cooperative game vulnerability index with the game vulnerability threshold, and selecting a target control strategy from a preset strategy library according to the comparison result; and step 5, generating an energy storage power scheduling instruction by utilizing a multi-agent reinforcement learning model driven by the target control strategy based on the trusted data set and the running state data under the condition of meeting the Lyapunov stability constraint boundary.
2. The power energy storage optimizing configuration method based on decentralized control according to claim 1, wherein the selecting a target control strategy from a preset strategy library according to the comparison result comprises: Selecting a survival priority strategy as the target control strategy in response to the collaborative game vulnerability index being greater than the game vulnerability threshold, wherein the survival priority strategy is configured to increase virtual incentive weights in a reward function of the multi-agent reinforcement learning model to maintain node online rates and reduce weights of frequency adjustment accuracy; And selecting a performance priority strategy as the target control strategy in response to the collaborative game vulnerability index being less than or equal to the game vulnerability threshold, wherein the performance priority strategy is configured to maximize a physical adjustment precision weight in a reward function of the multi-agent reinforcement learning model to approach a globally optimal solution and reduce the virtual incentive weight.
3. The power storage optimization configuration method based on decentralized control according to claim 1, wherein the calculating the collaborative game vulnerability index of the current network node through a non-collaborative game balance analysis algorithm based on the operation state data and the interaction state data comprises: calculating an expected benefit value of the local energy storage unit at the current moment based on the running state data; estimating the expected benefit value of the neighbor of the adjacent node at the current moment based on the interaction state data; calculating a revenue asymmetry between the expected revenue value and the neighbor expected revenue value; And weighting calculation is carried out on the income asymmetry by using a preset historical collaborative probability attenuation factor, so as to obtain the collaborative game vulnerability index.
4. The power energy storage optimizing configuration method based on distributed control according to claim 1, wherein the step of inputting the interaction state data into a preset antagonistic noise identification model and eliminating abnormal interaction data with false data injection characteristics comprises the steps of: Extracting time sequence fluctuation characteristics and amplitude distribution characteristics in the interaction state data; mapping the time sequence fluctuation feature and the amplitude distribution feature to a high-dimensional feature space to obtain a feature vector, and calculating the Euclidean distance between the feature vector and a preset normal physical signal manifold centroid; Responding to the Euclidean distance being larger than a preset abnormality judgment distance, judging the corresponding interaction state data as abnormal interaction data and blocking; and in response to the Euclidean distance being smaller than or equal to the abnormality judgment distance, judging the corresponding interaction state data to be normal interaction data, and reserving the normal interaction data to the trusted data set.
5. The distributed control-based power storage optimization configuration method according to claim 1, wherein the multi-agent reinforcement learning model driven by the target control strategy based on the trusted data set and the running state data comprises: Constructing a multi-agent deep reinforcement learning network comprising a state space, an action space and a reward function; Mapping the trusted data set and the operational state data into state vectors in the state space; loading weight parameters defined by the target control strategy into the reward function to form a dynamic reward function; and performing feature extraction and strategy gradient descent calculation on the state vector through the multi-agent deep reinforcement learning network, and outputting an action vector in the action space, wherein the action vector corresponds to the energy storage power scheduling instruction.
6. The power storage optimization configuration method based on decentralized control according to claim 5, wherein the generating the stored energy power scheduling command under the condition that the lyapunov stability constraint boundary is satisfied comprises: Based on a Lyapunov stability theory, constructing a Lyapunov function representing a power network frequency deviation energy function and a voltage deviation energy function; calculating the derivative of the Lyapunov function with respect to time, and determining a stable control region in which the derivative is smaller than zero; judging whether the motion vector falls into the stable control area or not; responding to the action vector falling into the stable control area, and directly converting the action vector into the energy storage power scheduling instruction; and in response to the motion vector not falling into the stable control area, projecting the motion vector onto the boundary of the stable control area to obtain a projected vector, and converting the projected vector into the energy storage power scheduling instruction.
7. The power energy storage optimizing configuration method based on the decentralized control according to claim 2, wherein, The running state data comprises the charge state of the local energy storage unit, the current charge and discharge power, a power grid frequency deviation value of a local connection point and a voltage deviation value of the local connection point; Interaction state data including the state of charge of the neighboring node, the marginal adjustment cost quote for the neighboring node, and the predicted adjustment power for the neighboring node.
8. A power energy storage optimizing configuration system based on decentralized control, which is applied to the power energy storage optimizing configuration method based on decentralized control as claimed in any one of claims 1 to 7, and is characterized by comprising the following steps: the distributed communication module is configured to establish communication connection between the distributed energy storage nodes and broadcast interaction state data; the data acquisition module is configured to acquire the running state data of the local energy storage unit; The vulnerability assessment module is configured to calculate a cooperative game vulnerability index of the current network node through a non-cooperative game equilibrium analysis algorithm based on the running state data and the interaction state data, wherein the cooperative game vulnerability index is used for representing probability risks that adjacent nodes exit cooperative regulation; The noise filtering module is configured to input the interaction state data into a preset antagonistic noise identification model, reject abnormal interaction data with false data injection characteristics and generate a cleaned credible data set; The strategy switching module is configured to compare the cooperative game vulnerability index with the game vulnerability threshold value, and select a target control strategy from a preset strategy library according to the comparison result; And the optimization control module is configured to generate an energy storage power scheduling instruction under the condition of meeting the Lyapunov stability constraint boundary by utilizing a multi-agent reinforcement learning model driven by the target control strategy based on the trusted data set and the running state data.

Description

Distributed control-based power energy storage optimal configuration system and method thereof Technical Field The invention relates to the technical field of automatic control and distributed energy network of a power system, in particular to a power energy storage optimal configuration system and a power energy storage optimal configuration method based on decentralized control. Background In the current power distribution network operation environment of high-permeability distributed resources, a large number of distributed energy storage nodes participate in system adjustment through an open communication network, and the nodes belong to different benefit bodies and have obvious independent benefit-by-benefit characteristics; In order to realize the optimal configuration of energy storage resources, the prior proposal generally adopts a centralized control or consistency collaborative architecture based on complete cooperation assumption, namely defaulting all nodes unconditionally obeys scheduling instructions and directly utilizes acquired state data to drive an optimization algorithm, and the proposal has certain adjustment capability under ideal environment, but because the proposal ignores non-cooperative game behavior among nodes, the trust crisis of the collective exit adjustment of the nodes is extremely easy to be caused when benefit distribution is uneven, meanwhile, the prior art mostly adopts a simple threshold method to treat data abnormality, malicious false data injection attack and normal physical fluctuation of a power grid are difficult to be effectively distinguished, and the artificial intelligent algorithm purely relying on data driving lacks definite physical safety boundary constraint, so that the system is easy to induce frequency and voltage out-of-limit instability in the process of exploring extreme game scenes or algorithms, the node survival rate and the integral adjustment efficiency under complex network environment are difficult to be considered, therefore, how to quantify and cope with the system vulnerability caused by the non-cooperative game of the nodes is easy to be caused, and the physical safety of intelligent decision is ensured while the antagonistic noise is removed. Disclosure of Invention The invention aims to provide a power energy storage optimal configuration system and a power energy storage optimal configuration method based on decentralized control, which can quantify and cope with the cooperative vulnerability caused by node non-cooperative game, avoid grid cascade collapse and antagonistic false data injection interference caused by individual progressive behaviors, and realize dynamic balance optimization of system survival rate and regulation efficiency in a complex game environment on the premise of strictly guaranteeing the physical stability constraint of Lyapunov, and specifically, the technical scheme of the invention is as follows: a power energy storage optimal configuration method based on decentralized control comprises the following steps: deploying distributed control agents on a plurality of distributed energy storage nodes in an electric power network, wherein communication connection is established among the distributed control agents; Initializing and setting a Lyapunov stability constraint boundary and a game vulnerability threshold; in response to the trigger signal of the current control period, the following decentralized optimization procedure is performed, including: Step 1, collecting running state data of a local energy storage unit and receiving interaction state data broadcasted by adjacent nodes; Step 2, based on the running state data and the interaction state data, a cooperative game vulnerability index of the current network node is calculated through a non-cooperative game equilibrium analysis algorithm, wherein the cooperative game vulnerability index is used for representing probability risks that adjacent nodes exit cooperative regulation; Step 3, inputting the interaction state data into a preset antagonistic noise identification model, removing abnormal interaction data with false data injection characteristics, and generating a cleaned credible data set; Step 4, comparing the cooperative game vulnerability index with the game vulnerability threshold, and selecting a target control strategy from a preset strategy library according to the comparison result; and step 5, generating an energy storage power scheduling instruction by utilizing a multi-agent reinforcement learning model driven by the target control strategy based on the trusted data set and the running state data under the condition of meeting the Lyapunov stability constraint boundary. Preferably, selecting a target control strategy from a preset strategy library according to the comparison result, including: Selecting a survival priority strategy as the target control strategy in response to the collaborative game vulnerability index being greater than