Search

CN-122026540-A - Multi-machine wide-area damping cooperative control system and method based on multi-agent reinforcement learning

CN122026540ACN 122026540 ACN122026540 ACN 122026540ACN-122026540-A

Abstract

The invention relates to the field of dynamic control of an electric power system, and discloses a multi-machine wide-area damping cooperative control system and method based on multi-agent reinforcement learning, aiming at the problems of low-frequency oscillation risk aggravation of a power grid under a high new energy occupation ratio, insufficient cooperative optimization of a traditional controller and the like, the system constructs a controller through wide-area measurement signals, determines an installation place and a feedback signal by combining participation factor analysis and observability analysis, and optimizes parameters of a doubly-fed fan and a synchronous machine controller by adopting a multi-agent depth deterministic strategy gradient algorithm; the intelligent agent can quickly generate the optimal control strategy through the steps of initialization, interaction, judgment, decision making, learning and the like. The invention can effectively promote the damping of the system, quickly inhibit the power fluctuation and ensure the stable operation of the power grid.

Inventors

  • SUN ZHENGLONG
  • LI SHENHAO
  • ZHANG RUI
  • WANG LIXIN
  • PAN CHAO
  • CAI GUOWEI

Assignees

  • 东北电力大学

Dates

Publication Date
20260512
Application Date
20251230

Claims (8)

  1. 1. A multi-machine wide area damping cooperative control system based on multi-agent reinforcement learning is characterized in that, Comprising the following steps: step S1, constructing a wide-area damping controller based on a wide-area measurement signal; s2, determining the installation place of the wide area damping controller and selecting feedback signals; And step S3, based on a multi-agent depth deterministic strategy gradient algorithm, determining the observed quantity of the agent and the controller parameters to be optimized for training, wherein the trained agent has the capability of generating a multi-machine wide-area damping cooperative control strategy when the system generates low-frequency oscillation.
  2. 2. The multi-agent reinforcement learning-based multi-machine wide area damping cooperative control system according to claim 1, wherein the step S1 specifically includes: Step S11, selecting a wide-area damping control loop based on a comprehensive geometric index of observability and controllability of a system, and respectively installing a wide-area damping controller based on a wide-area measurement signal on a reactive power control link of a rotor-side converter of a doubly-fed wind turbine and an excitation link of a synchronous machine; step S12, the invention adopts a wide area damping controller with a wide area measurement signal as input.
  3. 3. The multi-agent reinforcement learning-based multi-machine wide area damping cooperative control system according to claim 1, wherein the step S2 specifically includes: s21, utilizing participation factor analysis of an electric power system simulation example to select an installation site of a wide area damping controller, taking a two-area four-machine system example as an example, and determining the installation site of the wide area damping controller as a doubly fed fan DFIG and a synchronous machine G3; Step S22, utilizing the system observability analysis of the power system simulation calculation example, selecting the feedback signal of the wide-area damping controller, taking the two-area four-machine system calculation example, and determining the feedback signal of the wide-area damping controller as the speed difference of the synchronous machine.
  4. 4. The multi-agent reinforcement learning-based multi-machine wide area damping cooperative control system according to claim 1, wherein the step S3 specifically includes: S31, determining that the parameters of the controller to be optimized of the wide area damping controller are gain K and lead-lag parameters T1 and T3, and dividing the two groups of parameters of the controller into parameters to be optimized, which are arranged in the synchronous machine G3, into K-G3, T1-G3 and T3-G3, wherein the parameters to be optimized, which are arranged in the doubly fed fan DFIG, are K-DFIG, T1-DFIG and T3-DFIG, the value range of K is (0-100), the value range of T1 and T3 is (0-1), and the rest of parameters Tw of the controller are 10, T2 and T4 are all 0.5; S32, determining observation variables in the parameter optimization process of the controller, and selecting damping ratio and frequency of a dominant mode which can best represent the system stability by the mode simulation analysis of the power system; Step S32, according to the related content mentioned in S11 and S31, selecting two agents to respectively optimize two controllers under the framework of a multi-agent depth deterministic strategy gradient algorithm, wherein the agent 1 optimizes parameters of a wide-area damping controller attached to a synchronous machine G3, the agent 2 optimizes parameters of a wide-area damping controller attached to a doubly-fed fan DFIG, and under the framework of the algorithm, the two agents jointly learn and continuously optimize the parameters according to the value range of each parameter mentioned in each step S31 so as to enable better cooperative work among different controllers, and the trained agent has the capability of rapidly generating an optimal multi-machine wide-area damping cooperative control strategy when a power system generates low-frequency oscillation.
  5. 5. The multi-agent reinforcement learning-based multi-machine wide area damping cooperative control system according to claim 1, wherein the constructed reinforcement learning network comprises the following components: the initialization module is used for configuring reinforcement learning network parameters based on a multi-agent depth deterministic strategy gradient algorithm, setting the maximum interaction times of each cycle and the number of cycles required to be trained, setting the initial value of the controller parameters, and reading the damping ratio and the frequency of a dominant mode under the initial value; the interaction module is used for screening dominant modes according to damping ratio and frequency after the electric power system simulation calculation example is operated, reading data of the primary dominant modes after one-time simulation operation, generating controller parameters by the intelligent agent before each simulation, transmitting the controller parameters to corresponding wide-area damping controllers in the electric power system simulation calculation example, and then carrying out simulation by a person; the judging module is used for obtaining a reward value by using a reward function according to the damping ratio of the screened dominant mode; The decision module is used for transmitting the damping ratio and the frequency of the dominant mode as the input quantity of the reinforcement learning network to the reinforcement learning network, and taking the combination of the two groups of controllers as the output of the reinforcement learning network, so that the optimal combination of the two groups of wide-area damping controller parameters is quickly generated when the system oscillates at low frequency; And the learning module judges the damping effect of the controller parameters according to the damping ratio of the dominant mode and updates the network parameters of the reinforcement learning network by combining the rewarding value obtained by the judging module on the basis.
  6. 6. The multi-machine wide area damping cooperative control system based on multi-agent reinforcement learning according to claim 1, wherein the reinforcement learning network based on the multi-agent depth deterministic strategy gradient algorithm comprises two neural networks, namely a strategy neural network and a value neural network, wherein the input of the strategy neural network is the damping ratio and the frequency of the dominant oscillation mode screened out after each simulation, the output is the combination of control parameters, and the output of the strategy neural network is the neural network weight used for updating the strategy neural network and the value neural network.
  7. 7. The multi-agent reinforcement learning-based multi-machine wide area damping cooperative control system of claim 5, wherein the bonus function is set as follows: if the damping ratio after simulation is between 5% and 10%, the rewarding value is 1000; If the damping ratio after simulation is 10% -20%, the rewarding value is 10; if the damping ratio after simulation is more than 20%, the rewarding value is-50; If the damping ratio after simulation is between 0% and 5%, the reward value is the corresponding damping ratio multiplied by-10; if the damping ratio after simulation is less than 0%, negative damping occurs, and the reward value is-10000.
  8. 8. A method of using the multi-agent reinforcement learning based wide area damping cooperative control system according to any one of claims 1 to 7, comprising: Step one, constructing a wide-area damping controller based on a wide-area measurement system, determining a place for installing the controller according to a participation factor analysis method, determining a feedback signal of the controller according to the observability of the system, and determining a subsequent controller parameter to be optimized; step two, constructing a network of a strong learning algorithm based on multi-agent depth deterministic strategy gradients; Step three, determining the input quantity of the reinforcement learning network as the damping ratio and frequency of the screened dominant mode, and determining the interaction times with the simulation example of the power system in the training process; Step four, utilizing the constructed reinforcement learning network based on the multi-agent depth deterministic strategy gradient algorithm to perform optimization training on the electric quantity input into the network to generate a strategy model; and fifthly, taking the damping ratio and the frequency of the electric power system as input, inputting the damping ratio and the frequency into the strategy model, outputting the optimal combination of the two groups of controller parameters, and transmitting the output controller parameter combination to the electric power system for execution, so that the optimal two groups of wide-area damping controller parameters can be quickly generated when the system oscillates at a low frequency, and further, the damping of the system is improved.

Description

Multi-machine wide-area damping cooperative control system and method based on multi-agent reinforcement learning Technical Field The invention relates to the technical field of dynamic control and stability of an electric power system, in particular to a multi-machine wide-area damping cooperative control system and method based on multi-agent reinforcement learning. Background With the gradual formation of grid interconnection areas, a complex power system network structure can possibly cause low-frequency oscillation of a power grid at any time, so that the transmission efficiency on a power system interconnecting line is greatly limited, the risk of oscillation of a power system containing wind power is increased due to the factors such as strong output random fluctuation of wind power, grid connection through power electronic equipment and the like, the damping effect of a power system stabilizer on a traditional synchronous machine is reduced due to the fact that the power generation capacity of a traditional power supply can be replaced along with the planning and implementation of a large-capacity wind power field, the control effect of the system in different running states cannot be guaranteed due to the fact that parameters of the traditional wide-area damping controller are fixed, the interaction and influence of the wind power field or wind power field group and an accessed power system are also increased, and the low-frequency oscillation and stability of the power grid possibly faced by a large-scale wind power access system are one of the important concerns. Particularly, the power system containing the large-capacity wind power plant generates low-frequency oscillation, which can cause instability of the power system and influence the safe operation of the wind turbine generator of the large-capacity wind power plant. In order to solve the problem of insufficient damping of the inter-section oscillation mode of the interconnected power system, the characteristic of rapid power modulation of the wind turbine can be utilized, additional wide-area damping control is considered, the randomness and intermittence of the output of the wind turbine are considered, and meanwhile, the wide-area damping control is added to the synchronous machine. In order to better control wide-area damping of a wind turbine, particularly a doubly-fed wind turbine and a synchronous machine which are widely applied at present, the parameter coordination setting of two types of damping controllers is always needed to overcome the difficulty. The traditional wide area signal selection and the determination of the installation position are only aimed at a single mode, the ideal effect cannot be exerted in a multi-operation mode, the controller input is single input, when a system fails, the single input possibly causes the failure of a damping controller, the traditional mathematical programming method is slightly insufficient in optimizing capability on the aspect of nonlinear and multi-polar optimization problems, the traditional optimization algorithm can only be optimized for a single controller, and the cooperative optimization cannot be carried out when the multi-controller exists in the system, so that the multi-agent depth deterministic strategy gradient algorithm is introduced to solve the problem of the cooperative optimization of the multi-controller. The wide-area damping control is greatly helpful for improving the dynamic stability level and the power transmission capacity of the system, and is of great significance for ensuring the safe and stable operation of the practical large-scale interconnected power system by realizing the wide-area damping cooperative control of the wind turbine generator and the synchronous turbine generator in the power system at the present that the grid-connected capacity of the fan is gradually increased. Content of the application Aiming at the problems, the invention provides a multi-machine wide-area damping cooperative control system and a multi-machine wide-area damping cooperative control method based on multi-agent reinforcement learning, and the trained agents have the capability of rapidly generating an optimal multi-machine wide-area damping cooperative control strategy when the system oscillates at a low frequency. The system damping when the power system generates low-frequency oscillation is improved, so that the damage of the oscillation to the system is reduced, and the stability of the power system is improved. In order to solve the above problems, the present invention provides a multi-machine wide area damping cooperative control system based on multi-agent reinforcement learning, comprising: And constructing a wide-area damping controller based on the wide-area measurement signal. And determining the installation place of the wide-area damping controller and selecting feedback signals. Based on a multi-agent depth deterministic strategy gradient algorithm, t