CN-122026406-A - Voltage self-adaptive control method, equipment and medium for power distribution network under access of large-scale small hydropower station clusters

CN122026406ACN 122026406 ACN122026406 ACN 122026406ACN-122026406-A

Abstract

The invention relates to the technical field of intelligent reinforcement learning and novel power systems, and discloses a method, equipment and medium for adaptively controlling the voltage of a lower distribution network based on large-scale small hydropower station cluster access, wherein the method comprises the steps of constructing a mathematical model of the distribution network considering reactive power regulation and energy storage active power regulation of the small hydropower station, and clearly controlling variables and boundary conditions thereof; the method comprises the steps of converting a voltage control task into a constraint Markov decision process, defining a state space, an action space and a reward function, designing a depth deterministic strategy gradient algorithm based on original dual, training an intelligent body to solve CMDP and meet physical constraints, constructing a simulation environment based on historical data, training and deploying a control strategy, and realizing self-adaptive voltage control of small hydropower and energy storage equipment. The method has real-time performance, self-learning performance and multi-source coordination capability, and the voltage stability and the operation efficiency of the power distribution network are obviously improved.

Inventors

Ren Tinghao
MA JIANWEI
ZHOU ZHONGQIANG
WU RONGKUN
ZHOU XIAOYUAN
ZHANG GUANGYU
LUO XUELIAN
WANG KAIBO
MA LI
YANG YI
QIAN ZHENGCHAO
LU YING
LI YAO
Dai Qican
LUO XINGYU
Yuan Meiqi
MAO JIE
ZHANG XINYI
BAO YIZHAO
YANG HUA

Assignees

贵州电网有限责任公司

Dates

Publication Date: 20260512
Application Date: 20251225

Claims (10)

1. The voltage self-adaptive control method for the lower distribution network based on the large-scale small hydropower station cluster access is characterized by comprising the following steps of, Constructing a mathematical model, a control variable and a regulation boundary of the voltage control of the power distribution network under the small hydropower station cluster access, and carrying out reactive power regulation of a hydropower unit and active power regulation of energy storage equipment based on a Distflow tide model; Describing a voltage control task as a constraint Markov decision process, adjusting the reactive power of the hydroelectric generating set and the active power of the energy storage device through intelligent decision, optimizing the voltage of a power grid and minimizing the loss of the power grid; Outputting deterministic actions through a strategy network, evaluating state-action values according to the network, processing constraints by adopting a Lagrange relaxation method, alternately updating strategy network parameters and carrying out optimization solution on even variables; Training the intelligent agent according to the real historical operation data of the power distribution network, performing performance verification and evaluation on the trained intelligent agent, and deploying the intelligent agent to an actual power distribution network control system on line.
2. The method for adaptively controlling the voltage of the distribution network under the large-scale small hydropower station cluster access as claimed in claim 1, wherein the Distflow tide model comprises the following steps of, Wherein, the And Active power and reactive power are transmitted from node i to node j respectively, And Representing the active and reactive power of node j to all other nodes k, Representing a set of end nodes, Representing a set of head-end nodes, And Representing the active power and reactive power output of the small hydroelectric generating set at the node j, Representing the active power of the stored energy, And Representing the active and reactive power of the electrical load at node j, Representing the square of the current between the lines of node i and node j, Representing the square of the voltage amplitude at node i, And Representing the resistance and reactance between the lines of node i and node j, And B is the set of all branches in the system.
3. The method for adaptively controlling the voltage of the lower distribution network based on the large-scale small hydropower cluster access of claim 2, wherein the steps of performing reactive power adjustment of the hydropower unit and active power adjustment of the energy storage device comprise that the reactive power output of the small hydropower unit is realized by adjusting exciting current of the hydropower unit, and the reactive power output of the hydropower unit is adjusted within a set range by changing the exciting current; The reactive power output of the small hydroelectric generating set can be changed by adjusting the exciting current The following constraints need to be met: Wherein, the Representing the reactive power output of the small hydroelectric generating set j at time t, And Respectively representing the minimum value and the maximum value of reactive power output of the current unit; active output of energy storage device The energy storage device has maximum and minimum limits on charge and discharge power, and the active power has to be varied within the following ranges: Wherein, the And Respectively representing the minimum charge and discharge power and the maximum charge and discharge power of the energy storage equipment of the node j at the moment t, and simultaneously storing the energy quantity There is a capacity limitation, and the amount of electricity in the energy storage device varies with the charge and discharge process: Wherein, the And The electric quantity of the energy storage device at the time t and the time t-1 is shown, and delta t is the time step; And after the constraint conditions of the hydroelectric generating set and the energy storage equipment are brought into a tide model, the comprehensive voltage regulation and power optimization problem is formed.
4. The method for adaptively controlling the voltage of the lower distribution network based on the large-scale small hydropower station cluster access as set forth in claim 3, wherein the constraint Markov decision process comprises a state space, an action space and a reward function; the state space comprises voltage, power and energy storage electric quantity information of each node in the power distribution network at each time t, and specifically, the state space S is defined as: Wherein, the Respectively representing the voltage amplitude, active power, reactive power and the electric quantity state of energy storage of the node at the time t; The action space comprises all control actions which can be taken by the system, namely, the reactive power output of the hydroelectric generating set and the active power output of the energy storage equipment are adjusted The definition is as follows: Wherein, the Represents the reactive power output of the hydroelectric generating set j at the time t, Indicating the active power adjustment of the energy storage device j at time t.
5. The method for adaptively controlling voltage of a distribution network under access to a large-scale small hydropower station cluster as set forth in claim 4, wherein said reward function comprises defining that the agent is based on the current state at each time t And action taken The following reward values are specifically designed: Wherein R is the power loss in the transmission process, And The method is characterized in that the method is a weight coefficient, the relative importance of active Loss and voltage penalty is controlled respectively, loss represents the active Loss of a power distribution network, and a calculation formula is as follows: Wherein, the Representing the square of the current between the lines of node i and node j, In the form of a resistor, the resistor, The penalty term for voltage out-of-limit is represented by the formula: Wherein, the Is the reference voltage(s) that are used to generate the reference voltage, Is the allowable voltage deviation, and if the voltage exceeds the allowable range, punishment is given.
6. The method for adaptively controlling the voltage of the lower distribution network based on the large-scale small hydropower station cluster access of claim 5, wherein the original dual depth deterministic strategy algorithm comprises the steps of determining the value of action and evaluating state-action pairs through a strategy network and a Q network; policy network The output of (a) is a deterministic action in a given state s I.e. based on the optimal action selected by the current state, the policy network is represented by a deep neural network, the parameters are represented by Representing, the goal is to maximize long-term return by the strategic gradient approach; q network The output of (a) is given state s and the Q value of action a, representing the cumulative return that can be obtained after action a is performed in state s, Q network is represented by another deep neural network, the parameter is represented by phi, and the learning of Q value is aimed at minimizing the difference between the actual return and the estimated Q value: wherein, gamma is a discount factor, Is the target Q network and, For the estimated value of the Q network output, For instant rewards, s' is the next state to which the environment transitions after the state performs an action, The next action is selected for the target policy network based on the next state s'.
7. The method for adaptively controlling voltage of a distribution network under access of a large-scale small hydropower station cluster as set forth in claim 6, wherein said original dual depth deterministic strategy algorithm further comprises processing constraint optimization problems by using a Lagrange relaxation method and defining a Lagrange function The method comprises the following steps: Wherein, the Is a lagrange multiplier, represents the sag of the constraint, Is a constraint function, represents the degree of violation of the voltage out-of-limit constraint condition, and d is the upper limit of the constraint; After the optimization problem is converted into the unconstrained problem, carrying out original dual optimization, and obtaining an optimal solution by alternately updating a strategy and dual variables, wherein the original dual optimization problem is as follows: Wherein pi ∗ is the optimal strategy, lambda ∗ is the optimal dual variable, argmin, argmax is the minimum value and the maximum value, and in the process, the strategy network Is the main target of the optimization problem, and the dual variables are Lagrangian multipliers Controlling the satisfaction of the constraint conditions through gradient updating; at the time of updating the original dual DDPG algorithm strategy, in each training step, according to the current strategy Evaluating the current action value, optimizing a policy network by maximizing the Q value, wherein the update rule of the policy network is as follows: Wherein, the For a new parameter of k +1 iterations, As a parameter at the kth training iteration, Is the rate of learning to be performed, Is a policy network gradient through constantly updating the policy network Maximizing an objective function ; The update rule of the dual variables is as follows: Wherein, the new value of the lambda k+1 dual variable k+1 iterations, the current value of the lambda k dual variable at the kth training iteration, Is the update step size of the dual variable, Is the constraint violation at the current time, if the current constraint is not satisfied, i.e By enlarging then Punishment of the strategy to enable the strategy to meet the constraint; the Q network is used for evaluating the value of the state-action pair, the Q network is updated by minimizing the loss function of the Q network, and the updating rule of the Q network is as follows: In the formula, As the current parameters of the Q network at the kth training iteration, For parameters of the Q network at the k +1 training iteration, Is the rate of learning to be performed, Representing the gradient to the Q network parameter phi, S t+1 is the state at time t+1 for future accumulated returns of the target Q network; the original dual DDPG algorithm also uses the target network to help smooth the update, setting the parameters of the target network to the values of the current network at each update: Wherein pi θ′ is the target policy network, Is the target Q network.
8. The method for adaptively controlling the voltage of the distribution network based on the large-scale small hydropower station cluster access of claim 7, wherein the training agent comprises collecting historical operation data of the distribution network and the small hydropower station cluster; Building a simulation environment which accords with the actual power grid characteristics, wherein the simulation environment comprises a tide model of a power distribution network, a control model of a small hydropower unit and energy storage equipment, and can interact with a reinforcement learning algorithm; the training process is divided into three stages of exploration, training and execution, and the specific tasks of each stage are as follows: The exploration phase is that the intelligent agent interacts with the simulation environment through random selection action, accumulates experience data, learns the dynamic characteristics of the system and knows the action capable of maximizing voltage stability and minimizing network loss in the exploration process; the training phase is that the intelligent agent updates the strategy network and the Q network by playing back the pool sampling data and using the DDPG algorithm of the original dual, and simultaneously updates the dual variable lambda to meet the constraint condition of the system, and the intelligent agent continuously optimizes the control strategy to minimize the voltage out-of-limit penalty and the network loss; the training-completed intelligent agent selects corresponding control actions to adjust voltage and reduce network loss according to the state of the real-time power distribution network; after training, the intelligent agent needs to be verified and evaluated, and evaluation indexes comprise voltage stability, network loss minimization and constraint satisfaction; After training and evaluation, the intelligent agent is deployed into the controllers of the small hydropower and energy storage devices of the actual distribution network for on-line control.
9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor, when executing the computer program, implements the steps of the method for adaptive control of the voltage of a lower distribution network based on a large-scale small hydropower cluster access according to any one of claims 1 to 8.
10. A computer readable storage medium having stored thereon a computer program, which when executed by a processor, implements the steps of the method for adaptive control of voltage of a lower distribution network based on a large-scale small hydropower cluster access according to any one of claims 1 to 8.

Description

Voltage self-adaptive control method, equipment and medium for power distribution network under access of large-scale small hydropower station clusters Technical Field The invention relates to the technical field, in particular to a voltage self-adaptive control method, equipment and medium for a lower distribution network based on large-scale small hydropower station cluster access. Background With the large-scale development of renewable energy sources, distributed power supplies rapidly permeate into all levels of power grids, and particularly in partial mountain areas and hilly areas, a large number of small hydropower stations are connected into a medium-low voltage power distribution network to form small hydropower clusters. The small hydropower stations are used as green and renewable energy resources, have the characteristics of low operation cost and wide geographical distribution, and provide powerful support for the cleanliness and localization of regional power supply. However, as the output of the power distribution network is obviously influenced by hydrologic condition fluctuation, the lack of uniform scheduling and coordinated control leads to strong randomness and uncertainty of the small hydropower station clusters, and further brings great impact to voltage stability, power balance capability and operation economy of the power distribution network. Voltage regulation of conventional distribution networks is usually performed by means of main transformer tap regulation, on-load voltage regulation (OLTC) or reactive compensation equipment (e.g. capacitor banks, static var generators SVG, etc.), which often operate in a centralized control manner, with slow response speed, and difficulty in meeting increasingly complex voltage control requirements. Under the condition of small hydropower station cluster access, particularly under the conditions of weak grid structure, severe load change or severe fluctuation of hydropower output, the voltage of the power distribution network often frequently fluctuates and even is out of limit. The existing control strategies are mostly static and rule-driven designs, cannot be dynamically adjusted based on actual working conditions, and are not flexible and adaptive enough. In addition, although some small hydropower stations have certain excitation regulation capability and can provide reactive power support by regulating excitation current, a unified cooperative control mechanism is not formed at present, so that the regulation potential of the small hydropower stations cannot be effectively released. Meanwhile, with the development of energy storage technology, more and more battery energy storage systems are deployed in a power distribution network, and the power distribution network has fast-response active power regulation capability. The energy storage system can be cooperatively regulated and controlled with small hydropower stations, and plays an important role in peak clipping and valley filling, supporting voltage, smoothing power fluctuation and the like. However, the existing energy storage control often operates in isolation, cannot form organic linkage by combining small hydropower stations, and does not fully mine the effect of the energy storage control in voltage regulation. In addition, at the control level, an intelligent control framework which is oriented to the complex running state of the power distribution network, gives consideration to voltage stabilization and loss optimization, can dynamically learn and has safety constraint capability is not available at present. Therefore, a voltage self-adaptive control method and system capable of integrating the regulation and control capability of the small hydropower stations and the energy storage equipment based on the dynamic operation characteristics of the power distribution network and having the functions of online learning and safety constraint control are needed, so that efficient, intelligent and safe voltage regulation of the large-scale small hydropower station cluster access lower power distribution network is realized. This is the core technical problem to be solved by the present invention. Disclosure of Invention The present invention has been made in view of the above-described problems occurring in the prior art. Therefore, the invention provides a voltage self-adaptive control method for a distribution network based on large-scale small hydropower station cluster access, which can solve the problems of frequent voltage fluctuation, insufficient reactive power coordination, isolated energy storage regulation, static stiffness of a control strategy and the like caused by small hydropower station cluster access in the prior art, and provides a voltage self-adaptive control method and a system for the distribution network based on intelligent reinforcement learning and an original dual optimization mechanism, so that the cooperative optimization adjustment of reactive powe