CN-121689302-B - Photovoltaic output smooth regulation and control method, system, equipment and medium

CN121689302BCN 121689302 BCN121689302 BCN 121689302BCN-121689302-B

Abstract

The invention belongs to the technical field of photovoltaic power generation, and discloses a photovoltaic output smooth regulation and control method, a system, equipment and a medium, wherein, firstly, low-pass filtering is carried out on an original photovoltaic output signal to generate a reference action, a state vector is input to a pre-trained PPO reinforcement learning model to output residual action, a synthetic action instruction is formed after superposition, and then the synthetic action instruction is corrected by a power limit and SOC boundary constraint mechanism and is output by a teacher model; and then training a lightweight chemo-model for real-time regulation and control by taking teacher output as a soft label based on the distillation loss function. In the method, the low-pass filtering provides a stable reference, the reinforcement learning residual error compensates high-frequency fluctuation and non-stationary characteristics, the constraint mechanism protects the service life of the battery, the distillation process shifts the knowledge of the complex model to a light model, and the calculation load is reduced. By adopting the method, the smoothness performance is obviously improved, the control failure is avoided, the battery health is effectively protected, the efficient edge deployment is realized, and the control performance, the service life management and the engineering feasibility can be considered.

Inventors

CAI YUZE
ZHENG QIANG
ZHANG DONGXIAO

Assignees

宁波东方理工大学
宁波东理数智能源科技有限责任公司

Dates

Publication Date: 20260508
Application Date: 20260209

Claims (10)

1. The smooth regulation and control method for the photovoltaic output is characterized by comprising the following steps of: Performing low-pass filtering processing on the original photovoltaic output signal to generate a reference action; Inputting the original state space vector into a reinforced learning intelligent agent of a pre-built teacher model, and outputting residual action, wherein the reinforced learning intelligent agent adopts a PPO-based algorithm; superposing the reference action and the residual action to generate a synthetic action instruction; Correcting the synthesized action instruction by adopting a constraint mechanism of a power limit and an SOC boundary to generate a final action instruction, wherein the final action instruction is output by a teacher model; based on the distillation loss function, training a student model by taking a final action instruction output by a teacher model as a soft label so as to obtain a lightweight regulation model; and regulating and controlling the real-time photovoltaic output signal by adopting a lightweight regulation and control model, and outputting a real-time regulation and control instruction.
2. The method of claim 1, wherein the performing a low-pass filtering process on the raw photovoltaic output signal to generate the reference motion comprises: Acquiring an original photovoltaic output signal; The low-pass filter is adopted to carry out filtering treatment on the original photovoltaic output signal, a reference action is generated, and a specific first-order discrete filtering formula is as follows: in the formula, Representing a reference action; representing the filter coefficients; Representing an original photovoltaic output signal; representing the current time; indicating the last moment.
3. The method for smooth regulation of photovoltaic output according to claim 1, wherein the step of inputting the original state space vector into the reinforcement learning agent of the pre-built teacher model to output the residual action comprises: acquiring an original state space vector, wherein the original state space vector comprises an environment state and a battery charge state; Inputting the original state space vector into a reinforcement learning intelligent agent of a pre-built teacher model, and outputting residual action, wherein: the specific formula of the optimization objective function of the reinforcement learning agent is as follows: in the formula, Representing an optimization objective function; Representing the proportion of new strategies to old strategies; Representing a dominance function estimate; represents a shear range; Representing the desire; representing a numerical clipping function; Representing a minimum function; representing policy parameters; representing the current time; The state space vector The specific expression of (2) is as follows: in the formula, Representing an original photovoltaic output signal; Representing the variation of the original photovoltaic output signal; representing a battery state of charge; Representing a reference action; Representing a final action instruction; Representing a historical state quantity of photovoltaic power; A historical state quantity representing a photovoltaic power variation quantity; representing the current time; indicating the last moment.
4. The method for smooth regulation of photovoltaic output according to claim 1, wherein the step of superimposing the reference motion and the residual motion to generate the synthetic motion command has the following specific calculation formula: in the formula, Representing a synthetic action instruction; Representing a reference action; Representing a residual action; Indicating the current time.
5. The method of claim 1, wherein in the step of generating the final action command by modifying the synthetic action command using a constraint mechanism of a power limit and an SOC boundary, a specific modification formula is as follows: in the formula, Representing a final action instruction; Representing a synthetic action instruction; Representing the maximum allowable discharge power; representing the maximum charge power allowed; Representing the rated limit of battery power; Representing a minimum function; Representing a maximum function; representing a battery state of charge; And Respectively representing the allowable minimum and maximum charge states of the battery; Representing the rated capacity of the battery; Representing the control period time step size, Indicating the current time.
6. The method of claim 1, wherein in the step of generating a final motion command by modifying a synthetic motion command using a constraint mechanism of a power limit and an SOC boundary, an SOC control factor is also introduced into a reward function to modify a residual motion, wherein a cumulative loss of the SOC is calculated by a rain flow counting method, and the cumulative loss is calculated by a rain flow counting method The specific calculation formula of (2) is as follows: in the formula, Representing the total number of cycles identified by the rain flow count; Represent the first The depth of the charge-discharge cycle; Represent the first The number of charge and discharge cycles occurring; is expressed as a circulation depth The number of battery rated life cycles.
7. The method of claim 1, wherein the training the student model to obtain the lightweight regulation model based on the distillation loss function with the final action instruction output by the teacher model as a soft tag comprises: inputting the original photovoltaic output signal into a pre-constructed student model to obtain an output result of the student model; the final action instruction output by the teacher model is used as a soft label for training the student model, wherein the final action instruction is the output result of the teacher model; Training a student model by adopting a distillation loss function and combining an output result of a teacher model and an output result of the student model to obtain a lightweight regulation model, wherein: the distillation loss function The specific expression of (2) is as follows: in the formula, The number of samples is taken as a value , For the total number of training samples; Representing the first student model Outputting a result; Representing the first of the teacher model Outputting a result; Representing student model parameters.
8. A photovoltaic output smoothing regulation system, comprising: The filtering module is used for carrying out low-pass filtering processing on the original photovoltaic output signal so as to generate a reference action; The reinforcement learning module is used for inputting the space vector of the original state into a reinforcement learning intelligent agent of a pre-built teacher model and outputting residual actions, wherein the reinforcement learning intelligent agent adopts a PPO-based algorithm; The action superposition module is used for superposing the reference action and the residual action to generate a synthetic action instruction; The correction module is used for correcting the synthetic action instruction by adopting a constraint mechanism of the power limit and the SOC boundary so as to generate a final action instruction, wherein the final action instruction is output by the teacher model; The model training module is used for training the student model by taking the corrected final action instruction output by the teacher model as a soft label based on the distillation loss function so as to obtain a lightweight regulation model; and the regulation and control module is used for regulating and controlling the real-time photovoltaic output signal by adopting a lightweight regulation and control model and outputting a real-time regulation and control instruction.
9. A photovoltaic output smoothing regulation device, comprising: A memory for storing a computer program; A processor for implementing the steps of the photovoltaic output smoothing regulation method of any one of claims 1 to 7 when executing the computer program.
10. A computer readable storage medium storing a computer program, characterized in that the computer program is executed by a processor for implementing the steps of the photovoltaic output smoothing regulation method of any one of claims 1-7.

Description

Photovoltaic output smooth regulation and control method, system, equipment and medium Technical Field The invention belongs to the technical field of photovoltaic power generation, and particularly relates to a photovoltaic output smooth regulation and control method, a system, equipment and a medium. Background Along with the transition of the global energy structure to clean low carbon, photovoltaic power generation is taken as an important component of renewable energy, and the installation scale of the photovoltaic power generation is rapidly expanded. However, the output power of the photovoltaic power generation is obviously influenced by natural factors such as illumination intensity, cloud shadow shielding, weather variation and the like, and has outstanding time variability, volatility and uncertainty, and the characteristic has become a great challenge for restricting the safe and stable operation of a power grid. The short-time fluctuation of the photovoltaic output can cause the electric energy quality problems of power grid voltage fluctuation, frequency deviation, harmonic waves and the like, and can also obviously increase the running cost of power grid frequency modulation, standby and peak regulation, even lead to the phenomena of light and electricity abandonment in extreme cases, and cause serious waste of precious energy resources. In order to effectively smooth photovoltaic output fluctuation, a battery energy storage system (Battery Energy Storage System, BESS) becomes a key technical means accepted in the industry by virtue of flexible charge-discharge adjustment capability, and photovoltaic instantaneous output change is buffered by peak clipping and valley filling, so that grid connection stability of photovoltaic power generation can be greatly improved, and impact on back-end power grid facilities such as transformers, power transmission lines and the like is reduced. The current technical scheme for smooth control of photovoltaic output has various defects, and is difficult to meet the comprehensive requirements in engineering practice. In the traditional filtering and rule/strategy-based energy storage control method, although a smooth reference is generated in a moving average mode, a low-pass filtering mode and the like, and control is performed by combining with rules such as upper and lower limits of a State of Charge (SOC), the method has the advantages of simplicity and convenience in implementation and strong interpretability, the method is limited by constraints such as an upper limit of battery power and energy capacity, the situation that reference power cannot be executed by a battery after filtering often occurs, the smooth effect is insufficient or control action fails, and life influencing factors such as battery circulation depth, charge and discharge frequency and the like are not fully considered, so that battery aging is easy to accelerate. In the control method for the energy storage system, although SOC and charge-discharge depth constraint are introduced to relieve battery loss, the control performance is limited in extreme weather or high-frequency fluctuation scenes due to the lack of effective characterization on the non-stable characteristics of the photovoltaic output. In the intelligent control method which is rising in recent years, the model prediction control depends on the photovoltaic output and weather prediction results, has extremely high requirements on prediction precision and model fitting degree, has the problems of high solving cost, difficult setting and the like in strong nonlinearity and complex constraint scenes, and generally faces the dilemma of slow training convergence, poor strategy interpretability and high calculation force requirement based on the reinforcement learning method, and is difficult to directly deploy and land on the photovoltaic power station edge equipment. Therefore, the existing photovoltaic output smoothing and energy storage control technology cannot realize the effective compromise of control performance, battery life and edge deployment feasibility in engineering practice. Disclosure of Invention The invention provides a smooth regulation and control method, a system, equipment and a medium for photovoltaic output, which can effectively solve the problem that the existing photovoltaic output smoothing and energy storage control technology cannot realize the effective compromise of control performance, battery life and edge deployment feasibility in engineering practice. In order to achieve the above purpose, the invention adopts the following technical scheme: a photovoltaic output smooth regulation method comprises the following steps: Performing low-pass filtering processing on the original photovoltaic output signal to generate a reference action; Inputting the original state space vector into a reinforced learning intelligent agent of a pre-built teacher model, and outputting residual action, wherein th