CN-122014498-A - Wind turbine generator set full-link self-adaptive control method based on dynamic sensing and co-evolution DRL

CN122014498ACN 122014498 ACN122014498 ACN 122014498ACN-122014498-A

Abstract

The invention provides a wind turbine generator full-link self-adaptive control method based on dynamic sensing and co-evolution DRL, which relates to the technical field of wind power generation control, and comprises the following steps of S1, collecting running state and environment wind condition data of a wind turbine generator through a multi-mode sensor array, preprocessing the collected running state and environment wind condition data, outputting multi-source data passing verification, S2, inputting the multi-source data passing verification into a convolution encoder with enhanced physical knowledge, embedding hydrodynamic constraint to perform feature extraction, generating a working condition type, dynamic feature parameters and confidence, and synthesizing extreme working condition samples through a generation type opposite network. The online co-evolution of the model and the control strategy can be realized through multi-source data fusion, lightweight deep reinforcement learning and a stability constraint mechanism.

Inventors

LI XIAOKUN
WANG HAORAN
LUO ZHAN
HUANG YANQIN
LIU RUIBO
Guo Donglun

Assignees

三峡智控科技有限公司

Dates

Publication Date: 20260512
Application Date: 20251211

Claims (10)

1. A wind turbine generator full-link self-adaptive control method based on dynamic sensing and co-evolution DRL is characterized by comprising the following steps: S1, acquiring running state and environmental wind condition data of a wind generating set through a multi-mode sensor array, preprocessing the acquired running state and environmental wind condition data, and outputting multi-source data passing verification; s2, inputting the checked multi-source data into a convolution encoder with enhanced physical knowledge, embedding hydrodynamic constraint for feature extraction, generating working condition types, dynamic feature parameters and confidence coefficient, and synthesizing extreme working condition samples through a generation type countering network; S3, inputting dynamic characteristic parameters and confidence to a pre-trained deep reinforcement learning DRL model deployed in an AI (advanced technology) cooperative controller, integrating a regularization mechanism driven by double commentators and TD (time division) errors by the deep reinforcement learning DRL model, embedding Lyapunov stability constraint, adaptively switching parameter configuration according to working condition types, and outputting a variable pitch PID (proportion integration differentiation) coefficient and a yaw angle control instruction; s4, correcting the variable pitch PID coefficient and the yaw angle control instruction through a self-penalty mechanism; s5, transmitting the corrected variable pitch PID coefficient and yaw angle control instruction to a main control PLC for execution by adopting an event-driven communication frame according to priority, and triggering a wind turbine generator to execute variable pitch and yaw angle adjustment; s6, acquiring actual operation data of an executing mechanism, and feeding back to the deep reinforcement learning DRL model in the step S3 to realize strategy online iterative optimization.
2. The wind turbine generator full-link self-adaptive control method based on dynamic sensing and co-evolution DRL according to claim 1, wherein in step S1, the multi-mode sensor array high-frequency laser radar, an inertial measurement unit, an acoustic sensor, a vibration sensor, an accelerometer, an inclinometer, a gyroscope and an ultrasonic probe; The running state and environmental wind condition data comprise SCADA system data, vibration sensor data, meteorological sensor data, wind speed profile, blade vibration, tower displacement and aerodynamic load data; When the wind generating set completes starting self-checking or a preset working condition monitoring period arrives, triggering the multi-mode sensor array to start data acquisition.
3. The method for full-link adaptive control of a wind turbine generator set based on dynamic sensing and co-evolution DRL according to claim 1 or 2, wherein in step S1, specific operation logic of the multi-modal sensor array is as follows: S11, triggering sensor fault diagnosis when the wind generating set starts self-checking, sending test signals to each sensor, and receiving feedback response time < "> Is normal; S12, after the self-detection is passed, the accelerometer detects the rotating speed of the gear box > Time-to-start Sampling, wherein the inclinometer changes the pitch angle of the blade Time-to-start Sampling, and starting a gyroscope when a yaw system is started Sampling, ultrasonic probe vibrating on tower Time-to-start Sampling; s13, triggering the redundant sensor to start when any sensor detects that the data of the sensor is abnormal, and simultaneously sending a sensor abnormality early warning to the AI cooperative controller; Wherein, the Expressed as the maximum tolerated response time of the sensor feedback; A rotational speed threshold, denoted accelerometer-initiated sampling; an angle threshold, denoted as inclinometer-initiated sampling; A vibration threshold value expressed as an ultrasonic probe start sampling; expressed as the sampling frequency of the accelerometer; a sampling frequency denoted as inclinometer; represented as the sampling frequency of the gyroscope; Represented as the sampling frequency of the ultrasonic probe.
4. The wind turbine generator full-link self-adaptive control method based on dynamic sensing and co-evolution DRL according to claim 1, wherein the specific steps of the step S2 are as follows: S21, determining that data isomerism exists according to multi-source data passing through verification, triggering a convolution encoder with physical knowledge enhancement to start a feature extraction process, acquiring a data alignment module, inputting acquired data into the data alignment module, and unifying low-resolution data into preset high resolution through a linear interpolation method; S22, acquiring an attention mechanism module, embedding hydrodynamic constraints to extract physical constraint features, triggering the attention mechanism module, calculating a weight coefficient according to the importance scores of the data of each sensor output in the S1, and dynamically distributing data weights; S23, when the working condition classification module counts extreme working condition samples and the generator of the generating type countermeasure network is in a ready state, triggering the generator to start sample synthesis, and supplementing the synthesized samples to a sample library.
5. The method for controlling full-link self-adaption of wind turbine generator based on dynamic sensing and co-evolution DRL according to any one of claims 1 to 4, wherein the specific steps of step S3 are as follows: s31, triggering the pre-trained deep reinforcement learning DRL model to be loaded according to the dynamic characteristic parameters and the confidence coefficient and the AI cooperative controller; s32, reading the working condition type according to the loaded deep reinforcement learning DRL model, and triggering the parameter configuration switching of the model if the working condition type is changed; S33, integrating a regularization mechanism driven by double critics and TD errors by a model, embedding Lyapunov stability constraint, wherein when the Q value is calculated by a deep reinforcement learning DRL model, triggering the double critics architecture to run in parallel, namely calculating by critics 1 based on dynamic characteristic parameters , wherein, A Q function denoted as reviewer 1; Parameter matrix expressed as critics 1, critics 2 calculate based on the same parameter , wherein, A Q function denoted as reviewer 2; A parameter matrix denoted as reviewer 2; Taking out And Is taken as a target Q value, and TD error is calculated Setting a threshold value of TD error If (if) Triggering regularization mechanism to apply L2 regularization term to the policy network parameters According to regularization term Adjusting learning rate slave Down to , wherein, Expressed as a regularized learning rate; setting a threshold value of iteration times with stable TD error When (when) And maintain Triggering the regularization mechanism to close when iterating for the second time, and recovering the initial learning rate; S34, inputting dynamic characteristic parameters for reasoning, and outputting variable pitch PID coefficients and yaw angle control instructions.
6. The wind turbine generator full-link self-adaptive control method based on dynamic sensing and co-evolution DRL according to claim 1, wherein the specific steps of the step S4 are as follows: s41, triggering and starting safety evaluation of a self-penalty mechanism according to a variable pitch PID coefficient and a yaw angle control instruction; s42, calling a preset safety threshold library, wherein the safety threshold comprises a variable pitch angle which is less than or equal to Yaw rate is less than or equal to The load is less than or equal to Calculating deviation of instruction from threshold ; S43, if Generates a penalty signal And punish signals The deep reinforcement learning DRL model is fed back to the S3 through a back propagation channel; s44, when the deep reinforcement learning DRL model receives the punishment signal Triggering strategy network parameter updating, and re-outputting corrected variable pitch PID coefficient and yaw angle control instruction; S45, if continuous And triggering the model to load a history optimal strategy until the output variable pitch PID coefficient and the yaw angle control instruction fall into the safety domain, and generating a final variable pitch PID coefficient and yaw angle control instruction with a 'safety verification passing' mark.
7. The wind turbine generator full-link self-adaptive control method based on dynamic sensing and co-evolution DRL according to claim 6, wherein the specific steps of the step S5 are as follows: S51, triggering instruction priority judgment when receiving a final variable pitch PID coefficient and yaw angle control instruction with a 'safety verification passing' mark and a communication module detects that the channel state of an OPC UA Pub/Sub is normal; s52, triggering the communication module to transmit the instruction to the master control PLC according to the priority, starting an instruction transmission confirmation mechanism, and receiving the instruction by the master control PLC And when the main control PLC receives the instruction and the self-detection has no fault, triggering the actuating mechanism to start the variable pitch and yaw angle adjustment action.
8. The wind turbine generator full-link self-adaptive control method based on dynamic sensing and co-evolution DRL according to claim 1, wherein the specific steps of the step S6 are as follows: When the data acquisition module receives actual operation data fed back by the main control PLC actuating mechanism, the actual operation data comprise an actual pitch angle, an actual yaw angle and generator power, and the deviation of a data time stamp and an S5 instruction execution time is < At the time, the trigger bias calculation module starts up, wherein A time bias threshold for data association; calculating the relative deviation between the actual data and the S3 output instruction value Setting a deviation threshold If (if) Marking the deviation data as effective training samples, inputting the effective training samples into a deep reinforcement learning DRL model experience pool of S3, and when the number of the samples in the experience pool is more than or equal to Bar or model iteration counter arrival Triggering the deep reinforcement learning DRL model to start on-line iterative optimization and updating strategy parameters of the deep reinforcement learning DRL model for the next time, wherein, Number of samples for the experience pool; is the number of iterations.
9. The method for full-link adaptive control of wind turbine generator set based on dynamic sensing and co-evolution DRL according to claim 5, wherein in step S32, when the display of the training iteration counter of deep reinforcement learning DRL model reaches The number of the stored samples is not less than that of the experience pool When the bar is executed, the experience playback module is triggered to randomly extract from the experience pool Strip sample, sample format of , wherein, The working condition parameters are expressed as the working condition parameters output at the moment S2; A control instruction which is expressed as a model output at the time t; a reward value fed back at a time t S6; the working condition parameters are expressed as working condition parameters output at the time of t+1 and S2; Is the number of iterations; to store the number of samples; Is the number of randomly extracted samples; When the sample extraction is completed and the target network detects that the self parameters are not updated, the times are not more than Triggering the updating of the target network parameters and updating the current network parameters Copying to a target network , wherein, A parameter matrix expressed as a current network; a parameter matrix expressed as a target network; the number of times of parameter non-update is given; then calculating a target Q value, and according to the target Q value and the current network Q value Calculating loss, updating current network parameters through gradient descent, wherein, A Q function expressed as the current network; Setting a threshold value of a reward weight coefficient Triggering when extreme wind conditions are detected Is adjusted to The rotation speed is guaranteed to be stable preferentially.
10. The method for full-link adaptive control of wind turbine generator set based on dynamic sensing and co-evolution DRL according to claim 5, wherein a threshold of the number of times of TD error stabilization is set in step S33 Attenuation trigger threshold for TD error Threshold of number of TD error fluctuations Trigger threshold for exploration rate boost ; When the absolute value of the TD error is continuous Less than or equal to When the exploration rate attenuation mechanism of the deep reinforcement learning DRL model is triggered, the exploration rate is controlled by Linear decay to The attenuation step length is And/or a number of iterations, wherein, Expressed as post-decay exploration rate; attenuation step size expressed as exploration rate; When the absolute value of the TD error is continuous Secondary ] When the search rate is triggered, a search rate lifting mechanism is triggered, namely, search rate slave Linearly raise to The lifting step length is And (3) iterating for a plurality of times, triggering the test pool sample diversity verification at the same time, and if the sample coverage rate is < Uncovered condition type- Triggering the downloading of the supplementary sample from the cloud; Wherein, the A step up represented as an exploration rate; a threshold value expressed as sample coverage; And the supplementary samples comprise extreme wind conditions and fault condition samples.

Description

Wind turbine generator set full-link self-adaptive control method based on dynamic sensing and co-evolution DRL Technical Field The invention relates to the technical field of wind power generation control, in particular to a wind turbine generator full-link self-adaptive control method based on dynamic sensing and co-evolution DRL. Background With the continuous increase of the global renewable energy demand, wind power generation is widely used as a clean and efficient energy source. However, natural wind conditions have strong randomness and dynamic property, rapid switching of various working conditions such as stable wind, turbulent wind, extreme wind and the like exists, and dynamic factors such as wind speed turbulence, wind shearing, tower shadow effect and the like can directly influence the variable pitch and yaw link response of the wind turbine generator. In the prior art, patent number CN112483334B discloses an intelligent control method of a wind turbine generator based on edge calculation, and the technical scheme is that real-time operation data are collected through a processor at the wind turbine generator side, FFT analysis, fault diagnosis and control parameter adjustment based on expert strategy are carried out after preprocessing and screening, and finally data interaction is carried out with a SCADA system and a cloud platform. The technical scheme still has the obvious defects that the control parameter adjustment is carried out by depending on preset pretreatment conditions and expert strategies, the static optimizing mode is basically adopted, the rapid switching situation of the working condition from stable wind to turbulent wind is not considered, the dynamic response requirements of dynamic factors such as wind speed turbulence, wind shearing, tower shadow effect and the like on variable pitch and yaw links cannot be matched in real time, so that the control command is easy to lag or overshoot, meanwhile, the priority rule of the self-adaptive strategy is preset solidification, the priority dynamic adjustment mechanism of the control command under the dynamic disturbance is not set, the local convergence or strategy instability is easy to be trapped under the dynamic disturbance scenes such as extreme wind conditions, equipment transient faults and the like, the global optimizing capability is lacked, the relation between strategy exploration and stable control cannot be balanced in real time, the situation that the local optimal solution is blocked or the core command is preempted by the transmission resource is possibly caused, and the mechanical load impact is caused. Disclosure of Invention Aiming at the defects existing in the prior art, the technical problem to be solved by the invention is to provide a wind turbine generator full-link self-adaptive control method based on dynamic perception and co-evolution DRL, which can realize online collaborative progress of a model and a control strategy through multi-source data fusion, lightweight deep reinforcement learning and stability constraint mechanisms and solve the technical problems of the depicting misalignment of a static model to a dynamic working condition and the lack of stability of a self-adaptive algorithm under random disturbance in the construction landing process. In order to solve the technical problems, the invention provides a wind turbine generator full-link self-adaptive control method based on dynamic sensing and co-evolution DRL, which comprises the following steps: S1, acquiring running state and environmental wind condition data of a wind generating set through a multi-mode sensor array, preprocessing the acquired running state and environmental wind condition data, and outputting multi-source data passing verification; s2, inputting the checked multi-source data into a convolution encoder with enhanced physical knowledge, embedding hydrodynamic constraint for feature extraction, generating working condition types, dynamic feature parameters and confidence coefficient, and synthesizing extreme working condition samples through a generation type countering network; S3, inputting dynamic characteristic parameters and confidence to a pre-trained deep reinforcement learning DRL model deployed in an AI (advanced technology) cooperative controller, integrating a regularization mechanism driven by double commentators and TD (time division) errors by the deep reinforcement learning DRL model, embedding Lyapunov stability constraint, adaptively switching parameter configuration according to working condition types, and outputting a variable pitch PID (proportion integration differentiation) coefficient and a yaw angle control instruction; s4, correcting the variable pitch PID coefficient and the yaw angle control instruction through a self-penalty mechanism; s5, transmitting the corrected variable pitch PID coefficient and yaw angle control instruction to a main control PLC for execution by adopting an event-d