CN-121676295-B - Wind turbine generator running state monitoring method based on multi-source heterogeneous data
Abstract
The invention discloses a wind turbine generator running state monitoring method based on multisource heterogeneous data, which comprises the steps of collecting SCADA multisource heterogeneous data, removing abnormal samples and normalizing, converting the collected non-normal data into approximate multi-element normal distribution based on space rank, constructing an elastic network punishment likelihood function, solving mean vector estimation representing SCADA sparse fault signals, calculating basic monitoring statistics and robust optimization based on the mean vector estimation to obtain robust monitoring statistics and an upper control limit, calculating real-time robust monitoring statistics, sending out fault alarm if the real-time robust monitoring statistics exceed the upper control limit, reversely pushing the mean vector estimation according to optimal punishment coefficients after triggering the fault alarm, and identifying and outputting fault component information. The invention solves the problems of complex SCADA data distribution, variable group correlation interference, difficult recognition of sparse faults, insufficient controlled samples and high multi-unit monitoring cost in the prior art.
Inventors
- WU ZHENYU
- HE JIALE
- MU CHAOXU
- HAN CE
- WANG HOUSHENG
- XUE LIANGWEI
- TIAN XIANSHUO
Assignees
- 安徽大学
Dates
- Publication Date
- 20260508
- Application Date
- 20260205
Claims (7)
- 1. The method for monitoring the running state of the wind turbine generator based on the multi-source heterogeneous data is characterized by comprising an offline training stage and an online monitoring stage, and specifically comprises the following steps: The offline training phase comprises: Acquiring SCADA multi-source heterogeneous data of the wind turbine, and sequentially carrying out abnormal sample rejection and standardization treatment on the SCADA multi-source heterogeneous data; performing space rank conversion on the normalized SCADA multi-source heterogeneous data, and unifying data distribution to obtain samples which are approximately multi-element normal distribution; The method comprises the steps of carrying out smoothing treatment on samples which are approximately multi-element normal distribution, constructing an elastic network punishment likelihood function based on the smoothed samples, solving the elastic network punishment likelihood function, and obtaining the mean value vector estimation of sparse fault signals in the representation SCADA multi-source heterogeneous data; The method comprises the steps of carrying out robust optimization on basic monitoring statistics of a wind turbine generator based on mean vector estimation to obtain robust monitoring statistics, setting controlled average operation time length, and determining an upper control limit of the wind turbine generator in a controlled state through off-line simulation, wherein the method comprises the following steps: Estimation from mean vector Calculating the current time step Is based on monitoring statistics of (a) The formula is: ; Wherein, the For the EWMA weight correction term, In order to smooth the parameters of the image, Is that The transpose vector of the normalized spatial rank vector after the EWMA smoothing process at the moment, A transpose vector of mean shift vector estimated based on elastic network penalty; For a pair of Robustness optimization is performed: constructing a penalty coefficient sequence decreasing from big to small to 0: ; Definition of the definition Expressed in penalty coefficients Optimal estimation of lower regression coefficients Index set of non-zero elements in (B), record The number of elements in (a) ; Take values one by one from penalty coefficient sequence when = When in use, if So that Number of elements of (2) One is added, record For a transition point, all values in the penalty coefficient sequence are traversed, constructed such that Increasing from 1 to Transition point set of (2) , wherein, Refers to the number of non-zero regression coefficients, Representation is such that From the slave To increase to At the transition point without Default settings with a priori knowledge of (1) Obtaining robust monitoring statistics : ; Wherein, the 、 The expectation and variance are respectively given, = ; Setting a controlled average run length IC-ARL = Finding the IC-arl=by multiple off-line simulations Robust monitoring statistics The value of (2) is the upper control limit; the online monitoring stage comprises the following steps: The SCADA multisource heterogeneous data of the wind turbine generator is collected in real time, robust monitoring statistics are calculated in real time according to the processing flow of the offline training stage, the robust monitoring statistics obtained through real-time calculation are compared with an upper control limit determined through offline simulation, and if the robust monitoring statistics exceed the upper control limit, a fault alarm is sent out; after triggering the fault alarm, screening the optimal punishment coefficient, determining a mean value vector estimation according to the optimal punishment coefficient, and identifying and outputting fault component information according to the mean value vector estimation.
- 2. The method for monitoring the running state of the wind turbine generator based on the multi-source heterogeneous data according to claim 1, wherein the abnormal sample rejection and the standardization processing are performed on the SCADA multi-source heterogeneous data in sequence, specifically: eliminating data samples and zero value samples caused by sensor faults, transmission delay and shutdown maintenance in SCADA multi-source heterogeneous data of the wind turbine generator set, and abnormal fluctuation samples which are not matched with fault recording time; and performing Z-score standardization on the SCADA multi-source heterogeneous data from which the abnormal samples are removed, converting different running state variables of the wind turbine into uniform magnitude, and eliminating dimension influence.
- 3. The method for monitoring the running state of the wind turbine generator set based on the multi-source heterogeneous data according to claim 1, wherein the method for performing spatial rank conversion and unified data distribution on the normalized SCADA multi-source heterogeneous data is characterized in that the sample for obtaining the approximate multi-element normal distribution is specifically: calculating the current time step t Covariance matrix among controlled reference samples Covariance matrix The sum of multidimensional operation state variable vectors of wind turbines in each standardized controlled reference sample A mean vector of the controlled reference samples; For a pair of Performing Cholesky decomposition, and taking the inverse root to obtain transformation matrix ; Based on a transformation matrix And normalized multidimensional running state variable vector in the current time step t sampling sample Calculation of Spatial rank of (2) To delineate the relative size and direction of the sample; For space rank Performing standardization to obtain a sample of the approximate multivariate normal distribution of the current time step t 。
- 4. The method for monitoring the running state of the wind turbine generator set based on the multi-source heterogeneous data according to claim 1 is characterized by comprising the steps of carrying out smoothing on samples which are approximately multi-element normal distribution, constructing an elastic network punishment likelihood function based on the smoothed samples, and solving the elastic network punishment likelihood function to obtain a mean value vector estimation for representing sparse fault signals in the SCADA multi-source heterogeneous data, wherein the mean value vector estimation comprises the following steps: smoothing the samples with approximately multiple normal distributions by adopting an exponential weighted moving average to obtain smoothed samples Wherein In order to smooth the parameters of the image, 、 Samples of approximately normal distribution of time steps t, t-1, respectively, and are set up ; To be used for For input, constructing elastic net punishment likelihood function The formula is as follows: ; Wherein, the In order to adapt the weight matrix of the device, In order for the coefficient of balance to be present, In order to penalize the coefficients, Is a regression coefficient; Is the first The first sampling time The number of operating variables is a function of, Is that Is the first of (2) The number of components of the composition, The dimension of the operation variable of the SCADA system; Recording device Minimizing elastic network punishment likelihood function by adopting minimum angle regression LARS algorithm Solving regression coefficients Optimal estimation of (a) Further obtain an average vector estimation characterizing the sparse fault signal , The non-zero elements in the wind turbine generator set correspond to potential operation state fault variables of the wind turbine generator set.
- 5. The method for monitoring the running state of the wind turbine generator set based on the multi-source heterogeneous data according to claim 1, wherein after the fault alarm is triggered, screening an optimal penalty coefficient, determining a mean vector estimation according to the optimal penalty coefficient, and identifying and outputting fault component information according to the mean vector estimation specifically comprises: when the fault alarm is triggered, the optimal punishment coefficient is screened based on the risk expansion criterion RIC Then solving the optimal estimation of the regression coefficient under the optimal penalty coefficient through the elastic network penalty likelihood function and the minimum angle regression LARS algorithm Finally according to the self-adaptive weight matrix Mapping to obtain mean vector estimates ; And determining the position of the fault component according to the mapping relation between SCADA data and the component, and finally outputting complete fault component information including fault category, fault number and fault component position.
- 6. An electronic device comprising a memory and a processor, wherein: A memory for storing a computer program capable of running on the processor; a processor for executing a method for monitoring the operation state of a wind turbine based on multi-source heterogeneous data according to any of the preceding claims 1-5 when running said computer program.
- 7. A computer readable storage medium, characterized in that the computer readable storage medium stores computer instructions for causing a processor to implement a method for monitoring an operation state of a wind turbine based on multi-source heterogeneous data according to any of the preceding claims 1-5 when executed.
Description
Wind turbine generator running state monitoring method based on multi-source heterogeneous data Technical Field The invention relates to the technical field of operation and maintenance of wind turbines, in particular to a wind turbine operation state monitoring method based on multi-source heterogeneous data. Background With the large-scale development of the wind power industry, the wind power generation set is mostly deployed in severe environments such as remote mountain areas, seas and the like, and the set faults frequently occur due to factors such as strong storms, dust and salt and alkali corrosion. According to statistics, the failure shutdown cost of the wind turbine generator system accounts for about 20% of the total operation and maintenance cost of the wind power plant, and the failure of key parts (such as a gear box and a generator) not only can generate high maintenance cost, but also can obviously reduce the power generation performance of the wind turbine generator system and influence the income of the wind power plant. The operation state monitoring is a core technology for optimizing the maintenance strategy of the wind turbine generator and reducing the operation and maintenance cost, but the current monitoring method faces four technical bottlenecks caused by multi-source heterogeneous SCADA data: 1. the data distribution complexity is that the source of the multivariable data (such as temperature, vibration, current and power) collected by the SCADA system is different, the distribution is unknown (part of variables are periodic and part of variables are randomly fluctuated), the assumption of the 'multielement normal distribution' of the traditional multivariable statistical process control method (such as MEWMA control chart) is not satisfied, and the false alarm rate or the false alarm rate is easily caused to be too high. 2. And the variable group correlation interference is that because of the structural and functional correlation of the parts, the acquired different operation parameters have correlation, and the correlation strength can be distinguished according to the structural position and functional relation among the parts. Parts which are closer in structure and closer in function are more closely related, and the operation state of the parts is more relevant. As the number of sensors increases, the correlation structure becomes more complex. The relevance of wind turbines presents a group structure, also known as a group effect. When a fault occurs, a set of highly correlated variables typically change simultaneously, and group effects typically prevent the model from identifying multiple fault variables. 3. The SCADA system records a plurality of operation variables, but when the unit operates abnormally, all parts cannot be simultaneously failed. Only a few parts are abnormal, and only the operation variable related to the parts is affected by faults, so that abnormal fluctuation of data occurs. In high-dimensional SCADA data, if unified modeling analysis is performed on all variables, sparse fault characteristics are covered by most normal signals, and the variables are high in dimension, so that faults are difficult to identify. 4. The controlled samples are deficient in that the complexity of the monitoring model increases with increasing SCADA data dimensionality, which also means that more adequate controlled samples are required to train the model. However, there is not enough fault-free historical data available to train the model in some newly built wind farms. For example, some units fail early in the operation of a wind farm where there is no adequately controlled sample about the unit. In addition, in the large-scale newly-built wind power plant at the current stage, the number of units exceeds 50, and the SCADA variable number collected by each unit exceeds 100 dimensions. Therefore, the monitoring model is started quickly, and the method has practical significance for realizing accurate complete machine-level operation state monitoring of a newly-built wind power plant. In the prior art, the MEWMA control diagram depends on normal distribution assumption, has poor adaptability to non-normal SCADA data, can not process group correlation although a LASO method can screen sparse variables, and can cope with the non-normal data while the error recognition accuracy is only about 80%. Therefore, a method for monitoring the running state of a wind turbine generator set is needed, which can simultaneously solve the problems of multi-source heterogeneous distribution, group correlation, sparse faults and insufficient controlled samples. Disclosure of Invention Aiming at the defects existing in the prior art, the invention provides a wind turbine generator running state monitoring method based on multi-source heterogeneous data, which breaks through the limitation of the traditional method on data distribution by converting the SCADA multi-source heterogeneous da