CN-122020176-A - Power data unbalance fault prediction method based on improved SMOTE algorithm
Abstract
The invention discloses an electric power data unbalance fault prediction method based on an improved SMOTE algorithm, which comprises the steps of intelligently selecting neighbors and interpolating to generate a candidate synthesized sample through fusing a Euclidean distance, a sample local density and a composite measurement of physical state information defined by a self-encoder reconstruction error, solving an optimization problem aiming at minimizing a correction constant for the candidate sample which does not meet physical reality according to a preset electric power system physical constraint model, correcting the candidate sample into a physically effective final synthesized sample, and finally training a fault prediction model by using a data set containing the high-fidelity final synthesized sample. The invention ensures the physical authenticity of the synthesized sample, avoids data pollution, and improves the accuracy, generalization capability and reliability of the data-driven fault prediction model in the application of an actual power system.
Inventors
- WU QUAN
- TANG XIAOLAN
Assignees
- 西安卓俊建设工程有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20260205
Claims (10)
- 1. The power data unbalance fault prediction method based on the improved SMOTE algorithm is characterized by comprising the following steps of: s1, acquiring power equipment characteristic data containing a few types of fault samples, and generating a candidate synthesized sample by interpolating the few types of fault samples; S2, judging whether the candidate synthetic sample meets physical reality according to a preset physical constraint model of the power system, and correcting the candidate synthetic sample to enable the candidate synthetic sample to meet the physical constraint model when the candidate synthetic sample does not meet the physical reality so as to obtain a final synthetic sample; And S3, training a fault prediction model by using the data set containing the final synthesized sample.
- 2. The method for predicting a power data imbalance fault based on the modified SMOTE algorithm as recited in claim 1, wherein said interpolating in S1 comprises: linear interpolation is performed between a selected minority class of samples and their neighbors.
- 3. The method for predicting a power data imbalance fault based on the modified SMOTE algorithm of claim 2, wherein said neighbor samples are determined by a composite metric fusing euclidean distance, sample local density information, and sample physical state information.
- 4. The method for predicting a power data imbalance fault based on the modified SMOTE algorithm of claim 1, wherein said correcting in S2 comprises: the correction is to solve an optimization problem; The objective of the optimization problem is to minimize the distance between the candidate synthesized sample and the modified sample, and the constraint condition is that the modified sample satisfies the physical constraint model.
- 5. The improved SMOTE algorithm-based power data imbalance fault prediction method of claim 1, wherein said physical constraint model includes at least one expression of the following constraints: Three-phase system component constraint, power balance constraint and chemical or physical relation constraint among equipment characteristic parameters.
- 6. The method for power data imbalance fault prediction based on the modified SMOTE algorithm of claim 1, further comprising, prior to said S3 executing: And selecting the characteristics of the data set, and screening out the characteristic subset with the highest contribution to fault prediction.
- 7. The method for predicting a power data imbalance fault based on an improved SMOTE algorithm of claim 1, wherein said fault prediction model is a neural network model based on a self-attention mechanism.
- 8. The method for power data imbalance fault prediction based on the modified SMOTE algorithm of claim 1, further comprising, prior to said training a fault prediction model: A health index is defined for each sample by calculating the reconstruction error of the feature data using a self-encoder model trained on the health status data.
- 9. The method for predicting a power data imbalance fault based on the modified SMOTE algorithm of claim 7, wherein said S2 is applied to generate a device degradation trajectory sample, the process comprising: the interpolation and correction operations are performed between subsets of samples representing different health levels, divided according to health indices, to generate composite samples simulating continuous degradation of device performance.
- 10. The method for predicting a power data imbalance fault based on an improved SMOTE algorithm of claim 8, wherein a health management model is trained using said degradation trajectory samples with a historical health index sequence of the device as input and a remaining life of the device as output.
Description
Power data unbalance fault prediction method based on improved SMOTE algorithm Technical Field The invention relates to the technical field of fault prediction, in particular to a power data unbalance fault prediction method based on an improved SMOTE algorithm. Background With the advanced development of smart grids, the operation management mode of the power system is evolving from traditional scheduled overhauls to state-based predictive maintenance (PREDICTIVE MAINTENANCE, PDM) at an accelerated rate. The evolution core driving force is to realize the fine management of the whole life cycle of the power equipment, optimize the maintenance resource allocation and reduce the unplanned outage risk by utilizing mass sensing data and through a data-driven analysis and prediction model. Under the background, fault prediction methods based on machine learning and deep learning have become research hotspots, and the methods can build an intelligent model for early warning of potential faults in advance by learning complex mapping relations between historical operation data and fault events. However, the performance of these methods is highly dependent on the quality of the training data, especially when dealing with the inherent data imbalance problem of the power system, i.e. the number of normal operation samples far exceeds the number of failure samples, resulting in model training being biased towards the majority class and severely insufficient ability to identify failure samples of the minority class. To alleviate such problems, data enhancement algorithms represented by the synthetic minority class oversampling technique (SMOTE) have been developed that balance the data set by manually synthesizing new samples by linear interpolation between minority class samples. However, existing SMOTE-based data enhancement methods often have some limitations when applied to engineering systems (e.g., power systems) with strong physical constraints. Standard SMOTE algorithms perform "blind" linear interpolation in feature space, completely ignoring underlying physical laws that must be followed between variables, such as kirchhoff's law, power balance relationships, etc., which are extremely prone to generating "false samples" that are mathematically true but are unlikely to exist in the physical world. The pseudo samples which violate physical constraints are injected into the training set, so that the generalization capability of the model cannot be effectively improved, data pollution can be formed, the false model is misled to learn wrong or distorted characteristic association, the decision boundary is deviated from a real physical manifold, and finally the prediction accuracy and reliability of the model under the actual working condition are reduced. Therefore, how to make the composite samples statistically similar while ensuring their physical authenticity has become a technical bottleneck restricting the data-driven power failure prediction approach to play a greater role in critical management decisions. Disclosure of Invention This section is intended to outline some aspects of embodiments of the application and to briefly introduce some preferred embodiments. Some simplifications or omissions may be made in this section as well as in the description of the application and in the title of the application, which may not be used to limit the scope of the application. The present invention has been made in view of the above-described problems occurring in the prior art. Therefore, the invention provides a power data unbalance fault prediction method based on an improved SMOTE algorithm, which is used for solving the problems in the background technology. In order to solve the technical problems, the invention provides the following technical scheme that the power data unbalance fault prediction method based on the improved SMOTE algorithm comprises the following steps: s1, acquiring power equipment characteristic data containing a few types of fault samples, and generating a candidate synthesized sample by interpolating the few types of fault samples; S2, judging whether the candidate synthetic sample meets physical reality according to a preset physical constraint model of the power system, and correcting the candidate synthetic sample to enable the candidate synthetic sample to meet the physical constraint model when the candidate synthetic sample does not meet the physical reality so as to obtain a final synthetic sample; And S3, training a fault prediction model by using the data set containing the final synthesized sample. As a preferable scheme of the power data unbalance fault prediction method based on the improved SMOTE algorithm, the interpolation in S1 includes: linear interpolation is performed between a selected minority class of samples and their neighbors. The power data unbalance fault prediction method based on the improved SMOTE algorithm is characterized in that the neighbor samples are de