CN-121980428-A - Method and device for generating small mechanical failure sample data based on differential evolution
Abstract
The application discloses a method and a device for generating small mechanical failure sample data based on differential evolution, which can strengthen few types of samples in mechanical failure diagnosis, relieve the problem of sample imbalance and further improve the training effect of a mechanical failure diagnosis model. The method comprises the steps of obtaining running state data of mechanical equipment, constructing a fault diagnosis sample set, dividing a minority sample set and a majority sample set from the fault diagnosis sample set, constructing corresponding local neighborhood for each original sample in the minority sample set, performing adaptive differential evolution generation on each original sample based on the local neighborhood and residual service life information corresponding to the minority sample to obtain candidate synthesized samples, performing validity judgment on the candidate synthesized samples, adding the candidate synthesized samples meeting validity judgment conditions into the synthesized sample set, merging the synthesized sample set with the original minority sample set, and merging the enhanced minority sample set with the majority sample set.
Inventors
- WEI JIANAN
- Huang Gaoxia
- WU CHANGCHENG
- YANG ZUODE
Assignees
- 贵州大学
- 贵阳险峰机床股份有限公司
Dates
- Publication Date
- 20260505
- Application Date
- 20260407
Claims (10)
- 1. The method for generating the small sample data of the mechanical failure based on the differential evolution is characterized by comprising the following steps of: Acquiring running state data of mechanical equipment, constructing a fault diagnosis sample set, and dividing a minority sample set and a majority sample set from the fault diagnosis sample set; constructing a corresponding local neighborhood for each original sample in the minority sample set; Based on the local neighborhood and the residual service life information corresponding to the minority class samples, performing adaptive differential evolution generation on each original sample to obtain candidate synthesized samples; performing validity determination on the candidate synthesized samples, and adding the candidate synthesized samples meeting validity determination conditions into a synthesized sample set; Combining the synthesized sample set with the original minority sample set to obtain an enhanced minority sample set; and merging the enhanced minority sample set and the majority sample set to obtain an enhanced mechanical fault diagnosis training data set.
- 2. The method of claim 1, wherein the constructing a corresponding local neighborhood for each original sample in the minority class sample set comprises: calculating sample distances between the target sample and the rest of the minority sample sets aiming at the target samples in the minority sample sets; sorting according to the distance from the near to the far of the sample; And selecting a preset number of neighbor samples which are sequenced according to the sample distance, and forming a local neighborhood corresponding to the target sample.
- 3. The method of claim 1, wherein performing adaptive differential evolution generation on each of the original samples based on the local neighborhood and remaining life time information corresponding to the minority class samples to obtain candidate synthesized samples comprises: selecting a neighbor sample participating in the generation of the candidate synthesized sample from the local neighbor; determining the generation weight of each minority sample according to the normalized residual service life information, and determining the generation number of candidate synthesized samples corresponding to each minority sample according to the generation weight; Adjusting a scaling factor and a crossing rate according to the history generation condition; selecting one of the preset variation strategies to generate a donor vector based on the adjusted scaling factor, wherein the preset variation strategy comprises a differential vector, random sample disturbance and neighbor mean value guide; Adjusting the variation amplitude corresponding to the donor vector according to the residual service life information; And crossing the original sample with the donor vector with the variation amplitude adjusted to obtain the candidate synthesized sample.
- 4. The method according to claim 3, wherein determining the generation weight of each of the minority class samples according to the normalized remaining lifetime information, and determining the candidate synthesized sample generation number corresponding to each of the minority class samples according to the generation weight, includes: performing nonlinear mapping on the normalized residual service life information to obtain basic weight; performing enhancement processing on the basic weight to obtain an initial weight; identifying boundary samples according to the outlier degree of the minority class samples in a sample space; applying a boundary enhancement coefficient to the initial weight corresponding to the boundary sample to obtain the generation weight; And determining the generation quantity of the candidate synthesized samples corresponding to the minority samples according to the proportion of the generation weight to the total weight.
- 5. A method according to claim 3, wherein said adjusting the magnitude of variation corresponding to the donor vector according to the remaining lifetime information comprises: Reducing the variation amplitude corresponding to the donor vector for low remaining life samples; for samples with high residual service life, increasing the variation amplitude corresponding to the donor vector; The step of crossing the original sample with the donor vector with the variation amplitude adjusted to obtain the candidate synthesized sample comprises the following steps: performing two-term intersection on the original sample and the donor vector with the variation amplitude adjusted; Randomly selecting at least one characteristic dimension in the crossing process, so that the value of at least one characteristic dimension is derived from the donor vector after the variation amplitude is regulated; Outputting the intersected candidate synthesized samples.
- 6. The method of claim 1, wherein performing a validity determination on the candidate synthetic samples and adding the candidate synthetic samples that satisfy a validity determination condition to a synthetic sample set comprises: calculating a minimum distance between the candidate synthesized sample and an existing synthesized sample; When the minimum distance is greater than a preset distance threshold, judging that the candidate synthesized sample meets diversity constraint; discarding the candidate synthesized samples and regenerating when the minimum distance is less than or equal to the preset distance threshold; performing boundary constraint processing on the candidate synthesized samples meeting the diversity constraint, and limiting the value of each characteristic dimension between the minimum value and the maximum value of the characteristic dimension corresponding to the minority sample set; and adding the candidate synthesized samples which complete the boundary constraint processing into the synthesized sample set.
- 7. The method according to any one of claims 1 to 6, wherein in the process of performing validity determination on the candidate synthetic samples and adding the candidate synthetic samples satisfying validity determination conditions to a synthetic sample set, the method further comprises: performing an interpretability evaluation on the candidate synthetic samples; And optimizing a sample generation process according to the interpretability evaluation result.
- 8. The method of claim 7, wherein said performing an interpretability evaluation of said candidate synthetic samples comprises: Calculating the distribution consistency of the candidate synthesized sample and the real sample in the feature space to obtain a feature space consistency evaluation result; Calculating the correlation degree between the residual service life labels and each characteristic in the candidate synthetic sample to obtain a residual service life correlation evaluation result; calculating characteristic contribution difference of the candidate synthetic sample and the real sample to model output so as to obtain a decision contribution consistency evaluation result; and judging whether the candidate synthetic sample accords with the fault evolution rule according to a preset fault evolution rule so as to obtain a fault evolution rationality evaluation result.
- 9. The method of claim 8, wherein optimizing the sample generation process based on the interpretability evaluation result comprises: Screening the candidate synthesis samples according to the feature space consistency evaluation result, the residual service life correlation evaluation result, the decision contribution consistency evaluation result and the fault evolution rationality evaluation result; Rejecting candidate synthesized samples with abnormal feature space distribution, contradiction between residual service life labels and sample features, and deviation of feature contribution from real sample distribution or non-conformity with fault evolution rules; according to the candidate synthesized samples after screening, adjusting scaling factors, crossing rates, generation weights or variation amplitudes in sample generation; And continuing to generate the candidate synthesized samples based on the adjusted sample generation parameters.
- 10. A differential evolution based mechanical failure small sample data generating device, characterized by being adapted to implement the method of any one of claims 1 to 9, the device comprising: the system comprises an acquisition unit, a fault diagnosis unit and a storage unit, wherein the acquisition unit is used for acquiring the running state data of mechanical equipment, constructing a fault diagnosis sample set and dividing a minority sample set and a majority sample set from the fault diagnosis sample set; the construction unit is used for constructing a corresponding local neighborhood aiming at each original sample in the minority sample set; the generation unit is used for executing self-adaptive differential evolution generation on each original sample based on the local neighborhood and the residual service life information corresponding to the minority class samples so as to obtain candidate synthesized samples; An execution unit configured to execute validity determination on the candidate synthesized samples, and add the candidate synthesized samples satisfying validity determination conditions to a synthesized sample set; the first merging unit is used for merging the synthesized sample set and the original minority sample set to obtain an enhanced minority sample set; and the second merging unit is used for merging the enhanced minority sample set and the majority sample set to obtain an enhanced mechanical fault diagnosis training data set.
Description
Method and device for generating small mechanical failure sample data based on differential evolution Technical Field The application relates to the technical field of data enhancement, in particular to a method and a device for generating mechanical failure small sample data based on differential evolution. Background Mechanical equipment is in a complex operation environment for a long time in an industrial production process, and the operation state of the mechanical equipment is easily influenced by various aspects such as load change, component abrasion, environmental factors and the like, so that bearing faults, gear faults or other structural anomalies can occur. In order to realize timely monitoring and fault identification of the operation state of the mechanical equipment, the fault diagnosis model is usually required to be trained by utilizing historical operation data, so that the model can judge whether potential faults exist in the equipment according to the operation characteristics of the equipment. Therefore, building high quality failure diagnosis training data integration is an important basis in mechanical failure diagnosis technology. However, in the practical application scenario, the occurrence frequency of different types of faults often has a large difference, some common running states or common fault types can accumulate a large amount of sample data, and some abnormal or early fault states have a low occurrence probability and can only obtain a small amount of sample data, so that the difference of the number of samples of different types in the training data set is obvious. Under the condition, the fault diagnosis model is easily affected by unbalanced sample number distribution in the training process, so that the model is more prone to learn the category characteristics with more samples, and the fault type identification capability with less samples is insufficient, thereby affecting the overall effect of fault diagnosis of mechanical equipment. Disclosure of Invention The application provides a method and a device for generating small mechanical failure sample data based on differential evolution, which can strengthen few types of samples in mechanical failure diagnosis, relieve the problem of sample imbalance and further improve the training effect of a mechanical failure diagnosis model. The first aspect of the application provides a method for generating mechanical failure small sample data based on differential evolution, which comprises the following steps: Acquiring running state data of mechanical equipment, constructing a fault diagnosis sample set, and dividing a minority sample set and a majority sample set from the fault diagnosis sample set; constructing a corresponding local neighborhood for each original sample in the minority sample set; Based on the local neighborhood and the residual service life information corresponding to the minority class samples, performing adaptive differential evolution generation on each original sample to obtain candidate synthesized samples; performing validity determination on the candidate synthesized samples, and adding the candidate synthesized samples meeting validity determination conditions into a synthesized sample set; Combining the synthesized sample set with the original minority sample set to obtain an enhanced minority sample set; and merging the enhanced minority sample set and the majority sample set to obtain an enhanced mechanical fault diagnosis training data set. Optionally, the constructing a corresponding local neighborhood for each original sample in the minority sample set includes: calculating sample distances between the target sample and the rest of the minority sample sets aiming at the target samples in the minority sample sets; sorting according to the distance from the near to the far of the sample; And selecting a preset number of neighbor samples which are sequenced according to the sample distance, and forming a local neighborhood corresponding to the target sample. Optionally, the performing adaptive differential evolution generation on each original sample based on the local neighborhood and the remaining service life information corresponding to the minority class samples to obtain candidate synthesized samples includes: selecting a neighbor sample participating in the generation of the candidate synthesized sample from the local neighbor; determining the generation weight of each minority sample according to the normalized residual service life information, and determining the generation number of candidate synthesized samples corresponding to each minority sample according to the generation weight; Adjusting a scaling factor and a crossing rate according to the history generation condition; selecting one of the preset variation strategies to generate a donor vector based on the adjusted scaling factor, wherein the preset variation strategy comprises a differential vector, random sample disturb