CN-121996954-A - Defect detection method and device based on multi-source data
Abstract
The invention discloses a defect detection method and device based on multi-source data, which comprises the steps of obtaining data to be detected and source field sample data with a defect label, carrying out alignment mapping on the data to be detected and the source field sample data based on a pre-built mapping model to obtain defect detection characteristics and source field sample characteristics, carrying out distance analysis on each defect detection characteristic and all source field sample characteristics respectively to determine a first pseudo label, carrying out label updating on the first pseudo label and the source field sample data with the defect label based on a pre-built label classification model to obtain a second pseudo label, carrying out updating on the mapping model and the label classification model based on the first pseudo label and the second pseudo label, and repeating the alignment mapping and the label updating until preset conditions are met to give the defect label corresponding to the data to be detected. The invention improves the precision of generating the defect label through the high robustness defect detection model of two mechanisms.
Inventors
- LI ZHENYU
- JU LING
- ZHOU XIAN
- ZHU YING
- WU RONG
- WANG YIXUAN
- ZHANG WEIQI
- DAI YONGDONG
- TAN XIAO
- LI CHENYING
- HUANG XINYU
- XU ZHENGHONG
- LI LEI
- ZHANG XIANGYU
- WANG SHUHENG
Assignees
- 国网江苏省电力有限公司泰州供电分公司
- 国网江苏省电力有限公司
- 北京继祥科技发展有限公司
Dates
- Publication Date
- 20260508
- Application Date
- 20260112
Claims (10)
- 1. A defect detection method based on multi-source data, comprising: acquiring data to be detected of a defect and source field sample data with a defect label; Performing alignment mapping on the to-be-detected data and the source field sample data based on a pre-constructed mapping model to respectively obtain defect detection characteristics and source field sample characteristics; respectively carrying out distance analysis on each defect detection feature and all source field sample features to determine each target defect detection feature, target to-be-detected data and corresponding first pseudo tags; Based on a pre-constructed label classification model, carrying out label updating on target to-be-detected data, corresponding first pseudo labels and source field sample data with the defect labels to obtain second pseudo labels corresponding to the target to-be-detected data; updating the mapping model and the label classification model based on the first pseudo label and the second pseudo label, and repeating the alignment mapping and the label updating until a preset condition is met, and giving a defect label corresponding to the data to be detected.
- 2. The defect detection method of claim 1, wherein performing alignment mapping on the to-be-detected data and the source field sample data based on a pre-constructed mapping model to obtain a defect detection feature and a source field sample feature, respectively, comprises: aligning and mapping the data to be detected of the defects and the source field sample data to the same data space; the optimization objective function is constructed with the aim of minimizing intra-class distribution differences in the data space and maximizing intra-class compactness and inter-class separability; Solving an optimization objective function based on a Lagrangian algorithm, and determining a shared feature matrix of a data space; And respectively carrying out alignment mapping on the to-be-detected data and the source field sample data through the shared feature matrix to give defect detection features and source field sample features.
- 3. The defect detection method of claim 2, wherein optimizing the objective function is specifically expressed as: ; wherein, min is a minimizing function, In order to optimize the constraints of the objective, The euclidean norm square of the shared feature matrix W is represented, W T is the transpose of the shared feature matrix W, α 1 、α 2 represents the weight of intra-class compactness and inter-class separability, respectively, λ represents a regularization parameter, U s is a center matrix of all class clusters in the source field sample data, H is a diagonal matrix, I k represents a k-dimensional identity matrix, MCCD represents the maximum cluster center difference, DSC represents intra-class cluster compactness, DCMC represents multi-cluster compactness, DMCM represents inter-class cluster separability, DCNC represents neighboring class main cluster separability, (dsc+ DCMC) represents intra-class compactness, and (dmcm+ DCNC) represents inter-class separability.
- 4. The defect detection method of claim 3 wherein solving the optimization objective function based on a lagrangian algorithm to determine the shared feature matrix of the data space comprises: after matrixing the optimized objective function, constructing a Lagrangian function; Let the partial derivative be 0, solve all feature values and feature vectors meeting the requirements; and sequencing the feature values obtained by solving, and taking the minimum k feature values to form a Lagrange multiplier, wherein k corresponding feature vectors form a shared feature matrix, wherein k is a natural number.
- 5. The defect detection method of claim 1, wherein the step of performing a distance analysis on each defect detection feature and all source field sample features to determine each target defect detection feature, target to-be-detected defect data, and corresponding first pseudo tag comprises: respectively calculating Euclidean distances between each defect detection feature and all source field sample features; Arranging Euclidean distances between each defect detection feature and all source field sample features in order from small to large, and selecting a preset number of source field sample features closest to the defect detection feature; Determining an initial pseudo tag matrix of defect detection features based on a predetermined number of defect tags of source field sample features; And carrying out consistency judgment on the initial pseudo tag matrix, and determining target defect detection characteristics, target to-be-detected data and corresponding first pseudo tags.
- 6. The defect detection method of claim 1, wherein, based on a pre-constructed label classification model, the target to-be-detected data, the corresponding first pseudo label, and the source field sample data with the defect label are subjected to label updating to obtain a second pseudo label corresponding to the target to-be-detected data, and specifically comprises: Inputting source field sample data, source field sample characteristics, a defect label, target defect detection characteristics and a first pseudo label into a label classification model; In a label classification model, taking the minimized classification result difference among similar samples and the maximized classification result difference among different similar samples as targets, constructing an optimization target of the classifier, and calculating an explicit solution of the optimization target of the classifier; And obtaining a second pseudo tag corresponding to the target to-be-detected defect data based on the explicit solution.
- 7. The defect detection method of claim 6, wherein constructing an optimization objective for the classifier comprises: Based on the classification result difference of the centers of the class clusters in the source field and the to-be-detected defect field, the classification result difference of the similar cluster samples and the centers of the class clusters, the classification result difference between the centers of the similar clusters and the classification result difference between the centers of the adjacent main clusters, an optimization target of the classifier is constructed.
- 8. The defect detection method of claim 6, wherein obtaining a second pseudo tag corresponding to the target data to be detected for defects based on an explicit solution comprises: calculating the product of the explicit solution and a kernel function matrix of the target defect detection feature; And obtaining a second pseudo tag according to the calculation result, wherein the tag value of the second pseudo tag is the row where the maximum value of each column vector is located.
- 9. The defect detection method of claim 1, wherein updating the mapping model and the label classification model based on the first pseudo label and the second pseudo label, and repeating the alignment mapping and the label updating until a predetermined condition is satisfied, gives a defect label corresponding to the data to be detected for defects, comprising: Comparing the first pseudo tag with the second pseudo tag of each target defect detection data, and if the first pseudo tag and the second pseudo tag are consistent, determining the target defect detection data and the pseudo tags thereof as updated defect detection domain sample data; The method comprises the steps of inputting sample data of an update defect detection domain into a mapping model and a label classification model, and updating model parameters, wherein the update model parameters comprise a shared feature matrix of the update mapping model and a classifier coefficient matrix of the update label classification model; And repeatedly executing alignment mapping and label updating, performing loop iteration optimization until the preset iteration times are reached, and outputting a defect label corresponding to the data to be detected.
- 10. A defect detection apparatus based on multi-source data, wherein the defect detection method based on multi-source data according to any one of claims 1 to 9 is adopted, and the apparatus comprises: the data acquisition module is used for acquiring data to be detected by the defect and source field sample data with a defect label; The feature mapping module is used for carrying out alignment mapping on the to-be-detected data and the source field sample data based on a pre-constructed mapping model to respectively obtain defect detection features and source field sample features; The label primary determining module is used for respectively carrying out distance analysis on each defect detection characteristic and all source field sample characteristics to determine each target defect detection characteristic, target to-be-detected defect detection data and a corresponding first pseudo label; The label updating module is used for carrying out label updating on the target defect detection data, the corresponding first pseudo label and source field sample data with the defect label based on a pre-constructed label classification model to obtain a second pseudo label corresponding to the target to-be-detected defect detection data; And the label final determining module is used for updating the mapping model and the label classification model based on the first pseudo label and the second pseudo label, and repeating the alignment mapping and the label updating until a preset condition is met, so as to give the defect label corresponding to the data to be detected.
Description
Defect detection method and device based on multi-source data Technical Field The invention belongs to the technical field of defect detection, and particularly relates to a defect detection method and device based on multi-source data. Background In the fields of detection of industrial defects through a model and the like, in order to solve the problem of scarcity of marked samples in a target application scene, a migration learning method is widely adopted. The method comprises the steps of generating pseudo labels for samples of a target application scene without labels, and performing model iterative training by using the pseudo labels. However, the accuracy of the pseudo tag directly determines the final detection performance of the model, and how to improve the quality and effectively utilize the pseudo tag is a key challenge in the field. In response to the above-mentioned problems, some selective pseudo tag learning-based migration learning methods (TLPLS) are proposed in the prior art, and representative methods thereof are, for example, selective pseudo tag learning (SPL) based on structured prediction. Such methods typically rely on only a single information source to perform pseudo tag learning, for example, by calculating the distance between the target domain sample and the source domain or target domain class center to determine its pseudo tag and confidence, then, according to a preset ratio, selecting samples with higher confidence in each class and their pseudo tags for iterative training of the model, and expanding the selection ratio gradually as the number of iterations increases until all target samples are included. However, the above prior art has significant drawbacks. Firstly, the pseudo tag learning process only depends on single-dimension information such as class centers or source field samples, the joint effect of multi-source information is ignored, when sample characteristics are mixed among classes, the pseudo tag which is extremely easy to generate errors is judged only by the distance between the sample characteristics and the class centers, the information characterization is too general, and the situation of losing key class information exists. Secondly, the prior art adopts a strategy of selecting a high confidence coefficient sample according to a fixed proportion, and the method has inherent defects, so that the fact that the false labels with high partial confidence coefficient but actual errors are introduced into a training process cannot be avoided, meanwhile, the false labels with low partial confidence coefficient but actual correctness are also abandoned, and the introduction of errors or the loss of information can interfere with iterative optimization of a model, so that poor performance and even unstable training are caused. Disclosure of Invention Aiming at the defects in the prior art, the invention provides a defect detection method and device based on multi-source data, which can generate a defect label with high precision. In a first aspect, the present invention provides a defect detection method based on multi-source data, including: acquiring data to be detected of a defect and source field sample data with a defect label; Performing alignment mapping on the to-be-detected data and the source field sample data based on a pre-constructed mapping model to respectively obtain defect detection characteristics and source field sample characteristics; respectively carrying out distance analysis on each defect detection feature and all source field sample features to determine each target defect detection feature, target to-be-detected data and corresponding first pseudo tags; Based on a pre-constructed label classification model, carrying out label updating on target to-be-detected data, corresponding first pseudo labels and source field sample data with the defect labels to obtain second pseudo labels corresponding to the target to-be-detected data; updating the mapping model and the label classification model based on the first pseudo label and the second pseudo label, and repeating the alignment mapping and the label updating until a preset condition is met, and giving a defect label corresponding to the data to be detected. Further, based on a pre-constructed mapping model, performing alignment mapping on the to-be-detected data and the source field sample data to obtain a defect detection feature and a source field sample feature respectively, including: aligning and mapping the data to be detected of the defects and the source field sample data to the same data space; the optimization objective function is constructed with the aim of minimizing intra-class distribution differences in the data space and maximizing intra-class compactness and inter-class separability; Solving an optimization objective function based on a Lagrangian algorithm, and determining a shared feature matrix of a data space; And respectively carrying out alignment mapping on th