CN-121258972-B - Industrial weld defect detection method and system with characteristic decoupling and task alignment

CN121258972BCN 121258972 BCN121258972 BCN 121258972BCN-121258972-B

Abstract

The invention discloses a method and a system for detecting industrial weld defects by means of characteristic decoupling and task alignment, wherein the method comprises the steps of inputting an industrial weld defect detection image, utilizing a multi-scale characteristic focusing and diffusing network, enhancing multi-scale characteristics and information complementation by means of aggregating multi-scale context cues and diffusing the multi-scale context cues to a plurality of detection layers in a self-adaptive mode, introducing a multi-level context fusion module, carrying out explicit modeling on local and global context dependency relations in each enhanced scale, adopting a task alignment dynamic interaction head, and enabling classification and positioning tasks to be dynamically aligned with semantics in a space of a characteristic layer by means of task decoupling, deformable convolution and a dynamic characteristic selection mechanism, and inputting the enhanced and refined characteristics into a prediction network to detect industrial weld defects. The invention improves the accuracy and the robustness of weld defect detection through multi-scale feature enhancement, task alignment and context fusion.

Inventors

SU WENJIE
FAN EN
Yu Ningyao
LI QI
WANG RUOHAN
WEI PENGFEI

Assignees

绍兴文理学院

Dates

Publication Date: 20260508
Application Date: 20251128

Claims (6)

1. The industrial weld defect detection method for characteristic decoupling and task alignment is characterized by comprising the following steps of: S1, inputting an industrial weld defect detection image, utilizing a multi-scale feature focusing and diffusing network (1), and enhancing multi-scale features and information complementation by aggregating multi-scale context cues and adaptively diffusing the multi-scale context cues to a plurality of detection layers; S2, introducing a multi-level context fusion module (2), and explicitly modeling local and global context dependency relations in each enhanced scale; step S3, a task alignment dynamic interaction head (3) is adopted, and the space of the classification and positioning task at the feature level is dynamically aligned with the semantics through task decoupling, deformable convolution and a dynamic feature selection mechanism; s4, inputting the enhanced and refined characteristics into a prediction network, and detecting industrial weld defects; Step S1 comprises the steps of: s11, extracting continuous multi-scale feature mapping from a YOLOv backbone network (4) by utilizing a feature focusing mechanism, and aggregating multi-scale context clues by adopting a multi-branch parallel convolution structure; step S12, utilizing a characteristic diffusion mechanism, effectively feeding back the global context clues obtained through fusion to each original scale so as to improve the representation capacity of each scale; step S13, outputting the multi-scale enhanced features to a multi-level context fusion module (2) and a task alignment dynamic interaction head (3); Step S2 includes the steps of: step S21, based on the multiscale enhancement features output by the multiscale feature focusing and diffusion mechanism, using the enhancement features on each scale Carrying out channel alignment and dimension compression on convolution, and extracting local and global contexts from the dimension-reduced enhancement features in parallel; step S22, splicing the output characteristics of the local branches and the global branches in the channel dimension, and performing fusion refinement by using point-by-point convolution; step S23, after the projection operation is carried out on the fused context characteristics, residual addition is carried out on the fused context characteristics and the original enhanced characteristics, and stronger characteristics of the scale and injected into multi-level contexts are output , In the formula, Indicating use Convolving to salvo shadows; And Modeling operations representing local and global contexts, respectively; Representing fusion and semantic refinement functions; Representing a residual projection map; step S3 includes the steps of: Step S31, carrying out parameter multiplexing on the multi-scale stronger features output by the multi-level context fusion module (2) through shared convolution, and introducing a learnable scale factor to carry out layer-by-layer scale adjustment; in the formula, stronger feature sets with different scales are ; , Represents a set of real numbers, Representation of A stronger feature at the scale of the device, The number of channels is indicated and the number of channels is indicated, Representing the height of the stronger feature map, Representing the width of the stronger feature map; , Representing shared convolutions The characteristics of the processed product are that, An output channel representing a shared convolution; A factor representing the scale of the feature is adjusted, , Representing a positive real number set; representing shared interaction characteristics unified by scale and semantics; Step S32, for sharing interactive features Global averaging pooling to obtain global context vectors Combining shared interaction features And global context vector To generate initial classification features And regression characteristics , In the formula, Representing a global average pooling operation; Representation for incorporating shared features And global context To generate a convolution operation function of the initial classification feature; Representation for incorporating shared features And global context To generate a convolution operation function of the initial regression feature; Step S33, the refined regression characteristics And classification features Respectively by independent means Convolution projection is carried out on the final prediction space, and a detection result with high confidence is generated by combining a DFL module; in step S33, the features are classified And regression characteristics The refining process of (2) is as follows: step S331, introducing deformable convolution in the regression branch to share interaction characteristics Dynamically generating sample point offsets Modulation mask And uses the predicted sample point offset Modulation mask For initial regression characteristics Generating refined regression features using deformable convolution , In the formula, Representation for use in accordance with shared interaction features Dynamically generating sample point offsets Modulation mask Is a function of (2); representing a Sigmoid activation function for normalizing the output value; step S332, introducing dynamic feature selection mechanism into the classification branch to share interactive features Dynamic prediction of foreground confidence map And map the foreground confidence level With the initial classification features Multiplying by element to generate refined classification features , In the formula, Representation for use in accordance with shared interaction features Dynamic prediction of foreground confidence map Is a function of (2); Representation for processing refined classification features to generate final classification prediction results A kind of electronic device A convolution operation function; representing element-wise multiplication.
2. The method for detecting industrial weld defects with feature decoupling and task alignment according to claim 1, wherein in step S11, the process of aggregating multi-scale context cues using a feature focusing mechanism is as follows: step S111, extracting three features with different scales, realizing the consistency of space and channel by the three features through scale self-adaptive transformation, In the formula, 、 And Respectively representing three features of different scales; 、、 representing up-sampling, identity mapping and down-sampling operations respectively, Realized by an adaptive downsampling operator; step S112, splicing the three aligned features in the channel dimension to generate a focusing feature , In the formula, Representing multi-branch aggregation and fusion operations.
3. The method for detecting defects in industrial welds with feature decoupling and task alignment according to claim 2, wherein in step S11, the process of aggregating multi-scale context cues using a feature focusing mechanism further comprises: Step S113, applying a parallel set of depth convolutions to the spliced features, integrating the output results of the parallel branches through point-by-point convolution, and simultaneously introducing residual connection to add the features before fusion and the features after fusion so as to enable the focusing features The information is complementary.
4. A method for detecting defects in an industrial weld with feature decoupling and task alignment as claimed in claim 3, wherein in step S12, the focused feature is focused Performing scale self-adaptive transformation and adjustment to ensure that the space size, the channel number and the original characteristics of the system After concordance with the original features Weighted residual addition fusion is carried out to generate enhanced features , In the formula, And Respectively represent Original features and enhanced features at scale; Representing a focus feature; Is a scale adaptive transformation; is a weight coefficient which can be learned; 。
5. An industrial weld defect detection system for achieving feature decoupling and task alignment of the method of claim 1, the system comprising: A multi-scale feature focusing and diffusing network (1) for enhancing multi-scale feature representation and focusing fine-grained defect cues; the multi-level context fusion module (2) is used for capturing local and global context dependence so as to strengthen the recognition robustness under a complex background; the task alignment dynamic interaction head (3) is used for dynamically optimizing alignment of classification and positioning tasks and ensuring prediction consistency; YOLOv11 a backbone network (4) for extracting multi-level, multi-scale features from an input industrial weld defect detection image.
6. An industrial weld defect detection device with characteristic decoupling and task alignment, comprising: A memory (5) for storing a computer program and data; A processor (6) for implementing the steps of the industrial weld defect detection method of feature decoupling and task alignment as claimed in any one of claims 1-4 when executing a computer program.

Description

Industrial weld defect detection method and system with characteristic decoupling and task alignment Technical Field The invention relates to the technical field of industrial weld defect detection, in particular to a method and a system for detecting industrial weld defects by characteristic decoupling and task alignment. Background Welding is a core process in modern manufacturing and is critical to ensuring the structural integrity and safety of industrial components. However, the welding seam is limited by the change of factors such as materials, processes, environment and the like, and various defects such as cracks, air holes, splashing and the like often occur in the welding seam. These defects not only weaken the mechanical strength of the welded joint, but also pose a significant risk to the safety and reliability of the engineering structure. Thus, accurate weld defect detection has become a critical task in quality assurance and intelligent manufacturing systems. Traditionally, weld defect detection has relied on non-destructive inspection (NDT) techniques such as ultrasonic, radiographic, and manual visual inspection. Although these methods can achieve high precision in a controlled environment, their limitations are also very prominent, including high cost, inefficiency, excessive reliance on operator expertise, and difficulty in meeting complex surfaces or real-time monitoring requirements. These drawbacks limit the scalability and applicability of NDT technology in a large-scale automated production environment. With the rapid development of computer vision and deep learning technologies, data-driven detection methods are increasingly being applied to weld defect analysis. Convolutional Neural Networks (CNNs) and mainstream target detection frameworks (such as Faster R-CNN, YOLO, and SSD) have shown good results in identifying industrial image defects. Recent studies have also explored multi-scale feature fusion and transducer-based architectures to further improve the robustness of detection. However, in complex welding scenarios, existing methods still face challenges such as efficient detection of small scale defects, distinguishing fine defect patterns from welds, and achieving balanced optimization between localization and classification tasks. There is therefore a need to propose a new solution to the above-mentioned problems. Disclosure of Invention The invention aims to overcome the defects of the prior art, and provides a method and a system for detecting industrial weld defects by characteristic decoupling and task alignment, which are used for improving the accuracy and the robustness of weld defect detection through multi-scale characteristic enhancement, task alignment and context fusion. In order to achieve the above purpose, the present invention adopts the following technical scheme: a method for detecting defects of industrial welding seams with characteristics decoupled and aligned tasks comprises the following steps: S1, inputting an industrial weld defect detection image, utilizing a multi-scale feature focusing and diffusing network, and enhancing multi-scale features and information complementation by aggregating multi-scale context cues and adaptively diffusing the multi-scale context cues to a plurality of detection layers; Step S2, introducing a multi-level context fusion module, and explicitly modeling local and global context dependency relations in each enhanced scale; step S3, a task alignment dynamic interaction head is adopted, and space of classification and positioning tasks on a feature layer is dynamically aligned with semantics through task decoupling, deformable convolution and a dynamic feature selection mechanism; and S4, inputting the enhanced and refined characteristics into a prediction network to detect industrial weld defects. Further, step S1 includes the steps of: Step S11, extracting continuous multi-scale feature mapping from YOLOv backbone network by utilizing a feature focusing mechanism, and aggregating multi-scale context clues by adopting a multi-branch parallel convolution structure; step S12, utilizing a characteristic diffusion mechanism, effectively feeding back the global context clues obtained through fusion to each original scale so as to improve the representation capacity of each scale; And S13, outputting the multi-scale enhanced features to a multi-level context fusion module and a task alignment dynamic interaction head. Further, in step S11, the process of aggregating multi-scale context cues using the feature focusing mechanism is: step S111, extracting three features with different scales, realizing the consistency of space and channel by the three features through scale self-adaptive transformation, In the formula,、AndRespectively representing three features of different scales;、、 representing up-sampling, identity mapping and down-sampling operations respectively, Realized by an adaptive downsampling operator; step S112, sp