CN-121999300-A - Welding defect detection method based on multi-mode cross attention fusion module

CN121999300ACN 121999300 ACN121999300 ACN 121999300ACN-121999300-A

Abstract

A welding defect detection method based on a multi-mode cross attention fusion module relates to the technical field of intelligent laser welding manufacturing and deep learning fusion. The method comprises the steps of acquiring a top view image sequence and a overlook image sequence of a molten pool in a welding process, respectively normalizing, size cutting and sampling the two image sequences to obtain a top view input sequence and a overlook input sequence, splicing the top view image and the overlook image at the same moment in a channel dimension to form a cross-view fusion frame, constructing three paths of time sequence input sequences, constructing a multimodal cross-attention fusion module based on a transducer architecture, constructing a training database to obtain a trained welding defect detection model, and detecting and evaluating welding quality through the trained model. The invention is mainly used for detecting the defects of laser welding.

Inventors

TAN CAIWANG
WANG WEI
LIU FUYUN
LIU YUHANG
CHANG SHUAI
SONG XIAOGUO

Assignees

哈尔滨工业大学（威海）

Dates

Publication Date: 20260508
Application Date: 20260331

Claims (7)

1. The welding defect detection method based on the multi-mode cross attention fusion module is characterized by comprising the following steps of: S1, synchronously acquiring a top view image sequence and a top view image sequence of a molten pool in a welding process through a coaxial top view high-speed camera and a paraxial top view high-speed camera which are arranged on a laser head; S2, respectively carrying out normalization and size cutting on the two paths of image sequences, sampling at a fixed time interval of t milliseconds to obtain n key frames, and respectively obtaining a top view input sequence and a top view input sequence; S3, splicing the top view image and the overlook image at the same moment in the channel dimension to form a cross-view fusion frame, and splicing the cross-view fusion frame of the current moment and the fixed interval time t milliseconds to form a time sequence image group of n-1 key frames; S4, constructing three time sequence input sequences, wherein the top view input sequence and the top view input sequence select the first n-1 key frames, and the cross-view fusion frames select all sequences; S5, constructing a multi-mode cross attention fusion module based on a transducer architecture, wherein top view, overlook and fusion time sequence features are subjected to three-way information interaction and feature enhancement in an encoder through a cross-view cross attention mechanism, and the features after the three-way information interaction and feature enhancement are spliced on a channel to form fused sequence features; S6, outputting a welding state category after the fused sequence features are processed by a transducer decoder; s7, constructing a training database, performing end-to-end training on the multi-mode cross attention fusion module to obtain a trained welding defect detection model, and detecting and evaluating welding quality through the trained model.
2. The welding defect detection method based on the multi-mode cross attention fusion module according to claim 1 is characterized in that in S1, the coaxial top view high-speed camera is arranged right above a laser head and used for capturing the contour of the top of a molten pool and the dynamic state of a keyhole, and the paraxial top view high-speed camera is arranged at the side front of the laser head and used for capturing the shape and undercut characteristics of the edge of the molten pool.
3. The welding defect detection method based on the multi-mode cross attention fusion module according to claim 1, wherein in the step S2, all input images are cut into preset pixels before being sent into a convolutional encoder, wherein a top view image sequence and a top view image sequence are three channels, the top view image and the top view image after time synchronization are respectively passed through a visual angle geometric mapping transformation module, the top view image and the top view image are accurately projected to a top view image coordinate system based on camera parameters calibrated in advance, a unified visual angle reference system is constructed, a molten pool surface elliptic paraboloid curved surface model is constructed based on the physical characteristics of a laser welding molten pool, cross-visual angle fusion is carried out on two image sequences at the same moment, then multi-scale time fusion is carried out, finally cross-visual fusion characteristic sequence images are formed, each frame contains cross-visual angle fusion information of the current moment and the previous moment, and explicit coding of a dynamic evolution process of the molten pool is realized.
4. The welding defect detection method based on a multi-modal cross-attention fusion module of claim 1, wherein in S2, the sampling of the keyframes specifically comprises: S21, respectively performing center cutting processing on the collected top view and overlook original image sequences, and cutting the center of each frame of image into pictures with preset pixels by using an image preprocessing function; s22, carrying out normalization processing on the cut image, and scaling the pixel value to a preset interval; S23, respectively extracting key frames in two paths of sequences from the processed image sequences according to a fixed time interval of t milliseconds to obtain a top view frame group and a top view frame group; S24, the top view frame group and the top view frame group are respectively used as top view input sequences and top view input sequences in the first n-1 items.
5. The welding defect detection method based on the multi-mode cross-attention fusion module according to claim 1, wherein in S3, the specific process of cross-view fusion frame stitching includes: s31, acquiring an internal reference matrix and an external reference matrix of the top view camera and the overlook camera by adopting a Zhang Zhengyou calibration method; s32, based on camera parameters calibrated in advance, projecting a top view image to a top view image coordinate system, constructing a unified view angle reference system, and based on physical characteristics of a laser welding pool, constructing a pool surface elliptic parabolic curved surface model.
6. The welding defect detection method based on the multi-mode cross-attention fusion module according to claim 1, wherein in S5, the specific steps of three-way information interaction and feature enhancement include: S51, top view feature sequence Top-view feature sequence Fusing feature sequences with cross-vision Respectively sending the three light convolution encoders with the same structure and independent parameters, generating a query matrix, a key matrix and a value matrix through linear projection, and performing cross attention calculation; S52, looking up the characteristic sequence Top-view feature sequence Fusing feature sequences with cross-vision Inputting a multi-mode cross attention fusion module, and respectively calculating the three-way cross attention weight of one sequence to the other two sequences; s53, grouping and summing the six groups of attention outputs according to the source of the query matrix to obtain enhanced top view characteristics Features in plan view And cross-visual fusion features , S54, splicing three paths of enhancement features in feature dimensions, and then reducing vitamins through a linear layer to obtain an enhanced fusion feature sequence ; S55, fusing the characteristic sequences And inputting the data into a transducer decoder, carrying out global average pooling and classification by two full-connection layers, and outputting classification results.
7. The welding defect detection method based on a multi-modal cross-attention fusion module of claim 1, wherein in S7 the training database is constructed by: s71, presetting laser power, welding speed and defocus range, and carrying out a welding experiment; s72, carrying out metallographic section on each group of samples, and judging the states of the samples as good, over-penetration, under-penetration or undercut; S73, dividing the database of the welding test into a training set, a verification set and a test set, and training by adopting a cross entropy loss function and an Adam optimizer until the test precision meets the preset requirement.

Description

Welding defect detection method based on multi-mode cross attention fusion module Technical Field The invention relates to the technical field of intelligent laser welding manufacturing and deep learning fusion, in particular to a welding defect detection method based on a multi-mode cross attention fusion module. Background The laser welding is widely applied to the fields of high-end equipment such as rail transit, aerospace, shipbuilding and the like due to the advantages of high energy density, small heat affected zone, high welding speed and the like. However, the welding process is extremely susceptible to factors such as laser power fluctuation, assembly gap variation, material surface state and the like, so that fusion depth related defects such as over-penetration, under-penetration, undercut and the like occur. If not found in time, the defects seriously affect the structural strength and the service safety. Traditional welding quality detection mainly relies on off-line destructive detection or single-modality on-line monitoring. The method has the advantages that the method can not realize online closed-loop control, and the method is difficult to comprehensively reflect the three-dimensional dynamic behavior of a molten pool due to single information dimension, and is easy to be interfered by strong arc light and splashing in high-speed and high-reflection metal welding, so that the identification precision is insufficient. In recent years, a visual inspection method based on deep learning is gradually rising. However, the existing method mostly adopts a single-view image combined with a CNN structure. On one hand, a single view angle cannot observe the shape of a keyhole at the top of a molten pool (reflecting penetration) and the shape of the edge of the molten pool (reflecting undercut) at the same time, so that the characteristic is lost, and on the other hand, CNN is difficult to effectively model the dynamic association and time sequence evolution rules among different view angles, and particularly has poor generalization capability under the condition of a small sample, so that higher detection precision cannot be realized. Although partial researches try to introduce dual-view or multi-sensor fusion, most of the research adopts simple splicing or weighted average, cross-view semantic alignment and time sequence dynamic cooperative mechanisms are not deeply excavated, and the model is high in complexity and difficult to deploy in an industrial field real-time system. In the existing welding defect detection technology, most methods only depend on an image sequence of a single visual angle, a traditional Convolutional Neural Network (CNN) or a simple double-flow network is adopted to classify the state of a welding line, semantic association and dynamic complementation relation between top view and overlook visual angles cannot be effectively modeled, and joint modeling capability of a molten pool cross-visual angle space-time evolution process is also lacking, so that recognition precision of typical welding defects is low and input delay is high under conditions of strong light, splashing interference or a small sample. Therefore, a welding defect detection method based on a multi-mode cross attention fusion module, which can fuse top view and side view, effectively inhibit welding parasitic light interference and has high defect recognition precision, is needed. Disclosure of Invention The invention provides a welding defect detection method based on a multi-mode cross attention fusion module, which can fuse top view and side view, effectively inhibit welding parasitic light interference and has high defect recognition accuracy, and aims to solve the defects of insufficient single-view perception, shallow multi-mode fusion and weak time sequence modeling capability of the existing welding defects in detection. The invention discloses a welding defect detection method based on a multi-mode cross attention fusion module, which comprises the following steps: S1, synchronously acquiring a top view image sequence and a top view image sequence of a molten pool in a welding process through a coaxial top view high-speed camera and a paraxial top view high-speed camera which are arranged on a laser head; S2, respectively carrying out normalization and size cutting on the two paths of image sequences, sampling at a fixed time interval of t milliseconds to obtain n key frames, and respectively obtaining a top view input sequence and a top view input sequence; S3, splicing the top view image and the overlook image at the same moment in the channel dimension to form a cross-view fusion frame, and splicing the cross-view fusion frame of the current moment and the fixed interval time t milliseconds to form a time sequence image group of n-1 key frames; S4, constructing three time sequence input sequences, wherein the top view input sequence and the top view input sequence select the first n-1 key fra