CN-121982469-A - Multi-mode medical image fusion method based on multi-scale self-adaption and explicit channel interaction

CN121982469ACN 121982469 ACN121982469 ACN 121982469ACN-121982469-A

Abstract

The application discloses a multi-mode medical image fusion method based on multi-scale self-adaption and explicit channel interaction, and belongs to the field of medical image processing. The method comprises the steps of extracting a multi-scale feature map through a multi-scale self-adaptive feature extraction module, restraining noise and strengthening a salient region through a global-local dual attention feature strengthening module, dynamically adjusting feature distribution through a context gating self-adaptive feature modulation module, aligning cross-modal features, establishing multi-modal information flow interaction in channel dimensions through an explicit channel interaction fusion module, integrating complementary information through channel rearrangement and grouping convolution, and finally restoring a fused image through a joint optimization image reconstruction module and restraining reconstruction errors through a self-adaptive segmentation norm loss function. The application enhances the detail expression while maintaining the structure fidelity, and provides more reliable image basis for computer aided diagnosis.

Inventors

KONG ZIJIE
DENG LIZHEN
XU GUOXIA
ZHU HU

Assignees

南京邮电大学

Dates

Publication Date: 20260505
Application Date: 20260403

Claims (7)

1. The multi-mode medical image fusion method based on multi-scale self-adaption and explicit channel interaction is characterized by comprising the following steps of: Step1, respectively extracting features of two input medical images in different modes through a multi-scale self-adaptive feature extraction module to obtain an A-path multi-scale feature map and a B-path multi-scale feature map; Step 2, performing significance enhancement on the A-path multi-scale feature map and the B-path multi-scale feature map through a global-local dual-attention feature enhancement module to obtain an A-path enhanced feature map and a B-path enhanced feature map; Step 3, performing characteristic distribution adjustment on the characteristic diagram after A-path reinforcement and the characteristic diagram after B-path reinforcement through a context gating self-adaptive characteristic modulation module to obtain the characteristic diagram after A-path modulation and the characteristic diagram after B-path modulation; Step 4, fusing the characteristic diagram after the A-path modulation and the characteristic diagram after the B-path modulation through an explicit channel interaction fusion module to obtain a fused characteristic diagram; And 5, reconstructing the fusion feature map through a joint optimization image reconstruction module to obtain a final fusion image.
2. The multi-modal medical image fusion method based on multi-scale adaptive and explicit channel interaction according to claim 1, wherein the specific steps of step 1 are as follows: step 11, respectively calculating gradient information of two input multi-mode medical images by utilizing Sobel operator to strengthen edge characteristics of the input multi-mode medical images, and obtaining the multi-mode medical images with A-side edges strengthened And B-side edge reinforced multi-modal medical image ; Step 12, the multimode medical image after the A path edge is strengthened And B-side edge reinforced multi-modal medical image The method comprises the steps of respectively inputting cascaded self-adaptive parallel characteristic units APFU, wherein each self-adaptive parallel characteristic unit APFU is formed by self-adaptive convolution and a self-adaptive converter in parallel, the self-adaptive convolution dynamically generates convolution parameters through global pooling by using a context gating mechanism, and the formula is expressed as follows: ; Wherein, the ∈{ , }, Is a medical image with strengthened edges; For the context gating operation, The self-adaptive converter adopts a shift window self-attention mechanism to capture long-distance dependence; The characteristic diagram is obtained after the self-adaptive convolution layer; step 13, the A path output in the step 12 is subjected to a self-adaptive convolution layer to obtain a characteristic diagram And a characteristic diagram obtained after the B path passes through the self-adaptive convolution layer Three cascaded self-adaptive parallel feature units APFU are respectively input for processing to obtain corresponding multi-scale feature graphs, and the cascading process is expressed as follows: ; Wherein, the ∈{ , }, In the case of a multi-scale feature map, Obtaining A-path multi-scale characteristic diagram for processing function of self-adaptive parallel characteristic unit And B-path multi-scale feature map 。
3. The multi-modal medical image fusion method based on multi-scale adaptive and explicit channel interaction according to claim 2, wherein the specific steps of step 2 are as follows: step 21, the A-path multi-scale feature map obtained in the step 1 is processed And B-path multi-scale feature map Respectively divided into Each subspace contains G channels, denoted as: , wherein =1,2,..., Representing a subspace index; Step 22, dividing the step 21 into sub-space features of the A path Sequentially executing maximum pooling, depth convolution and point-by-point convolution operations to obtain a feature map after A-path processing Expressed as: ; dividing the step 21 into sub-space features of the B paths Sequentially performing maximum pooling, depth convolution and point-by-point convolution operations to obtain a feature map after B-path processing Expressed as: ; Wherein, the For a maximum pooling layer with a core size of 3 x 3 and a packing number of 1, In the case of a deep convolution, Is a point-by-point convolution; Step 23, feature map after processing the A path Layer normalization is performed to make its space dimension Merging into sequence length , The shape becomes The characteristics of the normalized A path are obtained Feature map after processing B path The same operation is carried out to obtain the characteristics of the normalized B path ; Step 24, normalizing the A path Respectively projecting the two query vectors into the query, key and value vectors to obtain two groups of query vectors 、 And two sets of key vectors 、 Value vector Expressed as: ; ; ; Normalizing the characteristics of the B path Respectively projecting the two query vectors into the query, key and value vectors to obtain two groups of query vectors 、 And two sets of key vectors 、 Value vector Expressed as: ; ; ; Wherein, the Is a matrix of query projection weights, Is a matrix of key projection weights that, Is a matrix of the projection weights of the values, , , , ; Step 25, calculating A-path differential attention diagram : ; Computing a B-way differential attention map : ; Wherein, the Is a learnable scaling factor, for controlling the balance of differential weights, For projection dimension, A-path differential attention is sought Dimensional rearrangement and expansion are carried out to obtain the shape as A path a attention feature of (2); will calculate a B-way differential attention diagram Respectively carrying out dimension rearrangement and expansion to obtain a shape of Is a B-way attention feature of (2); Step 26, A-path attention feature Features of subspaces of path A Multiplying by element, adding residual error to realize feature enhancement and information complementation, and outputting A-path enhanced feature : ; B-way attention feature Features of subspaces of the B path Multiplying by element, adding residual error to realize feature enhancement and information complementation, and outputting B-path enhanced feature : ; Wherein, the Representing element-by-element multiplication; step 27, A-way strengthening feature Splicing in the channel dimension to obtain a characteristic diagram after A-path reinforcement : ; Enhancing B-way Splicing in the channel dimension to obtain a characteristic diagram after B-path reinforcement : 。
4. A multi-modal medical image fusion method based on multi-scale adaptation and explicit channel interaction according to claim 3, wherein the specific steps of step 3 are as follows: Step 31, strengthening the characteristic diagram of the A path And a characteristic diagram after B-path reinforcement The two paths of characteristics are subjected to channel fusion through splicing operation, and then the two paths of characteristics are utilized The convolution layer carries out nonlinear mapping by combining a context gating mechanism, calculates two groups of learnable modulation parameters with consistent space dimensions in real time, and scales factors And offset factor : ; Wherein, the (. Cndot.) is a channel splicing operation, Is a point-by-point convolution layer; Step 32, strengthening the characteristic diagram of the A path Performing point-by-point instance normalization: ; Wherein the method comprises the steps of And Respectively represent Mean and standard deviation of (a); normalized features of A path Multiplied by And add Obtaining a characteristic diagram after A-path modulation : ; Characteristic diagram after strengthening B path Performing point-by-point instance normalization: ; Wherein, therein And Respectively represent Mean and standard deviation of (a); normalizing the characteristics of the B path Multiplied by And add Obtaining a characteristic diagram after B-path modulation : ; Wherein the method comprises the steps of Representing element-wise multiplication.
5. The multi-modal medical image fusion method based on multi-scale adaptive and explicit channel interaction according to claim 4, wherein the specific steps of step 4 are as follows: Step 41, modulating the characteristic diagram of the A path And a characteristic diagram after B-path modulation Element-by-element addition to obtain a spliced feature map ; Step 42, splicing the characteristic diagrams Remodelling into channel tokens, and respectively modulating the channel tokens from the A-path modulated feature images And a characteristic diagram after B-path modulation Extracting key matrix Sum matrix 、 From spliced feature maps Generating a query matrix via global averaging pooling and linear projection Calculating an explicit correlation matrix Explicit correlation matrix The calculation formula of (2) is as follows: ; Wherein, the For inquiring the matrix, the characteristic diagram is spliced Through global averaging pooling and linear projection generation, A key matrix for the A path; is a key matrix of the B-way, Is a matrix of values for the a-way, Is a matrix of values for the B-way, In order for the scaling factor to be a factor, And Representing Sigmoid activation function and Softmax normalization operation respectively; is the transpose of the matrix and, Using explicit correlation matrices For spliced characteristic diagram Explicit association enhancement is carried out to obtain an association enhancement feature map : ; Step 43, enhancing the feature map based on the association Through parallel spatial attention weighting And channel attention weight Generating coarse granularity weights ; ; ; ; Wherein, the And Global max pooling and average pooling for channel dimensions; Step 44, weighting coarse granularity Associated enhancement feature map After the elements are multiplied, fine granularity weights are generated through channel rearrangement and grouping convolution ; ; Wherein, the The channel rearrangement operation is used for enhancing the cross-group information interaction; convolving for a packet; step 45, outputting a fusion feature map through 1×1 convolution: 。
6. The multi-modal medical image fusion method based on multi-scale adaptive and explicit channel interaction according to claim 5, wherein the specific steps of step 5 are as follows: step 51, mapping the fusion characteristic into a fusion image by three-layer convolution, wherein the first layer is 1 multiplied by 1 convolution with a leakage ReLU activation function, and the second two layers are 3 multiplied by 3 convolution with a leakage ReLU activation function; Step 52, training phase using adaptive segmentation norm loss function The reconstruction error is constrained and the reconstruction error is, ; ; Wherein, the And The height and width of the image, respectively; Is a fusion feature map; in order to input the source image, Is the final fused image; 、 for the index of the coordinates, Representative image of the first The number of rows of the device is, Representative image of the first A column; , respectively locating the source image and the final fusion image at the coordinates , ) Gray values of pixel points at the position; is a preset threshold.
7. A multi-modal medical image fusion system based on multi-scale adaptation and explicit channel interaction for implementing the multi-modal medical image fusion method based on multi-scale adaptation and explicit channel interaction according to any one of claims 1 to 6, comprising: the multi-scale self-adaptive feature extraction module is used for extracting a multi-scale feature map of an input multi-mode medical image; A global-local dual attention feature enhancement module for enhancing the significance of each subspace feature, the module dividing the feature into subspaces, calculating the difference of two Softmax attention attempts in each subspace as a differential attention score, and enhancing the subspace feature based on the differential attention score; the self-adaptive feature modulation module of context gating is used for adjusting the spatial distribution of image features, the module utilizes the global information of context gating to generate modulation parameters, carries out affine transformation after carrying out instance normalization on a feature map, and aligns the cross-modal feature distribution; the explicit channel interaction fusion module is used for carrying out multi-modal information interaction in channel dimension, and the module adopts multi-head channel attention to calculate a correlation matrix of cross-modal characteristics, combines channel rearrangement and grouping convolution to generate fine granularity weight, and self-adaptively integrates multi-modal complementary information; And the combined optimization image reconstruction module is used for restoring the fusion characteristics into a fusion image, and the training stage adopts a self-adaptive segmentation norm loss function to restrict reconstruction errors.

Description

Multi-mode medical image fusion method based on multi-scale self-adaption and explicit channel interaction Technical Field The application relates to the technical field of medical image processing and artificial intelligence, in particular to a multi-mode medical image fusion method based on multi-scale self-adaption and explicit channel interaction. Background With the development of computer technology and the widespread use of medical imaging devices, medical images are increasingly important for their role in disease diagnosis, therapy planning and medical research. Depending on the imaging mechanism, medical images are generally classified into anatomical images (e.g., CT, MRI) that clearly show tissue morphology and functional imaging images (e.g., PET, SPECT) that reflect metabolic or blood flow information. It is difficult for a single modality to provide complete anatomical and functional information simultaneously, so fusing multimodal images has significant clinical value. The existing medical image fusion method is mainly divided into a traditional method and a deep learning method. Traditional methods (such as pyramid transformation, wavelet transformation, sparse representation and the like) rely on artificial design transformation and fusion rules, are sensitive to noise, are difficult to adaptively process significant gray level differences among multiple modes, and are easy to generate artifacts or structural fracture. In the deep learning method, a convolutional neural network is limited by a local receptive field, global dependence is difficult to model, the generation of an anti-network training is unstable and pattern collapse is easy to occur, and a transducer can capture long-range dependence, but has high computational complexity and still has insufficient adaptability to cross-modal characteristic distribution differences and noise interference. Although the Mamba architecture recently appears to have linear complexity and global modeling capability, how to consider local details and global structures in a fusion task remains an unsolved problem. In addition, the existing fusion strategy mostly adopts implicit interaction modes such as splicing or addition, so that semantic relativity among different channels is ignored, and explicit cross-modal information coordination is difficult to realize. The usual L1 or L2 loss functions also have difficulty balancing denoising and detail preservation when noise interference is severe. Disclosure of Invention In order to solve the problems, the application discloses a multi-mode medical image fusion method based on multi-scale self-adaption and explicit channel interaction, which is used for carrying out feature extraction and information fusion on medical images of different modes through key modules such as global-local dual attention feature reinforcement, context gating self-adaption feature modulation and explicit channel interaction fusion, so as to obtain a fusion image with more complete information and richer details. The multi-mode medical image fusion method based on multi-scale self-adaption and explicit channel interaction specifically comprises the following steps: Step1, respectively extracting features of two input medical images in different modes through a multi-scale self-adaptive feature extraction module to obtain an A-path multi-scale feature map and a B-path multi-scale feature map; Step 2, performing significance enhancement on the A-path multi-scale feature map and the B-path multi-scale feature map through a global-local dual-attention feature enhancement module to obtain an A-path enhanced feature map and a B-path enhanced feature map; Step 3, performing characteristic distribution adjustment on the characteristic diagram after A-path reinforcement and the characteristic diagram after B-path reinforcement through a context gating self-adaptive characteristic modulation module to obtain the characteristic diagram after A-path modulation and the characteristic diagram after B-path modulation; Step 4, fusing the characteristic diagram after the A-path modulation and the characteristic diagram after the B-path modulation through an explicit channel interaction fusion module to obtain a fused characteristic diagram; And 5, reconstructing the fusion feature map through a joint optimization image reconstruction module to obtain a final fusion image. Further, the specific content of step 1 is as follows: step 11, respectively calculating gradient information of two input multi-mode medical images by utilizing Sobel operator to strengthen edge characteristics of the input multi-mode medical images, and obtaining the multi-mode medical images with A-side edges strengthened And B-side edge reinforced multi-modal medical image; Step 12, the multimode medical image after the A path edge is strengthenedAnd B-side edge reinforced multi-modal medical imageThe method comprises the steps of respectively inputting cascaded self-adapt