CN-122023282-A - Medical image processing method based on bimodal magnetic resonance image feature fusion
Abstract
The invention discloses a medical image processing method based on dual-mode magnetic resonance image feature fusion, which relates to the field of computer vision and medical image processing, and comprises the following steps of acquiring two magnetic resonance image volume data of a target area as different mode inputs, and carrying out standardization and spatial scale unified processing on the image data; the three-dimensional feature extraction method comprises the steps of respectively carrying out feature extraction on magnetic resonance imaging sequences of different modes by utilizing a three-dimensional feature extraction network to obtain three-dimensional feature representation, mapping the three-dimensional features of the different modes into feature sequences of uniform dimension, introducing a cross attention correlation interactive modeling mode, modeling the association relation among the features of the different modes at a voxel level to obtain a fused multi-mode feature representation, carrying out feature aggregation and classification analysis based on the multi-mode feature representation, and outputting probability distribution of a target area on different categories.
Inventors
- YANG JUN
- CAI WEI
- GAN CHENQUAN
- LIANG CHENGCHAO
- Zeng Yanjia
Assignees
- 华中科技大学同济医学院附属同济医院
Dates
- Publication Date
- 20260512
- Application Date
- 20260109
Claims (8)
- 1. A medical image processing method based on the feature fusion of bimodal magnetic resonance images is characterized by comprising the following steps: s1, acquiring two magnetic resonance imaging sequence image data of a target area, inputting the two magnetic resonance imaging sequence image data as sequences of two modes, and preprocessing the image data to obtain standardized three-dimensional data; S2, respectively extracting features of three-dimensional volume data corresponding to magnetic resonance imaging sequences of different modes to obtain corresponding sequence features; S3, mapping the three-dimensional features of different modes into feature sequences with uniform dimensions, introducing a cross attention correlation interactive modeling mode, and modeling the association relationship among the features of different modes at a voxel level to obtain a fused multi-mode feature representation; And S4, carrying out feature aggregation and classification analysis based on the multi-mode feature representation, and outputting probability distribution of the target region on different categories.
- 2. The medical image processing method based on the bimodal magnetic resonance image feature fusion according to claim 1, wherein the step S1 is specifically as follows: s11, unifying the depth dimension D of the image sequence to a preset value S, wherein when D is greater than S, the depth dimension is cut in the middle, and when D is less than S, the depth dimension is symmetrically zero-filled to realize unification of space dimensions; s12, selecting non-zero voxels in the image sequence, calculating the global mean mu and standard deviation sigma of the non-zero voxels, and carrying out standardized transformation on each volume data volume: Wherein, the Representing normalized transformed volume data.
- 3. The medical image processing method based on the bimodal magnetic resonance image feature fusion, which is disclosed by claim 2, is characterized in that the feature extraction network in the step S2 is a three-dimensional convolutional neural network constructed based on an R3D-18 backbone structure, the three-dimensional convolutional neural network is used for carrying out structural adaptation of an input layer and channel configuration on single-channel medical image volume data, and initializing network parameters by adopting a pre-training model weight, and the three-dimensional convolutional neural network is used for outputting a feature map of a preset middle layer as sequence features of a corresponding mode in a forward reasoning process.
- 4. A medical image processing method based on bimodal magnetic resonance imaging feature fusion according to claim 3, wherein step S3 specifically comprises: s31, aligning the sequence features of the first mode with the sequence features of the second mode in a spatial resolution mode, so that the sequence features of the first mode and the sequence features of the second mode have the same size in the spatial dimension; S32, flattening the aligned first modality sequence features and second modality sequence features along the space dimension to form a feature sequence; S33, inputting the characteristic sequences into a bidirectional cross attention unit, and respectively calculating to obtain an enhanced first modality characteristic sequence and a second modality characteristic sequence; s34, reconstructing the enhanced first mode feature sequence and the enhanced second mode feature sequence into corresponding space feature forms, and splicing in the channel dimension to obtain splicing features; s35, carrying out channel compression processing on the spliced features through three-dimensional convolution of 1 multiplied by 1 to obtain the fused bimodal feature vector.
- 5. The medical image processing method based on the bimodal magnetic resonance image feature fusion according to claim 4, wherein the three-dimensional convolutional neural network is initialized by pre-training model weights in a model construction process, and data enhancement processing is performed on standardized three-dimensional volume data in a model training process so as to expand a training sample set. The data enhancement processing includes spatial transformation enhancement including random flipping and random rotation operations and intensity enhancement including affine transformation and noise perturbation.
- 6. A medical image processing method based on bimodal magnetic resonance image feature fusion as in claim 5, wherein during model training, a model is optimized with a loss function for category imbalance problem, wherein the loss function comprises a loss term for enhancing learning ability of a few categories of samples and a loss term for adjusting discrimination boundaries between categories. The mixing loss function is defined as: Wherein, the For the Focal Loss of the optical fiber, At the end of the line of LDAM Loss, And The weight coefficient is preset; The Focal Loss is defined as: Wherein, the As the weight of the class y, In order to be able to take the focus parameter as such, The prediction probability of the model to the real category y is given; the LDAM Loss pairs logits vectors by introducing class-adaptive spacing And (3) performing treatment: Wherein, the And (3) adjusting according to the number of category samples: , For the number of samples of class y in the training set, The method is used for controlling the super-parameters of the whole margin range, k represents the class serial number in the class sample, j represents the serial number of any class in the class sample except the class y, and LDAM Loss is finally defined as cross entropy loss after temperature scaling, and is as follows: Wherein the method comprises the steps of Is the weight of the category, Is the cross entropy loss and s is the temperature scaling factor.
- 7. A storage device, wherein program instructions and data are stored in the storage device, and the program instructions, when executed by a processor, are configured to implement a medical image processing method based on dual-modality magnetic resonance image feature fusion as set forth in any one of claims 1 to 6.
- 8. The medical image processing device based on the bimodal magnetic resonance image feature fusion is characterized by comprising a processor and a storage device, wherein program instructions and data are stored in the storage device, and the processor is used for loading and executing the program instructions so as to realize the medical image processing method based on the bimodal magnetic resonance image feature fusion according to any one of claims 1-6.
Description
Medical image processing method based on bimodal magnetic resonance image feature fusion Technical Field The invention belongs to the technical field of computer vision and medical image processing, and particularly relates to the field of medical image processing based on bimodal magnetic resonance image feature fusion. Background With the development of medical imaging technology, magnetic Resonance Imaging (MRI) is widely used in clinical image analysis due to its higher resolution on soft tissues. For partial hereditary or development related diseases, automatic segmentation, classification and phenotype quantification of a target area are realized based on MRI images, so that the image reading efficiency is improved, and a reference is provided for subsequent clinical analysis. Taking the relevant image of the Kjeldahl syndrome as an example, the image characterization of the relevant image often shows the characteristics of large individual difference, hidden characteristics and the like, and the structure and the function information of a target area are difficult to fully characterize only by relying on a single imaging sequence, so that the comprehensive observation is often carried out by combining different sequences/modes (such as T2 weighted imaging, diffusion weighted imaging and the like) in actual acquisition. However, bimodal/multisequence MRI data differ in spatial resolution, imaging scale, signal intensity distribution, etc., resulting in difficult direct alignment of cross-modal features, which presents challenges for joint modeling of three-dimensional volume data. The existing medical image processing method mainly adopts strategies such as feature stitching, weighted summation or later decision fusion to realize multi-mode fusion, information superposition is often carried out only on a channel level, explicit description of fine granularity space corresponding relation and correlation among different modes is lacking, meanwhile, part of the method adopts two-dimensional slices as processing units, and space context information of three-dimensional volume data is difficult to fully utilize, so that the distinguishing and stability of fusion features are influenced. In addition, the problems of limited quantity, unbalanced category distribution, acquisition noise and the like often exist in the real medical image samples, so that the phenomenon of insufficient generalization or reduced sensitivity to few types of samples easily occurs in the training process of the model. Therefore, the medical image processing method for the bimodal MRI three-dimensional data can perform effective correlation interactive modeling on different modal features and generate more discriminative fusion characterization on the basis of guaranteeing cross-modal space consistency so as to improve the robustness and accuracy of a three-dimensional medical image analysis task. Disclosure of Invention The invention aims to solve the problem that the prior multi-sequence magnetic resonance image fusion classification method is difficult to fully model complex association relations among different sequences in the medical image processing process, so that the target detection or classification precision is limited. The invention is realized by the following technical scheme: A medical image processing method based on bimodal magnetic resonance image feature fusion comprises the following steps: s1, acquiring at least two magnetic resonance imaging sequence image data of a target area, inputting the image data as sequences of two modes, and preprocessing the image data to obtain standardized three-dimensional data; S2, respectively extracting features of three-dimensional data corresponding to different magnetic resonance imaging sequences to obtain corresponding sequence features; S3, mapping the three-dimensional features of different modes into feature sequences with uniform dimensions, introducing a cross attention correlation interactive modeling mode, and modeling the association relationship among the features of different modes at a voxel level to obtain a fused multi-mode feature representation; And S4, carrying out feature aggregation and classification analysis based on the multi-mode feature representation, and outputting probability distribution of the target region on different categories. The storage device stores instructions and data for realizing a medical image processing method based on bimodal magnetic resonance image feature fusion. A medical image processing device based on dual-mode magnetic resonance image feature fusion comprises a processor and a storage device, wherein the processor loads and executes instructions and data in the storage device to realize a medical image processing method based on dual-mode magnetic resonance image feature fusion. The invention has the following beneficial effects: A multi-modal depth learning framework is presented that directly utilizes complete DWI an