CN-115984257-B - Multi-mode medical image fusion method based on multi-scale transducer
Abstract
The invention discloses a multi-mode medical image fusion method based on a multi-scale transducer, and belongs to the technical field of medical image fusion. The invention provides a novel efficient fusion model, wherein a multi-scale transformation former model is designed to introduce a feature extraction network, so that the feature extraction network can effectively extract multi-scale depth features and reserve more meaningful information for fusion tasks, the self-adaptive receptive field and patch size are adopted in the network training process, the image generation quality is constrained by constructing an objective function based on structural similarity optimization, and a better visual effect and quantization result are provided for medical image fusion results by combining convolution calculation with transformation.
Inventors
- FANG XIANJIN
- CHENG YING
- YANG GAOMING
- ZHANG HAIYONG
- ZHAO WANWAN
- Hua Kaiwen
- LI XIANG
- XUE MINGJUN
Assignees
- 合肥综合性国家科学中心人工智能研究院(安徽省人工智能实验室)
- 安徽理工大学
Dates
- Publication Date
- 20260508
- Application Date
- 20230221
Claims (8)
- 1. The multi-mode medical image fusion method based on the multi-scale transducer is characterized by comprising the following steps of: Step S1, slicing a data set of a brain glioma medical image, and removing a region without a focus; S2, carrying out data preprocessing on brain glioma slices; S3, constructing a multi-scale transducer module; S4, constructing a fusion network mechanism to obtain a multi-scale transformation former fusion network, and optimally training the multi-scale transformation former fusion network by utilizing data in the preprocessed training set; s5, constructing a loss function based on the structural similarity measurement, and restricting the generation direction of the image; S6, after the processing of the steps S1-S5, obtaining a medical image fusion model based on a multi-scale transformer fusion network, and inputting medical images of brain gliomas to be fused into the medical image fusion model for fusion processing, so as to obtain a fusion result; In the step S3, the multi-scale converter module includes 4 blocks connected in sequence, where each block is implemented by convolution calculation of different receptive field sizes and converter hierarchical connection of different patches; the size of the receptive field and the patch of each block is determined by the block input, the sizes of the receptive field and the patch are adaptively adjusted according to the different input sizes, Make the first The input of each block is that the receptive field is firstly passed through Convolution calculation of the size, and then dividing the features into The method comprises the steps of performing attention mechanism transformation on each patch, and the like, extracting a feature map after 4 blocks are calculated, and performing the next step, wherein the calculation formula of each block is as follows: ; ; ; Wherein, the A convolution calculation is represented and is performed, And The weights and offsets generated during the iteration process, The function is activated for softmax and, Representing taking the maximum value thereof.
- 2. The multi-mode medical image fusion method based on multi-scale transformer according to claim 1, wherein in the step S1, the data set of the brain glioma medical image comprises 4 image sequences, namely a flair sequence, a T1 weighting sequence, a contrast enhancement T1 weighting sequence and a T2 weighting sequence, the 4 image sequences are processed synchronously, the data sets are randomly disturbed, 30% of the data sets are extracted as verification sets, the rest 70% of the data sets are taken as training sets, and the training sets are randomly divided into training sets according to a set proportion in 70% of the training sets And a verification set Wherein 。
- 3. The multi-modal medical image fusion method based on multi-scale transducer of claim 2, wherein in step S2, the data preprocessing formula is as follows: ; Wherein, the As an input to the multi-scale transform fusion network constructed in step S4, As a function of the sampling function, Representing normalization of the sampled data.
- 4. The multi-mode medical image fusion method based on the multi-scale transformers is characterized in that the multi-scale transformers fusion network comprises a feature extraction network, fusion modules and a feature reconstruction network, wherein the feature extraction network comprises 3 convolution modules and 3 multi-scale transformers which are sequentially connected in a staggered mode, the convolution modules are in front, the multi-scale transformers are in back, the feature reconstruction network comprises 4 up-sampling modules which are sequentially connected, and the last multi-scale transformer module is connected with the first up-sampling module through the fusion modules.
- 5. The multi-modal medical image fusion method based on multi-scale transformers of claim 4 wherein each time an input image is processed by a convolution module, the input image is input into a multi-scale transformer module, each time a block in the multi-scale transformer module is passed, the feature map is changed, the number of channels is converted into 1/2 of the original number, and the feature map is expressed by the following formula after being changed: ; Wherein, the A convolution calculation with a stride of 1 and a convolution kernel size of 3 is represented, Is an input to the multi-scale hierarchical transducer module.
- 6. The multi-modal medical image fusion method based on multi-scale transformers of claim 5 wherein the fusion module has a calculation formula as follows: ; Wherein, the The two tensors are shown stitched, c, p, N, H and W refer to the channel number, patch size, patch number, height and width of the source image, respectively.
- 7. The method for multi-modal medical image fusion based on multi-scale transformer as defined in claim 6, wherein in the step S5, the formula of the structural similarity metric index SSIM and the structural similarity metric index loss function are as follows: ; ; Wherein, the In order to calculate the average value of the values, In order to calculate the variance of the values, And Is a number approaching to infinity, Is the result of the network generation.
- 8. The method for multi-modal medical image fusion based on multi-scale transformers of claim 2 wherein in step S6, after obtaining the medical image fusion model, a validation set is used And testing the medical image fusion model.
Description
Multi-mode medical image fusion method based on multi-scale transducer Technical Field The invention relates to the technical field of medical image fusion, in particular to a multi-mode medical image fusion method based on a multi-scale transducer. Background The image fusion is to fuse different information in the fused images together to obtain a new image with various complex information from different images gathered together. Image fusion is used in many fields, such as infrared light images and visible light images, and can be applied to military aspects to improve the detection reconnaissance capability of the system. In the medical imaging field, image fusion fuses details in medical images of different modalities, for example, fusing an MRI image with a SPECT image can obtain an image that simultaneously preserves functional metabolic functions of the SPECT image and structural soft tissue information of the MRI. RAIN tumor segmentation in multi-mode Magnetic Resonance Imaging (MRI) scanning is the basis for acquiring key quantitative indexes such as tumor two-dimensional diameter, tumor volume and the like, and has important clinical significance in disease diagnosis and treatment effect evaluation. Since gliomas are the most common primary malignancy of the brain, most brain tumor segmentation studies have focused on gliomas. Typical targets for glioma segmentation are to localize multiple types of pathological regions in a multi-modality MRI volume, including oedema (ED), necrotic and non-reinforced tumors (NCR/NET) and reinforced tumors (ET), typically including T1 weighting (T1), contrast-enhanced T1 weighting (T1 c), T2 weighting (T2). Fusing together different sequences of MRI images, including images of different modality features, facilitates subsequent processing of the physician's diagnosis or other work. The medical image fusion mode based on deep learning achieves good fusion effect, but still has a plurality of problems that (1) the fused frame is only used for appointed tasks, and the generalization performance is low. For example, the framework for fusing PET and SPECT has larger resolution difference of two images, has requirements on resolution of an input network picture, and is difficult to fuse other works, (2) the existing multi-mode medical image fusion mode based on deep learning is fusion of two mode images and does not have image fusion of more than two modes, (3) the medical image fusion technology is still and slightly lacking in application at present, but the fusion of two images cannot exert too great an effect on medical diagnosis and can not be used for amplifying a data set in an attempt, and (4) the existing method is generally CNN-based method, can well capture local information, but is difficult to capture global information. In order to realize the accurate fusion of the characteristics and commonality of the focus areas of the medical image of the glioma of the brain, a multi-mode medical image fusion method based on a multi-scale transducer is provided. Disclosure of Invention The invention aims to solve the technical problem of realizing accurate fusion of the focal region characteristics and commonalities of medical images of brain glioma, overcoming the defect of information fusion during the fusion processing of the existing multi-mode medical images, and providing a multi-mode medical image fusion method based on a multi-scale transducer. The invention solves the technical problems through the following technical proposal, and the invention comprises the following steps: Step S1, slicing a data set of a brain glioma medical image, and removing a region without a focus; S2, carrying out data preprocessing on brain glioma slices; S3, constructing a multi-scale transducer module; S4, constructing a fusion network mechanism to obtain a multi-scale transformation former fusion network, and optimally training the multi-scale transformation former fusion network by utilizing data in the preprocessed training set; s5, constructing a loss function based on the structural similarity measurement, and restricting the generation direction of the image; and S6, after the processing of the steps S1-S5, obtaining a medical image fusion model based on a multi-scale transformation former fusion network, and inputting the medical image of the brain glioma to be fused into the medical image fusion model for fusion processing, so as to obtain a fusion result. In step S1, the data set of the brain glioma medical image includes 4 image sequences, namely a flair sequence, a T1 weighted sequence, a contrast enhancement T1 weighted sequence and a T2 weighted sequence, the 4 image sequences are processed synchronously, the data set is randomly disordered, 30% of the data sets are extracted as verification sets, the rest 70% of the data sets are used as training sets, and the training sets X training and the verification sets X test are randomly divided into training sets X traini