CN-122025078-A - Multi-modal brain disease diagnosis method based on double contrast learning

CN122025078ACN 122025078 ACN122025078 ACN 122025078ACN-122025078-A

Abstract

The invention discloses a multi-modal brain disease diagnosis method based on double contrast learning, which comprises the following steps of obtaining multi-modal brain image data comprising functional magnetic resonance imaging (fMRI) and Diffusion Tensor Imaging (DTI) and corresponding diagnosis text description, constructing a multi-modal brain disease data set, establishing a multi-modal brain disease diagnosis model framework based on double contrast learning, wherein the framework comprises a multi-modal contrast learning stage and a cross-modal recovery learning stage, adopting a two-stage training strategy, firstly using a complete sample to carry out multi-modal contrast learning training, then using an incomplete sample to carry out cross-modal recovery learning training, optimizing a training process through joint optimization of contrast loss, reconstruction loss and classification loss, finally carrying out modal recovery and feature fusion on a sample to be diagnosed with a missing mode through a training model, and outputting brain disease diagnosis results. The invention obviously improves the accuracy and the robustness of brain disease diagnosis, enhances the adaptability to multi-mode data under different deletion rate conditions, and solves the technical problems of diagnosis performance reduction, semantic consistency lack of recovery characteristics, insufficient model generalization capability under high deletion rate conditions and the like caused by mode deletion in the prior art.

Inventors

CUI JINRONG
YE WEIHAO
Liu Fubang
XU JINHUI

Assignees

华南农业大学

Dates

Publication Date: 20260512
Application Date: 20260211

Claims (7)

1. The multi-mode brain disease diagnosis method based on double contrast learning is characterized by comprising the following steps of: S1, acquiring multi-mode brain image data comprising functional magnetic resonance imaging (fMRI) and Diffusion Tensor Imaging (DTI) and corresponding diagnosis text description, preprocessing the fMRI data and the DTI data, and constructing a multi-mode brain disease data set comprising a complete sample set, an fMRI missing sample set and a DTI missing sample set; S2, constructing a multi-modal brain disease diagnosis model based on dual contrast learning, wherein the model comprises a multi-modal contrast learning stage and a cross-modal recovery learning stage, wherein the multi-modal contrast learning stage maps different modalities to a shared embedding space through a text encoder, an fMRI encoder and a DTI encoder to construct a similarity matrix to realize cross-modal semantic alignment; S3, respectively obtaining text embedding, fMRI embedding and DTI embedding through three encoders for each sample in the complete sample set, constructing a text-image similarity matrix and an image-image similarity matrix, and constraining the consistency of the two similarity matrices through a contrast learning loss function to realize cross-modal semantic alignment; s4, recovering missing fMRI (magnetic resonance imaging) embedding through a fMRI generator by using DTI (digital time series interference) embedding for the fMRI missing sample, recovering missing DTI embedding through the DTI generator by using fMRI embedding for the DTI missing sample, adopting a reconstruction loss constraint generator, and ensuring consistency of the recovered modal characteristics and real modal characteristics; s5, obtaining fusion embedding of task perception through a fusion module by restoring or original fMRI embedding and DTI embedding, inputting the fusion embedding into a classification layer to obtain prediction probability distribution, and optimizing classification performance by adopting classification loss; S6, inputting the preprocessed multi-modal brain disease data set into a model for training, adopting a two-stage training strategy, wherein the first stage uses a complete sample set for multi-modal contrast learning training and optimizing contrast learning loss; S7, inputting the preprocessed multi-mode brain image data to be diagnosed into a model which is trained, for a sample with a missing mode, recovering the missing mode through a corresponding mode generator, obtaining embedded representation of all modes through an encoder, and outputting brain disease diagnosis results through a fusion module and a classification layer.
2. The method for diagnosing the brain diseases in multiple modes based on double contrast learning according to claim 1, wherein the fMRI data preprocessing in S1 is characterized in that SPM8 tools are adopted to carry out motion correction and trending processing through DPARSF tool boxes, fMRI data are divided into 90 regions of interest (ROIs) based on AAL maps and are divided into time sequences, the DTI data preprocessing is carried out by PANDA tools, the FSL tools are adopted to carry out distortion correction, trackVis is used to generate fiber bundle images, anatomical regions are identified based on the AAL maps, and structural connectivity among the regions is calculated.
3. The method for diagnosing a multi-modal brain disease based on dual contrast learning as claimed in claim 1, wherein three modal specific encoders in S2 each adopt a two-layer multi-layer perceptron (MLP) structure, a text encoder Is 1024, fMRI image encoder And DTI image encoder Is 2048 in dimension, and three encoders map the input to a shared embedded space of dimension 512.
4. The method for diagnosing a multi-modal brain disease based on dual contrast learning as recited in claim 1, wherein the text-to-image similarity matrix $$ $ $ { (t) } and the image-to-image similarity matrix in S3 Matrix elements are calculated by cosine similarity, and a contrast learning loss function is used Constraining the consistency of the two similarity matrices: ; Wherein the method comprises the steps of Is the complete sample number.
5. The method for diagnosis of a multimodal brain disease based on dual contrast learning according to claim 1, wherein the fMRI generator in S4 And a DTI generator Adopts a two-layer multi-layer perceptron (MLP) structure, the dimension of a hidden layer is 2048, the input and output dimensions are 512, and the reconstruction loss is realized The definition is as follows: ; Wherein the method comprises the steps of For the number of fMRI missing samples, For the number of DTI missing samples, For the number of complete samples to be taken, Representation of Norms.
6. The method for diagnosing a multi-modal brain disease based on dual contrast learning as claimed in claim 1, wherein the fusion module in S5 Input is spliced fMRI embedded and DTI embedded by adopting a two-layer multi-layer perceptron (MLP) structure The output is fusion embedding By means of Obtaining predictive probability distribution by function 。
7. The method for diagnosing a multi-modal brain disease based on dual contrast learning as claimed in claim 1, wherein the first stage training in S6 uses a complete sample set to optimize contrast learning loss The second stage training uses incomplete sample set to optimize the total loss function Wherein And Is the parameter of the ultrasonic wave to be used as the ultrasonic wave, In order to compare the learning loss with the learning loss, In order to reconstruct the loss of the device, Is a classification loss.

Description

Multi-modal brain disease diagnosis method based on double contrast learning Technical Field The invention belongs to the technical field of intelligent diagnosis of medical images, and particularly relates to a multi-mode brain disease diagnosis method based on dual contrast learning. Background Brain disease diagnosis is a core task of neural image analysis, and is characterized by the cooperative utilization of multi-mode image characteristics and semantic information. The traditional brain disease diagnosis method mainly relies on single-mode image analysis or artificial feature extraction, has the problems of limited diagnosis precision, sensitivity to missing data and the like, and the method based on multi-mode fusion can improve diagnosis accuracy but is limited by insufficient feature expression and semantic alignment capability under a mode missing scene. In the prior art, although the method based on low-rank representation can recover the missing modes through matrix complementation, the semantic correspondence between the cross modes is difficult to capture, and the recovery characteristics lack discrimination caused by noise interference. Although the traditional machine learning model (such as a support vector machine) can improve the precision by manually designing the multi-modal characteristics, the generalization capability is obviously reduced under the condition of high deletion rate. In recent years, a multi-modal learning method represented by a deep neural network has remarkable progress in brain disease diagnosis, but the method still has the technical limitations that firstly, a conventional modal recovery method mainly focuses on pixel level or feature level reconstruction, semantic alignment modeling among samples and modalities is insufficient, consistency of recovery features and original modalities in semantic space is difficult to ensure, performance of downstream classification tasks is limited, secondly, the conventional method often separates and optimizes modal recovery and classification tasks, and lacks a task-aware feature learning mechanism, so that the recovered modal features are close to true values in numerical value, but have weak semantic relevance with diagnostic tasks, and particularly, the diagnostic accuracy is remarkably reduced under extreme deletion conditions (such as 90% deletion rate). Aiming at the limitations of the traditional method and the deep learning method in incomplete multi-mode brain disease diagnosis, the invention provides a multi-mode brain disease diagnosis framework based on double contrast learning. The invention realizes technical breakthroughs through a two-stage architecture inspired by CLIP, namely, a text-image similarity matrix and an image-image similarity matrix are constructed by introducing a multi-mode contrast learning stage, three modes of text, fMRI and DTI are aligned to a shared embedded space, a cross-mode semantic corresponding relation is captured to provide semantic guidance for subsequent mode recovery, a missing mode is recovered through a mode generator by combining with the cross-mode recovery learning stage, and combined optimization of reconstruction loss and classification loss is adopted to ensure that the recovered mode is accurate in numerical value and related to a diagnosis task, semantic alignment is realized on two layers of a mode level and a sample level based on a dual contrast learning strategy, so that the accuracy and robustness of brain disease diagnosis can be remarkably improved, and on epilepsia and ADNI data sets, excellent diagnosis performance can be maintained even under the condition of 90% of the missing rate, and core technical problems of reduced diagnosis performance, lack of semantic consistency of recovery characteristics, insufficient modeling capability and the like under the condition of high missing rate caused by mode loss in incomplete multi-mode data are successfully solved. Disclosure of Invention Aiming at the technical problems of obvious reduction of diagnosis performance, lack of semantic consistency of recovery characteristics, insufficient model generalization capability under the condition of high deletion rate and the like caused by modal deletion in the existing incomplete multi-modal brain disease diagnosis method, the invention provides a multi-modal brain disease diagnosis method based on double contrast learning. By combining multi-modal contrast learning and cross-modal recovery learning mechanisms and combining a dual contrast learning strategy and task perception feature fusion, the accuracy and robustness of brain disease diagnosis are remarkably improved, and excellent diagnosis performance can be maintained even under extreme deletion conditions (such as 90% deletion rate). In order to achieve the aim of the invention, the technical scheme adopted is as follows, a multi-mode brain disease diagnosis method based on double contrast learning is constructed,