CN-121687526-B - Brain disease risk prediction method, device, equipment and storage medium based on self-adaptive subgraph contrast distillation
Abstract
The invention discloses a brain disease risk prediction method, device, equipment and storage medium based on self-adaptive subgraph contrast distillation. Collecting and preprocessing multi-mode brain image data, dividing the multi-mode brain image data into a plurality of regions of interest, extracting blood oxygen level dependent signal time sequences, analyzing connection weights to generate a weighted brain network diagram, inputting the weighted brain network diagram into a target diagram neural network, outputting key focus sub-graph features, constructing a multi-target loss item based on the key focus sub-graph features, training a model based on the multi-target loss item, predicting brain disease risk through the trained model, mining deep pathological features from limited and unbalanced nerve image data, and improving brain disease risk prediction accuracy.
Inventors
- ZENG YANGYAN
- Deng Gaoyi
- ZENG CHUNCHAO
- LIANG WEI
Assignees
- 湖南工商大学
Dates
- Publication Date
- 20260512
- Application Date
- 20260212
Claims (6)
- 1. A brain disease risk prediction method based on adaptive subgraph contrast distillation, characterized in that the brain disease risk prediction method based on adaptive subgraph contrast distillation comprises: Collecting multi-mode brain image data and preprocessing, dividing the preprocessed brain image data into a plurality of regions of interest, and extracting blood oxygen level dependent signal time sequences of the regions of interest, wherein the preprocessing comprises time correction, head motion correction and space standardization processing; performing connection weight analysis on each region of interest based on the blood oxygen level dependent signal time sequence to generate a weighted brain network diagram; constructing a target graph neural network model, inputting the weighted brain network graph into the target graph neural network model, and outputting key focus sub-graph features; Constructing a multi-target loss item based on the key focus subgraph characteristics, training the target graph neural network model based on the multi-target loss item, and predicting brain disease risk through the trained target graph neural network model, wherein the multi-target loss item comprises a classification cross entropy loss item, a knowledge distillation loss item and a comparison learning loss item; performing connection weight analysis on each region of interest based on the blood oxygen level dependent signal time sequence to generate a weighted brain network graph, including: And carrying out connection weight analysis on each region of interest based on the blood oxygen level dependent signal time sequence by adopting a pearson correlation coefficient to obtain original functional connection weights among the regions of interest, wherein the original functional connection weights refer to the following formula: Wherein, the Representing the original functional connection weights between region of interest i and region of interest j, The signal amplitude of the blood oxygen level dependent signal representing the region of interest i at the first time point, And Representing the average signal amplitudes of the region of interest i and the region of interest j over the time series of blood oxygen level dependent signals respectively, Representing the total sampling time point number; constructing connection distribution feature vectors of all the regions of interest based on original functional connection weights among all the regions of interest; Calculating intersection and union between the regions of interest based on the connection distribution feature vector, referring to the following formula: Wherein, the Representing the intersection between the region of interest i and the region of interest j, for measuring the intensity of the connection pattern common to both brain regions, Representing the union between the region of interest i and the region of interest j, for measuring the intensity range of the total connected mode of the two brain regions, And Region of interest i and region of interest j, respectively, are the third party region of interest Is a connection weight of (2); Constructing a multi-semantic soft Jaccard similarity connection weight based on the intersection and the union, and referring to the following formula: Wherein, the Representing the connection weights between the region of interest i and the region of interest j, Representing the intersection between the region of interest i and the region of interest j, Representing the union between region of interest i and region of interest j, Representing a positive number for preventing the denominator from being zero; Constructing a reconstruction weight matrix based on the multi-semantic soft Jacquard similarity connection weight, and generating a weighted brain network graph based on the reconstruction weight matrix; the target graph neural network model comprises a teacher model and a student model; the teacher model is configured to perform feature extraction on the weighted brain network diagram by adopting a multi-layer diagram convolution network to obtain global topological features of the whole brain; The student model is configured to calculate importance scores of brain region nodes relative to the global topological feature of the whole brain based on an adaptive attention mechanism, wherein the brain region nodes are mapping units of the region of interest in a graph structure; the student model is further configured to prune the brain region nodes based on the importance scores to obtain key brain region nodes, perform graph convolution operation based on a sub-graph structure formed by the key brain region nodes to obtain sub-graph node features, and aggregate the sub-graph node features into key focus sub-graph features; Constructing a multi-target loss item based on the key focus sub-graph features, training the target graph neural network model based on the multi-target loss item, and predicting brain disease risk through the trained target graph neural network model, wherein the method comprises the following steps: Carrying out data enhancement on key focus sub-graph characteristics of one sample in a sample batch output by a student model, constructing a positive sample pair, and taking other samples which are not subjected to data enhancement in the sample batch as negative samples to construct a multi-view pattern pair; processing the multi-view sample pair by using a contrast learning loss function, and calculating a contrast learning loss term; respectively calculating a classification cross entropy loss term, a knowledge distillation loss term and a feature layer distillation loss term; the method comprises the steps of calling a preset super-parameter group to carry out weighted summation on various loss items to obtain a total loss function; training a target graph neural network model based on the total loss function; Performing brain disease risk prediction through the trained target graph neural network model, outputting a brain risk prediction result, and mapping attention weights to brain patterns based on the brain risk prediction result to generate a key pathogenic brain region thermodynamic diagram; The total loss function refers to the following formula: Wherein, the The term of the total loss is represented as, Representing the cross-class entropy loss term, A true label representing the sample is presented, A predictive label representing the output of the target graph neural network model based on the key lesion sub-graph features, Represents the knowledge distillation loss term(s), Represents the characteristic layer distillation loss term, Representing a comparison of the learning loss terms, 、 、 And Respectively representing preset super parameters for balancing the importance of various loss items; the knowledge distillation loss term is calculated with reference to the following formula: Wherein, the The distillation temperature is indicated as the temperature at which the catalyst is distilled, Represents the KL divergence, which is used to measure the difference between two probability distributions, Represents the total number of disease categories, Representing the softening probability distribution of the teacher model, Representing the softening probability distribution of the student model, Representing teacher model pair categories Is used for the softening probability output of the (a), Representing teacher model pair categories The output of logits of (a) is, Representing the softening probability output of the student model for category c, Logits outputs representing the teacher model for category j; The characteristic layer distillation loss term is calculated with reference to the following formula: Wherein, the The freunds Luo Beini uz norm of the matrix, Representing a cosine similarity measure function; the comparative learning loss term is calculated with reference to the following formula: Wherein, the Representing a comparison of the learning loss terms, The sample batch size is indicated as such, A key focus sub-graph feature representing an ith sample of student model output, The data representing the ith sample enhances the back view, A positive pair of samples is represented and, Representing the cosine similarity between the vectors, Representing a comparison of learned temperature coefficients, for adjusting the scale of the similarity score, Representing a negative sample.
- 2. The brain disease risk prediction method based on adaptive subgraph contrast distillation according to claim 1, wherein the update process of hidden states of brain region nodes in the multi-layer graph convolution network of the teacher model refers to the following formula: Wherein, the And Respectively represent the first The brain area node is at the first Layer and the first The feature vector of the layer is used to determine, Represent the first The dimension of the layer characteristics, Represents a set of real numbers, Representing nodes with brain regions in a graph adjacency matrix A set of directly connected neighbor brain region nodes, the graph adjacency matrix being constructed based on a reconstructed weight matrix of the weighted brain network graph, Representing the first in the graph adjacency matrix The elements of row j and column j, And Respectively represent brain region nodes And the degree of the brain region node j, Representing a symmetric normalized term, for stabilizing the learning process, Represent the first A matrix of the learnable parameters of the layer, A nonlinear activation function; the teacher model is further configured to obtain node feature vectors output by the nodes of each brain region through iterative propagation of a multi-layer graph convolution network, and perform mixed pooling operation on the node feature vectors output by the nodes of each brain region through a multi-layer perceptron to obtain global topological features of the whole brain, wherein the mixed pooling operation refers to the following formula: Wherein, the Representing the global topological feature of the whole brain, A multi-layer perceptron is shown, For stitching the average and maximum pooled result vectors along the feature dimension, Represents the dimension-by-dimension maximum pooling of node feature vectors output by all brain region nodes, Representing the average pooling of node feature vectors output by all brain region nodes, Representing the total number of brain area nodes, which is the total number of the regions of interest, Representing the total number of layers of the graph rolling network.
- 3. The brain disease risk prediction method based on adaptive subgraph contrast distillation according to claim 2, wherein the student model is further configured to calculate importance scores of the brain region nodes based on node feature vectors output by the brain region nodes and global topology features of the whole brain, referring to the following formula: Wherein, the A significance score representing a brain region node i, And Respectively representing a learnable weight matrix, respectively used for linearly transforming node characteristic vectors of brain region nodes and global topology characteristics of the whole brain, In order that the transpose of the attention vector can be learned, A node feature vector representing a brain region node i, Representing a nonlinear activation function; The student model is further configured to normalize importance scores of all brain area nodes to obtain attention weights, and the attention weights are obtained by referring to the following formula: Wherein, the Represents the attention weight of the ith brain region node, Represents the importance score of brain region node j, Represents a temperature coefficient for adjusting the smoothness of the attention distribution, Representing an exponential function; The student model is further configured to prune the brain region nodes based on the attention weight to obtain key brain region nodes, construct a sub-graph structure based on the key brain region nodes, perform graph convolution operation on the sub-graph structure to obtain sub-graph node characteristics of each key brain region node, and aggregate the sub-graph node characteristics into key focus sub-graph characteristics, and refer to the following formula: Wherein, the Represents the sub-graph characteristics of the key focus, Sub-graph node features representing nodes i learned by the learning model on the sub-graph structure, Representing a set of nodes consisting of nodes of the critical brain region.
- 4. A brain disease risk prediction device based on adaptive sub-graph contrast distillation, characterized in that the device is configured to implement the brain disease risk prediction method based on adaptive sub-graph contrast distillation as claimed in any one of claims 1 to 3, the device comprising: the data processing module is used for acquiring multi-mode brain image data and preprocessing, dividing the preprocessed brain image data into a plurality of regions of interest, extracting blood oxygen level dependent signal time sequences of the regions of interest, and preprocessing comprises time correction, head movement correction and space standardization processing; The relation analysis module is used for carrying out connection weight analysis on each region of interest based on the blood oxygen level dependent signal time sequence to generate a weighted brain network diagram; The model construction module is used for constructing a target graph neural network model, inputting the weighted brain network graph into the target graph neural network model and outputting key focus sub-graph features; the risk prediction module is used for constructing a multi-target loss item based on the key focus sub-graph characteristics, training the target graph neural network model based on the multi-target loss item, and predicting the brain disease risk through the trained target graph neural network model, wherein the multi-target loss item comprises a classification cross entropy loss item, a knowledge distillation loss item and a comparison learning loss item.
- 5. An adaptive sub-graph contrast distillation based brain disease risk prediction device comprising a memory, a processor and an adaptive sub-graph contrast distillation based brain disease risk prediction program stored on the memory, the processor being configured to run the adaptive sub-graph contrast distillation based brain disease risk prediction program configured to implement the adaptive sub-graph contrast distillation based brain disease risk prediction method of any one of claims 1 to 3.
- 6. A computer readable storage medium, characterized in that it has stored thereon a brain disease risk prediction program based on adaptive sub-graph contrast distillation, which when executed by a processor, implements the brain disease risk prediction method based on adaptive sub-graph contrast distillation as claimed in any one of claims 1 to 3.
Description
Brain disease risk prediction method, device, equipment and storage medium based on self-adaptive subgraph contrast distillation Technical Field The invention relates to the technical field of medical information, in particular to a brain disease risk prediction method, device, equipment and storage medium based on self-adaptive sub-graph contrast distillation. Background With the acceleration of global population aging, chronic neurodegenerative diseases represented by Parkinson's Disease (PD) and Alzheimer's Disease (AD) are public health problems threatening human health. The disease has long course, hidden early symptoms and complex pathological mechanism. Modern neuroscience research shows that the brain is a complex dynamic network system, and neurodegenerative diseases often occur with abnormal topological structure and functional reconstruction of the brain functional connection network (Functional Connectivity Network, FCN). Functional magnetic resonance imaging (functional Magnetic Resonance Imaging, fMRI) as a noninvasive brain imaging technique can indirectly reflect neuronal activity through Blood Oxygen level dependent (Blood Oxygen LEVEL DEPENDENT, BOLD) signals, and has become a core means for exploring pathological mechanisms of brain diseases. In recent years, deep learning techniques represented by graph neural networks (Graph Neural Networks, GNN) have been widely used for brain network classification tasks because of their excellent ability to process non-euclidean space-diagram structure data. However, the existing brain network analysis method based on GNN still faces three technical bottlenecks in practical clinical application, namely firstly, the problem of semantic deficiency of brain network modeling. The conventional method generally utilizes pearson correlation coefficients to construct a functional connection matrix, and applies a fixed hard threshold to perform binarization processing. Although this method simplifies the calculation, fine granularity semantic information of the connection strength between brain regions is often erased, and the overall topological similarity of the connection distribution is ignored, so that deep pathological signals are difficult to capture. Second, the positioning of critical pathogenic subgraphs is difficult. The existing model mostly adopts global pooling (such as Mean Pooling or Sum Pooling) to aggregate the node characteristics of the whole brain region into a graph-level representation. The averaging operation easily masks local tiny focus abnormal characteristics, so that the model cannot adaptively extract a key brain region (subgraph) with the most discriminative ability for disease diagnosis, and therefore the model lacks of interpretability, and a clinician cannot know a specific diseased brain region through a model result. Finally, the risk of overfitting with small samples and high noise. Medical image data is expensive to acquire, and labeling data is extremely scarce (typically only a few hundred samples). In the training of the depth map neural network of large-scale parameters, the fitting phenomenon is very easy to occur. In addition, the fMRI signal has low signal-to-noise ratio and is greatly influenced by physiological noise such as head movement, respiration and the like, so that the training difficulty of the model is further increased. Disclosure of Invention The invention mainly aims to provide a brain disease risk prediction method, device, equipment and storage medium based on self-adaptive subgraph contrast distillation, and aims to solve the technical problems that in the prior art, brain network modeling semantics are lost, key pathogenic subgraph positioning is difficult, and fitting is caused by small samples and high noise, so that deep pathological features are difficult to excavate from limited and unbalanced neuroimage data, and brain risks cannot be predicted timely and accurately. To achieve the above object, the present invention provides a brain disease risk prediction method based on adaptive sub-graph contrast distillation, the method comprising the steps of: Collecting multi-mode brain image data and preprocessing, dividing the preprocessed brain image data into a plurality of regions of interest, and extracting blood oxygen level dependent signal time sequences of the regions of interest, wherein the preprocessing comprises time correction, head motion correction and space standardization processing; performing connection weight analysis on each region of interest based on the blood oxygen level dependent signal time sequence to generate a weighted brain network diagram; constructing a target graph neural network model, inputting the weighted brain network graph into the target graph neural network model, and outputting key focus sub-graph features; Constructing a multi-target loss item based on the key focus subgraph characteristics, training the target graph neural network model based on the mult