CN-121393896-B - Multi-modal feature fusion prediction method and system for early screening of Alzheimer disease
Abstract
The invention discloses a multi-mode feature fusion prediction method and a multi-mode feature fusion prediction system for early screening of Alzheimer's disease, belonging to the technical field of brain disease prediction; the method comprises the steps of obtaining multidimensional data comprising a cognitive test score, a brain image scanning result and biomarker concentration levels from a patient record database, obtaining a multidimensional data set in a unified format through standardized processing, extracting key feature vectors by adopting a dimension reduction analysis method to capture a co-transformation relation between the cognitive test score and brain image changes, constructing a classification model to primarily classify abnormal signals when the cognitive test score is reduced by more than a preset threshold value and the brain images display atrophy signs through an integrated learning method, and fusing the biomarker concentration levels to obtain abnormal signal vectors. According to the invention, through accurate evaluation of key parts of human brain such as hippocampus, the accuracy and efficiency of early screening of brain diseases such as Alzheimer disease are remarkably improved.
Inventors
- WANG HUAN
- TIAN PENG
- ZHOU BO
- YAO HONGXIANG
- Guo Yane
- LU XIANGHUI
- HAN WEIQIAO
- Han Ruoxin
- LI DONGMEI
- CUI JINJIN
Assignees
- 中国人民解放军总医院第二医学中心
Dates
- Publication Date
- 20260512
- Application Date
- 20251107
Claims (10)
- 1. The multi-mode feature fusion prediction method for early screening of Alzheimer disease is characterized by comprising the following steps of: step 1, acquiring multidimensional data from a patient record database, wherein the multidimensional data comprises cognitive test scores, brain image scanning results and biomarker concentration levels, and normalizing the multidimensional data by adopting standardized processing to obtain a multidimensional data set in a uniform format; step 2, extracting key feature vectors by adopting a dimension reduction analysis method according to the multi-dimensional dataset with the uniform format, wherein the key feature vectors are used for capturing the co-transformation relation between the cognitive test score and the brain image change and determining the key feature vectors; If the descending amplitude of the cognitive test score in the key feature vector exceeds a preset threshold and the brain image shows atrophy signs, constructing a classification model through an integrated learning method to primarily classify abnormal signals, judging the primary classification of the abnormal signals, wherein the abnormal signals refer to deviation indexes in cognitive and image data, and the deviation indexes are defined as that the descending amplitude of the cognitive test score is more than 10% compared with a baseline or the volume of a brain image area is more than 5% compared with a standard value; Step 4, aiming at the primary category of the abnormal signal, acquiring a biomarker concentration level as supplementary input, and fusing the primary category and the biomarker concentration level by adopting a classification fusion method to obtain a fused abnormal signal vector; Step 5, calculating the distance between vectors according to the fused abnormal signal vectors to quantify the association strength, judging the abnormal signal as a high risk signal if the association strength is greater than a preset threshold value, and determining a high risk signal set; Step 6, acquiring a multi-mode checking combination recommended based on subject background information as supplementary input according to the high risk signal set, and fusing the high risk signal set and the multi-mode checking combination to obtain a fused high risk screening set; step 7, grouping similar signals by using a clustering method to the fused high risk screening set, and calculating the comprehensive risk score of each group by using a weighted average method to obtain the comprehensive risk score of each group; and 8, aiming at each group of comprehensive risk scores, marking the risk as potential disease risk if each group of comprehensive risk scores is higher than a preset warning threshold value, and predicting risk progress trend by adopting a regression prediction method to obtain a final risk identification result.
- 2. The method for predicting early screening for multi-modal feature fusion of alzheimer's disease according to claim 1, wherein step 1 comprises: Acquiring initial data by acquiring multidimensional data comprising a cognitive test score, a brain image scanning result and a biomarker concentration level from a patient database to obtain a structured multidimensional data set; According to the collected multidimensional data set, carrying out normalization operation on data from different sources by adopting a standardized processing method, eliminating dimension differences and obtaining a multidimensional data set with uniform format; Aiming at a multi-dimensional data set with uniform format, executing data integrity check, if data is found to be missing or abnormal, filling by a preset interpolation method, and determining a multi-dimensional data matrix with repaired integrity; acquiring a multidimensional data matrix with repaired integrity, and judging the association strength among the dimensional data by adopting a correlation analysis method aiming at the cognitive test score and the biomarker concentration level in the multidimensional data matrix to obtain a correlation result among the dimensions; extracting biomarker concentration levels and brain image scanning characteristics highly correlated with the cognitive test scores according to the inter-dimensional correlation results, and determining key influence factor combinations; Classifying the multidimensional data matrix by using a support vector machine algorithm through key influence factor combination, distinguishing patient types in different health states, and obtaining classified data packets; And aiming at the classified data packets, generating corresponding characteristic weight distribution, and judging the contribution degree of each key influence factor to the classification result to obtain the final characteristic importance ranking.
- 3. The method for predicting early screening for multi-modal feature fusion of alzheimer's disease according to claim 1, characterized in that step 2 comprises: Acquiring the multi-dimensional data set in the unified format, and performing preliminary arrangement by adopting a preset data screening rule aiming at the cognitive test score and the brain image change data in the multi-dimensional data set to obtain an arranged data subset; processing the high-dimensional data by a dimension reduction analysis method aiming at the data subset after finishing, extracting main feature vectors capable of representing the association mode of cognitive test scores and brain image changes, and determining an extracted feature vector set; classifying the multidimensional data set by adopting a data grouping method according to the extracted feature vector set, and dividing the data according to different association modes to obtain classified data groups; For the classified data packets, if the data quantity in a certain packet is lower than a preset threshold value, carrying out supplementary processing on the packet by a data enhancement technology to obtain a supplementary data packet; According to the supplemented data packets, adopting a statistical analysis tool to carry out distribution comparison on the main feature vectors in each packet, and judging the difference of the feature vectors among the packets to obtain a difference analysis result; And aiming at the difference analysis result, carrying out graphical display on the feature vector distribution of each group by a data visualization technology, and determining a final distribution feature map.
- 4. The method for predicting early screening for multi-modal feature fusion of alzheimer's disease according to claim 1, characterized in that step3 comprises: By extracting relevant data of cognitive tests and brain images from key features, constructing an initial data set, and completing preliminary recording of signs of score reduction and atrophy to obtain a basic data set; screening according to the basic data set aiming at the score decrease and the atrophy sign, if the score decrease is detected to exceed the preset threshold and the image data shows the atrophy sign, marking the image data as a potential abnormal signal, and determining a data subset to be classified; adopting an ensemble learning method, processing deviation indexes in a data subset to be classified, constructing a classification model, and primarily classifying abnormal signals to obtain a classification result; Analyzing the corresponding relation between the abnormal signals and the initial categories through the classification results, and classifying the abnormal signals into the corresponding categories if the classification results show that the abnormal signals accord with certain initial categories, so as to obtain classified signal sets; Extracting deviation indexes related to a cognitive test from the classified signal set, performing data comparison, and if the comparison result shows that the deviation indexes are remarkably different from the historical data, marking the signal as a high-priority signal, and determining a key attention object; by focusing on objects and combining atrophy signs in brain images, generating detailed feature description records, completing further archiving of abnormal signals, and obtaining a final classification file; and according to the final classification file, the mapping relation between the key features and the initial categories is arranged, and the whole-flow classification processing of the abnormal signals is completed, so that the structured classification data storage is obtained.
- 5. The method for predicting early screening for multi-modal feature fusion of alzheimer's disease according to claim 1, characterized in that step 4 comprises: Acquiring a preliminary classification result of an abnormal signal from a monitoring system, and performing preliminary layering treatment on the signal by adopting a pre-established classification model to obtain an initial classification label; Acquiring corresponding biomarker concentration data according to the initial classification label, matching by using related records stored in a database, and determining concentration information associated with the classification label; Aiming at the concentration information obtained by matching and the initial classification label, carrying out feature integration processing by adopting a support vector machine algorithm to obtain an integrated feature vector; If partial data in the integrated feature vector deviates from a preset threshold range, carrying out standardization processing on the deviated data to obtain a corrected feature vector; Acquiring potential mode information related to the abnormal signal through the corrected feature vector, and judging whether abnormal mode distribution exists or not; According to the abnormal mode distribution information obtained through judgment, carrying out mode comparison by adopting a preset rule base, and determining a final abnormal signal vector; and aiming at the final abnormal signal vector, acquiring the distribution characteristics of the abnormal signal vector in a multidimensional space to obtain the comprehensive classification result of the abnormal signal.
- 6. The method for predicting early screening for multi-modal feature fusion of alzheimer's disease according to claim 1, characterized in that step5 comprises: Acquiring abnormal signal data from a system, and obtaining fusion vector data by carrying out feature extraction and vectorization on the abnormal signal; Aiming at the obtained fusion vector data, calculating the distance value between each vector by using a Euclidean distance method to generate a distance matrix; According to the generated distance matrix, analyzing the association strength between the vectors, comparing by adopting a preset threshold value, and judging a potential high risk signal if the association strength is larger than the preset threshold value; performing signal aggregation processing on the determined potential high risk signals to generate a high risk signal set, and marking source information of related signals; The method comprises the steps of obtaining common characteristics among signals through secondary analysis of signals in a high risk signal set, and determining a core association mode of the high risk signal; According to the determined core association mode, classifying and storing the signal set to generate a classified signal subset for subsequent processing; and acquiring the classified signal subsets, and carrying out priority sorting by combining source information to obtain a final high-risk signal processing sequence.
- 7. The method for predicting early screening for multi-modal feature fusion of alzheimer's disease according to claim 1, characterized in that step 6 comprises: Acquiring a high risk signal set from a system, and extracting corresponding subject background information aiming at each signal in the high risk signal set to obtain a background information data set associated with the signal; screening out multi-mode check combinations matched with background information by adopting a pre-established recommendation model according to the obtained background information data set, and determining an adaptive check combination list; The method comprises the steps of obtaining availability states of each checking combination according to a determined checking combination list, and binding the checking combination with a corresponding high-risk signal if the availability states meet preset conditions to obtain a bound signal checking pair; The data integration is carried out on the bound signal checking pairs, unified data format processing is adopted, and the signal data and the checking data are standardized, so that a standardized comprehensive data set is obtained; Extracting common information between signals and inspection according to the standardized comprehensive data set, and judging as high-correlation data pairs if the common information reaches a preset threshold value to obtain a high-correlation data subset; Aiming at the high-correlation data subset, acquiring corresponding subject background information, classifying and labeling, and determining classified screening data groups; and (3) carrying out data sequencing processing by combining the classified screening data packets with the priority information of the signal set to obtain a final comprehensive high risk screening set.
- 8. The method for predicting early screening for multi-modal feature fusion of alzheimer's disease according to claim 1, characterized in that step 7 comprises: Preliminary data arrangement is carried out on the high risk screening set to obtain structured fusion set data, and an initial set processing result is determined; Classifying the similar signals by adopting a clustering grouping method according to the structured result of the fusion aggregate data to obtain grouping result data; Extracting signal characteristic content in each group aiming at the grouping result data, summarizing characteristic summarizing information of each group, and determining a signal characteristic set after characteristic summarizing; Calculating a weighted average value of signal characteristics in each group through the signal characteristic set after characteristic induction, and acquiring preliminary evaluation data of comprehensive risks; if the primary evaluation data of the comprehensive risk exceeds a preset threshold range, performing secondary grouping processing on the group of signal feature sets to obtain adjusted grouping result data; re-calculating a weighted average value according to the adjusted grouping result data, and determining a final comprehensive risk score; And sequencing each group of high-risk screening results through the final comprehensive risk score to obtain a risk priority sequence.
- 9. The method for predicting early screening for multi-modal feature fusion of alzheimer's disease according to claim 1, characterized in that step 8 comprises: acquiring comprehensive risk score data, extracting health related indexes of a user from a system, generating comprehensive risk scores through calculation, and determining a preliminary risk assessment basis; for the comprehensive risk score, if the value of the comprehensive risk score is higher than a preset warning threshold value, marking the comprehensive risk score data as potential disease risks, and obtaining a marked risk data set; Analyzing the potential disease risk by adopting a regression prediction method according to the marked risk data set, and calculating a risk development trend to obtain a predicted value of the development trend; Analyzing the possible direction of risk progress through the predicted value of the progress trend and combining with the historical health data, and determining the dynamic change condition of the risk progress; Aiming at the dynamic change condition of risk progress, if the change exceeds a preset safety range, triggering a risk identification process to obtain specific risk category division; And integrating relevant health indexes and predictive analysis results according to risk classification, generating a final risk identification conclusion, and determining the priority ordering of the health states of the users.
- 10. The system for predicting the early-stage screening multi-modal feature fusion of the Alzheimer's disease is used for executing the method for predicting the early-stage screening multi-modal feature fusion of the Alzheimer's disease according to claim 1, and is characterized by comprising the following steps: the data acquisition and preprocessing module is used for acquiring multidimensional data from a patient record database, wherein the multidimensional data comprises a cognitive test score, a brain image scanning result and a biomarker concentration level, and the multidimensional data is normalized by adopting standardized processing to obtain a multidimensional data set with a uniform format; The feature extraction and dimension reduction module is used for extracting key feature vectors by adopting a dimension reduction analysis method according to the multi-dimensional dataset with the unified format, wherein the key feature vectors are used for capturing the co-transformation relation between the cognitive test score and the brain image change and determining the key feature vectors; The abnormal detection and classification module is used for constructing a classification model through an integrated learning method to primarily classify abnormal signals when the descending amplitude of the cognitive test score in the key feature vector exceeds a preset threshold and the brain image shows atrophy signs, and judging the primary classification of the abnormal signals, wherein the abnormal signals refer to deviation indexes in cognitive and image data; The multi-source data fusion module is used for acquiring a biomarker concentration level as a supplementary input aiming at the primary category of the abnormal signal, and fusing the primary category and the biomarker concentration level by adopting a classification fusion method to obtain a fused abnormal signal vector; The risk assessment and judgment module is used for calculating the distance between vectors according to the fused abnormal signal vectors so as to quantify the association strength, judging the abnormal signal as a high risk signal when the association strength is greater than a preset threshold value, and determining a high risk signal set; The examination recommending and integrating module is used for acquiring a multi-mode examination combination recommended based on the background information of the subject as a supplementary input according to the high-risk signal set, and fusing the high-risk signal set and the multi-mode examination combination to obtain a fused high-risk screening set; The risk calculation and grouping module is used for grouping similar signals by applying a clustering method to the fused high risk screening set, and calculating each group of comprehensive risk scores by adopting a weighted average method to obtain each group of comprehensive risk scores; The trend prediction and output module is used for marking the comprehensive risk score of each group as a potential disease risk when the score is higher than a preset warning threshold value, predicting the risk progress trend by adopting a regression prediction method, and obtaining a final risk identification result; the system control center is electrically connected with the data acquisition and preprocessing module, the feature extraction and dimension reduction module, the abnormality detection and classification module, the multi-source data fusion module, the risk assessment and judgment module, the inspection recommendation and integration module, the risk calculation and grouping module and the trend prediction and output module and is used for coordinating data flow and control signals among the modules.
Description
Multi-modal feature fusion prediction method and system for early screening of Alzheimer disease Technical Field The invention relates to the technical field of brain disease prediction, in particular to a method and a system for predicting early screening multi-mode feature fusion of Alzheimer disease. Background Alzheimer's Disease (AD), also known as senile dementia, is a neurodegenerative disease with unknown etiology most common and most important for the elderly, and the core pathological changes are extracellular senile plaques formed by beta ⁃ amyloid ⁃ beta (Abeta) deposition and neurointracellular neurofibrillary tangles formed by tau hyperphosphorylation, and neuron loss accompanied by gliosis and the like. With the continued growth of the global aging population, dementia has become the "fourth killer" for the elderly after cardiovascular, cerebrovascular and cancer, undoubtedly bringing tremendous pressure to the home, society and country. The pathogenesis of Alzheimer's disease is hidden, the etiology and pathogenesis are not clear, and once diagnosis is established, the place of difficult rescue is basically reached. If the AD patient can be found early, the early intervention is carried out, so that the clinical symptoms of the AD patient can be improved, the disability rate and the mortality rate of the patient can be reduced, and the expenditure of medical expenses can be saved. In clinical practice, diagnosis of cognitive disorders relies on comprehensive analysis of multidimensional data, including cognitive test scores, brain image scan results, and biomarker concentration levels, among others. These heterogeneous data sources vary significantly in format, dimension, and collection criteria, resulting in data fusion difficulties. The association mechanism between cognitive decline and brain structural change is complex, and the traditional analysis method is difficult to accurately identify early pathological signals, and particularly has obvious defects when capturing subtle co-transformation relations. The invention patent of CN120413030A discloses a method, medium and equipment for predicting Alzheimer disease risk based on feature fusion, wherein sMRI image, FDG-PET image, biomarker and intelligence scoring data of a target subject are firstly obtained. And then, respectively inputting sMRI and FDG-PET image data into a three-dimensional convolutional neural network module, and extracting corresponding first and second image feature maps. Then, inputting the two feature images into a cross attention fusion module to obtain image fusion features, and inputting a U-shaped up-sampling network to segment the sea horse regions to obtain sea horse features. And then, merging the image fusion characteristics, the hippocampus characteristics, the biomarker data and the intelligence score data, and inputting the merged data into an extreme learning machine for calculation so as to obtain the prediction result of the risk of developing Alzheimer disease. According to the method, the multi-mode data complementary information is extracted by means of cross-mode data fusion, so that the accuracy of disease risk prediction can be effectively improved. However, in practice, the method cannot adapt to the multi-level and multi-dimensional complex analysis requirements in the risk assessment of cognitive impairment diseases, and cannot realize the fine classification of abnormal signals and the accurate quantification of risk levels. How to effectively integrate multi-source heterogeneous medical data, accurately identify the association mode between a cognitive test and brain images, and establish an accurate multi-level risk assessment system becomes a key problem to be solved in the development of the early screening technology of the current cognitive impairment diseases. Disclosure of Invention The invention aims to provide a multi-mode feature fusion prediction method and system for early screening of Alzheimer's disease, which can effectively solve the problem of logic association of multi-dimensional data fusion and risk prediction. In order to solve the technical problems, the invention adopts the following technical scheme: the method for predicting the early screening multi-modal feature fusion of the Alzheimer disease comprises the following steps: step 1, acquiring multidimensional data from a patient record database, wherein the multidimensional data comprises cognitive test scores, brain image scanning results and biomarker concentration levels, and normalizing the multidimensional data by adopting standardized processing to obtain a multidimensional data set in a uniform format; step 2, extracting key feature vectors by adopting a dimension reduction analysis method according to the multi-dimensional dataset with the uniform format, wherein the key feature vectors are used for capturing the co-transformation relation between the cognitive test score and the brain image change and det