CN-121795930-B - Intelligent auxiliary diagnosis system for autism based on brain electric micro state and XGBoost
Abstract
The application provides an intelligent auxiliary diagnosis system for autism based on an electroencephalogram micro state and XGBoost, which relates to the technical field of auxiliary diagnosis. The acquisition module is used for acquiring electroencephalogram data corresponding to the training sample set and electroencephalogram data of the object to be tested, the processing module is used for extracting potential topography in the training sample set through a representative selection mechanism and adaptively determining the micro-state quantity to generate a micro-state template set, the selection module is used for carrying out time sequence back matching on the electroencephalogram data of the object to be tested through the micro-state template set, extracting time learning features and executing sparse feature selection to obtain a target feature index set, and the output module is used for constructing a classifier with regular constraint, training a classification model based on the training sample set and generating autism risk judging information. The application can automatically extract the space-time characteristics of the electroencephalogram signals and perform optimized classification, thereby effectively improving the accuracy and efficiency of diagnosis of autism.
Inventors
- WEI RAN
- KE XIAOYAN
- JIN HUA
- WANG YONGLU
- GONG YAN
- GAO JIANXING
- XU XINYUE
- Zhai Chenyuan
- WEI CONGHUI
Assignees
- 苏州市立医院
Dates
- Publication Date
- 20260508
- Application Date
- 20260306
Claims (9)
- 1. Intelligent auxiliary diagnosis system for autism based on brain electric micro state and XGBoost, which is characterized by comprising: the acquisition module is used for acquiring brain electrical information, wherein the brain electrical information comprises brain electrical data corresponding to a training sample set and brain electrical data of an object to be detected; The processing module is used for selecting a representative moment from the electroencephalogram data corresponding to the training sample set through a representative selection mechanism, extracting potential topography at the representative moment, clustering the potential topography, determining the micro-state quantity through stability evaluation, and generating a micro-state template set, wherein the determining the micro-state quantity through stability evaluation comprises the following steps: For each candidate micro-state quantity, carrying out layered resampling in the training sample set by taking a sample as a unit to form a plurality of groups of potential topography sets, carrying out constrained clustering on each group of potential topography sets, carrying out back projection on the obtained clustering center to the corresponding potential topography set, calculating the matching consistency degree of the reprojection error and the clustering center, and carrying out weighted combination with the spatial spectrum energy ratio and the mirror symmetry degree of the corresponding clustering center to determine a stability score; Calculating stability scoring gains of the number of adjacent candidate micro states according to the stability scoring sequences in the training sample set, and defining a preset scoring gain threshold as a second threshold; selecting the number of the micro-states, which enables the stability score to reach the maximum and the score gain of the number of the adjacent candidate micro-states to be smaller than a second threshold value, as a target micro-state number, and generating a micro-state template set based on a clustering result corresponding to the target micro-state number; The device comprises a micro-state template set, a selection module, a training sample set, a target feature index set and a target feature set, wherein the micro-state template set is used for carrying out time sequence matching on the brain electrical data of the object to be tested to obtain a time feature vector of the object to be tested; The output module is used for constructing a classifier with regular constraint, training by using the training sample set to obtain a classification model, receiving the feature subset of the object to be detected to generate a discrimination score, and outputting auxiliary diagnosis information for representing the autism risk according to the discrimination score.
- 2. The intelligent auxiliary diagnosis system for autism based on micro-brain states and XGBoost of claim 1, wherein the extracting potential topography at the representative time instant comprises: Dividing measurement channels of electroencephalogram data corresponding to the training sample set into a first channel group and a second channel group; Calculating a first artifact coefficient based on absolute deviation of instantaneous amplitude values relative to median of channels for channels of the first channel group, calculating a second artifact coefficient based on inconsistent degree of spatial correlation among channels for channels of the second channel group, and calculating a low artifact factor by combining the first artifact coefficient and the second artifact coefficient; Calculating to obtain a representative score of each time point based on the low artifact factor, the global field intensity quantile of each time point and the spatial correlation average value of the potential topography of each time point in the length of the first window; performing non-maximum suppression based on the representative scores to determine a plurality of representative moments, and extracting potential terrain at the representative moments to form a set of terrain samples.
- 3. The intelligent auxiliary diagnosis system for autism based on micro-brain states and XGBoost according to claim 2, wherein the dividing the measurement channels of the brain electrical data corresponding to the training sample set into a first channel group and a second channel group includes: calculating an artifact sensitivity index of each measurement channel based on electroencephalogram data corresponding to the training sample set, and dividing each measurement channel into a first channel group and a second channel group according to a bit division threshold of the artifact sensitivity index; Determining contribution degree as weight based on the relative variation amplitude of potential topography clustering stability evaluation indexes of the first artifact coefficient and the second artifact coefficient in the corresponding channel group, and executing weighted geometric average on the first artifact coefficient and the second artifact coefficient according to the weight to generate a low artifact factor; And determining the quantile threshold value for channel division according to the quantile and median absolute deviation rule in the channel group based on the training sample set.
- 4. The intelligent auxiliary diagnosis system for autism based on micro-brain states and XGBoost according to claim 2, wherein the clustering the potential topography comprises: establishing a channel adjacency graph based on the spatial layout of a measurement channel of the electroencephalogram data corresponding to the training sample set, executing polarity unification and mirror image mapping on the potential topography, calculating the spatial spectrum energy ratio of each potential topography on a graph Laplace feature vector base of the channel adjacency graph, and calculating the correlation coefficient of each potential topography and mirror image potential topography as mirror symmetry; and performing constrained clustering on the potential topography in a preset candidate micro-state quantity set, aiming at maximizing the spatial correlation in the group, and applying the constraint that the spatial spectrum energy ratio of each clustering center is not more than a first upper limit and the mirror symmetry degree is not less than a first lower limit.
- 5. The intelligent auxiliary diagnosis system for autism based on micro-brain states and XGBoost according to claim 4, wherein the calculating of the spatial spectral energy ratio of each potential topography on the graph laplace eigenvector basis of the channel adjacency graph includes: obtaining a characteristic value sequence and a characteristic vector sequence based on a normalized graph Laplace matrix of the channel adjacency graph, and determining a first frequency spectrum threshold according to the maximum spectrum gap in the characteristic value sequence; And determining the ratio of the projection energy of the potential terrain subjected to the polarity unification on the basis vector set with the characteristic value not larger than the first spectrum threshold value to the projection energy on all basis vectors as a space spectrum energy ratio.
- 6. The intelligent auxiliary diagnosis system for autism based on micro-brain states and XGBoost according to claim 4, wherein the calculating of the correlation coefficient of each potential topography and mirror potential topography as mirror symmetry degree includes: determining a skull center line according to the spatial layout of measurement channels of electroencephalogram data corresponding to the training sample set, and establishing a channel corresponding relation according to the mirror image of the coordinates of each measurement channel relative to the skull center line so as to determine a mirror image mapping matrix; Normalizing based on the distance from each measuring channel to the central line of the skull, and determining the weight of the measuring channel; And calculating a weighted correlation coefficient in a space with constant components removed by using the potential topography vector subjected to polarity unification and the potential topography vector subjected to mirror image mapping matrix transformation, and taking the weighted correlation coefficient as the mirror image symmetry degree.
- 7. The intelligent auxiliary diagnosis system for autism based on micro-brain states and XGBoost according to claim 1, wherein the obtaining the feature subset of the object to be tested includes: Dividing the time feature matrix into a plurality of feature groups according to micro states, wherein each feature group at least comprises average duration time, time coverage rate, occurrence frequency of the micro state and transition probability vectors pointing to other micro states from the micro state, so as to obtain a grouped time feature matrix; Generating a plurality of groups of training submatrices on the grouped time feature matrix, wherein the training submatrices are obtained by performing layered self-service sampling on sample rows; calculating regularization weights for each training submatrix based on the variation coefficients of the features in the corresponding training submatrix, and calculating candidate feature index sets by adopting weighted group sparse selection; Performing stability statistics on the candidate feature index sets, calculating the selection frequency of each feature and comparing the selection frequency with a third threshold value to determine a target feature index set; And applying the target feature index set to the temporal feature vector of the object to be detected, and extracting corresponding components according to an index sequence to obtain a feature subset of the object to be detected.
- 8. The intelligent auxiliary diagnosis system for autism based on micro-brain states and XGBoost according to claim 4, wherein the performing time sequence matching on the brain data of the object to be tested by using the micro-state template set includes: Determining a channel gain normalization matrix based on the electroencephalogram data corresponding to the training sample set and the time domain standard deviation of the electroencephalogram data of the object to be measured on each measurement channel, and carrying out channel gain normalization processing on the electroencephalogram data of the object to be measured by using the channel gain normalization matrix; on a normalized graph Laplace feature vector base of the channel adjacency graph, performing orthogonal registration on the potential topography set of the object to be detected after the channel gain normalization processing and the micro-state template set to determine a first alignment matrix, and transforming the micro-state template set by using the first alignment matrix to obtain an alignment template set; And performing label assignment on each time point of the brain electrical data of the object to be detected based on the alignment template set according to the criterion of maximizing the spatial correlation, and generating a micro-state sequence of the object to be detected.
- 9. The intelligent auxiliary diagnosis system for autism based on micro-brain states and XGBoost of claim 8, wherein the constructing a classifier with regular constraints comprises: defining the lower limit and the upper limit of a preset risk threshold interval as a fourth threshold and a fifth threshold respectively; Respectively training preparation classification models with different category weight ratios and generating cross-validation predictions on the training sample set aiming at a risk threshold interval defined by the fourth threshold and the fifth threshold, and calculating average net benefit of decision curve analysis in the risk threshold interval based on the cross-validation predictions to obtain evaluation values of the category weight ratios; And selecting the class weight ratio which enables the average net benefit to be the largest as a target weight ratio, and training the classifier with the regular constraint by using the training sample set and the target feature index set under the target weight ratio to obtain a classification model.
Description
Intelligent auxiliary diagnosis system for autism based on brain electric micro state and XGBoost Technical Field The application relates to the technical field of auxiliary diagnosis, in particular to an intelligent auxiliary diagnosis system for autism based on an electroencephalogram micro state and XGBoost. Background Autism spectrum disorder (Autism Spectrum Disorder, ASD) is a common neurological disorder, and clinical diagnosis mainly depends on behavioral observation and scale evaluation, and is greatly influenced by subjective factors, and diagnosis delay often leads to delayed intervention opportunities. In the prior art, diagnostic systems based on electroencephalogram signals (EEG) have attempted to assist ASD recognition by machine learning methods, such as diagnosis using functional connectivity analysis or time series feature extraction of EEG in combination with classifiers. The system has a certain progress in capturing brain activity abnormality, but has the limitation that the characteristic extraction process is mostly dependent on manual intervention or simple statistics, so that the automatic processing of the time-space dynamic characteristics of the brain electrical signals is insufficient, meanwhile, the classification optimization mechanism is imperfect, the influence of noise interference and over fitting is easy to cause, and the diagnosis accuracy and efficiency are difficult to meet clinical requirements. Therefore, how to realize the automatic extraction and the optimized classification of the space-time characteristics of signals in an autism diagnosis system based on the electroencephalogram signals so as to improve the accuracy and the efficiency of diagnosis becomes a technical problem to be solved urgently. Disclosure of Invention Aiming at the defects of the prior art, the application provides an intelligent auxiliary diagnosis system for autism based on an electroencephalogram micro state and XGBoost, which comprises the following components: the acquisition module is used for acquiring brain electrical information, wherein the brain electrical information comprises brain electrical data corresponding to a training sample set and brain electrical data of an object to be detected; The processing module is used for selecting a representative moment from the electroencephalogram data corresponding to the training sample set through a representative selection mechanism, extracting potential topography at the representative moment, clustering the potential topography, determining the micro-state quantity through stability evaluation, and generating a micro-state template set; The device comprises a micro-state template set, a selection module, a training sample set, a target feature index set and a target feature set, wherein the micro-state template set is used for carrying out time sequence matching on the brain electrical data of the object to be tested to obtain a time feature vector of the object to be tested; The output module is used for constructing a classifier with regular constraint, training by using the training sample set to obtain a classification model, receiving the feature subset of the object to be detected to generate a discrimination score, and outputting auxiliary diagnosis information for representing the autism risk according to the discrimination score. Optionally, the extracting the potential topography at the representative time instant includes: Dividing measurement channels of electroencephalogram data corresponding to the training sample set into a first channel group and a second channel group; Calculating a first artifact coefficient based on absolute deviation of instantaneous amplitude values relative to median of channels for channels of the first channel group, calculating a second artifact coefficient based on inconsistent degree of spatial correlation among channels for channels of the second channel group, and calculating a low artifact factor by combining the first artifact coefficient and the second artifact coefficient; Calculating to obtain a representative score of each time point based on the low artifact factor, the global field intensity quantile of each time point and the spatial correlation average value of the potential topography of each time point in the length of the first window; performing non-maximum suppression based on the representative scores to determine a plurality of representative moments, and extracting potential terrain at the representative moments to form a set of terrain samples. Optionally, the dividing the measurement channels of the electroencephalogram data corresponding to the training sample set into the first channel group and the second channel group includes: calculating an artifact sensitivity index of each measurement channel based on electroencephalogram data corresponding to the training sample set, and dividing each measurement channel into a first channel group and a second channel group acco