CN-122023270-A - Child face multi-disease screening method and system for class imbalance
Abstract
The application discloses a child face multi-disease screening method and system for class imbalance, wherein the method comprises the steps of synchronously obtaining global feature representation and local feature representation of a face through multi-scale feature extraction, carrying out deep fusion on the global feature representation and the local feature representation to comprehensively represent a complex phenotype, and finally generating a screening result through a decision rule of adapting long tail distribution. The problem of the recognition result degree of accuracy low when the child genetic disease is discerned is solved. The high-precision identification of various similar genetic syndromes is realized through multi-scale feature extraction under a unified framework, and the identification capability of rare disease categories of samples is obviously improved through a decision rule adapting to long-tail data distribution, so that the effect of improving the accuracy of screening the facial multiple diseases of children is achieved.
Inventors
- ZHOU MENGYING
- XI WENHUI
- PAN YI
- WEI YANJIE
- WAN DE
Assignees
- 深圳先进技术研究院
Dates
- Publication Date
- 20260512
- Application Date
- 20251231
Claims (10)
- 1. A method for screening for multiple diseases of a child's face oriented to category imbalance, the method comprising: Step S1, acquiring a face image to be detected, and carrying out multi-scale feature extraction on the face image to obtain a global feature representation and a local feature representation; S2, fusing the global feature representation and the local feature representation to generate a fused feature representation; And step S3, based on the fusion characteristic representation, generating a screening result of the facial image corresponding to at least one target disease category by adapting a decision rule of long-tail data distribution.
- 2. The method according to claim 1, wherein the step of obtaining a global feature representation in step S1 comprises: dividing the face image into a plurality of image blocks and encoding the face image as a feature vector; calculating the association degree score between the feature vector of each image block and the feature vectors of all other image blocks; And carrying out weighted aggregation on the feature vectors of all the image blocks according to the relevance score to generate the global feature representation.
- 3. The method according to claim 2, wherein the step of obtaining the local feature representation in step S1 comprises: The global feature representation is used as input, and is processed through at least one layer of convolutional neural network to obtain an intermediate feature representation; And extracting discriminant detail information from the intermediate feature representation to form the local feature representation.
- 4. A method according to any one of claims 1-3, wherein said step S2 comprises: the global feature representation is used as semantic guidance, and the association degree of each spatial position feature in the global feature representation and the local feature representation is calculated through a cross-attention mechanism, so that the region association weight reflecting the matching degree of each local region and the global semantic is obtained; Aiming at the local feature representation, calculating the mutual dependency relationship among all positions in the feature map of the local feature representation through lightweight convolution operation to obtain local self-attention weight reflecting the importance of the internal structure of a local area; and fusing the global feature representation and the local feature representation based on the region association weight and the local self-attention weight to generate the fused feature representation.
- 5. The method of claim 4, wherein the step of generating the fused feature representation further comprises: Calculating interaction weights among different feature dimensions in the fusion feature representation through a self-attention mechanism; and carrying out feature recalibration on the fusion feature representation according to the interaction weight to obtain an enhanced fusion feature representation, wherein the enhanced fusion feature representation is the final fusion feature representation.
- 6. The method of claim 5, wherein the step S3 includes: converting the final fusion feature representation to an initial logical value corresponding to each disease category; Combining the initial logic value with balance bias items corresponding to each disease category to generate an adjusted logic value; and carrying out normalization processing on the adjusted logic value to generate a screening result of the facial image belonging to each target disease category.
- 7. The method of claim 1 or 6, comprising, after the step of generating the screening result: During the acquisition process of the global feature representation, acquiring attention weight data generated by a last layer of attention module of the pre-training visual transducer model; Calculating importance scores of the spatial positions of the face images on the screening results according to the attention weight data; and generating a salient region map corresponding to the facial image space according to the importance score.
- 8. The method of claim 7, wherein after the step of generating the screening result, further comprising: acquiring a plurality of reference face images associated with target disease categories corresponding to the screening results; extracting corresponding attention weight data from each reference face image and generating a reference salient region map; and carrying out spatial standardized alignment treatment on all the reference significant region graphs, and then carrying out pixel-level statistical averaging to generate a category-level significant region graph reflecting the common attention mode of the target disease category group.
- 9. A class imbalance-oriented child facial multi-disease screening system, the system comprising: the multi-scale feature extraction module is used for receiving the facial image and extracting a global feature representation and a local feature representation; The feature fusion module is used for fusing the global feature representation with the local feature representation to obtain a fused feature representation; and the long tail optimization classification module is used for generating a screening result of the facial image corresponding to at least one target disease category by adapting a decision rule of long tail data distribution based on the fusion characteristic representation.
- 10. The system of claim 9, further comprising an interpretability output module, interacting with the feature fusion module and long tail optimization classification module, for generating and outputting a salient region map.
Description
Child face multi-disease screening method and system for class imbalance Technical Field The application relates to the technical field of image processing, in particular to a child facial multi-disease screening method and system for class imbalance. Background With the deep fusion of artificial intelligence technology in the medical field, a genetic syndrome and development disorder computer-aided screening technology based on facial image analysis has become an important research direction. Childhood genetic syndromes and dysplasia are often accompanied by specific facial morphologies that provide the potential for noninvasive, low-cost early screening. Computer-aided diagnosis techniques for the facial phenotype of genetic syndromes mainly involve four classes of protocols. The first class is based on traditional geometric features, structural parameters such as eye distance, nose bridge height, mouth width and the like are obtained through face key point detection, and are judged by combining shallow classifiers such as a support vector machine, random forests and the like, and the second class is based on a migration learning strategy, and a depth model pre-trained on a large-scale face data set is used as a feature extractor and is finely adjusted on a limited disease sample so as to automatically learn more discriminative textures and morphological features. The third class adopts advanced architecture such as visual transducer or multi-branch network to promote the phenotype modeling capability. And a fourth type of integrated platform such as Face2Gene is combined with DEEPGESTALT models and a clinical database to realize multi-disease auxiliary diagnosis. In addition, some approaches introduce visualization techniques such as Grad-CAM to enhance model interpretability. However, the method has obvious defects that firstly, clinical data are extremely unbalanced, rare disease samples are rare, a model is biased to common types, the identification performance of small sample diseases is poor, secondly, different disease phenotypes are highly overlapped, the model is difficult to capture fine granularity differences, generalization capability is limited, thirdly, an interpretability method is mostly based on single graph generation, stable phenotype interpretation of a disease level is lacking, and clinical requirements on decision transparency and credibility are difficult to meet. Therefore, a new multi-disease intelligent screening framework that combines accuracy, robustness and interpretability is needed. Disclosure of Invention The embodiment of the application solves the problem of low accuracy of the identification result when the related technology identifies the genetic diseases of children by providing the child face multi-disease screening method and system for the class imbalance, and improves the accuracy of the child face multi-disease screening. The embodiment of the application provides a method for screening children's face multiple diseases oriented to category imbalance, which comprises the following steps: Step S1, acquiring a face image to be detected, and carrying out multi-scale feature extraction on the face image to obtain a global feature representation and a local feature representation; S2, fusing the global feature representation and the local feature representation to generate a fused feature representation; And step S3, based on the fusion characteristic representation, generating a screening result of the facial image corresponding to at least one target disease category by adapting a decision rule of long-tail data distribution. Optionally, the step of obtaining the global feature representation in step S1 includes: dividing the face image into a plurality of image blocks and encoding the face image as a feature vector; calculating the association degree score between the feature vector of each image block and the feature vectors of all other image blocks; And carrying out weighted aggregation on the feature vectors of all the image blocks according to the relevance score to generate the global feature representation. Optionally, the step of obtaining the local feature representation in step S1 includes: The global feature representation is used as input, and is processed through at least one layer of convolutional neural network to obtain an intermediate feature representation; And extracting discriminant detail information from the intermediate feature representation to form the local feature representation. Optionally, the step S2 includes: calculating the association degree of each spatial position feature in the global feature representation and the local feature representation through a cross-attention mechanism to obtain a region association weight reflecting the matching degree of each local region and the global semantics; Aiming at the local feature representation, calculating the mutual dependency relationship among all positions in the feature map of the