CN-122023264-A - Neural development disorder screening method and system based on facial image

CN122023264ACN 122023264 ACN122023264 ACN 122023264ACN-122023264-A

Abstract

The application discloses a neural development disorder screening method and a neural development disorder screening system based on a face image, wherein the method comprises the steps of obtaining the face image to be detected, and extracting image texture characteristics and face key point structural characteristics from the face image; fusing the two features by using the structural features of the facial key points as query and the structural features of the image as the attention mechanisms of key values to obtain fused feature representation; and classifying the fusion characteristic representation by a classification strategy adapting to long-tail data distribution to obtain the neural development disorder category corresponding to the facial image. The problem of poor recognition capability when the nerve development disorder type is recognized by a computer vision technology is solved, and accurate automatic screening of multiple nerve development disorders including multiple rare diseases under a unified frame is realized.

Inventors

ZHOU MENGYING
XI WENHUI
TANG JIJUN
WEI YANJIE
WAN DE

Assignees

深圳先进技术研究院

Dates

Publication Date: 20260512
Application Date: 20251231

Claims (10)

1. A method of screening for a neurological disorder based on facial images, the method comprising: step S1, acquiring a face image to be detected, and extracting image texture features and face key point structural features from the face image; s2, fusing the two features through an attention mechanism taking the structural features of the facial key points as query and the texture features of the images as key values to obtain a fused feature representation; and step S3, classifying the fusion characteristic representation through a classification strategy adapting to long-tail data distribution, and obtaining the neural development disorder category corresponding to the facial image.
2. The method according to claim 1, wherein the step of extracting the image texture feature in the step S1 includes: inputting the facial images into at least two pre-training convolutional neural networks with different architectures in parallel, respectively performing forward propagation calculation and outputting an initial depth feature map; Respectively executing global average pooling and global maximum pooling on each initial depth feature map to obtain two space description feature maps; Splicing the two space description feature images, and generating a space attention weight image through convolution operation; weighting the corresponding initial depth feature map through the space attention weight map to obtain an enhanced feature map; And uniformly scaling or projecting each enhancement feature map to have the same channel number and space size to obtain a plurality of standard feature maps, and splicing all the standard feature maps along the channel dimension to obtain the final image texture feature.
3. The method according to claim 1 or 2, wherein the step of extracting facial key point structural features in step S1 includes: extracting key point characteristics in the facial image through a key point positioning algorithm with fixed parameters, and generating coordinate data of facial anatomy key points; And encoding the coordinate data into a geometric structure feature map, wherein the spatial dimension of the geometric structure feature map is matched with the spatial dimension of any feature map contained in the extraction process of the image texture features.
4. A method according to any one of claims 1 to 3, wherein said step S2 comprises: Linearly projecting the facial key point structural features as query vectors; linearly projecting the image texture features into key vectors and value vectors respectively; and performing attention calculation and fusion based on the query vector, the key vector and the value vector to obtain fusion characteristic representation.
5. The method of claim 4, wherein the step of performing attention computation and fusion based on the query vector, key vector, and value vector to obtain a fused feature representation comprises: calculating based on the query vector, the key vector and the complete set of value vectors to obtain a global attention fusion feature; Generating a local attention mask according to the spatial positions of the key points of the face, restricting the calculation of the key vectors and the value vectors to the local spatial adjacent areas corresponding to the key points, and calculating based on the query vectors and the restricted key vectors and the restricted value vectors to obtain local attention fusion characteristics; And fusing the global attention fusion feature with the local attention fusion feature to obtain a fusion feature representation.
6. The method of claim 5, wherein the step of fusing the global attention fusion feature with the local attention fusion feature to obtain a fused feature representation further comprises, after the step of: Performing self-attention calculation on the fusion characteristic representation to obtain correlation weights among different elements in the fusion characteristic representation; and re-weighting elements in the fusion characteristic representation according to the correlation weight, and updating the fusion characteristic representation.
7. The method of claim 6, wherein the step S3 includes: Performing linear transformation on the updated fusion characteristic representation to obtain original discrimination values corresponding to each nerve development disorder class; Carrying out normalized index calculation on the original discrimination value, and adjusting a calculation result by combining prior distribution of the neural development disorder category to obtain category prediction probability; And determining the neural development disorder category corresponding to the facial image according to the category prediction probability.
8. The method of claim 7, wherein after the step of determining the class of neurodevelopmental disorder corresponding to the facial image based on the class prediction probabilities, further comprising: acquiring the attention weight calculated by the attention mechanism in the fusion process; And mapping attention weights back to corresponding spatial locations of the facial image to generate a visual thermodynamic diagram, wherein a region of interest characterized by the visual thermodynamic diagram is associated with a spatial distribution of facial anatomy defined by the facial key point structural features.
9. A system for screening for a neurological disorder, the system comprising: the feature extraction module is used for receiving the facial image and outputting the texture features and the key point structural features of the facial image; The global-local cross fusion encoder is connected with the feature extraction module and is configured to take the structural features of the facial key points as queries, take the texture features of the images as key values, fuse the structural features of the facial key points and the texture features of the images through a cross attention mechanism and output fusion feature representations; and the long tail robust classification module is connected with the global-local cross fusion encoder and is configured to classify the fusion characteristic representation based on a classification strategy adapting to long tail data distribution and output a neural development disorder type prediction result.
10. The system of claim 9, further comprising an interpretable output module that generates a visual thermodynamic diagram associated with the facial key point structural feature based on the attention weights generated by the global-local cross fusion encoder.

Description

Neural development disorder screening method and system based on facial image Technical Field The application relates to the technical field of image processing, in particular to a neural development disorder screening method and system based on facial images. Background The neural development disorder is a group of diseases which appear in early brain development, the clinical manifestation is complex, and the diagnosis of the diseases depends on development evaluation, behavior observation, parental interview and experience judgment of professional doctors, and has the limitations of strong subjectivity, low efficiency and difficult large-scale popularization. With the development of computer vision technology, early screening of neurological disorders using facial images has become a new research direction. The currently mainstream neural development disorder face recognition technical scheme mainly relies on convolutional neural networks to extract image texture information, for example, a single convolutional neural network such as ResNet, mobileNet, xception, inception is used for encoding a face image. And part of researches are combined with the key point information of the human face, namely, coordinates are acquired by a key point detector, geometrical characteristics such as eye distance proportion are calculated manually, and finally, the geometrical characteristics and the texture characteristics of the convolutional neural network are simply spliced and fused. However, the phenotype differences of the neural development disorder are often very fine, a single convolutional neural network structure is difficult to capture local geometric structure features and global morphological features at the same time, and meanwhile, due to the fact that certain disease samples are seriously insufficient, the model is easily influenced by long tail distribution, and the recognition performance of few categories is poor. Furthermore, the interpretability of the above scheme depends on general thermodynamic diagram tools, the interpretable result of which is not stable enough, and the decision basis of the model is difficult to develop and explain clinically, so that the usability of the model in a real scene is affected. Disclosure of Invention The embodiment of the application solves the problem of poor recognition capability when recognizing the neurodevelopmental disorder through a computer vision technology by providing the face image-based neurodevelopmental disorder screening method and system, realizes the effect of high-precision automatic screening of various neurodevelopmental disorders, and can accurately screen rare diseases with rare samples. The embodiment of the application provides a face image-based neural development disorder screening method and a face image-based neural development disorder screening system, wherein the face image-based neural development disorder screening method comprises the following steps: step S1, acquiring a face image to be detected, and extracting image texture features and face key point structural features from the face image; s2, fusing the two features through an attention mechanism taking the structural features of the facial key points as query and the texture features of the images as key values to obtain a fused feature representation; and step S3, classifying the fusion characteristic representation through a classification strategy adapting to long-tail data distribution, and obtaining the neural development disorder category corresponding to the facial image. Optionally, the step of extracting the image texture feature in the step S1 includes: inputting the facial images into at least two pre-training convolutional neural networks with different architectures in parallel, respectively performing forward propagation calculation and outputting an initial depth feature map; Respectively executing global average pooling and global maximum pooling on each initial depth feature map to obtain two space description feature maps; Splicing the two space description feature images, and generating a space attention weight image through convolution operation; weighting the corresponding initial depth feature map through the space attention weight map to obtain an enhanced feature map; And uniformly scaling or projecting each enhancement feature map to have the same channel number and space size to obtain a plurality of standard feature maps, and splicing all the standard feature maps along the channel dimension to obtain the final image texture feature. Optionally, the step of extracting the facial key point structural feature in the step S1 includes: extracting key point characteristics in the facial image through a key point positioning algorithm with fixed parameters, and generating coordinate data of facial anatomy key points; And encoding the coordinate data into a geometric structure feature map, wherein the spatial dimension of the geometric structure featur