CN-122024334-A - Premature infant action recognition method based on local area discrimination and AUC collaborative optimization
Abstract
The invention discloses a premature infant action recognition method based on local area discrimination and AUC collaborative optimization. Firstly, preprocessing the acquired premature infant monitoring video data to obtain a premature infant multi-tag limb movement data set. And finally, constructing a fusion loss function based on the model, combining a data set and the loss model for training, and realizing premature infant action recognition through the trained model. The invention adopts multi-label class specific residual error attention to solve the problem of high similarity in space-time mode between different classes, introduces an AUC loss function, and relieves the problem of unbalance between classes caused by unintentional and spontaneous limb movement characteristics of premature infants, thereby synergistically improving the recognition performance of the model when facing to limb movement of the premature infants.
Inventors
- GUO XIAOFENG
- WANG TIANLEI
Assignees
- 杭州电子科技大学
Dates
- Publication Date
- 20260512
- Application Date
- 20260414
Claims (10)
- 1. The premature infant action recognition method based on the collaborative optimization of local area discrimination and AUC is characterized by comprising the following steps: The method comprises the steps of 1, preprocessing acquired premature infant monitoring video data, and converting the premature infant monitoring video data into premature infant multi-tag limb movement data sets comprising five body parts of a head part, a left hand part, a right hand part, a left foot part and a right foot part, wherein the preprocessing comprises video category sorting, cutting and multi-tag labeling; Step 2, constructing a premature infant multi-label limb action classification model, which comprises a video feature extraction module, a feature mapping module, a category perception feature enhancement module and a classifier module, wherein the category perception feature enhancement module comprises dimension transformation operation, L2 normalization and a multi-head type specific residual error attention mechanism; step 3, introducing AUC loss based on the constructed premature infant multi-label limb action classification model, and constructing a fusion loss function combining binary cross entropy and AUC loss; and 4, sending the preprocessed multi-label limb movement data of the premature infant into the constructed multi-label limb movement classification model of the premature infant for recognition training and storing the trained model.
- 2. The premature infant action recognition method based on the collaborative optimization of local area discrimination and AUC according to claim 1, wherein the specific implementation of step 1 is as follows: 1-1, accurately cutting acquired premature infant monitoring video data according to limb action categories to obtain video fragments of each limb action category, and classifying and sorting the video fragments according to the categories; 1-2, respectively extracting continuous image frames from divided training set, verification set and test set video data frame by frame, and adjusting the images to a fixed size; 1-3, defining a fixed-length continuous image frame passing through a sliding window as a single data sample, carrying out One-Hot type coding on each sample to convert a limb action type label into a binary multi-label form, wherein the coding action type comprises a head, a left hand, a right hand, a left leg and a right leg, specifically, adopting a One-dimensional array form to code 5 limb parts of a body into an array which is composed of 0 and 1 and has the size of 1 multiplied by 5 according to the sequence of the head, the left hand, the right hand, the left leg and the right leg, and if a certain limb part moves, the coding of the corresponding position is 1, otherwise, if no limb part moves, the coding is 0.
- 3. The premature infant action recognition method based on the collaborative optimization of local area discrimination and AUC according to claim 1, wherein the specific implementation of step 2 is as follows: 2-1, a video feature extraction module uses a video feature extraction backbone network as a feature extractor, and uses premature multi-label sample data as input to extract sample space-time features; 2-2, sending the space-time features extracted by the feature extractor into a feature mapping module, performing full-connection transformation and Dropout operation on the sample space-time features, and generating projection vectors; 2-3, inputting the space-time characteristics extracted by the characteristic extractor into a category perception characteristic enhancement module to obtain category specific residual characteristics; And 2-4, calculating by a classifier module based on the input projection vector and the class-specific residual characteristics to obtain a classification result.
- 4. A premature infant action recognition method based on local area discrimination and AUC co-optimization according to claim 3, wherein the class perception feature enhancement module specifically operates as follows: Firstly, mapping and transforming the video space-time characteristics through 3D convolution operation to adjust the characteristic channel number as a class number so as to obtain a class response tensor, calculating an L2 norm for each class weight vector of the 3D convolution, dividing the class response tensor by the L2 norm element by element according to the corresponding channel norm so as to obtain normalized class response, and further flattening the space-time dimension of the class response into one dimension so as to obtain a flattened space-time characteristic vector; The method comprises the steps of inputting space-time characteristic vectors subjected to dimension flattening into a multi-head class specific residual attention mechanism, carrying out weighted aggregation on key space-time areas, enabling the multi-head class specific residual attention mechanism to be composed of a plurality of mutually independent class specific residual attention units in parallel, enabling each unit to adopt a double-branch parallel structure, carrying out average pooling on space-time characteristics subjected to dimension flattening in one branch aiming at each unit to obtain class independent average pooling characteristics, inputting the space-time characteristics subjected to dimension flattening into a softmax function controlled by corresponding temperature parameters in the other branch to obtain corresponding class specific attention scores, further carrying out weighted summation on the attention scores and flattened characteristic vectors to obtain class specific characteristics, and finally carrying out weighted summation on the class independent average pooling characteristics and the class specific characteristics to obtain the class specific discrimination vectors of the single units.
- 5. The method for identifying premature infant actions based on collaborative optimization of local area discrimination and AUC according to claim 4, wherein the classifier module specifically operates as follows: Meanwhile, the class-specific discrimination vectors output by each unit in the multi-head class-specific residual attention mechanism are accumulated element by element according to class to obtain discrimination scores of each class, and the discrimination scores of all classes jointly form a discrimination score vector The score vector is determined Weighting and fusing the full-connection layer mapping characteristics according to categories to obtain fused multi-label prediction score distribution; element-by-element mapping is carried out on the multi-label prediction score distribution by adopting a Sigmoid function, and each class of prediction scores are independently mapped to the corresponding classes And (3) obtaining final prediction probability distribution containing all categories in the interval, carrying out binary judgment on the probability distribution based on a preset judgment threshold value of 0.5, judging that a positive sample is generated if the output prediction probability is greater than 0.5, otherwise, treating the positive sample as a negative sample, and not generating corresponding limb movement.
- 6. The premature infant action recognition method based on the collaborative optimization of local area discrimination and AUC according to claim 1, wherein the specific implementation of step 4 is as follows: 4-1, adjusting the image size of the preprocessed multi-label limb movement data of the premature infant to be the input size of a video feature extraction module; 4-2, sending the adjusted data into a constructed premature infant multi-label limb action classification model, and performing end-to-end collaborative training based on a constructed fusion loss function; the training process adopts random gradient descent to optimize, and the initial learning rate is set as Batch size of Dynamically adjusting the learning rate by using StepLR scheduler with specified step length, wherein in training iteration, the length of the model pair input is Performing forward propagation on the image frame sequence samples of the image frame sequence, generating probability distribution, calculating fusion loss by combining with a real label, and updating model parameters based on a backward propagation algorithm; meanwhile, the verification set is used for monitoring in the training process, so that overfitting is avoided.
- 7. A premature infant action recognition device with cooperative optimization of local area discrimination and AUC, which is characterized by comprising the following modules: The pretreatment module is used for carrying out pretreatment on the acquired premature infant monitoring video data and converting the premature infant monitoring video data into a premature infant multi-tag limb movement data set containing five body parts of a head part, a left hand part, a right hand part, a left foot part and a right foot part, wherein the pretreatment comprises video category arrangement, cutting and multi-tag labeling; The model construction module is used for constructing a premature infant multi-label limb action classification model and comprises a video feature extraction module, a feature mapping module, a category perception feature enhancement module and a classifier module, wherein the category perception feature enhancement module comprises dimension transformation operation, L2 normalization and a multi-head type specific residual error attention mechanism; the loss function construction module is used for introducing AUC loss based on the constructed multi-label limb action classification model of the premature infant and constructing a fusion loss function combining binary cross entropy and AUC loss; And the training module is used for sending the preprocessed multi-label limb movement data of the premature infant into the constructed multi-label limb movement classification model of the premature infant for recognition training and storing the trained model.
- 8. An electronic device, comprising a processor and a memory; the memory is used for storing a computer program; The processor, when executing the program stored in the memory, is configured to implement the premature infant action recognition method according to any one of claims 1 to 6.
- 9. A computer readable storage medium, characterized in that the computer readable storage medium has stored therein a computer program which, when executed by a processor, implements the premature infant action recognition method according to any one of claims 1-6.
- 10. A computer program product comprising instructions which, when run on a computer, cause the computer to perform the method of premature infant action recognition according to any of claims 1-6.
Description
Premature infant action recognition method based on local area discrimination and AUC collaborative optimization Technical Field The invention belongs to the field of non-contact video monitoring and premature infant action recognition, and relates to a multi-label learning method based on local area discrimination and AUC (Area Under the ROC Curve) loss collaborative optimization and oriented to premature infant action recognition task design. Background Premature infants refer to infants born before 37 weeks of gestation. Due to the inadequate gestation time, the various systems of the premature infant's body have not yet developed completely. Premature infants require more medical attention than term newborns in order to discover potential physical or mental problems of premature infants as early as possible and to perform medical interventions in time. The observation and judgment of the physiological condition of premature infants mainly depend on the physiological data of premature infants and spontaneous limb movements of premature infants. Traditional premature infant action monitoring mainly relies on sensors or medical staff to observe the condition of premature infants sporadically. Because the skin of premature infants is tender and fragile, and the immune system is immature, the use of sensors that directly contact the skin may cause irritation, damage, or anaphylaxis to the skin of premature infants, and severe cases may lead to pressure necrosis. And the presence of the sensor may also prevent the limb from fully stretching, interfering with spontaneous movement of the premature infant. Although the observation of medical staff has professional performance, the condition of premature infant can be immediately judged and analyzed, the work load of the medical staff in hospitals is heavy, and the observation of limb movements of premature infant at 24 hours can not be realized. Therefore, the real-time monitoring of the premature infant by adopting the non-contact camera is a more ideal and reliable method. The existing non-contact type motion recognition method is mainly developed and designed for adults, but the limb motion of premature infants is different from the conscious control limb motion of adults, the limb motion is unconscious spontaneous motion, and no obvious semantic relation expression exists. And there is also a high degree of similarity in the spatiotemporal pattern of the different limb movements of premature infants. Moreover, premature infants differ from adults in terms of their physical proportions in that they have longer trunk and shorter limbs, while those with longer limbs. These have resulted in the problem of poor performance when mainstream adult-oriented motion recognition methods are applied directly to non-contact monitoring of premature infants. The design of a non-contact intelligent monitoring system for premature infant action recognition is a key problem to be solved. Disclosure of Invention Aiming at the defects existing in the existing premature infant action recognition technology, the invention provides a premature infant multi-label limb action classification method based on the collaborative optimization of local area discrimination and logarithmic AUC loss function. The technical scheme of the invention mainly comprises the following steps: In a first aspect, an embodiment of the present application provides a premature infant action recognition method based on collaborative optimization of local area discrimination and AUC, including the steps of: And step 1, preprocessing the acquired premature infant monitoring video data, and converting the premature infant monitoring video data into a premature infant multi-tag limb movement data set containing five body parts of the head, the left hand, the right hand, the left foot and the right foot. The preprocessing comprises video category sorting, clipping and multi-label labeling. And 2, constructing a premature infant multi-label limb action classification model, wherein the model comprises a video feature extraction module, a feature mapping module, a category perception feature enhancement module and a classifier module. The category perception feature enhancement module comprises dimension transformation operation, L2 normalization and multi-head category specific residual error attention mechanism. And 3, introducing AUC loss based on the constructed premature infant multi-label limb action classification model, and constructing a fusion loss function combining binary cross entropy and AUC loss. And 4, sending the preprocessed multi-label limb movement data of the premature infant into the constructed multi-label limb movement classification model of the premature infant for recognition training and storing the trained model. In one possible implementation manner, the specific implementation of the step 1 is as follows: 1-1, accurately cutting the acquired premature infant monitoring video data acco