CN-116092123-B - Pedestrian image quality evaluation method, device, equipment and readable storage medium

CN116092123BCN 116092123 BCN116092123 BCN 116092123BCN-116092123-B

Abstract

The application discloses a method, a device, equipment and a readable storage medium for evaluating pedestrian image quality, wherein the method comprises the steps of obtaining a batch of original pedestrian images, adding quality factor labels to the original pedestrian images, and generating a training set, wherein the quality factors comprise the existence, shielding, illumination, blurring and gesture of pedestrians; the method comprises the steps of inputting a training set into an initial deep learning model for training, stopping training the model until the loss value of the model tends to be converged, obtaining a pedestrian image quality assessment model, obtaining a pedestrian image to be detected, and carrying out recognition processing on the pedestrian image to be detected based on the pedestrian image quality assessment model to obtain a quality assessment result of the pedestrian image to be detected. Based on the application, the technical problem that the pedestrian image cannot be accurately identified by adopting template matching or single-dimension deep learning in the prior art is solved.

Inventors

WANG YANG

Assignees

广联达科技股份有限公司

Dates

Publication Date: 20260512
Application Date: 20230207

Claims (7)

1. A method of evaluating pedestrian image quality, the method comprising: Acquiring a batch of original pedestrian images, adding quality factor labels to the original pedestrian images, and generating a training set, wherein the quality factors comprise the existence, shielding, illumination, blurring and gesture of pedestrians; Inputting the training set into an initial deep learning model for training, and stopping training the model until the loss value of the model tends to converge to obtain a pedestrian image quality assessment model, wherein the pedestrian image quality assessment model can identify the pedestrian image from five different types of dimensionalities of pedestrian presence, shielding, illumination, blurring and gesture; acquiring an image of a pedestrian to be detected; Identifying the pedestrian image to be detected based on the pedestrian image quality evaluation model to obtain a quality evaluation result of the pedestrian image to be detected; the adding a quality factor tag to the original pedestrian image comprises: Classifying the original pedestrian images to obtain a pedestrian image set and a background image set, adding a pedestrian label to the images in the pedestrian image set, and adding no pedestrian label to the images in the background image set; Randomly cutting the pedestrian image set, and adding quality factor labels with different shielding levels to original images in the pedestrian image set and the images obtained by cutting; respectively determining the characteristic values of illumination, blurring and gesture of the pedestrian image set based on a preset characteristic recognition algorithm, and respectively adding quality factor labels of different levels corresponding to the illumination, blurring and gesture to the images in the pedestrian image set according to the determined characteristic values; The step of adding quality factor labels with different shielding levels to the images obtained by the pedestrian image scissors comprises the following steps: Adding a non-shielding label to the original image in the pedestrian image set; Randomly cutting the pedestrian image set, extracting a local image containing pedestrians, and adding a primary shielding label to the local image; And acquiring the local images containing pedestrians after cutting and the pedestrian position information in the original images of the pedestrian image set, respectively carrying out coordinate fusion on the pedestrian position information, the local images and the original images of the pedestrian image set, determining the number of the human images contained in the fused images, screening out pedestrian images containing multiple people according to the number of the human images, and adding a secondary shielding label to the pedestrian images containing multiple people.
2. The method according to claim 1, wherein the identifying the pedestrian image to be detected based on the pedestrian image quality evaluation model, to obtain a quality evaluation result of the pedestrian image to be detected, includes: Inputting the pedestrian image to be detected into a convolutional neural network of the pedestrian image quality evaluation model for feature recognition to obtain a multidimensional feature vector; Using a plurality of subtask neural networks to carry out prediction operation on the multidimensional feature vector to obtain a prediction probability value of the pedestrian image to be detected in each subtask neural network; and carrying out integrated classification and output on the predicted probability value based on the softmax layer to obtain a corresponding quality evaluation result of the pedestrian image to be detected.
3. The method of claim 1, wherein the inputting the training set into the initial deep learning model for training until the loss value of the model tends to converge, stopping training the model, and obtaining the pedestrian image quality assessment model, comprises: Initializing the weight of a forward calculation formula by using a random value; inputting a training set pedestrian image with a quality factor label into a convolutional neural network layer of an initial deep learning network, and performing supervised learning to obtain a multi-layer feature vector; continuously inputting the multi-layer feature vector into an initial subtask neural network layer along the forward direction to perform quality feature calculation, and obtaining a predicted value of a training set pedestrian image in each initial subtask neural network layer, wherein the initial subtask neural network layer comprises five subtasks of pedestrian presence, shielding, illumination, blurring and gesture; Acquiring a weight value of each subtask neural network layer, calculating errors of the predicted value and a true value of the tag through a first adjustment function of a softmax layer, carrying out weighted summation on the errors of each subtask by utilizing the normalized subtask weight value, and updating each subtask weight value if the calculated result is larger than a first preset threshold, wherein the first adjustment function is focalloss, and whether the subtask weight value of the pedestrian is higher than the subtask weight values of other types; Performing new round of forward propagation calculation by using the updated new weight value, repeatedly performing forward propagation and weight value updating until the error value is smaller than or equal to a first preset threshold value, and stopping training the initial deep learning model to obtain an intermediate model; And in order to improve the generalization performance of the deep learning model so as to be suitable for more scenes, determining a final error value of the intermediate model, if the final error value exceeds a second preset threshold value, replacing a first adjusting function of the intermediate model with a second adjusting function, correcting each subtask weight value of the intermediate model through the second adjusting function, and repeatedly training the intermediate model based on the corrected weight values until the error value is smaller than or equal to the second preset threshold value, stopping training, and obtaining the pedestrian image quality evaluation model, wherein the second adjusting function is cosloss.
4. A method according to any one of claims 1-3, wherein the initial deep learning model is any one of network structures ResNet, mobileNet or shuffleNet.
5. An apparatus for evaluating pedestrian image quality, the apparatus comprising: The labeling module is used for acquiring a batch of original pedestrian images, adding quality factor labels to the original pedestrian images and generating a training set, wherein the quality factors comprise the existence, shielding, illumination, blurring and gesture of pedestrians; The training module is used for inputting the training set into an initial deep learning model to train until the loss value of the model tends to converge, stopping training the model to obtain a pedestrian image quality assessment model, wherein the pedestrian image quality assessment model can identify a pedestrian image from five different types of dimensionalities of existence, shielding, illumination, blurring and gesture of the pedestrian; the acquisition module is used for acquiring the pedestrian image to be detected; The identification module is used for carrying out identification processing on the pedestrian image to be detected based on the pedestrian image quality evaluation model to obtain a quality evaluation result of the pedestrian image to be detected; The labeling module comprises: The first labeling unit is used for classifying the original pedestrian images to obtain a pedestrian image set and a background image set, adding pedestrian labels for images in the pedestrian image set and adding no pedestrian labels for images in the background image set; The second labeling unit is used for randomly cutting the pedestrian image set and adding quality factor labels with different shielding levels to the original image in the pedestrian image set and the cut image; The third labeling unit is used for respectively determining the characteristic values of illumination, blurring and gesture of the pedestrian image set based on a preset characteristic recognition algorithm, and respectively adding quality factor labels of different levels corresponding to the illumination, blurring and gesture to the images in the pedestrian image set according to the determined characteristic values; The second labeling unit is further used for adding non-shielding labels to original images in the pedestrian image set, randomly cutting the pedestrian image set, extracting partial images containing pedestrians, adding primary shielding labels to the partial images, acquiring the local images containing pedestrians after cutting and pedestrian position information in the original images of the pedestrian image set, respectively carrying out coordinate fusion on the pedestrian position information, the partial images and the original images of the pedestrian image set, determining the number of images contained in the fused images, screening pedestrian images containing multiple people according to the number of images, and adding secondary shielding labels to the pedestrian images containing multiple people.
6. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method of any of claims 1 to 4 when executing the computer program.
7. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the method of any of claims 1 to 4.

Description

Pedestrian image quality evaluation method, device, equipment and readable storage medium Technical Field The present invention relates to the field of image recognition, and in particular, to a pedestrian image quality evaluation method, device, apparatus, and readable storage medium. Background The pedestrian re-identification is a technology for extracting characteristic information in pedestrian images to identify, and is characterized in that a computer vision technology is utilized to judge whether a specific pedestrian exists in an image or a video sequence, a monitoring pedestrian image is given, the pedestrian image under a cross-device is searched, and the purpose of compensating for the vision limitation of a fixed camera is achieved so as to identify the pedestrian under the monitoring device. The quality of pedestrian image directly determines the usability and recognition accuracy of the target detection algorithm. Although the performance of the pedestrian detection algorithm in the present stage has been improved greatly, some false detection (such as no pedestrian in the detection frame) and many detection results with incomplete information are unavoidable in the practical application process, such as the situation that the detection frame only comprises an upper half body, a lower half body, a left half body, a right half body and even more than one person. In addition, due to the influence of factors such as illumination and blurring, the phenomena such as overexposure, darkness and excessive blurring exist in the pedestrian image, and the reliability and recognition accuracy of the pedestrian re-recognition algorithm are seriously influenced. In the prior art, aiming at the situations of false detection, incomplete detection and the like of a target detection algorithm, a template matching or deep learning method is generally adopted for solving the problems. The template matching has the limitation, mainly is that the template matching can only perform parallel movement, and if the matching target in the original image rotates or changes in size, the algorithm can be disabled, so that the algorithm can not be used basically under the condition of relatively complex tasks. Secondly, the problem is relatively simple to process based on the deep learning method, and false detection or incomplete targets can be cleaned for the second time by adopting a classification method, so that unreasonable images are filtered out. The patent [ CN 113076917A ] is a deep learning-based method, which is used for processing the quality evaluation requirement of pedestrians according to two parallel tasks, namely 1, data are classified according to two main categories of shielding and non-shielding, and 2, data are classified according to five main categories of upper half body, lower half body, left half body, right half body and whole body. The two tasks share a backbone network and use the weight loss to optimize the network. Because of various factors affecting the quality of pedestrians, the recognition algorithm of the deep learning is too single and cannot accurately recognize pedestrian images. Aiming at the technical problem that the pedestrian image cannot be accurately identified by adopting template matching or single-dimension deep learning in the prior art, no effective solution exists at present. Disclosure of Invention The invention aims to provide a pedestrian image quality evaluation method, device, equipment and storage medium, which can solve the technical problem that the pedestrian image cannot be accurately identified by adopting template matching or single-dimension deep learning in the prior art. The invention provides a pedestrian image quality assessment method which comprises the steps of obtaining a batch of original pedestrian images, adding quality factor labels to the original pedestrian images to generate a training set, inputting the training set into an initial deep learning model to train until loss values of the model tend to converge, stopping training the model to obtain a pedestrian image quality assessment model, obtaining a pedestrian image to be detected, and carrying out identification processing on the pedestrian image to be detected based on the pedestrian image quality assessment model to obtain a quality assessment result of the pedestrian image to be detected. The method comprises the steps of inputting a pedestrian image to be detected into a convolutional neural network of the pedestrian image quality evaluation model to perform feature recognition to obtain a multi-dimensional feature vector, performing prediction operation on the multi-dimensional feature vector by using a plurality of subtask neural networks to obtain a prediction probability value of the pedestrian image to be detected in each subtask neural network, and performing integrated classification and output on the prediction probability value based on a softmax layer to obtain a quality eva