CN-113705294-B - Image recognition method and device based on artificial intelligence

CN113705294BCN 113705294 BCN113705294 BCN 113705294BCN-113705294-B

Abstract

The application provides an image recognition method and device based on artificial intelligence, and the method comprises the steps of carrying out target detection processing on an image to be recognized to obtain a target object image from the image to be recognized, carrying out feature extraction processing on the basis of the target object image to obtain corresponding image features, carrying out key point recognition processing on the basis of the image features to obtain key points and corresponding positions of the target object, and determining the integrity degree of the target object in the image to be recognized on the basis of the key points and the corresponding positions of the target object. The application can accurately identify the target object in the image and flexibly judge the incompleteness degree of the target object in the image.

Inventors

XIE YIBIN
Hou Haodi

Assignees

腾讯科技（深圳）有限公司

Dates

Publication Date: 20260505
Application Date: 20210304

Claims (19)

1. An image recognition method based on artificial intelligence, comprising: Performing target detection processing on an image to be identified so as to acquire a target object image from the image to be identified; Filling the surrounding area of the target object image, and performing feature extraction processing on the target object image with the preset size subjected to the filling processing to obtain corresponding image features; invoking the first keypoint detection model to perform the following: Mapping the image features into probability maps of a plurality of channels, wherein the probability map of each channel corresponds to probability distribution of one key point in a preset key point set, the probability distribution is used for representing probability that each pixel point in the target object image belongs to the key point corresponding to the probability map, the first key point detection model is trained through a first training sample set, the first training sample set comprises an occlusion image sample and a non-occlusion image sample, the occlusion image sample and the non-occlusion image sample carry key point marks, and part of key points of the target object in the occlusion image sample are occluded by a background; Identifying the pixel point with the highest probability in the probability map as the key point corresponding to the probability map, and identifying the position of the pixel point with the highest probability as the position of the key point corresponding to the probability map; Combining the key points identified from each probability map with the corresponding positions to form a key point identification result of the target object, wherein the key point identification result comprises a plurality of key points and the corresponding positions, and the key points are in one-to-one correspondence with all key points in the preset key point set; And determining the completeness of the target object in the image to be identified based on the key points and the corresponding positions of the target object.
2. The method according to claim 1, wherein the performing the object detection process on the image to be identified to obtain the object image from the image to be identified includes: Performing target detection processing on the image to be identified to obtain a detection frame comprising a target object; cutting out a target object image from the image to be identified based on the position of the detection frame; The method further comprises the steps of: And directly carrying out feature extraction processing on the cut target object image to obtain corresponding image features.
3. The method according to claim 2, wherein the performing the object detection process on the image to be identified to obtain a detection frame including the object includes: extracting the characteristics of the image to be identified to obtain a corresponding characteristic diagram; Determining a plurality of candidate frames in the image to be identified; mapping the candidate frames into the feature images to obtain a plurality of corresponding candidate feature images; carrying out maximum pooling treatment on the candidate feature images to obtain a plurality of candidate area images with the same size; And carrying out classification processing and candidate frame position regression processing on the plurality of candidate region graphs to obtain a detection frame comprising the target object.
4. The method according to claim 1, wherein the method further comprises: And calling a second key point detection model to execute the following processes: mapping the image characteristics into a probability map, wherein the probability map comprises the probability that each pixel point in the target object image corresponds to each key point in a preset key point set; The following processing is performed for each pixel point in the target object image: determining the maximum probability of the probabilities of the pixel points corresponding to all the key points in the preset key point set; When the maximum probability exceeds a probability threshold, identifying the pixel point as a key point corresponding to the maximum probability, and identifying the position of the pixel point as the position of the key point corresponding to the maximum probability; Combining the key points identified from the probability map and the corresponding positions to form a key point identification result of the target object; the key point identification result comprises at least one key point in the preset key point set and a position corresponding to the key point.
5. The method of claim 1, wherein determining the integrity of the target object in the image to be identified based on the keypoints and corresponding positions of the target object comprises: The following processing is performed for each key point of the target object: when the positions of the key points are located in the surrounding area for performing the filling processing, determining that the key points are missing key points in the image to be identified; removing the missing key points from the key points of the target object obtained by identification so as to update the key points of the target object obtained by identification; taking the updated number of key points of the target object obtained by recognition and the ratio of the number of preset key points as the integrity degree of the target object in the image to be recognized; The preset key point number is the key point count of the preset key point set of the target object.
6. The method according to claim 1, wherein the method further comprises: Taking the ratio of the number of the key points of the target object obtained by the key point identification processing to the number of preset key points as the integrity degree of the target object in the image to be identified; The preset key point number is the key point count of the preset key point set of the target object.
7. The method according to claim 5 or 6, wherein prior to determining the ratio, the method further comprises: Carrying out shielding identification processing on each key point of the target object to determine the shielded key point in the key points of the target object image; and removing the blocked key points from the key points of the target object so as to update the key points of the target object.
8. The method according to any one of claims 1 to 6, further comprising: Deleting the image to be identified from a candidate cover image set when the image to be identified is a candidate cover image of a media account and the integrity degree of the image to be identified is lower than an integrity degree threshold; and when the image to be identified is carried in the information to be recommended and the integrity degree of the image to be identified is lower than the integrity degree threshold value, shielding and recommending the image to be identified or reducing the recommendation weight of the image to be identified.
9. An artificial intelligence based image recognition device, comprising: The target detection module is used for carrying out target detection processing on the image to be identified so as to acquire a target object image from the image to be identified; the feature extraction module is used for carrying out filling processing on the surrounding area of the target object image, and carrying out feature extraction processing on the target object image with the preset size subjected to the filling processing to obtain corresponding image features; The key point identification module is used for calling a first key point detection model to perform the following processing, wherein the image features are mapped into probability maps of a plurality of channels, the probability map of each channel corresponds to probability distribution of one key point in a preset key point set, the probability distribution is used for representing probability that each pixel point in the target object image belongs to the key point corresponding to the probability map, the first key point detection model is trained through a first training sample set, the first training sample set comprises an occlusion image sample and a non-occlusion image sample, the occlusion image sample and the non-occlusion image sample carry key point marks, and part of key points of a target object in the occlusion image sample are occluded by a background; Identifying the pixel point with the highest probability in the probability map as the key point corresponding to the probability map, and identifying the position of the pixel point with the highest probability as the position of the key point corresponding to the probability map; combining the key points identified from each probability map with the corresponding positions to form a key point identification result of the target object, wherein the key point identification result comprises a plurality of key points and the corresponding positions, and the key points are in one-to-one correspondence with all key points in the preset key point set; And the integrity judging module is used for determining the integrity degree of the target object in the image to be identified based on the key points and the corresponding positions of the target object.
10. The apparatus of claim 9, wherein the object detection module is further to: The method comprises the steps of carrying out target detection processing on an image to be identified to obtain a detection frame comprising a target object, cutting out the target object image from the image to be identified based on the position of the detection frame; And the feature extraction module is also used for directly carrying out feature extraction processing on the cut target object image to obtain corresponding image features.
11. The apparatus of claim 10, wherein the object detection module is further to: Extracting features of the image to be identified to obtain a corresponding feature map, determining a plurality of candidate frames in the image to be identified, mapping the candidate frames into the feature map to obtain a plurality of corresponding candidate feature maps, carrying out maximum pooling treatment on the plurality of candidate feature maps to obtain a plurality of candidate region maps with the same size, and carrying out classification treatment and candidate frame position regression treatment on the plurality of candidate region maps to obtain a detection frame comprising the target object.
12. The apparatus of claim 9, wherein the keypoint identification module is further to: The second key point detection model is called to perform the following processing of mapping the image characteristic into a probability map, wherein the probability map comprises the probability of each key point in a preset key point set corresponding to each pixel point in the target object image, the maximum probability of the probability of each pixel point in the preset key point set corresponding to the pixel point is determined, when the maximum probability exceeds a probability threshold value, the pixel point is identified as the key point corresponding to the maximum probability, the position of the pixel point is identified as the position of the key point corresponding to the maximum probability, the key point identified from the probability map and the corresponding position are combined to form a key point identification result of the target object, and the key point identification result comprises at least one key point and the corresponding position in the preset key point set.
13. The apparatus of claim 9, wherein the integrity determination module is further to: And executing the following processing aiming at each key point of the target object, wherein when the position of the key point is positioned in a surrounding area for carrying out the filling processing, the key point is determined to be a missing key point in the image to be identified, the missing key point is removed from the key points of the target object obtained through identification so as to update the key points of the target object obtained through identification, the ratio of the number of the key points of the target object obtained through identification after updating to the number of preset key points is taken as the integrity degree of the target object in the image to be identified, and the number of the preset key points is the key point count of a preset key point set of the target object.
14. The apparatus of claim 9, wherein the integrity determination module is further to: And taking the ratio of the number of the key points of the target object obtained by the key point identification processing to the number of preset key points as the integrity degree of the target object in the image to be identified, wherein the number of the preset key points is the key point count of a preset key point set of the target object.
15. The apparatus of claim 13 or 14, wherein the integrity discrimination module, prior to determining the ratio, is further to: Carrying out shielding identification processing on each key point of the target object to determine the shielded key point in the key points of the target object image; and removing the blocked key points from the key points of the target object so as to update the key points of the target object.
16. The apparatus of any one of claims 9 to 14, further comprising a processing module to: And when the image to be identified is carried in the information to be recommended and the integrity degree of the image to be identified is lower than the integrity degree threshold, shielding and recommending the image to be identified or reducing the recommendation weight of the image to be identified.
17. An electronic device, comprising: a memory for storing executable instructions; A processor for implementing the method of any one of claims 1 to 8 when executing executable instructions stored in said memory.
18. A computer readable storage medium storing executable instructions for implementing the method of any one of claims 1 to 8 when executed by a processor.
19. A computer program product comprising computer instructions which, when executed by a processor, implement the method of any one of claims 1 to 8.

Description

Image recognition method and device based on artificial intelligence Technical Field The present application relates to an artificial intelligence technology, and in particular, to an image recognition method, device, electronic apparatus and computer readable storage medium based on artificial intelligence. Background Artificial intelligence (AI, artificial Intelligence) is the theory, method and technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human intelligence, sense the environment, acquire knowledge and use the knowledge to obtain optimal results. Image recognition (Image Classification), which refers to a technique of processing, analyzing, and understanding an image with a computer to recognize targets and objects of various modes. Along with the development of artificial intelligence technology, image recognition technology is continuously innovated in recent years, and technologies such as face recognition, human body recognition and the like are also widely applied to various fields, so that accurate evaluation of the integrity of objects in images to be recognized becomes an important challenge for image recognition processing. In the related art, when whether an object is complete in an identified image is judged, a simple classification model is generally used for judging whether the image is complete or not, the complete degree of the target identified object is judged after the identified object is rarely distinguished, and when the image is identified in a targeted manner, misjudgment is generated on the image which does not contain the target identified object, the complete degree of the target object cannot be accurately evaluated, and the accuracy and precision of the complete judgment of the target object are affected. Disclosure of Invention The embodiment of the application provides an image recognition method, an image recognition device, electronic equipment and a computer readable storage medium based on artificial intelligence, which can accurately recognize a target object and flexibly judge the incompleteness degree of the target object in an image. The technical scheme of the embodiment of the application is realized as follows: the embodiment of the application provides an image identification method based on artificial intelligence, which comprises the following steps: Performing target detection processing on an image to be identified so as to acquire a target object image from the image to be identified; Performing feature extraction processing based on the target object image to obtain corresponding image features; Performing key point identification processing based on the image characteristics to obtain key points and corresponding positions of the target object; And determining the completeness of the target object in the image to be identified based on the key points and the corresponding positions of the target object. The embodiment of the application provides an image recognition device based on artificial intelligence, which comprises the following components. The target detection module is used for carrying out target detection processing on the image to be identified so as to acquire a target object image from the image to be identified; the feature extraction module is used for carrying out feature extraction processing based on the target object image to obtain corresponding image features; The key point identification module is used for carrying out key point identification processing based on the image characteristics to obtain key points and corresponding positions of the target object; And the integrity judging module is used for determining the integrity degree of the target object in the image to be identified based on the key points and the corresponding positions of the target object. In the above solution, the target detection module is further configured to: Performing target detection processing on the image to be identified to obtain a detection frame comprising a target object; And cutting out a target object image from the image to be identified based on the position of the detection frame. In the above solution, the object detection module is further configured to: extracting the characteristics of the image to be identified to obtain a corresponding characteristic diagram; Determining a plurality of candidate frames in the image to be identified; mapping the candidate frames into the feature images to obtain a plurality of corresponding candidate feature images; carrying out maximum pooling treatment on the candidate feature images to obtain a plurality of candidate area images with the same size; And carrying out classification processing and candidate frame position regression processing on the plurality of candidate region graphs to obtain a detection frame comprising the target object. In the above solution, the feature extraction module is further conf