Search

US-12625904-B2 - Information processing apparatus, information processing method, and non-transitory computer readable medium for image retrieval based on pose information

US12625904B2US 12625904 B2US12625904 B2US 12625904B2US-12625904-B2

Abstract

An information processing apparatus includes a pose acquisition unit, a retrieval unit, and a display control unit. The pose acquisition unit acquires first pose information indicating a pose of a person shown in each of a plurality of reference images associated with a predetermined pose, and second pose information indicating a pose of a capture target person shown in a query image. The retrieval unit retrieves a reference image showing a person whose pose or action is similar to that of the capture target person shown in the query image, from among the plurality of reference images, based on a similarity degree between the first pose information and the second pose information. The display control unit causes a display unit to display at least one of the first pose information and the second pose information in a display mode according to an index used for retrieving the reference image.

Inventors

  • Ryo Kawai
  • Noboru Yoshida
  • Jianquan Liu
  • Satoshi Yamazaki
  • Tingting DONG
  • Karen Stephen
  • Youhei SASAKI
  • Naoki Shindou
  • YUTA NAMIKI

Assignees

  • NEC CORPORATION

Dates

Publication Date
20260512
Application Date
20220428

Claims (15)

  1. 1 . An information processing apparatus comprising: at least one memory storing instructions; and at least one processor configured to execute the instructions to: acquire first pose information regarding a first model configured with a plurality of model elements, the first pose information indicating a pose of a person shown in each of a plurality of reference images associated with a predetermined pose, and second pose information regarding a second model configured with the plurality of model elements, the second pose information indicating a pose of a capture target person shown in a query image; retrieve a reference image showing a person whose pose or action is similar to that of the capture target person shown in the query image, from among the plurality of reference images, based on a similarity degree between the first pose information and the second pose information; and cause a display to display at least one of the first pose information and the second pose information in a display mode according to an index used for retrieving the reference image, the index includes a weight indicating a degree to which each of model elements is emphasized in order to derive the similarity degree between pose estimation models in the predetermined pose.
  2. 2 . The information processing apparatus according to claim 1 , wherein the index includes the similarity degree, and the at least one processor is further configured to execute the instructions to cause the display to display at least one of the first pose information and the second pose information in the display mode according to the similarity degree.
  3. 3 . The information processing apparatus according to claim 2 , wherein the at least one processor is further configured to execute the instructions to, in a case where the second pose information is displayed, relate the second pose information to a specific reference image among the plurality of reference images; and cause the display to display the second pose information in the display mode according to the similarity degree between the second pose information and the first pose information associated with the specific reference image.
  4. 4 . The information processing apparatus according to claim 1 , wherein the at least one processor is further configured to execute the instructions to cause the display to display a plurality of model elements making up at least one of the first model and the second model in the display mode according to the weight.
  5. 5 . The information processing apparatus according to claim 1 , wherein the similarity degree includes at least one of an overall similarity degree being an overall similarity degree between the first model and the second model, and an element similarity degree being a similarity degree for each associated model element between the first model and the second model.
  6. 6 . The information processing apparatus according to claim 1 , wherein the index includes the similarity degree, the at least one processor is further configured to execute the instructions to, in a case where the second model is displayed, relate the second model to a specific reference image among the plurality of reference images; and cause the display to display the second model in the display mode according to the similarity degree between the second model and the first model associated with the specific reference image, and the first model associated with the specific reference image is any of a model with a largest maximum overall similarity degree, a model including a largest element similarity degree, and a model specified by a user.
  7. 7 . The information processing apparatus according to claim 5 , wherein, the at least one processor is further configured to execute the instructions to, in a case where the second model is displayed, derive an average value from the element similarity degrees for each of model elements; and cause the display to display the second model in the display mode according to the average value.
  8. 8 . The information processing apparatus according to claim 5 , wherein, the at least one processor is further configured to execute the instructions to, in a case where the similarity degree includes the element similarity degrees, cause the display to display only a model element with an element similarity degree of the element similarity degrees equal to or more than a first criterion value and equal to or less than a second criterion value or with a weight equal to or more than a predetermined threshold value, among model elements making up at least one of the first model and the second model.
  9. 9 . The information processing apparatus according to claim 5 , wherein the at least one processor is further configured to execute the instructions to derive the similarity degree by using the first model and the second model.
  10. 10 . The information processing apparatus according to claim 9 , wherein the similarity degree derived by using the first model and the second model includes the overall similarity degree between the first model and the second model.
  11. 11 . The information processing apparatus according to claim 10 , wherein the at least one processor is further configured to execute the instructions to derive the overall similarity degree by using a weight indicating a degree to which each of model elements is emphasized in order to derive a similarity degree between pose estimation models in the predetermined pose, and the element similarity degrees.
  12. 12 . The information processing apparatus according to claim 1 , wherein a model element includes a joint element associated with a plurality of joints, and a trunk element and a bone element associated with each of a trunk and a skeleton connecting between the plurality of joints, and the at least one processor is further configured to execute the instructions to cause the display to display at least one of the trunk element and the bone element as a line with an arrow.
  13. 13 . The information processing apparatus according to claim 1 , wherein the similarity degree includes a plurality of element similarity degrees, regarding each of the plurality of corresponding model elements between the first model and the second model, and an overall similarity degree between the first model and the second model, the overall similarity degree being calculated based on the plurality of element similarity degrees.
  14. 14 . An information processing method comprising, acquiring first pose information regarding a first model configured with a plurality of model elements, the first pose information indicating a pose of a person shown in each of a plurality of reference images associated with a predetermined pose, and second pose information regarding a second model configured with the plurality of model elements, the second pose information indicating a pose of a capture target person shown in a query image; retrieving a reference image showing a person whose pose or action is similar to that of the capture target person shown in the query image, from among the plurality of reference images, based on a similarity degree between the first pose information and the second pose information; and causing a display to display at least one of the first pose information and the second pose information in a display mode according to an index used for retrieving the reference image, the index includes a weight indicating a degree to which each of model elements is emphasized in order to derive the similarity degree between pose estimation models in the predetermined pose.
  15. 15 . A non-transitory computer readable medium storing a program for causing a computer to execute: acquiring first pose information regarding a first model configured with a plurality of model elements, the first pose information indicating a pose of a person shown in each of a plurality of reference images associated with a predetermined pose, and second pose information regarding a second model configured with the plurality of model elements, the second pose information indicating a pose of a capture target person shown in a query image; retrieving a reference image showing a person whose pose or action is similar to that of the capture target person shown in the query image, from among the plurality of reference images, based on a similarity degree between the first pose information and the second pose information; and causing a display to display at least one of the first pose information and the second pose information in a display mode according to an index used for retrieving the reference image, the index includes a weight indicating a degree to which each of model elements is emphasized in order to derive the similarity degree between pose estimation models in the predetermined pose.

Description

This application is a National Stage Entry of PCT/JP2022/019290 filed on Apr. 28, 2022, the contents of all of which are incorporated herein by reference, in their entirety. TECHNICAL FIELD The present invention relates to an information processing apparatus, an information processing method, and a non-transitory computer readable medium. BACKGROUND ART For example, an image retrieval apparatus described in Patent Document 1 includes a pose estimation unit, a feature value extraction unit, a query generation unit, and an image retrieval unit. The pose estimation unit described in the document recognizes, from an input image, pose information of a retrieval target made up of a plurality of keypoints. The feature value extraction unit described in the document extracts a feature value from the pose information and an input image. The query generation unit described in the document generates a retrieval query from an image database that accumulates a feature value in relation to an input image, and pose information specified by a user. The image retrieval unit described in the document retrieves, from the image database, an image including a similar pose according to a retrieval query. For example, an image processing apparatus described in Patent Document 2 includes an image acquisition unit, a skeleton structure detection unit, a query evaluation unit, a selection unit, a feature value computation unit, and a retrieval unit. The image acquisition unit described in the document acquires a candidate of a query image. The skeleton structure detection unit described in the document detects a two-dimensional skeleton structure of a person included in the candidate of the query image. The query evaluation unit described in the document computes an evaluation value of the candidate of the query image, based on a detection result of the two-dimensional skeleton structure. The selection unit described in the document selects, based on the evaluation value, a query image from among candidates of query images. The feature value computation unit described in the document computes a feature value of a two-dimensional skeleton structure detected from the query image. The retrieval unit described in the document retrieves, based on a similarity degree of the computed feature value, an analysis target image including a person with a pose similar to a pose of a person included in the query image, from among the analysis target images. Note that, Patent Document 3 discloses a technique for computing a feature value of each of a plurality of keypoints of a human body included in an image, and retrieving an image including a human body with a similar pose or a human body with a similar motion, based on the computed feature value, or gathering and categorizing those with similar poses or motions. Non-Patent Document 1 describes a technique related to skeleton estimation of a person. RELATED DOCUMENT Patent Document Patent Document 1: Japanese Patent Application Publication No. 2019-091138Patent Document 2: International Patent Publication No. WO2021/250808Patent Document 3: International Patent Publication No. WO2021/084677 Non-Patent Document Non-Patent Document 1: Zhe Cao, Tomas Simon, Shih-En Wei, Yaser Sheikh, [Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields], The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 7291 to 7299 DISCLOSURE OF THE INVENTION Technical Problem Patent Documents 1 and 2 describe techniques for estimating a pose or action, based on an image. However, in Patent Documents 1 and 2, since it is not known whether a pose has been correctly estimated, it is difficult to improve accuracy of estimating a pose of a capture target person shown in an image. Note that, Patent Document 3 and Non-Patent Document 1 also do not disclose a technique for improving accuracy of detecting a person being in a previously determined pose from an image capturing a person. In view of the problem described above, one example of an object of the present invention is to provide an information processing apparatus, an information processing method, and a storage medium that solve improving accuracy of estimating a pose of a capture target person shown in an image. Solution to Problem According to one aspect of the present invention, there is provided an information processing apparatus including: a pose acquisition unit that acquires first pose information indicating a pose of a person shown in each of a plurality of reference images associated with a predetermined pose, and second pose information indicating a pose of a capture target person shown in a query image;a retrieval unit that retrieves a reference image showing a person whose pose or action is similar to that of the capture target person shown in the query image, from among the plurality of reference images, based on a similarity degree between the first pose information and the second pose information; anda display control