CN-122023546-A - Information processing apparatus, information processing method, computer program product, and storage medium
Abstract
An information processing apparatus, an information processing method, a computer program product, and a storage medium are provided. The obtaining unit of the information processing apparatus obtains encoded data of a three-dimensional image and metadata corresponding to the encoded data of the three-dimensional image, the metadata including region information related to one partial region in the three-dimensional image, first comment information and second comment information associated with the one partial region, first condition information indicating a first condition related to display of the first comment information and corresponding to a first viewpoint position or a first viewing direction in a three-dimensional space of the three-dimensional image, and second condition information indicating a second condition related to display of the second comment information and corresponding to a second viewpoint position or a second viewing direction in the three-dimensional space of the three-dimensional image. A generation unit of the information processing apparatus generates a three-dimensional image file storing encoded data and metadata of the three-dimensional image.
Inventors
- Shan Benjun
- IMAO EIJI
- FUKATA MASANORI
Assignees
- 佳能株式会社
Dates
- Publication Date
- 20260512
- Application Date
- 20251111
- Priority Date
- 20241112
Claims (20)
- 1. An information processing apparatus comprising: an obtaining unit configured to obtain encoded data of a three-dimensional image and metadata corresponding to the encoded data of the three-dimensional image, The metadata includes: Region information, which is associated with a partial region in the three-dimensional image, First annotation information and second annotation information, which are associated with the one partial region, First condition information indicating a first condition related to display of the first annotation information and corresponding to a first viewpoint position or a first viewing direction in a three-dimensional space of the three-dimensional image, and Second condition information indicating a second condition related to display of the second annotation information and corresponding to a second viewpoint position or a second viewing direction in a three-dimensional space of the three-dimensional image, and A generation unit configured to generate a three-dimensional image file storing encoded data of the three-dimensional image and the metadata.
- 2. The information processing apparatus according to claim 1, Wherein the first condition includes a condition for specifying a first range of the first viewpoint position in the three-dimensional space.
- 3. The information processing apparatus according to claim 2, Wherein the first range is a range set based on user input.
- 4. The information processing apparatus according to claim 1, Wherein the first condition includes a condition based on the one partial region and a projection range of the three-dimensional image on a display plane set based on the viewing direction in the three-dimensional space.
- 5. The information processing apparatus according to claim 4, Wherein the first condition includes a condition based on a positional relationship between a straight line extending from the first viewpoint position to a reference point of the one partial region and the projection range.
- 6. The information processing apparatus according to claim 5, Wherein the first annotation information is displayed on condition that a straight line from the first viewpoint position to the reference point passes through the projection range.
- 7. The information processing apparatus according to claim 2, Wherein the first condition further includes a condition for specifying a second range of the first viewpoint position in the three-dimensional space.
- 8. The information processing apparatus according to claim 7, Wherein the second range is a range in the three-dimensional space in which a distance from a reference point of the one partial region to the first viewpoint position satisfies a predetermined condition.
- 9. The information processing apparatus according to claim 7, Wherein the second range is a range in which a distance between two points in the three-dimensional space corresponding to a reference point of the one partial region and a projection of the first viewpoint position on the xy plane satisfies a predetermined condition.
- 10. The information processing apparatus according to any one of claims 1 to 9, Wherein the metadata obtained by the obtaining unit further includes priority information for indicating which of the first annotation information and the second annotation information is to be displayed if the first condition and the second condition are satisfied.
- 11. The information processing apparatus according to claim 1, Wherein the metadata obtained by the obtaining unit further includes information indicating default annotation information to be displayed in the three-dimensional image in the case where the first condition and the second condition are not satisfied.
- 12. The information processing apparatus according to claim 1, Wherein the metadata obtained by the obtaining unit further includes information indicating a reference point of the one partial region.
- 13. The information processing apparatus according to claim 1, Wherein the metadata includes first region information related to a first partial region that is the one partial region, second region information related to a second partial region included in the first partial region, the first annotation information associated with the first partial region, and the second annotation information associated with the second partial region.
- 14. The information processing apparatus according to claim 13, Wherein the metadata further includes condition information indicating the first condition and the second condition, and corresponds to a viewpoint position or a viewing direction in the three-dimensional space of the three-dimensional image when the three-dimensional image file is reproduced.
- 15. The information processing apparatus according to claim 14, The condition information includes a condition that, in a case where the viewpoint position is included in the first partial area and is not included in the second partial area when the three-dimensional image file is reproduced, the second annotation information is displayed if the second condition is satisfied, and the second annotation information is not displayed if the second condition is not satisfied.
- 16. The information processing apparatus according to claim 14, Wherein the condition information includes a condition that the first annotation information is displayed in a case where a straight line from the viewpoint position to a reference point of the first partial region passes through a projection range of the three-dimensional image on a display plane set based on the viewing direction in the three-dimensional space, and the second annotation information is displayed in a case where a straight line from the viewpoint position to a reference point of the second partial region passes through the projection range.
- 17. The information processing apparatus according to claim 13, Wherein one or both of the first annotation information and the second annotation information are annotation information that is always displayed when the three-dimensional image file is reproduced.
- 18. The information processing apparatus according to claim 13, Wherein the metadata further includes a third partial region different from the second partial region and associated with the first partial region and third annotation information associated with the third partial region.
- 19. An information processing apparatus comprising: An obtaining unit configured to obtain a three-dimensional image file storing encoded data of a three-dimensional image and metadata corresponding to the encoded data of the three-dimensional image, The metadata includes: Region information, which is associated with a partial region in the three-dimensional image, First annotation information and second annotation information, which are associated with the one partial region, First condition information indicating a first condition related to display of the first annotation information and corresponding to a first viewpoint position or a first viewing direction in a three-dimensional space of the three-dimensional image, and Second condition information indicating a second condition related to display of the second annotation information and corresponding to a second viewpoint position or a second viewing direction in a three-dimensional space of the three-dimensional image, and A reproduction unit configured to reproduce the three-dimensional image file.
- 20. An information processing method, comprising: obtain encoded data of a three-dimensional image and metadata corresponding to the encoded data of the three-dimensional image, The metadata includes: Region information, which is associated with a partial region in the three-dimensional image, First annotation information and second annotation information, which are associated with the one partial region, First condition information indicating a first condition related to display of the first annotation information and corresponding to a first viewpoint position or a first viewing direction in a three-dimensional space of the three-dimensional image, and Second condition information indicating a second condition related to display of the second annotation information and corresponding to a second viewpoint position or a second viewing direction in a three-dimensional space of the three-dimensional image, and A three-dimensional image file storing the encoded data of the three-dimensional image and the metadata is generated.
Description
Information processing apparatus, information processing method, computer program product, and storage medium Technical Field The present disclosure relates to an information processing apparatus, an information processing method, and a program product. Background In recent years, a technique for generating three-dimensional data such as free-viewpoint video and point-group data from data measured using an image or a Light Detection and ranging (Light Detection AND RANGING) sensor or the like from a plurality of image pickup apparatuses has become known. Volumetric media data and similar three-dimensional data are typically compressed and encoded to reduce the data size. The Moving Picture Experts Group (MPEG) is standardizing the format of volume media such as three-dimensional images and three-dimensional video. Examples of methods for encoding three-dimensional data include, for example, geometry-based point cloud compression (G-PCC) for compressing point cloud data, visual volume video-based encoding (V3C) for compressing volume media, and the like. Three-dimensional data compressed and encoded using G-PCC or V3C or the like may be stored in a file of a derived format such as the basic media file format (ISOBMFF) of ISO/IEC 14496-12 or the like. In recent years, the generation of annotations related to objects in image content has been performed by analyzing the image content. Annotation is annotation information that indicates the result of object recognition as a human or computer readable string or as a parameter for identifying and classifying an object. The generation of annotations can be determined by the human viewing the image, but in most cases this is done by the AI image recognition process. At this time, the object recognized by the AI process is indicated as a partial region in the image, and a process to provide an annotation to the partial region or associate the annotation with the partial region is performed. In such image recognition processing, a three-dimensional object may also be identified in the three-dimensional volume data, and information indicating a three-dimensional partial region showing the identified object and information indicating an annotation to be provided to the partial region may be generated. In the technique described in japanese patent laid-open No. 2006-211531, annotation information is provided for any three-dimensional region in three-dimensional data. Further, in the technique described in japanese patent application laid-open No. 2013-232730, annotation information (additional information) is provided for any three-dimensional region. For example, in the case of displaying a three-dimensional object surrounded by a three-dimensional area provided with a plurality of annotations on a 2D display device, whether the annotations need to be displayed and whether the annotations need to be selectively displayed or not and the like need to be determined according to the distance between the viewpoint position and the object position in the three-dimensional space, and how the three-dimensional object is displayed in the current viewport (display area). However, in japanese patent laid-open No. 2006-211531, there is no mention of a method of selectively switching annotations for display in the case where a plurality of annotations are provided for any three-dimensional region. In other words, with Japanese patent laid-open No. 2006-211531, it is impossible to store selectable annotations. Further, in japanese patent application laid-open No. 2013-232730, there is no mention of a method of selectively switching and providing annotation information as selectable information in the case of providing annotation information for indicating the interior of a three-dimensional region for any three-dimensional region in a three-dimensional space. For example, for a three-dimensional object represented in a three-dimensional region, annotation information for the case of displaying from a wider space and annotation information for the case of displaying the three-dimensional region from the inside of the three-dimensional region cannot be individually defined as information that can be switched and selected. In other words, in the case of viewing from a wider space, even if the annotation information indicates the inside, it is associated with three-dimensional media data as annotation information in a similar manner. Disclosure of Invention According to an embodiment of the present disclosure, there is provided an information processing apparatus that provides a three-dimensional image file enabling selection of annotation information to be displayed according to a viewpoint. According to one embodiment of the present disclosure, an information processing apparatus includes an obtaining unit configured to obtain encoded data of a three-dimensional image and metadata corresponding to the encoded data of the three-dimensional image, the metadata including region informatio