Search

CN-121986497-A - Method and apparatus for processing region-dependent annotations in an image file

CN121986497ACN 121986497 ACN121986497 ACN 121986497ACN-121986497-A

Abstract

The invention relates to a method of encapsulating an image in an ISOBMFF-based media file, wherein the method comprises generating a plurality of region items, each region item describing a geometry of a region of the image, generating a data structure representing a region of the image and listing region items from the plurality of region items contained within the region of the image, and generating the media file comprising the image, the plurality of region items, and the data structure.

Inventors

  • Hev Lulun
  • Frederick Metz
  • Frank Denoir

Assignees

  • 佳能株式会社

Dates

Publication Date
20260505
Application Date
20241004
Priority Date
20231009

Claims (20)

  1. 1. A method of encapsulating an image in an ISOBMFF-based media file, wherein the method comprises: Generating a plurality of region items, each region item describing a geometry of a region of the image; generating a data structure representing a region of the image and listing region items contained within the region of the image from the plurality of region items; the media file is generated, the generated media file comprising the image, the plurality of region items, and the data structure.
  2. 2. The method of claim 1, wherein the data structure includes a reference space and a description of the region of the image represented by the data structure relative to the reference space.
  3. 3. The method of claim 2, wherein the data structure further comprises an indication indicating whether the data structure includes the reference space and the description of the region of the image represented by the data structure relative to the reference space.
  4. 4. The method of claim 1, wherein the region of the image represented by the data structure is a region of the image associated with the data structure.
  5. 5. The method of claim 4, wherein the data structure includes an indication indicating whether the region of the image represented by the data structure is a region of the image associated with the data structure.
  6. 6. The method of any of claims 1-5, wherein the image is a grid of a plurality of input images and the data structure is associated with an input image of the plurality of input images.
  7. 7. The method of any of claims 1-5, wherein the data structure is associated with the image.
  8. 8. The method of any of claims 1-7, wherein the plurality of region items are associated with the image.
  9. 9. The method of any of claims 1 to 8, wherein the region of the image represented by the data structure is a rectangular partition of the image.
  10. 10. The method of any of claims 1 to 9, wherein the data structure is an extension of EntityToGroupBox.
  11. 11. The method of any of claims 1 to 10, wherein the data structure is associated with an image using an item reference.
  12. 12. A method of processing image-related data from an ISOBMFF-based media file, wherein the method comprises: a region in which an image is obtained; Obtaining a data structure from the media file, the data structure representing a region of the image and listing region items contained within the region of the image from a plurality of region items in the media file, each region item of the plurality of region items describing a geometry of a region of the image; Obtaining at least one of the plurality of zone items based on the listed zone items, and The obtained at least one region item is processed.
  13. 13. The method of claim 12, wherein the data structure is obtained based on a rendering scale of a region of the image to be processed.
  14. 14. The method of claim 12 or 13, wherein the data structure comprises a reference space and a description of the region of the image represented by the data structure relative to the reference space.
  15. 15. The method of claim 14, wherein the data structure further comprises an indication indicating whether the data structure includes the reference space and the description of the region of the image represented by the data structure relative to the reference space.
  16. 16. The method of claim 12 or 13, wherein the region of the image represented by the data structure is a region of the image associated with the data structure.
  17. 17. The method of claim 16, wherein the data structure includes an indication to indicate whether the region of the image represented by the data structure is a region of the image associated with the data structure.
  18. 18. The method of any of claims 12 to 17, wherein the region of the image represented by the data structure is a rectangular partition of the image.
  19. 19. The method of any of claims 12 to 18, wherein the data structure is an extension of EntityToGroupBox.
  20. 20. The method of any of claims 12 to 19, wherein the data structure is associated with the image using an item reference.

Description

Method and apparatus for processing region-dependent annotations in an image file Technical Field The present disclosure relates to methods and apparatus for processing region-related information in an image file. Background Modern cameras and image analysis services enable the generation of localized metadata (localized metadata) for images. Localized metadata is metadata related to regions, portions of media content, rather than to the entire media content. The media content is typically an image, but may also be video content or a collection of images. For example, a camera may generate a focal region of a photograph or detect a face when taking a picture. As another example, the deep learning system may identify objects within the image. These localized metadata may be considered region annotation (region annotation). For example, an image photographed by a camera or processed by an image analysis service is stored on a storage device such as a memory card. The image is typically encoded to reduce the size of the data on the storage device. Many coding standards may be used, such as the JPEG, AV1 or the latest HEVC standard. The HEVC standard defines a profile for the encoding of still images and describes specific tools for compressing single still images or continuously shot still images. Extensions to the ISO base media file format (ISOBMFF) for such image data have been proposed to be included in the ISO/IEC 23008 standard (in section 12, entitled "HEIF" or "high efficiency image file format"). HEIF (high efficiency image File Format) is a standard developed by the Moving Picture Experts Group (MPEG) for storing and sharing images and image sequences. MIAF (multiple image application format) is a standard developed by MPEG in section 22 of the ISO/IEC 23000 standard. The MIAF specification specifies a multimedia application format, namely a multi-image application format (MIAF), which implements precise interoperability points for creating, reading, parsing and decoding images embedded in a high-efficiency image file (HEIF) format. The MIAF specification is fully compliant with the HEIF format and defines only additional constraints to ensure higher interoperability. The HEIF and MIAF file formats provide a mechanism for linking annotations to regions of an image that is dependent on region items. However, for high resolution images, the number of region items may become very large. When only a portion of the high resolution image is displayed, the HEIF and MIAF file formats do not provide a means for determining which region items should be parsed and processed. Disclosure of Invention The present invention has been designed to address one or more of the problems described above. According to a first aspect of the present invention, there is provided a method for encapsulating an image in an ISOBMFF-based media file, wherein the method comprises: -obtaining an image; -obtaining a plurality of region annotations associated with the image; -obtaining region partition information of the image; -generating a plurality of data structures representing region partitions (region parts) of the image; -associating the region annotation with the data structure based on the region partition information; -embedding the image, the region annotation and the plurality of data structures in the media file. In an embodiment, the region partition information includes information about a region covered by the region partition and rendering scale (RENDERING SCALE) information. In an embodiment, the data structure is a cell representing a spatial partition of the image. In an embodiment, cells are organized in a tree representing hierarchical spatial partitioning of an image, cells at the top of the tree being associated with the image, each cell being associated with a sub-cell (if present) in the tree. In an embodiment, each layer in the tree represents a version of the image at a given rendering scale. In an embodiment, the association of the region annotation with the cell is based on the location and size of the region and/or the associated rendering scale range and/or priority. In an embodiment, the tree is a quadtree. In an embodiment, the various layers of the tree are associated with an image overview of the image at a given resolution. In an embodiment, each overview image other than the one associated with the top of the tree is divided into tiles (tiles), with the cells of the tree being associated with the tiles of the overview image. In an embodiment, the region annotations are grouped into region items, and wherein the data structure represents a group of region items. In an embodiment, the data structure is an entity to group (entity to group) box. According to another aspect of the present invention, there is provided a method for rendering a spatial portion of an image from an ISOBMFF-based media file, wherein the method comprises: -obtaining an image item describing an image from the media file and a