CN-121986483-A - Bearing of dynamic system stream unit head length precision information for volumetric video

CN121986483ACN 121986483 ACN121986483 ACN 121986483ACN-121986483-A

Abstract

Point cloud encoding and decoding systems and methods. In one example encoding method, at least a first track comprising a plurality of samples containing point cloud information is encoded in a container file, such as an ISOBMFF file. A first sample set description box associated with the first track is encoded, wherein the first sample set description box includes at least a first sample set description entry and a second sample set description entry. The first sample group description entry stores first unit header length precision information and the second sample group description entry stores second unit header length precision information different from the first unit header length precision information. The first sample-to-group box is encoded and it associates a first plurality of samples with a first sample group description entry and a second plurality of samples with a second sample group description entry.

Inventors

Srinivas Gudumasu
Ahmad Hamza
C. Gaide

Assignees

交互数字CE专利控股有限公司

Dates

Publication Date: 20260505
Application Date: 20240930
Priority Date: 20231006

Claims (15)

1. A method of encoding, comprising: Encoding at least a first track in a container file, the first track comprising a plurality of samples comprising point cloud information; encoding a first sample set description box associated with the first track in the container file, the first sample set description box including at least a first sample set description entry and a second sample set description entry; wherein the first sample group description entry stores first unit head length precision information and the second sample group description entry stores second unit head length precision information different from the first unit head length precision information, and A first sample-to-box is encoded in the container file, the first sample-to-box associating a first plurality of samples with the first sample set description entry and associating a second plurality of samples with the second sample set description entry.
2. An encoding apparatus comprising one or more processors configured to perform at least the following: Encoding at least a first track in a container file, the first track comprising a plurality of samples comprising point cloud information; encoding a first sample set description box associated with the first track in the container file, the first sample set description box including at least a first sample set description entry and a second sample set description entry; wherein the first sample group description entry stores first unit head length precision information and the second sample group description entry stores second unit head length precision information different from the first unit head length precision information, and A first sample-to-box is encoded in the container file, the first sample-to-box associating a first plurality of samples with the first sample set description entry and associating a second plurality of samples with the second sample set description entry.
3. The method of claim 1 or the apparatus of claim 2, wherein the first track is an atlas track, a V3C bitstream track, or an atlas tile track.
4. A method according to claim 1 or claim 3 when dependent on claim 1, or an apparatus according to claim 2 or claim 3 when dependent on claim 2, wherein the unit header length precision information is system stream Network Abstraction Layer (NAL) unit header length precision information.
5. A method according to claim 1 or claim 3 when dependent on claim 1, or an apparatus according to claim 2 or claim 3 when dependent on claim 2, wherein the unit header length precision information is system stream V3C unit header length precision information.
6. The method of claim 1 or claims 3-5 when dependent on claim 1, or the apparatus of claim 2 or claims 3-5 when dependent on claim 2, further comprising: encoding at least a second track in the container file, the second track comprising a plurality of samples comprising point cloud information; Encoding a second sample set description box associated with the second track in the container file, the second sample set description box including at least a third sample set description entry and a fourth sample set description entry; Wherein the third sample group description entry stores third unit head length precision information and the fourth sample group description entry stores fourth unit head length precision information different from the third unit head length precision information, and A second sample-to-box is encoded in the container file, the second sample-to-box associating a third plurality of samples with the third sample set description entry and a fourth plurality of samples with the fourth sample set description entry.
7. The method of claim 1 or claims 3-6 when dependent on claim 1, or the apparatus of claim 2 or claims 3-6 when dependent on claim 2, wherein the container file is an ISOBMFF file.
8. A decoding method, comprising: decoding at least a first track from a container file, the first track comprising a plurality of samples comprising point cloud information; Decoding a first sample set description box associated with the first track from the container file, the first sample set description box including at least a first sample set description entry and a second sample set description entry; Wherein the first sample group description entry provides first unit head length accuracy information and the second sample group description entry provides second unit head length accuracy information different from the first unit head length accuracy information, and Decoding a first sample-to-box from the container file, the first sample-to-box associating a first plurality of samples with the first sample set description entry and a second plurality of samples with the second sample set description entry.
9. A decoding apparatus comprising one or more processors configured to perform at least the following: decoding at least a first track from a container file, the first track comprising a plurality of samples comprising point cloud information; Decoding a first sample set description box associated with the first track from the container file, the first sample set description box including at least a first sample set description entry and a second sample set description entry; Wherein the first sample group description entry provides first unit head length accuracy information and the second sample group description entry provides second unit head length accuracy information different from the first unit head length accuracy information, and Decoding a first sample-to-box from the container file, the first sample-to-box associating a first plurality of samples with the first sample set description entry and a second plurality of samples with the second sample set description entry.
10. The method of claim 8 or the apparatus of claim 9, further comprising processing the point cloud information based on the unit head length accuracy information.
11. A method according to claim 8 or claim 10 when dependent on claim 8, or an apparatus according to claim 9 or claim 10 when dependent on claim 9, wherein the first track is an atlas track, a V3C bitstream track or an atlas tile track.
12. The method of claim 8 or of claims 10-11 when dependent on claim 8, or the apparatus of claim 9 or of claims 10-11 when dependent on claim 9, wherein the unit header length precision information is system stream Network Abstraction Layer (NAL) unit header length precision information.
13. The method of claim 8 or of claims 10-11 when dependent on claim 8, or the apparatus of claim 9 or of claims 10-11 when dependent on claim 9, wherein the unit header length precision information is system stream V3C unit header length precision information.
14. The method of claim 8 or claims 10-13 when dependent on claim 8, or the apparatus of claim 9 or claims 10-13 when dependent on claim 9, further comprising: Decoding at least a second track from the container file, the second track comprising a plurality of samples including point cloud information; Decoding a second sample set description box associated with the second track from the container file, the second sample set description box including at least a third sample set description entry and a fourth sample set description entry; Wherein the third sample group description entry stores third unit head length precision information and the fourth sample group description entry stores fourth unit head length precision information different from the third unit head length precision information, and Decoding a second sample from the container file to a set box, the second sample to set box associating a third plurality of samples with the third sample set description entry and a fourth plurality of samples with the fourth sample set description entry.
15. The method of claim 8 or of claims 10-14 when dependent on claim 8, or the apparatus of claim 9 or of claims 10-14 when dependent on claim 9, wherein the container file is an ISOBMFF file.

Description

Bearing of dynamic system stream unit head length precision information for volumetric video Cross reference The present application claims priority from european patent application No.23306718.0 filed on 6/10/2023 entitled "bearing of dynamic system stream unit header length precision information for volumetric video" the entire contents of which are incorporated herein by reference. Background High quality three-dimensional point clouds have recently become an advanced representation of immersive media. A point cloud consists of a set of points represented in three-dimensional space that represent the location of each point using coordinates, as well as one or more attributes such as color, transparency, acquisition time, laser reflectivity, or material properties associated with each point. The point cloud may be collected in several ways. For example, one technique to collect a point cloud is to use multiple cameras and depth sensors. Light detection and ranging (LiDAR) laser scanners are also commonly used to collect point clouds. The number of points required to truly reconstruct objects and scenes using point clouds can be on the order of millions (or even billions). Thus, efficient representation and compression is ideal for storing and transmitting point cloud data. In uncompressed form, the volumetric video is represented by massive amounts of data. ISO/IEC 23090 reduces the amount of data required to store and transmit a volumetric video by means of the compression efficiency of existing two-dimensional video codecs, based on the codec (V3C) specification of the visual volumetric video. The V3C encoder converts the volumetric frame into a two-dimensional image sequence and a set of related metadata that enables reconstruction of the volumetric frame, referred to as atlas data. The generated two-dimensional image sequence is then encoded using a conventional video or image encoder, such as the ISO/IEC 14496-10 (H.264/AVC), ISO/IEC 23008-2 (HEVC), or ISO/IEC 23090-3 (VVC) international standard. The atlas data is encoded according to the mechanism specified by the ISO/IEC 23090-5 standard. V3C is a generic volumetric video codec mechanism that can be used for different applications for volumetric content compression, such as point clouds, immersive video, and grid representations of visual volumetric frames. Examples of such applications include video-based point cloud compression (V-PCC) and MPEG Immersive Video (MIV). The V3C specification employs a High Level Syntax (HLS) design common in conventional two-dimensional video codecs to represent encoded atlas data. The encoded atlas data is represented by a series of Network Abstraction Layer (NAL) units. Recent advances in acquisition and rendering of three-dimensional point technology have enabled new applications in the fields of telepresence, virtual reality, and large-scale dynamic three-dimensional maps. The three-dimensional graphics subgroup of ISO/IEC JTC1/SC29/WG7 is currently developing two three-dimensional Point Cloud Compression (PCC) standards, one being a geometry-based compression standard for static point clouds and LiDAR-acquired time-varying sparse point clouds and the other being a video-based compression standard for dynamic point clouds. The goal of these standards is to support efficient and interoperable storage and transmission of three-dimensional point clouds. One of the goals of these standards is to provide support for lossy and/or lossless codec of point cloud geometry and properties. WG7 has released a FDIS version of the ISO/IEC 23090-5 standard for video compression of dynamic point clouds. The MPEG systems working group (ISO/IEC/SC 29/WG 3) has established the V3C data bearing standard (ISO/IEC 23090-10) and has released its FDIS version. Disclosure of Invention An example point cloud encoding method includes encoding at least a first track in a container file (e.g., an ISOBMFF file), the first track including a plurality of samples including point cloud information, encoding a first sample group description box associated with the first track in the container file, the first sample group description box including at least a first sample group description entry and a second sample group description entry, wherein the first sample group description entry stores first unit head length precision information and the second sample group description entry stores second unit head length precision information that is different from the first unit head length precision information, and encoding the first sample in the container file to a group box, the first sample to the group box associating the first plurality of samples with the first sample group description entry and the second plurality of samples with the second sample group description entry. The first track may be, for example, an atlas track, a V3C bitstream track, or an atlas tile track. In some embodiments, the unit header length precision information is syste