CN-122029813-A - Encoding and decoding point cloud geometry

CN122029813ACN 122029813 ACN122029813 ACN 122029813ACN-122029813-A

Abstract

Concepts for encoding and decoding point cloud geometries are disclosed. In an encoding method, a current vertex of a triangle representing a portion of a point cloud geometry is determined, a value representing an average area of the triangle having the current vertex as a vertex is determined, and vertex information indicating a position of the current vertex is encoded in a bitstream based on the value. An encoder and decoder for encoding and decoding the point cloud geometry are also provided.

Inventors

J. G.Y. Takai
S. Russell

Assignees

皇家飞利浦有限公司

Dates

Publication Date: 20260512
Application Date: 20241004
Priority Date: 20231013

Claims (13)

1. A method of encoding a point cloud geometry, comprising: determining a current vertex of a triangle representing a portion of a point cloud geometry; determining a value representing an average area of a triangle having the current vertex as a vertex, and Vertex information indicating the position of the current vertex is encoded in a bitstream based on the value.
2. A method of decoding a point cloud geometry, comprising: Determining a value representing the average area of a triangle to be reconstructed from: at least one triangle vertex representing a portion of a point cloud geometry, and A current vertex; Decoding vertex information indicating a position of the current vertex from a bitstream based on the value; Determining the current vertex based on the vertex information, and The portion of the point cloud is decoded based on voxelizing the triangle formed by the current vertex and the vertex of the at least one triangle.
3. The method of claim 2, further comprising reconstructing the triangle based on the current vertex and a pair of vertices of the at least one triangle.
4. The method of claim 2, wherein the vertex of the at least one triangle is a previously decoded vertex.
5. The method of claim 2, wherein determining the value comprises: Determining a vertex predictor derived from the vertex of the at least one triangle for the current vertex, and The value representing the average area of a predicted triangle formed by the vertex predictor and the vertices of the at least one triangle is determined.
6. The method of claim 2, wherein decoding the vertex information comprises decoding quantized scalar values.
7. The method of claim 6, wherein decoding the quantized scalar value comprises: Decoding at least one quantization parameter based on the value, and Dequantizing the scalar value based on the at least one quantization parameter.
8. The method of claim 1 or 2, wherein the vertex information includes the location of the current vertex.
9. The method of claim 1 or 2, wherein the vertex information comprises a residual vector indicating a displacement from a vertex predictor to the current vertex.
10. A video encoder comprising circuitry connected to a memory, wherein the circuitry is arranged to: determining a current vertex of a triangle representing a portion of a point cloud geometry; determining a value representing an average area of a triangle having the current vertex as a vertex, and Vertex information indicating the position of the current vertex is encoded in a bitstream based on the value.
11. A computer program product comprising instructions that, when executed by one or more processors, cause the one or more processors to perform a method comprising: determining a current vertex of a triangle representing a portion of a point cloud geometry; determining a value representing an average area of a triangle having the current vertex as a vertex, and Vertex information indicating the position of the current vertex is encoded in a bitstream based on the value.
12. A video decoder comprising circuitry connected to a memory, wherein the circuitry is arranged to: Determining a value representing the average area of a triangle to be reconstructed from: at least one triangle vertex representing a portion of a point cloud geometry, and A current vertex; Decoding vertex information indicating a position of the current vertex from a bitstream based on the value; Determining the current vertex based on the vertex information, and The portion of the point cloud is decoded based on voxelizing the triangle formed by the current vertex and the vertex of the at least one triangle.
13. A computer program product comprising instructions that, when executed by one or more processors, cause the one or more processors to perform a method comprising: Determining a value representing the average area of a triangle to be reconstructed from: at least one triangle vertex representing a portion of a point cloud geometry, and A current vertex; Decoding vertex information indicating a position of the current vertex from a bitstream based on the value; Determining the current vertex based on the vertex information, and The portion of the point cloud is decoded based on voxelizing the triangle formed by the current vertex and the vertex of the at least one triangle.

Description

Encoding and decoding point cloud geometry Technical Field The invention relates to decoding and decoding a point cloud representing an outer surface of a 3D object. In particular, the invention relates to encoding/decoding the geometry of such point clouds. Background Conventional visual data describes an object or scene using a series of points, each including a location (x and y) in two dimensions and one or more optional attributes (e.g., color). Volumetric visual data adds another location dimension to the conventional visual data. The volumetric visual data describes an object or scene using a series of points, each including a location in three dimensions (x, y, and z) and one or more optional attributes (e.g., color, reflectivity, time stamp, etc.). Volumetric visual data may provide a more immersive way to experience visual data than conventional visual data. For example, an object or scene described by volumetric visual data may be viewed from any angle(s), while conventional visual data may generally only be viewed from the point at which it was captured or rendered. Volumetric visual data can be used in many applications including Augmented Reality (AR), virtual Reality (VR), and Mixed Reality (MR). Sparse volumetric visual data may be used in the automotive industry to represent 3D maps (cartography) or as input to a driving assistance system. In the latter case, the volumetric visual data is typically input to a driving decision algorithm. In another example, volumetric visual data may be used to store valuable objects in digital form. In applications for preserving cultural heritage, the goal is to preserve a representation of objects that may be threatened by natural disasters. For example, statues, vases, and temples may be fully scanned and stored as volumetric visual data having billions of samples. Such use cases of volumetric visual data may be particularly relevant to valuable objects in locations where earthquakes, tsunamis and typhoons are frequent. The volumetric visual data may be in the form of volumetric frames describing an object or scene captured at a particular moment in time, or in the form of a sequence of volumetric frames describing an object or scene captured at a plurality of different moments in time (referred to as a volumetric sequence or volumetric video). One format for storing volumetric visual data is a point cloud. The point cloud comprises a collection of points in three-dimensional (3D) space. Each point in the point cloud may include geometry information indicating the position of the point in 3D space. For example, the geometric information may use three Cartesian coordinates (x, y and z) or spherical coordinates (r,Θ) (e.g., when acquired by a rotation sensor) to indicate the position of the point in 3D space. The locations of points in a point cloud may be quantified in terms of spatial precision, which may be the same or different in each dimension. The quantization process may create a grid in 3D space. One or more points residing within each sub-grid volume may be mapped to sub-grid center coordinates, referred to as voxels. A voxel (also referred to as a volume pixel) may be considered as a 3D extension of a pixel corresponding to the grid coordinates of a 2D image. For example, similar to the pixels being the smallest units when dividing a 2D space (or 2D image) into discrete uniform (e.g., equal-sized) regions, the voxels may be the smallest volume units when dividing a 3D space into discrete uniform regions. The sub-grid center coordinates (which correspond to voxels) may be referred to as a voxelized grid. Points in the point cloud may also include one or more types of attribute information. The attribute information may indicate an attribute of the visual appearance of the point. For example, the attribute information may indicate texture (e.g., color) of the dot, material type of the dot, transparency information of the dot, reflectivity information of the dot, normal vector to the surface of the dot, velocity at the dot, acceleration at the dot, timestamp indicating when the dot was captured, or a modality (e.g., running, walking, or flying) indicating how the dot was captured. In another example, points in the point cloud may include light field data in the form of multiple view-dependent texture information. The light field data may be another type of optional attribute information. Points in the point cloud may describe objects or scenes. For example, points in a point cloud may describe the external surface and/or internal structure of an object or scene. The object or scene may be generated synthetically by a computer or may be generated from the capture of real world objects or scenes. The geometry information of the real world object or scene may be obtained by 3D scanning and/or photogrammetry. The 3D scan may include a laser scan, a structured light scan, and/or a modulated light scan. The 3D scan may obtain geometry information by movi