JP-2026075921-A - 3D data decoding device and 3D data encoding device
Abstract
[Problem] The objective is to simplify the definition of conformance conditions and enable high-quality encoding and decoding of 3D data by using integer arithmetic instead of floating-point arithmetic in the encoding and decoding of 3D data. [Solution] A 3D data decoding device for decoding encoded data comprises a mesh prediction unit that derives a predicted value of the base mesh normal vector from the encoded data, and a mesh decoding unit that derives the base mesh normal vector. The mesh prediction unit converts the decoded 3D normal vector into a 2D normal vector using integer arithmetic conversion processing to derive a 2D predicted normal vector. The mesh decoding unit arithmetically decodes the predicted residual of the 2D normal vector, adds the 2D predicted normal vector and the predicted residual to derive a 2D normal vector, and converts the 2D normal vector into a 3D normal vector using integer arithmetic conversion processing including division with bit shifts to derive the base mesh normal vector. [Selection Diagram] Figure 16
Inventors
- 徳毛 靖昭
- 猪飼 知宏
- 洪 秀俊
- 杉本 翔
Assignees
- シャープ株式会社
Dates
- Publication Date
- 20260511
- Application Date
- 20241023
Claims (8)
- A 3D data decoding device for decoding encoded data comprises a mesh prediction unit that derives a predicted value of the base mesh normal vector from the encoded data, and a mesh decoding unit that derives the base mesh normal vector. The above mesh prediction unit converts the decoded 3D normal vector into a 2D normal vector using integer arithmetic conversion processing, and derives a 2D predicted normal vector. The 3D data decoding device is characterized by the above mesh decoding unit arithmetically decoding the predicted residual of the 2D normal vector, adding the above 2D predicted normal vector and the above predicted residual to derive the 2D normal vector, and converting the above 2D normal vector into a 3D normal vector using a conversion process with integer operations including division with bit shifts to derive the above base mesh normal vector.
- The 3D data decoding device according to claim 1 is characterized in that the mesh decoding unit derives a 3D vector from a 2D normal vector, performs division according to the magnitude (norm) of the 3D vector, and then performs a right bit shift using the shift value (shift) to derive the 3D vector.
- The 3D data decoding device according to claim 1 is characterized in that the mesh decoding unit derives a 3D vector from a 2D vector based on the difference between the components of the 2D normal vector and a predetermined value scale, multiplies the derived 3D vector by scale, divides it by the magnitude value of the 3D vector norm, right-shifts it by a predetermined shift value shift, and adds a predetermined offset center to derive the result.
- The 3D data decoding device according to claim 1, characterized in that the bit shift is variable according to the range of values of the 3D normal vector.
- The 3D data decoding device according to claim 1 or 2, characterized in that the mesh prediction unit prohibits the use of zero vectors as predicted values for the base mesh normal vector.
- A 3D data encoding device for encoding 3D data comprises a mesh prediction unit that derives a predicted value of the base mesh normal vector, and a mesh encoding unit that encodes the predicted residual of the base mesh normal vector. The above mesh prediction unit converts the encoded 3D normal vector into a 2D normal vector using integer arithmetic conversion processing, and derives a 2D predicted normal vector. The above mesh coding unit is a 3D data coding device characterized by converting the 3D normal vector to be coded into a 2D normal vector using integer arithmetic conversion processing, arithmetic coding the predicted residual for the 2D predicted normal vector, and converting the 2D normal vector into a 3D normal vector using integer arithmetic conversion processing including division with bit shifts to derive the coded 3D normal vector.
- The 3D data encoding device according to claim 6, characterized in that the bit shift value is variable according to the range of the three-dimensional normal vector value.
- The 3D data encoding device according to claim 6 or 7, characterized in that the mesh prediction unit prohibits the use of zero vectors as predicted values for the base mesh normal vector.
Description
Embodiments of the present invention relate to a 3D data encoding device and a 3D data decoding device. To efficiently transmit or record 3D data, there is a 3D data encoding device that converts 3D data into a 2D image, encodes it using a video encoding scheme, and generates encoded data; and a 3D data decoding device that decodes the 2D image from the encoded data and reconstructs the 3D data. Specific 3D data encoding methods include, for example, MPEG-I's ISO/IEC 23090-5 V3C (Volumetric Video-based Coding) and V-PCC (Video-based Point Cloud Compression). V3C can encode and decode point clouds composed of point location and attribute information. Furthermore, it is used for encoding and decoding multi-view and mesh video using ISO/IEC 23090-12 (MPEG Immersive Video, MIV) and the currently standardized ISO/IEC 23090-29 (Video-based Dynamic Mesh Coding, V-DMC). The latest draft document for the V-DMC method is disclosed in Non-Patent Literature 1. These 3D data encoding methods encode and decode the geometry and attributes that make up the 3D data as images using video encoding methods such as H.265/HEVC (High Efficiency Video Coding) and H.266/VVC (Versatile Video Coding). In the case of point clouds, the geometry image is the depth to the projection plane, and the attribute image is the image of the attributes projected onto the projection plane. 3D data (mesh) like that described in Non-Patent Document 1 consists of a base mesh, mesh displacement, and texture mapping image. The base mesh can be encoded using a vertex coding scheme such as Draco. Mesh displacement can be encoded using either a video codec or directly using arithmetic coding, in addition to encoding a 2D mesh displacement image. The texture mapping image is encoded as an attribute image using a video codec. The HEVC and VVC video codecs mentioned above can be used. Study of technologies for Video-based mesh coding, ISO/IEC JTC 1/SC 29/WG 7 N0960, July 2024 This is a schematic diagram showing the configuration of the 3D data transmission system according to this embodiment.This diagram shows the hierarchical structure of the encoded stream data.This is a functional block diagram showing the schematic configuration of the 3D data decoding device 31.This is a functional block diagram showing the configuration of the base mesh decoding unit 303.This is a functional block diagram showing the configuration of the mesh displacement decoding unit 305.This is a functional block diagram showing the configuration of the mesh reconstruction unit 307.This is an example of syntax for a configuration that transmits coordinate transformation parameters and context initialization parameters at the sequence level (ASPS).This is an example of syntax for a configuration that transmits coordinate transformation parameters and context initialization parameters at the picture/frame level (AFPS).This is a diagram illustrating the operation of the mesh reconstruction unit 307.This is a functional block diagram showing the schematic configuration of the 3D data encoding device 11.This is a functional block diagram showing the configuration of the base mesh coding unit 103.This is a functional block diagram showing the configuration of the mesh displacement coding unit 107.This is a functional block diagram showing the configuration of the mesh separation unit 115.This is a diagram illustrating the operation of the mesh separation unit 115.This is an example of a syntactic structure for mesh displacement.This is a functional block diagram showing the configuration of the mesh decoding unit 3031.This is an example of the syntax structure for base mesh vertex positions.This is an example of the syntax structure of base mesh attributes.This figure shows how to derive the context of the syntax elements for the base mesh vertex positions.This diagram shows how to derive the context of the syntax elements of the base mesh attributes.This is a functional block diagram showing the configuration of the mesh coding unit 1031. The embodiments of the present invention will be described below with reference to the drawings. Figure 1 is a schematic diagram showing the configuration of the 3D data transmission system 1 according to this embodiment. The 3D data transmission system 1 is a system that transmits an encoded stream containing encoded 3D data to be encoded, decodes the transmitted encoded stream, and displays the 3D data. The 3D data transmission system 1 comprises a 3D data encoding device 11, a network 21, a 3D data decoding device 31, and a 3D data display device 41. The 3D data encoding device 11 receives the 3D data T as input. Network 21 transmits the encoded stream Te generated by the 3D data encoding device 11 to the 3D data decoding device 31. Network 21 is the Internet, a wide area network (WAN), a local area network (LAN), or a combination of these. Network 21 is not necessarily limited to a bidirectional communication network; it may also be a unidirectional