EP-4738824-A1 - THREE-DIMENSIONAL MESH INTER-FRAME PREDICTION ENCODING METHOD AND APPARATUS, THREE-DIMENSIONAL MESH INTER-FRAME PREDICTION DECODING METHOD AND APPARATUS, AND ELECTRONIC DEVICE

EP4738824A1EP 4738824 A1EP4738824 A1EP 4738824A1EP-4738824-A1

Abstract

This application discloses a three-dimensional mesh inter-frame prediction encoding method, decoding method, and apparatus, and an electronic device, and belongs to the field of three-dimensional dynamic mesh encoding and decoding technologies. The encoding method in embodiments of this application includes: performing, by an encoder side, first processing on a to-be-encoded three-dimensional mesh, to obtain a basemesh; performing, by the encoder side, submesh division on the basemesh, to obtain a P submesh; determining, by the encoder side, a target encoding mode of a to-be-encoded mesh vertex in the P submesh and a first target motion vector prediction MVP value, where the first target MVP value is one of N MVP values included in a first candidate list, each MVP value in the first candidate list corresponds to one index, and N is an integer greater than 1; and encoding, by the encoder side, the target encoding mode and an index corresponding to the first target MVP value, to obtain a bitstream including first information, where the first information is used to indicate the target encoding mode and the index corresponding to the first target MVP value.

Inventors

ZOU, Wenjie
ZHANG, WEI
YANG, FUZHENG
LV, Zhuoyi

Assignees

Vivo Mobile Communication Co., Ltd.

Dates

Publication Date: 20260506
Application Date: 20240624

Claims (20)

A three-dimensional mesh inter-frame prediction encoding method, comprising: performing, by an encoder side, first processing on a to-be-encoded three-dimensional mesh, to obtain a basemesh; performing, by the encoder side, submesh division on the basemesh, to obtain a P submesh; determining, by the encoder side, a target encoding mode of a to-be-encoded mesh vertex in the P submesh and a first target motion vector prediction MVP value, wherein the first target MVP value is one of N MVP values comprised in a first candidate list, each MVP value in the first candidate list corresponds to one index, and N is an integer greater than 1; and encoding, by the encoder side, the target encoding mode and an index corresponding to the first target MVP value, to obtain a bitstream comprising first information, wherein the first information is used to indicate the target encoding mode and the index corresponding to the first target MVP value.
The method according to claim 1, wherein before the determining, by the encoder side, a target encoding mode of a to-be-encoded mesh vertex in the P submesh and a first target MVP value, the method further comprises: constructing, by the encoder side, the first candidate list based on a neighboring encoded mesh vertex of the to-be-encoded mesh vertex in the P submesh.
The method according to claim 2, wherein the constructing, by the encoder side, the first candidate list based on a neighboring encoded mesh vertex of the to-be-encoded mesh vertex in the P submesh comprises: obtaining, by the encoder side, L neighboring encoded mesh vertexes and M neighboring encoded mesh vertexes of the to-be-encoded mesh vertex in the P submesh, wherein L and M are integers greater than 1; obtaining, by the encoder side, a motion vector MV value of each of the L neighboring encoded mesh vertexes, and determining L first MVP values of a to-be-encoded mesh vertex based on the MV values of the L encoded mesh vertexes; determining, by the encoder side, a second MVP value based on MV values respectively corresponding to the M encoded mesh vertexes; and constructing, by the encoder side, the first candidate list based on the L first MVP values and the second MVP value, wherein L+1≤N.
The method according to claim 3, wherein the determining, by the encoder side, a second MVP value based on MV values respectively corresponding to the M encoded mesh vertexes comprises: obtaining, by the encoder side, the MV values respectively corresponding to the M encoded mesh vertexes, and determining an average value of the M MV values as the second MVP value; or obtaining, by the encoder side, the MV values respectively corresponding to the M encoded mesh vertexes, performing weighted average calculation on the M MV values, and determining a calculation result as the second MVP value.
The method according to claim 3 or 4, wherein a sorting order of the L first MVP values in the first candidate list is related to first distances, wherein the first distance is a distance between an encoded mesh vertex corresponding to the first MVP value and the to-be-encoded mesh vertex.
The method according to claim 5, wherein the L first MVP values are sorted in ascending order of the first distances.
The method according to any one of claims 3 to 6, wherein when L+1<N, the constructing, by the encoder side, the first candidate list based on the L first MVP values and the second MVP value comprises: constructing, by the encoder side, a sub-candidate list based on the L first MVP values and the second MVP value; and performing, by the encoder side, a zero-padding operation on the sub-candidate list, to obtain the first candidate list comprising the N MVP values.
The method according to any one of claims 2 to 7, wherein after the constructing, by the encoder side, the first candidate list based on a neighboring encoded mesh vertex of the to-be-encoded mesh vertex in the P submesh, the method further comprises: resorting, by the encoder side, the N MVP values in the first candidate list, to obtain a resorted first candidate list; and determining the first target MVP value from the first candidate list comprises: determining, by the encoder side, the first target MVP value from the resorted first candidate list.
The method according to claim 8, wherein the resorting, by the encoder side, the N MVP values in the first candidate list comprises: obtaining, by the encoder side, a sum of errors between a second target MVP value and a motion vector MV of the neighboring encoded mesh vertex of the to-be-encoded mesh vertex, wherein the second target MVP value is one of the N MVP values; and resorting, by the encoder side, the N MVP values based on a sum of errors corresponding to each MVP value in the first candidate list.
The method according to claim 9, wherein the N MVP values in the resorted first candidate list are sorted in ascending order of the corresponding sum of errors.
The method according to any one of claims 1 to 10, wherein the determining the first target MVP value comprises: obtaining, by the encoder side in the first candidate list, a first rate-distortion cost of each MVP value in a first encoding mode and a second rate-distortion cost of each MVP value in a second encoding mode, to obtain N first rate-distortion costs and N second rate-distortion costs; and determining, by the encoder side, the first target MVP value in the N MVP values according to the N first rate-distortion costs and the N second rate-distortion costs.
The method according to claim 11, wherein a first rate-distortion cost or a second rate-distortion cost that corresponds to the first target MVP value is a smallest one of the N first rate-distortion costs and the N second rate-distortion costs.
The method according to claim 11 or 12, wherein when the first rate-distortion cost corresponding to the first target MVP value is the smallest one of the N first rate-distortion costs and the N second rate-distortion costs, the target encoding mode is the first encoding mode; or when the second rate-distortion cost corresponding to the first target MVP value is the smallest one of the N first rate-distortion costs and the N second rate-distortion costs, the target encoding mode is the second encoding mode.
The method according to claim 13, wherein the first encoding mode is a mode in which encoding is directly performed based on the MVP value, and the second encoding mode is a mode in which encoding is performed based on the MVP value and a motion vector difference MVD value; and when the target encoding mode is the second encoding mode, the method further comprises: encoding, by the encoder side, an MVD value of the to-be-encoded mesh vertex in the P submesh.
The method according to any one of claims 11 to 14, wherein a target rate-distortion cost is related to a first length difference and a first angle, the first length difference is a difference between a modulus of a third target MVP value and a modulus of an MV value of the to-be-encoded mesh vertex, and the first angle is an angle between the third target MVP value and the MV value of the to-be-encoded mesh vertex; and the target rate-distortion cost is the first rate-distortion cost or the second rate-distortion cost, and the third target MVP value is one of the N MVP values in the first candidate list.
A three-dimensional mesh inter-frame prediction decoding method, comprising: obtaining, by a decoder side, a bitstream sent by an encoder side, wherein the bitstream comprises a basemesh bitstream; decoding, by the decoder side, a basemesh type of the basemesh bitstream, to obtain a P submesh bitstream comprising first information, wherein the first information is used to indicate a target encoding manner and an index corresponding to a first target MVP value; determining, by the decoder side, a target encoding mode according to the first information, and determining the first target MVP value from a first candidate list according to the index corresponding to the first target MVP value, wherein the first candidate list comprises N MVP values and an index corresponding to each MVP value, the first target MVP value is one of the N MVP values, and N is an integer greater than 1; and decoding, by the decoder side, the P submesh bitstream according to the target encoding mode and the first target MVP value, to obtain an MV value of a to-be-decoded mesh vertex in a P submesh.
The method according to claim 16, wherein before the determining the first target MVP value from a first candidate list according to the index corresponding to the first target MVP value, the method further comprises: constructing, by the decoder side, the first candidate list based on a neighboring decoded mesh vertex of the to-be-decoded mesh vertex in the P submesh.
The method according to claim 17, wherein the constructing, by the decoder side, the first candidate list based on a neighboring decoded mesh vertex of the to-be-decoded mesh vertex in the P submesh comprises: obtaining, by the decoder side, L neighboring decoded mesh vertexes and M neighboring decoded mesh vertexes of the to-be-decoded mesh vertex in the P submesh, wherein L and M are integers greater than 1; obtaining, by the decoder side, an MV value of each of the L neighboring decoded mesh vertexes, and determining L first MVP values of the to-be-decoded mesh vertex based on the MV values of the L neighboring decoded mesh vertexes; determining, by the decoder side, a second MVP value based on MV values respectively corresponding to the M decoded mesh vertexes; and constructing, by the decoder side, the first candidate list based on the L first MVP values and the second MVP value, wherein L+1≤N.
The method according to claim 18, wherein the determining, by the decoder side, a second MVP value based on MV values respectively corresponding to the M decoded mesh vertexes comprises: obtaining, by the decoder side, the MV values respectively corresponding to the M decoded mesh vertexes, and determining an average value of the M MV values as the second MVP value; or obtaining, by the decoder side, the MV values respectively corresponding to the M decoded mesh vertexes, performing weighted average calculation on the M MV values, and determining a calculation result as the second MVP value.
The method according to claim 18 or 19, wherein a sorting order of the L first MVP values in the first candidate list is related to first distances, wherein the first distance is a distance between a decoded mesh vertex corresponding to the first MVP value and the to-be-decoded mesh vertex.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS This application claims priority to Chinese Patent Application No. 202310795698.X, filed in China on June 30, 2023, which is incorporated herein by reference in its entirety. TECHNICAL FIELD This application belongs to the field of three-dimensional dynamic mesh encoding and decoding technologies, and specifically, to a three-dimensional mesh inter-frame prediction encoding method, decoding method, and apparatus, and an electronic device. BACKGROUND Over recent years, with the rapid development of multimedia technologies, a three-dimensional model becomes a new generation of digital media following audio, an image, and a video, and a three-dimensional mesh is a common three-dimensional model representation manner. Mesh decimation, mesh parameterization, and subdivision and deformation are performed on the three-dimensional mesh to obtain a basemesh. When the basemesh is encoded, the basemesh is divided into three types, namely, an I submesh, a P submesh, and a skip (Skip) submesh. For the P submesh, an inter-frame encoding mode is used, and submesh reference information and a motion vector (Motion Vector, MV) of each vertex need to be encoded. Currently, motion vector prediction is usually performed in a fixed manner. Such a prediction manner is not applicable to MVs of all vertexes, causing a case in which MV prediction of some vertexes is inaccurate. SUMMARY Embodiments of this application provide a three-dimensional mesh inter-frame prediction encoding method, decoding method, and apparatus, and an electronic device, to resolve a problem in the related art that accuracy of three-dimensional mesh inter-frame prediction is not high. According to a first aspect, a three-dimensional mesh inter-frame prediction encoding method is provided, performed by an encoder side, and including: performing, by the encoder side, first processing on a to-be-encoded three-dimensional mesh, to obtain a basemesh;performing, by the encoder side, submesh division on the basemesh, to obtain a P submesh;determining, by the encoder side, a target encoding mode of a to-be-encoded mesh vertex in the P submesh and a first target motion vector prediction MVP value, where the first target MVP value is one of N MVP values included in a first candidate list, each MVP value in the first candidate list corresponds to one index, and N is an integer greater than 1; andencoding, by the encoder side, the target encoding mode and an index corresponding to the first target MVP value, to obtain a bitstream including first information, where the first information is used to indicate the target encoding mode and the index corresponding to the first target MVP value. According to a second aspect, a three-dimensional mesh inter-frame prediction decoding method is provided, performed by a decoder side, and including: obtaining, by the decoder side, a bitstream sent by an encoder side, where the bitstream includes a basemesh bitstream;decoding, by the decoder side, a basemesh type of the basemesh bitstream, to obtain a P submesh bitstream including first information, where the first information is used to indicate a target encoding manner and an index corresponding to a first target MVP value;determining, by the decoder side, a target encoding mode according to the first information, and determining the first target MVP value from a first candidate list according to the index corresponding to the first target MVP value, where the first candidate list includes N MVP values and an index corresponding to each MVP value, the first target MVP value is one of the N MVP values, and N is an integer greater than 1; anddecoding, by the decoder side, the P submesh bitstream according to the target encoding mode and the first target MVP value, to obtain an MV value of a to-be-decoded mesh vertex in a P submesh. According to a third aspect, a three-dimensional mesh inter-frame prediction encoding apparatus is provided, including: a processing module, configured to perform first processing on a to-be-encoded three-dimensional mesh, to obtain a basemesh;a division module, configured to perform submesh division on the basemesh, to obtain a P submesh;a first determining module, configured to determine a target encoding mode of a to-be-encoded mesh vertex in the P submesh and a first target motion vector prediction MVP value, where the first target MVP value is one of N MVP values included in a first candidate list, each MVP value in the first candidate list corresponds to one index, and N is an integer greater than 1; andan encoding module, configured to encode the target encoding mode and an index corresponding to the first target MVP value, to obtain a bitstream including first information, where the first information is used to indicate the target encoding mode and the index corresponding to the first target MVP value. According to a fourth aspect, a three-dimensional mesh inter-frame prediction decoding apparatus is provided,