CN-122029817-A - Encoding/decoding method, code stream, encoder, decoder, and storage medium

CN122029817ACN 122029817 ACN122029817 ACN 122029817ACN-122029817-A

Abstract

The embodiment of the application discloses a coding and decoding method, a code stream, an encoder, a decoder and a storage medium, which can improve coding and decoding efficiency of a basic grid in dynamic grid coding and decoding. The decoding method comprises the steps of analyzing a code stream, determining first image level grammar identification information corresponding to a basic grid of a current image (S1801), determining dimension grammar identification information corresponding to each coordinate dimension of vertexes in the basic grid through analyzing the code stream under the condition that the first image level grammar identification information indicates inter-frame decoding, wherein the dimension grammar identification information corresponding to each coordinate dimension represents a motion vector prediction mode corresponding to the coordinate dimension, and determining a motion vector decoding value corresponding to each coordinate dimension of vertexes based on the dimension grammar identification information corresponding to each coordinate dimension in the coordinate dimensions (S1803), so that the motion vector decoding information corresponding to the vertexes is determined.

Inventors

WEI HONGLIAN

Assignees

OPPO广东移动通信有限公司

Dates

Publication Date: 20260512
Application Date: 20231012

Claims (20)

A decoding method applied to a decoder, the method comprising: analyzing the code stream, and determining first image level grammar identification information corresponding to a basic grid of the current image; Determining dimension grammar identification information corresponding to each coordinate dimension of the vertexes in the basic grid by analyzing a code stream under the condition that the first image-level grammar identification information indicates inter-frame decoding; And determining a motion vector decoding value of the vertex corresponding to each coordinate dimension based on the dimension grammar identification information corresponding to each coordinate dimension in the plurality of coordinate dimensions, thereby determining the motion vector decoding information corresponding to the vertex.
The method of claim 1, wherein the determining, based on the dimension syntax identification information corresponding to each of the plurality of coordinate dimensions, a motion vector decoded value corresponding to the vertex in each coordinate dimension comprises: For each coordinate dimension, under the condition that dimension grammar identification information corresponding to the coordinate dimension is a first value, determining a motion vector decoding value corresponding to the vertex in the coordinate dimension by analyzing a code stream, thereby determining the motion vector decoding value corresponding to the vertex in each coordinate dimension, wherein the first value is used for indicating that a motion vector prediction mode corresponding to the coordinate dimension is a no-prediction mode.
The method of claim 1, wherein the determining the motion vector decoded value for the vertex in each coordinate dimension based on the dimension syntax identification information and the motion vector decoded value for each coordinate dimension of the plurality of coordinate dimensions comprises: For each coordinate dimension, under the condition that dimension grammar identification information corresponding to the coordinate dimension is a second value, determining at least one current image neighbor point corresponding to the vertex in the current image according to the connection relation of the basic grid; the at least one current image neighbor point comprises a vertex which has a connection relation with the vertex and has a decoding sequence before the vertex in the current image, wherein the motion vector prediction mode corresponding to the second value representation coordinate dimension is an intra-frame prediction mode; determining a first residual error corresponding to the vertex in the coordinate dimension by analyzing the code stream; And determining a motion vector decoding value corresponding to the vertex in the coordinate dimension according to at least one first motion vector reconstruction value corresponding to the at least one current image neighbor point and the first residual error, so as to determine the motion vector decoding value corresponding to the vertex in each coordinate dimension.
A method according to claim 3, wherein said determining a motion vector decoded value for said vertex in the coordinate dimension from at least one first motion vector reconstructed value for said at least one current image neighbor point and said first residual, comprises: Determining a first motion vector predicted value corresponding to the vertex in the coordinate dimension by carrying out weighted average on the at least one first motion vector reconstructed value; and determining a motion vector decoding value corresponding to the vertex in the coordinate dimension according to the first motion vector predicted value and the first residual error.
The method of claim 1, wherein the determining, based on the dimension syntax identification information corresponding to each of the plurality of coordinate dimensions, a motion vector decoded value corresponding to the vertex in each coordinate dimension comprises: And determining a motion vector decoding value corresponding to each coordinate dimension of the vertex according to the decoding mode of the reference basic grid and the dimension grammar identification information corresponding to each coordinate dimension, wherein the reference basic grid is the basic grid of the reference image corresponding to the current image.
The method of claim 5, wherein determining the motion vector decoded value of the vertex corresponding to each coordinate dimension according to the decoding mode of the reference base mesh and the dimension syntax identification information corresponding to each coordinate dimension comprises: If the decoding mode of the reference basic grid is an intra-frame decoding mode, for each coordinate dimension, if the dimension grammar identification information corresponding to the coordinate dimension is a first value, determining a motion vector decoding value corresponding to the vertex in the coordinate dimension by analyzing a code stream; If the dimension grammar identification information corresponding to the coordinate dimension is a second value, determining at least one current image neighbor point corresponding to the vertex in the current image, and determining a first residual error corresponding to the vertex in the coordinate dimension by analyzing a code stream; and determining a motion vector decoding value corresponding to the vertex in the coordinate dimension according to at least one first motion vector reconstruction value corresponding to the at least one current image neighbor point and the first residual error, so as to determine the motion vector decoding value corresponding to the vertex in each coordinate dimension.
The method according to claim 5, wherein the determining the motion vector decoding value of the vertex corresponding to each coordinate dimension according to the decoding mode of the reference image corresponding to the current image and the dimension syntax identification information corresponding to each coordinate dimension includes: If the decoding mode of the reference basic grid is an inter-frame decoding mode, for each coordinate dimension, if the dimension grammar identification information corresponding to the coordinate dimension is a first value, determining a motion vector decoding value corresponding to the vertex in the coordinate dimension by analyzing a code stream; If the dimension grammar identification information corresponding to the coordinate dimension is a second value, determining at least one current image neighbor point corresponding to the vertex in the current image, and determining a first residual error corresponding to the vertex in the coordinate dimension by analyzing a code stream; if the dimension grammar identification information corresponding to the coordinate dimension is a third value, determining at least one reference image neighbor point corresponding to the vertex in the reference image of the current image, and determining a second residual error corresponding to the vertex in the coordinate dimension by analyzing a code stream, wherein the at least one reference image neighbor point comprises a reference vertex determined based on a parity point corresponding to the vertex in the reference image; And determining a motion vector decoding value corresponding to the vertex in the coordinate dimension according to at least one second motion vector reconstruction value corresponding to the at least one reference image neighbor point and the second residual error, so as to determine the motion vector decoding value corresponding to the vertex in each coordinate dimension.
The method of claim 7, wherein the determining, from the at least one second motion vector reconstruction value corresponding to the at least one reference image neighbor point and the second residual, a motion vector decoding value corresponding to the vertex in the coordinate dimension comprises: determining a second motion vector predicted value corresponding to the vertex in each coordinate dimension by carrying out weighted average on the at least one second motion vector reconstructed value; And determining a motion vector decoding value corresponding to the vertex in the coordinate dimension according to the second motion vector predicted value and the second residual error.
The method of claim 7, wherein the determining at least one reference image neighbor point corresponding to the vertex in a reference image of the current image comprises: In the reference image, determining a reference vertex with the same vertex index as the vertex as a corresponding parity point of the vertex; And determining at least one of the co-location point and a reference vertex which has a connection relation with the co-location point in the reference frame as the at least one reference image neighbor point.
The method of any of claims 1-9, wherein the vertices include vertices of a current group of at least one group of the base mesh, and wherein the determining dimension syntax identification information corresponding to the vertices in each of the plurality of coordinate dimensions by parsing a codestream comprises: determining dimension grammar identification information corresponding to each coordinate dimension of the current group by analyzing the code stream; And determining the dimension grammar identification information corresponding to the current group in each coordinate dimension as the dimension grammar identification information corresponding to the vertex in the current group in each coordinate dimension.
The method of any one of claims 1-9, wherein the vertex comprises a current vertex in the base mesh, and the determining dimension syntax identification information corresponding to each of the plurality of coordinate dimensions for the vertex by parsing a codestream comprises: And determining dimension grammar identification information corresponding to the current vertex in each coordinate dimension by analyzing the code stream.
The method of any of claims 1-9, wherein the vertices in the base mesh include vertices of a current grouping in at least one grouping of the base mesh, the method further comprising: Determining sequence-level first syntax identification information and/or second image-level syntax identification information by parsing the bitstream; And when the sequence-level first syntax identification information and/or the second image-level syntax identification information indicate to perform block decoding on the base mesh, determining dimension syntax identification information corresponding to each of the plurality of coordinate dimensions by analyzing a code stream, where the dimension syntax identification information corresponds to a vertex in the base mesh, including: determining dimension grammar identification information corresponding to each coordinate dimension of the current group by analyzing the code stream; And determining the dimension grammar identification information corresponding to the current group in each coordinate dimension as the dimension grammar identification information corresponding to the vertex in the current group in each coordinate dimension.
The method of claim 12, wherein the vertices in the base mesh include current vertices in the base mesh, and wherein the determining dimension syntax identification information corresponding to the vertices in the base mesh in each of the plurality of coordinate dimensions by parsing a codestream includes: And under the condition that the sequence-level first grammar identification information and/or the second image-level grammar identification information indicate that the basic grid is not subjected to block decoding, determining dimension grammar identification information corresponding to the current vertex in each coordinate dimension by analyzing a code stream.
The method of claim 11, wherein the method further comprises: Determining sequence-level second grammar identification information and/or third image-level grammar identification information by analyzing the code stream, wherein the sequence-level second grammar identification information and/or the third image-level grammar identification information represent the number of vertexes in a group; And determining the vertex in the current grouping based on the sequence-level second syntax identification information and/or third image-level syntax identification information.
The method of claim 14, wherein the determining the sequence-level second syntax identification information and/or the third picture-level syntax identification information by parsing the bitstream comprises: In case the sequence-level first syntax identification information and/or the second picture-level syntax identification information indicates a block decoding of the base mesh, determining the sequence-level second syntax identification information and/or the third picture-level syntax identification information by parsing the bitstream.
The method according to claim 14 or 15, wherein the method further comprises: And under the condition that the sequence-level second grammar identification information and/or the third image-level grammar identification information are/is preset identification values, determining dimension grammar identification information corresponding to the current vertexes in the basic grid in each coordinate dimension by analyzing the code stream.
The method of claim 10, wherein the method further comprises: Decoding a next group in the at least one group according to a preset decoding sequence, and determining motion vector decoding information corresponding to vertexes in the next group until motion vector decoding information corresponding to each vertex in the basic grid is determined; and determining reconstruction information of the basic grid according to motion vector decoding information corresponding to each vertex in the basic grid and combining the connection relation of the basic grid.
The method of any one of claims 1-9, wherein the method further comprises: in the case that the first image level syntax identification information is a third identification value, determining that the first image level syntax identification information indicates inter-frame decoding.
An encoding method applied to an encoder, the method comprising: Determining a basic grid of a current image and a motion vector corresponding to vertexes in the basic grid on each coordinate dimension; when the basic grid is inter-coded, at least one coding cost in each coordinate dimension is determined based on the motion vector of the vertex corresponding to each coordinate dimension in at least one mode of coding cost estimation; And determining dimension grammar identification information corresponding to each coordinate dimension of the vertex according to the at least one coding cost, wherein the dimension grammar identification information corresponding to each coordinate dimension represents a motion vector prediction mode corresponding to the coordinate dimension.
The method of claim 19, wherein the at least one encoding cost comprises at least one of a first encoding cost and a second encoding cost, wherein the determining the at least one encoding cost for the vertex in each coordinate dimension based on the encoding cost estimate for the at least one mode for the motion vector for the vertex in each coordinate dimension comprises: Performing coding cost estimation on a motion vector corresponding to each coordinate dimension of the vertex, and determining a first coding cost corresponding to each coordinate dimension of the vertex; and/or Determining at least one current image neighbor point corresponding to the vertex in the current image according to the connection relation of the basic grid, wherein the at least one current image neighbor point comprises a vertex which has the connection relation with the vertex in the current image and has the coding sequence before the vertex; And carrying out coding cost estimation on the motion vector corresponding to the vertex in each coordinate dimension according to at least one first motion vector reconstruction value corresponding to the at least one current image neighbor point in each coordinate dimension, and determining a second coding cost corresponding to the vertex in each coordinate dimension.

Description

Encoding/decoding method, code stream, encoder, decoder, and storage medium Technical Field The embodiment of the application relates to the technical field of dynamic grid coding and decoding, in particular to a coding and decoding method, a code stream, an encoder, a decoder and a storage medium. Background In standard reference software for dynamic grid Coding (DYNAMIC MESH Coding, DMC) provided by the moving picture expert group (Moving Picture Experts Group, MPEG), motion vectors of vertices in a base grid are typically encoded when inter-Coding the base grid in a current image. At the decoding end, the basic grid corresponding to the current image is obtained by decoding the motion vector of the vertex in the basic grid and combining with the connection information in the basic grid of the reference image corresponding to the current image. However, the current method for determining the coding and decoding modes of the motion vectors is not perfect enough, so that the coding and decoding cost of the motion vectors is high, the coding and decoding efficiency of the basic grid is reduced, and the coding and decoding performance of the DMC is further reduced. Disclosure of Invention The embodiment of the application provides a coding and decoding method, a code stream, an encoder, a decoder and a storage medium, which can improve coding and decoding efficiency of a basic grid and further improve coding and decoding performance of DMC. The technical scheme of the embodiment of the application can be realized as follows: In a first aspect, an embodiment of the present application provides a decoding method, applied to a decoder, including: A decoding method applied to a decoder, the method comprising: analyzing the code stream, and determining first image level grammar identification information corresponding to a basic grid of the current image; Determining dimension grammar identification information corresponding to each coordinate dimension of the vertexes in the basic grid by analyzing a code stream under the condition that the first image-level grammar identification information indicates inter-frame decoding; And determining a motion vector decoding value of the vertex corresponding to each coordinate dimension based on the dimension grammar identification information corresponding to each coordinate dimension in the plurality of coordinate dimensions, thereby determining the motion vector decoding information corresponding to the vertex. In a second aspect, an embodiment of the present application provides an encoding method, applied to an encoder, including: Determining a basic grid of a current image and a motion vector corresponding to vertexes in the basic grid on each coordinate dimension; when the basic grid is inter-coded, at least one coding cost in each coordinate dimension is determined based on the motion vector of the vertex corresponding to each coordinate dimension in at least one mode of coding cost estimation; And determining dimension grammar identification information corresponding to each coordinate dimension of the vertex according to the at least one coding cost, wherein the dimension grammar identification information corresponding to each coordinate dimension represents a motion vector prediction mode corresponding to the coordinate dimension. In a third aspect, an embodiment of the present application provides a code stream, where the code stream is generated by performing bit encoding according to information to be encoded, where the information to be encoded is determined based on dimension syntax identification information corresponding to vertices in a base mesh of a current image in each coordinate dimension, where, The dimension grammar identification information corresponding to each coordinate dimension of the vertex is obtained by estimating coding cost of at least one mode based on a motion vector corresponding to each coordinate dimension of the vertex when the basic grid is inter-coded, determining at least one coding cost on each coordinate dimension and determining according to the at least one coding cost, wherein the dimension grammar identification information corresponding to each coordinate dimension represents a motion vector prediction mode corresponding to the coordinate dimension. In a fourth aspect, an embodiment of the present application provides a decoder, including: The system comprises a code stream analyzing part, a resolving part and a motion vector prediction mode, wherein the code stream analyzing part is configured to analyze the code stream to determine first image level grammar identification information corresponding to a basic grid of a current image, and the dimension grammar identification information corresponding to each coordinate dimension of vertexes in the basic grid is determined by analyzing the code stream under the condition that the first image level grammar identification information indicates interframe decoding; And a de