US-12621423-B2 - Image data encoding/decoding method and apparatus

US12621423B2US 12621423 B2US12621423 B2US 12621423B2US-12621423-B2

Abstract

Disclosed are methods and apparatuses for decoding an image. A method includes receiving a bitstream obtained by encoding the image; dividing a first coding block into a plurality of second coding blocks; generating a prediction block of a second coding block based on syntax information obtained from the bitstream; and reconstructing the second coding block based on the prediction block and a residual block of the second coding block, the residual block being obtained by performing a dequantization and an inverse-transform on quantized transform coefficients from the bitstream. The first coding block has a recursive division structure. The first coding block is divided based on at least one of a quad tree division, a binary tree division or a triple tree division.

Inventors

Ki Baek Kim

Assignees

B1 INSTITUTE OF IMAGE TECHNOLOGY, INC.

Dates

Publication Date: 20260505
Application Date: 20250811
Priority Date: 20161004

Claims (3)

1 . A method of decoding a 360-degree image performed by an image decoding apparatus, the method comprising: receiving a bitstream in which the 360-degree image is encoded, the bitstream including data of an expanded 2-dimensional image, the expanded 2-dimensional image including a 2-dimensional image and a predetermined expansion region, and the 2-dimensional image being projected from an image with a 3-dimensional projection structure and including one or more faces; generating a predicted image by performing prediction based on information on the prediction included in the bitstream; and reconstructing the expanded 2-dimensional image based on the predicted image and a residual image, wherein a size of the expansion region is determined based on first width information of the expansion region on a left side of the face and second width information of the expansion region on a right side of the face, both the first width information and the second width information being obtained from the bitstream, wherein both the first width information and the second width information are obtained based on whether the expansion region exists, and whether the expansion region exists is determined by flag information obtained from the bitstream, wherein sample values of the expansion region are determined by horizontally copying the sample values of the face according to a padding method selected from a plurality of padding methods, wherein the size of the expansion region is determined based on size information obtained from the bitstream, wherein the predicted image is added to the residual image to reconstruct the expanded 2-dimensional image, wherein the number of syntax elements for the size information is determined differently based on a projection format for the 3-dimensional projection structure, the projection format being one among a plurality of projection formats including an ERP format in which the 360-degree image is projected in a two-dimensional plane and a CMP format in which the 360-degree image is projected in a cube, wherein the residual image is obtained by decoding residual information for the residual image included in the bitstream, wherein the predicted image is generated by referring to at least one neighboring sample, wherein the 3-dimensional projection structure is selectively determined based on identification information, among a plurality of pre-defined projection formats including the ERP format in which the 360-degree image is projected in the two-dimensional plane and the CMP format in which the 360-degree image is projected in the cube, and wherein the sample values of the expansion region are determined differently according to the padding method selected from the plurality of padding methods, the padding method being determined for each of the one or more faces independently from each other.
2 . A method of encoding a 360-degree image performed by an image encoding apparatus, the method comprising: obtaining a 2-dimensional image projected from an image with a 3-dimensional projection structure and including at least one face; obtaining an expanded 2-dimensional image including the 2-dimensional image and a predetermined expansion region; generating a predicted image by performing prediction, information on the prediction being encoded into the bitstream; and encoding data of the expanded 2-dimensional image into the bitstream based on the predicted image and a residual image, wherein a size of the expansion region is encoded based on first width information of the expansion region on a left side of the face and second width information of the expansion region on a right side of the face, both the first width information and the second width information being encoded into the bitstream, wherein both the first width information and the second width information are encoded based on whether the expansion region exists, and whether the expansion region exists is encoded by flag information encoded into the bitstream, wherein sample values of the expansion region are determined by horizontally copying the sample values of the face according to a padding method selected from a plurality of padding methods, wherein size information on the size of the expansion region is encoded into the bitstream, wherein the residual image is obtained based on the expanded 2-dimensional image and the predicted image, wherein the residual image is encoded by encoding residual information for the residual image into the bitstream, wherein the number of syntax elements for the size information is determined differently based on a projection format for the 3-dimensional projection structure, the projection format being one among a plurality of projection formats including an ERP format in which the 360-degree image is projected in a two-dimensional plane and a CMP format in which the 360-degree image is projected in a cube, wherein the residual information is included in the bitstream by encoding the residual image, wherein the predicted image is generated by referring to at least one neighboring sample, wherein the 3-dimensional projection structure is selectively determined based on identification information, among a plurality of pre-defined projection formats including the ERP format in which the 360-degree image is projected in the two-dimensional plane and the CMP format in which the 360-degree image is projected in the cube, and wherein sample values of the expansion region are determined differently according to a padding method selected from a plurality of padding methods, the padding method being determined for each of the one or more faces independently from each other.
3 . A method of transmitting a bitstream performed by a transmitting apparatus, comprising: transmitting the bitstream to an image decoding apparatus, wherein the bitstream is generated by performing obtaining a 2-dimensional image projected from an image with a 3-dimensional projection structure and including at least one face; obtaining an expanded 2-dimensional image including the 2-dimensional image and a predetermined expansion region; generating a predicted image by performing prediction, information on the prediction being encoded into the bitstream; and encoding data of the expanded 2-dimensional image into the bitstream based on the predicted image and a residual image, wherein a size of the expansion region is encoded based on first width information of the expansion region on a left side of the face and second width information of the expansion region on a right side of the face, both the first width information and the second width information being encoded into the bitstream, wherein both the first width information and the second width information are encoded based on whether the expansion region exists, and whether the expansion region exists is encoded by flag information encoded into the bitstream, wherein sample values of the expansion region are determined by horizontally copying the sample values of the face according to a padding method selected from a plurality of padding methods, wherein size information on the size of the expansion region is encoded into the bitstream, wherein the residual image is obtained based on the expanded 2-dimensional image and the predicted image, wherein the residual image is encoded by encoding residual information for the residual image into the bitstream, wherein the number of syntax elements for the size information is determined differently based on a projection format for the 3-dimensional projection structure, the projection format being one among a plurality of projection formats including an ERP format in which the 360-degree image is projected in a two-dimensional plane and a CMP format in which the 360-degree image is projected in a cube, wherein residual information is included in the bitstream by encoding the residual image, wherein the predicted image is generated by referring to at least one neighboring sample, wherein the 3-dimensional projection structure is selectively determined based on identification information, among a plurality of pre-defined projection formats including the ERP format in which the 360-degree image is projected in the two-dimensional plane and the CMP format in which the 360-degree image is projected in the cube, and wherein sample values of the expansion region are determined differently according to a padding method selected from a plurality of padding methods, the padding method being determined for each of the one or more faces independently from each other.

Description

RELATED APPLICATIONS This application is a continuation application of U.S. patent application Ser. No. 18/523,855, filed Nov. 29, 2023, which is a continuation application of U.S. patent application Ser. No. 17/073,225, filed Oct. 16, 2020, which is now U.S. Pat. No. 12,028,503, which is a continuation application of U.S. patent application Ser. No. 16/372,251, filed Apr. 1, 2019, which is a continuation application of the International Patent Application Serial No. PCT/KR2017/011144, filed Oct. 10, 2017, which claims priority to the Korean Patent Application Serial No. 10-2016-0127883, filed Oct. 4, 2016; the Korean Patent Application Serial No. 10-2016-0129383, filed Oct. 6, 2016; and the Korean Patent Application Serial No. 10-2017-0090613, filed Jul. 17, 2017. All of these applications are incorporated by reference herein in their entireties. TECHNICAL FIELD The present invention relates to image data encoding and decoding technology, and more particularly, to a method and apparatus for encoding and decoding a 360-degree image for realistic media service. BACKGROUND With the spread of the Internet and mobile terminals and the development of information and communication technology, the use of multimedia data is increasing rapidly. Recently, demand for high-resolution images and high-quality images such as a high definition (HD) image and an ultra high definition (UHD) image is emerging in various fields, and demand for realistic media service such as virtual reality, augmented reality, and the like is increasing rapidly. In particular, since multi-view images captured with a plurality of cameras are processed for 360-degree images for virtual reality and augmented reality, the amount of data generated for the processing increases massively, but the performance of an image processing system for processing a large amount of data is insufficient. As described above, in an image encoding and decoding method and apparatus of the related art, there is a demand for improvement of performance in image processing, particularly, image encoding/decoding. SUMMARY It is an object of the present invention to provide a method for improving an image setting process in initial steps for encoding and decoding. More particularly, the present invention is directed to providing an encoding and decoding method and apparatus for improving an image setting process in consideration of the characteristics of a 360-degree image. According to an aspect of the present invention, there is provided a method of decoding a 360-degree image. Here, the method of decoding a 360-degree image may include receiving a bitstream including an encoded 360-degree image, generating a predicted image with reference to syntax information acquired from the received bitstream, acquiring a decoded image by combining the generated predicted image with a residual image acquired by inversely quantizing and inversely transforming the bitstream, and reconstructing the decoded image into the 360-degree image according to a projection format. Here, the syntax information may include projection format information for the 360-degree image. Here, the projection format information may be information indicating at least one of an Equi-Rectangular Projection (ERP) format in which the 360-degree image is projected into a 2D plane, a CubeMap Projection (CMP) format in which the 360-degree image is projected to a cube, an OctaHedron Projection (OHP) format in which the 360-degree image is projected to an octahedron, and an IcoSahedral Projection (ISP) format in which the 360-degree image is projected to a polyhedron. Here, the reconstructing may include acquiring arrangement information according to region-wise packing with reference to the syntax information and rearranging blocks of the decoded image according to the arrangement information. Here, the generating of the predicted image may include performing image expansion on a reference picture acquired by restoring the bitstream, and generating a predicted image with reference to the reference picture on which the image expansion is performed. Here, the performing of the image expansion may include performing image expansion on the basis of partitioning units of the reference picture. Here, the performing of the image expansion on the basis of the partitioning units may include generating an expanded region individually for each partitioning unit by using the reference pixel of the partitioning unit. Here, the expanded region may be generated using a boundary pixel of a partitioning unit spatially adjacent to a partitioning unit to be expanded or using a boundary pixel of a partitioning unit having image continuity with a partitioning unit to be expanded. Here, the performing of the image expansion on the basis of the partitioning units may include generating an expanded image for a region where two or more partitioning units that are spatially adjacent to each other among the partitioning units are combined, usi