EP-3906686-B1 - METHOD AND APPARATUS FOR VIDEO CODING

EP3906686B1EP 3906686 B1EP3906686 B1EP 3906686B1EP-3906686-B1

Inventors

XU, XIAOZHONG
LI, XIANG
LIU, SHAN

Dates

Publication Date: 20260506
Application Date: 20200106

Claims (12)

A method for video decoding in a decoder, comprising: decoding (S1510) prediction information of a current block in a current coding tree unit, CTU, from a coded video bitstream, the prediction information indicating an intra block copy, IBC, mode; determining (S1520) padded values of a reference block based on a block vector that points to the reference block, the padded values of the reference block being copied from a reference sample line; and reconstructing (S1530) at least a sample of the current block based on the padded values of the reference block, characterized in that the method further comprises padding a top-left block, a bottom-left block, and a bottom-right block of a CTU (1350) adjacent to the left of the current CTU (1340) by a combination of horizontal and vertical padding when decoding the current coding block in a top-left block of the current CTU (1340) and when a search range only includes a top-right block of the CTU (1350) adjacent to the left of the current CTU (1340), wherein the top-left block and the bottom-left block of the CTU (1350) adjacent to the left of the current CTU (1340) is padded vertically using a reference sample line on top of the CTU (1350) adjacent to the left of the current CTU (1340) and the bottom-right block of the CTU (1350) adjacent to the left of the current CTU (1340) is padded horizontally using another reference sample line at a rightmost column of the CTU (1350) adjacent to the left of the current CTU (1340), wherein the vertical padding of the top-left block and the bottom-left block is performed based on a first distance between the top-left block and the bottom-left block, and the reference sample line above the current CTU (1340), and wherein the horizontal padding of the bottom-right block is performed based on a second distance between the bottom-right block and the another reference sample line to the left of the current CTU (1340).
The method of claim 1, wherein reconstructed samples of the reference block are not stored in a reference sample memory, and the padded values of the reference block are stored in a memory that is different from the reference sample memory, wherein the reference sample memory is a memory that stores reference samples of previously decoded CUs for future IBC-based compensation.
The method of claim 2, wherein a maximum size of the reference sample memory is limited to four sets of 64x64 luma samples and corresponding chroma samples.
The method of claim 2, wherein an adjacent left CTU is partitioned into a top-left reference coding region, a top-right reference coding region, a bottom-left reference coding region, and a bottom-right reference coding region, and each of the reference coding regions in the left CTU including reconstructed samples that are not stored in the reference sample memory is padded by the reference sample line above the current CTU or to the left of the current CTU.
The method of claim 2, wherein an adjacent left CTU is partitioned into a top-left reference coding region, a top-right reference coding region, a bottom-left reference coding region, and a bottom-right reference coding region, and the reference block is included in the top-right reference coding region or the bottom-right reference coding region of the left CTU, or in the current CTU.
The method of claim 5, wherein a maximum size of the reference sample memory is limited to three sets of 64x64 luma samples and corresponding chroma samples.
The method of claim 5, wherein each of the top-right reference coding region of the left CTU, the bottom-right reference coding region of the left CTU, and reference coding regions in the current CTU including reconstructed samples that are not stored in the reference sample memory is padded by the reference sample line above the current CTU or to the left of the current CTU.
The method of claim 5, wherein a maximum size of the reference sample memory is limited to two sets of 64x64 luma samples and corresponding chroma samples.
The method of claim 8, wherein each of the top-right reference coding region of the left CTU, the bottom-right reference coding region of the left CTU, and reference coding regions in the current CTU including reconstructed samples that are not stored in the reference sample memory is padded by the reference sample line above the current CTU or to the left of the current CTU.
The method of claim 2, wherein the reference block is padded by boundary pixels of a reconstructed reference block in one of the current CTU and an adjacent left CTU, and reconstructed samples of the reconstructed reference block are stored in the reference sample memory.
An apparatus, comprising: processing circuitry configured to perform the method for video decoding in the decoder of any one of claims 1 to 10.
A non-transitory computer-readable medium storing instructions which when executed by a computer for video decoding cause the computer to perform the method for video decoding in the decoder of any one of claims 1 to 10.

Description

This present application claims the benefit of priority to U.S. Patent Application No. 16/734,107, "Method and Apparatus for Video Coding" filed on January 3, 2020, which claims the benefit of priority to U.S. Provisional Application No. 62/788,935, "Intra picture block compensation with boundary padding" filed on January 6, 2019. TECHNICAL FIELD The present disclosure describes embodiments generally related to video coding. BACKGROUND The background description provided herein is for the purpose of generally presenting the context of the disclosure. Work of the presently named inventors, to the extent the work is described in this background section, as well as aspects of the description that may not otherwise qualify as prior art at the time of filing, are neither expressly nor impliedly admitted as prior art against the present disclosure. Video coding and decoding can be performed using inter-picture prediction with motion compensation. Uncompressed digital video can include a series of pictures, each picture having a spatial dimension of, for example, 1920 x 1080 luminance samples and associated chrominance samples. The series of pictures can have a fixed or variable picture rate (informally also known as frame rate), of, for example 60 pictures per second or 60 Hz. Uncompressed video has significant bitrate requirements. For example, 1080p60 4:2:0 video at 8 bit per sample (1920x1080 luminance sample resolution at 60 Hz frame rate) requires close to 1.5 Gbit/s bandwidth. An hour of such video requires more than 600 GBytes of storage space. One purpose of video coding and decoding can be the reduction of redundancy in the input video signal, through compression. Compression can help reduce the aforementioned bandwidth or storage space requirements, in some cases by two orders of magnitude or more. Both lossless and lossy compression, as well as a combination thereof can be employed. Lossless compression refers to techniques where an exact copy of the original signal can be reconstructed from the compressed original signal. When using lossy compression, the reconstructed signal may not be identical to the original signal, but the distortion between original and reconstructed signals is small enough to make the reconstructed signal useful for the intended application. In the case of video, lossy compression is widely employed. The amount of distortion tolerated depends on the application; for example, users of certain consumer streaming applications may tolerate higher distortion than users of television distribution applications. The compression ratio achievable can reflect that: higher allowable/tolerable distortion can yield higher compression ratios. Motion compensation can be a lossy compression technique and can relate to techniques where a block of sample data from a previously reconstructed picture or part thereof (reference picture), after being spatially shifted in a direction indicated by a motion vector (MV henceforth), is used for the prediction of a newly reconstructed picture or picture part. In some cases, the reference picture can be the same as the picture currently under reconstruction. MVs can have two dimensions X and Y, or three dimensions, the third being an indication of the reference picture in use (the latter, indirectly, can be a time dimension). In some video compression techniques, an MV applicable to a certain area of sample data can be predicted from other MVs, for example from those related to another area of sample data spatially adjacent to the area under reconstruction, and preceding that MV in decoding order. Doing so can substantially reduce the amount of data required for coding the MV, thereby removing redundancy and increasing compression. MV prediction can work effectively, for example, because when coding an input video signal derived from a camera (known as natural video) there is a statistical likelihood that areas larger than the area to which a single MV is applicable move in a similar direction and, therefore, can in some cases be predicted using a similar motion vector derived from MVs of a neighboring area. That results in the MV found for a given area to be similar or the same as the MV predicted from the surrounding MVs, and that in turn can be represented, after entropy coding, in a smaller number of bits than what would be used if coding the MV directly. In some cases, MV prediction can be an example of lossless compression of a signal (namely: the MVs) derived from the original signal (namely: the sample stream). In other cases, MV prediction itself can be lossy, for example because of rounding errors when calculating a predictor from several surrounding MVs. Various MV prediction mechanisms are described in H.265/HEVC (ITU-T Rec. H.265, "High Efficiency Video Coding," December 2016). Out of the many MV prediction mechanisms that H.265 offers, described herein is a technique henceforth referred to as "spatial merge." Referring to FIG. 1, a current blo