US-12627827-B2 - Template based prediction
Abstract
Aspects of the disclosure includes methods and apparatuses for video decoding and video encoding and a method of processing visual media data. Coded information in a bitstream is received. The coded information indicates whether filtering is to be applied to at least one of a current template of a current block in a current picture and a reference template of a reference block. The current block is predicted based on the reference block. When the coded information indicates that the filtering is to be applied, a plurality of samples in the at least one of the current template and the reference template is filtered. A linear model between the current template and the reference template is determined based on the filtered plurality of samples. The current block is reconstructed based on the linear model and the reference block.
Inventors
- Biao Wang
- Madhu PERINGASSERY KRISHNAN
- Lien-Fei CHEN
- Roman CHERNYAK
- Xin Zhao
- Shan Liu
Assignees
- Tencent America LLC
Dates
- Publication Date
- 20260512
- Application Date
- 20240710
Claims (20)
- 1 . A method for video decoding, the method comprising: receiving coded information in a bitstream, the coded information indicating whether filtering is to be applied to at least one of a current template of a current block in a current picture and a reference template of a reference block, the current block being predicted based on the reference block of the current block; and when the coded information indicates that the filtering is to be applied to the at least one of the current template of the current block and the reference template of the reference block, filtering a plurality of samples in the at least one of the current template of the current block and the reference template of the reference block using at least one filter that includes one of a 3×3 filter, a 3×2 filter, and a 2×3 filter; determining a linear filter between the current template and the reference template based on the filtered plurality of samples in the at least one of the current template and the reference template, the linear filter being different from the at least one filter; applying the linear filter to the reference block to determine a prediction signal for the current block; and reconstructing the current block based on the prediction signal.
- 2 . The method of claim 1 , wherein the reference block is in the current picture when the current block is predicted according to an intra template matching prediction (IntraTMP) mode or an intra block copy (IBC) mode; the reference block is in a reference picture that is different from the current picture when the current block is predicted according to an inter prediction method; the current template includes neighboring reconstructed samples of the current block; and the reference template includes neighboring reconstructed samples of the reference block.
- 3 . The method of claim 2 , wherein when the reference block is in the current picture, a value of a sample in the prediction signal is a weighted sum of the sample in the reference block, a plurality of neighboring samples of the sample in the reference block, and a bias term according to the linear filter; and when the reference block is in the reference picture, the value of the sample in the prediction signal is a weighted sum of the sample in the reference block and the bias term according to the linear filter.
- 4 . The method of claim 1 , wherein the at least one of the current template and the reference template includes the current template and the reference template.
- 5 . The method of claim 4 , wherein the filtering comprises: filtering first samples of the plurality of samples in the current template of the current block with a first filter in the at least one filter; and filtering second samples of the plurality of samples in the reference template of the reference block with a second filter that is different from the first filter, the second filter being one of the at least one filter.
- 6 . The method of claim 1 , wherein the at least one of the current template and the reference template consists of the current template or the reference template.
- 7 . The method of claim 1 , wherein the at least one filter includes a filter that is [ - 1 - 1 - 1 - 1 1 0 - 1 - 1 - 1 - 1 ] .
- 8 . The method of claim 1 , wherein the coded information in the bitstream includes a flag indicating whether the filtering is to be applied to the at least one of the current template of the current block and the reference template of the reference block, and the flag is signaled at a block level or a high-level that is higher than the block level, or whether the filtering is to be applied to the at least one of the current template of the current block and the reference template of the reference block is determined implicitly based on already parsed information of the coded information, the already parsed information including an intra prediction mode.
- 9 . The method of claim 1 , wherein the filtering is applied to the at least one of the current template and the reference template only when the current block is a luma block.
- 10 . The method of claim 1 , wherein the current block is a luma block or a chroma block.
- 11 . The method of claim 1 , wherein the current template comprises one of: a top template that is directly above the current block, a left template that is to the left of the current block, and a top-left corner template between the top template and the left template; and the top template, the left template, the top-left corner template between the top template and the left template, a top-right template that is above and to the right of the current block, and a bottom-left template that is below and to the left of the current block; and the reference template has a same shape and a same size as the current template.
- 12 . The method of claim 1 , wherein the plurality of samples is a subset of samples in the at least one of the current template of the current block and the reference template of the reference block.
- 13 . The method of claim 12 , wherein the current template of the current block includes a line that includes two end samples and middle samples that are between the two end samples, each middle sample is adjacent to two samples in the line, each end sample is adjacent to only one sample in the line, and the subset of samples in the at least one of the current template of the current block and the reference template of the reference block includes the middle samples of the current template and excludes the two end samples of the current template.
- 14 . The method of claim 12 , wherein the current template of the current block includes two end lines and one or more middle lines, one of the two end lines is adjacent to the current block, the one or more middle lines are between the two end lines, and the subset of samples in the at least one of the current template of the current block and the reference template of the reference block includes samples in at least one of the one or more middle lines in the current template and excludes the two end lines in the current template.
- 15 . The method of claim 1 , wherein a filter shape used in the filtering depends on at least one of a location of the current template, a location of the reference template, a size of the current block, a shape of the current block, a size of the current template, and a shape of the current template.
- 16 . A method for video encoding, the method comprising: when filtering is to be applied to at least one of a current template of a current block in a current picture and a reference template of a reference block, filtering a plurality of samples in the at least one of the current template of the current block and the reference template of the reference block using at least one filter that includes one of a 3×3 filter, a 3×2 filter, and a 2×3 filter; determining a linear filter between the current template and the reference template based on the filtered plurality of samples in the at least one of the current template and the reference template, the linear filter being different from the at least one filter; applying the linear filter to the reference block to determine a prediction signal for the current block; encoding the current block based on the prediction signal; and encoding, in a bitstream, a syntax element indicating whether the filtering is to be applied to the at least one of the current template of the current block and the reference template of the reference block.
- 17 . The method of claim 16 , wherein the reference block is in the current picture when the current block is encoded according to an intra template matching prediction (IntraTMP) mode or an intra block copy (IBC) mode; the reference block is in a reference picture that is different from the current picture when the current block is encoded according to an inter prediction mode; the current template includes neighboring samples of the current block; and the reference template includes neighboring samples of the reference block.
- 18 . The method of claim 16 , wherein the at least one of the current template and the reference template includes the current template and the reference template.
- 19 . The method of claim 18 , wherein the filtering comprises: filtering first samples of the plurality of samples in the current template of the current block with a first filter in the at least one filter; and filtering second samples of the plurality of samples in the reference template of the reference block with a second filter that is different from the first filter, the second filter being one of the at least one filter.
- 20 . A non-transitory computer-readable storage medium storing instructions which when executed by a processor cause the processor to perform an encoding method comprising: when filtering is to be applied to at least one of a current template of a current block in a current picture and a reference template of a reference block, filtering a plurality of samples in the at least one of the current template of the current block and the reference template of the reference block using at least one filter that includes one of a 3×3 filter, a 3×2 filter, and a 2×3 filter; determining a linear filter between the current template and the reference template based on the filtered plurality of samples in the at least one of the current template and the reference template, the linear filter being different from the at least one filter; applying the linear filter to the reference block to determine a prediction signal for the current block; encoding the current block based on the prediction signal; encoding, in a bitstream, a syntax element indicating whether the filtering is to be applied to the at least one of the current template of the current block and the reference template of the reference block; and transmitting the bitstream.
Description
RELATED APPLICATION The present application claims the benefit of priority to U.S. Provisional Application No. 63/526,175, “METHOD AND APPARATUS FOR IMPROVEMENT ON TEMPLATE BASED PREDICTION” filed on Jul. 11, 2023, which is incorporated by reference herein in its entirety. TECHNICAL FIELD The present disclosure describes aspects generally related to video coding. BACKGROUND The background description provided herein is for the purpose of generally presenting the context of the disclosure. Work of the presently named inventors, to the extent the work is described in this background section, as well as aspects of the description that may not otherwise qualify as prior art at the time of filing, are neither expressly nor impliedly admitted as prior art against the present disclosure. Image/video compression can help transmit image/video data across different devices, storage and networks with minimal quality degradation. In some examples, video codec technology can compress video based on spatial and temporal redundancy. In an example, a video codec can use techniques referred to as intra prediction that can compress an image based on spatial redundancy. For example, the intra prediction can use reference data from the current picture under reconstruction for sample prediction. In another example, a video codec can use techniques referred to as inter prediction that can compress an image based on temporal redundancy. For example, the inter prediction can predict samples in a current picture from a previously reconstructed picture with motion compensation. The motion compensation can be indicated by a motion vector (MV). SUMMARY Aspects of the disclosure include methods and apparatuses for video encoding/decoding. According to an aspect of the disclosure, a method for video decoding includes receiving coded information in a bitstream. The coded information indicates whether filtering is to be applied to at least one of a current template of a current block in a current picture and a reference template of a reference block. The current block is predicted based on the reference block of the current block. When the coded information indicates that the filtering is to be applied to the at least one of the current template of the current block and the reference template of the reference block, the method includes filtering a plurality of samples in the at least one of the current template of the current block and the reference template of the reference block and determining a linear model between the current template and the reference template based on the filtered plurality of samples in the at least one of the current template and the reference template. The method includes reconstructing the current block based on the linear model and the reference block. In an example, the reference block is in the current picture when the current block is predicted according to an intra template matching prediction (IntraTMP) mode. In an example, the reference block is in a reference picture that is different from the current picture when the current block is predicted according to an inter prediction method. The current template includes neighboring reconstructed samples of the current block. The reference template includes neighboring reconstructed samples of the reference block. In an example, the method includes applying the linear model to the reference block to determine a prediction signal and reconstructs the current block based on the prediction signal. When the reference block is in the current picture, a value of a sample in the prediction signal is a weighted sum of the sample in the reference block, a plurality of neighboring samples of the sample in the reference block, and a bias term according to the linear model. When the reference block is in the reference picture, the value of the sample in the prediction signal is a weighted sum of the sample in the reference block and the bias term according to the linear model. In an example, the at least one of the current template and the reference template includes the current template and the reference template. In an example, the method includes filtering first samples of the plurality of samples in the current template of the current block with a first filter and filters second samples of the plurality of samples in the reference template of the reference block with a second filter that is different from the first filter. In an example, the at least one of the current template and the reference template consists of the current template or the reference template. In an example, the method includes filtering the plurality of samples in the at least one of the current template of the current block and the reference template of the reference block using a filter that is [-1-1-1-110-1-1-1-1]. In an example, the coded information in the bitstream includes a flag indicating whether the filtering is to be applied to the at least one of the current template of the current bl