US-20260129194-A1 - Interpolation Filter Clipping For Sub-Picture Motion Vectors

US20260129194A1US 20260129194 A1US20260129194 A1US 20260129194A1US-20260129194-A1

Abstract

A video coding mechanism is disclosed. The mechanism includes receiving a bitstream comprising a current picture including a sub-picture coded according to inter-prediction. A motion vector for a block of the sub-picture is determined. A clipping function is applied to sample locations in a reference block to support application of an interpolation filter when the motion vector points outside of the sub-picture and when a flag is set to indicate the sub-picture is treated as a picture. The interpolation filter is applied to results of the clipping function to obtain a predicted sample value. The block is decoded based on the predicted sample value. The block is forwarded for display as part of a decoded video sequence.

Inventors

Ye-Kui Wang
Jianle Chen
Fnu HENDRY

Assignees

HUAWEI TECHNOLOGIES CO., LTD.

Dates

Publication Date: 20260507
Application Date: 20251230

Claims (18)

1 . A method implemented by a decoder comprising: receiving a bitstream comprising quantized transform coefficients of a sub-picture, where the sub-picture includes a block coded according to inter-prediction; obtaining a reconstructed residual block based on the quantized transform coefficients; determining a reference block for the block of the sub-picture; applying a clipping function to sample locations in the reference block during application of an interpolation process when a value of a flag indicates that the sub-picture is treated as a picture; applying the interpolation process to results of the clipping function to obtain a predicted sample value; and obtaining a reconstructed block based on the reconstructed residual block and a predicted block comprising he predicted sample value, wherein the interpolation process includes a luma sample bilinear interpolation process, wherein the block includes a block of luma samples, wherein the predicted sample value includes a predicted luma sample value, wherein the luma sample bilinear interpolation process receives inputs including a luma location in full sample units (xIntL, yIntL), and wherein the luma sample bilinear interpolation process outputs the predicted luma sample value (predSampleLXL), and wherein the clipping function is applied to the sample locations and meets the following equations: xInti = Clip ⁢ 3 ⁢ ( SubPicLeftBoundaryPos , SubPicRightBoundaryPos , xInt L + i ) , and yInti = Clip ⁢ 3 ⁢ ( SubPicTopBoundaryPos , SubPicBotBoundaryPos , yInt L + i ) , where xInti and yInti are a sample location at index i, SubPicRightBoundaryPos is a position of a right boundary of the sub-picture, SubPicLeftBoundaryPos is a position of a left boundary of the sub-picture, SubPicTopBoundaryPos is a position of a top boundary of the sub-picture, and where SubPicBotBoundaryPos is a position of a bottom boundary of the sub-picture, and Clip3 is the clipping function according to: Clip ⁢ 3 ⁢ ( x , y , z ) = { x ; z < x y ; z > y z ; otherwise where x, y, and z are numerical input values.
2 . The method of claim 1 , wherein the clipping function is applied to the sample locations when subpic_treated_as_pic_flag[SubPicIdx] is equal to one, where subpic_treated_as_pic_flag is the flag set to indicate the sub-picture is treated as a picture, SubPicIdx is an index of the sub-picture.
3 . The method of claim 1 , wherein the interpolation process includes a luma sample interpolation filtering process.
4 . The method of claim 1 , wherein the interpolation process includes a luma sample eight tap interpolation filtering process.
5 . The method of claim 1 , wherein the interpolation process includes a chroma sample interpolation process, wherein a second block includes a block of chroma samples, and wherein a second predicted sample value includes a predicted chroma sample value.
6 . The method of claim 5 , wherein the chroma sample interpolation process receives inputs including a chroma location in full sample units (xIntc, yIntc), wherein the chroma sample interpolation process outputs a predicted chroma sample value (predSampleLXc), and wherein the clipping function is applied to the sample locations according to: when subpic_treated_as_pic_flag[SubPicIdx] is equal to one, the following applies: xInti = Clip ⁢ 3 ⁢ ( SubPicLeftBoundaryPos / SubWidthC , SubPicRightBoundaryPos / SubWidthC , xInt C + i ) , and yInti = Clip ⁢ 3 ⁢ ( SubPicTopBoundaryPos / SubHeightC , SubPicBotBoundaryPos / SubHeightC , yInt C + i ) , where subpic_treated_as_pic_flag is the flag set to indicate the sub-picture is treated as a picture, SubPicIdx is an index of the sub-picture, xInti and yInti are a clipped sample location at index i, SubPicRightBoundaryPos is a position of a right boundary of the sub-picture, SubPicLeftBoundaryPos is a position of a left boundary of the sub-picture, SubPicTopBoundaryPos is a position of a top boundary of the sub-picture, SubPicBotBoundaryPos is a position of a bottom boundary of the sub-picture, SubWidthC and SubHeightC indicate a horizontal and vertical sampling rate ratio between luma and chroma samples, and Clip3 is the clipping function according to: Clip ⁢ 3 ⁢ ( x , y , z ) = { x ; z < x y ; z > y z ; otherwise where x, y, and z are numerical input values.
7 . A method implemented by an encoder, comprising: obtaining a sub-picture, wherein the sub-picture includes a block; determining to encode the block according to inter-prediction; determining a reference block for the block of the sub-picture; applying a clipping function to sample locations in the reference block to support application of an interpolation process when a flag is set to indicate the sub-picture is treated as a picture; applying the interpolation process to results of the clipping function to obtain a predicted sample value; encoding the block into a bitstream based on the predicted sample value; and storing the bitstream for communication toward a decoder, wherein the interpolation process includes a luma sample bilinear interpolation process, wherein the block includes a block of luma samples, wherein the predicted sample value includes a predicted luma sample value, wherein the luma sample bilinear interpolation process receives inputs including a luma location in full sample units (xIntL, yIntL), and wherein the luma sample bilinear interpolation process outputs the predicted luma sample value (predSampleLXL), and wherein the clipping function is applied to the sample locations and meets the following equations: xInti = Clip ⁢ 3 ⁢ ( SubPicLeftBoundaryPos , SubPicRightBoundaryPos , xInt L + i ) , and yInti = Clip ⁢ 3 ⁢ ( SubPicTopBoundaryPos , SubPicBotBoundaryPos , yInt L + i ) , where xInti and yInti are a sample location at index i, SubPicRightBoundaryPos is a position of a right boundary of the sub-picture, SubPicLeftBoundaryPos is a position of a left boundary of the sub-picture, SubPicTopBoundaryPos is a position of a top boundary of the sub-picture, and where SubPicBotBoundaryPos is a position of a bottom boundary of the sub-picture, and Clip3 is the clipping function according to: Clip ⁢ 3 ⁢ ( x , y , z ) = { x ; z < x y ; z > y z ; otherwise where x, y, and z are numerical input values.
8 . The method of claim 7 , wherein the clipping function is applied to the sample locations when subpic_treated_as_pic_flag[SubPicIdx] is equal to one, where subpic_treated_as_pic_flag is the flag set to indicate the sub-picture is treated as a picture and SubPicIdx is an index of the sub-picture.
9 . The method of claim 7 , wherein the interpolation process includes a luma sample interpolation filtering process.
10 . The method of claim 7 , wherein the interpolation process includes a luma sample eight tap interpolation filtering process.
11 . The method of claim 7 , wherein the interpolation process includes a chroma sample interpolation process, wherein a second block includes a block of chroma samples, and wherein a second predicted sample value includes a predicted chroma sample value.
12 . The method of claim 11 , wherein the chroma sample interpolation process receives inputs including a chroma location in full sample units (xIntc, yIntc), wherein the chroma sample interpolation process outputs a predicted chroma sample value (predSampleLXc), and wherein the clipping function is applied to the sample locations according to: when subpic_treated_as_pic_flag[SubPicIdx] is equal to one, the following applies: xInti = Clip ⁢ 3 ⁢ ( SubPicLeftBoundaryPos / SubWidthC , SubPicRightBoundaryPos / SubWidthC , xInt C + i ) , and yInti = Clip ⁢ 3 ⁢ ( SubPicTopBoundaryPos / SubHeightC , SubPicBotBoundaryPos / SubHeightC , yInt C + i ) , where subpic_treated_as_pic_flag is the flag set to indicate the sub-picture is treated as a picture, SubPicIdx is an index of the sub-picture, xInti and yInti are a clipped sample location at index i, SubPicRightBoundaryPos is a position of a right boundary of the sub-picture, SubPicLeftBoundaryPos is a position of a left boundary of the sub-picture, SubPicTopBoundaryPos is a position of a top boundary of the sub-picture, SubPicBotBoundaryPos is a position of a bottom boundary of the sub-picture, SubWidthC and SubHeightC indicate a horizontal and vertical sampling rate ratio between luma and chroma samples, and Clip3 is the clipping function according to: Clip ⁢ 3 ⁢ ( x , y , z ) = { x ; z < x y ; z > y z ; otherwise where x, y, and z are numerical input values.
13 . A decoder, comprising: a receiver configured to receive a bitstream comprising quantized transform coefficients of a sub-picture, where the sub-picture includes a block coded according to inter-prediction; and one or more processors coupled to the receiver and configured to: obtain a reconstructed residual block based on the quantized transform coefficients; determine a reference block for the block of the sub-picture; apply a clipping function to sample locations in the reference block during application of an interpolation process when a value of a flag indicates that the sub-picture is treated as a picture; apply the interpolation process to results of the clipping function to obtain a predicted sample value; and obtain a reconstructed block based on the reconstructed residual block and a predicted block comprising he predicted sample value, wherein the interpolation process includes a luma sample bilinear interpolation process, wherein the block includes a block of luma samples, wherein the predicted sample value includes a predicted luma sample value, wherein the luma sample bilinear interpolation process receives inputs including a luma location in full sample units (xIntL, yIntL), and wherein the luma sample bilinear interpolation process outputs the predicted luma sample value (predSampleLXL), and wherein the clipping function is applied to the sample locations and meets the following equations: xInti = Clip ⁢ 3 ⁢ ( SubPicLeftBoundaryPos , SubPicRightBoundaryPos , xInt L + i ) , and yInti = Clip ⁢ 3 ⁢ ( SubPicTopBoundaryPos , SubPicBotBoundaryPos , yInt L + i ) , where xInti and yInti are a sample location at index i, SubPicRightBoundaryPos is a position of a right boundary of the sub-picture, SubPicLeftBoundaryPos is a position of a left boundary of the sub-picture, SubPicTopBoundaryPos is a position of a top boundary of the sub-picture, and where SubPicBotBoundaryPos is a position of a bottom boundary of the sub-picture, and Clip3 is the clipping function according to: Clip ⁢ 3 ⁢ ( x , y , z ) = { x ; z < x y ; z > y z ; otherwise where x, y, and z are numerical input values.
14 . The decoder of claim 13 , wherein the clipping function is applied to the sample locations when subpic_treated_as_pic_flag[SubPicIdx] is equal to one, where subpic_treated_as_pic_flag is the flag set to indicate the sub-picture is treated as a picture and SubPicIdx is an index of the sub-picture.
15 . The decoder of claim 13 , wherein the interpolation process includes a luma sample interpolation filtering process.
16 . The decoder of claim 13 , wherein the interpolation process includes a luma sample eight tap interpolation filtering process.
17 . The decoder of claim 13 , wherein the interpolation process includes a chroma sample interpolation process, wherein a second block includes a block of chroma samples, and wherein a second predicted sample value includes a predicted chroma sample value.
18 . The decoder of claim 17 , wherein the chroma sample interpolation process receives inputs including a chroma location in full sample units (xIntc, yIntc), wherein the chroma sample interpolation process outputs a predicted chroma sample value (predSampleLXc), and wherein the clipping function is applied to the sample locations according to: when subpic_treated_as_pic_flag[SubPicIdx] is equal to one, the following applies: xInti = Clip ⁢ 3 ⁢ ( SubPicLeftBoundaryPos / SubWidthC , SubPicRightBoundaryPos / SubWidthC , xInt C + i ) , and yInti = Clip ⁢ 3 ⁢ ( SubPicTopBoundaryPos / SubHeightC , SubPicBotBoundaryPos / SubHeightC , yInt C + i ) , where subpic_treated_as_pic_flag is the flag set to indicate the sub-picture is treated as a picture, SubPicIdx is an index of the sub-picture, xInti and yInti are a clipped sample location at index i, SubPicRightBoundaryPos is a position of a right boundary of the sub-picture, SubPicLeftBoundaryPos is a position of a left boundary of the sub-picture, SubPicTopBoundaryPos is a position of a top boundary of the sub-picture, SubPicBotBoundaryPos is a position of a bottom boundary of the sub-picture, SubWidthC and SubHeightC indicate a horizontal and vertical sampling rate ratio between luma and chroma samples, and Clip3 is the clipping function according to: Clip ⁢ 3 ⁢ ( x , y , z ) = { x ; z < x y ; z > y z ; otherwise where x, y, and z are numerical input values.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS This patent application is a continuation of U.S. application Ser. No. 18/742,363 filed on Jun. 13, 2024, which is a continuation of U.S. application Ser. No. 17/470,376 filed on Sep. 9, 2021, which is a continuation of International Application No. PCT/US2020/022083 filed Mar. 11, 2020, which claims the benefit of U.S. Provisional Patent Application No. 62/816,751 filed Mar. 11, 2019, and U.S. Provisional Patent Application No. 62/826,659 filed Mar. 29, 2019, all of which are hereby incorporated by reference. TECHNICAL FIELD The present disclosure is generally related to video coding, and is specifically related to coding sub-pictures of pictures in video coding. BACKGROUND The amount of video data needed to depict even a relatively short video can be substantial, which may result in difficulties when the data is to be streamed or otherwise communicated across a communications network with limited bandwidth capacity. Thus, video data is generally compressed before being communicated across modern day telecommunications networks. The size of a video could also be an issue when the video is stored on a storage device because memory resources may be limited. Video compression devices often use software and/or hardware at the source to code the video data prior to transmission or storage, thereby decreasing the quantity of data needed to represent digital video images. The compressed data is then received at the destination by a video decompression device that decodes the video data. With limited network resources and ever increasing demands of higher video quality, improved compression and decompression techniques that improve compression ratio with little to no sacrifice in image quality are desirable. SUMMARY In an embodiment, the disclosure includes a method implemented in a decoder, the method comprising: receiving, by a receiver of the decoder, a bitstream comprising a current picture including a sub-picture coded according to inter-prediction; determining, by a processor of the decoder, a motion vector for a block of the sub-picture; applying, by the processor, a clipping function to sample locations in a reference block to support application of an interpolation filter when the motion vector points outside of the sub-picture and when a flag is set to indicate the sub-picture is treated as a picture; applying, by the processor, the interpolation filter to results of the clipping function to obtain a predicted sample value; and decoding, by the processor, the block based on the predicted sample value. Inter-prediction may be performed according to one of several inter-prediction modes. Certain inter-prediction modes generate candidate lists of motion vector predictors at both the encoder and the decoder. This allows the encoder to signal a motion vector by signaling the index from the candidate list instead of signaling the entire motion vector. Further, some systems encode sub-pictures for independent extraction. This allows a current sub-picture to be decoded and displayed without decoding information from other sub-pictures. This may cause errors when a motion vector is employed that points outside of the sub-picture because the data pointed to by the motion vector may not be decoded and hence may not be available. The present disclosure includes a flag that indicates a sub-picture should be treated as a picture. When a current sub-picture is treated like a picture, the current sub-picture should be extracted without reference to other sub-pictures. Specifically, the present example employs a clipping function that is applied when applying interpolation filters. This clipping function ensures that the interpolation filter does not rely on data from adjacent sub-pictures in order to maintain separation between the sub-pictures to support separate extraction. As such, the clipping function is applied when the flag is set and a motion vector points outside of the current sub-picture. The interpolation filter is then applied to the results of the clipping function. Accordingly, the present example provides additional functionality to a video codec by preventing errors when performing sub-picture extraction. Optionally, in any of the preceding aspects, another implementation of the aspect provides, wherein the interpolation filter includes a luma sample bilinear interpolation process, wherein the block includes a block of luma samples, and wherein the predicted sample value includes a predicted luma sample value. Optionally, in any of the preceding aspects, another implementation of the aspect provides, wherein the luma sample bilinear interpolation process receives inputs including a luma location in full sample units (xIntL, yIntL), wherein the luma sample bilinear interpolation process outputs a predicted luma sample value (predSampleLXL), and wherein the clipping function is applied to the sample locations according to: when subpic_treated_as_pic_flag[SubPicIdx]