US-20260129205-A1 - TECHNIQUES FOR BITSTREAM EXTRACTION FOR SUBPICTURE IN CODED VIDEO STREAM

US20260129205A1US 20260129205 A1US20260129205 A1US 20260129205A1US-20260129205-A1

Abstract

A method, computer program, and computer system are provided for video coding. The video coding includes decoding one or more previously encoded frames of a video source that are designated as reference frames for the video source, searching the reference frames for one or more candidate pixel blocks for an input frame of the video source, and encoding the input frame based on the one or more candidate pixel blocks.

Inventors

Byeongdoo CHOI
Stephan Wenger
Shan Liu

Assignees

Tencent America LLC

Dates

Publication Date: 20260507
Application Date: 20260105

Claims (20)

1 . A method of video coding, executable by a processor, comprising: receiving video data having one or more subpictures; extracting resampling parameters and spatial scalability parameters corresponding to the subpictures; and decoding the video data based on the extracted resampling and spatial scalability parameters.
2 . The method of claim 1 , further comprising enabling an adaptive resolution change within the received video data based on the resampling parameters.
3 . The method of claim 1 , wherein the resampling parameters correspond to one or more flags signaled in a parameter set associated with the video data.
4 . The method of claim 1 , wherein spatial scalability parameters correspond to one or more flags signaled in a parameter set associated with the video data.
5 . The method of claim 1 , further comprising scaling the video data based extracting a sub-bitstream associated with the resampling and spatial scalability parameters.
6 . The method of claim 5 , wherein the sub-bitstream is extracted from a bitstream associated with the video data based on a target output layer set index, a target highest temporal identification value, and an array of target subpicture index values associated with the video data.
7 . The method of claim 1 , wherein resampling of the video data during decoding is disabled based on the resampling parameters.
8 . A computer system for video coding, the computer system comprising: one or more computer-readable non-transitory storage media configured to store computer program code; and one or more computer processors configured to access said computer program code and operate as instructed by said computer program code, said computer program code including: receiving code configured to cause the one or more computer processors to receive video data having one or more subpictures; extracting code configured to cause the one or more computer processors to extract resampling parameters and spatial scalability parameters corresponding to the subpictures; and decoding code configured to cause the one or more computer processors to decode the video data based on the extracted resampling and spatial scalability parameters.
9 . The computer system of claim 8 , further comprising enabling code configured to cause the one or more computer processors to enable an adaptive resolution change within the received video data based on the resampling parameters.
10 . The computer system of claim 8 , wherein the resampling parameters correspond to one or more flags signaled in a parameter set associated with the video data.
11 . The computer system of claim 8 , wherein spatial scalability parameters correspond to one or more flags signaled in a parameter set associated with the video data.
12 . The computer system of claim 8 , further comprising scaling code configured to cause the one or more computer processors to scale the video data based extracting a sub-bitstream associated with the resampling and spatial scability parameters.
13 . The computer system of claim 12 , wherein the sub-bitstream is extracted from a bitstream associated with the video data based on a target output layer set index, a target highest temporal identification value, and an array of target subpicture index values associated with the video data.
14 . The computer system of claim 8 , wherein resampling of the video data during decoding is disabled based on the resampling parameters.
15 . A non-transitory computer readable medium having stored thereon a computer program for video coding, the computer program configured to cause one or more computer processors to: receive video data having one or more subpictures; extract resampling parameters and spatial scalability parameters corresponding to the subpictures; and decode the video data based on the extracted resampling and spatial scalability parameters.
16 . The computer readable medium of claim 15 , wherein the computer program is further configured to cause the one or more computer processors to enable an adaptive resolution change within the received video data based on the resampling parameters.
17 . The computer readable medium of claim 15 , wherein the resampling parameters correspond to one or more flags signaled in a parameter set associated with the video data.
18 . The computer readable medium of claim 15 , wherein spatial scalability parameters correspond to one or more flags signaled in a parameter set associated with the video data.
19 . The computer system of claim 15 , wherein the computer program is further configured to cause the one or more computer processors to scale the video data based extracting a sub-bitstream associated with the resampling and spatial scability parameters.
20 . The computer readable medium of claim 19 , wherein the sub-bitstream is extracted from a bitstream associated with the video data based on a target output layer set index, a target highest temporal identification value, and an array of target subpicture index values associated with the video data.

Description

CROSS-REFERENCE TO RELATED APPLICATION This application is a continuation application of U.S. application Ser. No. 18/809,072, filed Aug. 19, 2024, which is a Continuation Application of U.S. application Ser. No. 18/153,131 (filed Jan. 11, 2023), now U.S. Pat. No. 12,069,269 issued Aug. 20, 2024, which is a Continuation Application of U.S. application Ser. No. 17/335,600 (filed Jun. 1, 2021), now U.S. Pat. No. 11,589,052 issued Feb. 21, 2023, which claims priority based on U.S. Provisional Application No. 63/037,202 (filed Jun. 10, 2020), in the U.S. Patent and Trademark Office, the disclosures of which are incorporated by reference herein in their entirety. FIELD This disclosure relates generally to field of data processing, and more particularly to video coding. BACKGROUND Video coding and decoding using inter-picture prediction with motion compensation has been known for decades. Uncompressed digital video can consist of a series of pictures, each picture having a spatial dimension of, for example, 1920×1080 luminance samples and associated chrominance samples. The series of pictures can have a fixed or variable picture rate (informally also known as frame rate), of, for example 60 pictures per second or 60 Hz. Uncompressed video has significant bitrate requirements. For example, 1080p60 4:2:0 video at 8 bit per sample (1920×1080 luminance sample resolution at 60 Hz frame rate) requires close to 1.5 Gbit/s bandwidth. An hour of such video requires more than 600 GByte of storage space. One purpose of video coding and decoding can be the reduction of redundancy in the input video signal, through compression. Compression can help reducing aforementioned bandwidth or storage space requirements, in some cases by two orders of magnitude or more. Both lossless and lossy compression, as well as a combination thereof can be employed. Lossless compression refers to techniques where an exact copy of the original signal can be reconstructed from the compressed original signal. When using lossy compression, the reconstructed signal may not be identical to the original signal, but the distortion between original and reconstructed signal is small enough to make the reconstructed signal useful for the intended application. In the case of video, lossy compression is widely employed. The amount of distortion tolerated depends on the application; for example, users of certain consumer streaming applications may tolerate higher distortion than users of television contribution applications. The compression ratio achievable can reflect that: higher allowable/tolerable distortion can yield higher compression ratios. A video encoder and decoder can utilize techniques from several broad categories, including, for example, motion compensation, transform, quantization, and entropy coding, some of which will be introduced below. Historically, video encoders and decoders tended to operate on a given picture size that was, in most cases, defined and stayed constant for a coded video sequence (CVS), Group of Pictures (GOP), or a similar multi-picture timeframe. For example, in MPEG-2, system designs are known to change the horizontal resolution (and, thereby, the picture size) dependent on factors such as activity of the scene, but only at I pictures, hence typically for a GOP. The resampling of reference pictures for use of different resolutions within a CVS is known, for example, from ITU-T Rec. H.263 Annex P. However, here the picture size does not change, only the reference pictures are being resampled, resulting potentially in only parts of the picture canvas being used (in case of downsampling), or only parts of the scene being captured (in case of upsampling). Further, H.263 Annex Q allows the resampling of an individual macroblock by a factor of two (in each dimension), upward or downward. Again, the picture size remains the same. The size of a macroblock is fixed in H.263, and therefore does not need to be signaled. Changes of picture size in predicted pictures became more mainstream in modern video coding. For example, VP9 allows reference picture resampling and change of resolution for a whole picture. Similarly, certain proposals made towards VVC (including, for example, Hendry, et. al, “On adaptive resolution change (ARC) for VVC”, Joint Video Team document JVET-M0135-v1, Jan. 9-19, 2019, incorporated herein in its entirety) allow for resampling of whole reference pictures to different—higher or lower—resolutions. In that document, different candidate resolutions are suggested to be coded in the sequence parameter set and referred to by per-picture syntax elements in the picture parameter set. SUMMARY Embodiments relate to a method, system, and computer readable medium for video coding. According to one aspect, a method for video coding is provided. The method may include receiving video data having one or more subpictures. Resampling parameters and spatial scalability parameters corresponding to the subpictures are extra