EP-3769524-B1 - METHOD AND APPARATUS FOR IDENTITY TRANSFORM IN MULTIPLE TRANSFORM SELECTION

EP3769524B1EP 3769524 B1EP3769524 B1EP 3769524B1EP-3769524-B1

Inventors

ZHAO, XIN
LI, XIANG
LIU, SHAN

Dates

Publication Date: 20260513
Application Date: 20190830

Claims (5)

A method of controlling residual coding for decoding or encoding of a video sequence, the method being performed by at least one processor and comprises: determining (610) whether to replace a transform type of Multiple Transform Selection, MTS, by using Identity transform, IDT, based on whether a block height and a block width meet a predetermined condition, wherein the predetermined condition includes determining whether a width/height ratio or height/width ratio of the block is smaller than a pre-defined threshold value, and wherein the pre-defined threshold value is one of: 2, 4, 8, 16, and 32; and based on the block height and the block width meeting the predetermined condition, replacing (620) the transform type of MTS by using the IDT and keeping (630) a syntax of the MTS unchanged, wherein the IDT is a linear transform process using an NxN transform core in which each coefficient is non-zero with a value 2 M along a diagonal position, wherein the values of M depend on a transform size, and wherein N is an integer.
The method of claim 1, wherein the transform type of the MTS is discrete cosine transform DCT-4 or DCT-8.
An apparatus for controlling residual coding for decoding or encoding of a video sequence comprising: at least one memory configured to store computer program code; and at least one processor configured to access the at least one memory and operate according to the computer program code, the computer program code comprising: determining code configured to cause the at least one processor to determine whether to replace a transform type of Multiple Transform Selection, MTS, by using Identity transform, IDT, based on whether a block height and a block width meet a predetermined condition, wherein the predetermined condition includes determining whether a width/height ratio or height/width ratio of the block is smaller than a pre-defined threshold value, and wherein the pre-defined threshold value is one of: 2, 4, 8, 16, and 32; and replacing code configured to cause the at least one processor to, based on the block height and the block width meeting the predetermined condition, replace the transform type of MTS by using the IDT and keeping a syntax of the MTS unchanged, wherein the IDT is a linear transform process using an NxN transform core in which each coefficient is non-zero with a value 2 M along a diagonal position, wherein the values of M depend on a transform size, and wherein N is an integer.
The apparatus of claim 3, wherein the transform type of the MTS is discrete cosine transform DCT-4 or DCT-8.
A non-transitory computer-readable storage medium storing instructions that cause at least one processor to implement operations of the method of any one of claims 1 to 2.

Description

BACKGROUND 1. Field Methods and apparatuses consistent with embodiments relate to video coding, and more particularly, a method and an apparatus for applying identity transform in Multiple Transform Selection (MTS). 2. Description of Related Art In High Efficiency Video Coding (HEVC), a coding tree unit (CTU) is split into coding units (CUs) by using a quadtree structure denoted as a coding tree to adapt to various local characteristics. A decision whether to code a picture area using inter-picture (temporal) or intra-picture (spatial) prediction is made at a CU level. Each CU can be further split into one, two or four prediction units (PUs) according to a PU splitting type. Inside one PU, the same prediction process is applied and relevant information is transmitted to a decoder on a PU basis. After obtaining a residual block by applying a prediction process based on the PU splitting type, a CU can be partitioned into transform units (TUs) according to another quadtree structure similar to the coding tree for the CU. One of key features of an HEVC structure is that it has multiple partition conceptions including CU, PU, and TU. In HEVC, a CU or a TU can only be square shape, while a PU may be square or rectangular shape for an inter predicted block. In a later stage of HEVC, some contributions proposed to allow rectangular shape PUs for intra prediction and transform. At a picture boundary, HEVC imposes implicit a quadtree split so that a block will keep quadtree splitting until a size fits the picture boundary. In Versatile Video Coding (VVC), a quadtree plus binary tree (QTBT) structure removes concepts of multiple partition types, i.e., removes a separation of CU, PU and TU concepts, and supports more flexibility for CU partition shapes. In the QTBT block structure, a CU can have either a square or rectangular shape. As shown in FIG. 1A, a CTU is first partitioned by a quadtree structure. Quadtree leaf nodes are further partitioned by a binary tree structure. There are two splitting types, symmetric horizontal splitting and symmetric vertical splitting, in binary tree splitting. Binary tree leaf nodes are called CUs, and that segmentation is used for prediction and transform processing without any further partitioning. This means that a CU, a PU and a TU have the same block size in the QTBT coding block structure. In VVC, a CU sometimes consists of coding blocks (CBs) of different color components, e.g., one CU contains one luma CB and two chroma CBs in a case of P and B slices of a 4:2:0 chroma format, and a CU sometimes consists of a CB of a single component, e.g., one CU contains only one luma CB or just two chroma CBs in a case of I slices. The following parameters are defined for a QTBT partitioning scheme. CTU size: a root node size of a quadtree, the same concept as in HEVCMinQTSize: a minimum allowed quadtree leaf node sizeMaxBTSize: a maximum allowed binary tree root node sizeMaxBTDepth: a maximum allowed binary tree depthMinBTSize: a minimum allowed binary tree leaf node size In an example of the QTBT partitioning structure, the CTU size is set as 128×128 luma samples with two corresponding 64×64 blocks of chroma samples, the MinQTSize is set as 16×16, the MaxBTSize is set as 64×64, the MinBTSize (for both width and height) is set as 4×4, and the MaxBTDepth is set as 4. The quadtree partitioning is applied to a CTU first to generate quadtree leaf nodes. The quadtree leaf nodes may each have a size from 16×16 (i.e., the MinQTSize) to 128×128 (i.e., the CTU size). If the leaf quadtree node is 128×128, it will not be further split by a binary tree because the size exceeds the MaxBTSize (i.e., 64×64). Otherwise, the leaf quadtree node could be further partitioned by the binary tree. Therefore, the quadtree leaf node is also a root node for the binary tree, and has the binary tree depth as 0. When the binary tree depth reaches MaxBTDepth (i.e., 4), no further splitting is considered. When the binary tree node has the width equal to MinBTSize (i.e., 4), no further horizontal splitting is considered. Similarly, when the binary tree node has the height equal to MinBTSize, no further vertical splitting is considered. The leaf nodes of the binary tree are further processed by prediction and transform processing without any further partitioning. The maximum CTU size may be 256×256 luma samples. Portion (a) of FIG. 1A illustrates an example of block partitioning by using QTBT, and Portion (b) of FIG. 1A illustrates a corresponding tree representation. Solid lines indicate quadtree splitting and dotted lines indicate binary tree splitting. In each splitting (i.e., non-leaf) node of the binary tree, one flag is signaled to indicate which splitting type (i.e., horizontal or vertical) is used, where 0 indicates horizontal splitting and 1 indicates vertical splitting. For the quadtree splitting, there is no need to indicate the splitting type because quadtree splitting always splits a block both horizontally a