BR-122025009714-A2 - METHOD FOR DECODING AND ENCODING IMAGES AND COMPUTER-READABLE MEDIA
Abstract
The present invention relates to a method for decoding a video signal based on adaptive multiple transforms (AMT), the method comprising the steps of: obtaining an AMT index from the video signal, wherein the AMT index indicates any one of a plurality of transform combinations in a transform configuration group, and the transform configuration group includes the discrete sine transform type 7 (DST7) and the discrete cosine transform type 8 (DCT8); deriving a transform combination corresponding to the AMT index, wherein the transform combination consists of a horizontal transform and a vertical transform, and includes at least one of the DST-7 or DCT-8, performing an inverse transform on a current block based on the transform combination, and restoring the video signal using the inversely transformed current block, wherein the AMT represents a transform scheme that is performed based on a transform combination adaptively selected from a plurality of transform combinations.
Inventors
- Moonmo KOO
Assignees
- GUANGDONG OPPO MOBILE TELECOMMUNICATIONS CORP., LTD.
Dates
- Publication Date
- 20260310
- Application Date
- 20180806
- Priority Date
- 20170804
Claims (7)
- 1. METHOD FOR DECODING IMAGES characterized by comprising: determining a prediction mode as an inter-prediction mode to be used for a block; determining a transform configuration group from a plurality of transform configuration groups to be used for the inter-prediction mode block, wherein the plurality of transform configuration groups includes a plurality of transform combinations; obtaining a transform index, wherein the transform index (i) indicates a transform combination from the plurality of transform configuration groups, (ii) is a unique transform index corresponding to a first index value for a horizontal transform and a second index value for a vertical transform, and (iii) is composed of values 0, 1, 2, 3, 4, and wherein the first index value and the second index value are determined based on the transform index and the configuration group of a transform, and wherein the transform index is defined at a level of one encoding unit; To derive a transform combination that corresponds to the transform index, where the transform combination is configured with a horizontal transform and a vertical transform and includes one of (DST (Discrete Sine Transform)-7, DST-7), (DST-7, DCT (Discrete Cosine Transform)-8), (DCT-8, DST-7) and (DCT-8, DCT-8); to perform an inverse transform on a transform unit based on the transform combination; and to reconstruct a video signal based on the transform unit.
- 2. METHOD, according to claim 1, characterized in that the transformation combination is selected based on at least one of the prediction mode, a size, or a format of the transformation unit.
- 3. METHOD, according to claim 1, characterized by further comprising: checking whether the number of non-zero transform coefficients is greater than a threshold, wherein the transform index is obtained based on whether the number of non-zero transform coefficients is greater than the threshold.
- 4. METHOD, according to claim 1, characterized by performing the inverse transform comprising applying an inverse DST-7 transform or an inverse DCT-8 transform to each row after applying the inverse DST-7 transform or the inverse DCT-8 transform to each column.
- 5. METHOD, according to claim 1, characterized in that the transform index is additionally defined at least at one level within a sequence, an image, a slice, a block, the transform unit, or a prediction unit.
- 6. METHOD FOR ENCODING IMAGES characterized by comprising: determining a prediction mode as an inter-prediction mode to be used for a block; deriving a transform combination that is applied to a transform unit, wherein the transform combination is configured with a horizontal transform and a vertical transform and includes one from (DST-7, DST-7), (DST-7, DCT-8), (DCT-8, DST-7) and (DCT-8, DCT-8); performing a transform on the transform unit based on the transform combination; determining a transform configuration group from a plurality of transform configuration groups to be used for the inter-prediction mode block, wherein the plurality of transform configuration groups includes a plurality of transform combinations; generate a transform index that corresponds to the transform combination, wherein the transform index (i) indicates a transform combination of the plurality of transform configuration groups, (ii) is a unique transform index corresponding to a first index value for a horizontal transform and a second index value for a vertical transform, and (iii) is composed of values 0, 1, 2, 3, 4 and wherein the first index value and the second index value are determined based on the transform index and the configuration group of a transform.
- 7. COMPUTER-READABLE MEDIA characterized by storing a bitstream generated by an encoding method, wherein the encoding method comprises: determining a prediction mode as an inter-prediction mode to be used for a block; deriving a transform combination that is applied to a transform unit, wherein the transform combination is configured with a horizontal transform and a vertical transform and includes one from (DST-7, DST-7), (DST-7, DCT-8), (DCT-8, DST-7) and (DCT-8, DCT-8); performing a transform on the transform unit based on the transform combination; determining a transform configuration group from a plurality of transform configuration groups to be used for the inter-prediction mode block, wherein the plurality of transform configuration groups includes a plurality of transform combinations; generate a transform index that corresponds to the transform combination, wherein the transform index (i) indicates a transform combination of the plurality of transform configuration groups, (ii) is a unique transform index corresponding to a first index value for a horizontal transform and a second index value for a vertical transform, and (iii) is composed of values 0, 1, 2, 3, 4 and wherein the first index value and the second index value are determined based on the transform index and the configuration group of a transform.
Description
TECHNICAL FIELD [001] The present disclosure relates to a method and apparatus for processing a video signal, and, more particularly, to a technology for configuring a transform combination for each transform configuration group distinguished based on at least one of a prediction mode, a block size or a block format. BACKGROUND OF THE INVENTION [002] Next-generation video content will feature high spatial resolution, high frame rate, and high dimensionality in scene representation. In order to process such content, technologies such as memory storage, memory access rate, and processing capacity will need to be significantly increased. [003] Therefore, it is necessary to develop an encoding tool to efficiently process next-generation video content. In particular, it is necessary to design a more efficient transform in terms of encoding efficiency and complexity when a transform is applied. REVELATION TECHNICAL PROBLEM [004] The present disclosure aims to design a more efficient transform configuration in terms of encoding efficiency and complexity. [005] The revelation proposes a method for configuring a transform combination for each transform configuration group distinguished based on at least one of a prediction mode, a block size, or a block format. [006] Additionally, the revelation proposes an encoder/decoder structure to incorporate a new transform scheme. TECHNICAL SOLUTION [007] In order to achieve the objectives, the disclosure proposes a method to replace the discrete cosine transform type 8 (DCT 8) with a modified form of discrete sine transform type 7 (DST7), while using DST7 kernel coefficient data without any alteration. [008] Additionally, the disclosure proposes a method to replace DST7 with DST4 and replace DCT8 with a modified form of DCT4 at the same time using DST4 core coefficient data without any alteration. [009] Additionally, the disclosure proposes a method for configuring transform configuration groups based on at least one of a prediction mode, a block size or a block format, and configuring a transform corresponding to each row or column differently, wherein a transform configuration group is configured with one or more transform combinations and a transform combination is configured with transforms corresponding to all rows and columns. [010] Additionally, the disclosure proposes a method for setting up transforms for all rows and columns based on a transform, such as DST7 or DST4, and a transform modified from the same. [011] Additionally, the disclosure proposes a method for setting up a set of transforms that can be derived with respect to all transforms in such a way as to use linear relationships between all trigonometric transforms (8 DCTs, 8 DSTs) or add a post/pre-processing process to a transform input/output part, compute a union of the derived transform sets, and use the union to determine a combination of transforms. ADVANTAGEOUS EFFECTS [012] The revelation can generate transform coefficients possessing greater encoding efficiency by setting transforms for all rows and columns for each transform configuration group based on a predetermined number of transforms when a still image or a moving image is encoded. DESCRIPTION OF THE DRAWINGS [013] FIG. 1 is a block diagram illustrating the configuration of an encoder for encoding a video signal according to an embodiment of the present invention. [014] FIG. 2 is a block diagram illustrating the configuration of a decoder for decoding a video signal according to an embodiment of the present invention. [015] FIG. 3 illustrates embodiments to which disclosure can be applied, FIG. 3A is a diagram to describe a block-splitting structure based on a quaternary tree (hereinafter referred to as “QT”), FIG. 3B is a diagram to describe a block-splitting structure based on a binary tree (hereinafter referred to as “BT”), FIG. 3C is a diagram to describe a block-splitting structure based on a ternary tree (hereinafter referred to as “TT”), and FIG. 3D is a diagram to describe a block-splitting structure based on an asymmetric tree (hereinafter referred to as “AT”). [016] FIG. 4 is a modality to which the revelation is applied, and illustrates a schematic block diagram of a 120/130 transform and quantization unit and a 140/150 dequantization and transform unit within an encoder. [017] FIG. 5 is a modality to which the revelation is applied and illustrates a schematic block diagram of a 220/230 dequantization and transform unit within a decoder. [018] FIG. 6 is a modality to which the revelation is applied and is a table illustrating a group of transform configurations to which adaptive multiple transforms (AMT) are applied. [019] FIG. 7 is a modality to which disclosure is applied and is a flowchart illustrating an encoding process in which adaptive multiple transforms (AMTs) are performed. [020] FIG. 8 is a modality to which revelation is applied and is a flowchart illustrating a decoding process in which adaptive multiple transforms