EP-3763127-B1 - FAST DST-7

EP3763127B1EP 3763127 B1EP3763127 B1EP 3763127B1EP-3763127-B1

Inventors

ZHAO, XIN
LI, XIANG
LIU, SHAN

Dates

Publication Date: 20260506
Application Date: 20190411

Claims (10)

A computer-implemented method for decoding a video sequence using a discrete sine transform (DST) type-VII transform core, comprising: generating a set of tuples of transform core elements associated with an n-point DST-VII transform core, wherein n is one of 16, 32, or 64, wherein a first sum of a first subset of transform core elements of a first tuple is equal to a second sum of a second subset of remaining transform core elements of the first tuple; performing a transform on the video sequence using the set of tuples of transform core elements associated with the n-point DST-VII transform core; wherein the set of tuples include {a, j, l}, {b, i, m}, {c, h, n}, {d, g, o}, and {e, f, p}, wherein the n-point DST-VII transform core is a 16-point DST-VII transform core which includes the transform core elements included in the set of tuples and an element k, and is represented as {a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p}, or the set of tuples include {a, l, A, n, y}, {b, k, B, o, x}, {c, j, C, p, w}, {d, I, D, q, v}, {e, h, E, r, u}, and {f, g, F, s, t}, wherein the n-point DST-VII transform core is a 32-point DST-VII transform core which includes the transform core elements included in the set of tuples, an element m and an element z, and is represented as {a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z, A, B, C, D, E, F}.
The method of claim 1, wherein a + j = l, b + i = m, c + h = n, d + g = o, and e + f = p.
The method of claim 1, wherein a + l + A = n + y, b + k + B = o + x, c + j + C = p + w, d + I + D = q + v, e + h + E = r + u, and f + g + F = s + t.
The method of claim 1, wherein n is equal to sixteen or sixty-four, and wherein each tuple, of the set of tuples, includes three transform core elements.
The method of claim 1, wherein n is equal to thirty-two, and wherein each tuple, of the set of tuples, includes five transform core elements.
The method of claim 1, wherein a tuple, of the set of tuples, includes a first transform core element, a second transform core element, a third transform core element, and a fourth transform core element, and wherein a sum of absolute values of the first transform core element and the second transform core element is equal to a sum of absolute values of the third transform core element and the fourth transform core element.
The method of claim 1, wherein a tuple, of the set of tuples, includes a first transform core element, a second transform core element, a third transform core element, and a fourth transform core element, and wherein a sum of absolute values of the first transform core element, the second transform core element, and the third transform core element is equal to an absolute value of the fourth transform core element.
The method of claim 1, wherein a tuple, of the set of tuples, includes a first transform core element, a second transform core element, a third transform core element, a fourth transform core element, and a fifth transform core element, and wherein a sum of absolute values of first transform core element, the second transform core element, and the third transform core element is equal to a sum of the fourth transform core element and the fifth transform core element.
A device for decoding a video sequence using a discrete sine transform (DST) type-VII transform core, comprising: at least one memory configured to store program code; at least one processor configured to read the program code and operate as instructed by the program code to perform the method as claimed in any of claims 1-8.
A non-transitory computer-readable medium storing instructions, the instructions comprising: one or more instructions that, when executed by one or more processors of a device, cause the one or more processors to perform the method as claimed in any of claims 1-8.

Description

Field This disclosure is directed to video coding technologies. More specifically, the present disclosure is directed to a method and device for decoding a video sequence using a discrete sine transform (DST) type-VII transform core, and a medium. Background ITU-T VCEG (Q6/16) and ISO/IEC MPEG (JTC 1/SC 29/WG 11) promulgated the H.265/HEVC (High Efficiency Video Coding) standard in 2013 (version 1), and provided updates in 2014 (version 2), 2015 (version 3), and 2016 (version 4). Since, the ITU has been studying the potential need for standardization of future video coding technology with a compression capability that significantly exceeds that of the HEVC standard (including its extensions). In October 2017, the ITU issued the Joint Call for Proposals on Video Compression with Capability beyond HEVC (CfP). By February 15, 2018, a total of 22 CfP responses on standard dynamic range (SDR), 12 CfP responses on high dynamic range (HDR), and 12 CfP responses on 360 video categories were submitted, respectively. In April 2018, all received CfP responses were evaluated in the 122 MPEG / 10th JVET (Joint Video Exploration Team - Joint Video Expert Team) meeting. With careful evaluation, JVET formally launched the standardization of next-generation video coding beyond HEVC, i.e., the so-called Versatile Video Coding (VVC). The current version of which is VTM (VVC Test Model), i.e., VTM 1. As compared to DCT-2, of which the fast methods have been extensively studied, the implementation of DST-7 is still much less efficient than DCT-2. For example, VTM 1 includes matrix multiplication. In JVET-J0066, a method is proposed to approximate different types of DCTs and DSTs in JEM7 by applying adjustment stages to a transform in the DCT-2 family, which includes DCT-2, DCT-3, DST-2 and DST-3, and the adjustment stage refers to a matrix multiplication using a sparse matrix which requires relatively less operation counts. In JVET-J001, a method for implementing n-point DST-7 using 2n + 1 point Discrete Fourier Transform (DFT) is proposed. Problem to be solved The lack of efficient fast implementation of DST-7 limits the application of DST-7 for practical video codec implementations. For different implementation scenarios, a matrix multiplication based implementation is preferred since it includes more regular processing, but in some cases, a fast method which significantly reduces the number of operation counts is preferred. Therefore, it is highly desirable to identify a fast method which outputs substantially identical results as compared to a matrix multiplication based implementation, like the DCT-2 design in HEVC, which supports both matrix multiplication and a partial butterfly implementation. The existing fast methods for DST-7, e.g., JVET-J0066 and JVET-J0017, cannot support all the desirable features of a transform design in a video codec, including 16-bit intermediate operations, integer operations, and/or provide identical results between a fast method implementation and a matrix multiplication based implementation. Summary According to an aspect of the disclosure, a method for decoding a video sequence using a discrete sine transform (DST) type-VII transform core includes generating a set of tuples of transform core elements associated with an n-point DST-VII transform core, wherein a first sum of a first subset of transform core elements of a first tuple is equal to a second sum of a second subset of remaining transform core elements of the first tuple; generating the n-point DST-VII transform core based on generating the set of tuples of transform core elements; and performing a transform on a sample block using the n-point DST-VII transform core. According to an aspect of the disclosure, a device for decoding a video sequence using a discrete sine transform (DST) type-VII transform core includes at least one memory configured to store program code; at least one processor configured to read the program code and operate as instructed by the program code to perform the method for decoding a video sequence. According to an aspect of the disclosure, a non-transitory computer-readable medium storing instructions, the instructions comprising: one or more instructions that, when executed by one or more processors of a device, cause the one or more processors to perform the method for decoding a video sequence. Brief description of the drawings Further features, the nature, and various advantages of the disclosed subject matter will be more apparent from the following detailed description and the accompanying drawings in which: FIG. 1 is a flowchart of an example process for method for decoding a video sequence using a discrete sine transform (DST) type-VII transform core.FIG. 2 is a simplified block diagram of a communication system according to an embodiment of the present disclosure.FIG. 3 is a diagram of the placement of a video encoder and decoder in a streaming environment.FIG. 4 is a functional block diagram of a