EP-3673652-B1 - SYSTEM AND METHOD OF CROSS-COMPONENT DYNAMIC RANGE ADJUSTMENT (CC-DRA) IN VIDEO CODING

EP3673652B1EP 3673652 B1EP3673652 B1EP 3673652B1EP-3673652-B1

Inventors

RUSANOVSKYY, DMYTRO
RAMASUBRAMONIAN, Adarsh Krishnan
KARCZEWICZ, MARTA

Dates

Publication Date: 20260506
Application Date: 20180821

Claims (15)

A method of processing video data, the method comprising: receiving (1900) high dynamic range and wide color gamut video data; determining (1902) luma scale parameters for each of a plurality of ranges of codeword values for luminance components of the video data; performing (1904) a dynamic range adjustment process on the luminance components using the luma scale parameters; determining (1906) chroma scale parameters for chrominance components of the video data using a function of the luma scale parameters and a quantization parameter used to encode the chrominance components; performing (1908) a dynamic range adjustment process on the chrominance components of the video data using the chroma scale parameters; encoding (1910) the video data after performing the dynamic range adjustment process on the luminance components and after performing the dynamic range adjustment process on the chrominance components, wherein encoding the video data comprises quantizing the chrominance components using the quantization parameter.
A method of processing video data, the method comprising: decoding (2000) high dynamic range and wide color gamut video data, wherein decoding the video data comprises inverse quantizing the chrominance components using a quantization parameter; receiving (2002) the video data; determining (2004) luma scale parameters for each of a plurality of ranges of codeword values for luminance components of the video data; performing (2006) an inverse dynamic range adjustment process on the luminance components using the luma scale parameters; determining (2008) chroma scale parameters for chrominance components of the video data using a function of the luma scale parameters and the quantization parameter used to decode the chrominance components; and performing (2010) an inverse dynamic range adjustment process on the chrominance components of the video data using the chroma scale parameters; wherein decoding the video data is performed before performing an inverse dynamic range adjustment process on the luminance components and before performing an inverse dynamic range adjustment process on the chrominance components.
The method of claim 1 or claim 2, wherein determining the chroma scale parameters comprises: determining chroma scale parameters for chrominance components associated with luminance components having a first range of codeword values of the plurality of ranges of codeword values using a function of the luma scale parameters determined for the luminance components having the first range of codeword values.
The method of claim 1 or claim 2, wherein determining the chroma scale parameters comprises: determining chroma scale parameters for chrominance components associated with luminance components having a first range of codeword values and a second range of codeword values of the plurality of ranges of codeword values using a function of the luma scale parameters determined for the luminance components having the first range of codeword values and the second range of codeword values.
The method of claim 1 or claim 2, wherein the luma scale parameters for each of the plurality of ranges of codeword value for the luminance components are represented by a discontinuous function, the method further comprising: applying a linearization process to the discontinuous function to produce linearized luma scale parameters; and determining the chroma scale parameters for the chrominance components of the video data using a function of the linearized luma scale parameters.
The method of claim 5, wherein the linearization process is one or more of a linear interpolation process, a curve fitting process, an averaging process, a low pass filtering process, or a higher order approximation process.
The method of claim 1 or claim 2, wherein determining the chroma scale parameters further comprises: determining the chroma scale parameters for the chrominance components of the video data using a function of the luma scale parameters, the quantization parameter used to decode the chrominance components, and a color representation parameter derived from characteristics of the chroma component of the video data.
The method of claim 7, wherein the color representation parameter includes a transfer function associated with the video data.
The method of claim 1 or claim 2, further comprising: determining initial chroma scale parameters for the chrominance components of the video data, wherein determining the chroma scale parameters comprises determining the chroma scale parameters for the chrominance components of the video data using a function of the luma scale parameters and the initial chroma scale parameters.
The method of claim 1 or claim 2, further comprising: determining luma offset parameters for the luminance components; performing the dynamic range adjustment process on the luminance components using the luma scale parameters and the luma offset parameters; determining chroma offset parameters for the chrominance components; and performing the dynamic range adjustment process on the chrominance components of the video data using the chroma scale parameters and the chroma offset parameters.
An apparatus configured to process video data, the apparatus comprising: means for receiving video data; means for determining luma scale parameters for each of a plurality of ranges of codeword values for luminance components of the video data; means for performing a dynamic range adjustment process on the luminance components using the luma scale parameters; means for determining chroma scale parameters for chrominance components of the video data using a function of the luma scale parameters and a quantization parameter used to encode the chrominance components; means for performing a dynamic range adjustment process on the chrominance components of the video data using the chroma scale parameters; and means for encoding (1910) the video data after performing the dynamic range adjustment process on the luminance components and after performing the dynamic range adjustment process on the chrominance components, wherein encoding the video data comprises quantizing the chrominance components using the quantization parameter.
The apparatus of claim 11, further comprising a camera configured to capture the video data.
An apparatus configured to process video data, the apparatus comprising: means for decoding high dynamic range and wide color gamut video data, wherein decoding the video data comprises inverse quantizing the chrominance components using a quantization parameter; means for receiving the video data; means for determining luma scale parameters for each of a plurality of ranges of codeword values for luminance components of the video data; means for performing an inverse dynamic range adjustment process on the luminance components using the luma scale parameters; means for determining chroma scale parameters for chrominance components of the video data using a function of the luma scale parameters and the quantization parameter used to decode the chrominance components; and means for performing an inverse dynamic range adjustment process on the chrominance components of the video data using the chroma scale parameters; wherein decoding the video data is performed before performing the inverse dynamic range adjustment process on the luminance components and before performing the inverse dynamic range adjustment process on the chrominance components.
The apparatus of claim 13, further comprising a display configured to display the video data after performing the inverse dynamic range adjustment process on the luminance components and after performing the inverse dynamic range adjustment process on the chrominance components.
A non-transitory computer-readable storage medium storing instructions that, when executed, causes one or more processors of a device configured to process video data to perform the method of any of claims 1-10.

Description

This application claims the benefit of U.S. Provisional Application No. 62/548,236, filed August 21, 2017, and U.S. Application Number 16/__,__, filed August 20, 2018 TECHNICAL FIELD This disclosure relates to video processing. BACKGROUND Digital video capabilities can be incorporated into a wide range of devices, including digital televisions, digital direct broadcast systems, wireless broadcast systems, personal digital assistants (PDAs), laptop or desktop computers, tablet computers, e-book readers, digital cameras, digital recording devices, digital media players, video gaming devices, video game consoles, cellular or satellite radio telephones, so-called "smart phones," video teleconferencing devices, video streaming devices, and the like. Digital video devices implement video coding techniques, such as those described in the standards defined by MPEG-2, MPEG-4, ITU-T H.263, ITU-T H.264/MPEG-4, Part 10, Advanced Video Coding (AVC), ITU-T H.265, High Efficiency Video Coding (HEVC), and extensions of such standards. The video devices may transmit, receive, encode, decode, and/or store digital video information more efficiently by implementing such video coding techniques. Video coding techniques include spatial (intra-picture) prediction and/or temporal (inter-picture) prediction to reduce or remove redundancy inherent in video sequences. For block-based video coding, a video slice (e.g., a video frame or a portion of a video frame) may be partitioned into video blocks, which may also be referred to as treeblocks, coding units (CUs) and/or coding nodes. Video blocks in an intra-coded (I) slice of a picture are encoded using spatial prediction with respect to reference samples in neighboring blocks in the same picture. Video blocks in an inter-coded (P or B) slice of a picture may use spatial prediction with respect to reference samples in neighboring blocks in the same picture or temporal prediction with respect to reference samples in other reference pictures. Pictures may be referred to as frames, and reference pictures may be referred to as reference frames. Spatial or temporal prediction results in a predictive block for a block to be coded. Residual data represents pixel differences between the original block to be coded and the predictive block. An inter-coded block is encoded according to a motion vector that points to a block of reference samples forming the predictive block, and the residual data indicating the difference between the coded block and the predictive block. An intra-coded block is encoded according to an intra-coding mode and the residual data. For further compression, the residual data may be transformed from the pixel domain to a transform domain, resulting in residual transform coefficients, which then may be quantized. The quantized transform coefficients, initially arranged in a two-dimensional array, may be scanned in order to produce a one-dimensional vector of transform coefficients, and entropy coding may be applied to achieve even more compression. The total number of color values that may be captured, coded, and displayed may be defined by a color gamut. A color gamut refers to the range of colors that a device can capture (e.g., a camera) or reproduce (e.g., a display). Often, color gamuts differ from device to device. For video coding, a predefined color gamut for video data may be used such that each device in the video coding process may be configured to process pixel values in the same color gamut. Some color gamuts are defined with a larger range of colors than color gamuts that have been traditionally used for video coding. Such color gamuts with a larger range of colors may be referred to as a wide color gamut (WCG). Another aspect of video data is dynamic range. Dynamic range is typically defined as the ratio between the maximum and minimum brightness (e.g., luminance) of a video signal. The dynamic range of common video data used in the past is considered to have a standard dynamic range (SDR). Other example specifications for video data define color data that has a larger ratio between the maximum and minimum brightness. Such video data may be described as having a high dynamic range (HDR). Lasserre et al, in "Technicolor's response to CfE for HDR and WCG (category 1", 112, MPEG Meeting; 22-26 June 2015; Warsaw, Motion Picture Expert Group or ISO/IEC JTC1/SC29/WG11), no. m36263, 21 June 2015, discloses pre-processing that generates an SDR version from input HDR content. US 2017/105014 A1 discloses luma-driven chroma scaling for high dynamic range and wide color gamut contents. SUMMARY This disclosure is directed to the field of coding of video signals with high dynamic range (HDR) and wide color gamut (WCG) representations. More specifically, this disclosure describes signaling and operations applied to video data in certain color spaces to enable more efficient compression of HDR and WCG video data. This disclosure describes example techniques and devices for