US-12621446-B2 - Adaptive video filter classifier methods
Abstract
A method and apparatus for coding video data includes filtering video data using a loop filter, such as an adaptive loop filter (ALF), that is determined using a classifier from a plurality of classifiers. A video coder may reconstruct a block of video data to generate a reconstructed block, determine a filter class for the reconstructed block using a classifier from a plurality of classifiers, wherein the plurality of classifiers includes one or more of a first classifier based on a geometry partition and a second classifier based on a histogram of gradients, determine a filter based on the filter class, and apply the filter to the reconstructed block.
Inventors
- Marta Karczewicz
- Vadim SEREGIN
- Nan Hu
- Ikram Jumakulyyev
Assignees
- QUALCOMM INCORPORATED
Dates
- Publication Date
- 20260505
- Application Date
- 20231218
Claims (20)
- 1 . A method of coding video data, the method comprising: reconstructing a block of video data to generate a reconstructed block; determining a filter class for the reconstructed block using a classifier based on a geometry partition, including determining the filter class based on a partition structure, from a plurality of partition structures, that is associated with a lowest sum of activity values; determining a filter based on the filter class; and applying the filter to the reconstructed block.
- 2 . The method of claim 1 , wherein determining the filter class using the classifier based on the geometry partition comprises: determining respective sums of activity values for each partition structure of the plurality of partition structures, wherein each of the plurality of partition structures is defined by a respective angle and a respective offset distance; determining the partition structure of the plurality of partition structures associated with the lowest sum of activity values of the respective sums of activity values; and determining the filter class based on the respective angle and respective offset distance associated with the partition structure of the plurality of partition structures associated with the lowest sum of activity values of the respective sums of activity values.
- 3 . The method of claim 2 , wherein determining the respective sums of activity values for each partition structure of a plurality of partition structures comprises: determining respective first activity values for respective first partitions of the plurality of partition structures using samples of the block of video data in the respective first partitions; determining respective second activity values for respective second partitions of the plurality of partition structures using samples of the block of video data in the respective second partitions; and adding the respective first activity values to the respective second activity values to determine the respective sums of activity values.
- 4 . The method of claim 2 , wherein each of the plurality of partition structures is defined by the respective angle and the respective offset distance in a function ƒ(x, y)=x cos (θ)−y sin (θ)+ρ relative to a boundary distance alpha (a), wherein ƒ(x, y) is a partitioning boundary of a partition structure, θ is the respective angle, and ρ is the respective offset distance.
- 5 . The method of claim 2 , wherein determining respective sums of activity values comprises: determining respective sums of activity values based on a horizontal gradient and a vertical gradient obtained using a 1D Laplacian.
- 6 . The method of claim 2 , wherein determining respective sums of activity values comprises: determining respective sums of activity values based on a variance of samples.
- 7 . The method of claim 2 , wherein determining the filter class based on the respective angle and respective offset distance associated with the partition structure of the plurality of partition structures associated with the lowest sum of activity values of the respective sums of activity values comprises: determining the filter class based on the respective angle and respective offset distance and an overall activity calculated for the block.
- 8 . The method of claim 2 , wherein determining the filter class based on the respective angle and respective offset distance associated with the partition structure of the plurality of partition structures associated with the lowest sum of activity values of the respective sums of activity values comprises: determining the filter class based on the respective angle and respective offset distance and ratio of an overall activity calculated for the block and the lowest sum of activity values of the plurality of partition structures.
- 9 . The method of claim 1 , wherein the filter is an adaptive loop filter.
- 10 . The method of claim 1 , wherein coding comprises encoding, and wherein reconstructing the block of video data to generate the reconstructed block comprises: reconstructing the block of video data in a reconstruction loop of a video encoding process to generate the reconstructed block.
- 11 . The method of claim 1 , wherein coding comprises decoding, and wherein reconstructing the block of video data to generate the reconstructed block comprises: decoding the block of video data to generate the reconstructed block.
- 12 . An apparatus configured to code video data, the apparatus comprising: a memory configured to store a block of video data; and one or more processors in communication with the memory, the one or more processors configured to: reconstruct the block of video data to generate a reconstructed block; determine a filter class for the reconstructed block using a classifier based on a geometry partition, wherein the one or more processors are further configured to determine the filter class based on a partition structure, from a plurality of partition structures, that is associated with a lowest sum of activity values; determine a filter based on the filter class; and apply the filter to the reconstructed block.
- 13 . The apparatus of claim 12 , wherein to determine the filter class using the classifier based on the geometry partition, the one or more processors are further configured to: determine respective sums of activity values for each partition structure of the plurality of partition structures, wherein each of the plurality of partition structures is defined by a respective angle and a respective offset distance; determine the partition structure of the plurality of partition structures associated with the lowest sum of activity values of the respective sums of activity values; and determine the filter class based on the respective angle and respective offset distance associated with the partition structure of the plurality of partition structures associated with the lowest sum of activity values of the respective sums of activity values.
- 14 . The apparatus of claim 13 , wherein to determine the respective sums of activity values for each partition structure of a plurality of partition structures, the one or more processors are further configured to: determine respective first activity values for respective first partitions of the plurality of partition structures using samples of the block of video data in the respective first partitions; determine respective second activity values for respective second partitions of the plurality of partition structures using samples of the block of video data in the respective second partitions; and add the respective first activity values to the respective second activity values to determine the respective sums of activity values.
- 15 . The apparatus of claim 13 , wherein each of the plurality of partition structures is defined by the respective angle and the respective offset distance in a function ƒ(x,y)=x cos (θ)−y sin (θ)+ρ relative to a boundary distance alpha (a), wherein ƒ(x, y) is a partitioning boundary of a partition structure, θ is the respective angle, and ρ is the respective offset distance.
- 16 . The apparatus of claim 13 , wherein to determine respective sums of activity values, the one or more processors are further configured to: determine respective sums of activity values based on a horizontal gradient and a vertical gradient obtained using a 1D Laplacian.
- 17 . The apparatus of claim 13 , wherein to determine respective sums of activity values, the one or more processors are further configured to: determine respective sums of activity values based on a variance of samples.
- 18 . The apparatus of claim 13 , wherein to determine the filter class based on the respective angle and respective offset distance associated with the partition structure of the plurality of partition structures associated with the lowest sum of activity values of the respective sums of activity values, the one or more processors are further configured to: determine the filter class based on the respective angle and respective offset distance and an overall activity calculated for the block.
- 19 . The apparatus of claim 13 , wherein to determine the filter class based on the respective angle and respective offset distance associated with the partition structure of the plurality of partition structures associated with the lowest sum of activity values of the respective sums of activity values, the one or more processors are further configured to: determine the filter class based on the respective angle and respective offset distance and ratio of an overall activity calculated for the block and the lowest sum of activity values of the plurality of partition structures.
- 20 . The apparatus of claim 12 , wherein the filter is an adaptive loop filter.
Description
This application claims the benefit of U.S. Provisional Patent Application No. 63/478,333, filed Jan. 3, 2023, the entire content of which is incorporated by reference herein. TECHNICAL FIELD This disclosure relates to video encoding and video decoding. BACKGROUND Digital video capabilities can be incorporated into a wide range of devices, including digital televisions, digital direct broadcast systems, wireless broadcast systems, personal digital assistants (PDAs), laptop or desktop computers, tablet computers, e-book readers, digital cameras, digital recording devices, digital media players, video gaming devices, video game consoles, cellular or satellite radio telephones, so-called “smart phones,” video teleconferencing devices, video streaming devices, and the like. Digital video devices implement video coding techniques, such as those described in the standards defined by MPEG-2, MPEG-4, ITU-T H.263, ITU-T H.264/MPEG-4, Part 10, Advanced Video Coding (AVC), ITU-T H.265/High Efficiency Video Coding (HEVC), ITU-T H.266/Versatile Video Coding (VVC), and extensions of such standards, as well as proprietary video codecs/formats such as AOMedia Video 1 (AV1) that was developed by the Alliance for Open Media. The video devices may transmit, receive, encode, decode, and/or store digital video information more efficiently by implementing such video coding techniques. Video coding techniques include spatial (intra-picture) prediction and/or temporal (inter-picture) prediction to reduce or remove redundancy inherent in video sequences. For block-based video coding, a video slice (e.g., a video picture or a portion of a video picture) may be partitioned into video blocks, which may also be referred to as coding tree units (CTUs), coding units (CUs) and/or coding nodes. Video blocks in an intra-coded (I) slice of a picture are encoded using spatial prediction with respect to reference samples in neighboring blocks in the same picture. Video blocks in an inter-coded (P or B) slice of a picture may use spatial prediction with respect to reference samples in neighboring blocks in the same picture or temporal prediction with respect to reference samples in other reference pictures. Pictures may be referred to as frames, and reference pictures may be referred to as reference frames. SUMMARY In general, this disclosure describes techniques for adaptive loop filtering including techniques for determining classes of adaptive loop filters using classifiers. A video coder may be configured to determine a filter class using a classifier from a plurality of classifiers, including one or more classifiers based on a geometry partition and/or one or more classifiers based on a histogram of gradients (HoG). These classifiers may be more optimally suited for determining the activity of samples in a block for some types of video content, thus making the selection of filters based on a filter class determined from such activity more accurate. As such, video data coded using the techniques of this disclosure may exhibit less distortion. In one example, a method includes reconstructing a block of video data to generate a reconstructed block, determining a filter class for the reconstructed block using a classifier from a plurality of classifiers, wherein the plurality of classifiers includes one or more of a first classifier based on a geometry partition and a second classifier based on a histogram of gradients, determining a filter based on the filter class, and applying the filter to the reconstructed block. In another example, a device includes a memory and one or more processors in communication with the memory, the one or more processors configured to reconstruct a block of video data to generate a reconstructed block, determine a filter class for the reconstructed block using a classifier from a plurality of classifiers, wherein the plurality of classifiers includes one or more of a first classifier based on a geometry partition and a second classifier based on a histogram of gradients, determine a filter based on the filter class, and apply the filter to the reconstructed block. In another example, a device includes means for reconstructing a block of video data to generate a reconstructed block, means for determining a filter class for the reconstructed block using a classifier from a plurality of classifiers, wherein the plurality of classifiers includes one or more of a first classifier based on a geometry partition and a second classifier based on a histogram of gradients, means for determining a filter based on the filter class, and means for applying the filter to the reconstructed block In another example, a computer-readable storage medium is encoded with instructions that, when executed, cause a programmable processor to reconstruct a block of video data to generate a reconstructed block, determine a filter class for the reconstructed block using a classifier from a plurality of classifiers, wherein the plurality of cla