DE-112024002874-T5 - Template-based intra-mode derivation fusion with non-angled mode

DE112024002874T5DE 112024002874 T5DE112024002874 T5DE 112024002874T5DE-112024002874-T5

Abstract

To better predict the pixels of a block in a video, we propose a method for predicting the pixel colors of an image block, comprising: - Determining that template-based intra-mode derivation is applicable to predicting the block; - Determining a set of intra-prediction modes and their respective costs; - Determining a first TIMD mode, which is the first of the set of lowest-cost intra-prediction modes, and a second TIMD mode, which is the second of the set of second-lowest-cost intra-prediction modes; - Determining a third TIMD mode, which is a non-angle-dependent intra-prediction mode; - Checking whether the third TIMD mode differs from the first TIMD mode and whether the third TIMD mode differs from the second TIMD mode. - based on the successful verification, forming a final TIMD mode comprising a linear combination of the first TIMD mode, the second TIMD mode, and the third TIMD mode; and - forming a prediction of the block based on a template of the block and the final TIMD mode.

Inventors

Médéric Stéphane Blestel
Pierre Jean Andrivon

Assignees

KONINKLIJKE PHILIPS N.V.

Dates

Publication Date: 20260513
Application Date: 20240625
Priority Date: 20230707

Claims (13)

A method for predicting a block of pixel colors in an image, comprising: - Determining that template-based intra-mode derivation (TIMD) is applicable to predicting the block; - Determining a set of intra-prediction modes and their respective costs; - Determining a first TIMD mode, which is the first of the set of lowest-cost intra-prediction modes, and a second TIMD mode, which is the second of the set of second-lowest-cost intra-prediction modes; - Determining a third TIMD mode, which is non-angle-dependent; - Verifying that the third TIMD mode differs from the first TIMD mode and that the third TIMD mode differs from the second TIMD mode; - Based on the successful verification, forming a final TIMD mode that includes a linear combination of the first TIMD mode, the second TIMD mode, and the third TIMD mode; and - forming a prediction of the block based on a template of the block and the final TIMD mode.
Procedure according to Claim 1 , where the respective costs are based on differences between predicted samples obtained by applying the respective intra-prediction mode to samples of a template reference and reconstructed template samples.
Procedure according to Claim 1 , where determining the third TIMD mode depends on the condition that the first TIMD mode and the second TIMD mode are not non-angular intra-prediction modes.
Procedure according to Claim 1 , where the determination of the final TIMD mode depends on the third cost of the non-angular intra-forecast mode being less than a threshold.
Procedure according to Claim 4 , where the threshold costs are based on the lowest of the costs of the first TIMD mode and the second TIMD mode.
Procedure according to Claim 4 , where the threshold costs are a product of a scale value and the lowest of the costs of the first TIMD mode and the second TIMD mode.
Method according to any of the preceding claims, wherein the linear combination comprises multiplying each TIMD mode by a weighting that depends on its cost.
Procedure according to Claim 7 , where the weights are calculated as the numerator of the sum of all weights minus the respective weight of the respective TIMD mode divided by a denominator that is a multiple of the sum of all weights.
Method according to any of the preceding claims, wherein the non-angular intra-prediction mode is a DC mode or a planar mode.
Method according to one of the preceding claims, wherein the non-angle-based intra-prediction mode is based on a reference block that is usable for the intra-block copy of an adjacent block of the block, or on a reference block that is usable for the intra-TMP prediction of an adjacent block of the block.
Procedure according to Claim 1 , where the non-angular intra-forecast mode is selected as the one with the lowest cost from a set of non-angular intra-forecast modes.
A video decoder comprising a computation circuit connected to a data storage device, wherein the computation circuit is arranged to: - receive a bitstream comprising encoded video data; - the encoded video data comprising a first indication that a pixel block is encoded based on a template-based intra-mode derivation; - the encoded video data comprising a second indication that a final TIMD mode is based on a non-angle-dependent intra-prediction mode; - determine a plurality of costs for a plurality of intra-prediction modes, wherein each cost factor for a respective IPM being based on differences in a template of the block between: predicted samples obtained by prediction from reference samples of the template using the respective IPM; and reconstructed samples of the template; - Determine a first TIMD mode, which is the first of the multitude of intra-prediction modes with the lowest cost, and a second TIMD mode, which is the second of the multitude of intra-prediction modes with the second-lowest cost; Determine a third TIMD mode, which is a non-angle-dependent intra-prediction mode; Verify whether the third TIMD mode differs from the first TIMD mode and whether the third TIMD mode differs from the second TIMD mode; Based on successful verification, form the final TIMD mode, which comprises a linear combination of the first TIMD mode, the second TIMD mode, and the third TIMD mode; and Form a prediction of the block based on a template of the block and the final TIMD mode.
Video encoder, comprising: - Receiving an original video image comprising a block to be encoded using template-based intraprediction coding; - Determining a variety of intraprediction modes and their respective costs; - Determining a first TIMD mode, which is the first of the variety of intraprediction modes with the lowest costs, and a second TIMD mode, which is the second of the variety of intraprediction modes with the second lowest costs; - Determining a third TIMD mode, which is a non-angle-dependent intraprediction mode; - Checking whether the third TIMD mode differs from the first TIMD mode and whether the third TIMD mode differs from the second TIMD mode; - Based on successful verification, construct a final TIMD mode comprising a linear combination of the first, second, and third TIMD modes; and - Construct a prediction of the block based on a template of the block and the final TIMD mode. - Encode in encoded video data a first indication that a pixel block is encoded based on a template-based intra-mode derivation, and a second indication that a final TIMD mode is based on a non-angle-based intra-prediction mode.

Description

AREA OF INVENTION The invention relates to the predictive coding of blocks from previously decoded blocks of the same image using template-based intra-mode derivation technology, which can be used in video encoders and decoders. BACKGROUND OF THE INVENTION Video encoding achieved great compression rates by exploiting the fact that objects typically move but don't necessarily change shape. Therefore, the same information already decoded for a previous frame can be reused to predict the pixels of that object in a current frame. For example, if the buildings in a cityscape undergo a simple panning motion, a global motion vector can be used to move the pixels from their previous position in the previous frame to the position of a currently decoded block. Regarding prediction correction, if no other changes occur, such as a change in lighting, these predicted pixels will be close enough to the values of the original frame to be encoded as a faithful reconstruction. Even in the case of changes, prediction is still useful because communicating the difference (also called the residual) requires far fewer bits than encoding the entire pixel block itself. However, motion-based prediction is not always the best prediction (e.g., at the beginning of a new scene). In this case, the so-called intra-prediction can be used, in which a block at a certain position, which is currently being decoded and reconstructed, can be predicted from already decoded blocks of the same image (at higher positions in the image or to the left of the current block). Linearly predictable features in the image can be utilized by employing a so-called angular intra-prediction. This is a prediction that forecasts a currently predicted pixel from a neighboring, already reconstructed pixel that can be retrieved at a specific angular direction. For example, a skyscraper might exhibit largely the same colors in vertical lines that follow the structure of the windows and the wall between them. Retrieving a neighboring pixel that is vertically higher (corresponding to angular direction 26 in, for example, HEVC) provides a good prediction for the color components of the current pixel. In older MPEG codecs, the prediction direction or directions for a block were usually explicitly signaled. With newer codecs, it has become clear that data processing can be more cost-effective than communication for some devices or communication systems. Consequently, one goal can be for the decoder to recognize vertical patterns on its own, eliminating the need for explicit instructions to use the vertical angle prediction mode. Only correction bits need to be sent if necessary, thus saving a few bits. Template-based intra-mode derivation (TIMD) is a technique that allows the decoder to autonomously derive what should be a good prediction, such as an angle prediction of 45 degrees. In this approach, a template is created, for example, block width × height = 2 pixels above a block (the current block to be predicted, as well as all surrounding blocks that could be good candidates for the prediction). The predictability is then checked, for example, for an angle prediction of 45 degrees. The current block itself has not yet been decoded (the intra-prediction mode must first be set up), so these pixels cannot be used for the predictability check. However, another adjacent area can be used, located slightly further above the block and bordering the template; this second area is called the reference. If a diagonal pattern extends down through the reference area into the template block (both for the currently decoded block and for a good candidate reference block on which the prediction of the block pixel color components can be based), it is likely that this pattern continues down from the template area into the area of the block and thus is still a good It is a predictor. Different prediction directions can be tested, and a cost measurement can indicate which directions are the best prediction (e.g., downwards for the skyscraper instead of diagonally). It is desirable to continuously improve these possibilities. BRIEF SUMMARY OF THE INVENTION The encoding of video images can be improved by using a method for predicting a block of pixel colors in an image, including: - Determine that the template-based intra-mode derivative is applicable for predicting the block; - Determining a variety of intra-forecast modes, and determining the respective costs for the variety of intra-forecast modes; - Determining a first TIMD mode, which is the first of the multitude of intra-forecast modes with the lowest costs, and a second TIMD mode, which is the second of the multitude of intra-forecast modes with the second-lowest costs; - Determining a third TIMD mode, which is a non-angle-dependent intra-prediction mode; - Check if the third TIMD mode differs from the first TIMD mode (i.e., is not identical to it) and if the third TIMD mode differs from the second TIMD mode; - based on the successf