EP-4736408-A2 - INTRA AFFINE PREDICTION IN VIDEO CODING
Abstract
Methods and systems are described for video coding and decoding using intra affine prediction. Two or three control point best vectors (CPBV) to be used to derive motion vectors of affine predictions are generated based on creating for each CPBV a list of candidate best vectors, and selecting according to a criterion the two or three best ones. Methods to generate the list of the candidate best vectors include a first method based on neighboring coded units and a distance criterion among two consecutive best vectors in the list, and a second method based on template matching between a coded unit to be encoded or decoded and prior-decoded coded units in a reference area.
Inventors
- ARUMUGAM, Jeeva Raj
- Natesan, Ashwin
- VALVAIKER, Vaibhav Pandurang
- Shingala, Jay Nitin
- PU, Fangjun
- LU, TAORAN
- YIN, PENG
- SARATE, Abhijeet Basavaraj
Assignees
- Dolby Laboratories Licensing Corporation
Dates
- Publication Date
- 20260506
- Application Date
- 20240626
Claims (20)
- CLAIMS 1. A method in a decoder for video decoding intra-coded coding units coded in an intra affine mode, the method comprising: receiving a coded unit (CU) coded using an intra-affine prediction mode; determining two or more control point best vectors (CPBVs) based on a list of best vectors (BVs) generated from prior decoded CUs neighboring the CU; determining motion vectors for the intra-affine prediction mode based on the two or more CPBVs; and decoding the CU using the determined motion vectors, wherein determining the two or more control point best vectors (CPBVs) comprises: for each one of the two or more CPBVs: creating a BV list of two or more best vectors, wherein two consecutive best vectors in the BV list satisfy a distance criterion.
- 2. The method of claim 1, wherein satisfying the distance criterion comprises adding best vector B(i+1) bv in the BV list if |B(i) "^ ^ – B(i+1) "^ ^ | + |(B(i) "^ – B(i+1) "^ )| > th, wherein th denotes a threshold, B(i) bv and B(i+1) bv are two consecutive best vectors in the BV list with horizontal ("^ ^ ^ and vertical ^"^ ^ ) components, and (i+1) does not exceed a maximum possible number of best vectors in the BV list.
- 3. The method of claim 2, wherein the BV list includes at most 2 or 3 best vectors.
- 4. A method for video decoding intra-coded coding units coded in an intra affine mode, the method comprising: receiving a coded unit (CU) coded using an intra-affine prediction mode; determining two control point best vectors (CPBVs) based on a list of best vectors (BVs) generated from a template matching process; determining motion vectors for the intra-affine prediction mode based on the two CPBVs; and D23075WO01 decoding the CU using the determined motion vectors, wherein determining the two control point best vectors (CPBVs) comprises: for a first CPBV of the two CPBVs creating a first BV list of two or more best vectors based on template costs using a left or a top template for the CU; and for a second CPBV of the two CPBVs creating a second BV list of two or more best vectors based on template costs using a top or a left template for the CU.
- 5. The method of claim 4, wherein creating the first BV list comprises: computing template costs associated with performing template matching (TM) between the left template for the CU and two or more corresponding templates in a reference region for the CU; ranking the template costs in increasing cost order; and selecting as BVs vectors for the first BV list those vectors associated with two or more of the template costs with smaller template costs.
- 6. The method of claim 4, wherein creating the second BV list comprises: computing template costs associated with performing template matching (TM) between the top template for the CU and two or more corresponding templates in a reference region for the CU; ranking the template costs in increasing cost order; and selecting as BVs vectors for the second BV list those vectors associated with two or more of the templates costs with smaller template costs.
- 7. The method of claim 4 wherein a syntax parameter received by the decoder determines whether the first BV list is associated with the left template and the second BV list is associated with the top template or whether the first BV list is associated with the top template and the second BV list is associated with the left template.
- 8. The method of claim 1 or claim 4, wherein determining the two or more CPBVs further comprises: generating based on the BV lists for the two or more CPBVs a final intra affine candidate list; and selecting according to a selection criterion the two or more CPBVs. D23075WO01
- 9. The method of claim 8, wherein the selection criterion comprises receiving a candidate index parameter from the decoder selecting the two or more CPBVs in the final intra affine candidate list.
- 10. The method of claim 9, wherein the selection criterion comprises performing template matching between the CU and candidate CPBVs in the final intra affine candidate list and selecting the two or more CPBVs with the smaller template-matching cost.
- 11. A method in an encoder for video coding intra-coded coding units coded in an intra affine mode, the method comprising: accessing a coded unit (CU) to be coded in intra-affine prediction mode; determining two or more control point best vectors (CPBVs) based on a list of best vectors (BVs) generated from prior encoded CUs neighboring the CU; determining motion vectors for the intra-affine prediction mode based on the two or more CPBVs; and encoding the CU using the determined motion vectors, wherein determining the two or more control point best vectors (CPBVs) comprises: for each one of the two or more CPBVs: creating a BV list of two or more best vectors, wherein two consecutive best vectors in the BV list satisfy a distance criterion.
- 12. The method of claim 11, wherein satisfying the distance criterion comprises adding best vector B(i+1) bv in the BV list if |B(i) "^ ^ – B(i+1) "^ ^ | + |(B(i) "^ – B(i+1) "^ )| > th, wherein th denotes a threshold, B(i) bv and B(i+1) bv are two consecutive best vectors in the BV list with horizontal ("^ ^ ^ and vertical ^"^ ^ ) components, and (i+1) does not exceed a maximum possible number of best vectors in the BV list.
- 13. The method of claim 12, wherein the BV list includes at most 2 or 3 best vectors. D23075WO01
- 14. A method for video encoding coding units in an intra affine mode, the method comprising: accessing a coded unit (CU) to be coded in intra-affine prediction mode; determining two control point best vectors (CPBVs) based on a list of best vectors (BVs) generated from a template matching process; determining motion vectors for the intra-affine prediction mode based on the two CPBVs; and encoding the CU using the determined motion vectors, wherein determining the two control point best vectors (CPBVs) comprises: for a first CPBV of the two CPBVs creating a first BV list of two or more best vectors based on template costs using a left or a top template for the CU; and for a second CPBV of the two CPBVs creating a second BV list two or more best vectors based on template costs using a top or a left template for the CU.
- 15. The method of claim 14, wherein creating the first BV list comprises: computing template costs associated with performing template matching (TM) between the left template for the CU and two or more corresponding templates in a reference region for the CU; ranking the template costs in increasing cost order; and selecting as BVs vectors for the first BV list those vectors associated with two or more of the templates costs with the smaller template costs.
- 16. The method of claim 14, wherein creating the second BV list comprises: computing template costs associated with performing template matching (TM) between the top template for the CU and two or more corresponding templates in a reference region for the CU; ranking the template costs in increasing cost order; and selecting as BVs vectors for the second BV list those vectors associated with two or more of the templates costs with the smaller template costs.
- 17. The method of claim 14, wherein a syntax parameter transmitted by the encoder in a bitstream with the encoded CU specifies whether the first BV list is associated with the left template and the second BV list is associated with the top template or whether D23075WO01 the first BV list is associated with the top template and the second BV list is associated with the left template.
- 18. The method of claim 11 or claim 14, wherein determining the two or more CPBVs further comprises: generating based on the BV lists for the two or more CPBVs a final intra affine candidate list; and selecting according to a selection criterion the two or more CPBVs.
- 19. The method of claim 18, wherein the selection criterion comprises selecting the two or more CPBVs according to a rate-distortion criterion, and sending by the encoder a candidate index parameter identifying the two or more CPBVs in the final intra affine candidate list.
- 20. The method of claim 19, wherein the selection criterion comprises performing template matching between the CU and candidate CPBVs in the final intra affine candidate list and selecting the two or more CPBVs with the smaller template-matching costs.
Description
INTRA AFFINE PREDICTION IN VIDEO CODING CROSS-REFERENCE TO RELATED APPLICATIONS [0001] This patent application claims the benefit of priority from Provisional Indian Patent Application Ser. No. 202311043461, filed on 28 June 2023, and Provisional Indian Patent Application Ser. No. 202311066925, filed on 5 October 2023, each of which is incorporated by reference herein in its entirety. TECHNOLOGY [0002] The present document relates generally to images and video coding. More particularly, an embodiment of the present invention relates to applications of Intra affine prediction tools in video coding. BACKGROUND [0003] In 2020, the MPEG group in the International Standardization Organization (ISO), jointly with the International Telecommunications Union (ITU), released the first version of the Versatile Video Coding Standard (VVC), also known as H.266 (Ref. [1]). More recently, the same group has been working on the development of the next generation coding standard that provides improved coding performance over existing video coding technologies. As part of this investigation, new coding techniques are also examined. [0004] As appreciated by the inventors here, improved techniques for applying Intra affine prediction tools in image and video coding are desired, and they are described herein. [0005] The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section. Similarly, issues identified with respect to one or more approaches should not assume to have been recognized in any prior art on the basis of this section, unless otherwise indicated. BRIEF DESCRIPTION OF THE DRAWINGS D23075WO01 [0006] An embodiment of the present invention is illustrated by way of example, and not in way by limitation, in the figures of the accompanying drawings and in which like reference numerals refer to similar elements and in which: [0007] FIG.1 depicts examples of Intra-block copying (IBC) and Affine prediction according to prior art; [0008] FIG.2 depicts an example of subblock based affine prediction according to prior art; [0009] FIG.3 depicts an example of motion vector prediction (MVP) in IBC-Affine prediction according to prior art; [00010] FIG. 4 depicts derivation points to derive control point block vectors (CPBV) in advanced intra affine according to an embodiment of this invention; [00011] FIG.5 depicts subblock-based block vector derivation for the current block in advanced intra affine mode according to an embodiment of this invention; [00012] FIG. 6 depicts examples of templates for using template matching in advanced intra affine mode according to an embodiment of this invention; and [00013] FIG.7 depicts examples of deriving control point vectors for an intra affine transform using template matching according to an embodiment of this invention; [00014] FIG. 8 depicts examples of deriving control point block vectors (CPBVs) for intra affine transform based on different model types according to an embodiment of this invention; [00015] FIG.9 depicts an example CPBV list construction process, according to an embodiment of this invention; [00016] FIG. 10 depicts an example of an intra affine candidate list construction process according to an embodiment of this invention; and [00017] FIG.11 depicts control point intra prediction modes from spatial neighbors to be used for deriving the sub-block modes using an affine model, according to an embodiment of this invention. DESCRIPTION OF EXAMPLE EMBODIMENTS [00018] Example embodiments that relate to applying Intra Affine prediction tools in video coding are described herein. In the following description, for the purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the various embodiments of present invention. It will be apparent, however, that the various embodiments of the present invention may be practiced D23075WO01 without these specific details. In other instances, well-known structures and devices are not described in exhaustive detail, in order to avoid unnecessarily occluding, obscuring, or obfuscating embodiments of the present invention. SUMMARY [00019] Example embodiments described herein relate to applying intra affine prediction tools in image and video coding. In intra affine prediction, an affine transform is generated based on motion vectors generated from two or three control point best vectors (CPBV). In embodiments described herein the two or three CPBVs are selected from a final intra affine candidate list that includes a combination of generated best vectors (BVs). In a first embodiment, for a coded unit (CU), the BVs are generated based on neighboring CUs that were coded in IBC, intra TMP, or intra affine mode