US-12621497-B2 - Device and method for coding video data
Abstract
A method of encoding video data by an electronic device is provided. The method receives the video data including at least one image frame, each including one or more regions. The method signals a first affine flag in a sequence parameter set (SPS) associated with the at least one image frame when an affine mode including affine tools is enabled, and determines that a second affine flag is signaled in the SPS when the first affine flag is equal to one. The method determines that a third affine flag corresponding to an affine prediction refinement with optical flow (PROF) mode is signaled in a slice header associated with a specific region in a specific image frame when the second affine flag is equal to one. The method reconstructs the specific region based on first candidate modes, including the affine PROF mode, when the third affine flag is equal to zero.
Inventors
- Yu-Chiao Yang
Assignees
- SHARP KABUSHIKI KAISHA
Dates
- Publication Date
- 20260505
- Application Date
- 20240719
Claims (19)
- 1 . A method of encoding video data performed by an electronic device, the method comprising: receiving the video data, including at least one image frame, each of the at least one image frame including one or more regions; signaling a first affine flag in a sequence parameter set (SPS), associated with the at least one image frame, when an affine mode is enabled in the at least one image frame, wherein the SPS is included in encoded data and the affine mode includes a plurality of affine tools; determining that a second affine flag is signaled in the SPS when the first affine flag is equal to one; determining that a third affine flag is signaled in a slice header, associated with a specific one of the one or more regions in a specific one of the at least one image frame, when the second affine flag is equal to one, wherein: the slice header is included in the encoded data, the third affine flag corresponds to one of the plurality of affine tools, the one of the plurality of affine tools comprises an affine prediction refinement with optical flow (PROF) mode, and the third affine flag comprises an affine PROF disabled flag, indicating whether the affine PROF mode is disabled, when the specific one of the one or more regions associated with the slice header is reconstructed; and reconstructing the specific one of the one or more regions based on a plurality of first candidate modes, including the one of the plurality of affine tools, when the third affine flag is equal to zero.
- 2 . The method according to claim 1 , further comprising: determining that the third affine flag is not signaled in the slice header when the second affine flag is equal to zero.
- 3 . The method according to claim 2 , further comprising: determining that the second affine flag is not signaled in the SPS when the first affine flag is equal to zero; and inferring that the second affine flag is equal to zero when the second affine flag is not signaled in the SPS.
- 4 . The method according to claim 1 , further comprising: inferring that the third affine flag is equal to zero when the second affine flag is signaled in the SPS and the third affine flag is not signaled in the slice header.
- 5 . The method according to claim 1 , further comprising: disabling the affine PROF mode for the specific one of the one or more regions when the third affine flag is equal to one; and reconstructing the specific one of the one or more regions based on a plurality of second candidate modes when the third affine flag is equal to one, wherein the affine PROF mode is excluded from the plurality of first candidate modes to generate the plurality of second candidate modes.
- 6 . The method according to claim 1 , wherein: a syntax level of the SPS is higher than a syntax level of the slice header, the at least one image frame is included in a video sequence of the video data and the SPS corresponds to the video sequence, and the specific one of the one or more regions comprises a slice, that is included in the specific one of the at least one image frame in the video sequence, and the slice header corresponds to the slice.
- 7 . The method according to claim 1 , wherein: the first affine flag comprises an affine PROF enabled flag, indicating whether the affine PROF mode is enabled, when the at least one image frame, associated with the SPS, is reconstructed, and the second affine flag comprises an affine PROF present flag, indicating whether the affine PROF disabled flag is included in the slice header, when the second affine flag is equal to one.
- 8 . The method according to claim 7 , further comprising: disabling the affine PROF mode for the at least one image frame, associated with the SPS, when the affine PROF enabled flag is equal to zero; and reconstructing the at least one image frame based on a plurality of second candidate modes when the affine PROF enabled flag is equal to zero, wherein the affine PROF mode is excluded from the plurality of first candidate modes to generate the plurality of second candidate modes.
- 9 . The method according to claim 7 , further comprising: enabling the affine PROF mode for the specific one of the one or more regions, associated with the slice header, when the affine PROF disabled flag is equal to zero; signaling a block flag, in a syntax structure corresponding to a block unit, indicating whether the block unit in the specific one of the one or more regions is predicted by the affine mode, when the affine PROF disabled flag is equal to zero, wherein the syntax structure is different from the SPS and the slice header; predicting the block unit based on the affine mode to generate a prediction block when the block flag is equal to one; and refining the prediction block according to the affine PROF mode when the block flag is equal to one.
- 10 . An electronic device for encoding video data, the electronic device comprising: at least one processor; and at least one non-transitory computer-readable medium coupled to the at least one processor and storing one or more computer-executable instructions that, when executed by the at least one processor, cause the electronic device to: receive the video data, including at least one image frame, each of the at least one image frame including one or more regions; signal a first affine flag in a sequence parameter set (SPS) (SPS), associated with the at least one image frame, when an affine mode is enabled in the at least one image frame, wherein the SPS is included in encoded data and the affine mode includes a plurality of affine tools; determine that a second affine flag is signaled in the SPS when the first affine flag is equal to one; determine that a third affine flag is signaled in a slice header, associated with a specific one of the one or more regions in a specific one of the at least one image frame, when the second affine flag is equal to one, wherein: the slice header is included in the encoded data, the third affine flag corresponds to one of the plurality of affine tools, the one of the plurality of affine tools comprises an affine prediction refinement with optical flow (PROF) mode, and the third affine flag comprises an affine PROF disabled flag, indicating whether the affine PROF mode is disabled, when the specific one of the one or more regions associated with the slice header is reconstructed; and reconstruct the specific one of the one or more regions based on a plurality of first candidate modes, including the one of the plurality of affine tools, when the third affine flag is equal to zero.
- 11 . The electronic device according to claim 10 , wherein the one or more computer-executable instructions, when executed by the at least one processor, further cause the electronic device to: determine that the second affine flag is not signaled in the SPS when the first affine flag is equal to zero; infer that the second affine flag is equal to zero when the second affine flag is not signaled in the SPS; and determine that the third affine flag is not signaled in the slice header when the second affine flag is equal to zero.
- 12 . The electronic device according to claim 10 , wherein the one or more computer-executable instructions, when executed by the at least one processor, further cause the electronic device to: determine that the third affine flag is not signaled in the slice header when the second affine flag is equal to zero; and infer that the third affine flag is equal to zero when the second affine flag is signaled in the SPS and the third affine flag is not signaled in the slice header.
- 13 . The electronic device according to claim 10 , wherein the one or more computer-executable instructions, when executed by the at least one processor, further cause the electronic device to: disable the affine PROF mode for the specific one of the one or more regions when the third affine flag is equal to one; and reconstruct the specific one of the one or more regions based on a plurality of second candidate modes when the third affine flag is equal to one, wherein the affine PROF mode is excluded from the plurality of first candidate modes to generate the plurality of second candidate modes.
- 14 . The electronic device according to claim 10 , wherein: the first affine flag comprises an affine PROF enabled flag, indicating whether the affine PROF mode is enabled, when the at least one image frame, associated with the SPS, is reconstructed, and the second affine flag comprises an affine PROF present flag, indicating whether the affine PROF disabled flag is included in the slice header, when the second affine flag is equal to one.
- 15 . The electronic device according to claim 14 , wherein the one or more computer-executable instructions, when executed by the at least one processor, further cause the electronic device to: disable the affine PROF mode for the at least one image frame, associated with the SPS, when the affine PROF enable flag is equal to zero; and reconstruct the at least one image frame based on a plurality of second candidate modes when the affine PROF enabled flag is equal to zero, wherein the affine PROF mode is excluded from the plurality of first candidate modes to generate the plurality of second candidate modes.
- 16 . A method of encoding video data performed by an electronic device, the method comprising: receiving the video data, including at least one image frame, each of the at least one image frame including one or more regions; signaling a first affine flag in a sequence parameter set (SPS), associated with the at least one image frame, when an affine mode is enabled in the at least one image frame, wherein the SPS is included in encoded data and the affine mode includes a plurality of affine tools; determining, based on the first affine flag, whether a second affine flag is signaled in the SPS; determining, based on the second affine flag, whether a third affine flag is signaled in a slice header, associated with a specific one of the one or more regions in a specific one of the at least one image frame, wherein: the slice header is included in the encoded data, the third affine flag corresponds to one of the plurality of affine tools, the one of the plurality of affine tools comprises an affine prediction refinement with optical flow (PROF) mode, and the third affine flag comprises an affine PROF disabled flag, indicating whether the affine PROF mode is disabled, when the specific one of the one or more regions associated with the slice header is reconstructed; and reconstructing the specific one of the one or more regions based on a plurality of first candidate modes, including the one of the plurality of affine tools, when the third affine flag is equal to zero.
- 17 . The method according to claim 16 , further comprising: determining that the second affine flag is signaled in the SPS when the first affine flag is equal to one; and determining that the second affine flag is not signaled in the SPS and inferring that the second affine flag is equal to zero when the first affine flag is equal to zero.
- 18 . The method according to claim 16 , further comprising: determining that the third affine flag is signaled in the slice header when the second affine flag is equal to one; and determining that the third affine flag is not signaled in the slice header when the second affine flag is equal to zero.
- 19 . The method according to claim 16 , further comprising: disabling the affine PROF mode for the at least one image frame, associated with the SPS, when the first affine flag is equal to zero; and reconstructing the at least one image frame based on a plurality of second candidate modes when the affine PROF enabled flag is equal to zero, wherein the affine PROF mode is excluded from the plurality of first candidate modes to generate the plurality of second candidate modes, wherein the first affine flag comprises an affine PROF enabled flag, indicating whether the affine PROF mode is enabled, when the at least one image frame associated with the SPS is reconstructed.
Description
CROSS-REFERENCE TO RELATED APPLICATION(S) This application is a continuation application of U.S. patent application Ser. No. 17/836,278, filed on Jun. 9, 2022, which is a continuation application of U.S. patent application Ser. No. 16/987,304, filed on Aug. 6, 2020, issued as U.S. Pat. No. 11,405,648, which claims the benefit of and priority to U.S. Provisional Patent Application Ser. No. 62/884,335, filed on Aug. 8, 2019, the contents of all of which are hereby incorporated herein fully by reference in their entirety for all purposes. FIELD The present disclosure generally relates to video coding, and more specifically, to techniques for controlling an affine tool to be enabled or disabled by different syntax structures having different syntax levels for reconstructing image frames in encoded data. BACKGROUND In a conventional video coding method, an encoder may encode video data to generate encoded data having multiple flags and provide the encoded data to a decoder. The flags may indicate whether multiple coding modes is enabled or not. For example, the encoded data may include a block-based affine flag indicating whether a block unit is predicted by an affine mode. In addition, the block unit is also refined according to an affine prediction refinement with optical flow (PROF) mode when the block-based affine flag indicates that the block unit is predicted by the affine mode. However, the coding efficiency is not always increased when the affine-predicted blocks are refined according to the affine PROF mode. In other words, the coding efficiency may decrease for some of the block units refined according to the affine PROF mode. Thus, the encoder and the decoder need to have more flags for the affine PROF mode. In addition, the selection of a syntax level of the affine PROF flag is critical to prevent the number of bits in the encoded data from increasing too much. SUMMARY The present disclosure is directed to a device and method for disabling an adjustment to an initial prediction result using several flags. In a first aspect of the present disclosure, a method for decoding a bitstream by an electronic device is provided. The method includes receiving encoded data, as part of the bitstream, for at least one image frame, wherein each of the at least one image frame includes one or more regions; determining a first affine flag from a sequence parameter set (SPS) associated with the at least one image frame when an affine mode is enabled in the at least one image frame, wherein the SPS is included in the encoded data and the affine mode includes multiple affine tools; determining that a second affine flag is present in the SPS when the first affine flag is equal to one; determining that a third affine flag is present in a slice header associated with a specific one of the one or more regions in a specific one of the at least one image frame when the second affine flag is equal to one, wherein the slice header is included in the encoded data, and the third affine flag corresponds to one of the multiple affine tools; and reconstructing the specific one of the one or more regions based on multiple first candidate modes, including the one of the multiple affine tools, when the third affine flag is equal to zero. In a second aspect of the present disclosure, a method for decoding a bitstream by an electronic device is provided. The method includes receiving encoded data, as part of the bitstream, for at least one image frame, wherein each of the at least one image frame includes one or more regions; determining a first affine flag from a sequence parameter set (SPS) associated with the at least one image frame when an affine mode is enabled in the at least one image frame, wherein the SPS is included in the encoded data and the affine mode includes multiple affine tools; determining, based on the first affine flag, whether a second affine flag is present in the SPS; determining, based on the second affine flag, whether a third affine flag is present in a slice header associated with a specific one of the one or more regions in a specific one of the at least one image frame, wherein the slice header is included in the encoded data, and the third affine flag corresponds to one of the multiple affine tools; and reconstructing the specific one of the one or more regions based on multiple first candidate modes, including the one of the multiple affine tools, when the third affine flag is equal to zero. In a third aspect of the present disclosure, a method of encoding video data and an electronic device for performing the method are provided. The method includes receiving the video data including at least one image frame, wherein each of the at least one image frame includes one or more regions; signaling a first affine flag in a sequence parameter set (SPS) associated with the at least one image frame when an affine mode is enabled in the at least one image frame, wherein the SPS is included in encoded data and the