KR-20260066641-A - METHOD AND APPARTUS THEREOF FOR VIDEO ENCODING AND DECODING FOR SCREEN CONTENTS AND GAMING CONTENTS VIDEO

KR20260066641AKR 20260066641 AKR20260066641 AKR 20260066641AKR-20260066641-A

Abstract

The present invention identifies features of a computer graphics-processed image and presents an image encoding method utilizing said features. Images generated by graphics processing clearly exhibit differences between surrounding pixel values, and these features are distinct from images captured by a camera. These features can be utilized in the image encoding method. The present invention aims to improve encoding performance by utilizing a method that does not smooth the differences between surrounding pixel values within a computer graphics-generated image during the image encoding/decoding process.

Inventors

류창우
이선영

Assignees

가온그룹 주식회사
주식회사 아틴스

Dates

Publication Date: 20260512
Application Date: 20251104
Priority Date: 20241104

Claims (20)

In a video decoding method, A step of obtaining region information from a bitstream indicating that at least a portion of the current image is an arbitrary region generated by computer graphics; A step of identifying at least one filtering process applied in the prediction process of the current block corresponding to the production area based on the above area information; A step of generating a prediction block for the current block by selectively disabling the identified filtering process; and An image decoding method comprising the step of restoring the current block using the prediction block.
In Article 1, The above production area corresponds to a part of the above current image, and A video decoding method further comprising the step of additionally obtaining region information indicating the location and size of the portion of the region from the bitstream.
In Article 2, An image decoding method characterized by specifying the above-mentioned area information using a coding tree unit (CTU) index.
In Article 2, An image decoding method characterized by specifying the above-mentioned area information using a tile index.
In Article 2, A video decoding method characterized in that the above-mentioned production areas are multiple, and area information for each production area is individually signaled.
In Article 1, The above filtering process is, It includes at least one of reference sample filtering, location-based prediction combination (PDPC), and intra prediction fusion, An image decoding method characterized in that at least one filtering process is integrally controlled by a single flag.
In Article 6, A video decoding method further comprising the step of individually controlling at least one of the at least one filtering process based on an additional flag.
In Article 1, It further includes a step of automatically determining whether the current block belongs to a production area using a hash block-based algorithm; An image decoding method characterized by determining the production area when the hit percentage of the above hash block exceeds a predetermined threshold.
In Article 1, The above filtering process is, An image decoding method comprising at least one of overlapping block motion compensation (OBMC), geometric segmentation mode (GPM) blending, and bidirectional optical flow (BDOF).
In Article 1, An image decoding method characterized by adaptively determining whether to disable the filtering process at the coding unit (CU) level.
In Article 1, A video decoding method characterized by applying at least one of a transform skip mode and an intra-block copy (IBC) mode preferentially in the above-mentioned production area.
In Article 1, A video decoding method characterized in that the above-mentioned region information is signaled in at least one bit sequence syntax unit among a sequence parameter set (SPS), a picture parameter set (PPS), a picture header (PH), or a slice header (SH).
In Article 1, A video decoding method characterized by the fact that, when a dual-tree structure is applied, the filtering process is controlled independently for the chroma block.
In a video encoding method, A step of determining whether at least a portion of the current image is an arbitrary region generated by computer graphics; A step of identifying at least one filtering process applied in the prediction process of the current block corresponding to the production area based on the above judgment result; A step of generating a prediction block for the current block by selectively disabling the identified filtering process; A step of encoding the current block using the prediction block; and A video encoding method comprising the step of including area information indicating the above-mentioned production area in a bitstream.
In Article 14, The above production area corresponds to a part of the above current image, and A video encoding method further comprising the step of including region information indicating the location and size of the aforementioned partial region in the bitstream.
In Article 14, The above filtering process includes at least one of reference sample filtering, location-based prediction combination (PDPC), and intra-prediction fusion, and A video encoding method characterized by integrating and controlling at least one filtering process with a single flag.
In Article 14, A video encoding method further comprising the step of determining the production area by calculating a hash value in units of 4×4 or 8×8 blocks.
In a video decoding device, A parsing unit configured to obtain region information from a bitstream indicating that at least a portion of the current image is an arbitrary region generated by computer graphics; A prediction unit configured to identify at least one filtering process applied in the prediction process of a current block corresponding to the production area based on the above area information, and to selectively disable the identified filtering process to generate a prediction block for the current block; and An image decoding device comprising a restoration unit configured to restore the current block using the prediction block.
In Article 18, The above production area corresponds to a part of the above current image, and An image decoding device configured such that the above parsing unit further acquires area information indicating the location and size of the above partial area.
In Article 18, The above filtering process includes at least one of reference sample filtering, location-based prediction combination (PDPC), and intra-prediction fusion, and An image decoding device configured such that the prediction unit is configured to integrally control the at least one filtering process based on a single flag.

Description

Method and apparatus for encoding and decoding video of screen content and game content The present invention relates to the field of encoding and decoding of digital video, and to a method for encoding and decoding digital video, a method for recording such data, and components, devices, and systems for realizing such a method. The present invention may correspond to a technical field identical to at least one of the digital video compression technology standards known by standard names such as MPEG-2, MPEG-4 Video, H.263, H.264/AVC, H.265/HEVC, H.266/VVC, VC-1, AV1, QuickTime, VP-9, VP-10, and Motion JPEG, a technical field for improving the inherent efficiency of the standard, or a technical field for improving or replacing the standard. The present invention relates to an adaptive filter control method and apparatus capable of reducing encoding/decoding complexity while maintaining or improving compression performance by selectively deactivating unnecessary filtering operations during the intra-prediction process, taking into account the characteristics of images generated by computer graphics, such as screen content and gaming content. Digital video encoding and decoding are widely utilized in various digital video applications. For example, devices such as video recording equipment and camcorders used for video recording activities—including digital television broadcasting, video transmission via communication networks, video calls, video conversations, and video chats, recording and provision of video content using optical media such as VCDs (video compact discs), DVDs (digital versatile discs), and Blu-rays, all procedures for the production, editing, collection, and distribution of video content, and video recording for personal, commercial, industrial, and security purposes—are all dependent on video encoding and decoding technology. Accordingly, embodiments that can be referred to as digital video encoders and decoders may constitute a part of a wide range of devices related to the creation, recording, and provision of digital video, including digital television, digital broadcasting systems, wireless broadcasting systems, computers in the form of notebooks/desktops/tablets, e-book readers, digital cameras, digital recording devices, digital multimedia playback devices, video game devices/terminals/consoles, mobile phones equipped with multimedia playback functions (including smartphones), equipment for video conferencing, and other devices. Digital video encoders and decoders as described above can be implemented by digital video compression standards that are understood by and widely used by people skilled in the art. The digital video compression standards may include at least one of the compression standards known by standard names such as MPEG-2, MPEG-4 Video, H.263, H.264/AVC, H.265/HEVC, H.266/VVC, VC-1, AV1, QuickTime, VP-9, VP-10, and Motion JPEG. Video encoders and decoders can be implemented to encode or decode digital video information more efficiently while complying with the above specifications, or by improving or modifying them. Attempts to modify the above specifications may also lead to the development of new specifications. Among well-known examples is the so-called enhanced compression model (ECM), which is an attempt to improve and replace the conventional H.266/VVC specifications, currently being developed by the Joint Video Experts Team (JVET), a joint international standardization group of ISO, IEC, and ITU-T. Meanwhile, as part of recent advancements in video encoding technology, attempts are being made to improve the compression efficiency of videos containing screen content, such as text or graphics, and videos containing gaming content. Unlike natural images, the aforementioned screen content images may contain edges occurring in arbitrary directions rather than horizontal or vertical directions, and are characterized by a complex mixture of linear and curved components within the image. Furthermore, the aforementioned gaming content images may exhibit characteristics of both natural images and screen content simultaneously. For example, gaming content may contain a mixture of parts with characteristics of natural images, such as in-game characters or backgrounds, and parts with characteristics of screen content, such as in-game interfaces or text. Consequently, it has been pointed out that smoothing-related processes designed for natural images, such as Reference Sample Filtering, Position Dependent Prediction Combination (PDPC), and Intra Prediction Fusion, may actually degrade the sharp edge characteristics of screen content. FIG. 1 is a conceptual diagram of a video communication system according to an embodiment of the present invention, FIG. 2 is a conceptual diagram of the arrangement of an encoder and a decoder in a real-time video streaming environment according to an embodiment of the present invention. FIG. 3 is a conceptual diagram of a functional unit of a