JP-7855646-B2 - Encoder, decoder, and encoding and decoding methods involving complex processing for flexibly sized image partitions

JP7855646B2JP 7855646 B2JP7855646 B2JP 7855646B2JP-7855646-B2

Inventors

スクーピンローベルト
サンチェスデラフエンテヤーゴ
ヘルゲコルネリウス
シーアルトーマス
ズューリングカルステン
ウィーガントトーマス

Assignees

フラウンホッファー－ゲゼルシャフトツァフェルダールングデァアンゲヴァンテンフォアシュンクエー．ファオ

Dates

Publication Date: 20260508
Application Date: 20240627
Priority Date: 20181228

Claims (6)

A hardware device for video decoding, A first flag indicating whether to predict a block of tiles based on a sample reconstructed with respect only to the current image and not to other images, wherein the first flag included in the tile header of the current image is decoded from the data stream, A second flag indicating whether to predict the current block of the tile by copying a sample from a reference block of the sample reconstructed with respect to the current image, in response to a first flag indicating that the block of the tile is predicted based on a sample reconstructed with respect to the current image only and not to other images, wherein the second flag included in the header of the tile containing the first flag is decoded from the data stream, Depending on the second flag indicating that the current block is predicted by copying a sample from the reference block, the prediction of the current block in the current image is determined by copying a sample from the reference block. The current block is reconstructed based on the prediction of the current block and the prediction residual decoded from the data stream. A hardware device configured in such a way.
The hardware device according to claim 1, characterized in that the tile is composed of one or more coding tree units.
A method for video decoding, A first flag indicating whether to predict a block of tiles based on a sample reconstructed with respect only the current image and not other images, the step of decoding the first flag included in the tile header of the current image from a data stream, A step of decoding a second flag from the data stream, which is included in the header of the tile containing the first flag , in response to a first flag indicating that the current block of the tile is predicted by copying a sample from a reference block of the sample reconstructed with respect to the current image , in response to a first flag indicating that the block of the tile is predicted based on a sample reconstructed with respect to the current image only and not to other images; The steps include determining the prediction of the current block in the current image by copying a sample from the reference block, in accordance with the second flag indicating that the current block will be predicted by copying a sample from the reference block, The steps include: reconstructing the current block based on the prediction of the current block and the prediction residual decoded from the data stream; A method that includes this.
The method according to claim 3, characterized in that the tile is composed of one or more coding tree units.
A non-transient digital storage medium, The aforementioned non-transient digital storage medium stores a computer program. When the computer program is executed by the computer, the computer program A first flag indicating whether to predict a block of tiles based on a sample reconstructed with respect only the current image and not other images, the step of decoding the first flag included in the tile header of the current image from a data stream, A step of decoding a second flag from the data stream, which is included in the header of the tile containing the first flag , in response to a first flag indicating that the current block of the tile is predicted by copying a sample from a reference block of the sample reconstructed with respect to the current image , in response to a first flag indicating that the block of the tile is predicted based on a sample reconstructed with respect to the current image only and not to other images; The steps include determining the prediction of the current block by copying a sample from the reference block, in accordance with the second flag indicating that the current block will be predicted by copying a sample from the reference block, The steps include: reconstructing the current block in the current image based on the prediction of the current block and the prediction residual decoded from the data stream; A non-transient digital storage medium characterized by performing the following:
The non-transient digital storage medium according to claim 5, characterized in that the tile is composed of one or more coding tree units.

Description

This invention relates to video encoding and video decoding, particularly encoders, decoders, and encoding and decoding methods involving complex processing for flexibly sized image data. H. 265/HEVC (HEVC = High Efficiency Video Coding) is a video encoder that already provides a tool to enhance, or even enable, parallel processing in encoders and/or decoders. For example, HEVC assists in subdividing an image into an array of tiles that are encoded independently of each other. Another concept supported by HEVC is associated with WPP (Wave Processing Packaging), where the CTU-series or CTU-lines of an image are processed parallel from left to right, for example, in stripes, provided that a certain minimum CTU offset is maintained in the processing of continuous CTU lines. However, it is preferable to have a video encoder readily available that more efficiently supports the parallel processing capabilities of video encoders and/or video decoders. The following section provides an introduction to VCL partitioning according to the latest technology (VCL = Video Coding Layer). Generally, in video coding, the coding process of image samples requires smaller partitions. Samples are divided into several rectangular areas for collaborative processing, such as coding predictions or transformations. Therefore, the image is partitioned into blocks of a specific size that remain constant during the encoding of the video sequence. In the H.264/AVC standard, fixed-size blocks of 16x16 samples, so-called macroblocks, are used (AVC = Advanced Video Coding). In the latest HEVC standard (see Non-Patent Literature 1), there is a maximum size coding tree block (CTB) or coding tree unit (CTU) for 64 x 64 samples. Further explanations of HEVC use more common CTUs for such types of blocks. The CTUs are processed in the order of the raster scan, starting with the top-left CTU, processing the CTUs within the image in a linear fashion, and then descending to the bottom-right CTU. [1] ISO/IEC, ITU-T. High efficiency video coding. ITU-T Recommendation H.265 | ISO/IEC 23008 10 (HEVC), edition 1, 2013; edition 2, 2014. Embodiments of this invention will be described in detail below with reference to the drawings. Figure 1 shows a video encoder according to an embodiment.Figure 2 shows a video decoder according to an embodiment.Figure 3 shows a system according to an embodiment.Figure 4 illustrates the effect of replacing a partial CTU that generates a tile boundary.Figure 5 shows luma samples for each image compensation from a partial CTU.Figure 6 shows the CTU grid alignment mismatch.Figure 7 shows the percentage of CTU affected by grid mismatch.Figure 8 shows the tile boundaries of the two sets.Figure 9 shows the sequence of corresponding CTUs in the reference image following a partial CTU in the current image.Figure 10 shows a video encoder.Figure 11 shows a video decoder.Figure 12 shows the relationship between, on the one hand, the reconstructed signal, i.e., the reconstructed image, and on the other hand, the combination of the predicted residual signal and the predicted signal transmitted in the data stream.Figure 13 shows image segmentation by slicing in the raster scanning order.Figure 14 shows image partitioning using tiles. The following description of the drawings begins with providing a description of a block-based predictive encoder and decoder for coded images of video, in order to form an example of a coded frame operation in which embodiments of this invention are incorporated. Each encoder and decoder is described with respect to Figures 10 to 12. Hereinafter, embodiments described in Figures 1 to 3 and subsequent drawings are also used to form encoders and decoders that do not operate according to the coding frame operation below the encoders and decoders of Figures 10 and 11, but the description of the conceptual embodiments of this invention is provided along with a description of how such concepts are incorporated into the encoders and decoders of Figures 10 and 11, respectively. Figure 10 illustrates a device for predictively encoding an image 12 into a data stream 14 using a video encoder, exemplifying transformation-based residual coding. The device, or encoder, is denoted by reference numeral 10. Figure 11 illustrates a corresponding video decoder 20, a device configured to predictively decode an image 12' from the data stream 14 using transformation-based residual decoding. An apostrophe is used to indicate that the image 12' reconstructed by the decoder 20 is derived from the original encoded image 12 by the device 10, with respect to the coding loss introduced by the quantization of the predictive residual signal. While embodiments of this application are not limited to this type of predictive residual coding, Figures 10 and 11 exemplify transformation-based predictive residual coding. This also applies to other details described with respect to Figures 10 and 11, as outlined below. The encoder 10 is configur