CN-115984125-B - Wide code rate dynamic point cloud coding-oriented occupancy bitmap guide artifact removal method

CN115984125BCN 115984125 BCN115984125 BCN 115984125BCN-115984125-B

Abstract

The invention discloses a wide code rate dynamic point cloud coding-oriented occupied bitmap guide artifact removal method, which comprises the steps of extracting an occupied bitmap from a video-based dynamic point cloud coding compression block, upsampling the occupied bitmap to enable the size of the occupied bitmap to be equal to that of a geometric compression block, cascading the geometric compression block and the upsampled occupied bitmap, inputting the geometric compression block and the upsampled occupied bitmap into Uformer variants, extracting effective features of the geometric compression block, obtaining a recovery block, calculating to obtain corresponding logarithmic errors according to the recovery block and the geometric compression block, taking the corresponding logarithmic errors as error losses, updating a learning rate by using an SGD function, training a model by utilizing an adaptive gradient optimizer iteration model parameter and balance error loss, obtaining a final optimal model, cascading the geometric block and the corresponding upsampled occupied bitmap, and inputting the optimal model to obtain a geometric video frame with artifacts removed.

Inventors

XIONG JIAN
WU JUNHAO
LUO WANG
GAO HAO

Assignees

南京邮电大学

Dates

Publication Date: 20260508
Application Date: 20221209

Claims (6)

1. A method for removing a leading artifact of a occupation map for wide code rate dynamic point cloud coding is characterized by comprising the following steps: extracting a occupation bitmap from a dynamic point cloud coding compression block based on video, and upsampling the occupation bitmap to ensure that the size and the dimension of the occupation bitmap are equal to those of a geometric compression block; Cascading the geometric compression block and the up-sampled occupancy map, inputting the cascading occupancy map into Uformer variants, extracting effective features of the geometric compression block, and obtaining a recovery block; Calculating to obtain corresponding logarithmic errors according to the recovery blocks and the geometric compression blocks, and taking the logarithmic errors as error losses; Updating the learning rate by using the SGD function, and training the model by using the iteration model parameters and the balance error loss of the adaptive gradient optimizer to obtain a final optimal model; and cascading the geometric blocks and the corresponding up-sampled occupancy bitmaps, and inputting an optimal model to obtain the geometric video frame with the artifacts removed.
2. The wide-code rate dynamic point cloud coding oriented occupancy bitmap guidance artifact removal method of claim 1, wherein the input to train the optimal model comprises far layer blocks extracted from video-based dynamic point cloud coding compression blocks, corresponding near layer blocks, and occupancy bitmaps.
3. The method for removing the leading artifacts of the occupancy bitmaps for the wide-code-rate dynamic point cloud coding according to claim 1, wherein the occupancy bitmaps are up-sampled and then subjected to maximum pooling, and the occupancy bitmaps subjected to maximum pooling are input Uformer in a variant.
4. The method for removing the leading artifacts of the occupancy bitmaps for wide-code-rate dynamic point cloud coding according to claim 1, wherein the method for calculating the error loss comprises the following steps: Calculating the difference between the restored block and the original block, multiplying the calculated difference by the up-sampled occupied bit map, and filtering the difference calculation interference of the empty pixels to the non-empty pixels; And calculating logarithmic errors point by point for non-empty pixels, and summing to obtain final error loss.
5. The method for removing the leading artifacts of the occupancy bitmap for the wide-code-rate dynamic point cloud coding according to claim 4, wherein the calculation formula of the error loss is as follows: In the formula (I), in the formula (II), And Recovery blocks representing far layer and corresponding near layer respectively " "He" "Represents the corresponding original block, while" om "is the occupancy map representing the upsampling.
6. The wide-code-rate dynamic point cloud coding-oriented occupancy bitmap guidance artifact removal method according to claim 1, wherein the optimal model is based on training of an incremental peak signal-to-noise ratio, and the incremental peak signal-to-noise ratio training method comprises: acquiring a training target with minimum MSE between a recovery block and an original block: wherein m is the number of training samples, Representing the i-th original block; In order to obtain an optimal model, gradient descent is typically used during training to update parameters, in particular: ; wherein eta is the learning rate and is used for specifying the size of parameter update, Is a denoising error; for each iteration, the goal is to maximize the sum of incremental PSNRs, expressed as: ; wherein, the fatted PSNR i is the incremental PSNR of the ith sample, which is the PSNR gain after removing the artifact, and is obtained by the following formula: wherein n is the number of signal bits, and the symbol And Representing the MSE error of the recovery block and the input compression block, respectively, the equation is rewritten as: then, get The equation is rewritten as The error before removing the artifact is not changeable for each batch of samples, and for each iteration, Is a constant, then the simplification is: the final objective function is set as: The logarithmic operation in the final objective function adjusts the effect of MSE on the overall cost.

Description

Wide code rate dynamic point cloud coding-oriented occupancy bitmap guide artifact removal method Technical Field The invention relates to a wide code rate dynamic point cloud coding-oriented occupancy bitmap guiding artifact removal method, and belongs to the technical field of intersection of image video processing and machine learning. Background In recent years, point clouds are widely used for the description of three-dimensional objects. To better enable the transmission and storage of point clouds in 3D applications, MPEG initiates the formulation of point cloud compression standards. Video-based point cloud compression is one of the standards developed for dynamic point cloud compression. In a video-based dynamic point cloud coding method, in order to fully utilize time correlation, a block projection method is provided, geometric and attribute information of a point cloud is projected onto a two-dimensional video, and then the two-dimensional video is compressed by utilizing video coding technologies such as HEVC and VVC. In video frames, however, it is necessary to first form a geometric projection block from adjacent points with similar normal vectors, and then pack the geometric projection block into a two-dimensional grid to construct a video. However, the geometry of the geometrically projected blocks is generally irregular, where there are gaps between blocks, i.e., where there are a large number of empty pixels in the generated video. To facilitate encoding and decoding, the null pixels are filled with adjacent non-null pixels, and a duty cycle map (OM) needs to be provided to the decoder to indicate which pixels are null. Lossy video compression in video-based dynamic point cloud coding methods can introduce compression artifacts, resulting in reduced quality of point cloud reconstruction. The traditional compression artifact removal method mainly utilizes local prior knowledge or non-local similarity of the image, such as an autoregressive model, sparse representation and the like. However, these methods are often limited by inefficiency of artificial features and strong assumptions of a priori models. In recent years, due to the strong ability to learn representations, deep learning-based models, such as Convolutional Neural Networks (CNNs), are widely used for compression artifact removal, as well as image denoising. The main idea of CNN-based noise denoising is to predict noise by extracting local context features. The goal of training is to minimize errors between the noise signal and the original signal, such as Mean Square Error (MSE) and Mean Absolute Error (MAE). The CNN-based model achieves excellent performance in compression artifact removal due to the advantage in extracting image local context features. However, the video generated in the video-based dynamic point cloud encoding method is different from the natural image signal, so there are some problems in applying CNN to video-based dynamic point cloud encoding artifact removal. The method mainly has the following problems that firstly, adjacent non-empty pixels are used for filling the empty pixels, local context information cannot be truly reflected, and secondly, the model can be better performed at a low bit rate than at a high bit rate due to the fact that parameters which can be learned by the model are updated in an iteration mode and unbalance is caused by training cost based on MSE. In order to solve the problems, the invention provides a geometrical compression artifact removal method guided by an occupied bit map, which is applicable to a wide bit rate range by dynamic point cloud coding. However, a large number of empty pixels in the video, filled with neighboring non-empty pixels, cannot truly reflect local context information, and can reduce noise prediction accuracy in compression artifact removal. Furthermore, training models based on Mean Square Error (MSE) perform better at low bit rates than at high bit rates due to the iterative updating of the imbalance of the parameters that the model can learn. Therefore, the invention provides a learning-based wide bit rate dynamic point cloud coding geometric compression artifact removal method, which mainly comprises a context feature extraction scheme based on a occupation map and a training method based on an incremental peak signal-to-noise ratio (PSNR). Disclosure of Invention The invention aims to provide a wide code rate dynamic point cloud coding-oriented occupancy bitmap guiding artifact removal method, which aims to solve the defect that the prior art cannot truly reflect local context information and parameter iteration update unbalance which can be learned by a model. A method for removing a leading artifact of a occupation map for wide code rate dynamic point cloud coding, the method comprising: extracting a occupation bitmap from a dynamic point cloud coding compression block based on video, and upsampling the occupation bitmap to ensure that the size