US-12626336-B2 - Apparatus and method for improving image quality using James-Stein combination equation

US12626336B2US 12626336 B2US12626336 B2US 12626336B2US-12626336-B2

Abstract

The present disclosure relates to an apparatus and method for improving image quality by combining an unbiased image and a biased image using a James-Stein combiner. An apparatus for improving image quality using a James-Stein combiner according to the present disclosure includes an unbiased image buffer configured to output an unbiased image block, a biased image buffer configured to output a biased image block, a sample variance buffer configured to output a sample variance of the unbiased image, an artificial neural network model configured to derive a variance weight per pixel, a variance estimator configured to output a variance estimate of the unbiased image block, and a James-Stein combiner configured to locally combine the biased image block, the unbiased image block, and the variance estimate of the unbiased image block by applying a James-Stein combination equation.

Inventors

Bochang Moon
Jeongmin GU

Assignees

GWANGJU INSTITUTE OF SCIENCE AND TECHNOLOGY

Dates

Publication Date: 20260512
Application Date: 20231005
Priority Date: 20221019

Claims (20)

1 . An apparatus for improving image quality using a James-Stein combination equation, the apparatus comprising: a processor; and a memory storing instructions that, when executed by the processor, cause the processor to: output an unbiased image block centered on combination pixel c in an unbiased image; output a biased image block centered on combination pixel c in a biased image obtained by removing noise from the unbiased image; output a sample variance of the unbiased image; derive a variance weight per pixel by learning the biased image block, the unbiased image block, and the sample variance using an artificial neural network model; output a variance estimate of the unbiased image block based on the variance weight per pixel; and locally combine the biased image block, the unbiased image block, and the variance estimate of the unbiased image block by applying the James-Stein combination equation, wherein an error of an image combined by applying the James-Stein combination equation is smaller than an error of the unbiased image.
2 . The apparatus of claim 1 , wherein the instructions further cause the processor to: output a combined image based on a resulting value of the James-Stein combination equation.
3 . The apparatus of claim 2 , wherein the instructions further cause the processor to: average resulting values of the James-Stein combination equation derived for all pixels c belonging to an image block Ω i of any pixel i to estimate it as a color value of the pixel i.
4 . The apparatus of claim 1 , wherein the unbiased image is a rendering image by a path tracing method.
5 . The apparatus of claim 1 , wherein the biased image is an image obtained by removing noise from the unbiased image by at least one of a kernel predicting convolution network (KPCN) method and an auxiliary feature guided self-attention (AFGSA) method.
6 . The apparatus of claim 1 , wherein the artificial neural network model is implemented as U-Net.
7 . An apparatus for improving image quality using a James-Stein combination equation the apparatus comprising: a processor; and a memory storing instructions that, when executed by the processor, cause the processor to: output an unbiased image block centered on combination pixel c in an unbiased image; output a first modified biased image block and a second modified biased image block centered on combination pixel c, respectively, in a first modified biased image and a second modified biased image modified to remove noise from two mutually independent unbiased images and to include image feature information, respectively; output a sample variance of the unbiased image; derive an alpha value and a variance weight per pixel by learning the first modified biased image block, the second modified biased image block, the unbiased image block, and the sample variance using an artificial neural network model; output a variance estimate of the unbiased image block based on the variance weight per pixel; derive an alpha blended biased image block in which the first modified biased image block and the second modified biased image block are combined by performing alpha blending based on the first modified biased image block, the second modified biased image block, and the alpha value; and locally combine the alpha blended biased image block, the unbiased image block, and the variance estimate of the unbiased image block by applying the James-Stein combination equation, wherein an error of an image combined by applying the James-Stein combination equation is smaller than an error of the unbiased image.
8 . The apparatus of claim 7 , wherein the instructions further cause the processor to: perform alpha blending by applying the first modified biased image block ( y ˆ i A ) , the second modified biased image block ( y ˆ i B ) , and the alpha value (α i ) to Equation 8 below: y ˆ i * = α i ⁢ y ˆ i A + ( 1 - α i ) ⁢ y ˆ i B ( Equation ⁢ 8 ) where α i is an alpha value of pixel i output from the artificial neural network model, and is a value between 0 and 1.
9 . The apparatus of claim 8 , wherein the first modified biased image is derived by reflecting image features derived as a second feature set to the first biased image, and the second modified biased image is derived by reflecting image features derived as a first feature set to the second biased image, and the first feature set includes rendering-related image features of the first biased image, and the second feature set includes rendering-related image features of the second biased image.
10 . The apparatus of claim 7 , wherein, of the two mutually independent unbiased images, one unbiased image includes an average of the first half samples, and the other unbiased image includes an average of the second half samples.
11 . The apparatus of claim 7 , wherein the instructions further cause the processor to: output a combined image based on a resulting value of the James-Stein combination equation.
12 . The apparatus of claim 11 , wherein the instructions further cause the processor to: average resulting values of the James-Stein combination equation derived for all pixels c belonging to an image block Ω i of any pixel i to estimate it as a color value of the pixel i.
13 . The apparatus of claim 7 , wherein the unbiased image is a rendering image by a path tracing method.
14 . The apparatus of claim 7 , wherein the biased image is an image obtained by removing noise from the unbiased image by at least one of a kernel predicting convolution network (KPCN) method and an auxiliary feature guided self-attention (AFGSA) method.
15 . The apparatus of claim 7 , wherein the artificial neural network model is implemented as U-Net.
16 . A method of improving image quality using a James-Stein combination equation implemented by at least one processor of a computer system, the method comprising: determining, by the computer system, an unbiased image block centered on combination pixel c in an unbiased image and a sample variance of the unbiased image block; determining, by the computer system, a biased image block centered on combination pixel c in a biased image obtained by removing noise from the unbiased image; deriving, by the computer system, a variance weight per pixel by learning the biased image block, the unbiased image block, and the sample variance in an artificial neural network model; calculating, by the computer system, a variance estimate of the unbiased image block based on the variance weight per pixel; locally combining, by the computer system, the biased image block, the unbiased image block, and the variance estimate of the unbiased image block by applying the James-Stein combination equation; and outputting, by the computer system, a combined image based on a resulting value of the James-Stein combination equation, wherein an error of an image combined by the James-Stein combination equation is smaller than an error of the unbiased image.
17 . The method of claim 16 , wherein the outputting of the combined image based on the resulting value of the James-Stein combination equation by the computer system comprises averaging resulting values of the James-Stein combination equation derived for all pixels c belonging to an image block Ω i of any pixel i to estimate it as a color value of the pixel i.
18 . The method of claim 16 , wherein the unbiased image is a rendering image by a path tracing method.
19 . The method of claim 16 , wherein the biased image is an image obtained by removing noise from the unbiased image by at least one of a kernel predicting convolution network (KPCN) method and an auxiliary feature guided self-attention (AFGSA) method.
20 . The method of claim 16 , wherein the artificial neural network model is implemented as U-Net.

Description

CROSS-REFERENCE TO RELATED APPLICATION This application claims the benefit of Korean Patent Application No. 10-2022-0134714 filed on Oct. 19, 2022, in the Korean Intellectual Property Office, the entire disclosure of which is incorporated herein by reference for all purposes. BACKGROUND 1. Field of the Invention The present disclosure relates to an apparatus and method for improving image quality, and more particularly, to an apparatus and method for improving image quality by combining an unbiased image and a biased image using a James-Stein combiner. 2. Description of the Related Art As a method for generating an image through photorealistic rendering, a Monte Carlo rendering method is widely known. Monte Carlo rendering is a widely used method for rendering photo-like images in games and film productions, and in order to obtain high-quality images during the rendering process, the number of samples per pixel (spp) must be increased, which leads to an increase in rendering time. FIG. 1 is a diagram comparing qualities of images resulting from traditional Monte Carlo rendering according to the number of samples, wherein (a) is a rendering result using 2,000 (2 k) samples per pixel, and (b) is a rendering result using 64,000 (64 k) samples per pixel. When using 2,000 samples per pixel, it has a lot of noise in the rendering result. Since Monte Carlo rendering method such as path tracing is an unbiased rendering method and may reach the ground truth when an infinite number of samples are used per pixel, noise (Monte Carlo variance) is included when the number of samples is finite. In addition, since a large number of samples are required to obtain high-quality image, a lot of time is required to generate a noisy-free image. Accordingly, various methods of rendering an image including noise using a Monte Carlo rendering method using a limited number of samples and then removing noise through deep learning have been attempted. As a conventional technique for removing noise included in a Monte Carlo rendeing image using a deep learning method, Bako and other researchers proposed kernel predicting convolution networks (KPCN) that infer weights per pixel of a general denoising kernel (Non-Patent Document 1). Yu and other researchers have proposed an auxiliary feature guided self-attention (AFGSA) technology that effectively removes Monte Carlo rendering noise using the transformer-based network, an advanced neural framework (Non-Patent Document 2). FIG. 2 is a graph illustrating the number of samples per pixel (spp) and noise removal effect of existing noise removing techniques of a Monte Carlo rendering image, wherein (a) is a graph for a curly-hair image, and (b) is a graph for a veach-ajar image, and (c) is a graph for a glass-of-water image. For each image, the relative error (relative L2 error) compared with the ground truth image was shown while removing the noise of the Monte Carlo rendering image by the path tracing (PT) method, the KPCN method based on Non-Patent Document 1, and the AFGSA method based on Non-Patent Document 2. For all images (see (a), (b), (c) of FIG. 2), the PT method linearly decreases the relative error as the number of samples per pixel increases. As such, the PT method is a consistent estimation method, but the result value may be limited by certain parameters. On the other hand, in the case of a curly hair image (see (a) of FIG. 2) or a glass-of-water image (see (c) of FIG. 2), as the number of samples per pixel (spp) increases, the relative error of the KPCN method and the AFGSA method becomes larger than that of the PT method. In the case of a veach-ajar image (see (b) of FIG. 2), in the KPCN method and the AFGSA method, the error reduction effect decreases even when the number of samples per pixel (spp) increases. In all images, the KPCN method and the AFGSA method are inconsistent in their error reduction effect versus the number of samples per pixel. In summary, the conventional deep learning-based denoisers for Monte Carlo rendering have problems in that consistency cannot be guaranteed, such as an increase in relative error compared to the PT method or a decrease in the error reduction effect, even if the number of samples per pixel is increased. PRIOR ART DOCUMENT Non-Patent Document (Non-Patent Document 0001) Document 1: Steve Bako, Thijs Vogels, Brian McWilliams, Mark Meyer, Jan Novak, Alex Harvill, Pradeep Sen, Tony Derose, and Fabrice Rousselle. 2017. Kernel-predicting convolutional networks for denoising Monte Carlo renderings. ACM Trans. Graph. 36, 4(2017), 14 pages(Non-Patent Document 0002) Document 2: Jiaqi Yu, Yongwei Nie, Chengjiang Long, Wenju Xu, Qing Zhang, and Guiqing Li. 2021. Monte Carlo Denoising via Auxiliary Feature Guided Self-Attention. ACM Trans. Graph. 40, 6 (2021), 13 pages.(Non-Patent Document 0003) Document 3: Charles Stein and Willard James. 1961. Estimation with quadratic loss. In Proc. 4th Berkeley Symp. Mathematical Statistics Probability, Vol.