JP-2026075434-A - Information processing device and its control method
Abstract
[Problem] Improve the performance of the demosaicing processing unit. [Solution] The information processing device includes: acquisition means for acquiring a first mosaic image and a second mosaic image generated based on the same source image; estimation means for inputting the first mosaic image and the second mosaic image to a demosaicing processing unit and generating a first estimated image and a second estimated image with pixel value interpolation; calculation means for calculating a first reconstruction error in the first estimated image and a second reconstruction error in the second estimated image; and update means for updating the parameters of the demosaicing processing unit based on the first and second reconstruction errors. The first mosaic image is a mosaic image in which the thinning pattern for generating the mosaic image and the source image are in a first positional relationship, and the second mosaic image is a mosaic image in which the thinning pattern and the source image are in a second positional relationship. [Selection Diagram] Figure 2
Inventors
- 井関 茜
Assignees
- キヤノン株式会社
Dates
- Publication Date
- 20260508
- Application Date
- 20241022
Claims (13)
- An acquisition means for acquiring a first mosaic image and a second mosaic image generated based on the same source image, Estimation means inputs the first mosaic image and the second mosaic image into a demosaicing unit to generate a first estimated image and a second estimated image with pixel value interpolation, A calculation means for calculating a first reconstruction error in the first estimated image and a second reconstruction error in the second estimated image, An update means for updating the parameters of the demosaicing processing unit based on the first reconstruction error and the second reconstruction error, Equipped with, The information processing apparatus is characterized in that the first mosaic image is a mosaic image in which the thinning pattern for generating the mosaic image and the original image are in a first positional relationship, and the second mosaic image is a mosaic image in which the thinning pattern and the original image are in a second positional relationship different from the first positional relationship.
- The acquisition means acquires a first image obtained by applying a first geometric transformation to the original image, a second image obtained by applying a second geometric transformation different from the first geometric transformation to the original image, and a first mosaic image and a second mosaic image obtained by applying the same decimation pattern to the first image and the second image. The information processing apparatus according to claim 1, characterized in that the calculation means calculates a first reconstruction error, which is the difference between the first estimated image and the first image, and a second reconstruction error, which is the difference between the second estimated image and the second image.
- The information processing apparatus according to claim 2, characterized in that the first geometric transformation and the second geometric transformation are rotation transformations in which the rotation angles are different from each other.
- The acquisition means acquires a first image and a second image obtained by applying the same geometric transformation to the original image, a first mosaic image obtained by applying a first thinning pattern to the first image, and a second mosaic image obtained by applying a second thinning pattern different from the first thinning pattern to the second image. The information processing apparatus according to claim 1, characterized in that the calculation means calculates a first reconstruction error, which is the difference between the first estimated image and the first image, and a second reconstruction error, which is the difference between the second estimated image and the second image.
- The information processing apparatus according to claim 1, characterized in that the updating means updates the parameters of the neural network by backpropagation so that the weighted sum of the first reconstruction error and the second reconstruction error is less than or equal to a predetermined threshold.
- The system further includes an influence calculation means for calculating the conversion influence, which is the difference between the first estimated image and the second estimated image. The information processing apparatus according to claim 1, wherein the update means further updates the parameter based on the degree of conversion influence.
- A storage means for storing a plurality of first images obtained by applying a first geometric transformation to a plurality of mutually different original images, a plurality of second images obtained by applying a second geometric transformation different from the first geometric transformation to the plurality of original images, and a plurality of first mosaic images and a plurality of second mosaic images obtained by applying the same decimation pattern to the plurality of first images and the plurality of second images. A selection means for selecting an image group which is a subset of the plurality of first images, the plurality of second images, the plurality of first mosaic images, and the plurality of second mosaic images from the storage means, Furthermore, The selection means selects the image group such that the variation in the gradient direction of the high-frequency components in the image group is greater than a threshold. The information processing apparatus according to claim 1, characterized in that the acquisition means acquires the image group.
- The information processing apparatus according to claim 7, characterized in that the selection means selects an image from among the plurality of first images and the plurality of second images in which the degree of degradation from the original image is less than or equal to a predetermined threshold.
- The information processing apparatus according to claim 7, characterized in that the selection means selects images such that the number of types of original images corresponding to the plurality of first images or the plurality of second images exceeds a predetermined threshold.
- The information processing apparatus according to claim 1, characterized in that the demosaicing processing unit is composed of a neural network (NN), and the parameters are the weight coefficients of the filter.
- The information processing apparatus according to claim 1, characterized in that the aforementioned decimation pattern is a Bayer array.
- A method for controlling an information processing device that learns a machine learning-based demosaicing process, An acquisition step to obtain a first mosaic image and a second mosaic image generated based on the same source image, An estimation step in which the first mosaic image and the second mosaic image are input to a demosaicing unit to generate a first estimated image and a second estimated image with pixel value interpolation, A calculation step for calculating a first reconstruction error in the first estimated image and a second reconstruction error in the second estimated image, An update step of updating the parameters of the demosaicing processing unit based on the first reconstruction error and the second reconstruction error, Includes, A control method characterized in that the first mosaic image is a mosaic image in which the thinning pattern for generating the mosaic image and the original image are in a first positional relationship, and the second mosaic image is a mosaic image in which the thinning pattern and the original image are in a second positional relationship different from the first positional relationship.
- A program for causing a computer to execute the control method described in claim 12.
Description
This invention relates to image processing technology using machine learning. In digital cameras, the subject image formed by the optical system, such as the lens, is captured as image information by measuring the amount of light using an image sensor with multiple pixels arranged spatially. Generally, the pixels of the image sensor themselves cannot distinguish colors. Therefore, a color filter with different colors arranged in a mosaic pattern is used to configure each pixel to transmit light of a specific color. For example, by using a color filter with red (R), green (G), and blue (B) arranged in a mosaic pattern, a mosaic image is obtained where each pixel corresponds to one of the RGB colors. The camera's development processing unit performs signal processing on each pixel of this mosaic image, including demosaicing to interpolate the pixel values of the remaining two missing colors (missing pixel values), to generate and output a color image (RGB image). Generally, demosaicing calculates the pixel values of the remaining two missing colors using rule-based interpolation with the pixel values of one or more surrounding pixels. However, rule-based interpolation has the challenge of difficulty in reproducing high spatial frequency components. In recent years, machine learning-based demosaicing, such as neural networks (NNs) as exemplified in Non-Patent Document 1, has been studied. In machine learning-based demosaicing, inference is performed to convert the mosaic image to an RGB image using a trained model that has been trained on a large number of RGB images. Compared to rule-based methods, machine learning-based methods have improved performance in reproducing high spatial frequency components and can estimate high-quality color images. Gharbi et al., "Deep Joint Demosaicking and Denoising", SIGGRAPH Asia 2016 This diagram shows the hardware configuration of an information processing device.This figure shows the functional configuration of the information processing device (first embodiment).This is a flowchart of the inference process.This is a flowchart of the learning process (first embodiment).This diagram illustrates the creation of a mosaic image based on a thinning pattern.This diagram illustrates the processing flow of a demosaicing neural network.This figure shows the functional configuration of the information processing device (second embodiment).This is a flowchart of the learning process (second embodiment).This figure shows the functional configuration of the information processing device (third embodiment).This is a flowchart of the learning process (third embodiment). The embodiments will be described in detail below with reference to the attached drawings. Note that the following embodiments do not limit the invention as defined in the claims. While multiple features are described in the embodiments, not all of these features are essential to the invention, and the features may be combined in any way. Furthermore, in the attached drawings, identical or similar configurations are given the same reference numerals, and redundant descriptions are omitted. (First Embodiment) As a first embodiment of the information processing device according to the present invention, an information processing device for learning a neural network for demosaicing (demosaic NN) will be described below as an example. In the following description, demosaicing will be described as generating a normal color image (an RGB 3-component image) from a mosaic image of a Bayer (RGB) sequence, but the sequence and color configuration are not limited to those described above. <Overview> In demosaicing, the relative positional relationship between the pixel arrangement of each color in the original image and the mosaic image changes, and when the color being downsampled changes, the difficulty of interpolating the pixel values of the missing pixels in demosaicing changes. As a result, even with the same original image, the demosaicing result (color and image quality) will change. Therefore, the demosaicing neural network is trained to mitigate such changes in the demosaicing result. Specifically, two or more mosaic images are created from the original image, each with a different (for example, altered) relative positional relationship between the pixel arrangement of each color in the original image and the mosaic image. Then, during backpropagation, the parameters of the demosaicing neural network are updated so that the reconstruction error between the demosaicing results (inference results) from each mosaic image and the original image is minimized. <Device configuration> Figure 1 shows the hardware configuration of an information processing device that performs demosaicing NN training and/or inference using demosaicing NN. The information processing device may be configured as, for example, a computer. The CPU 101 controls the entire device by executing a control program stored in the read-only memory (ROM) 102. The