JP-7856717-B2 - Image processing method, image processing device, program, method for manufacturing a trained machine learning model, processing device, image processing system

JP7856717B2JP 7856717 B2JP7856717 B2JP 7856717B2JP-7856717-B2

Inventors

日浅法人
木村良範
楠美祐一

Assignees

キヤノン株式会社

Dates

Publication Date: 20260511
Application Date: 20240919

Claims (19)

A step of acquiring an image obtained by imaging and information regarding the spread of the point image distribution function or the modulation transfer function that represents the resolution performance of the optical instrument used for imaging , An image processing method characterized by comprising the steps of generating an output image with a smaller sampling pitch than the captured image using a machine learning model based on the captured image and the information.
The image processing method according to claim 1, characterized in that the output image is an image obtained by enlarging or demosaicing the captured image.
The image processing method according to claim 1 or 2 , characterized in that the information is different depending on the position of the pixels in the captured image.
The image processing method according to any one of claims 1 to 3 , characterized in that the information is a map in which values are arranged to a size corresponding to the number of pixels in the captured image.
The image processing method according to claim 4 , characterized in that the value is a value based on the frequency at which the modulation transfer function becomes a predetermined value.
The image processing method according to claim 4 or 5 , characterized in that the information has a plurality of channel components representing different resolution performance components for the same pixel of the captured image.
The aforementioned information is obtained using the type of optical instrument or the state of the optical instrument in the imaging process . The image processing method according to any one of claims 1 to 6 , characterized in that the aforementioned state is information relating to at least one of the focal length, F-number, and focus distance.
The image processing method according to any one of claims 1 to 7 , characterized in that the aforementioned information is obtained using information relating to the pixel pitch of the image sensor used for imaging .
The image processing method according to any one of claims 1 to 8 , characterized in that the output image is an image in which the blur caused by the optical device in the captured image has been corrected .
The image processing method according to any one of claims 1 to 9 , characterized in that the step of generating the output image further generates the output image based on information regarding the noise of the captured image.
The image processing method according to claim 10 , characterized in that the noise information includes at least one of the information regarding the intensity of the noise generated during imaging or the information regarding the denoising performed on the captured image.
The image processing method according to any one of claims 1 to 11, characterized in that the step of generating the output image generates the output image based on the captured image and the information linked in the channel direction.
The image processing method according to any one of claims 1 to 12, characterized in that the machine learning model has one or more residual blocks.
The image processing method according to any one of claims 1 to 13, characterized in that, in the step of generating the output image, the output image is generated by adding a second intermediate image , which is generated using the captured image and the information and has a smaller sampling pitch than the captured image , to a first intermediate image generated by reducing the sampling pitch of the captured image without using the information.
A program characterized by causing a computer to execute the image processing method described in any one of claims 1 to 14 .
An acquisition means for acquiring an image obtained by imaging and information regarding the spread of the point image distribution function or the modulation transfer function that represents the resolution performance of the optical instrument used for imaging , An image processing apparatus characterized by having a generation means that generates an output image having a smaller sampling pitch than the captured image using a machine learning model based on the captured image and the information.
A step of acquiring a first image, information regarding the spread of the point image distribution function or the modulation transfer function representing the resolution performance of an optical instrument corresponding to the first image, and a second image having a smaller sampling pitch than the first image. A step of generating an output image with a smaller sampling pitch than the first image using a machine learning model based on the first image and the information, A method for generating a trained machine learning model, characterized by comprising the step of updating the weights of the machine learning model using the output image and the second image.
A data acquisition means that acquires a first image, information regarding the spread or modulation transfer function of the point image distribution function representing the resolution performance of an optical instrument corresponding to the first image, and a second image having a smaller sampling pitch than the first image. A computation means that generates an output image with a smaller sampling pitch than the first image using a machine learning model based on the first image and the information, A processing apparatus characterized by having an update means for updating the weights of the machine learning model using the output image and the second image.
An image processing system comprising an image processing device according to claim 16 and a control device capable of communicating with the image processing device, The control device has means for transmitting a request to perform processing on the captured image, The image processing system is characterized in that the image processing device has means for performing processing on the captured image in response to the request.

Description

This invention relates to image processing for reducing the sampling pitch of captured images. Patent Document 1 discloses a method for generating high-resolution enlarged images by enlarging a low-resolution image to the same number of pixels as a high-resolution image using bicubic interpolation, and then inputting this enlarged image into a trained machine learning model. By using a machine learning model trained specifically for image enlargement, higher accuracy can be achieved compared to general methods such as bicubic interpolation. U.S. Patent Application Publication No. 2018/0075581 This figure shows the relationship between the modulation transfer function and the Nyquist frequency in Examples 1 and 2.This is a block diagram of the image processing system in Example 1.This is an external view of the image processing system in Example 1.This is a flowchart of the machine learning model training in Example 1.This diagram illustrates the process of generating an enlarged image in Example 1.This figure shows the configuration of the machine learning models in Examples 1 and 2.This is a flowchart for generating an enlarged image in Example 1.This is a block diagram of the image processing system in Example 2.This is an external view of the image processing system in Example 2.This is a flowchart of the machine learning model training in Example 2.This figure shows the relationship between the color filter array and the Nyquist frequency in Example 2.This diagram shows the process of generating demosaicing images in Example 2.This is a flowchart for generating demosaicing images in Example 2. The embodiments of the present invention will be described in detail below with reference to the drawings. In each drawing, identical components are denoted by the same reference numerals, and redundant descriptions are omitted. Before detailing the embodiments, the gist of the present invention will be briefly explained. In this invention, in the process of reducing the sampling pitch of an captured image (hereinafter referred to as upsampling), resolution performance information, which is information regarding the resolution performance of the optical equipment used to capture the image, is used. This improves the accuracy of upsampling. To explain the reason for this, the problems of upsampling and the principle of its occurrence are described in detail below. When converting an image formed by an optical system into a captured image using an image sensor, sampling is performed at the pixels of the image sensor. Therefore, among the frequency components forming the subject image, those exceeding the Nyquist frequency of the image sensor are mixed with low-frequency components due to aliasing, resulting in moiré patterns. In upsampling of a captured image, the Nyquist frequency increases due to the smaller sampling pitch; ideally, it is desirable to generate an image where aliasing does not occur up to this increased Nyquist frequency. However, it is generally difficult to perform image processing that distinguishes whether a structure in a captured image containing moiré patterns is a moiré pattern or the original structure of the subject. In conventional upsampling methods such as bilinear interpolation, moiré patterns remain even after upsampling the captured image. In contrast, upsampling using machine learning models can estimate high frequencies before aliasing occurs from the moiré patterns to some extent, thus potentially removing some of the moiré. However, as mentioned earlier, distinguishing between moiré and the structure of the subject is difficult. Therefore, even with machine learning models, some moiré may be misidentified as the subject and remain, while some subject matter may be misidentified as moiré, generating false structures. Therefore, in this invention, in the upsampling of the captured image, the resolution performance information of the optical instrument used to capture the image is used. This will be further explained with reference to Figures 1(A) and (B). Figures 1(A) and 1(B) show the frequency characteristics of the modulation transfer function (MTF) representing the resolution performance of an optical instrument. The horizontal axis represents the spatial frequency in a certain direction, and the vertical axis represents the MTF. Figure 1(A) shows a state where the cutoff frequency 003 of the optical instrument (in this specification, the cutoff frequency refers to the frequency above which the MTF becomes 0) is less than or equal to the Nyquist frequency 001. In this case, no moiré patterns exist in the captured image. This is because even if the MTFs are arranged with a period of sampling frequency 002, there is no region where the MTFs overlap with each other. Therefore, when the resolution performance corresponds to Figure 1(A), providing this information to the algorithm allows the algorithm to determine that it is not necessary to estimate the high-frequency components b