US-12621574-B2 - Image processing apparatus to generate composite image, control method, and recording medium

US12621574B2US 12621574 B2US12621574 B2US 12621574B2US-12621574-B2

Abstract

An image processing apparatus includes at least one processor and at least one memory containing instructions that cause the at least one processor to be configured to function as an acquisition unit, a decision unit, and a generation unit. The acquisition unit is configured to acquire a plurality of images including at least one High Dynamic Range (HDR) image. The decision unit is configured to decide a peak luminance value of the composite image. The generation unit is configured to generate the composite image by executing additive composition processing using the plurality of images. The generation unit controls the additive composition processing so that a signal level of each pixel of the composite image falls within an output dynamic range whose maximum value is set to a signal level corresponding to the peak luminance value decided by the decision unit.

Inventors

Hiroaki Kuchiki

Assignees

CANON KABUSHIKI KAISHA

Dates

Publication Date: 20260505
Application Date: 20230310
Priority Date: 20220316

Claims (13)

1 . An image processing apparatus for generating a composite image, comprising at least one processor; and at least one memory containing instructions that, when executed by the at least one processor, cause the at least one processor to be configured to function as following units: an acquisition unit configured to acquire a plurality of images including at least one High Dynamic Range (HDR) image; a decision unit configured to decide a peak luminance value of the composite image; and a generation unit configured to generate the composite image by executing additive composition processing using the plurality of images, wherein the additive composition processing sums signal levels of corresponding pixels from the plurality of images without applying transparency weighting coefficients, and wherein the generation unit controls the additive composition processing so that a signal level of each pixel of the composite image falls within an output dynamic range whose maximum value is set to a signal level corresponding to the peak luminance value decided by the decision unit.
2 . The apparatus according to claim 1 , wherein the HDR image is an HDR image encoded by a Perceptual Quantization (PQ) method standardized in ITU-R BT.2100, and the decision unit decides a peak luminance value of the HDR image as the peak luminance value of the composite image.
3 . The apparatus according to claim 2 , wherein if a plurality of HDR images are included in the plurality of images, the decision unit decides a maximum value of peak luminance values of the plurality of HDR images as the peak luminance value of the composite image.
4 . The apparatus according to claim 1 , wherein the HDR image is an HDR image encoded by a Perceptual Quantization (PQ) method standardized in ITU-R BT.2100, and the decision unit decides a peak luminance value corresponding to an image capturing condition of the HDR image as the peak luminance value of the composite image.
5 . The apparatus according to claim 1 , wherein the at least one processor further function as an input unit configured to accept an input of the peak luminance value of the composite image, and the decision unit decides the peak luminance value of the composite image based on the input accepted by the input unit.
6 . The apparatus according to claim 1 , wherein the generation unit generates an intermediate image by performing additive composition of the plurality of images, and generates the composite image by changing, with respect to a pixel whose signal level exceeds the maximum value of the output dynamic range among pixels included in the intermediate image, the signal level to the maximum value.
7 . The apparatus according to claim 1 , wherein the generation unit generates the composite image by converting a dynamic range of each of the plurality of images so a signal level of each pixel after additive composition does not exceed the maximum value of the output dynamic range, and performing additive composition of the plurality of images after the conversion.
8 . The apparatus according to claim 7 , wherein the dynamic range is converted to set, as a maximum value of display luminance of each of the plurality of images after conversion, a value obtained by dividing the maximum value of the output dynamic range by the number of the plurality of images.
9 . The apparatus according to claim 7 , wherein the dynamic range is converted to change a maximum value of display luminance of each of the plurality of images after conversion in accordance with a peak luminance value of each image.
10 . The apparatus according to claim 9 , wherein the dynamic range is converted with reference to a common conversion characteristic with respect to the plurality of images.
11 . The apparatus according to claim 1 , wherein the at least one processor further function as an output unit configured to output an image file that associates the composite image generated by the generation unit with the maximum value of the output dynamic range.
12 . A control method for an image processing apparatus that generates a composite image, comprising: acquiring a plurality of images including at least one High Dynamic Range (HDR) image; deciding a peak luminance value of the composite image; and generating the composite image by executing additive composition processing using the plurality of images, wherein the additive composition processing sums signal levels of corresponding pixels from the plurality of images without applying transparency weighting coefficients, and wherein in the generating, the additive composition processing is controlled so that a signal level of each pixel of the composite image falls within an output dynamic range whose maximum value is set to a signal level corresponding to the peak luminance value decided in the deciding.
13 . A computer-readable recording medium recording a program for causing a computer to function as each unit of an image processing apparatus defined in claim 1 .

Description

BACKGROUND Technical Field One disclosed aspect of the embodiments relates to an image processing apparatus, a control method, and a recording medium and, more particularly, to a technique of generating a composite image using an HDR image. Description of the Related Art As a method of generating a composite image of a multiple exposure expression, there is provided additive composition. In additive composition, pixel values of respective pixels of a plurality of images to be composited are added to decide the pixel value of a corresponding pixel of the composite image. A general sRGB 8-bit image such as a JPEG image is a Standard Dynamic Range (SDR) image, in which the luminance (scene luminance) of a captured scene is represented by a pixel value falling within the range of 0 to 255. If a composite image of a multiple exposure expression is obtained by performing additive composition of SDR images, the output composite image is also an SDR image represented by pixel values each falling within the range of 0 to 255. The SDR image relatively expresses the brightness of an object, and the brightness when the SDR image obtained by additive composition is displayed on the display device tends not to be significantly different from the brightness of the image as the composition target. On the other hand, in recent years, a display device called an HDR display in which the performance of a light emitting element such as an LED is improved and the display luminance dynamic range is wider than that of a conventional display device has appeared on the market, and the display device can display an image of a gradation expression corresponding to the dynamic range wider than that of the SDR image. Therefore, some image capturing apparatuses can record a High Dynamic Range (HDR) image so that an expression of a detail and color in each luminance range can be confirmed on the display device. Such an HDR image has, as a pixel value, 10-bit display luminance, that is, display luminance from 0 to 1,023 generally obtained by converting the scene luminance. A signal characteristic representing the relationship between the display luminance and a video signal level in the HDR image is defined by an Electro-Optical Transfer Function (EOTF), and the following two kinds of methods are adopted. One method is a Hybrid Log Gamma (HLG) method standardized in ARM STD-B67, in which a video signal level is converted into the relative value of the display luminance and the display luminance corresponding to the maximum luminance that can be output from the display device is obtained. The other method is a Perceptual Quantization (PQ) method standardized in SMPTE ST 2084 or ITU-R BT.2100, in which the video signal level is converted into the absolute value of the display luminance within a maximum range of 10,000 nit (or cd/m2). Therefore, when displaying an HDR image obtained by capturing a scene, scene luminance is converted into display luminance corresponding to the maximum luminance that can be output from the display device in the former method, and scene luminance is converted into display luminance that is absolutely determined regardless of the display device in the latter method. Thus, if display on the display device adopting the PQ method is assumed, for example, it is necessary to convert an image signal of scene luminance to indicate an absolute luminance value in encoding in the image capturing apparatus, thereby generating an HDR image. Therefore, in encoding in the PQ method of absolutely representing scene luminance, even if the same scene is captured, a peak luminance value (the maximum value of the display luminance and the maximum value of the output dynamic range) included in the HDR image may change. This is because the scene luminance with which the sensor output is saturated changes in accordance with an image capturing mode and the like and thus a gamma curve used for conversion varies to assign the absolute display luminance to the same scene luminance. For example, as shown in FIG. 1, the input/output characteristics (the relationships between the number of input stages and output luminance) in two kinds of image capturing modes of different exposure amounts are different in terms of the peak luminance value (the maximum value of the output luminance). In this example, an input/output characteristic 11 in the image capturing mode of a high exposure amount is indicated by a solid line and an input/output characteristic 12 in the image capturing mode of a low exposure amount is indicated by an alternate long and short dashed line. As shown in FIG. 1, in the two image capturing modes, a common input/output characteristic is indicated in a region other than a high-luminance region, and scene luminance is converted into the same display luminance regardless of the exposure amount while the peak luminance value varies between values 13 and 14 in the high-luminance region in accordance with a difference in luminance