US-12620070-B2 - Determining and using point spread function for image deblurring

US12620070B2US 12620070 B2US12620070 B2US 12620070B2US-12620070-B2

Abstract

A method including: obtaining images captured using camera(s), depth maps corresponding to images, and pose information; identifying image segment(s) for first image and second image; determining first relative pose of same object captured in first image; determining second relative pose of same object captured in second image; when same object is in-focus in first image, reprojecting at least image segment(s) of first image; and determining for given camera that captured second image a point spread function as function of optical depth; obtaining third image captured using given camera and third depth map corresponding to third image; and applying extended depth-of-field correction to image segment(s) of third image that is out of focus, by using point spread function.

Inventors

Kai Inha
Mikko Strandborg
Mikko Ollila

Assignees

Varjo Technologies Oy

Dates

Publication Date: 20260505
Application Date: 20221121

Claims (20)

1 . A computer-implemented method comprising: obtaining a plurality of images of a real-world environment captured using at least one camera, a plurality of depth maps captured corresponding to the plurality of images, and pose information indicative of corresponding camera poses from which the plurality of images and the plurality of depth maps are captured; for a given pair of a first image and a second image from amongst the plurality of images, the first image and the second image being captured from different camera poses using a fixed-focus camera, identifying at least one image segment of the first image and at least one image segment of the second image that represent a same object present in the real-world environment, wherein the same object is one of in-focus in the first image and out of focus in the second image or is in-focus in the second image and out of focus in the first image; determining a first relative pose of the same object with respect to a first camera pose from which the first image is captured, based on optical depths in a segment of a first depth map corresponding to the at least one image segment of the first image, and a location of the at least one image segment in a field of view of the first image; determining a second relative pose of the same object with respect to a second camera pose from which the second image is captured, based on optical depths in a segment of a second depth map corresponding to the at least one image segment of the second image, and a location of the at least one image segment in a field of view of the second image; when the same object is in-focus in the first image, reprojecting at least the at least one image segment of the first image from the first relative pose to the second relative pose; and determining, for a given camera that captured the second image, a point spread function as a function of optical depth, based on a correlation between reprojected pixels of the at least one image segment of the first image and respective pixels of the at least one image segment of the second image, and respective optical depths in the segment of the second depth map corresponding to the at least one image segment of the second image, wherein determining the point spread function further comprises estimating the point spread function from the correlation and computing separate point spread functions for a plurality of optical-depth values indicated by the second depth map for the at least one image segment; obtaining a third image of the real-world environment captured using a given camera and a third depth map captured corresponding to the third image; and applying an extended depth-of-field correction to at least one image segment of the third image that is out of focus, by using a point spread function determined for the given camera, based on optical depths in a segment of the third depth map corresponding to the at least one image segment of the third image.
2 . The computer-implemented method of claim 1 , further comprising when the same object is in-focus in the second image, reprojecting at least the at least one image segment of the second image from the second relative pose to the first relative pose; and determining for a given camera that captured the first image a point spread function as a function of optical depth, based on a correlation between reprojected pixels of the at least one image segment of the second image and respective pixels of the at least one image segment of the first image, and respective optical depths in the segment of the first depth map corresponding to the at least one image segment of the first image.
3 . The computer-implemented method of claim 1 , wherein the extended depth-of-field correction is applied by employing a Wiener filter to deconvolve the at least one image segment of the third image with the point spread function determined for the given camera.
4 . The computer-implemented method of claim 1 , further comprising updating the point spread function by employing a neural network to predict a value of the point spread function for a given optical depth based on values of the point spread function for at least two optical depths that are determined based on said correlation.
5 . The computer-implemented method of claim 1 , wherein the first image and the second image are captured: simultaneously using different cameras, or using a same camera or different cameras at different instances of time, wherein at least one of: (i) the same camera or the different cameras, (ii) the same object moves between the different instances of time.
6 . The computer-implemented method of claim 1 , wherein the first image and the second image are captured using different fixed-focus cameras that are focused at different focal planes.
7 . The computer-implemented method of claim 1 , wherein the at least one camera comprises at least one fixed-focus camera, and wherein the first image and the second image are captured at different temperatures of a camera lens of the at least one fixed-focus camera.
8 . The computer-implemented method of claim 1 , wherein the first image and the second image are captured using different cameras that are focused at different focal planes or that have different apertures.
9 . The computer-implemented method of claim 1 , further comprising: obtaining information indicative of a gaze direction of a user; determining a gaze region in the third image, based on the gaze direction of the user; and applying the extended depth-of-field correction to the at least one image segment of the third image that is out of focus, only when the at least one image segment of the third image overlaps with the gaze region.
10 . The computer-implemented method of claim 1 , wherein the step of identifying the at least one image segment of the first image and the at least one image segment of the second image comprises: identifying a plurality of image segments of the first image and a plurality of image segments of the second image that represent same objects that are present in the real-world environment; computing weights for the plurality of image segments of the first image and the plurality of image segments of the second image, wherein a weight of a given image segment is calculated based on at least one of: a gradient of optical depth across the given image segment, when a given same object is out-of-focus in the given image segment, a difference in optical depth between the given same object and a neighbourhood of the given same object, when the given same object is out-of-focus in the given image segment, a contrast of features in the given image segment, when the given same object is in-focus in the given image segment; and selecting the at least one image segment of the first image and the at least one image segment of the second image, from amongst the plurality of image segments of the first image and the plurality of image segments of the second image, based on the weights computed for the plurality of image segments of the first image and the plurality of image segments of the second image.
11 . The computer-implemented method of claim 1 , wherein the at least one image segment of the first image represents the same object as well as a first portion of a neighbourhood of the same object as captured from a perspective of the first camera pose, and the at least one image segment of the second image represents the same object as well as a second portion of the neighbourhood of the same object as captured from a perspective of the second camera pose.
12 . A computer program product comprising a non-transitory machine-readable data storage medium having stored thereon program instructions that, when executed by a processor, cause the processor to execute steps of a computer-implemented method of claim 1 .
13 . A system comprising at least one server that is configured to: obtain a plurality of images of a real-world environment captured using at least one camera, a plurality of depth maps captured corresponding to the plurality of images, and pose information indicative of corresponding camera poses from which the plurality of images and the plurality of depth maps are captured; for a given pair of a first image and a second image from amongst the plurality of images, the first image and the second image being captured from different camera poses using a fixed-focus camera, identify at least one image segment of the first image and at least one image segment of the second image that represent a same object present in the real-world environment, wherein the same object is in-focus in the first image and out of focus in the second image, or is in-focus in the second image and out of focus in the first image; determine a first relative pose of the same object with respect to a first camera pose from which the first image is captured, based on optical depths in a segment of a first depth map corresponding to the at least one image segment of the first image, and a location of the at least one image segment in a field of view of the first image; determine a second relative pose of the same object with respect to a second camera pose from which the second image is captured, based on optical depths in a segment of a second depth map corresponding to the at least one image segment of the second image, and a location of the at least one image segment in a field of view of the second image; when the same object is in-focus in the first image, reproject at least the at least one image segment of the first image from the first relative pose to the second relative pose; and determine for a given camera that captured the second image a point spread function as a function of optical depth, based on a correlation between reprojected pixels of the at least one image segment of the first image and respective pixels of the at least one image segment of the second image, and respective optical depths in the segment of the second depth map corresponding to the at least one image segment of the second image, wherein determining the point spread function further comprises estimating the point spread function from the correlation and computing separate point spread functions for a plurality of optical-depth values indicated by the second depth map for the at least one image segment; obtain a third image of the real-world environment captured using a given camera and a third depth map captured corresponding to the third image; and apply an extended depth-of-field correction to at least one image segment of the third image that is out of focus, by using a point spread function determined for the given camera, based on optical depths in a segment of the third depth map corresponding to the at least one image segment of the third image.
14 . The system of claim 13 , wherein when the same object is in-focus in the second image, the at least one server is configured to: reproject at least the at least one image segment of the second image from the second relative pose to the first relative pose; and determine for a given camera that captured the first image a point spread function as a function of optical depth, based on a correlation between reprojected pixels of the at least one image segment of the second image and respective pixels of the at least one image segment of the first image, and respective optical depths in the segment of the first depth map corresponding to the at least one image segment of the first image.
15 . The system of claim 13 , wherein the at least one server is configured to apply extended depth-of-field correction by employing a Wiener filter to deconvolve the at least one image segment of the third image with the point spread function determined for the given camera.
16 . The system of claim 13 , wherein the at least one server is configured to update the point spread function by employing a neural network to predict a value of the point spread function for a given optical depth based on values of the point spread function for at least two optical depths that are determined based on said correlation.
17 . The system of claim 13 , wherein the first image and the second image are captured: simultaneously using different cameras, or using a same camera or different cameras at different instances of time, wherein at least one of: (i) the same camera or the different cameras, (ii) the same object moves between the different instances of time.
18 . The system of claim 13 , wherein the first image and the second image are captured using different fixed-focus cameras that are focused at different focal planes.
19 . The system of claim 13 , wherein the at least one camera comprises at least one fixed-focus camera, and wherein the first image and the second image are captured at different temperatures of a camera lens of the at least one fixed-focus camera.
20 . The system of claim 13 , wherein the first image and the second image are captured using different cameras that are focused at different focal planes or that have different apertures.

Description

TECHNICAL FIELD The present disclosure relates to computer-implemented methods for determining and using point spread functions (PSFs) for image deblurring. The present disclosure also relates to systems for determining and using PSFs for image deblurring. The present disclosure also relates to computer program products for determining and using PSFs for image deblurring. BACKGROUND In the recent decade, three-dimensional (3D) telepresence is actively being explored by researchers to bring the world closer. Such a 3D telepresence involves using evolving technologies such as immersive extended-reality (XR) technologies which makes an individual feel as if they are present at a location different from an existing location of the individual. With recent advancements in such technologies, demand for high-quality image generation has been increasing. Several advancements are being made to develop image generation techniques that facilitate generation of high-quality images using image reconstruction (namely, image resynthesis). Despite progress in cameras used for image capturing, existing techniques and equipment for image generation has several limitations associated therewith. Firstly, cameras used for image capturing typically suffer from depth-of-field issues. Such depth-of-field issues can be resolved to some extent by adjusting a size of an aperture of a given camera. However, when the size of the aperture of the given camera is significantly smaller, images of a real-world environment in a low-light setting are not captured properly by the given camera. Moreover, larger the size of the aperture, narrower is the depth-of-field. Hence, images of the real-world environment are sharply captured only within a focusing distance range of the given camera, and are captured blurred outside the focusing distance range. Furthermore, even when an auto-focus camera is employed for capturing the images, it is still not possible to capture sharp (i.e., in-focus) images in an entire field of view, because the auto-focus camera can be adjusted according to only one focusing distance range at a time. Therefore, the generated images are of low quality and unrealistic, and are often generated with considerable latency/delay. Secondly, upon capturing images that are blurred (i.e., images having defocus blur), some existing techniques employ machine learning-based tuning algorithms to reverse (namely, remove) blur from the images. However, such tuning algorithms can only be implemented for cameras using a specific lens setup. Thus, image correction lacks a required resolution which is necessary for high-fidelity image generation, as said algorithms have limited capability, for example, in terms of reproducing realistic and accurate visual details of the real-world environment. Moreover, some existing techniques employ several optical elements and cameras for image generation. However, such an implementation increases an overall cost, power consumption, fault susceptibilities, and the like. Therefore, in light of the foregoing discussion, there exists a need to overcome the aforementioned drawbacks associated with existing equipment and techniques for image generation. SUMMARY The present disclosure seeks to provide a computer-implemented method for determining and using point spread function for image deblurring. The present disclosure also seeks to provide a system for determining and using point spread function for image deblurring. The present disclosure also seeks to provide a computer program product for determining and using point spread function for image deblurring. An aim of the present disclosure is to provide a solution that overcomes at least partially the problems encountered in prior art. In a first aspect, the present disclosure provides a computer-implemented method comprising: obtaining a plurality of images of a real-world environment captured using at least one camera, a plurality of depth maps captured corresponding to the plurality of images, and pose information indicative of corresponding camera poses from which the plurality of images and the plurality of depth maps are captured;for a given pair of a first image and a second image from amongst the plurality of images, identifying at least one image segment of the first image and at least one image segment of the second image that represent a same object that is present in the real-world environment, wherein the same object is in-focus in one of the first image and the second image, but is out-of-focus in another of the first image and the second image;determining a first relative pose of the same object with respect to a first camera pose from which the first image is captured, based on optical depths in a segment of a first depth map corresponding to the at least one image segment of the first image, and a location of the at least one image segment in a field of view of the first image;determining a second relative pose of the same object with respect