US-12626338-B2 - Method and image processing device for improving signal-to-noise ratio of image frame sequences

US12626338B2US 12626338 B2US12626338 B2US 12626338B2US-12626338-B2

Abstract

A method for improving signal-to-noise of image frames is provided. The method includes estimating a representative velocity of an optical flow in an image frame sequence. The method also includes determining an interpolation factor from the representative velocity of the optical flow. The method also includes employing a trained artificial neural network for generating an expanded image frame sequence. The expanded image frame sequence includes a number of interpolating image frames. Each interpolating image frame interpolates between subsequent image frames of the image frame sequence. The number of interpolating image frames corresponds to the interpolation factor. The method also includes computing a time-dependent combination of image frames from the expanded image frame sequence to generate an output image frame sequence.

Inventors

Kai WALTER
Constantin Kappel

Assignees

LEICA MICROSYSTEMS CMS GMBH

Dates

Publication Date: 20260512
Application Date: 20211008
Priority Date: 20201028

Claims (15)

1 . A method for improving signal-to-noise of image frames, the method comprising: estimating a representative velocity of an optical flow in an image frame sequence; determining an interpolation factor from the representative velocity of the optical flow, wherein a ratio of the representative velocity and the interpolation factor is equal to or less than one pixel per frame; employing a trained artificial neural network for generating an expanded image frame sequence, wherein the expanded image frame sequence comprises a number of interpolating image frames, wherein each interpolating image frame interpolates between subsequent image frames of the image frame sequence, wherein the number of interpolating image frames corresponds to the interpolation factor; generating an output image frame sequence by computing a time-dependent combination of image frames from the expanded image frame sequence.
2 . The method of claim 1 , wherein estimating the representative velocity of the optical flow comprises: calculating a histogram of the optical flow between successive image frames in the image frame sequence; and analyzing the histogram to determine the representative velocity.
3 . The method of claim 2 , wherein analyzing the histogram to determine the representative velocity comprises employing the histogram to determine the representative velocity as a quantile for a predetermined threshold value.
4 . The method of claim 2 , wherein calculating the histogram is based on estimating a pixel-wise optical flow.
5 . The method of claim 4 , wherein estimating the pixel-wise optical flow is based on Farnebäck's algorithm.
6 . The method of claim 1 , wherein the image frames of the image frame sequence are microscopy image frames.
7 . The method of claim 1 , wherein the trained artificial neural network is configured for generating the interpolating image frames by applying a feature reshaping operation with channel attention.
8 . The method of claim 7 , wherein the feature reshaping operation comprises a pixel shuffle operation.
9 . The method of claim 1 , wherein the interpolation factor is a power of two and wherein the trained artificial neural network is configured for recursively generating and adding the interpolating image frames to the image frame sequence, wherein a number of recursions corresponds to the power.
10 . A method for improving signal-to-noise of image frames, the method comprising estimating a representative velocity of an optical flow in an image frame sequence; determining an interpolation factor based on the representative velocity of the optical flow; employing a trained artificial neural network for generating an expanded image frame sequence, wherein the expanded image frame sequence comprises a number of interpolating image frames, wherein each interpolating image frame interpolates between subsequent image frames of the image frame sequence, wherein the number of interpolating image frames corresponds to the interpolation factor; generating an output image frame sequence by computing a time-dependent combination of image frames by applying a rolling average to the expanded image frame sequence.
11 . The method of claim 10 , wherein the rolling average is a weighted rolling average and wherein applying the weighted rolling average comprises determining parameters for the weighted rolling average from the representative velocity.
12 . The method of claim 11 , wherein the parameters for the weighted rolling average are a window size determining a group of image frames from the expanded image frame sequence that contribute to the weighted rolling average, and a width determining a weight by which each image frame from the group of image frames contributes to the weighted rolling average.
13 . The method of claim 1 , further comprising applying a de-noising algorithm to the output image frame sequence.
14 . The method of claim 1 , wherein the trained artificial neural network has been trained by: pre-training an artificial neural network with image frame sequences that are not domain specific; and training the artificial neural network with domain-specific image frames.
15 . An image processing device for improving signal-to-noise of image frames by carrying out the method of claim 1 , the image processing device comprising: a memory configured for saving an image frame sequence at least temporarily; processing circuitry configured for estimating a representative velocity of an optical flow in the image frame sequence and for determining an interpolation factor from the representative velocity of the optical flow; and a trained artificial neural network for generating an expanded image frame sequence, wherein the expanded image frame sequence comprises a predetermined number of interpolating image frames, wherein each interpolating image frame interpolates between subsequent image frames of the image frame sequence, wherein the predetermined number corresponds to the interpolation factor, wherein the processing circuitry is further configured for computing a time-dependent combination of image frames from the expanded image frame sequence to generate an output image frame sequence.

Description

CROSS REFERENCE TO RELATED APPLICATIONS This application is a U.S. National Phase application under 35 U.S.C. § 371 of International Application No. PCT/EP2021/077897, filed on Oct. 8, 2021, and claims benefit to European Patent Application No. EP 20204428.5, filed on Oct. 28, 2020. The International Application was published in English on May 5, 2022 as WO 2022/089917 A1 under PCT Article 21(2). FIELD This disclosure relates to systems and methods for improving signal-to-noise in image frame sequences. BACKGROUND High-resolution imaging techniques such as magnetic resonance imaging or fluorescence microscopy allow capturing time lapses of objects such as biological specimen or in-vivo biological tissue. Capturing high-resolution time lapses of biological structures is challenging because high frame rates are required to capture relatively fast movements, while also protecting biological structures from damage incurred by the microscopy method. For example, time lapse fluorescence microscopy of biological specimens on the one hand requires high frame rates to allow for capturing fast movements of the cells and on the other hand requires minimizing the excitation laser intensity to reduce cell damage. In addition, photo bleaching leads to loss of fluorescence imaging. Such opposing requirements often lead to very small signal-to-noise ratios of the acquired images. Other microscopy methods suffer from similar problems. In confocal microscopy the low signal-to-noise ratio is even more prominent because light excluded by the pin hole does not contribute to the images. STED microscopy also suffers from the described problems because it has an intrinsically low photon budget due to depletion of large parts of the excitation at the focal point. To improve signal-to-noise of image frame sequences, time-dependent combinations of the image frames may be applied. For example, rolling average with exponential weighting may be applied, such as according to IR⁢A(t,r)=[I*B]⁢(t,r),B⁡(t)=e-❘"\[LeftBracketingBar]"t❘"\[RightBracketingBar]"σ⁢Θ⁡(❘"\[LeftBracketingBar]"t❘"\[RightBracketingBar]"-w2),(1) where I is the image frame sequence, σ is a width, w is a window size, Θ is the Heaviside function, and * denotes a convolution operation. However, because rolling averaging involves applying a blurring kernel, fast movements of objects captured in the image frame sequence are smeared. Rolling averaging therefore cannot be applied to improve signal-to-noise of image frame sequences that capture fast moving objects. Another approach for improving signal-to-noise in image frames is applying recent developments in deep learning methods that now readily offer solutions for de-noising of images. In such approaches, a single image is inputted without considering movement of captured objects in the images. However, it is well known that a significant problem of deep learning algorithms is the creation of inadequate details, so-called hallucinations, from random patterns like noise. Such artifacts become particularly visible in noisy time lapse images. FIG. 1A illustrates the problem of hallucinations in a de-noising approach of the state of the art. Column 12 of FIG. 1A reproduces two successive fluorescence microscopy image frames captured from a live biological specimen. Applying a de-noising algorithm of the state of the art, Nikon's denoise.ai, to the image frames of column 12 yields the image frames in column 14. As is evident, the de-noising algorithm infers shapes of objects that seem realistic from the very noisy images in column 12. However, a comparison of the de-noised images in column 14 for the successive image frames shows that shapes and positions of the predicted objects change substantially from the upper frame to the lower frame, casting strong doubt on the veracity of the predicted object shapes. Hence, applying de-noising strongly depends on the temporal realization of noise, which implies that de-noising cannot be applied reliably to noisy time lapses. FIG. 1B illustrates the problem of blurring when applying weighted rolling average according to the state of the art. Panel 16 illustrates an image frame of a simulated image frame sequence capturing a fast-moving object. As indicated by the white arrow, the object moves with a velocity ν=(2, −2) pixel per frame. Panel 18 reproduces results of applying a weighted rolling average according to Equation (1) with σ=4 and w=8. As shown in panel 18, the object is blurred due to contribution of neighboring image frames, heavily distorting the object's true shape. SUMMARY In an embodiment, the present disclosure provides a method for improving signal-to-noise of image frames. The method includes estimating a representative velocity of an optical flow in an image frame sequence. The method also includes determining an interpolation factor from the representative velocity of the optical flow. The method also includes employing a trained artificial neural network for generating