US-20260127726-A1 - Machine Learning Model Based Triggering Mechanism for Image Enhancement

US20260127726A1US 20260127726 A1US20260127726 A1US 20260127726A1US-20260127726-A1

Abstract

A method includes determining a respective delta quality score associated with each of a plurality of images by predicting, by an image enhancement model, an enhanced image corresponding to a given image, determining a first quality score associated with the given image and a second quality score associated with the enhanced image. The delta quality score is based on a difference of the first and second quality scores. The method includes generating a training dataset comprising the plurality of images associated with respective delta quality scores. The method includes training, based on the generated training dataset, a quality assessment model to predict a quality-improvability score associated with an input image. The quality-improvability score is indicative of a potential to increase a perceptual quality of the input image based on removal of one or more image degradation factors. The method includes outputting, by the computing device, the trained quality assessment model.

Inventors

Hossein Talebi
Sungjoon Choi
Peyman Milanfar
Mauricio DELBRACIO

Assignees

GOOGLE LLC

Dates

Publication Date: 20260507
Application Date: 20231004

Claims (20)

1 . A computer-implemented method, comprising: determining a respective delta quality score associated with each of a plurality of images, wherein the determining of the delta quality score comprises: predicting, by an image enhancement model, an enhanced image corresponding to a given image of the plurality of images, the image enhancement model having been trained to remove one or more image degradations associated with the given image, determining a first quality score associated with the given image and a second quality score associated with the predicted enhanced image, and wherein the delta quality score is based on a difference of the second quality score and the first quality score, and wherein the delta quality score is indicative of a degree of image enhancement in the predicted enhanced image; generating, by a computing device, a training dataset comprising the plurality of images associated with respective delta quality scores; training, based on the generated training dataset, a quality assessment model to predict a quality-improvability score associated with an input image, wherein the quality-improvability score is indicative of a potential to increase a perceptual quality of the input image based on removal of one or more image degradation factors; and outputting, by the computing device, the trained quality assessment model.
2 . The computer-implemented method of claim 1 , wherein the quality assessment model is a convolutional neural network, and wherein the training of the quality assessment model further comprises: receiving labeled data indicating the degree of image enhancement in the predicted enhanced image as perceived by human annotators; and fine-tuning a last layer of the convolutional neural network with the received labeled data.
3 . The computer-implemented method of claim 2 , wherein the convolutional neural network comprises a MobileNet architecture.
4 . The computer-implemented method of claim 2 , wherein the convolutional neural network comprises a fully connected layer configured to determine the delta quality score.
5 . The computer-implemented method of claim 1 , wherein the first quality score and the second quality score are neural image assessment (NIMA) scores.
6 . The computer-implemented method of claim 1 , wherein the first quality score and the second quality score are generated by an AlexNet based convolutional neural network (CNN) that has been trained on Aesthetic Visual Analysis (AVA) with a rank-based loss function.
7 . The computer-implemented method of claim 1 , wherein the one or more image degradation factors comprise one or more of a motion blur, a lens blur, an image noise, an image compression artifact, or an artifact caused by saturated pixels.
8 . A computer-implemented method, comprising: receiving, by a computing device, an input image; predicting, by a quality assessment model, a quality-improvability score associated with the input image, wherein the quality-improvability score is indicative of a potential to increase a perceptual quality of the input image based on removal of one or more image degradation factors, the quality assessment model having been trained on a training dataset comprising a plurality of images associated with respective delta quality scores, the delta quality scores having been determined by: predicting, by an image enhancement model, an enhanced image corresponding to a given image of the plurality of images, the image enhancement model having been trained to remove one or more image degradations associated with the given image, determining a first quality score associated with the given image and a second quality score associated with the predicted enhanced image, and wherein the delta quality score is based on a difference of the second quality score and the first quality score, and wherein the delta quality score is indicative of a degree of image enhancement in the predicted enhanced image; and providing, by the computing device, an alert notification based on the predicted quality-improvability score.
9 . The computer-implemented method of claim 8 , wherein the quality assessment model is a convolutional neural network.
10 . The computer-implemented method of claim 8 , wherein the convolutional neural network comprises a MobileNet architecture.
11 . The computer-implemented method of claim 8 , wherein the convolutional neural network comprises a fully connected layer configured to determine the delta quality score.
12 . The computer-implemented method of claim 8 , wherein the first quality score and the second quality score are neural image assessment (NIMA) scores.
13 . The computer-implemented method of claim 8 , wherein the first quality score and the second quality score are generated by an AlexNet based CNN that has been trained on Aesthetic Visual Analysis (AVA) with a rank-based loss function.
14 . The computer-implemented method of claim 8 , further comprising: determining whether the predicted quality-improvability score exceeds a threshold score; and based upon a determination that the predicted quality-improvability score exceeds the threshold score, providing the input image to the image enhancement model to enhance the quality of the input image.
15 . The computer-implemented method of claim 14 , wherein the one or more image degradation factors comprises image blurring, and wherein the threshold score is a threshold deblurring score.
16 . The computer-implemented method of claim 14 , wherein the one or more image degradation factors comprises image noise, wherein the threshold score is a threshold denoising score.
17 . The computer-implemented method of claim 14 , wherein the one or more image degradation factors comprises an image compression artifact, and wherein the threshold score is a threshold compression artifact removal score.
18 . The computer-implemented method of claim 14 , wherein the one or more image degradation factors comprises an artifact caused by saturated pixels, and wherein the threshold score is a threshold saturated pixel artifact removal score.
19 . The computer-implemented method of claim 14 , wherein the providing of the alert notification comprises: triggering the alert notification upon a determination that the predicted quality-improvability score exceeds the threshold score.
20 . The computer-implemented method of claim 14 , wherein the providing of the alert notification comprises: upon a determination that the predicted quality-improvability score exceeds the threshold score, providing a recommendation to a user to enhance the input image.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS/INCORPORATION BY REFERENCE This application claims priority to U.S. Provisional Patent Application No. 63/378,386, filed on Oct. 5, 2022, which is hereby incorporated by reference in its entirety. BACKGROUND Many modern computing devices, including mobile phones, personal computers, and tablets, include image capture devices, such as still and/or video cameras. The image capture devices can capture images, such as images that include people, animals, landscapes, and/or objects. Some image capture devices and/or computing devices can correct or otherwise modify captured images. For example, some image capture devices can provide “red-eye” correction that removes artifacts such as red-appearing eyes of people and animals that may be present in images captured using bright lights, such as flash lighting. After a captured image has been corrected, the corrected image can be saved, displayed, transmitted, printed to paper, and/or otherwise utilized. SUMMARY Removing blur, noise and compression artifacts from images are longstanding problems in computational photography. Image degradations can come from several sources. When the photographer or the autofocus system incorrectly sets the focus (out-of-focus), or when the relative motion between the camera and the scene is faster than the shutter speed (motion blur). Additionally, even in ideal acquisition conditions, there can be an intrinsic camera blur due to sensor resolution, light diffraction, lens aberrations, and anti-aliasing filters. Similarly, image noise is intrinsic to the capture of a discrete number of photons (shot-noise), and the analog-to-digital conversion and processing (read out noise). In general, images are compressed, such as by using JPEG compression, before storage or transmission. The image compression can also degrade the image quality. Powered by a system of machine-learned components, an image capture device may be configured to generate a trigger based on a determination that an image should be enhanced. The trigger may alert users, and users may be provided with recommendations to remove blur, noise, compression artifacts, and so forth, to create sharp images. In some aspects, mobile devices may be configured with these features so that an image can be enhanced in real-time. In some instances, an image may be automatically enhanced by the mobile device. In other aspects, mobile phone users can non-destructively enhance an image to match their preference. Also, for example, pre-existing images in a user's image library can be enhanced based on techniques described herein. In one aspect, a computer-implemented method is provided. The method includes determining a respective delta quality score associated with each of a plurality of images, wherein the determining of the delta quality score comprises: predicting, by an image enhancement model, an enhanced image corresponding to a given image of the plurality of images, the image enhancement model having been trained to remove one or more image degradations associated with the given image, determining a first quality score associated with the given image and a second quality score associated with the predicted enhanced image, and wherein the delta quality score is based on a difference of the second quality score and the first quality score, and wherein the delta quality score is indicative of a degree of image enhancement in the predicted enhanced image. The method includes generating, by a computing device, a training dataset comprising the plurality of images associated with respective delta quality scores. The method includes training, based on the generated training dataset, a quality assessment model to predict a quality-improvability score associated with an input image, wherein the quality-improvability score is indicative of a potential to increase a perceptual quality of the input image based on removal of one or more image degradation factors. The method includes outputting, by the computing device, the trained quality assessment model. In another aspect, a computing device is provided. The computing device includes one or more processors and data storage. The data storage has stored thereon computer-executable instructions that, when executed by one or more processors, cause the computing device to carry out functions. The functions include: determining a respective delta quality score associated with each of a plurality of images, wherein the determining of the delta quality score comprises: predicting, by an image enhancement model, an enhanced image corresponding to a given image of the plurality of images, the image enhancement model having been trained to remove one or more image degradations associated with the given image, determining a first quality score associated with the given image and a second quality score associated with the predicted enhanced image, and wherein the delta quality score is based on a difference of the seco