EP-4736127-A1 - PERFORMING ZOOM OPERATION ON COMPUTING DEVICE WITH CONVOLUTIONAL NEURAL NETWORK LAYERS

EP4736127A1EP 4736127 A1EP4736127 A1EP 4736127A1EP-4736127-A1

Abstract

Systems and methods for performing a zoom operation on a display of a computing device include identifying a region of interest (ROI) of an image presented on a computing device display based on a receive zoom user input, determining a scaling ratio for a final zoomed presentation of the ROI, selecting two or more convolutional neural network (CNN) layers from among a plurality of CNN layers each including a scaling ratio that is less than the scaling ratio for the final zoomed presentation of the ROI, generating an intermediate image from each of the selected CNN layers including a sequence of intermediate size image frames of the ROI, and rendering an animation of changes in a displayed size of the ROI using the sequence of intermediate size image frames of the ROI.

Inventors

ZHANG, NAN
ZHANG, WENHAO
LI, WANJUN

Assignees

Qualcomm Incorporated

Dates

Publication Date: 20260506
Application Date: 20230628

Claims (20)

A method of performing a zoom operation on a computing device, comprising: identifying a region of interest (ROI) of an image presented on a display of the computing device based on a zoom user input received by the computing device; determining a scaling ratio for a final zoomed presentation of the ROI; selecting two or more convolutional neural network (CNN) layers from among a plurality of CNN layers, each CNN layer comprising a scaling ratio that is less than the scaling ratio for the final zoomed presentation of the ROI; generating an intermediate image from each of the selected CNN layers, comprising a sequence of intermediate size image frames of the ROI each having dimensions smaller than dimensions of the final zoomed presentation of the ROI and greater than dimensions of the ROI; and rendering an animation of changes in a displayed size of the ROI using the sequence of intermediate size image frames of the ROI.
The method of claim 1, wherein at least one of the selected CNN layers includes a non-integer scaling ratio.
The method of claim 1, wherein selecting two or more CNN layers from among the plurality of CNN layers comprises selecting a quantity of the plurality of CNN layers based on an application requirement of an application executing in the computing device that provided the image.
The method of claim 1, wherein selecting two or more CNN layers from among the plurality of CNN layers comprises selecting a quantity of the plurality of CNN layers based on an available quantity of memory of the computing device.
The method of claim 1, further comprising generating the plurality of CNN layers based on animation requirements of an image-providing application executing in the computing device.
The method of claim 5, wherein at least one of the generated plurality of CNN layers includes a non-integer scaling ratio.
A computing device, comprising: a processing system configured to: identify a region of interest (ROI) of an image presented on a display of the computing device based on a zoom user input received by the computing device; determine a scaling ratio for a final zoomed presentation of the ROI; select two or more convolutional neural network (CNN) layers from among a plurality of CNN layers, each CNN layer comprising a scaling ratio that is less than the scaling ratio for the final zoomed presentation of the ROI; generate an intermediate image from each of the selected CNN layers, comprising a sequence of intermediate size image frames of the ROI each having dimensions smaller than dimensions of the final zoomed presentation of the ROI and greater than dimensions of the ROI; and render an animation of changes in a displayed size of the ROI using the sequence of intermediate size image frames of the ROI.
The computing device of claim 7, wherein at least one of the selected CNN layers includes a non-integer scaling ratio.
The computing device of claim 7, wherein the processing system is further configured to select a quantity of the plurality of CNN layers based on an application requirement of an application executing in the computing device that provided the image.
The computing device of claim 7, wherein the processing system is further configured to select a quantity of the plurality of CNN layers based on an available quantity of memory of the computing device.
The computing device of claim 7, wherein the processing system is further configured to generate the plurality of CNN layers based on animation requirements of an image-providing application executing in the computing device.
The computing device of claim 11, wherein the processing system is further configured such that at least one of the generated plurality of CNN layers includes a non-integer scaling ratio.
A computing device, comprising: means for identifying a region of interest (ROI) of an image presented on a display of the computing device based on a zoom user input received by the computing device; means for determining a scaling ratio for a final zoomed presentation of the ROI; means for selecting two or more convolutional neural network (CNN) layers from among a plurality of CNN layers, each CNN layer comprising a scaling ratio that is less than the scaling ratio for the final zoomed presentation of the ROI; means for generating an intermediate image from each of the selected CNN layers, comprising a sequence of intermediate size image frames of the ROI each having dimensions smaller than dimensions of the final zoomed presentation of the ROI and greater than dimensions of the ROI; and means for rendering an animation of changes in a displayed size of the ROI using the sequence of intermediate size image frames of the ROI.
The computing device of claim 13, wherein at least one of the selected CNN layers includes a non-integer scaling ratio.
The computing device of claim 13, wherein means for selecting two or more CNN layers from among the plurality of CNN layers comprises means for selecting a quantity of the plurality of CNN layers based on an application requirement of an application executing in the computing device that provided the image.
The computing device of claim 13, wherein means for selecting two or more CNN layers from among the plurality of CNN layers comprises means for selecting a quantity of the plurality of CNN layers based on an available quantity of memory of the computing device.
The computing device of claim 13, further comprising means for generating the plurality of CNN layers based on animation requirements of an image-providing application executing in the computing device.
The computing device of claim 17, wherein at least one of the generated plurality of CNN layers includes a non-integer scaling ratio.
A non-transitory processor-readable medium having stored thereon processor-executable instructions configured to cause a processing system in a computing device to perform operations comprising: identifying a region of interest (ROI) of an image presented on a display of the computing device based on a zoom user input received by the computing device; determining a scaling ratio for a final zoomed presentation of the ROI; selecting two or more convolutional neural network (CNN) layers from among a plurality of CNN layers, each CNN layer comprising a scaling ratio that is less than the scaling ratio for the final zoomed presentation of the ROI; generating an intermediate image from each of the selected CNN layers, comprising a sequence of intermediate size image frames of the ROI each having dimensions smaller than dimensions of the final zoomed presentation of the ROI and greater than dimensions of the ROI; and rendering an animation of changes in a displayed size of the ROI using the sequence of intermediate size image frames of the ROI.
The non-transitory processor-readable medium of claim 19, wherein the stored processor-executable instructions are further configured to cause the processing system in the computing device to perform operations such that at least one of the selected CNN layers includes a non-integer scaling ratio.

Description

PERFORMING ZOOM OPERATION ON COMPUTING DEVICE WITH CONVOLUTIONAL NEURAL NETWORK LAYERS BACKGROUND Computing devices, such as smartphones, can utilize Artificial Intelligence (AI) for image processing using a dedicated AI processor or AI accelerator that is designed to run machine learning algorithms efficiently. The AI processor/accelerator also may execute convolutional neural networks (CNNs) as part of a CNN-based AI image processing system. CNNs are well suited for various image processing tasks such as image recognition, object detection, image resizing and image segmentation. While CNNs can be used to enlarge images in support of image zooming operations, CNN processing for image magnification take time to complete, and thus would result in unsmooth zoom animations if used directly. CNN processing is also power hungry, consuming an inordinate amount of battery power to enlarge images. Consequently, in many computing devices, a run-time zoom or run-time magnify operation is only available within certain applications executing on the computing device. SUMMARY Various aspects include methods of performing a zoom operation on a computing device. Various aspects may include identifying a region of interest (ROI) of an image presented on a display of the computing device based on a zoom user input received by the computing device, determining a scaling ratio for a final zoomed presentation of the ROI, selecting two or more convolutional neural network (CNN) layers from among a plurality of CNN layers in which each CNN layer includes a scaling ratio that is less than the scaling ratio for the final zoomed presentation of the ROI, generating an intermediate image from each of the selected CNN layers in which the ROI in each image has dimensions smaller than dimensions of the final zoomed presentation of the ROI and greater than dimensions of the ROI, and rendering an animation of changes in a displayed size of the ROI using the sequence of intermediate size image frames of the ROI. In some aspects, at least one of the selected CNN layers may include a non-integer scaling ratio. In some aspects, selecting two or more CNN layers from among the plurality of CNN layers may include selecting a quantity of the plurality of CNN layers based on an application requirement of an application executing in the computing device that provided the image. In some aspects, selecting two or more CNN layers from among the plurality of CNN layers may include selecting a quantity of the plurality of CNN layers based on an available quantity of memory of the computing device. Some aspects may include generating the plurality of CNN layers based on animation requirements of an image-providing application executing in the computing device. In some aspects, at least one of the generated plurality of CNN layers may include a non-integer scaling ratio. Further aspects may include a computing device having a processing system configured with processor-executable instructions to perform various operations corresponding to the methods discussed above. Further aspects may include a non-transitory processor-readable storage medium having stored thereon processor-executable instructions configured to cause a processing system to perform various operations corresponding to the method operations discussed above. Further aspects may include a computing device having various means for performing functions corresponding to the method operations discussed above. BRIEF DESCRIPTION OF THE DRAWINGS The accompanying drawings, which are incorporated herein and constitute part of this specification, illustrate exemplary embodiments of the claims, and together with the general description given and the detailed description, serve to explain the features herein. FIG. 1A is a conceptual diagram illustrating aspects of a method 100 for performing a zoom operation on a computing device in accordance with various embodiments. FIG. 1B is a conceptual diagram illustrating aspects of a method 150 for performing a zoom operation on a computing device in accordance with various embodiments. FIGS. 2A and 2B illustrate an example neural network suitable for implementation in a computing device in accordance with various embodiments. FIGS. 3A and 3B illustrate example functionality components that may be included in a convolutional neural network, which may be implemented in a computing device that is configured to implement a generalized framework for continual learning in accordance with various embodiments. FIG. 4 is a component block diagram illustrating an example computing system implemented as a system in a package (SIP) suitable for implementing various embodiments. FIG. 5A illustrates a method of performing a zoom operation on a computing device according to various embodiments FIG. 5B illustrates operations that may be performed as part of the method of performing a zoom operation on a computing device according to various embodiments. FIG. 6 is a c