US-20260129208-A1 - IMAGE PROCESSING METHOD BASED ON GLOBAL MOTION ESTIMATION AND DEVICE USING THE SAME

US20260129208A1US 20260129208 A1US20260129208 A1US 20260129208A1US-20260129208-A1

Abstract

A method of image processing based on global motion estimation including: estimating global motion parameters corresponding to components of a global motion between a current image frame and a reference image frame by executing a global motion estimation model comprising one or more neural networks that input the current image frame and the reference image frame; generating a geometric transformation matrix by combining the global motion parameters; and generating at least one of an output image and an output video using the geometric transformation matrix.

Inventors

Seunghoon JEE
Paul OH
Dokwan OH
Junhee Lee
Chansol Hwang

Assignees

SAMSUNG ELECTRONICS CO., LTD.

Dates

Publication Date: 20260507
Application Date: 20250401
Priority Date: 20241106

Claims (20)

1 . A method of image processing based on global motion estimation, the method comprising: estimating global motion parameters corresponding to components of a global motion between a current image frame and a reference image frame by executing a global motion estimation model comprising one or more neural networks that input the current image frame and the reference image frame; generating a geometric transformation matrix by combining the global motion parameters; and generating at least one of an output image and an output video using the geometric transformation matrix.
2 . The method of claim 1 , wherein the generating the at least one of the output image and the output video further comprises: generating the output video by encoding the current image frame and the reference image frame using the geometric transformation matrix.
3 . The method of claim 2 , wherein the generating the at least one of the output image and the output video further comprises: inputting the current image frame, the reference image frame, and the geometric transformation matrix into a video codec configured to execute one or more operations using the geometric transformation matrix.
4 . The method of claim 3 , wherein the video codec is configured to execute at least one of a translation mode using a global translation motion, a rotation mode using a global rotation motion, a zoom mode using a global zoom motion, a rotation and zoom mode using global rotation and global zoom, and an affine mode using the global translation motion, the global rotation motion, the global zoom motion, and a global shear motion.
5 . The method of claim 4 , wherein the global motion estimation model comprises one or more sub-models corresponding to at least one of the translation mode, the rotation mode, the zoom mode, the rotation and zoom mode, and the affine mode.
6 . The method of claim 1 , wherein the geometric transformation matrix is an affine transformation matrix.
7 . The method of claim 1 , wherein the generating of the geometric transformation matrix comprises: determining one or more function values by substituting one or more global motion parameters into one or more functions; and determining one or more elements of the geometric transformation matrix by combining the global motion parameters based on (i) operations between the global motion parameters, (ii) operations between the global motion parameters and the one or more function values, (iii) operations between a plurality of function values of the one or more function values, or (iv) a combination thereof.
8 . The method of claim 1 , wherein the generating of the at least one of the output image and the output video comprises: generating the output image by driving an image signal processor (ISP) using the geometric transformation matrix.
9 . The method of claim 1 , wherein the geometric transformation matrix is a homography transformation matrix.
10 . The method of claim 1 , further comprising: generating a scaled current image frame by scaling the current image frame to a target size; and generating a scaled reference image frame by scaling the reference image frame to the target size, wherein the global motion estimation model is executed based on the scaled current image frame and the scaled reference image frame.
11 . The method of claim 10 , wherein the target size is adjusted to guarantee that video encoding is performed within a predetermined amount of time.
12 . The method of claim 1 , wherein the global motion estimation model comprises a first estimation model configured to estimate first global motion parameters and a second estimation model configured to estimate second global motion parameters, and the geometric transformation matrix comprises an affine transformation matrix determined by a combination of the first global motion parameters and a homography transformation matrix determined by a combination of the second global motion parameters.
13 . An electronic device comprising: a camera configured to generate a current image frame and a reference image frame; a memory storing one or more instructions; a video codec; and at least one processor operatively coupled to the memory, the camera, and the video codec, wherein the one or more instructions, when executed by the at least one processor, cause the electronic device to: store a global motion estimation model based on a neural network and estimate global motion parameters corresponding to components of a global motion between the current image frame and the reference image frame by executing the global motion estimation model comprising one or more neural networks that input the reference image frame; generate a geometric transformation matrix by combining the global motion parameters; and control the video codec to generate an output video using the geometric transformation matrix.
14 . The electronic device of claim 13 , wherein the one or more instructions, when executed by the at least one processor, cause the electronic device to execute at least one of a translation mode supporting a global translation motion, a rotation mode supporting a global rotation motion, a zoom mode supporting a global zoom motion, a rotation and zoom mode supporting global rotation and global zoom, and an affine mode supporting the global translation motion, the global rotation motion, the global zoom motion, and a global shear motion.
15 . The electronic device of claim 13 , wherein the geometric transformation matrix is an affine transformation matrix.
16 . The electronic device of claim 13 , wherein the one or more instructions, when executed by the at least one processor cause the electronic device to, to generate the geometric transformation matrix: determine one or more function values by substituting one or more global motion parameters into one or more functions, and determine one or more elements of the geometric transformation matrix by combining the global motion parameters based on (i) operations between the global motion parameters, (ii) operations between the global motion parameters and the one or more function values, (iii) operations between a plurality of function values of the one or more function values, or (iv) a combination thereof.
17 . The electronic device of claim 13 , wherein a scaled current image frame is generated by scaling the current image frame to a target size and a scaled reference image frame is generated by scaling the reference image frame to the target size, and wherein the global motion estimation model is executed based on the scaled current image frame and the scaled reference image frame.
18 . An electronic device comprising: a camera configured to generate a current image frame and a reference image frame; a memory storing one or more instructions; an image signal processor (ISP); at least one processor operatively coupled to the memory, the camera, and the ISP; wherein the one or more instructions, when executed by the at least one processor, cause the electronic device to: store a global motion estimation model based on a neural network and estimate global motion parameters corresponding to components of a global motion between the current image frame and the reference image frame by executing the global motion estimation model based on the current image frame and the reference image frame; generate a geometric transformation matrix by combining the global motion parameters; and control the ISP to generate an output image using the geometric transformation matrix.
19 . The electronic device of claim 18 , wherein the one or more instructions, when executed by the at least one processor, cause the electronic device to, to generate the geometric transformation matrix: determine one or more function values by substituting one or more global motion parameters into one or more functions, and determine one or more elements of the geometric transformation matrix by combining the global motion parameters based on (i) operations between the global motion parameters, (ii) operations between the global motion parameters and the one or more function values, (iii) operations between a plurality of function values of the one or more function values, or (iv) a combination thereof.
20 . The electronic device of claim 18 , wherein the geometric transformation matrix is a homography transformation matrix.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS This application claims the benefit under 35 USC § 119(a) of Korean Patent Application No. 10-2024-0156411, filed on Nov. 6, 2024, in the Korean Intellectual Property Office, the entire disclosure of which is incorporated herein by reference for all purposes. BACKGROUND 1. Field The following description relates to an image processing method based on global motion estimation and a device using the same. 2. Description of Related Art In a process of generating an image or a video, image signal processing to resolve degradation of the image or video compression to efficiently store a video file may be performed. The image signal processing or video compression may improve the quality of the image or the size of the video based on the correlation between frames. The correlation between frames may be derived based on motion estimation that compares the frames block-wise. The motion estimation may be performed in a predetermined search range. When the search range is restricted, a block matching rate may be improved by considering a global motion that occurs due to a camera motion, etc. Furthermore, the global motion may be used for image signal processing such as image stabilization. SUMMARY This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter. According to an aspect of the disclosure, a method of image processing based on global motion estimation includes estimating global motion parameters corresponding to components of a global motion between a current image frame and a reference image frame by executing a global motion estimation model comprising one or more neural networks that input the current image frame and the reference image frame; generating a geometric transformation matrix by combining the global motion parameters; and generating at least one of an output image and an output video using the geometric transformation matrix. According to an aspect of the disclosure, an electronic device includes: a camera configured to generate a current image frame and a reference image frame; a memory storing one or more instructions; a video codec; and at least one processor operatively coupled to the memory, the camera, and the video codec, in which the one or more instructions, when executed by the at least one processor, cause the electronic device to: store a global motion estimation model based on a neural network and estimate global motion parameters corresponding to components of a global motion between the current image frame and the reference image frame by executing the global motion estimation model comprising one or more neural networks that input the reference image frame; control the video codec to generate a geometric transformation matrix by combining the global motion parameters; and generate an output video using the geometric transformation matrix. According to an aspect of the disclosure, an electronic device includes: a camera configured to generate a current image frame and a reference image frame; a memory storing one or more instructions; an image signal processor (ISP); and at least one processor operatively coupled to the memory, the camera, in which the one or more instructions, when executed by the at least one processor, cause the electronic device to: store a global motion estimation model based on a neural network and estimate global motion parameters corresponding to components of a global motion between the current image frame and the reference image frame by executing the global motion estimation model comprising one or more neural networks that input the reference image frame; control the ISP to generate a geometric transformation matrix by combining the global motion parameters; and generate an output video using the geometric transformation matrix. Other features and aspects will be apparent from the following detailed description, the drawings, and the claims. BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a diagram illustrating an example of operations of generating global motion parameters and a geometric transformation matrix using a global motion estimation model, according to one or more embodiments. FIG. 2 is a diagram illustrating an example of operations of generating an affine transformation matrix as a geometric transformation matrix, according to one or more embodiments. FIG. 3 is a diagram illustrating an example of a global motion estimation model including sub-models, according to one or more embodiments. FIG. 4 is a diagram illustrating operations of generating a homography transformation matrix as a geometric transformation matrix, according to one or more embodiments. FIG. 5 is a diagram illustrating an example of training and inferen