KR-20260067821-A - IMAGE PROCESSING METHOD BASED ON GLOBAL MOTION ESTIMATION AND APPARATUS USING THE SAME

KR20260067821AKR 20260067821 AKR20260067821 AKR 20260067821AKR-20260067821-A

Abstract

An image processing method based on global motion estimation and an apparatus utilizing the same are provided. The method may include the steps of executing a neural network-based global motion estimation model based on a current image frame and a reference image frame to estimate global motion parameters corresponding to components of global motion between the current image frame and the reference image frame, combining the global motion parameters to generate a geometric transformation matrix, and generating one or more of an output image and an output video using the geometric transformation matrix.

Inventors

지승훈
오바울
오도관
이준희
황찬솔

Assignees

삼성전자주식회사

Dates

Publication Date: 20260513
Application Date: 20241106

Claims (20)

In a global motion estimation-based image processing method, A step of executing a neural network-based global motion estimation model based on a current image frame and a reference image frame to estimate global motion parameters corresponding to components of global motion between the current image frame and the reference image frame; A step of generating a geometric transformation matrix by combining the above global motion parameters; and A step of generating one or more of an output image and an output video using the above geometric transformation matrix An image processing method including
In paragraph 1, The step of generating one or more of the above output image and the above output video is A step of generating the output video by encoding the current image frame and the reference image frame using the geometric transformation matrix. An image processing method including
In paragraph 2, The step of generating the above output video A method comprising the step of inputting the current image frame, the reference image frame, and the geometric transformation matrix into a video codec that supports the geometric transformation matrix. Image processing method.
In paragraph 3, The above video codec is A translation mode using global translation motion, a rotation mode using global rotation motion, a zoom mode using global zoom motion, a rotation and zoom mode using global rotation and global zoom, and one or more of the affine modes using global translation motion, global rotation motion, global zoom motion, and global shear motion, Image processing method.
In paragraph 4, The above global motion estimation model Sub-models corresponding to one or more of the translation mode, the rotation mode, the zoom mode, the rotation and zoom mode, and the affine mode, Image processing method.
In paragraph 1, The above geometric transformation matrix is The affine transformation matrix, Image processing method.
In paragraph 1, The step of generating the above geometric transformation matrix A step of determining one or more function values by substituting one or more of the above global motion parameters into one or more functions; and A step of determining elements of the geometric transformation matrix by combining the global motion parameters based on operations between the global motion parameters, operations between the global motion parameters and one or more function values, operations between multiple function values of the one or more function values, or a combination thereof. An image processing method including
In paragraph 1, The step of generating one or more of the above output image and the above output video is A step of driving an image signal processor (ISP) using the geometric transformation matrix above to generate the output image An image processing method including
In paragraph 1, The above geometric transformation matrix is The homography transformation matrix, Image processing method.
In paragraph 1, The method further includes the step of scaling the current image frame and the reference image frame to a target size to generate a scaled current image frame and a scaled reference image frame. The above global motion estimation model is executed based on the scaled current image frame and the scaled reference image frame, Image processing method.
In Paragraph 10, The above target size is Adjusted to ensure real-time video encoding, Image processing method.
In paragraph 1, The above global motion estimation model It includes a first estimation model for estimating first global motion parameters and a second estimation model for estimating second global motion parameters, and The above geometric transformation matrix is A homography transformation matrix comprising an affine transformation matrix determined by combining the first global motion parameters and a homography transformation matrix determined by combining the second global motion parameters. Image processing method.
In electronic devices, A camera that generates the current image frame and the reference image frame; A global motion estimator that stores a neural network-based global motion estimation model and executes the global motion estimation model based on the current image frame and the reference image frame to estimate global motion parameters corresponding to components of global motion between the current image frame and the reference image frame; A transformation matrix generator that generates a geometric transformation matrix by combining the above global motion parameters; and A video codec that generates an output video using the above geometric transformation matrix An electronic device including
In Paragraph 13, The above video codec is A translation mode that supports global translation motion, a rotation mode that supports global rotation motion, a zoom mode that supports global zoom motion, a rotation and zoom mode that supports global rotation and global zoom, and one or more of the affine modes that support global translation motion, global rotation motion, global zoom motion, and global shear motion. Electronic device.
In Paragraph 13, The above geometric transformation matrix is The affine transformation matrix, Electronic device.
In Paragraph 13, The above transformation matrix generator One or more of the above global motion parameters are substituted into one or more functions to determine one or more function values, and Determining elements of the geometric transformation matrix by combining the global motion parameters based on operations between the global motion parameters, operations between the global motion parameters and one or more function values, operations between multiple function values of the one or more function values, or a combination thereof. Electronic device.
In Paragraph 13, The above current image frame and the above reference image frame are scaled to a target size to generate a scaled current image frame and a scaled reference image frame, and The above global motion estimation model is executed based on the scaled current image frame and the scaled reference image frame, Electronic device.
In electronic devices, A camera that generates the current image frame and the reference image frame; A global motion estimator that stores a neural network-based global motion estimation model and executes the global motion estimation model based on the current image frame and the reference image frame to estimate global motion parameters corresponding to components of global motion between the current image frame and the reference image frame; A transformation matrix generator that generates a geometric transformation matrix by combining the above global motion parameters; and An image signal processor that generates an output image using the above geometric transformation matrix An electronic device including
In Paragraph 18, The above transformation matrix generator One or more of the above global motion parameters are substituted into one or more functions to determine one or more function values, and Determining elements of the geometric transformation matrix by combining the global motion parameters based on operations between the global motion parameters, operations between the global motion parameters and one or more function values, operations between multiple function values of the one or more function values, or a combination thereof. Electronic device.
In Paragraph 18, The above geometric transformation matrix is The homography transformation matrix, Electronic device.

Description

Image processing method based on global motion estimation and apparatus using the same The following embodiments relate to a global motion estimation-based image processing method and an apparatus using the same. During the process of generating images or videos, image signal processing to mitigate image degradation or video compression to efficiently store video files may be performed. Image signal processing or video compression can improve image quality or reduce video file size based on the correlation between frames. The correlation between frames can be derived based on motion estimation, which compares frames in block units. Motion estimation can be performed within a specific search range. When the search range is limited, the block matching rate can be improved by considering global motion caused by factors such as camera motion. Additionally, global motion can be utilized in image signal processing, such as image stabilization. FIG. 1 is a diagram illustrating exemplary operations for generating global motion parameters and a geometric transformation matrix using a global motion estimation model according to one embodiment. FIG. 2 is a diagram illustrating exemplary operations for generating an affine transformation matrix as a geometric transformation matrix according to one embodiment. FIG. 3 is a diagram illustrating an exemplary global motion estimation model including sub-models according to one embodiment. FIG. 4 is a diagram illustrating operations for generating a homography transformation matrix as a geometric transformation matrix according to one embodiment. FIG. 5 is a diagram illustrating exemplary training and inference stages of a global motion estimation model according to one embodiment. FIG. 6 is a diagram illustrating an exemplary frame prediction model used for training a global motion estimation model according to one embodiment. FIG. 7 is a diagram exemplarily illustrating a motion kernel estimation model of a frame prediction model according to one embodiment. FIG. 8 is a diagram exemplarily illustrating the unfolding operation of a motion kernel estimation model according to one embodiment. FIG. 9 is a block diagram showing an exemplary configuration of an electronic device according to one embodiment. FIG. 10 is a block diagram showing another exemplary configuration of an electronic device according to one embodiment. FIG. 11 is a flowchart exemplarily illustrating a global motion estimation-based image processing method according to one embodiment. FIG. 12 is a block diagram showing another exemplary configuration of an electronic device according to one embodiment. Specific structural or functional descriptions of the embodiments are disclosed for illustrative purposes only and may be modified and implemented in various forms. Accordingly, actual implementations are not limited to the specific embodiments disclosed, and the scope of this specification includes modifications, equivalents, or substitutions included in the technical concept described by the embodiments. Terms such as "first" or "second" may be used to describe various components, but these terms should be interpreted solely for the purpose of distinguishing one component from another. For example, the first component may be named the second component, and similarly, the second component may be named the first component. When it is stated that a component is "connected" to another component, it should be understood that it may be directly connected to or joined to that other component, or that there may be other components in between. The singular expression includes the plural expression unless the context clearly indicates otherwise. In this specification, terms such as "comprising" or "having" are intended to specify the existence of the described features, numbers, steps, actions, components, parts, or combinations thereof, and should be understood as not precluding the existence or addition of one or more other features, numbers, steps, actions, components, parts, or combinations thereof. In this specification, each of the phrases such as “at least one of A or B” and “at least one of A, B, or C” may include any one of the items listed together with the corresponding phrase, or all possible combinations thereof. Unless otherwise defined, all terms used herein, including technical or scientific terms, have the same meaning as generally understood by those skilled in the art. Terms such as those defined in commonly used dictionaries should be interpreted as having a meaning consistent with their meaning in the context of the relevant technology, and should not be interpreted in an ideal or overly formal sense unless explicitly defined in this specification. Hereinafter, embodiments will be described in detail with reference to the attached drawings. In the description with reference to the attached drawings, identical components are given the same reference numeral regardless of the drawing number, and re