US-12620061-B2 - Method and apparatus with image processing

US12620061B2US 12620061 B2US12620061 B2US 12620061B2US-12620061-B2

Abstract

A processor-implemented method includes estimating a transformation model using a transformation determination neural network model, provided motion sensor detected motion data representing motion of an image sensor with respect to a first image frame and a subsequent second image frame from captured by the image sensor, to perform a transformation based on global motion between the first image frame and the second image frame, and generating output image data by combining, by using the transformation model, the first image frame and the second image frame.

Inventors

Seunghoon JEE

Assignees

SAMSUNG ELECTRONICS CO., LTD.

Dates

Publication Date: 20260505
Application Date: 20230518
Priority Date: 20221115

Claims (18)

1 . A processor-implemented method comprising: estimating a transformation model using a transformation determination neural network model, provided motion sensor detected motion data representing motion of an image sensor with respect to a first image frame at a first time and a second image frame at a second time different from the first time captured by the image sensor, to perform a transformation based on global motion between the first image frame and the second image frame; and generating output image data by combining, by using the transformation model, the first image frame and the second image frame, wherein a first sensing period of the motion sensor is less than a second sensing period of the image sensor, and wherein motion data generated from output data of the motion sensor that is collected during the second sensing period is input to the transformation determination neural network model.
2 . The method of claim 1 , further comprising: generating input image data comprising the first image frame and the second image frame using the image sensor; and generating the motion data using the motion sensor, and; wherein generating of the output image data comprises by encoding the first image frame and the second image frame into video data corresponding to the output image data using the transformation model.
3 . The method of claim 2 , wherein the video data comprises matching data between pixel blocks of the first image frame and pixel blocks of the second image frame, and wherein the encoding comprises: setting a search area of the second image frame for block matching of a first pixel block of the first image frame by using the transformation model; and searching for a second pixel block matching with the first pixel block in the search area of the second image frame.
4 . The method of claim 3 , wherein the setting of the search area comprises transforming a first position of the first pixel block of the first image frame to a second position of the second image frame by using the transformation model; and setting the search area of the second image frame according to the second position.
5 . The method of claim 3 , wherein the setting of the search area comprises transforming a search area of the first image frame according to the first pixel block of the first image frame to the search area of the second image frame by using the transformation model.
6 . The method of claim 1 , wherein the generating of the output image data comprises generating photo data corresponding to the output image data by compensating the global motion between the first image frame and the second image frame by using the transformation model.
7 . The method of claim 1 , wherein the motion data comprises at least some of acceleration data and angular velocity data according to the motion of the image sensor between a first time point and a second time point.
8 . The method of claim 1 , further comprising generating the transformation determination neural network model by training on in-training transformation determination neural network model using training data based on a sensing result obtained by sensing a corresponding motion of the image sensor by the motion sensor with respect to a training image captured by the image sensor of a test pattern.
9 . The method of claim 1 , further comprising generating the transformation determination neural network model by: determining a first test transformation model by performing vision analysis on test image data obtained by capturing a provided test pattern through the image sensor; estimating a second test transformation model by inputting, to the transformation determination neural network model, test motion data obtained by sensing the motion of the image sensor through the motion sensor while the image sensor captures the provided test pattern; determining first loss data corresponding to a difference between the first test transformation model and the second test transformation model; and generating the transformation determination neural network model based on the first loss data.
10 . The method of claim 9 , wherein generating the transformation determination neural network model further comprises: generating a first result image by transforming an additional test image by using the first test transformation model; generating a second result image by transforming the additional test image by using the second test transformation model; determining second loss data corresponding to a difference between the first result image and the second result image; and generating the transformation determination neural network model by training an in-training transformation determination neural network model based on the first loss data and the second loss data.
11 . A non-transitory computer-readable storage medium storing instructions that, when executed by a processor, cause the processor to perform the method of claim 1 .
12 . An electronic device comprising: one or more processors configured to: estimate a transformation model using a transformation determination neural network model, provided motion sensor detected motion data representing motion of an image sensor with respect to a first image frame at a first time and a second image frame at a second time different from the first time captured by the image sensor, to perform a transformation based on global motion between the first image frame and the second image frame; and generate output image data by combining, by using the transformation model, the first image frame and the second image frame, wherein a first sensing period of the motion sensor is less than a second sensing period of the image sensor, and wherein motion data generated from output data of the motion sensor that is collected during the second sensing period is input to the transformation determination neural network model.
13 . The electronic device of claim 12 , further comprising: the image sensor configured to generate input image data comprising the first image frame and the second image frame; and the motion sensor configured to generate the motion data; and wherein the one or more processors comprise a codec configured to encode the first image frame and the second image frame into video data corresponding to the output image data using the transformation model.
14 . The electronic device of claim 13 , wherein the video data comprises matching data between pixel blocks of the first image frame and pixel blocks of the second image frame, and wherein the codec is configured to: set a search area of the second image frame for block matching of a first pixel block of the first image frame by using the transformation model and search for a second pixel block matching with the first pixel block of the first image frame in the search area of the second image frame.
15 . The electronic device of claim 14 , wherein the codec, to set the search area, is configured to transform a first position of the first pixel block of the first image frame to a second position of the second image frame by using the transformation model and set the search area of the second image frame according to the second position.
16 . The electronic device of claim 14 , wherein the codec, to set the search area, is configured to transform a search area of the first image frame according to the first pixel block of the first image frame to the search area of the second image frame by using the transformation model.
17 . The electronic device of claim 12 , wherein the one or more processors, to generate the output image data, are configured to generate photo data corresponding to the output image data by compensating the global motion between the first image frame and the second image frame by using the transformation model.
18 . The electronic device of claim 12 , wherein the transformation determination neural network model is generated using training data based on a sensing result obtained by sensing a corresponding motion of the image sensor by the motion sensor with respect to a training image captured by the image sensor of a test pattern.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS This application claims the benefit under 35 USC § 119(a) of Korean Patent Application No. 10-2022-0152573, filed on Nov. 15, 2022, in the Korean Intellectual Property Office, the entire disclosure of which is incorporated herein by reference for all purposes. BACKGROUND 1. Field The following description relates to a method and apparatus with image processing. 2. Description of Related Art A deep learning-based neural network may be used for image processing. The neural network may be trained based on deep learning and may perform inference for a desired purpose by mapping input data and output data that are in a nonlinear relationship to each other. Such a trained capability of generating the mapping may be referred to as a learning ability of the neural network. SUMMARY This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter. In one general aspect, a processor-implemented method may include estimating a transformation model using a transformation determination neural network model, provided motion sensor detected motion data representing motion of an image sensor with respect to a first image frame and a subsequent second image frame from captured by the image sensor, to perform a transformation based on global motion between the first image frame and the second image frame; and generating output image data by combining, by using the transformation model, the first image frame and the second image frame. The method may further include generating input image data comprising the first image frame and the second image frame using the image sensor; and generating the motion data using the motion sensor; performing the generating of the output image data by encoding the first image frame and the second image frame into video data corresponding to the output image data using the transformation model. The video data may include matching data between pixel blocks of the first image frame and pixel blocks of the second image frame, and wherein the encoding may include setting a search area of the second image frame for block matching of a first pixel block of the first image frame by using the transformation model; and searching for a second pixel block matching with the first pixel block in the search area of the second image frame. The setting of the search area may include transforming a first position of the first pixel block of the first image frame to a second position of the second image frame by using the transformation model; and setting the search area of the second image frame according to the second position. The setting of the search area may include transforming a search area of the first image frame according to the first pixel block of the first image frame to the search area of the second image frame by using the transformation model. The generating of the output image data may include generating photo data corresponding to the output image data by compensating the global motion between the first image frame and the second image frame by using the transformation model. The motion data may include at least some of acceleration data and angular velocity data according to the motion of the image sensor between the first time and the second time. In the method, a first sensing period of the motion sensor is less than a second sensing period of the image sensor, and wherein motion data generated by combining output data of the motion sensor that is collected during the second sensing period is input to the transformation determination neural network model. The method may further include generating the transformation determination neural network model by training on in-training transformation determination neural network model using training data based on a sensing result obtained by sensing a corresponding motion of the image sensor by the motion sensor with respect to a training image captured by the image sensor of a test pattern. The method may further include generating the neural network model by determining a first test transformation model by performing vision analysis on test image data obtained by capturing a provided test pattern through the image sensor; estimating a second test transformation model by inputting, to the neural network model, test motion data obtained by sensing the motion of the image sensor through the motion sensor while the image sensor captures the provided test pattern; determining first loss data corresponding to a difference between the first test transformation model and the second test transformation model; and generating the transformation determination neural network model by training an in-training transformation determination neural net