KR-20260062267-A - REAL-TIME VIDEO MOTION DETECTION APPARATUS AND METHOD USING IMITATION LEARNING

KR20260062267AKR 20260062267 AKR20260062267 AKR 20260062267AKR-20260062267-A

Abstract

The present invention relates to an apparatus and method for generating a super-resolution image through pixel-level classification, wherein the apparatus comprises: an image input unit that receives a low-resolution image; a backbone network unit that inputs the low-resolution image as input data to a backbone network to generate a low-resolution feature map as output data; a pixel classification unit that receives the low-resolution feature map and the coordinates of a specific pixel, predicts the restoration difficulty of the specific pixel, and determines a restoration upsampler; an upsampling unit that includes a plurality of upsamplers constructed based on the restoration difficulty and performs a pixel-unit operation to upsample the specific pixel through the determined restoration upsampler among the plurality of upsamplers; and a super-resolution image output unit that generates a super-resolution image by outputting the upsampled specific pixel at the coordinates of the specific pixel of the low-resolution image.

Inventors

김선주
정진호
김진우

Assignees

연세대학교 산학협력단

Dates

Publication Date: 20260507
Application Date: 20241028

Claims (10)

Image input unit for receiving low-resolution images; A backbone network unit that inputs the above low-resolution image as input data into a backbone network to generate a low-resolution feature map as output data; A pixel classification unit that receives the above low-resolution feature map and the coordinates of a specific pixel, predicts the restoration difficulty of the above specific pixel, and determines a restoration upsampler; An upsampling unit comprising a plurality of upsamplers built based on the above restoration difficulty, and performing a pixel-unit operation to upsample the specific pixel through the determined restoration upsampler among the plurality of upsamplers; and A super-resolution image generation device through pixel-level classification, comprising a super-resolution image output unit that generates a super-resolution image by outputting the upsampled specific pixel to the coordinates of the specific pixel of the low-resolution image.
In paragraph 1, the backbone network portion A super-resolution image generation device through pixel-level classification, characterized by selecting FSRCNN (Fast Super-Resolution Convolutional Neural Network), CARN (Cascading Residual Network), or SRResNet (Super-Resolution Residual Network) as the backbone network based on the restoration characteristics of the low-resolution image.
In claim 1, the pixel classification unit A super-resolution image generation device through pixel-level classification, characterized by determining one of the restoration difficulty grades processed and assigned to the plurality of upsamplers based on the above low-resolution feature map as the restoration difficulty of the above specific pixel.
In claim 1, the pixel classification unit A super-resolution image generation device through pixel-level classification characterized by determining a relatively large capacity upsampler when the above-mentioned specific pixel is composed of a relatively complex pattern or texture.
In paragraph 4, the pixel classification unit A super-resolution image generation device through pixel-level classification characterized by determining a relatively small capacity upsampler when the above-mentioned specific pixel is configured relatively simply.
In paragraph 1, the upsampling part A super-resolution image generation device through pixel-level classification, characterized by determining the number of the plurality of upsamplers by determining a restoration difficulty grade based on the above low-resolution feature map.
In paragraph 1, the upsampling part A super-resolution image generation device through pixel-level classification, characterized by implementing the above-mentioned plurality of upsamplers to perform different upsampling techniques according to the above-mentioned restoration difficulty grade.
In claim 1, the super-resolution image output unit A super-resolution image generation device through pixel-level classification, characterized by performing pixel-wise Refinement on the super-resolution image to post-process artifact pixels when discontinuities occur between adjacent pixels restored through the plurality of upsamplers.
In claim 1, the super-resolution image output unit A super-resolution image generation device through pixel-level classification, characterized by performing a determination of discontinuities by applying PSNR (Peak Signal-to-Noise Ratio), SSIM (Structural Similarity Index), or FLOPs (Floating Point Operations) to the super-resolution image.
In a method for generating a super-resolution image through pixel-level classification performed in a device for generating a super-resolution image through pixel-level classification, Image input stage for receiving a low-resolution image; A backbone network step of inputting the above low-resolution image as input data into a backbone network to generate a low-resolution feature map as output data; A pixel classification step that receives the above low-resolution feature map and the coordinates of a specific pixel, predicts the restoration difficulty of the above specific pixel, and determines a restoration upsampler; An upsampling step comprising a plurality of upsamplers built based on the above restoration difficulty, and performing a pixel-unit operation to upsample the specific pixel through the determined restoration upsampler among the plurality of upsamplers; and A method for generating a super-resolution image through pixel-level classification, comprising a super-resolution image output step of generating a super-resolution image by outputting the upsampled specific pixel to the coordinates of the specific pixel of the low-resolution image.

Description

Real-Time Video Motion Detection Apparatus and Method Using Pixel-Level Classification The present invention relates to a super-resolution image generation technology, and more specifically, to an apparatus and method for generating a super-resolution image through pixel-level classification that can improve the efficiency of image super-resolution generation by adaptively allocating computational resources at the pixel level. Single Image Super Resolution (SISR) is a process focused on restoring high-resolution (HR) images from low-resolution (LR) images. This process is widely utilized in various fields, including digital photography, medical imaging, surveillance, and security. In particular, Single Image Super Resolution (SISR) has evolved alongside deep neural networks (DNNs). However, with the emergence of new Single Image Super Resolution (SISR) models, model size and computational cost have tended to increase, making it difficult to apply them to applications or resource-constrained devices. Consequently, there has been a shift toward designing simple, efficient, and lightweight models that strike a balance between performance and computational cost. Furthermore, research is underway to reduce the parameter size or the number of floating-point operations (FLOPs) in existing models without compromising performance. Meanwhile, as platforms such as smartphones, high-definition TVs, and monitors supporting 2K to 8K resolutions provide users with large-scale images, the demand for efficient super-resolution (SR) is increasing. Due to limitations in computational resources, large-scale images cannot be processed in a single step, meaning the entire image cannot be processed at once. Therefore, for super-resolution (SR) targeting large-scale images, a per-patch processing method has been used, which involves dividing a given low-resolution (LR) image into patches, applying an SR model independently to each patch, and then merging the results to obtain a high-resolution image. Recently, efficiency has been improved by dividing patches based on restoration difficulty and allocating computational resources to each patch. However, if the restoration difficulty varies from pixel to pixel, the uniform allocation of computational resources within a patch can actually reduce efficiency. FIG. 1 is a diagram illustrating a super-resolution image generation device through pixel-level classification according to the present invention. FIG. 2 is a flowchart illustrating a method for generating a super-resolution image through pixel-level classification according to the present invention. Figures 3 to 6 are drawings illustrating experimental results related to the present invention. FIG. 7 is a diagram illustrating the system configuration of a super-resolution image generation device according to the present invention. FIG. 8 is a diagram illustrating a super-resolution image generation system according to the present invention. The description of the present invention is merely an example for structural or functional explanation, and therefore the scope of the present invention should not be interpreted as being limited by the examples described in the text. That is, since the examples are subject to various modifications and may take various forms, the scope of the present invention should be understood to include equivalents capable of realizing the technical concept. Furthermore, the objectives or effects presented in the present invention do not imply that a specific example must include all of them or only such effects; therefore, the scope of the present invention should not be understood as being limited by them. Meanwhile, the meaning of the terms described in this application should be understood as follows. Terms such as "first," "second," etc., are intended to distinguish one component from another, and the scope of rights shall not be limited by these terms. For example, the first component may be named the second component, and similarly, the second component may be named the first component. When it is stated that one component is "connected" to another component, it should be understood that it may be directly connected to that other component, or that there may be other components in between. Conversely, when it is stated that one component is "directly connected" to another component, it should be understood that there are no other components in between. Meanwhile, other expressions describing the relationships between components, such as "between" and "exactly between," or "adjacent to" and "directly adjacent to," should be interpreted in the same way. A singular expression should be understood to include a plural expression unless the context clearly indicates otherwise, and terms such as "include" or "have" are intended to specify the existence of the implemented features, numbers, steps, actions, components, parts, or combinations thereof, and should be understood not to preclude the existence or