CN-116205789-B - Single image super-resolution reconstruction method based on channel fusion self-attention mechanism

CN116205789BCN 116205789 BCN116205789 BCN 116205789BCN-116205789-B

Abstract

The invention provides a single image super-resolution reconstruction method based on a channel fusion self-attention mechanism, which is a lightweight image super-resolution network based on the channel fusion self-attention mechanism and comprises the following steps: the system comprises a shallow feature extraction module, a channel fusion self-attention module, a dense feature fusion module and an image reconstruction module. Channel fusion self-attention is that the present invention proposes a new linear self-attention method. Conventional self-attention mechanisms generate new pixel features by weighted summation of similar features in adjacent locations in image space, while channel fusion self-attention regenerates new channel features by fusing image region features between different channels. In addition, the image super-resolution network of the invention reduces the computational complexity of the network by fusing the self-attention and lightweight convolution modules through the channels, so that the network has enough lightweight and flexibility.

Inventors

ZENG KUN
Lin Hanjiang
FANG JINSHENG

Assignees

闽江学院

Dates

Publication Date: 20260505
Application Date: 20221220

Claims (2)

1. A single image super-resolution reconstruction method based on a channel fusion self-attention mechanism is characterized in that CMSAN network is used as a depth network for single image super-resolution reconstruction, and a low-resolution image is processed Directly input to CMSAN network to reconstruct a high-resolution image ; The CMSAN network comprises a shallow feature extraction module, a plurality of channel fusion self-attention modules CMSAB, a dense feature fusion module and an up-sampling module, and a convolution layer is adopted to extract a low-resolution image Analyzing the shallow layer characteristics by CMSAB module, extracting hierarchical characteristics, serially connecting and fusing the generated hierarchical characteristics, adding the hierarchical characteristics with residual characteristics, and finally obtaining a reconstructed high-resolution image by an up-sampling module ; The channel fusion self-attention module CMSAB structure is defined by channel fusion self-attention and lightweight convolution; channel fusion self-attention improves model efficiency by maintaining the local attention of the transducer and the movement of the window mechanism in the channel dimension rather than pixel space, the convolution part uses two 1x1 convolutions, 1x 3 depth convolutions, and two BN layers are introduced to replace the self-attention LN layer to improve model running speed.
2. The single image super-resolution reconstruction method based on a channel fusion self-attention mechanism as set forth in claim 1, wherein: When providing a set of training data Wherein To train the number of image blocks in the database, And Respectively a low resolution image block and a high resolution image block, the loss function is expressed as: Wherein the method comprises the steps of Is a reconstructed high resolution image.

Description

Single image super-resolution reconstruction method based on channel fusion self-attention mechanism Technical Field The invention belongs to the technical field of image processing, and particularly relates to a single image super-resolution reconstruction method based on a channel fusion self-attention mechanism. Background The image is one of main media for human to acquire information, contains a large amount of digital information, and has very important application in the fields of medical treatment, remote sensing, monitoring and the like. However, due to factors such as digital image acquisition equipment and environment, problems such as insufficient resolution, unclear blurring, missing details and the like often exist when an output image is acquired, so that the digital image acquisition equipment and the environment cannot be directly used. Therefore, in order to solve a series of problems in the image acquisition process, super Resolution (SR) reconstruction technology of images has been developed. Under the condition of original hardware equipment, the high-resolution image is obtained by utilizing the super-resolution reconstruction technology of the image in a software mode at low cost, and the method has very important research significance in the field of image processing. Super Resolution (SR) reconstruction of images is a Low-level computer vision task that reconstructs High Resolution (HR) images from Low Resolution (LR) images. The SR technology has wide application and has attracted extensive attention in academia and industry. In recent years, convolutional neural networks have great potential in SR tasks, and a large number of SR models based on the convolutional neural networks have remarkable results and high practical value. Dong et al first proposed SRCNN [7] which learns the end-to-end mapping from LR images to HR images by CNN containing only three convolution layers. Then VDSR [8] and DRCN [9] learn larger networks through residual learning and recursive learning respectively, further improving SR performance. By employing residual learning and recursive learning strategies DRRN [11] achieves better performance with fewer parameters. MemNet [12] is proposed to solve the long-term dependency problem by mining persistent memory. In these methods, the original LR image is upsampled Bicubic to the scale of the HR image before being sent to the network. To increase SR speed, most new SR models take the original LR image as input and increase spatial resolution by deconvolution or sub-pixel convolution at the end of the network [14]. Unlike other SR methods, lapSRN [10] reconstruct SR images by progressively increasing the image resolution and predicting the subband residual of HR images. SRResNet [15] and EDSR [16] improve SR performance by stacking a series of residual blocks, an SR model is proposed. In particular, EDSR modifies the residual block by removing the Bulk Normalization (BN) layer to achieve performance improvement. In order to improve the effect of the SR task, the SR model based on the convolutional neural network recovers the high-frequency details of the image, often stacks very deep network structures, causes consumption of a large amount of parameters and computing resources, and is not suitable for a real scene. Therefore, the design of the lightweight network ensures the trade-off between the size and the reconstruction performance of the model, becomes one of the key researches in the future direction in the field of image processing, and has very important significance. Hui et al propose IDN [13] to progressively extract long and short path features and extract more useful information for SR reconstruction. Based on IDN, IMDN [19] presents a multiple distillation and contrast-aware channel attention mechanism and wins the image super-resolution challenge of AIM 2019. Liu et al propose RFDN [20] that introduces a characteristic distillation connection and shallow residue blocks for fast SR less than the IMDN parameter. Inspired by the ability of the human visual system to automatically focus on important areas, attention mechanisms are designed to focus on the largest part of the information in the input signal. Recently, some work has introduced an attention mechanism [23] into the SR task. Zhang et al propose RCAN [24], focusing on the most important channels by introducing a channel attention mechanism into the simplified residual block. Magid et al propose DFSA [25] to predict attention patterns of features in the frequency domain using a matrix multispectral channel attention mechanism. Liu et al propose an Enhanced Spatial Attention (ESA) module [20] to make efficient use of local spatial information with fewer parameters. In addition, non-local attention mechanisms aimed at capturing long-range spatial information have also been studied. NLRN [26], RNAN [27], CSNLN [28], ENLCN [29], and the like have introduced non-local concerns to achieve performan