CN-121998831-A - Content-aware-based image super-resolution method of light-weight mixed path model
Abstract
The invention discloses an image super-resolution method of a light-weight mixed path model based on content perception, which comprises the following steps of providing matched recovery paths for different types of image degradation by introducing a packet type multi-path strategy; the method is characterized in that a content sensing mechanism is combined, processing branches are adaptively selected according to the complexity of an image region, fine reconstruction of the complex region and efficient processing of a simple region are achieved, a progressive calibration module is further adopted, a fusion process of simple and complex region features in the content sensing module is optimized, artifacts are reduced, and reconstruction consistency is improved. The method can obviously improve the super-resolution reconstruction quality of the image under the real complex degradation scene on the premise of keeping lower calculation cost, and is suitable for the fields of remote sensing image enhancement, medical image analysis, high-definition display and the like.
Inventors
- REN CHAO
- ZHAO PEIKAI
- FANG QINGQING
- LUO XIAODONG
Assignees
- 四川大学
Dates
- Publication Date
- 20260508
- Application Date
- 20260410
Claims (6)
- 1. The image super-resolution method of the lightweight mixed path model based on content perception is characterized by comprising the following steps of: Constructing a mixed path super-resolution network model based on content perception, wherein the network model can adaptively select calculation paths with different complexity for different areas according to local content complexity and content characteristics of an input image; And secondly, performing supervised training on the network model by using the real degraded image pairing data set to enable the network model to learn the mapping relation from low resolution to high resolution.
- 2. The method for image super-resolution based on a content-aware light-weight hybrid path model according to claim 1 is characterized in that in the first step, a content-aware hybrid path super-resolution network model is constructed, the overall architecture of the network model follows the process flow of shallow feature extraction, progressive content-aware fusion, multi-path node feedforward, dynamic path routing and high-resolution reconstruction, a high-resolution result is output, in addition, a packet multi-path strategy is adopted in the model, the packet multi-path strategy dynamically routes to a corresponding restoration path according to an input degraded image, a progressive content-aware feature extraction module (PCAF) is introduced in the content-aware fusion process, and is used for assisting in selecting a decision path to realize differential processing of a complex region and a simple region, and a unified light-weight architecture is constructed, so that the number of model parameters is remarkably reduced.
- 3. The image super-resolution method based on the content-aware light-weight mixed path model of claim 1 is characterized in that in the step one, a mixed path super-resolution network model based on the content-aware is constructed, a grouping type multi-path strategy is adopted in the multi-path node feedforward step, the strategy divides a backbone into a plurality of path groups according to network depth, each group is provided with a plurality of path nodes with the same structure and different model parameters, the number of the path groups meets the hierarchical distribution that N1 is greater than or equal to N2 is greater than or equal to N3, and the design principle of 'lighter shallow paths and stronger deep paths' is adopted, so that the network can provide matched modeling paths for different types and complex degradation modes at different levels, the capacity can be put in on demand in the depth dimension, and the degraded diversity and layering can be dealt with structurally.
- 4. The method for image super-resolution based on content-aware lightweight hybrid path model as defined in claim 1, wherein said constructing a content-aware hybrid path super-resolution network model in step one, wherein content adjustment node construction is implemented in a multipath node feed-forward step, and at a branch selection level, a content adjustment control mechanism is introduced to enable the network to adaptively select branches of different complexity according to local content, dynamically balancing between reconstruction quality and inference efficiency, and giving input characteristics Gating network (By content features and degradation-related features) Common drive) outputs weight vectors for each path node : The training stage adopts Top-k sparse selection: Wherein the method comprises the steps of Is the first A plurality of path nodes corresponding to FFN branches of different complexity/induction bias, Representing the largest of the selected input vectors The index of each element is switched to Top-1 by forward and backward transmission of a small number of nodes in the set, so that the multiple expressions are reserved, the training stability and the training efficiency are maintained, each position is only activated by a single path node, and the calculated amount is remarkably reduced.
- 5. The method for image super-resolution based on content-aware light-weight hybrid path model according to claim 1, wherein the method for image super-resolution based on content-aware hybrid path model is characterized in that in the step one, a routing mechanism is adopted in the process of dynamic path routing, wherein the mechanism takes 'dual-source statistics-light projection-temperature control normalization-sparse selection-weighted fusion' as a main line, on one hand, three types of maximum/minimum/average statistics characteristics are respectively extracted from context characteristics and content characteristics to form stable description on degradation intensity and content complexity, on the other hand, convolution is adopted to map six statistical channels to node dimensions, path node probability distribution with temperature dependence is matched, the path node probability distribution is activated in a training period to reduce invalid computation, finally, the output of a selected node is subjected to linear weighted fusion according to weight, so that computing resources are automatically concentrated to the most relevant node, discontinuity caused by hard switching is avoided, expression capacity, stability and efficiency are considered, the router works in cooperation with a subsequent 'complex/simple path' of PCAF and gating N, on the other hand, a complex degradation area is strengthened, and a simple reconstruction area is maintained in a simple reconstruction area.
- 6. The method for image super-resolution of a content-aware-based lightweight hybrid path model according to claim 1, wherein the step one is characterized by constructing a content-aware-based hybrid path super-resolution network model, wherein the progressive content-aware fusion module PCAF reforms the channel attention mixer into a two-stage structure of 'pre-coarse calibration-post-fine calibration' on the basis of employing asymmetric windows and learning offsets: step one, coarse calibration of features, before entering the PCAF backbone, hold Small window, magnification , Is unchanged from the setting of the learnable offset, the learnable offset is generated by the predictor Further, the warp function is used for carrying out space deformation alignment on the characteristics, and in order to reduce the statistical mismatch of the light path and the heavy path, the characteristics of the light path are firstly carried out In the context of reference Window level mean-variance alignment is done for the benchmarks, Obtaining a rough calibration result , Is the mean value of the values, Is the standard deviation of the two-dimensional image, Is a scaling factor that is used to scale the image, Is the offset; Subsequently by Parallel modeling of sparse attention and convolution branches is entered, and gating after rectification is performed The duty cycle of complex or simple regions is determined within the block, thereby aggregating more distant contexts and stabilizing subsequent fusions without increasing query overhead, Generating a gating weight graph which is self-adaptive to space by a statistical branch of a predictor; Step two, characteristic fine calibration, namely after the main calculation is finished, adding fine calibration at an output end to further eliminate local brightness drift and edge narrow-band artifact, performing 3 x3 depth separable convolution and channel scaling/offset learning fine adjustment on a light path residual error, and reinjecting in a residual error form to obtain : Finally, in the case of fusion, As a spatially adaptive mask, the outputs of the two parallel paths are weighted and summed to form the final output: Wherein the method comprises the steps of For the output of the attention path, Is a fine calibrated output.
Description
Content-aware-based image super-resolution method of light-weight mixed path model Technical Field The invention relates to the field of computer vision and digital image processing, in particular to an image super-resolution method based on a deep learning technology, and especially relates to a light-weight high-efficiency image super-resolution method which is oriented to a real complex degradation scene and combines a content perception mechanism and a mixed path model. Background With the wide application of image acquisition devices and the improvement of resolution, the demand for high-quality images is increasing. However, the images acquired in the actual scene often degrade into a low resolution state due to sensor limitations, transmission compression, motion blur, etc., which directly affects the accuracy of subsequent visual analysis and understanding tasks. In view of this challenge, image super-resolution techniques have been developed that aim to recover high resolution details from degraded low resolution images, and have wide application in the fields of medical image processing, satellite remote sensing, video monitoring, target detection, and the like. Image super-resolution technology has undergone rapid development from the conventional method to the deep learning-based method since the proposal. Early approaches relied primarily on non-depth learning techniques such as interpolation-based algorithms (e.g., bilinear interpolation, bicubic interpolation) and reconstruction-based algorithms (e.g., iterative backprojection). These methods, while simple and fast to calculate, are limited by feature expression capabilities and are difficult to cope with complex degradation situations. In recent years, the rise of deep learning brings a revolutionary breakthrough to the field of image super-resolution, and the super-resolution method is subjected to the spanning development from deep Convolutional Neural Network (CNN) to a transducer architecture. Stacking more convolution layers based on CNN, introducing a channel/space attention mechanism significantly improves the representation capability of the model, and makes great progress on multiple reference data sets compared with the traditional method. Although convolutional networks operate relatively simply and efficiently, they are limited by the local receptive nature of convolutional networks. In order to overcome the problems of CNN, a transducer architecture is introduced into an image restoration task, and through a window self-attention and layered structure, longer-range receptive field modeling can be performed, so that better performance than a convolutional network can be obtained, an attention mechanism is further optimized, and the detail restoration capability of a model is improved. However, the high computational overhead of the transducer somewhat limits its application in practice. Therefore, how to significantly reduce the computational complexity of the transducer while maintaining or even improving its powerful resilience becomes a key challenge for current research. On this basis, researchers have begun to focus on more dynamic and adaptive light weight strategies, of which hybrid expert (Mixture of Experts, moE) mechanisms are representative. The mechanism realizes dynamic balance between model capacity and reasoning cost by sparsely activating part of expert sub-networks. For example, the IM-LUT optimizes the interpolation lookup table structure by using an expert selection mechanism, a good compromise is obtained between efficiency and expression capability, GLDFN and SSIU are respectively from the aspects of degradation modeling and structural sensitivity, a lightweight MoE module is designed to improve deployability, and CasArbi effectively enhances reconstruction consistency under any amplification factor by self-adaptive routing according to region scale. Nevertheless, existing approaches still face fundamental challenges in coping with real-world image super-resolution tasks. First, the real world image degradation modes are extremely complex, diverse and spatially non-uniform, are nonlinear stacks of various blurring, noise, compression artifacts, and are difficult to describe accurately with simple analytical models. Most existing methods, whether CNN-based or Transformer-based, are usually trained and optimized under a hypothetical, relatively single degradation model, resulting in a severely inadequate generalization ability in the face of true complex degradation, with the recovered image being prone to artifacts, blurring or texture distortion. Secondly, even if a dynamic sparse computing idea is introduced, the routing mechanism of the existing MoE model in the super-resolution task is often simpler, the path selection is guided by not fully combining the content semantic information of the image, so that the distribution of computing resources is not necessarily optimal, and the optimal performance recove