CN-121999269-A - Fine granularity image classification method, system and application based on image super-resolution and feature component attention

CN121999269ACN 121999269 ACN121999269 ACN 121999269ACN-121999269-A

Abstract

The invention relates to a fine-granularity image classification method, a system and application based on image super-division and feature component attention, which are used for constructing an improved fine-granularity image classification network and comprises a backbone network, a feature component attention module and a super-division-identification module which are sequentially arranged, wherein the super-division-identification module comprises two output heads, one output head is connected with the input end of the backbone network, the improved fine-granularity image classification network is trained, images are classified by the backbone network, the feature component attention module and the other output head of the super-division-identification module in the improved fine-granularity image classification network after training, and the fine-granularity image classification system is realized based on the method and is applied to distinguishing at least 2 entity categories with vision performance difference degree smaller than a threshold value, and the classification basis among the entity categories is one or more difference characteristics. The invention can fully excavate the characteristic parts with different granularities, obtain better characteristic representation, and has better classifying capability and classifying capability on low-resolution images.

Inventors

YE LEI
XU CHENQI
LIANG DEYUAN
Wang Dihong

Assignees

浙江工业大学

Dates

Publication Date: 20260508
Application Date: 20251214

Claims (10)

1. The fine-granularity image classification method based on image super-division and feature component attention is characterized by constructing an improved fine-granularity image classification network, comprising a backbone network, a feature component attention module and a super-division-identification module which are sequentially arranged, wherein the super-division-identification module comprises two output heads, and one output head is connected with the input end of the backbone network; Training the improved fine-grained image classification network; the images are classified with another output head of the backbone network, feature attention module, and super-score-recognition module in the trained improved fine-grained image classification network.
2. The fine-granularity image classification method based on image super-resolution and feature component attention as claimed in claim 1, wherein the backbone network outputs a hierarchical feature map; the feature component attention module obtains an adaptive sampling and dynamically weighted feature map based on the hierarchical feature map; The super-resolution-recognition module comprises a super-resolution reconstruction branch and a recognition branch, and a reconstruction output head and a recognition output head are arranged in cooperation with the super-resolution reconstruction branch and the recognition branch; When the improved fine-grained image classification network is trained, the identification output head outputs a primary identification result, the reconstruction output head outputs a reconstructed image and then inputs the reconstructed image into the backbone network again, and the identification output head outputs a secondary identification result, and the primary identification result and the secondary identification result are added; classifying the images by the backbone network, the characteristic component attention module and the recognition branches of the super-division-recognition module in the trained improved fine-granularity image classification network, and outputting classification results by the recognition output head.
3. The fine-granularity image classification method based on image super-resolution and feature component attention according to claim 2, wherein the feature component attention module comprises a feature sampling layer, the output of the feature sampling layer and the learnable bias are sequentially input into a Dropout layer and an attention unit after being added layer by layer, the output end of the Dropout layer is further connected with a feature weight calibration unit, an attention matrix in the attention unit is combined with the feature weight calibration unit to obtain feature weights, and a self-adaptive sampling and dynamic weighted feature map is obtained based on the feature weights and the output of the attention unit.
4. A fine-grained image classification method based on image super-resolution and feature component attention according to claim 3, characterized in that the recognition branches of the super-resolution-recognition module are input with feature weights, adaptive sampling and dynamically weighted feature graphs; The recognition branch comprises a feature weight processing block which is used for inputting feature weights and corresponding learnable parameters; The recognition branch further comprises a feature map fusion block for inputting a feature map with adaptive sampling and dynamic weighting; and the outputs of the feature weight processing block and the feature map fusion block are fused and then output through the normalization layer and the full-connection layer.
5. The fine-granularity image classification method based on image super-resolution and feature component attention according to claim 3, wherein the super-resolution reconstruction branch of the super-resolution-recognition module is input by using the self-adaptive sampling and dynamic weighting feature map, and the reconstruction branch and the jump branch are arranged behind the input end of the super-resolution reconstruction branch; the reconstruction branch comprises a double-bit downsampling module, a random Gaussian noise processing block and a recovery block which are connected in sequence; The jump branches include a convolution layer and an activation function; the outputs of the reconstruction branch and the jump branch are added to output a super-resolution reconstruction result.
6. The fine-grained image classification method based on image super-resolution and feature component attention as set forth in claim 5, wherein the restoration block comprises a pixel rebinning layer, a convolution layer, and an activation function in sequence.
7. The fine-grained image classification method based on image oversubscription and feature attention of claim 1, wherein a loss function is established, training the improved fine-grained image classification network; The loss function is associated with feature attention module loss, classification loss, and image superdivision loss.
8. The fine granularity image classification method based on image super-resolution and feature attention as recited in claim 7, wherein training weight lost by image super-resolution increases gradually with task execution.
9. A fine granularity image classification system based on image super-division and feature component attention is characterized by comprising: at least one processor, and A memory communicatively coupled to at least one of the processors; wherein the memory stores instructions executable by the processor for execution by the processor to implement the fine-grained image classification method based on image superclassification and feature component attention as in one of claims 1-8.
10. An application of the fine-grained image classification method based on image super-classification and feature component attention according to any of claims 1-8, wherein the method is applied to distinguish at least 2 entity categories with visual performance difference less than a threshold, and classification among the entity categories is based on one or more difference features.

Description

Fine granularity image classification method, system and application based on image super-resolution and feature component attention Technical Field The invention relates to the technical field of calculation and calculation or counting, in particular to a fine-granularity image classification method, system and application based on image super-resolution and feature component attention in the technical field of computer vision. Background With the increasing degree of social informatization and the development of artificial intelligence, the application requirements of computer vision in subdivision scenes are increasingly refined. The fine-grained target classification is taken as one of the core tasks of computer vision, aims to distinguish fine subclasses under the same large class, such as birds, vehicle types or pathological cell subtypes of different varieties, and the like, and is essentially characterized in that the capturing capacity of the biological vision on detailed characteristics is simulated through a computer, image information is acquired by using a sensor such as a camera, the accurate classification of the target is realized after the image information is processed through an algorithm, and finally, the machine has the fine-grained vision understanding capacity similar to human. Traditional fine granularity classification methods are highly dependent on large-scale labeling data such as position coordinates, attribute labels and the like, but are faced with multiple limitations in practical applications. Firstly, the data marking cost is extremely high, fine-grained class distinction needs expert marking local features, such as bird beak shape, vehicle hub design and the like, the manual marking cost is exponentially increased along with the increase of class quantity and is difficult to cover all potential subclasses, such as emerging species, novel products and the like, secondly, the problem that the difference between different classes of fine-grained image classification is small and the difference in class is large exists, the distinguishing features between specific classes can only exist in local areas, such as fine texture distinction and the like, different individuals in the same class can also present obvious appearance differences due to gesture, illumination and visual angle change, so that model generalization is difficult, finally, in an actual application scene, the problem that the image resolution of a shot image is low is likely to appear due to the problem of shooting equipment, so that human experts are difficult to distinguish the small differences between classes is caused, the fine-grained image classification is highly dependent on the features, and finally, the robustness of the model in the actual application scene is influenced. Therefore, researchers focus on converting a strongly supervised learning process into a weakly supervised or even self-supervised process, while enhancing the extraction and utilization of key feature points of images. However, most of the existing fine-grained object classification methods utilize an object detection module or an attention mechanism, and extract component features using a rectangular bounding box, which makes it difficult to capture shape information of rich objects, while ignoring the problem of low classification accuracy due to low resolution of input images in a real application scene. Based on this, there is a need for an improvement in the fine-grained object classification method. Disclosure of Invention Aiming at the problems existing in the prior art, the invention provides a fine granularity image classification method, a fine granularity image classification system and application based on image super-resolution and attention of feature components. The technical scheme adopted by the invention is that the fine-granularity image classification method based on image super-resolution and feature component attention constructs an improved fine-granularity image classification network, and the method comprises a backbone network, a feature component attention module and a super-resolution-recognition module which are sequentially arranged, wherein the super-resolution-recognition module comprises two output heads, and one output head is connected with the input end of the backbone network; Training the improved fine-grained image classification network; the images are classified with another output head of the backbone network, feature attention module, and super-score-recognition module in the trained improved fine-grained image classification network. Preferably, the backbone network outputs a hierarchical feature map; the feature component attention module obtains an adaptive sampling and dynamically weighted feature map based on the hierarchical feature map; The super-resolution-recognition module comprises a super-resolution reconstruction branch and a recognition branch, and a reconstruction output hea