CN-120047972-B - Residual error expansion fusion method for pedestrian re-identification
Abstract
The invention discloses a residual error expansion fusion method for pedestrian re-identification. The device comprises a data preprocessing module, a residual error expansion fusion module and a similarity calculation and result output module. Wherein the backbone network is built using ResNet a, initializing the backbone network with a model pre-trained on ImageNet. The proposed residual expansion fusion module generates copies by weight sharing using residual units of different sizes. Meanwhile, copies are expanded and fused in two directions, so that loss of important features of an original layer during feature fusion is effectively reduced. The method effectively improves the robustness of the model in various scenes, reduces the pedestrian characterization information loss problem in the characteristic solving transmission process to the maximum extent, and can more accurately identify pedestrians in difficult scenes in the real world.
Inventors
- DENG ZELIN
- Ke Yongyang
Assignees
- 长沙理工大学
Dates
- Publication Date
- 20260508
- Application Date
- 20250224
Claims (4)
- 1. The residual error expansion fusion method for pedestrian re-identification is characterized by comprising the following steps of: (1) Data preprocessing, namely selecting a data set required by an experiment from a pedestrian re-identification data set, and dividing and preprocessing the data; (2) Constructing a main network structure, namely constructing a main network based on ResNet-50 and initializing the main network by utilizing a pre-training model, wherein each network layer utilizes a residual error unit to generate copies of respective feature vectors, expands the copies generated by a lower network layer in the vertical direction and the horizontal direction, and fuses the copies generated by an adjacent higher network layer under the same size so as to reduce the loss of important features of an original layer during feature fusion; (3) Firstly, preprocessing pedestrian samples sampled in batches, inputting the pedestrian samples into a pre-defined model to extract high-order features of pedestrians, inputting low-level features extracted from a backbone network and high-level features into a residual error expansion fusion module, reducing feature loss through expansion, fusing the high-level features and the low-level features under the same size, inputting the high-level features and the low-level features into a loss function to calculate loss, carrying out counter propagation, updating model parameters, and continuously iterating the value of a minimized loss function to form an optimized pedestrian re-recognition model; (4) After training, the proposed pedestrian re-recognition model is used as a pedestrian feature extractor to represent each pedestrian by utilizing the pedestrian features containing the identity recognition information, the Euclidean distance between the feature vectors of each pedestrian in the target pedestrian and the test data set is calculated, the pedestrians in the data set are sequenced according to the similarity between the pedestrians in the data set and the target pedestrian, finally, the pedestrian image most similar to the target pedestrian is screened out, and the recognition accuracy of the model is calculated.
- 2. The residual expansion fusion method for pedestrian re-recognition according to claim 1, wherein: And (1) adopting preprocessed data enhancement on the original pedestrian picture data set, wherein the preprocessing data enhancement comprises data normalization, random horizontal overturn, random erasure, random rotation, random overturn and random brightness adjustment technology.
- 3. The residual expansion fusion method for pedestrian re-recognition according to claim 1, wherein: (a) In the step (2), the defined convolutional neural network takes ResNet-50 as a main network, the structure of the convolutional neural network comprises 4 gradient relieving stages, and gradient disappearance problems are relieved in each stage through jump connection, so that a deeper network can be trained, the design enables the network to effectively extract multi-level characteristic information, meanwhile, higher calculation efficiency is maintained, and rich characteristic representation is provided for subsequent tasks; (b) The residual expansion fusion module in the step (2) is used for relieving the problem of information loss of low-layer features in the transmission process during feature fusion, generating copies of input feature vectors for each network layer through weight sharing by utilizing residual units consisting of two CBR units with different sizes, and furthest reducing important feature loss of an original layer by reserving the copies; The CBR unit is a network layer consisting of convolution filters, batch normalization layers and linear rectifiers.
- 4. The residual expansion fusion method for pedestrian re-recognition according to claim 1, wherein: the loss function L constructed in step (3) is as follows: ; Wherein the loss function For a Triplet Loss function For the Center pass of the sample, Indicating the number of triples available, Is a super-parameter for representing the interval, Representing a range loss function, Is of the category of Is the center of all sample features, a and B are the duty cycle weights of the respective loss functions.
Description
Residual error expansion fusion method for pedestrian re-identification Technical Field The invention relates to the fields of computer vision technology and image retrieval, in particular to a residual error expansion fusion method for pedestrian re-identification. Background Pedestrian re-identification is a technology for identifying a target pedestrian in an existing video sequence of a possible source and a non-overlapping camera view, and has wide application in various fields such as public security, construction of smart cities and the like. Under the background, the pedestrian re-identification is used as a key means of urban safety monitoring, and can assist urban managers and safety monitoring personnel to track suspicious personnel, so that the stability of social order and the life and property safety of people are more effectively ensured. The conventional pedestrian re-recognition technology mainly depends on a target detection and tracking algorithm, however, the performance of the technology is often not satisfactory when facing to shielding and complex scenes, and the generation of a deep learning and pedestrian re-recognition crossing technology is promoted to a certain extent. The pedestrian re-identification technology has profound effects on urban management and safety protection, and also has a certain effect on monitoring pedestrian behaviors. Through analysis of pedestrian behaviors, abnormal behaviors such as wandering, trailing and the like can be found in time, criminal behaviors can be effectively prevented by utilizing the pedestrian re-identification technology, and public safety is guaranteed. Therefore, the technology is attractive in the fields of computer vision and artificial intelligence research, and attracts intensive researches of a plurality of students. As researchers succeeded in applying deep learning to the pedestrian re-recognition field, the field was rapidly developed. However, in a real situation, the pedestrian faces the challenges of disordered background, local shielding, different gestures, illumination change, low resolution and the like, so that the characteristics of the same person are greatly obscured, and particularly in the case of shielding, different body parts are often shielded and become invisible, so that the pedestrian re-recognition challenge is greatly improved. The current common method is to extract global features and local features by using a multi-scale feature fusion strategy. As Zhang et al propose lightweight feature pyramid branches, features are extracted from different levels of the network and aggregated into a bi-directional pyramid structure. However, most of the current methods do not consider that the features extracted by the low-level network are transmitted through too many network layers, so that the original information features are lost, and the diversity of the pedestrian features is reduced. Disclosure of Invention The invention aims to solve the problems that in the pedestrian characteristic extraction process, more characterization information is generally obtained by fusing multiple layers of pedestrian characteristics, but pedestrian characterization information is lost due to the low-layer characteristics of pedestrians in the transmission process to the high-layer characteristics, so that the recognition performance of a model is affected. In order to alleviate the problem, the invention provides a residual error expansion fusion method for pedestrian re-identification. The main network is constructed by using the modified ResNet-50 to extract effective low-level and high-level features, then the residual fusion module generates copies of input feature vectors for each network layer by using residual units with two different sizes through a weight sharing mechanism, and then the copies are respectively subjected to expansion and fusion operations in the vertical direction and the horizontal direction so as to effectively retain key feature information of an original layer and furthest reduce the problem of losing important features in the transmission process. The aim of the invention can be achieved by the following technical scheme: A residual error expansion fusion method for pedestrian re-identification comprises the following steps: (1) The pedestrian data sets of the mark-1501 and CUHK are selected as experimental data sets, and the data sets are divided into three parts, namely a training set, a testing set and a query set. Carrying out data preprocessing by adopting the technologies of data normalization, random horizontal overturn, random erasure, random rotation and the like; (2) The convolutional neural network is used as a main network, the structure of the convolutional neural network comprises 4 stages, gradient disappearance problems are relieved by means of jump connection in each stage, further deeper networks can be trained, the design enables the networks to efficiently extract multi-scale c