CN-122024278-A - Pedestrian re-recognition model training method for generating anti-facts sample by using fractional Fourier transform
Abstract
The invention relates to the field of computer vision and deep learning, and discloses a training method for a pedestrian re-identification model by utilizing fractional Fourier transform to generate a counter fact sample. The method comprises the steps of performing fractional Fourier transform on a pedestrian image, mapping the image to a fractional frequency domain space, generating a counterfactual spectrum in an amplitude mixing mode, generating a counterfactual sample with a style intervention characteristic through inverse transformation, fusing predicted results of an original sample and the counterfactual sample based on a front-gate causal adjustment theory to approach an intervention distribution, and introducing a causal contrast loss function to strengthen causal consistency constraint in a feature representation space. By the technical scheme, under the condition that the confounding variables are not required to be observed explicitly, the influence of the domain related interference factors on the model is effectively weakened, and the generalization performance and the robustness of the pedestrian re-identification model under the conditions of crossing scenes and crossing cameras are remarkably improved.
Inventors
- Jia Jieru
- CHEN YUWEI
- QIAN YUHUA
Assignees
- 山西大学
Dates
- Publication Date
- 20260512
- Application Date
- 20260121
Claims (10)
- 1. A training method for a pedestrian re-identification model by utilizing fractional Fourier transformation to generate a counter fact sample is characterized by comprising the following steps: step 1, collecting pedestrian image samples from a plurality of cameras or monitoring view angles, and configuring corresponding identity labels for the pedestrian image samples to form a training sample set; Step 2, selecting at least one image from the training sample set as a reference image, and generating at least one inverse fact sample in a frequency domain based on the pedestrian image sample and the reference image; step 3, inputting the original pedestrian image and the corresponding anti-facts sample image into a feature extraction network of the shared parameters to obtain a corresponding feature representation or prediction result; step 4, constructing an identity loss function for restricting identity discrimination consistency based on the prediction result; Step 5, constructing a causal contrast loss function for zooming in the characteristic distances of the same identity sample and zooming out the characteristic distances of different identity samples based on the similarity relation between the characteristic representations of the original pedestrian image and the inverse fact sample image; and step 6, training the pedestrian re-recognition model according to the identity loss function and the causal comparison loss function to obtain a trained pedestrian re-recognition model.
- 2. The method of claim 1, wherein the constructing of the anti-facts sample image in step 2 includes frequency domain transforming the original pedestrian image and the selected reference image, respectively, and combining the two based on frequency domain features to generate the anti-facts sample images having different look styles.
- 3. The method of claim 2, wherein the frequency domain transform is a fractional fourier transform and the order of the fractional fourier transform is within a predetermined range.
- 4. A method according to claim 3, wherein the order of the fractional fourier transform is adaptively determined by a preset strategy including, but not limited to, adjustment based on training stability or model performance metrics.
- 5. The method according to any of claims 2 to 4, characterized in that in the frequency domain the image representation is decomposed into amplitude information and phase information and combined based on the amplitude information of the different images, while preserving the phase information of the original image, to generate the counterfactual sample image.
- 6. The method of claim 5, wherein the combining of the magnitude information is based on linear blending of weight coefficients, the weight coefficients being generated according to a preset distribution or random strategy.
- 7. The method according to claim 1, wherein in the step 4, the prediction result includes a prediction result obtained based on an original pedestrian image and a prediction result obtained based on a counterfactual sample image, and the target prediction result for identity discrimination is obtained by weighting and fusing the prediction results.
- 8. The method according to claim 1, wherein in the step 5, the similarity relationship is determined based on a distance or a similarity between feature vectors corresponding to the original pedestrian image and the counterfactual sample image, and the similarity includes at least one of cosine similarity, euclidean distance, or inner product similarity.
- 9. The method of claim 8, wherein the causal contrast loss function is used to constrain feature similarity between an original pedestrian image of the same identity and its corresponding counterfactual sample image to be higher than feature similarity between samples of different identities.
- 10. The method of claim 1, wherein the identity loss function and causal contrast loss function are weighted according to preset weighting coefficients to form a total loss function for model training.
Description
Pedestrian re-recognition model training method for generating anti-facts sample by using fractional Fourier transform Technical Field The invention belongs to the technical field of computer vision and deep learning, and particularly relates to a pedestrian re-identification model training method for generating a counter fact sample by using fractional Fourier transform. Background The Person Re-identification (ReID) technology aims at performing cross-scene matching and searching on the same identity of a pedestrian under the view angle of a non-overlapping camera, and is one of key technologies in the fields of intelligent security, smart city and public security. The existing pedestrian re-identification method is mainly based on a convolutional neural network or a visual transducer model, and an identity matching task is completed by extracting local and global appearance characteristics of a human body and performing measurement learning. However, in a real open scene, pedestrian images are often affected by factors such as illumination variation, occlusion interference, viewing angle offset, resolution degradation, complex background noise and the like, so that characteristic distribution of the same identity under different cameras is severely offset. At present, the mainstream method generally relies on data enhancement and contrast learning strategies to relieve the distribution offset problem, but the enhancement mode is mostly limited to spatial domain disturbance, and potential distribution change is difficult to model from the angle of a frequency domain, so that the generalization capability of a model on an unknown scene is limited. On the other hand, the counterfactual inference is gradually introduced into the field of visual characterization learning in recent years as a causal inference paradigm, and the core idea is to generate a "counterfactual sample" which is consistent with the original sample semantics but in which the interference factors are artificially controlled, so as to force model learning to be more discriminative and robust. However, most of the existing counterfactual sample generation methods rely on explicit image generation countermeasure networks, and the training process is complex, the stability is poor, the calculation cost is high, and the change mode of the sample in the frequency spectrum dimension is difficult to accurately control. Therefore, in the task of re-identifying pedestrians, how to construct a model training method which does not need to additionally generate a network, can finely regulate and control image style disturbance in a frequency domain and effectively generate high-quality anti-reality samples, so that the robustness and generalization capability of the model in an open scene are key technical problems to be solved urgently. Disclosure of Invention The invention aims to overcome the defects in the prior art and provides a training method for a pedestrian re-identification model by generating a counterfactual sample through fractional Fourier transform In order to solve the technical problems, the invention adopts the following technical scheme: A training method for a pedestrian re-identification model by utilizing fractional Fourier transformation to generate a counter fact sample comprises the following steps: step 1, collecting pedestrian image samples from a plurality of cameras or monitoring view angles, and configuring corresponding identity labels for the pedestrian image samples to form a training sample set; Step 2, selecting at least one image from the training sample set as a reference image, and generating at least one inverse fact sample in a frequency domain based on the pedestrian image sample and the reference image; step 3, inputting the original pedestrian image and the corresponding anti-facts sample image into a feature extraction network of the shared parameters to obtain a corresponding feature representation or prediction result; step 4, constructing an identity loss function for restricting identity discrimination consistency based on the prediction result; Step 5, constructing a causal contrast loss function for zooming in the characteristic distances of the same identity sample and zooming out the characteristic distances of different identity samples based on the similarity relation between the characteristic representations of the original pedestrian image and the inverse fact sample image; and step 6, training the pedestrian re-recognition model according to the identity loss function and the causal comparison loss function to obtain a trained pedestrian re-recognition model. Further, in the step 2, the process of constructing the anti-facts sample image includes performing frequency domain transformation on the original pedestrian image and the selected reference image respectively, and combining the original pedestrian image and the selected reference image based on the frequency domain features to generate