CN-121982638-A - Dense pedestrian detection method, dense pedestrian detection device, dense pedestrian detection equipment, dense pedestrian detection medium and dense pedestrian detection product

CN121982638ACN 121982638 ACN121982638 ACN 121982638ACN-121982638-A

Abstract

The invention discloses a dense pedestrian detection method, a dense pedestrian detection device, dense pedestrian detection equipment, dense pedestrian detection media and dense pedestrian detection products, and relates to the technical field of computer vision, wherein the method comprises the steps of obtaining dense pedestrian images to be detected; the method comprises the steps of inputting dense pedestrian images into a preset dense pedestrian detection model to output dense pedestrian detection results, wherein the dense pedestrian detection model is constructed based on an improved YOLOv network model, and comprises the steps of introducing an adaptive frequency weight mechanism on the basis of a large receptive field wavelet convolution WTConv to form an adaptive wavelet transformation convolution AWTConv, embedding the adaptive wavelet transformation convolution AWTConv into a C3k2 module of a Backbone network of a YOLOv network model to form a C3k 2-AWTConv module, and introducing a density adaptive mechanism on the basis of a WIoUv loss function to form an adaptive loss function WIOU-DC. The invention can effectively improve the accuracy and recall rate of dense pedestrian target detection.

Inventors

WU YAO
ZHANG YIZHI
GAN JING
LI ZHUOYU
GUI YAOCHENG
ZHOU JIBIAO

Assignees

南京邮电大学

Dates

Publication Date: 20260505
Application Date: 20260116

Claims (10)

1. A dense pedestrian detection method, comprising: acquiring dense pedestrian images to be detected; Inputting the dense pedestrian image into a preset dense pedestrian detection model, and outputting a dense pedestrian detection result; wherein the dense pedestrian detection model is constructed based on a modified YOLOv network model, the modification of the YOLOv network model comprising: Introducing an adaptive frequency weight mechanism on the basis of a large receptive field wavelet convolution WTConv to form an adaptive wavelet transformation convolution AWTConv, and embedding the adaptive wavelet transformation convolution AWTConv into a C3k2 module of a Backbone network of the YOLOv network model to form a C3k2_ AWTConv module; and introducing a density self-adaptive mechanism on the basis of WIoUv loss functions to form a self-adaptive loss function WIOU-DC, and adopting the self-adaptive loss function WIOU-DC to optimize the bounding box regression.
2. The dense pedestrian detection method of claim 1 wherein the training process of the dense pedestrian detection model comprises: acquiring a dense pedestrian data set, and labeling the dense pedestrian images in the dense pedestrian data set in a way that label frames of pedestrians, riding persons and non-shielded parts of shielded pedestrians in the dense pedestrian images are unified into a category; Dividing the marked intensive pedestrian data set into a training set, a testing set and a verification set according to a preset proportion; And training the dense pedestrian detection model through the training set, the testing set and the verification set.
3. The dense pedestrian detection method of claim 1 wherein the adaptive wavelet transform convolution AWTConv comprises: decomposing the input image X into low frequency components, horizontal high frequency components, vertical high frequency components and diagonal high frequency components by means of a set of predefined Haar wavelet filters, introducing a learnable scalar coefficient α in a decomposition process expressed as: In the formula, Is a low frequency component, a horizontal high frequency component, a vertical high frequency component, and a diagonal high frequency component, In order for the convolution operation to be performed, Respectively is A corresponding Haar wavelet filter is provided, Respectively is The corresponding scalar coefficient is used to determine the scalar coefficient, Haar wavelet filters modulated by scalar coefficients respectively; is wavelet transformation; Using inverse wavelet transform Will decompose the result Performing reconstruction and resynthesis to generate output : In the formula, The weight tensor is a deep convolution kernel.
4. The dense pedestrian detection method of claim 1 wherein the adaptive loss function WIOU-DC comprises: constructing a concentration factor based on target concentration around a prediction box : In the formula, Is the first Prediction frame and the th The cross-over ratio of the individual prediction frames, Is the cross-over ratio threshold value, For indicating the function, the value is 1 when the condition in brackets is satisfied, and the value is 0 otherwise; Based on concentration factors And WIoUv loss function construction adaptive loss function WIOU-DC: In the formula, Is WIoUv loss functions.
5. The dense pedestrian detection method of claim 1 wherein the improvement to the YOLOv network model further comprises: A weighted bi-directional feature pyramid BiFPN is introduced, and the weighted bi-directional feature pyramid BiFPN is embedded into the Concat module of the Neck network of the YOLOv network model, constituting a BiFPN _ Concat module.
6. The dense pedestrian detection method of claim 1 wherein the improvement to the YOLOv network model further comprises: The Conv module of the Neck network, which improved the YOLOv network model, was introduced with dynamic convolution DynamicConv.
7. A dense pedestrian detection apparatus, characterized by comprising: an image acquisition module configured to acquire a dense pedestrian image to be detected; The detection and identification module is configured to input the dense pedestrian image into a preset dense pedestrian detection model and output a dense pedestrian detection result; wherein the dense pedestrian detection model is constructed based on a modified YOLOv network model, the modification of the YOLOv network model comprising: Introducing an adaptive frequency weight mechanism on the basis of a large receptive field wavelet convolution WTConv to form an adaptive wavelet transformation convolution AWTConv, and embedding the adaptive wavelet transformation convolution AWTConv into a C3k2 module of a Backbone network of the YOLOv network model to form a C3k2_ AWTConv module; and introducing a density self-adaptive mechanism on the basis of WIoUv loss functions to form a self-adaptive loss function WIOU-DC, and adopting the self-adaptive loss function WIOU-DC to optimize the bounding box regression.
8. An electronic device, comprising a processor and a storage medium; The storage medium is used for storing instructions; the processor being operative according to the instructions to perform the steps of the method according to any one of claims 1-6.
9. A computer readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the steps of the method according to any of claims 1-6.
10. A computer program product comprising computer programs/instructions which, when executed by a processor, implement the steps of the method of any of claims 1-6.

Description

Dense pedestrian detection method, dense pedestrian detection device, dense pedestrian detection equipment, dense pedestrian detection medium and dense pedestrian detection product Technical Field The invention relates to the technical field of computer vision, in particular to a dense pedestrian detection method, a dense pedestrian detection device, dense pedestrian detection equipment, dense pedestrian detection media and dense pedestrian detection products. Background The pedestrian detection technology in the dense scene is used as a core task in the fields of pedestrian detection and computer vision, aims to perform data processing and pattern recognition on images or videos where target objects are located, and has important application value in the fields of traffic management, security monitoring and the like. In recent years, single-stage detection algorithms represented by YOLO series have been paid attention to because of their high efficiency, wherein YOLOv has been significantly improved in detection speed and accuracy compared with earlier YOLO models, but there are still many disadvantages in dense pedestrian scenes, such as limitation of multi-scale feature fusion, limited extraction capability of backhaul features, insufficient performance of loss functions, and the like, and the detection effect has yet to be improved. Disclosure of Invention The invention aims to overcome the defects in the prior art and provide a dense pedestrian detection method, a device, equipment, a medium and a product, so that the detection performance of computer vision in dense pedestrian scenes is improved. In order to achieve the above purpose, the invention is realized by adopting the following technical scheme: in a first aspect, the present invention provides a dense pedestrian detection method, comprising: acquiring dense pedestrian images to be detected; Inputting the dense pedestrian image into a preset dense pedestrian detection model, and outputting a dense pedestrian detection result; wherein the dense pedestrian detection model is constructed based on a modified YOLOv network model, the modification of the YOLOv network model comprising: Introducing an adaptive frequency weight mechanism on the basis of a large receptive field wavelet convolution WTConv to form an adaptive wavelet transformation convolution AWTConv, and embedding the adaptive wavelet transformation convolution AWTConv into a C3k2 module of a Backbone network of the YOLOv network model to form a C3k2_ AWTConv module; and introducing a density self-adaptive mechanism on the basis of WIoUv loss functions to form a self-adaptive loss function WIOU-DC, and adopting the self-adaptive loss function WIOU-DC to optimize the bounding box regression. Optionally, the training process of the dense pedestrian detection model includes: acquiring a dense pedestrian data set, and labeling the dense pedestrian images in the dense pedestrian data set in a way that label frames of pedestrians, riding persons and non-shielded parts of shielded pedestrians in the dense pedestrian images are unified into a category; Dividing the marked intensive pedestrian data set into a training set, a testing set and a verification set according to a preset proportion; And training the dense pedestrian detection model through the training set, the testing set and the verification set. Optionally, the adaptive wavelet transform convolution AWTConv includes: decomposing the input image X into low frequency components, horizontal high frequency components, vertical high frequency components and diagonal high frequency components by means of a set of predefined Haar wavelet filters, introducing a learnable scalar coefficient α in a decomposition process expressed as: In the formula, Is a low frequency component, a horizontal high frequency component, a vertical high frequency component, and a diagonal high frequency component,In order for the convolution operation to be performed,Respectively isA corresponding Haar wavelet filter is provided,Respectively isThe corresponding scalar coefficient is used to determine the scalar coefficient,Haar wavelet filters modulated by scalar coefficients respectively; is wavelet transformation; Using inverse wavelet transform Will decompose the resultPerforming reconstruction and resynthesis to generate output: In the formula,The weight tensor is a deep convolution kernel. Optionally, the adaptive loss function WIOU-DC includes: constructing a concentration factor based on target concentration around a prediction box : In the formula,Is the firstPrediction frame and the thThe cross-over ratio of the individual prediction frames,Is the cross-over ratio threshold value,For indicating the function, the value is 1 when the condition in brackets is satisfied, and the value is 0 otherwise; Based on concentration factors And WIoUv loss function construction adaptive loss function WIOU-DC: In the formula, Is WIoUv loss functions. Optionally, the im