CN-122024187-A - Low-illumination traffic target perception method and system based on bionic visual model
Abstract
The application discloses a low-illumination traffic target perception method and a low-illumination traffic target perception system based on a bionic visual model, comprising the steps of obtaining an input image of a low-illumination traffic scene and preprocessing the input image to obtain a preprocessed image; the method comprises the steps of converting a preprocessed image into a brightness map, generating modulation parameters based on brightness values of the brightness map, carrying out self-adaptive gain modulation on the preprocessed image or characteristics thereof by utilizing the modulation parameters, then fusing the modulated image or characteristics thereof with structural texture enhancement characteristics to obtain modulation characteristics, inputting the modulation characteristics into a trained backbone network, embedding at least one trained characteristic level separable learning module into the backbone network, carrying out channel segmentation, light convolution learning, gate control interactive modulation and fusion on the modulation characteristics, and outputting the multi-scale characteristics after enhancement. The method and the system of the application improve the stability, adaptability and instantaneity of target perception in a low-illumination environment.
Inventors
- DONG SHIFENG
- JIAO LIN
Assignees
- 安徽大学
Dates
- Publication Date
- 20260512
- Application Date
- 20260413
Claims (11)
- 1. The low-illumination traffic target perception method based on the bionic visual model is characterized by comprising the following steps of: s1, acquiring an input image of a low-illumination traffic scene and preprocessing the input image to obtain a preprocessed image; S2, converting the preprocessed image into a brightness map, generating modulation parameters based on brightness values of the brightness map, performing adaptive gain modulation on the preprocessed image or characteristics thereof by using the modulation parameters, and fusing the preprocessed image or characteristics thereof with structural texture enhancement characteristics to obtain modulation characteristics; S3, inputting the modulation characteristics into a trained backbone network, embedding at least one trained characteristic level separable learning module into the backbone network, and carrying out channel segmentation, light convolution learning, gate control interactive modulation and fusion on the modulation characteristics to output enhanced multi-scale characteristics; s4, inputting the multi-scale features into a trained low-illumination night vision image classification model, performing multi-scale feature alignment fusion, feature aggregation and classification reasoning, and outputting the prediction category and the confidence of the traffic target.
- 2. The bionic visual model-based low-illumination traffic target perception method according to claim 1, wherein the backbone network has a multi-stage hierarchical structure, and the former modulation features are input and processed sequentially through a feature transformation layer, a first stage, a second stage, a third stage and a fourth stage; The feature transformation layer performs convolution, normalization, activation and downsampling operations on the modulated features and outputs a first-level feature map; The first stage and the second stage respectively carry out residual convolution processing and space downsampling on the feature images output by the previous stage, and sequentially output a second-stage feature image and a third-stage feature image; The third stage and the fourth stage respectively carry out residual convolution processing and space downsampling on the feature images output by the previous stage, and further input the feature images obtained by processing into a feature-stage separable learning module; And taking the third-level characteristic diagram, the fourth-level enhancement characteristic diagram and the fifth-level enhancement characteristic diagram as multi-scale characteristics output by the backbone network.
- 3. The bionic vision model-based low-illuminance traffic target perception method according to claim 1, wherein the low-illuminance night vision image classification model sequentially comprises the following connected processing layers: The multi-scale feature alignment and fusion layer receives multi-scale features from a backbone network as input, aligns feature images of different scales to a uniform space size and channel number through up-sampling and point-by-point convolution, fuses the aligned feature images, and outputs a uniform fusion feature image; the feature convergence layer is connected with the multi-scale feature alignment and fusion layer, performs global average pooling on the unified fusion feature map and outputs a one-dimensional characterization vector; The classification reasoning layer is connected with the feature convergence layer and comprises at least one full-connection layer, and is used for mapping the characterization vector into high-dimensional classification features, outputting a classification logits vector through the final full-connection layer and obtaining a predictive probability vector through Softmax normalization; And the result output layer is connected with the classification reasoning layer and is used for outputting the target category with highest probability and the corresponding confidence level according to the predictive probability vector.
- 4. The method for sensing the low-illuminance traffic target based on the bionic visual model according to claim 1, wherein the step S2 specifically comprises: Converting the preprocessed image into a brightness map, and calculating the global average brightness or the local average brightness vector of the brightness map; Calculating and generating a global pupil modulation coefficient or a spatial modulation chart based on the global average brightness or the local average brightness vector, a preset modulation intensity parameter and a cut-off function; performing element-by-element multiplication gain modulation on the preprocessed image or the features extracted from the preprocessed image by using a global pupil modulation coefficient or a spatial modulation diagram to obtain an initial modulation result; performing dynamic range limitation and noise suppression processing on the initial modulation result; and extracting structural texture enhancement features from the initial modulation result, fusing the structural texture enhancement features with early features of the backbone network, and outputting modulation features.
- 5. The biomimetic visual model-based low-illumination traffic target perception method according to claim 4, wherein the global pupil modulation factor is calculated by the following formula: ; Wherein, the Representing global pupil modulation coefficients for uniformly gain modulating the characteristics of the whole image; a global average luminance representing a luminance map; Representing a preset modulation intensity parameter; representing a very small constant that prevents denominator from being zero; is the lower limit of the pupil modulation factor; is the upper limit of the pupil modulation factor; Is a truncated function.
- 6. The method for sensing a traffic target with low illumination based on a bionic visual model according to claim 4, wherein each pixel position in the spatial modulation map Modulation factor of (2) Calculated by the following formula: ; Wherein, the Representing spatially modulated graphics at pixel locations A modulation factor at; representing pixel locations in a luminance map Or the local brightness value corresponding to the area where the local brightness value is positioned; Representing a preset modulation intensity parameter; representing a very small constant that prevents denominator from being zero; is the lower limit of the pupil modulation factor; is the upper limit of the pupil modulation factor; Is a truncated function.
- 7. The bionic vision model-based low-illuminance traffic target perception method of claim 4 wherein the gain modulation is specifically: ; Wherein, the Representing global pupil modulation coefficients for uniformly gain modulating the characteristics of the whole image; For preprocessing the image; The initial modulation result is the modulation image; For element-by-element multiplication.
- 8. The method of claim 4, wherein the dynamic range limiting and noise suppressing processes include at least one of truncating pixel values to a predetermined range of values, gamma correcting, applying a learnable noise suppressing constraint, extracting structural texture enhancement features from the initial modulation result by at least one of computing image gradients, computing laplace operator responses, applying a set of learnable convolution filters for feature extraction, fusing the structural texture enhancement features with early features of the backbone network, and at least one of channel stitching, weighted addition, and gate fusion.
- 9. The bionic vision model-based low-illuminance traffic target perception method of claim 1, wherein the training process of the low-illuminance night vision image classification model specifically comprises: Constructing a training data set containing a low-illumination traffic scene image sample and a corresponding real class label; inputting the low-illumination traffic scene image sample into a network containing a low-illumination night vision image classification model to be trained, and performing forward propagation to obtain a prediction class probability; Calculating cross entropy loss between the prediction category probability and the real category label; Based on the cross entropy loss, updating all trainable parameters in the low-illumination night vision image classification model by using a back propagation algorithm and a gradient descent optimization algorithm until the model converges.
- 10. The method for sensing the low-illuminance traffic target based on the bionic visual model according to claim 1, wherein the step S1 specifically includes: acquiring an input image of a low-illumination traffic scene from a vehicle-mounted camera, a road side camera or a mobile terminal camera; Performing geometric and imaging corrections on an input image, the corrections including at least one of lens distortion correction, debounce, white balance correction, defogging, and raindrop removal processing; scaling the corrected image to a preset network input size to complete size normalization; and carrying out pixel normalization processing on the image with normalized size to ensure that the pixel value range meets the network input requirement and obtain a preprocessed image.
- 11. Low-illumination traffic target perception system based on bionical vision model, characterized by comprising: The image preprocessing module is used for acquiring an input image of the low-illumination traffic scene and preprocessing the input image to obtain a preprocessed image; the light self-adaptive pupil sensing module is used for converting the preprocessed image into a brightness map, generating modulation parameters based on brightness values of the brightness map, carrying out self-adaptive gain modulation on the preprocessed image or characteristics thereof by utilizing the modulation parameters, and fusing the preprocessed image or characteristics thereof with the structural texture enhancement characteristics to obtain modulation characteristics; The characteristic decoupling learning module is used for inputting the modulation characteristics into a trained backbone network, embedding at least one trained characteristic level separable learning module into the backbone network, and carrying out channel segmentation, light convolution learning, gate control interactive modulation and fusion on the modulation characteristics to output the enhanced multi-scale characteristics; the classification output module is used for inputting the multi-scale features into the trained low-illumination night vision image classification model, carrying out multi-scale feature alignment fusion, feature aggregation and classification reasoning, and outputting the prediction category and the confidence of the traffic target.
Description
Low-illumination traffic target perception method and system based on bionic visual model Technical Field The invention relates to the technical field of computer vision and intelligent traffic, in particular to a low-illuminance traffic target perception method and system based on a bionic vision model. Background Traffic target perception is the core basic capability of systems such as automatic driving, vehicle-road coordination and intelligent traffic monitoring, and the like, and the perception objects of the traffic target perception cover vehicles, pedestrians, non-motor vehicles, traffic lights, traffic signs, road barriers, road structure information and the like. The reliable target perception has important significance for guaranteeing driving safety and improving traffic efficiency. The image captured by the imaging system is subject to serious illumination deficiencies during night, tunnel entrance/exit, intense backlight, and severe weather conditions such as rain, fog, snow, etc. The low illumination environment can cause the phenomena of underexposure, dynamic range compression, remarkable noise enhancement, color distortion and the like of an image, so that the edge of a target is blurred, texture details are lost, and the contrast is reduced sharply. The degradation effects cause the serious degradation of the performance of the traditional visual perception algorithm designed based on normal illumination conditions, and are specifically characterized by increased target omission ratio, increased false detection and mixed category identification, thereby directly restricting the reliable application of the intelligent traffic system in all-weather and full-scene conditions. In order to cope with the low illumination challenge, the prior art mainly evolves along two directions, namely, a two-stage paradigm of enhancing firstly and then sensing is adopted, namely, firstly, an image enhancement technology (such as histogram equalization, retinex theory derivation method or enhancement network based on deep learning) is utilized to enhance the visual quality of a low illumination image, and then the enhanced image is input into a target detection or classification model for reasoning. Such schemes often introduce irrelevant artifacts or excessive smoothness in the enhancement process due to inconsistent enhancement goals (pursuing visual perceptions) and perception task goals (pursuing feature separability), which instead impair the detail features critical to the perception task, and the flow concatenation introduces additional computational overhead. Secondly, designing an end-to-end complex network architecture, and improving the feature expression capability of the model on difficult samples by introducing an attention mechanism, multi-branch fusion or deeper networks. However, such schemes tend to have limited adaptation to abrupt illumination conditions, high model complexity, and difficulty in meeting real-time requirements on vehicle-mounted or road-side edge devices with limited computational resources. Therefore, the lack of a low-illuminance traffic target sensing method capable of simultaneously realizing illumination condition self-adaption, efficient feature expression and calculation efficiency is a technical problem to be solved in the art. Disclosure of Invention In order to solve the technical problems in the background art, the invention provides a low-illumination traffic target perception method and system based on a bionic visual model. The invention provides a low-illumination traffic target perception method based on a bionic visual model, which comprises the following steps of: s1, acquiring an input image of a low-illumination traffic scene and preprocessing the input image to obtain a preprocessed image; S2, converting the preprocessed image into a brightness map, generating modulation parameters based on brightness values of the brightness map, performing adaptive gain modulation on the preprocessed image or characteristics thereof by using the modulation parameters, and fusing the preprocessed image or characteristics thereof with structural texture enhancement characteristics to obtain modulation characteristics; S3, inputting the modulation characteristics into a trained backbone network, embedding at least one trained characteristic level separable learning module into the backbone network, and carrying out channel segmentation, light convolution learning, gate control interactive modulation and fusion on the modulation characteristics to output enhanced multi-scale characteristics; s4, inputting the multi-scale features into a trained low-illumination night vision image classification model, performing multi-scale feature alignment fusion, feature aggregation and classification reasoning, and outputting the prediction category and the confidence of the traffic target. Preferably, the backbone network has a multi-stage hierarchical structure, and the previous mo