CN-121982674-A - Traffic marking detection method and system for complex pavement environment

CN121982674ACN 121982674 ACN121982674 ACN 121982674ACN-121982674-A

Abstract

The invention discloses a traffic marking detection method and system for a complex road surface environment, and belongs to the technical field of intelligent traffic. The method comprises the steps of obtaining an original road image, constructing a data set, constructing a traffic marking detection model, wherein the model comprises a backbone network, a neck network and a detection head, a structure-gradient double-flow cooperative feature extraction module is arranged at the front end of the backbone network, physical gradient priori is utilized to keep the real outline of the marking, an orthogonal space decoupling sensing module is arranged at the rear end of the backbone network, long-distance position dependence of the marking is captured, a gradient calibration alignment fusion module is arranged in the neck network, pixel level alignment of shallow edges and deep semantics is achieved, the detection head adopts an orthogonal shape sensing measurement loss function optimization model, and the traffic marking detection under a complex environment is achieved by the trained model. The invention realizes high recall rate identification and high precision positioning of multiple types of road marks in a complex scene, and reduces the risk of misjudgment of lanes of directional marks.

Inventors

GUAN DEYONG
ZHU RUI
WANG KE
ZHANG YUTING
Zheng Lezhi
ZHANG KE

Assignees

山东科技大学

Dates

Publication Date: 20260505
Application Date: 20260408

Claims (9)

1. The traffic marking detection method for the complex pavement environment is characterized by comprising the following steps of: Step 1, acquiring an original road image, preprocessing the image, and constructing a data set; Step 2, constructing a traffic marking detection model, wherein the traffic marking detection model comprises a backbone network, a neck network and a detection head; A structure-gradient double-flow cooperative feature extraction module, namely an SGD-Stem module, is arranged at the input front end of the backbone network, the real outline of a traffic marking is grasped by using a physical gradient priori, the high-frequency noise of the road surface is intercepted from the source, the purity of the feature flowing into the backbone network is ensured, and the complete physical boundary is maintained; An orthogonal space decoupling sensing module, namely an OD-SLU module, is arranged between the last C3k2 characteristic extraction module and the SPPF module of the backbone network so as to capture the long-distance position dependence of the traffic marking on the road and accurately identify the slender marking with direction semantics; Setting a gradient calibration alignment fusion module, namely GCAF module, in the neck network, converting the edge of the shallow layer feature from the backbone network into a space guiding mask, and performing pixel-by-pixel clipping and calibration on the up-sampled and amplified deep semantic feature to realize seamless alignment of the semantic and edge details and eliminate up-sampling ghost of the fuzzy marking; Adopting an orthogonal shape perception measurement loss function at the detection head to calculate errors of a prediction boundary frame and a real boundary frame, and updating a model through back propagation; and step 3, training the traffic marking detection model constructed in the step 2 based on the data set in the step 1, and realizing traffic marking detection of the complex road surface environment by utilizing the trained traffic marking detection model.
2. The traffic marking detection method for the complex pavement environment according to claim 1, wherein in the step 1, an original road image is obtained through a vehicle-mounted camera unit or a road side monitoring probe; The process of preprocessing the image is as follows: And (3) adjusting an original road image to 640 multiplied by 640 pixels, performing normalization operation, mapping pixel values to a [0, 1] interval, adopting a Mosaic data enhancement method, increasing sample diversity through random cutting, scaling and splicing, and constructing a data set containing ten categories of crosswalk, left turn, right turn, straight line, left turn straight line, right turn straight line, stop line, yellow grid line, diamond line and bus lane.
3. The traffic marking detection method for a complex pavement environment according to claim 1, wherein the SGD-Stem module has the following processing flow: The preprocessed image is input into an SGD-Stem module, a physical gradient prior of the input image is firstly extracted by utilizing a multidirectional Sobel operator, normalized spatial gradient attention map is generated through 1X 1 convolution and Sigmoid activation, then the spatial gradient attention map is acted on the original input image characteristic by adopting a residual multiplication injection mechanism as a weighting mask, a high-frequency edge signal of a wear mark is enhanced before the spatial resolution is reduced, characteristics enhanced by gradient guidance are obtained, and finally the characteristics enhanced by gradient guidance are subjected to learnable structured downsampling by a depth separable convolution module with a step length of 2, namely a DW-Conv module, so that a downsampling characteristic map is obtained.
4. The traffic marking detection method for the complex pavement environment according to claim 3, wherein the Sobel operator comprises a horizontal Sobel operator and a vertical Sobel operator, and gray gradient components of the image are calculated through the horizontal Sobel operator and the vertical Sobel operator so as to capture illumination abrupt change boundaries of the lane lines, the guide arrows and the pavement background; the generated spatial gradient attention tries to show high response values in the edge area of the traffic marking and show low response values in the flat pavement area, so that a spatial mask prior for subsequent processing is formed; The residual multiplication strategy is adopted, the generated spatial gradient attention map is used as a weighting coefficient, pixel-by-pixel Hadamard product operation is carried out with the original input image characteristic X in , and the characteristic calculation formula after gradient guiding enhancement is as follows: ; wherein: in order to guide the enhanced features via a gradient, For the original input image feature(s), For the spatial gradient attention map, Representing a hadamard product operation; The feature after gradient guiding enhancement is processed by batch normalization and SiLU nonlinear activation function in a DW-Conv module, and a downsampled feature map with high signal-to-noise ratio and rich edge details is output.
5. The traffic marking detection method for the complex pavement environment according to claim 1, wherein the OD-SLU module has the following processing flow: The method comprises the steps of carrying out global average pooling on input high-dimensional feature tensors along height dimension and width dimension respectively to obtain feature vectors keeping position dependence in the width direction and feature vectors keeping position dependence in the height direction, then respectively entering two paths of features into independent convolution coding paths which are not shared by two parameters, generating Sigmoid attention weight graphs in two directions by using 1X 1 convolution and nonlinear activation functions, and finally carrying out element-by-element Hadamard product weight on the generated Sigmoid attention weight graphs with the high-dimensional feature tensors originally input to output semantic feature graphs.
6. The traffic marking detection method for a complex pavement environment according to claim 1, wherein the processing flow of the GCAF module is as follows: The two input features of GCAF modules are respectively And Wherein For low resolution semantic features from deep layers of the backbone network, High resolution edge features from shallow layers of the network; for low resolution semantic features from deep layers of the backbone network The spatial scale of the target pixel is amplified by two times by adopting a bilinear interpolation algorithm, the bilinear interpolation is carried out by referring to the values of 4 adjacent pixel points around the target pixel and carrying out weighted average calculation according to the distance to generate an amplified feature map, and then feature smoothing and channel mapping are carried out by adopting 3X 3 standard convolution to obtain amplified and smoothed deep features; the mathematical representation of this process is: ; Wherein, the Representing the amplified and smoothed deep features; Representing a bilinear interpolation function; a standard convolution operation representing a convolution kernel size of 3 x 3; high resolution edge features from shallow layers of a network Laminating the channel dimension to 1 layer through a 1X 1 convolution layer, and generating a space calibration mask through Sigmoid activation function processing; the mathematical representation of this process is: ; Wherein, the Representing the spatial calibration mask(s), Representing a 1x 1 convolution dimension reduction operation, (. Cndot.) represents a Sigmoid activation function; Using the generated spatial calibration mask For deep layer characteristics after amplification and smoothing treatment Carrying out Hadamard product operation; the mathematical representation of this process is: ; Wherein, the Representing the hadamard product operation, The depth characteristic tensor after the calibration; Finally, the calibrated deep characteristic tensor High resolution edge features with original shallow layer Splicing in the channel dimension and mapping it to the number of output channels required for the next step of the network using a1 x 1 convolutional layer Obtaining a final fusion feature tensor; the mathematical representation of this process is: ; In the formula, And (5) fusing the feature tensors finally.
7. The traffic marking detection method for a complex pavement environment according to claim 1, wherein the calculation step of the orthogonal shape perception metric loss function is as follows: Assume that the real boundary box of the traffic marking in the image is The prediction boundary box is , wherein, Representing the coordinates of the center point of the real bounding box, Representing the width and height of the real bounding box; representing the center point coordinates of the prediction bounding box, Setting the width and height of the minimum closed matrix area containing the two frames to be C w and C h respectively; Firstly, the errors of the predicted boundary frame and the real boundary frame in the central point offset and the wide-high deformation are forcedly and orthogonally decomposed into two independent punishment items of a transverse X axis and a longitudinal Y axis, and the mathematical expression of the process is as follows: , ; Wherein, the For a laterally independent distance penalty, For a longitudinal independent distance penalty, A very small constant to prevent the divisor from being zero; Using a real bounding box The aspect ratio of (2) by designing a cross penalty weight factor, the mathematical expression of the process is: , ; Mapping the orthogonally decoupled and shape weighted distance metrics into a smoothed nonlinear penalty term using an exponential decay function, and calculating a final bounding box regression loss in combination with a base blending ratio IoU The mathematical expression of the process is as follows: , 。
8. The traffic marking detection method for a complex pavement environment according to claim 6, further comprising the steps of: The final fusion feature tensor The detection head sent into the network carries out decoding prediction, the detection head comprises two decoupled parallel branches of classification and regression, and the classification probability that the area belongs to various traffic marked lines, the center point coordinate, the width and the height of the marked line boundary frame are respectively predicted through convolution mapping; in the actual prediction process, the same traffic marking target can generate a plurality of candidate prediction frames which are overlapped in space around the traffic marking target, so that a non-maximum value suppression NMS post-processing algorithm is finally accessed, the NMS post-processing algorithm sorts the prediction frames according to confidence scores of the prediction frames from high to low, eliminates redundant boundary frames with the geometric overlapping degree exceeding a set threshold value with the high frame, and finally outputs accurate marking types, confidence scores and two-dimensional coordinate positions of the target boundary frames to finish the end-to-end traffic marking automatic detection.
9. The traffic marking detection system for the complex pavement environment is characterized by comprising the following modules: The preprocessing module is used for acquiring an original road image, preprocessing the image and constructing a data set; The prediction module is used for constructing a traffic marking detection model, and the traffic marking detection model comprises a backbone network, a neck network and a detection head; A structure-gradient double-flow cooperative feature extraction module, namely an SGD-Stem module, is arranged at the input front end of the backbone network, the real outline of a traffic marking is grasped by using a physical gradient priori, the high-frequency noise of the road surface is intercepted from the source, the purity of the feature flowing into the backbone network is ensured, and the complete physical boundary is maintained; An orthogonal space decoupling sensing module, namely an OD-SLU module, is arranged between the last C3k2 characteristic extraction module and the SPPF module of the backbone network so as to capture the long-distance position dependence of the traffic marking on the road and accurately identify the slender marking with direction semantics; Setting a gradient calibration alignment fusion module, namely GCAF module, in the neck network, converting the edge of the shallow layer feature from the backbone network into a space guiding mask, and performing pixel-by-pixel clipping and calibration on the up-sampled and amplified deep semantic feature to realize seamless alignment of the semantic and edge details and eliminate up-sampling ghost of the fuzzy marking; Adopting an orthogonal shape perception measurement loss function at the detection head to calculate errors of a prediction boundary frame and a real boundary frame, and updating a model through back propagation; The traffic marking detection model is trained based on the preprocessed data, and the traffic marking detection model is utilized to realize traffic marking detection of the complex road surface environment.

Description

Traffic marking detection method and system for complex pavement environment Technical Field The invention relates to the technical field of intelligent traffic, in particular to a traffic marking detection method and system for a complex road surface environment. Background Road traffic marking is a core element in an urban traffic management system and an intelligent driving perception system. As an intuitive visual navigation language, the road traffic marking not only bears the key functions of guiding the running track of the vehicle and distributing the road right of way, but also is an important perception basis for realizing lane keeping, path planning and traffic rule compliance by the advanced driving assistance system and the automatic driving automobile. With the rapid development of computer vision and artificial intelligence technology, the automatic detection of road markings has gradually evolved from the early traditional image processing methods based on color threshold segmentation, edge detection, hough transformation and the like to the deep learning detection method based on a deep convolutional neural network. While the traditional method often depends on the characteristics of manual design, the deep learning method can automatically extract the high-dimensional characteristics of the image through large-scale data training, and remarkable progress is made in detection efficiency and generalization capability. In the current mainstream technical scheme, a single-stage target detection algorithm represented by YOLO series is widely applied to vehicle-mounted terminals and road side monitoring equipment. Existing YOLOv and other models typically employ a Backbone network (Backbone) of convolutional layer stacks to extract image features, use the attention mechanism to enhance feature saliency, and regress the target bounding box through a loss function based on the cross-over ratio (IoU). However, aiming at complex road surface environments, the existing traffic marking detection technology has the problems of weak edge feature extraction capability, insufficient space perception on an elongated directional target and easiness in generating edge dislocation, feature ghost and the like in a multi-scale feature fusion stage under complex interference. Disclosure of Invention The invention aims to provide a traffic marking detection method for a complex pavement environment, which can solve the problems of weak edge feature extraction capability, insufficient space perception of an elongated directional target, easy occurrence of edge dislocation and feature ghost in a multi-scale feature fusion stage and the like in the prior art under complex interference. In order to achieve the above purpose, the invention adopts the following technical scheme: A traffic marking detection method facing complex road surface environment comprises the following steps: Step 1, acquiring an original road image, preprocessing the image, and constructing a data set; Step 2, constructing a traffic marking detection model, wherein the traffic marking detection model comprises a backbone network, a neck network and a detection head; A structure-gradient double-flow cooperative feature extraction module, namely an SGD-Stem module, is arranged at the input front end of the backbone network, the real outline of a traffic marking is grasped by using a physical gradient priori, the high-frequency noise of the road surface is intercepted from the source, the purity of the feature flowing into the backbone network is ensured, and the complete physical boundary is maintained; An orthogonal space decoupling sensing module, namely an OD-SLU module, is arranged between the last C3k2 characteristic extraction module and the SPPF module of the backbone network so as to capture the long-distance position dependence of the traffic marking on the road and accurately identify the slender marking with direction semantics; Setting a gradient calibration alignment fusion module, namely GCAF module, in the neck network, converting the edge of the shallow layer feature from the backbone network into a space guiding mask, and performing pixel-by-pixel clipping and calibration on the up-sampled and amplified deep semantic feature to realize seamless alignment of the semantic and edge details and eliminate up-sampling ghost of the fuzzy marking; Adopting an orthogonal shape perception measurement loss function at the detection head to calculate errors of a prediction boundary frame and a real boundary frame, and updating a model through back propagation; and step 3, training the traffic marking detection model constructed in the step 2 based on the data set in the step 1, and realizing traffic marking detection of the complex road surface environment by utilizing the trained traffic marking detection model. In addition, on the basis of the traffic marking detection method facing the complex road surface environment, the invention also