CN-121982667-A - Method and device for identifying pavement drivable area

CN121982667ACN 121982667 ACN121982667 ACN 121982667ACN-121982667-A

Abstract

The invention discloses a method and a device for identifying a pavement drivable area, wherein the method comprises the steps of collecting pavement image information of a road through a vehicle-mounted camera, inputting the preprocessed pavement image information into a pavement instance segmentation model obtained through pre-training, outputting segmented pavement instances according to the input pavement image information by the pavement instance segmentation model, and executing mask sensing and instance screening on the pavement instances output by the pavement instance segmentation model to identify the drivable area, wherein the pavement instance segmentation model comprises an instance segmentation model, an EMA module embedded in a backbone network of the instance segmentation model and a SAM module embedded in a neck structure of the instance segmentation model. The invention can improve the accuracy of identifying the pavement drivable area, reduce the calculation force and realize real-time and vehicle-mounted deployment.

Inventors

HU SHUGUANG
GUO YUANHAO
ZHANG JIE
CAO JIANKUN
WANG TAO
LI JIAQING

Assignees

中公高科(霸州)养护科技产业有限公司

Dates

Publication Date: 20260505
Application Date: 20251225

Claims (10)

1. A method for identifying a drivable area on a road surface, comprising: Road surface image information of a road is collected through a vehicle-mounted camera; Inputting the preprocessed pavement image information into a pavement example segmentation model obtained by pre-training; The pavement example segmentation model outputs a segmented pavement example according to the input pavement image information; Performing mask sensing and instance screening on the pavement instance output by the pavement instance segmentation model to identify a drivable area; the pavement example segmentation model comprises an example segmentation model, an EMA module embedded in a backbone network of the example segmentation model, and a SAM module embedded in a neck structure of the example segmentation model.
2. The method according to claim 1, wherein the road surface instance segmentation model outputs segmented road surface instances according to the input road surface image information, specifically comprising: in the pavement example segmentation model, a backbone network embedded with an EMA module extracts a multi-scale characteristic image from input pavement image information, eliminates interference information through the EMA module, and outputs the multi-scale characteristic image with the interference information eliminated.
3. The method of claim 2, wherein the road surface instance segmentation model outputs segmented road surface instances based on the input road surface image information, further comprising: In the road surface example segmentation model, a neck structure embedded with the SAM module aims at eliminating multi-scale characteristic images of interference information, guided aggregation is carried out on horizontal strips and vertical strips, road surface example characteristics with continuous forms and clear boundaries are obtained, and multi-scale characteristic images with continuous forms and clear road surface boundaries after strip reinforcement are output.
4. A road surface drivable area recognition apparatus, comprising: the image acquisition module is used for acquiring road surface image information of a road through the vehicle-mounted camera; the image preprocessing module is used for preprocessing the acquired pavement image information; the road surface example segmentation model is used for outputting segmented road surface examples according to the input road surface image information; The driving area identification module is used for executing mask sensing and instance screening on the road surface instance output by the road surface instance segmentation model to identify a driving area; The pavement example segmentation model comprises an example segmentation model, an EMA module embedded in a backbone network of the example segmentation model, and a neck structure SAM module embedded in the example segmentation model.
5. The apparatus of claim 4, wherein the EMA module is an end of each principal feature extraction stage module inserted into a backbone network of an instance segmentation model.
6. The apparatus of claim 5, wherein the EMA module comprises an input unit, a grouping unit, a directional attention unit, a channel attention unit, a multiplication unit, a reshaping unit, an output unit; The input unit is used for receiving the feature map output by the main feature extraction stage module connected with the input unit; The grouping unit is used for dividing the feature images received by the input unit into a plurality of subgroups according to the channel dimension and sequentially outputting the feature images of the subgroups; The direction attention unit is used for carrying out one-dimensional average pooling in the horizontal and vertical directions on the characteristic diagrams of the plurality of subgroups output by the grouping unit at the same time, and carrying out horizontal and vertical direction sensing to obtain direction sensing characteristic diagrams of the plurality of subgroups; the channel attention unit is used for outputting the characteristic diagrams of the plurality of subgroups output by the grouping unit, enhancing the response weight to the road surface area characteristics and outputting the characteristic diagrams of the plurality of subgroups for enhancing the attention of the road surface area; a multiplication unit for multiplying the direction perception feature map output by the direction attention unit with the feature map for enhancing the attention of the road surface area output by the channel attention unit, and outputting a plurality of sub-groups of multiplied feature maps; the recombination unit is used for recombining the feature graphs of the subgroups output by the multiplication unit according to the channel dimension and outputting the recombined feature graphs; And the output unit is used for multiplying the recombined characteristic diagram output by the recombination unit with the characteristic diagram received by the input unit and outputting the characteristic diagram subjected to re-weighting.
7. The apparatus of claim 4, wherein the SAM module is an end of each feature fusion path module inserted into a neck structure of an instance segmentation model.
8. The apparatus of claim 7, wherein the SAM module comprises: The input unit is used for receiving the feature map output by the feature fusion path module connected with the input unit; A first convolution unit for performing feature image received by the input unit A convolution operation, namely outputting a characteristic diagram after the convolution operation; the channel segmentation unit is used for dividing the characteristic diagram after the convolution operation output by the first convolution unit into two groups according to the channel dimension and respectively inputting the two groups of characteristic diagrams into the first strip-shaped attention mechanism unit and the second strip-shaped attention mechanism unit; A first strip-shaped attention mechanism unit for performing input feature graphs After the convolution operation, carrying out strip convolution on the feature images after the convolution operation in two directions, namely horizontal and vertical directions, adding the feature images after the strip convolution in the two directions, and outputting the added feature images; a second strip-shaped attention mechanism unit for performing input feature graphs After the convolution operation, carrying out strip convolution on the feature images after the convolution operation in two directions, namely horizontal and vertical directions, adding the feature images after the strip convolution in the two directions, and outputting the added feature images; the channel splicing unit is used for recombining the characteristic diagrams output by the first strip-shaped attention mechanism unit and the second strip-shaped attention mechanism unit according to the channel dimension and outputting the recombined characteristic diagrams; a second convolution unit for performing recombination on the feature map output by the channel splicing unit Convolution operation, namely outputting a characteristic diagram after convolution; the adding unit is used for adding the characteristic diagram output by the convolution unit and the characteristic diagram received by the input unit to obtain characteristic diagrams of the strip-shaped characteristic of the protruding road with different sizes; and the output unit is used for outputting the feature graphs of the protruding road band-shaped features with different sizes, which are obtained by the adding unit.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor is adapted to carry out the steps of a method for identifying a travelable road area according to any one of claims 1-3 when the computer program is executed.
10. A computer-readable storage medium, wherein a computer program is stored in the computer-readable storage medium, and the computer program is executable by at least one processor to cause the at least one processor to perform the steps of a method for identifying a travelable road area according to any one of claims 1-3.

Description

Method and device for identifying pavement drivable area Technical Field The invention relates to the technical field of computers, in particular to a method and a device for identifying a pavement drivable area. Background The identification (extraction) of the drivable region is a key technology in intelligent driving environment sensing, and aims to sense the road environment around a driving vehicle by using a sensor sensing technology, identify and divide the drivable region in the current driving scene and prevent lane departure or illegal driving. The accuracy and robustness of the travelable region segmentation (or travelable road surface instance segmentation) directly determines whether the vehicle can travel normally. The existing example segmentation based on deep learning can be generally divided into three types, namely (1) a two-stage/cascade framework of 'detection and re-segmentation first', which is represented by MaskR-CNN, is usually combined with FPN to perform multi-scale feature fusion, and the mask is predicted in parallel while the candidate frames are classified and regressed, so that the accuracy is higher. (2) The method of clustering after pixel labeling firstly carries out high-resolution semantic prediction, then gathers pixels into examples, has low overall precision and high calculation/video memory overhead, and is not beneficial to vehicle-mounted real-time deployment. (3) The "dense sliding window/tensor mask" method (e.g., tensorMask) predicts the mask or mask tensor directly on a dense grid, with strong structural expressive power but higher algorithm complexity and resource consumption. Meanwhile, a single-stage method (such as YOLACT) oriented to real-time performance is decoupled through a prototype mask+coefficient, so that real-time reasoning of about 33FPS on COCO is realized, and a direction is provided for vehicle-mounted landing. Anchor frame free/dynamic convolution paradigms (SOLOv, condInst) have also emerged in recent years, with the dynamic convolution kernels of position/instance conditions avoiding ROI clipping and anchor frame design, achieving a better tradeoff between accuracy and speed. However, the prior art still faces a common bottleneck in the division of the driving pavement example, namely (1) the two-stage frame has long links and numerous branches, is difficult to stabilize and real time under the conditions of limited calculation power and time delay, and has insufficient sensitivity to narrow-band/slender structures (such as broken marks and curb narrow bands) and small targets although the single-stage frame is more efficient. (2) The anchor frame design depends on priori and is sensitive to migration across data sets, the precision and parameter adjustment cost are affected, and although the anchor frame free paradigm alleviates the problem, the false segmentation and post-processing burden still exists on complex boundary/adhesion targets. (3) The complex working condition brings remarkable domain shift and degradation, namely imaging quality is reduced and semantics are lost due to night/backlight/shadow, rain, snow, fog and the like, and the existing method has insufficient robustness under the condition. The correlation study reveals that night semantic segmentation significantly downshifts through cross-period adaptation and uncertainty assessment. (4) The road drivable region has morphological characteristics of large range, weak texture and irregular boundary, common candidate generation or pixel clustering is easy to miss and overstate at the boundary, and pixel-level modeling brings additional calculation and storage pressure. In summary, there is a need for a vehicle-mounted deployment-oriented road surface travelable area identification method with high robustness under multi-scale and shielding/illumination changes and real-time performance. Disclosure of Invention In view of the above, the invention aims to provide a method and a device for identifying a pavement drivable area, which improve the accuracy of identifying the pavement drivable area, reduce the calculation force and can realize real-time performance and vehicle-mounted deployment. Based on the above object, the present invention provides a method for identifying a road surface drivable area, comprising: Road surface image information of a road is collected through a vehicle-mounted camera; Inputting the preprocessed pavement image information into a pavement example segmentation model obtained by pre-training; The pavement example segmentation model outputs a segmented pavement example according to the input pavement image information; Performing mask sensing and instance screening on the pavement instance output by the pavement instance segmentation model to identify a drivable area; Wherein the pavement instance segmentation model comprises YOLOv n-Seg instance segmentation model, and EMA module embedded in backbone network of YOLOv n-Seg instance segmentation