CN-117237896-B - Road topology prediction method, electronic equipment and driving equipment

CN117237896BCN 117237896 BCN117237896 BCN 117237896BCN-117237896-B

Abstract

The application relates to the technical field of image processing, in particular to a road topology prediction method, electronic equipment and driving equipment, and aims to solve the problem that the road topology prediction cannot be effectively performed aiming at complex and diverse road scenes. The road topology structure prediction method comprises the steps of obtaining at least one frame of data containing a road, marking basic element detection results of the road on the at least one frame of data, extracting semantic features of the at least one frame of data to obtain semantic features, inputting the semantic features into a network model, and obtaining the road topology structure according to output of the network model. The method can be suitable for more complex and diverse road scenes by adopting the network model, and the road, namely the topology structure of the road, is predicted automatically and effectively.

Inventors

PENG QIXIANG
SHAN WEI
QIN HAIBO
YAO MAOQING

Assignees

安徽蔚来智驾科技有限公司

Dates

Publication Date: 20260508
Application Date: 20230920

Claims (9)

1. A road topology prediction method, comprising: Acquiring at least one frame of data containing a road, wherein the at least one frame of data is marked with a basic element detection result of the road; extracting semantic features from the at least one frame of data to obtain semantic features; Inputting the semantic features into a network model, and obtaining a topological structure of a road according to the output of the network model, wherein each frame of data in at least one frame of data comprises a plurality of basic elements of the road, the network model comprises a transducer model, and the transducer model comprises a coding layer, a fusion layer, a first self-attention layer, a decoding layer and an output layer; The coding layer is utilized to aggregate the point-level semantic feature set corresponding to each basic element, so as to obtain line-level semantic features corresponding to the basic elements; Fusing the line-level semantic features and the point-level semantic feature sets corresponding to the basic elements by utilizing the fusion layer to obtain a target point-level semantic feature set; Fusing the target point level semantic feature data sets corresponding to different basic elements by using the first self-attention layer to obtain fusion features; decoding the fusion characteristic and the road query characteristic by using a decoding layer to obtain a decoding characteristic of the road; And inputting the decoding characteristics into the output layer to obtain a topological structure of the road, wherein the topological structure at least comprises central line position information of the road.
2. The method of claim 1, wherein the acquiring at least one frame of data comprising a link comprises: And acquiring multi-frame data containing the road, wherein the multi-frame data comprises at least one frame of historical frame data and current frame data which are arranged in time sequence.
3. The method of claim 2, further comprising at least one of pre-processing operations of frame extraction, resampling, coordinate transformation, and construction of higher order features of the multi-frame data prior to semantic feature extraction of the at least one frame data.
4. A method according to any one of claims 1 to 3, wherein each basic element characterizes the basic element detection result with a plurality of sampling points, and performing semantic feature extraction on the at least one frame of data to obtain semantic features comprises: and extracting semantic features of sampling points corresponding to each basic element in the at least one frame of data to obtain a point-level semantic feature set corresponding to the basic element.
5. The method of claim 1, wherein the road query features include a predefined point-level query feature and a predefined line-level query feature, and the decoding layer includes a residual connection layer, a second self-attention layer, and a cross-attention layer; fusing the predefined point level query features and the predefined line level query features by using the residual error connection layer to obtain fused query features; inputting the fusion query features into the second self-attention layer to obtain self-attention layer output features; And inputting the self-attention layer output characteristics and the fusion characteristics into the cross-attention layer to obtain the decoding characteristics of the road.
6. The method of claim 1, wherein after deriving the topology of the road based on the output of the network model, the method further comprises at least one of filtering, suppressing, and completing the topology of the road.
7. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored therein a computer program which, when executed by a processor, implements the road topology prediction method of any one of claims 1 to 6.
8. An electronic device, comprising: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory has stored therein a computer program which, when executed by the at least one processor, implements the road topology prediction method of any of claims 1 to 6.
9. A driving apparatus comprising a driving apparatus body and the electronic apparatus as claimed in claim 8.

Description

Road topology prediction method, electronic equipment and driving equipment Technical Field The application relates to the technical field of image processing, and particularly provides a road topology prediction method, electronic equipment and driving equipment. Background In the running process of the automatic driving vehicle, the surrounding road topological structure is acquired in advance, so that the planning of the subsequent movement track is facilitated. However, in the conventional automatic driving technical scheme, the road topology structure is usually obtained by obtaining the lane line detection result given by the upstream, and the lane association is performed by combining some rule and experience-based schemes, so as to obtain the lane topology structure. However, these rules and experiences can only support a part of conventional road scenes, such as high-speed, conventional ramps, conventional intersections, etc., and cannot give accurate lane association information for more complex and diverse road scenes, i.e. cannot effectively predict the road topology. Disclosure of Invention The application aims to solve the technical problems, namely the problem that the road topology structure cannot be effectively predicted aiming at complex and diverse road scenes in the prior art. In a first aspect, an embodiment of the present application provides a road topology prediction method, including: Acquiring at least one frame of data containing a road, wherein the at least one frame of data is marked with a basic element detection result of the road; extracting semantic features from the at least one frame of data to obtain semantic features; Inputting the semantic features into a network model, and obtaining the topological structure of the road according to the output of the network model. In some embodiments, the acquiring at least one frame data including a road comprises: And acquiring multi-frame data containing the road, wherein the multi-frame data comprises at least one frame of historical frame data and current frame data which are arranged in time sequence. In some embodiments, the method further comprises at least one preprocessing operation of extracting designated frames, resampling, converting coordinates and constructing higher-order features on the multi-frame data before the semantic feature extraction is performed on the at least one frame data. In some embodiments, each frame of data in the at least one frame of data includes a plurality of basic elements of the road, each basic element characterizes a basic element detection result with a plurality of sampling points, and extracting semantic features from the at least one frame of data includes: and extracting semantic features of sampling points corresponding to each basic element in the at least one frame of data to obtain a point-level semantic feature set corresponding to the basic element. In some embodiments, the network model comprises a transducer model comprising a coding layer, a fusion layer, a first self-attention layer, a decoding layer, and an output layer; The coding layer is utilized to aggregate the point-level semantic feature set corresponding to each basic element, so as to obtain line-level semantic features corresponding to the basic elements; Fusing the line-level semantic features and the point-level semantic feature sets corresponding to the basic elements by utilizing the fusion layer to obtain a target point-level semantic feature set; Fusing the target point level semantic feature data sets corresponding to different basic elements by using the first self-attention layer to obtain fusion features; decoding the fusion characteristic and the road query characteristic by using a decoding layer to obtain a decoding characteristic of the road; And inputting the decoding characteristics into the output layer to obtain a topological structure of the road, wherein the topological structure at least comprises central line position information of the road. In some embodiments, the road query features include a predefined point-level query feature and a predefined line-level query feature, the decoding layer including a residual connection layer, a second self-attention layer, and a cross-attention layer; fusing the predefined point level query features and the predefined line level query features by using the residual error connection layer to obtain fused query features; inputting the fusion query features into the second self-attention layer to obtain self-attention layer output features; And inputting the self-attention layer output characteristics and the fusion characteristics into the cross-attention layer to obtain the decoding characteristics of the road. In some embodiments, after obtaining the topology of the road from the output of the network model, the method further comprises at least one post-processing operation of filtering, suppressing, and complementing the topology of the road. In a sec