CN-122024197-A - Road continuous extraction method based on hybrid neural network

CN122024197ACN 122024197 ACN122024197 ACN 122024197ACN-122024197-A

Abstract

The invention provides a road continuous extraction method based on a hybrid neural network, which relates to the technical field of neural networks and comprises the steps of carrying out multi-mode data enhancement after cutting returned satellite photos in a blocking mode to obtain a preprocessed image block, inputting the hybrid neural network, carrying out multi-scale context sensing feature extraction in a Swin transform encoder to output a layered road feature map, carrying out progressive decoding and cross-layer feature fusion in a strip convolution decoder through four-way strip convolution operation to output a binary road segmentation map, positioning road topology nodes, carrying out road section traffic state quantization, and outputting a multi-level road state thermodynamic diagram. The method solves the technical problems that the inherent structure of the convolutional neural network or the full convolutional network in the prior art causes insufficient local feature extraction, and particularly, the method is poor in performance when processing long-range dependency relationship and complex road topological structure, influences the integrity of the road network, and further causes inaccurate map information.

Inventors

LU YIWEI
HUANG BO
TAO YU
YANG YUANTAO
Yang Ruopeng
Zhai leyu
WANG YIZHI
YIN CHANGSHENG

Assignees

中国人民解放军信息支援部队工程大学

Dates

Publication Date: 20260512
Application Date: 20260212

Claims (9)

1. The road continuous extraction method based on the hybrid neural network is characterized by comprising the following steps of: After the returned satellite photos are cut in a blocking mode, multi-mode data enhancement is carried out, and preprocessed image blocks are obtained; inputting the preprocessed image block into a hybrid neural network, wherein the hybrid neural network comprises a Swin transform encoder and a strip convolutional decoder, in which: S1, extracting multi-scale context sensing characteristics of the preprocessed image block by adopting a layered window displacement mechanism at the Swin transform encoder, and outputting a layered road characteristic diagram; s2, performing progressive decoding and cross-layer feature fusion on the layered road feature map through a four-way strip convolution operation in the strip convolution decoder, and outputting a binary road segmentation map; positioning road topology nodes on the binary road segmentation map; and taking the road topology node as a road network structure control point, dynamically binding real-time multi-source traffic flow data to the binary road segmentation map through a space-time registration algorithm to carry out road section traffic state quantification, and outputting a multi-level road state thermodynamic diagram.
2. The hybrid neural network-based road continuous extraction method of claim 1, wherein the step of performing multi-scale context-aware feature extraction on the preprocessed image block at the Swin transform encoder by using a layered window displacement mechanism, and outputting a layered road feature map, comprises: dividing the preprocessed image block into a plurality of non-overlapping local windows, and then performing linear embedding conversion to generate a feature vector sequence; driving 2 Swin blocks to execute window self-attention and shift window self-attention alternating calculation on the feature vector sequence so as to capture pixel-level road textures and output a first-layer road feature map; After the first layer road feature map is downsampled, driving 2 Swin blocks to execute window self-attention and shift window self-attention alternating calculation so as to repair local road fracture, and outputting a second layer road feature map; After downsampling the second-layer road feature map, driving 6 Swin blocks to execute cross-window long-range dependency modeling to obtain a third-layer road feature map; After the third-layer road feature map is downsampled, driving 2 Swin blocks to execute global road topology sensing to obtain a fourth-layer road feature map, wherein the fourth-layer road feature map is subjected to cavity space pyramid pooling processing to expand receptive fields and optimize road connectivity expression; The first layer road feature map, the second layer road feature map, the third layer road feature map and the fourth layer road feature map form the layered road feature map.
3. The hybrid neural network based road continuous extraction method of claim 2, wherein driving 2 Swin blocks to perform window self-attention and shift window self-attention alternating computation on the feature vector sequence to capture pixel-level road texture, outputting a first layer road feature map, comprising: rearranging the feature vector sequence into a two-dimensional feature map to form a primary space feature map; Dividing the primary space feature map into a local window array in a non-overlapping manner based on a hierarchical division size setting; Driving a W-MSA dominant Swin block to perform local feature aggregation coding based on non-shift window multi-head self-attention on the local window array to obtain a primary coding feature array; Performing periodic coordinate displacement on a reference window dividing grid of the local window array to obtain a displacement window array; And the SW-MSA dominant Swin block models the primary coding feature array according to the displacement window array based on the window-crossing context of the multi-head self-attention of the displacement window, and outputs the first-layer road feature map.
4. The hybrid neural network-based road continuous extraction method of claim 3, wherein the obtaining the displacement window array by periodically performing coordinate displacement on the reference window division grid of the local window array comprises: Setting a cell reference coordinate origin to the reference window dividing grid to obtain a space coordinate reference system; Applying a hierarchical coordinate offset vector to the space coordinate reference system along the orthogonal direction of the space plane to obtain a middle window grid; And executing the cyclic boundary filling according to the cell area exceeding the space range of the feature map of the local window array in the middle window grid, and outputting the displacement window array.
5. The road continuous extraction method based on the hybrid neural network as claimed in claim 3, wherein driving the W-MSA dominant Swin block to perform local feature aggregation coding based on non-shift window multi-head self-attention on the local window array to obtain a preliminary coding feature array comprises: After performing layer normalization processing on the features in each window unit in the local window array, performing multi-head self-attention calculation in a non-displacement window space of each window unit in the local window array so as to model global dependency relations in each window unit and generate a self-attention weighted feature array; residual connection is carried out on the self-attention weighted feature array and an original input feature array of the local window array, so that a first intermediate feature array is formed; After performing layer normalization processing on the first intermediate feature array, performing nonlinear transformation and feature enhancement through a multi-layer perceptron to generate a first high-dimensional nonlinear feature array; and carrying out residual connection on the first high-dimensional nonlinear feature array and the first intermediate feature array, and outputting the preliminary coding feature array.
6. The hybrid neural network-based road continuous extraction method of claim 3, wherein the SW-MSA dominant Swin block performs shift window multi-headed self-attention-based cross-window context modeling on the preliminary coding feature array according to the shift window array, and outputs the first layer road feature map, comprising: Performing cross-window feature recombination aggregation on the primary coding feature array according to the displacement window array to generate a recombination feature array; Performing multi-headed self-attention computation in a displacement window space of each window unit in the reorganization feature array to model a context dependency relationship crossing an original local window boundary, and generating a displacement self-attention weighting feature array; residual connection is carried out on the shift self-attention weighted feature array and the preliminary coding feature array, and a second intermediate feature array is output; After the second intermediate feature array is subjected to layer normalization processing, nonlinear transformation and feature enhancement are carried out through a multi-layer perceptron, and second high-dimensional nonlinear transformation features are generated; and carrying out residual connection on the second high-dimensional nonlinear transformation characteristic and the second intermediate characteristic array, and outputting the first layer road characteristic diagram.
7. The hybrid neural network-based road continuous extraction method of claim 2, wherein the step of performing progressive decoding and cross-layer feature fusion on the layered road feature map by the stripe convolutional decoder through a four-way stripe convolutional operation, and outputting a binary road segmentation map, comprises: Performing multi-thread parallel convolution operation on the fourth-layer road feature map by adopting a horizontal-direction strip convolution kernel, a vertical-direction strip convolution kernel, a first diagonal strip convolution kernel and a second diagonal strip convolution kernel to obtain a horizontal-direction feature map, a vertical-direction feature map, a first diagonal-direction feature map and a second diagonal-direction feature map; Performing channel dimension splicing on the horizontal direction feature map, the vertical direction feature map, the first diagonal direction feature map and the second diagonal direction feature map, performing feature integration dimension reduction through point-by-point convolution, and outputting a fourth layer decoding feature map; after upsampling the fourth layer decoding feature, performing cross-layer fusion with the third layer road feature map to obtain a third layer fusion feature map; performing four-way strip convolution operation on the third layer fusion feature map to obtain a third layer decoding feature map; after upsampling the third layer decoding feature, performing cross-layer fusion with the second layer road feature map to obtain a second layer fusion feature map; performing a four-way strip convolution operation on the second layer fusion feature map to obtain a second layer decoding feature map; After upsampling the second layer decoding feature map, performing cross-layer fusion with the first layer road feature map to obtain a first layer fusion feature map; performing a four-way strip convolution operation on the first layer fusion feature map to obtain a first layer decoding feature map; And performing pixel-level classification convolution after upsampling the first layer decoding feature map, and outputting the binary road segmentation map.
8. The method for continuously extracting roads based on the hybrid neural network according to claim 1, wherein the road topology nodes are used as road network structure control points, real-time multi-source traffic data are dynamically bound to the binary road segmentation map through a space-time registration algorithm to carry out road segment traffic state quantification, and a multi-level road state thermodynamic diagram is output, and the method comprises the following steps: Performing topology vectorization conversion on the binary road segmentation map to perform recognition and extraction of the road topology nodes, and constructing and generating a structured digital road network; after carrying out space-time standardization pretreatment on the real-time multi-source traffic flow data, taking the road topology node as a road network structure control point, dynamically binding the real-time multi-source traffic flow data to the structured digital road network through a space-time registration algorithm to obtain a road section level space-time alignment observation data set; carrying out road section level traffic flow data fusion calculation on the road section level space-time alignment observation data set to obtain a plurality of traffic state quantization parameters of a plurality of road sections; And performing multi-scale state rendering superposition of the plurality of traffic state quantization parameters on the structured digital road network, and outputting the multi-level road state thermodynamic diagram.
9. The road continuous extraction method based on the hybrid neural network as claimed in claim 1, wherein after the returned satellite photos are segmented and cut, multi-mode data enhancement is performed to obtain the preprocessed image block, comprising: Performing cutting size matching according to the attribution of the data types of the satellite photos to obtain target cutting sizes; performing block cutting of the satellite photo by adopting the target cutting size to obtain an image block with uniform size; Performing multi-modal data enhancement on the uniform-size image block using Augmentor tools, generating the pre-processed image block; wherein the multi-modal data enhancement includes random horizontal flip, random vertical flip, random rotation, and random scaling.

Description

Road continuous extraction method based on hybrid neural network Technical Field The invention relates to the technical field of neural networks, in particular to a road continuous extraction method based on a hybrid neural network. Background With the popularization of remote sensing technology and satellite images, road extraction based on high-resolution satellite images has become one of the core problems in the fields of geographic information systems, automatic driving, urban planning, intelligent transportation and the like. The road extraction not only involves extracting the boundary of an object on the ground, but also needs to ensure the accuracy, continuity and rationality of the topological structure of the extraction result. The high resolution remote sensing image provides detailed ground information, but the task of extracting road information becomes extremely challenging due to the complexity of the satellite image itself. Traditional road extraction methods, such as point cloud data analysis and GPS track aggregation based on laser radar (LiDAR), can achieve good effects in some specific applications, but have the problems of low efficiency and poor adaptability, and particularly when facing large-scale urban environments, the method is often unable to respond to dynamic changes rapidly. Furthermore, these methods often rely on manual labeling, resulting in inefficient processing in large scale scenarios, and difficulty in adapting to rapidly changing urban environments. In order to improve the automation and accuracy of road extraction, deep learning-based methods, particularly Convolutional Neural Networks (CNNs), are becoming the mainstream. With the proposal of the Full Convolutional Network (FCN) architecture, CNNs have shown great potential on road segmentation at the pixel level. However, the limitations of the conventional CNN architecture, especially when faced with long distance dependency, complex structure, and road break, are still difficult to achieve. When the road extraction is performed based on the CNN, the operation of sliding window in the local area is dependent on the convolution kernel, the receptive field of the convolution operation is limited, and the limitation causes the CNN to be insufficient in capturing the context information of long-distance dependency relationship and spanning a large-range image area. For example, a shelter from buildings, trees, etc. in an urban environment can lead to loss of road information and breakage of road segments, which conventional CNNs cannot effectively recover. In satellite images, the presence of occlusions and the non-linear structure of the roads often lead to discontinuities in the extraction results, which affect the integrity of the road network, lead to inaccurate map information, and affect the accuracy and usability of subsequent urban planning, traffic monitoring and navigation systems. It should be noted that the information disclosed in this background section is only for the purpose of increasing the understanding of the general background of the invention and should not be taken as an acknowledgement or any form of suggestion that this information forms the prior art already known to a person of ordinary skill in the art. Disclosure of Invention Aiming at the defects or improvement demands of the prior art, the invention provides a road continuous extraction method based on a hybrid neural network, which solves the technical problems that the inherent structure of the convolutional neural network or the full convolutional network in the prior art causes insufficient local feature extraction, particularly poor performance when processing long-range dependency relationship and complex road topological structure, influences the integrity of the road network and further causes inaccurate map information. The specific technical scheme is as follows: The invention provides a road continuous extraction method based on a hybrid neural network, which comprises the steps of carrying out multi-mode data enhancement after a returned satellite photo is cut in a blocking mode to obtain a preprocessed image block, inputting the preprocessed image block into the hybrid neural network, wherein the hybrid neural network comprises a Swin transform encoder and a strip convolution decoder, S1, carrying out multi-scale context sensing feature extraction on the preprocessed image block by the Swin transform encoder by adopting a layered window displacement mechanism, outputting a layered road feature map, S2, carrying out progressive decoding and cross-layer feature fusion on the layered road feature map by the strip convolution decoder through a four-way strip convolution operation, outputting a binary road segmentation map, positioning road topology nodes on the binary road segmentation map, carrying out road segment state quantization by taking the road topology nodes as road network structure control points and dynamically bind