CN-121982577-A - Unmanned aerial vehicle inspection method and device based on multi-source data

CN121982577ACN 121982577 ACN121982577 ACN 121982577ACN-121982577-A

Abstract

The invention provides an unmanned aerial vehicle inspection method and device based on multi-source data, and relates to the field of unmanned aerial vehicle inspection, comprising the steps of acquiring multi-mode data by an unmanned aerial vehicle through multi-source data acquisition of a target area; the method comprises the steps of carrying out feature extraction on multi-mode data through a multi-mode feature extraction layer to obtain initial feature vectors, carrying out feature alignment and attention weighting on the initial feature vectors through a multi-mode feature processing layer to obtain comprehensive feature vectors, carrying out task feature extraction on the comprehensive feature vectors through a lightweight self-attention network to obtain task feature sets, inputting the comprehensive feature vectors and the task feature sets into a multi-task recognition layer to carry out task detection to obtain task recognition result sets, wherein the task recognition result sets comprise dam defect recognition results, ecological damage recognition results, intrusion target recognition results, natural disaster recognition results, electric power facility defect recognition results and flood discharge early warning results, and generating inspection reports according to the task recognition result sets.

Inventors

CHEN SHIBO
LI GUANGQIN
LIU YUSHENG
Shao Guangzhe
ZHANG ZHILIANG
ZHOU MIAOLIN
SU JUNLONG
HE ENLIANG

Assignees

广东粤电南水发电有限责任公司
广东数字生态科技有限责任公司

Dates

Publication Date: 20260505
Application Date: 20251126

Claims (10)

1. The unmanned aerial vehicle inspection method based on the multi-source data is characterized by comprising the following steps of: Acquiring multi-mode data including RGB images, infrared images, point cloud data, video sequences and environmental sensor data by acquiring multi-source data of a target area through an unmanned aerial vehicle; constructing a comprehensive inspection large model, wherein the comprehensive inspection large model comprises a multi-mode feature extraction layer, a multi-mode feature processing layer, a light self-attention network and a multi-task identification layer; Extracting the characteristics of the multi-mode data through the multi-mode characteristic extraction layer to obtain an initial characteristic vector; Performing feature alignment and attention weighting on the initial feature vector through the multi-mode feature processing layer to obtain a comprehensive feature vector, wherein the comprehensive feature vector comprises RGB image features, infrared image features, point cloud features, video sequence features and environment features; Task feature extraction is carried out on the comprehensive feature vector through the lightweight self-attention network, and a task feature set is obtained, wherein the task feature set comprises dam defect identification task features, ecological damage identification task features, intrusion target identification task features, natural disaster identification task features, electric power facility defect identification task features and flood discharge early warning task features; Inputting the comprehensive feature vector and the task feature set into the multi-task recognition layer for task detection to obtain a task recognition result set, wherein the task recognition result set comprises a dam defect recognition result, an ecological damage recognition result, an intrusion target recognition result, a natural disaster recognition result, an electric power facility defect recognition result and a flood discharge early warning result, and generating a patrol report according to the task recognition result set.
2. The method for unmanned aerial vehicle inspection based on multi-source data according to claim 1, wherein the performing feature extraction on the multi-mode data by the multi-mode feature extraction layer to obtain an initial feature vector comprises: the multi-modal feature extraction layer comprises a convolutional neural network, a VFE network, an LSTM network and a 3D-CNN network; Extracting multi-scale spatial features of the RGB image and the infrared image through the convolutional neural network, extracting three-dimensional geometric features of the point cloud data through the VFE network, extracting frame features of the video sequence through the LSTM network, and extracting space-time features of the environmental sensor data through the 3D-CNN network; And splicing the multi-scale space features of the RGB image, the multi-scale space features of the infrared image, the three-dimensional geometric features, the frame features and the space-time features to obtain the initial feature vector.
3. The method for unmanned aerial vehicle inspection based on multi-source data according to claim 1, wherein the performing feature alignment and attention weighting on the initial feature vector by the multi-modal feature processing layer to obtain a comprehensive feature vector comprises: Mapping the multi-scale space features of the RGB image, the multi-scale space features of the infrared image, the three-dimensional geometric features, the frame features and the space-time features in the initial feature vector to the same latitude to obtain an aligned feature vector; The multi-modal feature processing layer comprises a cross-modal attention mechanism module, and attention weighting is carried out on the aligned feature vectors through the cross-modal attention mechanism module to obtain weighted feature vectors; and acquiring a region-task relation diagram, and generating the comprehensive feature vector according to the task context information of the weighted feature vector enhanced by the region-task relation diagram.
4. The method for unmanned aerial vehicle inspection based on multi-source data according to claim 1, wherein the task feature extraction of the integrated feature vector through the lightweight self-care network to obtain a task feature set comprises: Generating a task perception modulation vector through a lightweight self-attention network according to the comprehensive feature vector, and calculating to obtain a task node set according to the task perception modulation vector and the comprehensive feature vector; and calculating to obtain task correlation parameters among all task nodes in the task node set, and carrying out feature enhancement on the task nodes through the task correlation parameters to obtain a task feature set.
5. The multi-source data based unmanned aerial vehicle inspection method of claim 1, wherein the multi-task recognition layer comprises a dam defect recognition sub-network, an ecological damage recognition sub-network, an intrusion target recognition sub-network, a natural disaster recognition sub-network, an electric power facility defect recognition sub-network and a flood discharge early warning sub-network.
6. The method for unmanned aerial vehicle inspection based on multi-source data according to claim 5, wherein inputting the integrated feature vector and the task feature set into the multi-task recognition layer for task detection, obtaining a task recognition result set, comprises: inputting the comprehensive feature vector and the dam defect recognition task feature into a dam defect recognition sub-network to perform feature convolution and dam defect probability prediction, and obtaining a dam defect recognition result; inputting the comprehensive feature vector and the ecological damage recognition task features into an ecological damage recognition sub-network to perform feature fusion and ecological damage probability prediction, so as to obtain an ecological damage recognition result; Inputting the comprehensive feature vector and the intrusion target recognition task feature into an intrusion target recognition sub-network to perform feature extraction and intrusion target detection, and obtaining an intrusion target recognition result; Inputting the comprehensive feature vector and the natural disaster recognition task feature into a natural disaster recognition sub-network to perform context feature extraction and natural disaster classification, and obtaining a natural disaster recognition result; inputting the comprehensive feature vector and the power facility defect recognition task feature into a power facility defect recognition sub-network to perform multidimensional feature extraction and power facility defect classification, and obtaining a power facility defect recognition result; and inputting the comprehensive feature vector and the flood discharge early-warning task feature into a flood discharge early-warning sub-network to carry out flood discharge early-warning prediction, so as to obtain a flood discharge early-warning result.
7. The multi-source data based unmanned aerial vehicle inspection method of claim 6, wherein the dam defect recognition result comprises a dam defect probability map, a dam defect type, and a dam defect confidence; The ecological damage identification result comprises an ecological damage type and an influence area; the intrusion target recognition result comprises an intrusion target position, an intrusion target category and an intrusion target track; the natural disaster identification result comprises a natural disaster identification category and a natural disaster risk index; The utility defect identification results include utility defect location, utility defect type, and utility defect severity; the flood discharge early warning result comprises a flood discharge risk index and a flood discharge early warning grade.
8. A multi-source data-based unmanned aerial vehicle inspection device for implementing the multi-source data-based unmanned aerial vehicle inspection method according to any one of claims 1 to 7, wherein the device comprises: The data acquisition module is used for acquiring multi-source data of a target area through the unmanned aerial vehicle to obtain multi-mode data, wherein the multi-mode data comprises RGB images, infrared images, point cloud data, video sequences and environmental sensor data; the model construction module is used for constructing a comprehensive inspection large model, and the comprehensive inspection large model comprises a multi-mode feature extraction layer, a multi-mode feature processing layer, a light self-attention network and a multi-task identification layer; The initial feature vector acquisition module is used for carrying out feature extraction on the multi-mode data through the multi-mode feature extraction layer to obtain an initial feature vector; the comprehensive feature vector acquisition module is used for carrying out feature alignment and attention weighting on the initial feature vector through the multi-mode feature processing layer to obtain a comprehensive feature vector, wherein the comprehensive feature vector comprises RGB image features, infrared image features, point cloud features, video sequence features and environment features; The task feature set acquisition module is used for extracting task features of the comprehensive feature vector through the lightweight self-attention network to obtain a task feature set, wherein the task feature set comprises dam defect identification task features, ecological damage identification task features, intrusion target identification task features, natural disaster identification task features, electric power facility defect identification task features and flood discharge early warning task features; The patrol report acquisition module is used for inputting the comprehensive feature vector and the task feature set into the multi-task recognition layer to perform task detection to obtain a task recognition result set, wherein the task recognition result set comprises a dam defect recognition result, an ecological damage recognition result, an intrusion target recognition result, a natural disaster recognition result, an electric power facility defect recognition result and a flood discharge early warning result, and a patrol report is generated according to the task recognition result set.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the multi-source data based drone inspection method of any one of claims 1 to 7 when the program is executed.
10. A non-transitory computer readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the multi-source data based drone inspection method of any of claims 1 to 7.

Description

Unmanned aerial vehicle inspection method and device based on multi-source data Technical Field The invention relates to the field of unmanned aerial vehicle inspection, in particular to an unmanned aerial vehicle inspection method and device based on multi-source data. Background With the development of unmanned aerial vehicles, remote sensing and artificial intelligence technologies, automatic inspection of large-scale water conservancy facilities, ecological environments, electric power facilities and natural disaster areas by using unmanned aerial vehicles has become an important means for smart city construction, ecological environment protection and public safety management. In order to simplify the data processing process, the conventional unmanned aerial vehicle inspection method generally only relies on single-mode data for defect identification, cannot process space-time scale differences among data of different modes, and lacks multi-source data fusion capability, so that the identification accuracy under a complex environment (such as night, bad weather and vegetation shielding) is low. In addition, the existing unmanned aerial vehicle inspection method is usually designed only for a single task, for example, only detecting dam cracks or only monitoring electric facilities, lacks the capability of simultaneously processing multiple tasks, and is difficult to meet the requirement of 'multi-target and multi-field parallel detection' in an actual inspection scene. And the potential correlation among the tasks is not considered by a single task, so that the identification results are mutually isolated, and the reliability of early warning is reduced. Disclosure of Invention In order to solve the problems, the invention provides an unmanned aerial vehicle inspection method based on multi-source data, which comprises the following steps: Acquiring multi-mode data including RGB images, infrared images, point cloud data, video sequences and environmental sensor data by acquiring multi-source data of a target area through an unmanned aerial vehicle; constructing a comprehensive inspection large model, wherein the comprehensive inspection large model comprises a multi-mode feature extraction layer, a multi-mode feature processing layer, a light self-attention network and a multi-task identification layer; Extracting the characteristics of the multi-mode data through the multi-mode characteristic extraction layer to obtain an initial characteristic vector; Performing feature alignment and attention weighting on the initial feature vector through the multi-mode feature processing layer to obtain a comprehensive feature vector, wherein the comprehensive feature vector comprises RGB image features, infrared image features, point cloud features, video sequence features and environment features; Task feature extraction is carried out on the comprehensive feature vector through the lightweight self-attention network, and a task feature set is obtained, wherein the task feature set comprises dam defect identification task features, ecological damage identification task features, intrusion target identification task features, natural disaster identification task features, electric power facility defect identification task features and flood discharge early warning task features; Inputting the comprehensive feature vector and the task feature set into the multi-task recognition layer for task detection to obtain a task recognition result set, wherein the task recognition result set comprises a dam defect recognition result, an ecological damage recognition result, an intrusion target recognition result, a natural disaster recognition result, an electric power facility defect recognition result and a flood discharge early warning result, and generating a patrol report according to the task recognition result set. Optionally, the extracting features of the multi-mode data by the multi-mode feature extracting layer to obtain an initial feature vector includes: the multi-modal feature extraction layer comprises a convolutional neural network, a VFE network, an LSTM network and a 3D-CNN network; Extracting multi-scale spatial features of the RGB image and the infrared image through the convolutional neural network, extracting three-dimensional geometric features of the point cloud data through the VFE network, extracting frame features of the video sequence through the LSTM network, and extracting space-time features of the environmental sensor data through the 3D-CNN network; And splicing the multi-scale space features of the RGB image, the multi-scale space features of the infrared image, the three-dimensional geometric features, the frame features and the space-time features to obtain the initial feature vector. Optionally, the performing feature alignment and attention weighting on the initial feature vector by the multi-mode feature processing layer to obtain a comprehensive feature vector includes: Mapping the mul