Search

CN-121999040-A - Unmanned aerial vehicle target positioning method, equipment and medium based on multi-observation point cooperation

CN121999040ACN 121999040 ACN121999040 ACN 121999040ACN-121999040-A

Abstract

The application provides an unmanned aerial vehicle target positioning method, equipment and medium based on multi-observation point cooperation, and the method comprises the steps of establishing a unmanned aerial vehicle target recognition model based on bidirectional cross-dimension feature enhancement by constructing a dynamic and static target three-dimensional space positioning physical framework, providing an unmanned aerial vehicle multi-target association matching algorithm based on images to eliminate background interference and correct visual angle influence, modeling mapping relations between multiple observation points and unmanned aerial vehicle targets by adopting an end-to-end deep neural network, and therefore realizing multi-source feature fusion and positioning precision improvement.

Inventors

  • LI HAO
  • TONG LEI
  • SHI LONG
  • HAN HONGGUI
  • WANG QI
  • WU XINSEN
  • YANG RUIXUE
  • YANG YUYUAN

Assignees

  • 通号低空智能科技有限公司

Dates

Publication Date
20260508
Application Date
20251216

Claims (10)

  1. 1. The unmanned aerial vehicle target positioning method based on multi-observation point cooperation is characterized by comprising the following steps of: s1, deploying a plurality of cameras to form a three-dimensional monitoring network, and performing time synchronization and space coordinate calibration on each camera to acquire three-dimensional position and posture data of each camera; S2, identifying unmanned aerial vehicle targets in images acquired by cameras by utilizing a target identification model, and acquiring two-dimensional coordinates of the unmanned aerial vehicle targets in the images, wherein the target identification model integrates a bidirectional feature refinement module in a backbone network, and the module processes feature images through channel splitting and a cross attention mechanism so as to integrate space details and semantic information; s3, performing association matching on targets identified in different camera images based on the extracted target foreground appearance attribute characteristics so as to determine a plurality of observation data belonging to the same unmanned aerial vehicle target; S4, according to the association matching result, each observation data belonging to the same unmanned aerial vehicle target is input into an end-to-end depth neural network, a mapping relation between multi-source observation data and three-dimensional world coordinates is established through the depth neural network, and three-dimensional position estimation of the unmanned aerial vehicle target under a world coordinate system is output; The observation data belonging to the same unmanned aerial vehicle target comprises three-dimensional position and posture data corresponding to a camera and two-dimensional coordinate data of the target in an image of the camera.
  2. 2. The method according to claim 1, wherein S1 specifically comprises: s11, selecting N cameras, wherein N is more than 3, and deploying by adopting a space symmetry topological structure to form a three-dimensional monitoring network covering a target monitoring area; S12, all camera nodes are accessed to the same time synchronization server, and time calibration is carried out according to a set synchronization period so as to realize time signal synchronization of each camera; s13, acquiring and recording the space coordinates and attitude parameters of each camera node, and completing the space coordinate calibration; S14, planning a flight route of the unmanned aerial vehicle, wherein the route is formed by calculating the gravity centers of planes formed by camera vertexes in the three-dimensional monitoring network and connecting the planes, so that the flight track of the unmanned aerial vehicle is ensured to penetrate through an area covered by the monitoring network; s15, acquiring image data synchronously shot by multiple cameras in the flight process of the unmanned aerial vehicle along the flight route, and forming a data set for training and verification.
  3. 3. The method of claim 1, wherein the object recognition model comprises a backbone network, a neck network, and a detection head; The backbone network is constructed based on a cross-stage local network architecture, and the bidirectional feature refinement module is integrated in at least two feature extraction stages with different scales; The neck network adopts a bidirectional characteristic pyramid structure, and the multi-level characteristic graphs from the backbone network are fused through double-path connection from top to bottom and from bottom to top; the detection head comprises a classification branch and a regression branch, wherein the regression branch is optimized by adopting a CIoU loss function, the classification branch is optimized by adopting a cross entropy loss function, and meanwhile, the detection head introduces a zooming loss function as an auxiliary optimization target in the training process.
  4. 4. The method of claim 3, wherein the bi-directional feature refinement module processes the input feature map as follows: Splitting an input feature map into a first feature subset and a second feature subset along a channel dimension, wherein the splitting ratio is controlled by a super parameter alpha; sequentially performing convolution operation and channel attention calculation on the first feature subset to generate channel attention weight; Performing convolution operation and spatial attention calculation on the second feature subset to generate spatial attention weight; Multiplying the channel attention weight with the second feature subset element by element to obtain a channel enhancement feature; multiplying the spatial attention weight with the first feature subset element by element to obtain a spatial enhancement feature; And fusing the channel enhancement features and the space enhancement features through feature aggregation operation to obtain an output feature map.
  5. 5. The method according to claim 1, wherein S3 specifically comprises: s31, calculating the gradient of pixels in each target prediction frame identified in each camera image, determining an adaptive threshold value based on the sequenced gradient values, and marking the pixels with the gradient larger than or equal to the threshold value as a foreground; s32, extracting appearance attribute features with color, texture or shape features as targets based on the foreground pixels; S33, calculating the similarity of the target appearance attribute feature vectors among different observation points by adopting cosine distances, normalizing the similarity by using a softmax function to obtain the association probability among the targets of the observation points, and completing matching based on the association probability.
  6. 6. The method according to claim 1, wherein in S4, before inputting the observation data into the end-to-end deep neural network, further comprising a data preprocessing step: Aligning the data acquired by each camera based on the time stamp, and carrying out normalization processing on the three-dimensional position, the gesture data and the two-dimensional coordinate data, wherein the normalization formula is as follows: ; Wherein, the , Represents the ith degree obtained by using a high-precision observation point positioning module ) Absolute geographic coordinates provided by the individual observation point camera edge devices, The i-th camera is represented to obtain the two-dimensional coordinates and the size of the boundary frame of the positioning target in the acquired image through the unmanned aerial vehicle target recognition model, Representing that the data is to be minimized, Representing maximizing the data.
  7. 7. The method of claim 1, wherein in S4, the end-to-end deep neural network comprises an input layer, at least one hidden layer, and an output layer, wherein the hidden layer performs nonlinear transformation using a Sigmoid activation function to learn a nonlinear mapping relationship between the observed data and three-dimensional world coordinates.
  8. 8. The method of claim 7, wherein the end-to-end deep neural network is trained by combining a composite loss function of a mean square error loss function and a euclidean distance loss function to optimize network parameters and compensate for measurement bias of the optical sensor.
  9. 9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor, when executing the computer program, realizes the steps of the unmanned aerial vehicle target positioning method based on multi-observation point coordination according to any one of claims 1 to 8.
  10. 10. A storage medium having stored thereon a computer program, which when executed by a processor, implements the steps of the unmanned aerial vehicle target positioning method based on multi-observation point coordination according to any of claims 1 to 8.

Description

Unmanned aerial vehicle target positioning method, equipment and medium based on multi-observation point cooperation Technical Field The document relates to the technical field of unmanned aerial vehicle positioning, in particular to an unmanned aerial vehicle target positioning method, equipment and medium based on multi-observation point cooperation. Background Along with the rapid development of low-altitude economy, unmanned aerial vehicles are increasingly widely applied in a plurality of fields, and the accurate and real-time positioning requirements on targets of unmanned aerial vehicles are increasingly urgent. However, the existing unmanned aerial vehicle target positioning technology still has the following limitations: The small target detection precision is insufficient, the unmanned aerial vehicle presents a small target in medium-long distance imaging, the existing feature extraction network is difficult to effectively fuse space details and deep semantic information, the omission ratio is high, the positioning frame is inaccurate, and the performance is poor especially under a multi-scale target coexistence scene. The correlation reliability of the multi-view target is poor, the multi-observation point is cooperated to depend on cross-view target matching, but the existing method excessively depends on appearance characteristics (such as colors and textures) which are easily affected by the view, so that the matching effect is obviously reduced when the view changes. The data consistency of multiple observation points is insufficient, the time synchronization precision of multiple cameras is low, time lag exists in observation data, and the data such as camera gestures and image coordinates are lack of standardized processing, so that error accumulation is caused when the data are fused, and the positioning precision is reduced. The optical sensor deviation is not effectively compensated, the nonlinear mapping relation between an observation point and a target cannot be fully modeled by the existing method, the sensor measurement deviation in imaging cannot be corrected in real time, and meanwhile, an optimization mechanism considering position error and distance constraint is lacked, so that a positioning result is inaccurate in a world coordinate system. The problems restrict the application effect of unmanned aerial vehicle target positioning in complex actual scenes. Therefore, the invention provides a multi-observation point cooperative unmanned aerial vehicle target positioning method, which aims to improve the small target recognition, cross-view correlation, data cooperation and deviation compensation capability, realize accurate and real-time positioning and provide reliable support for scenes such as airspace management and the like. Disclosure of Invention The invention provides a method for establishing a recognition model based on bidirectional cross-dimensional feature enhancement by constructing a dynamic and static target three-dimensional space positioning physical framework, and modeling the mapping relation between multiple observation points and targets by an image multi-target association matching algorithm and an end-to-end deep neural network. According to an embodiment of the present invention, there is provided an unmanned aerial vehicle target positioning method based on multi-observation point cooperation, which is characterized by comprising: s1, deploying a plurality of cameras to form a three-dimensional monitoring network, and performing time synchronization and space coordinate calibration on each camera to acquire three-dimensional position and posture data of each camera; s2, identifying an unmanned aerial vehicle target in the image acquired by each camera by utilizing a target identification model integrated with a bidirectional feature refinement module, and acquiring a two-dimensional coordinate of the unmanned aerial vehicle target in the image; s3, performing association matching on targets identified in different camera images based on the extracted target foreground appearance attribute characteristics so as to determine a plurality of observation data belonging to the same unmanned aerial vehicle target; S4, according to the association matching result, each observation data belonging to the same unmanned aerial vehicle target is input into an end-to-end depth neural network, a mapping relation between multi-source observation data and three-dimensional world coordinates is established through the depth neural network, and three-dimensional position estimation of the unmanned aerial vehicle target under a world coordinate system is output; The observation data belonging to the same unmanned aerial vehicle target comprises three-dimensional position and posture data corresponding to a camera and two-dimensional coordinate data of the target in an image of the camera. According to an embodiment of the present invention, there is provided an electronic