CN-122023725-A - Unmanned aerial vehicle incremental new view angle synthesis method and system based on air-ground coordination

CN122023725ACN 122023725 ACN122023725 ACN 122023725ACN-122023725-A

Abstract

The unmanned aerial vehicle incremental new view angle synthesis method based on space-ground coordination comprises the steps of firstly collecting data, marking a unified time stamp, resolving the pose of an unmanned aerial vehicle in real time, simultaneously carrying out coding compression on image data, extracting key frame indexes, carrying out downsampling processing on point cloud data, packaging the point cloud data into lightweight structured data packets, dynamically adjusting transmission priority and frame loss strategies of different types of data according to a preset QoS strategy, preferentially guaranteeing transmission of pose data and sparse point clouds, sending the lightweight structured data packets to a ground terminal, carrying out unpacking and space-time alignment on the received lightweight structured data packets by the ground terminal, directly initializing central position parameters of a three-dimensional Gaussian body by using space coordinates of the sparse point clouds obtained by unpacking, constructing an initial three-dimensional Gaussian parameter set with physical scene priori, and carrying out incremental optimization and rendering by the ground terminal in an asynchronous parallel synchronization mode. The three-dimensional reconstruction time of the unmanned aerial vehicle is short, and the time-consuming requirement of earthquake rescue can be met.

Inventors

LI XUDONG
LIU JIAN
HUANG YONG
WU KAI
YU CHENHUI

Assignees

湖北省地震局(中国地震局地震研究所)

Dates

Publication Date: 20260512
Application Date: 20260212

Claims (10)

1. The unmanned aerial vehicle incremental new view angle synthesizing method based on air-ground coordination is characterized by comprising the following steps of: the method comprises the steps of firstly, collecting laser radar point cloud, camera images and inertial measurement unit data, and marking a unified time stamp, wherein an onboard unit of the unmanned aerial vehicle receives the data, calculates the pose of the unmanned aerial vehicle in real time, simultaneously carries out encoding compression on the image data, extracts a key frame index, carries out downsampling treatment on the point cloud data, and packages the pose, sparse point cloud and compressed image frames into a lightweight structured data packet; the second step, the lightweight structured data package is sent to a buffer area, the state of a wireless communication link is monitored in real time, and the sending priority and the frame loss strategy of different types of data are dynamically adjusted according to a preset QoS strategy; thirdly, unpacking and space-time alignment are carried out on the received lightweight structured data packet by the ground terminal, the central position parameter of the three-dimensional Gaussian body is directly initialized by utilizing the space coordinates of the unpacked sparse point cloud, and the initial covariance matrix of the three-dimensional Gaussian body is calculated according to the local neighborhood distribution characteristics of the point cloud, so that an initial three-dimensional Gaussian parameter set with a physical scene prior is constructed; fourthly, performing incremental optimization and rendering by adopting asynchronous parallel synchronization in background processing of the ground terminal: During optimization, the compressed image frames obtained by unpacking are used as supervision signals, and gradient descent updating is carried out on the three-dimensional Gaussian parameter sets by combining the corresponding accurate pose so as to obtain an optimized scene; And during rendering, reading the current updated three-dimensional Gaussian parameter set, synthesizing virtual images with specified visual angles in real time through a rasterization pipeline, and displaying the virtual images until the method is finished after all the virtual images are synthesized.
2. The unmanned aerial vehicle incremental new view angle synthesizing method based on air-ground coordination according to claim 1, wherein in the first step, laser radar point cloud, camera image and inertial measurement unit data are collected, specifically: The method comprises the steps of generating a unified reference clock signal and a hardware trigger pulse through a high-precision time-space synchronous circuit board at an airborne end, synchronously distributing the unified reference clock signal and the hardware trigger pulse to a laser radar, a camera and an inertial measurement unit, starting to capture images when the hardware trigger pulse is received by the camera, synchronously recording point cloud data and inertial measurement data when the hardware trigger pulse is received by the laser radar and the inertial measurement unit, marking an original data frame output by each sensor with a unified hardware time stamp based on the reference clock signal and transmitting the unified hardware time stamp to the airborne unit, and reading the unified hardware time stamp by the airborne unit to perform frame level alignment and association on multi-mode data acquired at the same time.
3. The unmanned aerial vehicle incremental new view angle synthesizing method based on space-ground coordination according to claim 1 is characterized in that in the second step, qoS transmission strategies are executed, specifically, an onboard unit builds a multi-priority queue in a transmission buffer, pose data associated with a hardware timestamp is stored in the high-priority queue, sparse point cloud is stored in a medium-priority queue, compressed image key frames are stored in a low-priority queue, bandwidth time delay of a wireless communication link is calculated in real time, all queue data are transmitted through the wireless communication link concurrently when good link states are detected, congestion control strategies are executed when link congestion or signal-to-noise ratio reduction is detected, the transmission rate of the low-priority queue is automatically reduced, hierarchical error control is executed, meanwhile, an automatic retransmission request mechanism is started for the high-priority queue data and the medium-priority queue data, active queue management strategies are executed for the low-priority queue data, and buffer time is discarded preferentially when the buffer overflows for a preset time length.
4. The unmanned aerial vehicle incremental new view angle synthesizing method based on space-ground coordination of claim 3, wherein in the second step, the pose data and the sparse point cloud are preferentially ensured to be transmitted, and the screened lightweight structured data packet is sent to a ground terminal by utilizing a wireless communication link, specifically: The method comprises the steps of integrating laser radar point cloud and inertial measurement unit data through an airborne unit, outputting six-degree-of-freedom pose of an engine body coordinate system in real time, performing downsampling and denoising processing on the point cloud data by adopting voxel grid filtering, calculating local curvature characteristics of the residual point cloud, extracting plane characteristic points and edge characteristic points capable of representing a scene geometric structure to generate sparse point cloud, performing compression on image data by utilizing a hardware encoder, combining variation of the six-degree-of-freedom pose, screening out compressed image frames at the current moment when the variation of the pose exceeds a preset space threshold value to serve as reconstruction key frames, and finally packaging the six-degree-of-freedom pose, the sparse point cloud and the compressed key frame data into a lightweight structured data packet.
5. The unmanned aerial vehicle incremental new view angle synthesizing method based on space-ground coordination according to claim 1, wherein in the third step, the ground terminal unpacks and aligns the received lightweight structured data packet in time-space, and directly initializes the central position parameter of the three-dimensional Gaussian by using the space coordinates of the unpacked sparse point cloud, and calculates the initial covariance matrix of the three-dimensional Gaussian according to the local neighborhood distribution characteristics of the point cloud, thereby constructing an initial three-dimensional Gaussian parameter set with physical scene priori, specifically comprising the following steps: After receiving a data stream, a ground terminal firstly carries out data packet integrity verification, respectively decodes the verified data, restores a reconstruction key frame, a sparse point cloud and a corresponding six-degree-of-freedom precise pose, completes frame-level space-time alignment of multi-mode data according to a timestamp, stores the multi-mode data into a local cache, then starts a three-dimensional Gaussian parameter set generation process, directly maps each three-dimensional point coordinate in the sparse point cloud into an initial mean value vector of a three-dimensional Gaussian body aiming at a position parameter, calculates average Euclidean distance between each three-dimensional point and the nearest K neighborhood points aiming at a scale parameter, sets the logarithmic value of the average Euclidean distance into an scale of the three-dimensional Gaussian body to prevent a reconstruction scene from generating a cavity, calculates the normal vector or covariance matrix direction of the local neighborhood of the three-dimensional point aiming at a rotation parameter, converts the three-dimensional point into a quaternion format to serve as an initial rotation quaternion of the three-dimensional Gaussian body, projects the three-dimensional point into a corresponding reconstruction key frame plane by utilizing the six-degree-of-freedom precise pose and the camera parameter, extracts the average Euclidean distance between each three-dimensional point and the nearest K neighborhood points aiming at the scale parameter, sets of three-dimensional point coordinate values and sets the three-dimensional Gaussian parameter to serve as initial color parameters, and finally optimizes the three-dimensional color parameters and the initial color parameters to include the initial color parameters and the three-dimensional color parameters and the initial color parameter to be optimized.
6. The unmanned aerial vehicle incremental new view angle synthesizing method based on air-ground coordination according to claim 1, wherein in the fourth step, the background processing of the ground terminal adopts asynchronous parallel synchronization to perform incremental optimization and rendering, and the specific process is as follows: The background optimization thread continuously monitors a data buffer area, when receiving new key frames and pose data, takes the new key frames and pose data as training supervision signals, calculates luminosity consistency loss and structure similarity loss between a rendered image of a current three-dimensional Gaussian model under a corresponding visual angle and a real key frame, calculates gradients by using a back propagation algorithm, updates the position, rotation, scaling, opacity and spherical harmonic coefficients of a three-dimensional Gaussian parameter set by using a gradient descent method, and simultaneously executes a self-adaptive density control strategy according to the size of the position gradients, namely executing cloning operation on small-scale Gaussian balls with the position gradients exceeding a threshold value and the geometric dimensions smaller than a preset value to fill texture details, executing splitting operation on large-scale Gaussian balls with the position gradients exceeding the threshold value and the geometric dimensions larger than the preset value to refine edges, pruning the low-opacity Gaussian balls to realize incremental refinement of the three-dimensional model; rendering is independent of optimization operation, an internal reference matrix and an external reference pose of a virtual camera are obtained in real time in response to a virtual camera control instruction of a user, a latest three-dimensional Gaussian parameter set at the current moment is locked and read, projection and multi-layer transparency superposition are carried out through a tile-based efficient rasterization pipeline, and a continuous image sequence of a user-specified visual angle is synthesized in real time.
7. The unmanned aerial vehicle incremental new view angle synthesis method based on space-ground coordination of claim 6, wherein the three-dimensional Gaussian parameter set is formed by combining a geometric structure and textures to obtain a three-dimensional Gaussian scene model.
8. The unmanned aerial vehicle incremental new view angle synthesizing method and system based on air-ground coordination, which are disclosed in claim 1, are characterized in that the unmanned aerial vehicle incremental new view angle synthesizing system based on air-ground coordination comprises an unmanned aerial vehicle acquisition module (1), an airborne preprocessing module (2), a data transmission module (3), a ground receiving end module (4) and a background processing module (5); The output end of the unmanned aerial vehicle acquisition module (1) is connected with the input end of the airborne preprocessing module (2), the output end of the airborne preprocessing module (2) is connected with the input end of the data transmission module (3), the output end of the data transmission module (3) is connected with the input end of the ground receiving end module (4), and the output end of the ground receiving end module (4) is connected with the input end of the background processing module (5); The unmanned aerial vehicle acquisition module (1) comprises an input sensor acquisition instruction and a hardware synchronous trigger signal, and outputs multi-mode original data; the airborne preprocessing module (2) comprises input multi-mode original data and output light structured data packets; the data transmission module (3) monitors the link state and executes a service quality self-adaptive strategy, and is used for transmitting the lightweight structured data packet; the ground receiving end module (4) comprises an input light structured data packet, outputs a decoded light structured data packet and stores the decoded light structured data packet into a local cache; The background processing module (5) comprises input of the aligned structured data and virtual camera parameters, and output of a new view angle synthesized image and a three-dimensional model.
9. The unmanned aerial vehicle incremental new view angle synthesizing method and system based on air-ground coordination according to claim 8, wherein the unmanned aerial vehicle collecting module (1) is used for collecting visual images, three-dimensional point clouds and inertial measurement data of a scene and unifying time stamp marks; The airborne preprocessing module (2) is used for performing space-time correlation on multi-source data, performing pose estimation on the basis of laser radar data and inertial measurement data, performing thinning processing on the acquired three-dimensional point cloud data to generate sparse point cloud, and packaging the pose data, the sparse point cloud data and encoded compressed image frame data into a lightweight structured data packet; the data transmission module (3) is used for realizing reliable transmission of the lightweight structured data packet based on a QoS self-adaptive transmission strategy of link perception, wherein the pose data associated with a hardware timestamp is set to be high priority, sparse point cloud data is set to be medium priority, compressed image frame data is set to be low priority, and an error control and retransmission mechanism is combined; The ground receiving end module (4) is used for receiving the lightweight structured data packet transmitted by the on-board end, finishing data decoding, frame level alignment and buffering, feeding back link state information to the data transmission module (3), and supporting subsequent real-time processing; The background processing module (5) is used for performing physical initialization of three-dimensional Gaussian parameters by using the returned pose data and the space geometrical properties of the sparse point cloud, iteratively updating the three-dimensional Gaussian parameters by using newly received image frames through an incremental optimization algorithm to construct a high-precision three-dimensional model, and performing real-time rasterization rendering on the currently optimized three-dimensional model in response to virtual camera parameters or user instructions to generate a new view angle image.
10. The method and system for synthesizing the incremental new view angles of the unmanned aerial vehicle based on air-ground coordination according to claim 1, wherein the original three-dimensional point cloud data acquired by the unmanned aerial vehicle acquisition module (1), the original image data acquired by the camera and the original multi-modal data acquired by the inertial measurement unit comprise the following components.

Description

Unmanned aerial vehicle incremental new view angle synthesis method and system based on air-ground coordination Technical Field The invention relates to an improvement of a real-time three-dimensional environment sensing and reconstructing technology of an air-ground cooperative framework, belongs to the field of computer vision and unmanned aerial vehicle remote sensing mapping intersection, and particularly relates to an unmanned aerial vehicle incremental new view angle synthesizing method and system based on air-ground cooperation. Background Traditional geographic information collection and topography survey rely on manual work to measure on spot more, and not only operating efficiency is low, measurement accuracy is limited, and is difficult to go deep into the topography complicacy such as hills, deserts, earthquake disaster areas, and the manpower of environmental hazard is difficult to reach the area and carries out the operation. The appearance of unmanned aerial vehicle visual angle acquisition technology provides brand new technological path for emergent survey and drawing and disaster rescue, and it can realize centimeter level high accuracy topography survey, promotes the data acquisition efficiency in open-air and the disaster scene by a wide margin, fuses the collection mode through five camera lens oblique photography and laser radar, can generate high accuracy three-dimensional model fast, provides key high quality data support for fields such as earthquake rescue, survey and drawing reconnaissance, digital twin. However, the current mainstream unmanned aerial vehicle three-dimensional reconstruction scheme still has obvious technical short plates, and is difficult to adapt to emergency scenes with extremely high requirements on time efficiency, such as earthquake rescue. The existing scheme mostly adopts the traditional oblique photography (SfM/MVS) technical route, and adopts a lengthy offline processing flow of flight acquisition, data copying, space three encryption, dense matching and texture mapping as a whole, so that the time spent on data calculation and model reconstruction is long, the complete flow usually needs hours or even days to finish, the gold rescue time is extremely limited in the sudden geological disaster rescue of earthquakes and the like, and the overlong data processing period cannot provide real-time data support for the topography research and judgment of disaster areas, ruin modeling, rescue path planning and disaster area analysis, so that the quick response and efficient promotion of earthquake rescue actions are severely restricted. The Chinese patent application with the application number of CN202510806533.7 and the application date of 2025 and 6 months and 17 discloses a synthesis method for synthesizing a new view image by sparse views, which comprises the steps of firstly inputting images and camera pose of the same object at different view angles, obtaining five-dimensional vectors of a pre-sampling point, sampling, obtaining characteristic information of the sampling point, extracting multi-level characteristics of the pre-sampling point through wavelet decomposition, decoding through a frequency domain characteristic coding network, sending the multi-level characteristics and direction information of the pre-sampling point into a double-layer decoder module together to obtain volume density, color value and prediction visibility, inputting the multi-level characteristics and the direction information of the pre-sampling point into a micro-rendering module to generate a rendering image and generate prediction visibility, carrying out loss calculation on the rendering image and the input image to optimize the whole network to obtain a new view angle synthesis network model of the sparse views, constructing the pre-sampling point by inputting a small amount of images and camera pose of different view angles, completing generation of the new view angle images through frequency domain coding, micro-rendering and attention mechanism, improving the synthesis precision under sparse input conditions, and improving the high-frequency information loss problem, although the scheme improves the precision, and has long aging treatment, and still has no time-consuming three-dimensional rescue requirements. The disclosure of this background section is only intended to increase the understanding of the general background of the present patent application and should not be taken as an admission or any form of suggestion that this information forms the prior art already known to a person of ordinary skill in the art. Disclosure of Invention The invention aims to solve the problems that in the prior art, the three-dimensional reconstruction time of an unmanned aerial vehicle is long and the time-consuming requirement of earthquake rescue is difficult to meet, and provides an unmanned aerial vehicle incremental new view angle synthesizing method and system based on air-g