Search

CN-122023344-A - Unmanned aerial vehicle multi-scene inspection model construction method integrating transfer learning and physical state guidance

CN122023344ACN 122023344 ACN122023344 ACN 122023344ACN-122023344-A

Abstract

The invention relates to the technical field of artificial intelligence, unmanned aerial vehicle remote sensing and cross-domain self-adaptive learning, and discloses a method for constructing an unmanned aerial vehicle multi-scene inspection model by fusing transfer learning and physical state guidance, which is used for solving the problems of large inspection data distribution difference and weak model generalization capability of an unmanned aerial vehicle under the conditions of zooming, posture changing and complex environment illumination, and comprises the following steps: analyzing the inspection data to construct an acquisition state fingerprint vector containing geometric and environmental components; searching a source domain anchor point set by using a geometric manifold index library, and resolving geometric and environmental difference vectors; performing feature map space transformation and weight nonlinear fusion by using the difference vector through a mapping network and a super network respectively, simulating a physical view angle of a target domain and adapting to environmental features; and calculating the consistency loss value drive parameter update with the standard specification data by combining the ground sampling distance back calculation physical equivalent geometric parameters.

Inventors

  • YANG GUIFANG
  • LIN YUXING
  • CHEN ZHUO
  • ZHU ZAILIANG

Assignees

  • 莆田市木兰数智科技服务有限公司
  • 莆田学院

Dates

Publication Date
20260512
Application Date
20260130

Claims (10)

  1. 1. The unmanned aerial vehicle multi-scene inspection model construction method integrating transfer learning and physical state guidance is characterized by comprising the following steps of: Analyzing and vectorizing original inspection data to construct a standardized collected state fingerprint vector, wherein the original inspection data comprises source domain data and target domain data; receiving geometric state components of the source domain data, and constructing a geometric manifold index library; Calculating a reference geometric state vector of the target domain data, searching a source domain anchor point set in the geometric manifold index library as a query vector, calculating a state difference vector between the target domain vector and the source domain anchor point set vector, and decomposing the state difference vector into a geometric difference vector and an environment difference vector; Receiving the geometric difference vector, inputting the geometric difference vector into a mapping network to generate an affine transformation parameter matrix, and executing grid sampling and interpolation operation on the feature map extracted from the source domain image; receiving the environmental difference vector, inputting a super-network structure to generate a gating vector, and carrying out nonlinear weighted fusion on weight parameters of a pre-training model; And introducing physical priori constraint in a counter propagation stage of model training, receiving target detection boundary frame coordinates output by a model, back calculating physical equivalent geometric parameters of a predicted target, and calculating consistency loss value drive parameter update of the physical equivalent geometric parameters and pre-stored standard specification data to obtain a patrol data analysis model.
  2. 2. The method for constructing a multi-scene inspection model of an unmanned aerial vehicle by combining transfer learning and physical state guidance according to claim 1, wherein the step of constructing a standardized collected state fingerprint vector comprises the following steps: Extracting telemetry data and sensor metadata associated with each frame of image in the original inspection data; Constructing the geometric state component, wherein the geometric state component comprises a relative height, a shooting distance, a cloud deck pitch angle and a relative yaw angle and is used for representing a space geometric relationship in imaging; Constructing the environment state component, which comprises a ground sampling distance, an exposure value and an image texture entropy and is used for representing the conditions of environment illumination and texture definition during imaging; and performing Z-Score standardization processing on the geometric state component and the environment state component by using statistical distribution parameters of source domain historical data to generate the collected state fingerprint vector.
  3. 3. The method for constructing the unmanned aerial vehicle multi-scene inspection model by fusing transfer learning and physical state guidance according to claim 2, wherein the method for constructing the geometric manifold index library is characterized by adopting a space division algorithm, and specifically comprises the following steps: adopting a K-dimensional tree algorithm as a space division structure; calculating the variance value of the geometric state component of the source domain data on each dimension, and selecting the dimension with the largest variance as the split dimension of the current node; Selecting a median as a segmentation node in the split dimension, and recursively dividing a left subtree space and a right subtree space until the number of samples contained in leaf nodes is smaller than a preset capacity threshold; And taking the weighted Euclidean distance as a measurement function for measuring the similarity degree of the geometric state vectors, wherein the weight set value of the shooting distance dimension is larger than the weight set value of the angle dimension.
  4. 4. The method for constructing a multi-scene inspection model of an unmanned aerial vehicle by combining transfer learning and physical state guidance according to claim 1, wherein the step of calculating a reference geometric state vector of the target domain data comprises: acquiring a target domain calibration data set, wherein the target domain calibration data set comprises a plurality of pre-acquired typical samples; Carrying out standardization processing on samples in the target domain calibration data set by utilizing the global mean value and the global standard deviation of the source domain data; And carrying out arithmetic average on all sample vectors of the processed target domain calibration data set to obtain the reference geometric state vector.
  5. 5. The method for constructing the unmanned aerial vehicle multi-scene inspection model by fusing transfer learning and physical state guidance according to claim 1, wherein the step of decomposing the state difference vector into a geometric difference vector and an environmental difference vector specifically comprises: Respectively calculating an initial difference vector between each anchor point sample in the anchor point set and the reference geometric state vector; calculating an aggregation weight according to the geometric space distance between each anchor point sample and the reference geometric state vector by using a Softmax function with a temperature coefficient; and carrying out weighted summation on the initial difference vectors of all anchor point samples by utilizing the aggregation weight to respectively obtain the geometrical difference vector and the environment difference vector after aggregation.
  6. 6. The method for constructing the unmanned aerial vehicle multi-scene inspection model by fusing transfer learning and physical state guidance according to claim 1, wherein the step of performing grid sampling and interpolation operation on the feature map extracted from the source domain image specifically comprises: Generating a normalized sampling grid by using the affine transformation parameter matrix; and sampling pixel values at corresponding positions in the source domain feature map by adopting a bilinear interpolation algorithm aiming at non-integer coordinates in the sampling grid, and generating a feature map after geometric correction.
  7. 7. The method for constructing the unmanned aerial vehicle multi-scene inspection model by fusing transfer learning and physical state guidance according to claim 1, further comprising a confidence gating step before performing grid sampling and interpolation operation on the feature map extracted from the source domain image: Calculating the Euclidean modulus of the geometric difference vector; judging whether the Euclidean modulus length exceeds a preset safety transformation threshold value or not; and if the safe transformation threshold value is exceeded, resetting the affine transformation parameter matrix to an identity transformation matrix, or stopping executing grid sampling operation, and directly outputting an original source domain feature map.
  8. 8. The method for constructing the unmanned aerial vehicle multi-scene inspection model by fusing transfer learning and physical state guidance according to claim 1, wherein the step of inputting the super-network structure to generate the gating vector comprises the following steps: Establishing a mapping relation from the environmental difference features to the importance of the convolution channels by using the super network, and outputting an importance descriptor of the feature channels; Processing the feature channel importance descriptor by using an activation function to generate the gating vector mapped to a normalized numerical interval; The nonlinear weighted fusion is embodied by performing a channel multiplication-based recalibration operation on convolution kernel weights of a convolution layer using the gating vector.
  9. 9. The method for constructing the unmanned aerial vehicle multi-scene inspection model by fusing transfer learning and physical state guidance according to claim 2, wherein the step of receiving the target detection boundary frame coordinates output by the model and back-calculating the physical equivalent geometrical parameters of the predicted target comprises the following steps: receiving a target detection boundary box output by a model, and acquiring a pixel dimension value of the boundary box; multiplying the pixel dimension value by the ground sampling distance value recorded in the environmental state component, thereby obtaining a physical equivalent geometric parameter of the predicted target; the ground sample distance value characterizes the actual physical distance represented by a single pixel in the image.
  10. 10. The method for constructing the unmanned aerial vehicle multi-scene inspection model by fusing transfer learning and physical state guidance according to claim 1, wherein the step of calculating the consistency loss value of the physical equivalent geometric parameter and the pre-stored standard specification data specifically comprises the following steps: taking the statistical mean value and standard deviation of the physical equivalent geometric parameters of the category to which the prediction target belongs from pre-stored standard component specification data; Calculating the physical consistency probability that the physical equivalent geometric parameters of the predicted target belong to the category to which the predicted target belongs by using a Gaussian kernel function; and constructing a loss function based on the physical consistency probability, or calibrating the confidence coefficient of the predicted target by taking the physical consistency probability as a weighting factor, and restraining a pseudo target with unreasonable physical equivalent geometric parameters.

Description

Unmanned aerial vehicle multi-scene inspection model construction method integrating transfer learning and physical state guidance Technical Field The invention relates to the technical fields of artificial intelligence, unmanned aerial vehicle remote sensing and cross-domain self-adaptive learning, in particular to a method for constructing an unmanned aerial vehicle multi-scene inspection model by fusing transfer learning and physical state guidance. Background With the wide application of unmanned aerial vehicle technology in the fields of electric power, traffic and security inspection, image recognition technology based on deep learning has become a core means for processing massive inspection data. In practical applications, unmanned aerial vehicles often need to perform tasks in different lines, different seasons, and different lighting conditions, which results in differences in distribution of the acquired image data. In order to maintain the performance of a model in a new scene without the need of massive labeling of new data, transfer learning and domain adaptation techniques are widely adopted, aiming at transferring existing source domain knowledge to a target domain. However, the existing transfer learning method has certain limitations in processing the unmanned aerial vehicle inspection data. Most of the current domain adaptation algorithms are aligned based on statistical distribution of image features, and focus on reducing the distance between a source domain and a target domain in a feature space, but often neglect the hidden physical acquisition state behind the image. The difference of the unmanned aerial vehicle inspection images mainly comes from two dimensions, namely, the geometrical visual angle difference caused by the change of the flight attitude and the shooting distance and the environmental texture difference caused by the change of weather and illumination. The prior art typically confounds these two types of differences together for global feature alignment, failing to purposefully decouple processing. For example, when the source domain is mainly in a top view and the target domain is in a top view, forced feature alignment often causes loss or distortion of spatial structure information of an image, causing a negative migration phenomenon, and making it difficult to obtain a high-quality detection model. In addition, traditional visual inspection models rely primarily on texture and shape features at the pixel level for reasoning, lacking in perceptibility of physical dimensions. In a scene where the unmanned aerial vehicle zooms or the flying height varies greatly, the pixel size of the same class of parts in the image varies drastically, while interfering objects in the background (such as debris on the ground) may exhibit similar textures and pixel sizes to the power parts at a particular focal length. Because the existing model does not introduce physical parameters such as ground sampling distance and the like to restrict, the real size of the object in the physical world is difficult to distinguish only by visual features, and logical unreasonable false detection is easy to generate, so that the practicability and reliability of the inspection system are reduced. Disclosure of Invention Aiming at the defects of the prior art, the invention provides a method for constructing an unmanned aerial vehicle multi-scene inspection model by fusing transfer learning and physical state guidance, which solves the problems of large inspection data distribution difference and weak model generalization capability of the unmanned aerial vehicle under the conditions of zooming, posture changing and complex environment illumination. The method is realized by the following technical scheme that firstly, the original inspection data is analyzed and vectorized to construct a standardized collected state fingerprint vector. The original inspection data covers the source domain data and the target domain data. The acquired state fingerprint vector is definitely decoupled into a geometric state component and an environmental state component in dimensions, and the spatial position relation and the optical environmental characteristic during imaging are respectively represented. And secondly, receiving geometric state components of the source domain data and constructing a geometric manifold index library. And meanwhile, calculating a reference geometric state vector of the target domain data, and taking the reference geometric state vector as a query vector to search a source domain anchor point set in the geometric manifold index library. And further decomposing the difference into a geometric difference vector and an environmental difference vector by calculating a state difference vector between the target domain vector and the source domain anchor point set vector, thereby quantifying the specific deviation of the target domain and the source domain in physical space and environmental