CN-121999390-A - Method for identifying external invasion species based on unmanned aerial vehicle and hyperspectral multi-mode data

CN121999390ACN 121999390 ACN121999390 ACN 121999390ACN-121999390-A

Abstract

The invention discloses an external invasion species identification method based on unmanned aerial vehicle and hyperspectral multi-mode data, which relates to the technical field of remote sensing and computer vision and is deployed on unmanned aerial vehicles or edge equipment. Through a light convolution backbone of transfer learning, combining channel attention and improved h-swish, a sketch judging head and a verification judging head are constructed to carry out parallel reasoning, a double-threshold decision is matched with comprehensive acceptability, a time window is carried out for re-judging the gray area sample, and the gray area sample is sent to be re-checked and reflowed to carry out increment learning and threshold updating when the gray area sample is still uncertain. The end side adopts INT8/FP8 quantization, operator fusion and asynchronous two-stage running water to reduce time delay and energy consumption. The species category, the confidence level, the time stamp and the rechecking mark are output, and the method is suitable for the rapid identification of the wild.

Inventors

CAO ZHIYONG
LI CHEN
GUI FURONG
ZHANG YUANHAO
ZHAO HONGBO
CAI JIYUAN
JIAN GUANGYAO
ZHANG SHUAI
LI YAHONG

Assignees

云南农业大学

Dates

Publication Date: 20260508
Application Date: 20251202

Claims (10)

1. The mobile-end-lightweight-network-based method for identifying the external invasion species is applied to an unmanned aerial vehicle processor or edge computing equipment, and is characterized by comprising the following steps of: s1, training and deployment preparation, namely initializing a lightweight convolution trunk by using pre-training weights of a public data set, and performing migration learning training on task data to obtain parameters of a sketch judging head and a verification judging head and end side packets; S2, collecting and preprocessing, namely collecting an image of a target area, writing the collected image into an annular cache, carrying out color correction based on shooting time, exposure information and white balance information of image metadata, and carrying out long-side scaling, denoising and pixel normalization to obtain a preprocessed image; S3, extracting light-weight features, namely carrying out multi-scale feature extraction on the preprocessed image based on a light-weight convolution trunk to obtain a feature set; S4, attention and activation modulation, namely carrying out channel attention recalibration on the feature set, carrying out nonlinear mapping by adopting an improved h-swish activation function to obtain enhanced features, wherein the improved h-swish activation function comprises a leachable scaling parameter, a translation parameter and a door width parameter, and fusing the leachable scaling parameter, the translation parameter and the door width parameter with normalization, scale transformation and linear mapping in a piecewise linear approximation mode in an reasoning stage, and carrying out the fusion as a single operator on an end side; s5, sketching judgment, namely inputting the enhanced features into a sketching judgment head, and outputting candidate categories and sketching probability distribution; S6, checking and judging, namely inputting the enhanced features into a checking and judging head, and outputting checking probability distribution and target categories; s7, calculating the comprehensive acceptance, namely combining the probability of the verification judging head on the target category with the probability of the sketch judging head on the target category and the consistency between probability distribution of the verification judging head and the sketch judging head according to preset weight to obtain the comprehensive acceptance; S8, comparing the comprehensive acceptance with a high threshold and a low threshold, outputting a recognition result when the comprehensive acceptance is not lower than the high threshold, performing multiple verification in a time window when the comprehensive acceptance is between the high threshold and the low threshold, weighting and converging the multiple results, comparing again, and packaging a sample to be tested together with a context image fragment, an adjacent frame abstract, acquisition time and exposure information, sketch probability distribution and verification probability distribution to a far-end review when the comparison again does not reach the high threshold, and directly performing the uploading review when the comprehensive acceptance is lower than the low threshold; And S9, incremental learning and threshold updating, namely adding a sample subjected to remote review and a gray area sample in the time window into a sample pool, performing small-step fine adjustment on the sketch judging head, the verification judging head and the improved h-swish activation function parameters, and updating the high threshold and the low threshold under a verification set curve based on misjudgment cost and uploading cost.
2. The method of claim 1, wherein the method is run on the end side in a manner of execution of division and two-stage pipelining of the producer and consumer, wherein the lightweight feature extraction, the attention and activation modulation, the sketch decision and the check decision, and the double-threshold decision are staggered in time to form an overlap of handling and computation, wherein the comprehensive acceptance is used for uniformly driving decision flows of acceptance, re-judgment and up-feed rechecking, wherein the improved h-swish activation function performs orthogonal mixing on the input entering activation in a training stage to thin abnormal activation, and performs integral fusion on a piecewise linear approximation and the normalization, the scale transformation, and the linear mapping in an inference stage and performs as an end-side single operator, and wherein the S1 comprises freezing shallow layers of lightweight convolution backbones and training high layers and decision heads during transfer learning training, then thawing the whole network for joint fine tuning, and employing weighted cross entropy, focus loss or a combination of both as classification loss to alleviate class imbalance.
3. The method of claim 1, wherein S1 comprises data enhancement during a transfer learning training using a combination of random clipping, horizontal flipping, color perturbation, and random occlusion.
4. The method of claim 1, wherein S2 comprises performing luminance stretching and white balance correction on the preprocessed image, performing equal scaling according to a preset long-side size, denoising by using a spatial filtering mode, and rejecting the preprocessed image with quality lower than a threshold value based on a sharpness index.
5. The method of claim 1, wherein the lightweight convolution backbone in S3 adopts a bottleneck structure of depth separable convolution, comprising point-wise convolution and a linear layer to reduce the number of parameters, and upsampling and channel alignment of different scale features by a feature pyramid to output the feature set with a uniform number of channels.
6. The method of claim 1, wherein the channel attention recalibration in S4 comprises three parts of global averaging pooling, channel compression, and channel recalibration, the channel compression ratio being set within a fixed range to balance computational overhead and performance improvement, the channel attention recalibration being performed immediately following the lightweight convolutional backbone output.
7. The method of claim 1, wherein the modified h-swish activation function in S4 supports adaptive learning of channel-by-channel or layer-by-layer scaling parameters, translation parameters, and gate width parameters during a training phase, and performs orthogonal mixing on input into activation during a training phase to thin out abnormal activation, and fuses the modified h-swish activation function with piecewise linear approximation with normalization, scaling, linear mapping during an inference phase to implement end-side execution of integer or low precision floating points, and polyline break points are generated according to the translation parameters and gate width parameters.
8. The method according to claim 1, wherein the consistency in S7 is represented in an exponential form of the difference of probability distribution, the time window re-judgment in S8 performs temperature calibration on the multiple verification results and performs weighted multi-number fusion according to the reciprocal variance to obtain a fusion probability distribution, and the fusion probability distribution replaces the verification probability distribution to participate in the re-comparison of the comprehensive acceptance.
9. The method of claim 1, wherein the method employs a joint optimization strategy of quantization and pruning and operator fusion and asynchronous pipelining during end-side execution, wherein weight quantization is INT8 or FP8 quantization per block, activation quantization is FP8 quantization per row, structured channel pruning is used to meet real-time and power consumption goals, convolution, normalization and the modified h-swish activation function are fused in a single execution unit, producer is responsible for performing on-chip storage handling, consumer is responsible for performing convolution and decision, dual ring buffering and event fencing is used to manage access phases, and two-stage pipelining is used for interleaved execution between feature computation and activation, verification and re-decision.
10. The method of claim 1, wherein the output interface comprises a species category identification, a confidence level, a comprehensive acceptance level, a high threshold decision flag, a low threshold decision flag, a timestamp, and an up-feed review flag, and records up-feed event logs and threshold configuration, wherein the confidence level is a probability that the verification decision head is on an output species category, and wherein the output interface interfaces with an unmanned aerial vehicle task management system or an edge operation and maintenance platform for task trajectory tracing and incremental learning process auditing.

Description

Method for identifying external invasion species based on unmanned aerial vehicle and hyperspectral multi-mode data Technical Field The invention relates to the technical field of remote sensing and computer vision, in particular to an unmanned aerial vehicle-mounted sensing and hyperspectral imaging, image processing and pattern recognition, deep learning multi-mode fusion and edge intelligent reasoning, and particularly relates to an external invasion species recognition method based on unmanned aerial vehicle and hyperspectral multi-mode data. Background The exotic invasive species have the characteristics of quick propagation, wide coverage and high treatment cost in an ecological system, and early discovery and quick identification are key to reducing ecological and economic losses. Traditional ground inspection relies on manual visual inspection, and has the problems of limited coverage, low efficiency and strong subjectivity. With the development of unmanned aerial vehicle platforms and optical sensors, low-altitude remote sensing is utilized to develop a large-scale rapid inspection trend, wherein hyperspectral imaging can distinguish species in fine-grained spectrum dimensions, but has large data volume, is sensitive to illumination and gesture, and is usually matched with space texture information of visible light images to realize stable identification under a complex background. The prior art generally carries out classification by a convolutional neural network based on visible light images only, calculates vegetation indexes or specific wave band characteristics based on multispectral/hyperspectral data to carry out threshold judgment, carries out pixel-level classification after the hyperspectral data is subjected to dimensionality reduction (such as main component/wave band selection), and uploads the data to a cloud for centralized reasoning and comparison. The method has the following limitations that firstly, single-mode illumination change, seasonal weather difference and background interference are sensitive, the contradiction that the inter-class separability is insufficient and the intra-class difference is overlarge easily occurs, secondly, hyperspectral data is obviously influenced by gestures, radiation and geometric distortion, spectral characteristics are unstable due to pixel mixing and stripe effects, a simple exponential threshold or a global threshold is difficult to adapt to different ground objects and time periods, thirdly, model multi-cloud reasoning is difficult to achieve end-side instantaneity and bandwidth cost, and the capability of robust processing and on-site closed loop updating of uncertain samples is lacking, fourthly, the existing multi-mode fusion is multi-stop in a prior loose coupling flow of dimension reduction and rear end classification, space-spectrum information is not sufficiently aligned in the same characteristic space, so that the generalization capability is insufficient under the conditions of shielding, backlighting, shading and long tail species, and the existing method lacks engineering mechanisms capable of adaptively updating the threshold and sample library in a deployment stage due to unbalanced training sample acquisition and high labeling cost. Therefore, there is a need for an external invasion species identification method that cooperatively utilizes multi-mode data such as visible light and hyperspectrum on an unmanned aerial vehicle platform, has the capabilities of radiation/geometric consistency correction and space-spectrum joint modeling, can realize efficient and robust reasoning at the end side, and performs closed-loop management and online evolution on uncertain samples, so as to improve identification accuracy, instantaneity and operability in a complex environment. Disclosure of Invention The invention aims to overcome the defects of the prior art and provides an external invasion species identification method based on unmanned aerial vehicle and hyperspectral multi-mode data, so as to solve the problems. The invention aims at realizing the technical scheme that the method for identifying the external invasion species based on the mobile terminal lightweight network is applied to an unmanned aerial vehicle processor or edge computing equipment and comprises the following steps of: s1, training and deployment preparation, namely initializing a lightweight convolution trunk by using pre-training weights of a public data set, and performing migration learning training on task data to obtain parameters of a sketch judging head and a verification judging head and end side packets; S2, collecting and preprocessing, namely collecting an image of a target area, writing the collected image into an annular cache, carrying out color correction based on shooting time, exposure information and white balance information of image metadata, and carrying out long-side scaling, denoising and pixel normalization to obtain a preprocessed image