CN-122024117-A - Unmanned aerial vehicle tea tree disease detection method based on multidimensional receptive field attention
Abstract
The invention provides an unmanned aerial vehicle tea tree disease detection method based on multidimensional receptive field attention, and belongs to the technical field of computer vision and agricultural intelligent monitoring. The method comprises the steps of collecting RGB images of a tea garden through an unmanned aerial vehicle and carrying out self-adaptive enhancement, constructing a lightweight detection network embedded with a multidimensional receptive field attention module, extracting multi-scale space features through parallel cavity convolution, extracting frequency domain features through discrete cosine transformation, generating space and channel attention weights to carry out feature reconstruction, automatically excavating difficult samples based on SLIC super-pixel segmentation and feature clustering, introducing a self-adaptive weighted multitask loss function fused with attention guiding loss to optimize network focusing capability, and finally generating a visual detection result marked with a disease plaque boundary frame, category and confidence. The method solves the problems of inaccurate extraction of tea plant disease characteristics and insufficient study of difficult samples under a complex background, and realizes accurate positioning and visual analysis of tea plant diseases.
Inventors
- MEI RONGFANG
- ZHANG XINYI
- Deng Pangbai
- LI QI
- ZHU JUN
- HU RONG
- CAO YAOYUN
- WANG XU
- ZHANG XIA
- Cai baichuan
- TANG YUFENG
- YU QINGCHUAN
- QU CHAOYANG
- LIU MINGXUAN
- LIU XIANCHAO
- CHEN DAYUN
Assignees
- 宜宾职业技术学院
- 宜宾嘉博智能科技有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20260415
Claims (8)
- 1. The unmanned aerial vehicle tea tree disease detection method based on multidimensional receptive field attention is characterized by comprising the following steps of: s1, acquiring a plurality of RGB images covering different areas of a tea garden through an unmanned aerial vehicle carrying a visible light camera; Step S2, preprocessing the RGB image in the step S1 into a plurality of sub-images with uniform sizes, and marking a boundary box of a disease area and classifying disease types based on the sub-images to construct a training data set; s3, constructing a single-stage target detection network as a reference detection model, and embedding a multidimensional receptive field attention module into a backbone network and a characteristic fusion network of the reference detection model to form a multidimensional detection network; s4, using a training data set to perform initial training on the multi-dimensional detection network in the step S3; Step 5, super-pixel segmentation is carried out on the training image by using the characteristics extracted by the multi-dimensional detection network trained in the step 4 to generate a homogeneity region; S6, performing end-to-end reinforcement training on the multi-dimensional detection network trained in the step S4 by using the difficult sample set and adopting a self-adaptive weighting multi-task loss function; and step S7, inputting the image to be detected into the detection network trained in the step S6 after the pretreatment in the step S2, and outputting a disease detection result.
- 2. The unmanned aerial vehicle tea tree disease detection method based on multidimensional receptive field attention as recited in claim 1, wherein in step S3, the multidimensional receptive field attention module is processed as follows: for the input feature map F in step S3, steps S31 and S32 are executed respectively: Step S31, calculating a multiscale spatial attention weight matrix M s , which comprises the steps of S311-S313: step S311, processing the input feature map F through a group of parallel hole convolution layers with different hole ratios to obtain a multi-scale feature map; Step S312, splicing and fusing the multi-scale feature images to obtain fusion features F ms ; Step S313, after 1x1 convolution operation is carried out on the fusion feature F ms , a final multi-scale space attention weight M s is generated through a Sigmoid activation function; step S32, calculating a frequency-color channel attention weight matrix M c , which comprises the steps S321-S324: step S321, global average pooling and global maximum pooling are respectively carried out on an input feature map F, and global average pooling feature vectors and global maximum pooling feature vectors are obtained; S322, discrete cosine transforming the input feature map F, and obtaining a frequency domain feature vector F freq by extracting L multiplied by L low-frequency coefficients in each block; step S323, splicing the global average pooling feature vector, the global maximum pooling feature vector and the F freq feature vector; Step S324, the spliced feature vectors are processed through a full connection layer, and finally a final frequency-color channel attention weight M c is generated through a Sigmoid activation function; and S33, performing feature reconstruction: The input feature map F is subjected to dot multiplication with the frequency-color channel attention weight M c , and then is subjected to dot multiplication with the multi-scale space attention weight M s , so that an output feature map F 1 is obtained: F 1 =(F•M c )•M s Wherein, represents an element-wise dot product operation.
- 3. The method for detecting tea tree diseases of unmanned aerial vehicle based on multidimensional receptive field attention as set forth in claim 2, wherein in step S322, the specific method for obtaining the frequency domain feature vector F freq is that the input feature map F is divided into M×M blocks which are not overlapped in space dimension, discrete cosine transformation is carried out on each block, L×L low-frequency coefficients of the upper left corner of each block are extracted, and then the low-frequency coefficients are reconstructed into the frequency domain feature vector F freq , wherein M and L are positive integers, the value of M is selected from the integer power of 2 between 4 and 32, and the value of L is smaller than M.
- 4. The unmanned aerial vehicle tea tree disease detection method based on multidimensional receptive field attention as recited in claim 2, wherein in step S311, the group of cavitation convolutional layers with different cavitation rates comprises three 3x3 cavitation convolutional layers connected in parallel, and the cavitation rates of the three cavitation layers are 1,3 and 5 respectively.
- 5. The unmanned aerial vehicle tea tree disease detection method based on the multidimensional receptive field attention as set forth in claim 1 is characterized in that in step S5, the potential difficult-to-case areas are automatically identified through a clustering algorithm, specifically, feature vectors of all the super-pixel areas are clustered through the clustering algorithm, a difficult-to-case score is calculated based on the distance between the feature vectors and a healthy blade class clustering center and the distance between the feature vectors and a known disease feature clustering center, N super-pixel areas with the highest scores are selected as the potential difficult-to-case areas; The method for calculating the difficult case score S comprises the following steps: S=α×D h +(1-α)×min(D d1 ,D d2 ,…D d[c-1] ) Wherein S represents a difficulty score, the higher the value thereof is, the more likely the area is a difficulty, alpha is a configurable weight coefficient, alpha is more than or equal to 0 and less than or equal to 1;D h , the Euclidean distance from the characteristic vector of the currently calculated super-pixel area to the clustering center of the healthy blade class is represented, D d1 ,D d2 ,…D d[c-1] represents the Euclidean distance from the characteristic vector of the currently calculated super-pixel area to the clustering center of the 1 st to c-1 th known disease class, wherein c is the total class number.
- 6. The unmanned aerial vehicle tea tree disease detection method based on multidimensional receptive field attention as recited in claim 1, wherein in step S5, the super-pixel segmentation is implemented by a simple linear iterative clustering algorithm.
- 7. The unmanned aerial vehicle tea tree disease detection method based on multidimensional receptive field attention as recited in claim 1, wherein in step S6, the adaptive weighted multitasking loss function L total is formed as follows: L total =L det + β×L att Wherein L det is a reference detection loss, comprising a bounding box regression loss, a classification loss and a confidence loss, L att is an attention guidance loss, defined as: L att =1/N×∑ i=1 N (1-Mean(M s (i) •M c (i) )) •IoU gt (i) Wherein N is the number of positive samples in the batch, M s (i) and M c (i) are the corresponding blocks cut out from the target frame position of the ith positive sample in the multi-scale space attention weight matrix and the frequency-color channel attention weight matrix respectively, mean () represents the average value of space dimensions, and IoU gt (i) is the intersection ratio of the ith predicted frame calculated according to the current model parameters and the matched real frame; beta is an adaptive weight coefficient, an initial value of the self-adaptive weight coefficient is 0-1, and the self-adaptive weight coefficient is dynamically calculated in a subsequent training batch according to the following formula: β=γ×tanh(L det /(s×L att +ε)) Where γ is the hyper-parameter, s is a scaling factor for the balance magnitude, ε is a smoothing term that prevents divide-by-zero errors.
- 8. The unmanned aerial vehicle tea tree disease detection method based on multidimensional receptive field attention as recited in claim 2, wherein the multidimensional receptive field attention module is embedded at a position following the last two residual type feature fusion modules in the backbone network, and a junction following the up-sampling module and preceding the convolution module in the feature fusion network, and following the last convolution module of the feature fusion network.
Description
Unmanned aerial vehicle tea tree disease detection method based on multidimensional receptive field attention Technical Field The invention relates to the technical field of computer vision and agricultural intelligent monitoring, in particular to an unmanned aerial vehicle tea tree disease detection method based on multidimensional receptive field attention. Background Early detection and accurate positioning of tea plant diseases are key to accurate prevention and control and green treatment of tea gardens, and directly affect disease spread control, pesticide application reduction and synergy and tea yield and quality guarantee. However, in the actual scene of the large-scale tea garden, the particularity of the remote sensing view angle of the unmanned aerial vehicle and the complexity of the field environment provide challenges for the adaptability of the general model, and especially when the disease patch presents characteristics of tiny, irregular shape, insignificant difference between color and healthy blades and the like under the complex crown background, or the phenomena of illumination mutation, blade overlapping shielding, shadow interference and the like exist, the model may have phenomena of detection confidence degree reduction, small target omission detection or complex background false detection and the like, which may affect the accuracy of the actual prevention and control decision. In order to solve the problems, the prior art mainly adopts two schemes, namely, large-scale data retraining, namely, end-to-end retraining is carried out on the whole model after collecting and marking massive field scene data, the mode is long in period and high in marking cost, and has extremely high requirements on computing resources, and secondly, the robustness of the model is improved through general strategies such as data enhancement and the like, the mode can bring about certain improvement, but often lacks the specific perception capability of specific colors, textures and frequency domain features of a disease area, and has limited distinguishing capability on difficult samples (such as early disease spots similar to the colors of healthy blades, small disease spots with changeable scales and the like) under a complex background. In addition, the existing method depends on spatial domain characteristics, and frequency domain information sensitive to disease textures and color space characteristics sensitive to early color degradation cannot be fused effectively, so that the model characteristic expression capability is insufficient. Therefore, a technical scheme capable of pertinently enhancing the multi-dimensional characteristic sensing and distinguishing capability of a model to a disease area in a complex tea garden scene is needed, the detection precision is improved, meanwhile, the model is effectively focused on a difficult sample, and high-efficiency and accurate unmanned aerial vehicle remote sensing disease detection is realized. Disclosure of Invention The invention mainly aims to overcome the defects and the shortcomings of the prior art and provides an unmanned aerial vehicle tea tree disease detection method based on multidimensional receptive field attention. The method is based on a multidimensional receptive field attention mechanism and self-adaptive difficult case mining, a detection network with strong perceptibility to a disease area is constructed by fusing space, color and frequency domain characteristics, and the robustness and detection precision of the model in a real scene are remarkably improved by utilizing an innovative attention guiding loss and difficult case strengthening training mechanism. The invention creatively provides a multidimensional receptive field attention module and a self-adaptive difficult-to-case learning framework for unmanned aerial vehicle remote sensing images. By constructing parallel multi-scale space attention and frequency-color channel attention paths, the model can be focused on disease plaques with different sizes and is sensitive to key features such as color degradation, texture variation and the like, and the feature distinguishing capability of the model under a complex crown background is effectively enhanced. Meanwhile, through an automatic mining mechanism of difficult sample based on super pixel segmentation and clustering and the self-adaptive weighted multitasking loss of attention force diagram and detection performance, the targeted strengthening training of the difficult sample is realized, and the generalization performance of the model for various complex scenes is fundamentally improved. Meanwhile, the invention provides an end-to-end optimization flow of feature extraction, difficult case mining and reinforcement training. Firstly, a multidimensional receptive field attention module is embedded in a backbone network and a feature fusion network, so that the weighted reconstruction of input features in space and channe