CN-122023917-A - Wheat stripe rust monitoring method based on multisource satellite remote sensing and interpretable stacking integrated learning
Abstract
The invention relates to a wheat stripe rust monitoring method based on multi-source satellite remote sensing and interpretable stacked integrated learning, which comprises the steps of obtaining a Sentinel-2 optical remote sensing image and a Sentinel-1 synthetic aperture radar image of a research area, and field investigation data of wheat stripe rust on the same period ground, preprocessing, constructing an initial feature set, screening an optimal feature combination, constructing a stacked integrated learning monitoring model, and training to obtain a wheat stripe rust monitoring classification result. The invention integrates the optical spectrum and texture information of the Sentinel-2 and the radar polarization information of the Sentinel-1, realizes all-weather and three-dimensional wheat stripe rust monitoring, remarkably improves the generalization capability and robustness of the model, has the monitoring precision of 89.0 percent, is superior to a single machine learning model, can quantitatively explain the contribution of each feature in decision making, and enhances the credibility and scientificity of the monitoring result in agricultural practical application.
Inventors
- HUANG LINSHENG
- WU KEYANG
- Ruan chao
- ZHAO JINLING
- Pang Denghao
Assignees
- 安徽大学
Dates
- Publication Date
- 20260512
- Application Date
- 20260202
Claims (6)
- 1. A wheat stripe rust monitoring method based on multi-source satellite remote sensing and interpretable stacked integrated learning is characterized by comprising the following steps in sequence: (1) Acquiring and preprocessing data, namely acquiring a Sentinel-2 optical remote sensing image and a Sentinel-1 synthetic aperture radar image of a research area, and synchronously carrying out ground wheat stripe rust field investigation data on the ground to obtain preprocessed data; (2) Extracting multisource characteristics, namely respectively extracting vegetation indexes, texture characteristics and polarization indexes from the preprocessed Sentinel-2 optical remote sensing image and the Sentinel-1 synthetic aperture radar image to construct an initial characteristic set; (3) Selecting optimal features, namely screening optimal feature combinations from an initial feature set by adopting a strategy of combining a Relief algorithm, a minimum redundancy maximum correlation algorithm and a sequence forward selection algorithm; (4) Constructing a stacking integrated learning monitoring model, training, and inputting the optimal characteristic combination into the trained stacking integrated learning monitoring model to obtain a wheat stripe rust monitoring classification result; (5) And introducing a SHAP framework to analyze decision basis of the stacked integrated learning monitoring model.
- 2. The wheat stripe rust monitoring method based on multi-source satellite remote sensing and interpretable stacked ensemble learning of claim 1, wherein the step (1) specifically comprises the following steps: (1a) Collecting and preprocessing field investigation data, namely measuring disease infection conditions on 10m multiplied by 10m field blocks which are in a space range of 20m multiplied by 20m and have relatively consistent disease occurrence by using a five-point investigation method, recording disease index DI by randomly selecting 10 wheat plants in a space range of 1m multiplied by 1m for each sample side, and recording center longitude and latitude at the same time; calculating a disease index DI, classifying the disease index DI into two categories of health and morbidity, wherein the healthy DI is less than 5, the morbidity DI is more than or equal to 5, and taking the disease index DI as a model training label, wherein the calculation formula of the disease index DI is as follows: ; ; ; Wherein, the Is the incidence of stripe rust, i.e., the number of infected leaves as a percentage of the total number of leaves; is the average severity; Ranking the blade severity; the number of the blades corresponding to the severity of the blades; to investigate the total number of leaves; (1b) The method comprises the steps of collecting an optical remote sensing image and a synthetic aperture radar image, preprocessing, carrying out cloud removal processing and atmosphere correction on the Sentinel-2 optical remote sensing image, and carrying out thermal noise removal, radiometric calibration, terrain correction and speckle filtering processing on the Sentinel-1 synthetic aperture radar image to generate a backrest scattering coefficient image.
- 3. The wheat stripe rust monitoring method based on multi-source satellite remote sensing and interpretable stacked ensemble learning of claim 1, wherein the step (2) specifically comprises the following steps: (2a) The sample set consisting of the wheat stripe rust field investigation data comprises Extracting 21 vegetation indexes VIs of each sample point in a sample set, namely extracting an atmospheric resistance anti-vegetation index ARVI, a normalized vegetation index NDVI, a structure insensitive pigment index SIPI, a difference vegetation index DVI, a triangular vegetation index TVI, a normalized difference greenness index NDGI, an enhanced vegetation index EVI, an optimized soil adjustment vegetation index OSAVI, a renormalized difference vegetation index RDVI, an improved ratio vegetation index MSR, a vegetation senescence index PSRI, a red 1 vegetation senescence index PSRIre1, a red 2 vegetation senescence index PSRIre2, a red 3 vegetation senescence index PSRIre3, a red 1 normalized vegetation index NDVIre1, a red 2 normalized vegetation index NDVIre2, a red 3 normalized vegetation index NDVIre3, a normalized red 1 differential vegetation index NREDI1, a normalized red 2 differential vegetation index NREDI2, a normalized red 3 differential vegetation index NREDI and a red disease stress index REDSI from a senselvedge 2 optical remote sensing image; (2b) Extracting texture features TFs of sample points, namely selecting 11 effective wave bands of a Sentinel-2 optical remote sensing image by using a gray level co-occurrence matrix algorithm, and respectively calculating 8 statistics of mean value, variance, homogeneity, contrast, dissimilarity, entropy, second moment and relativity of each wave band to construct 88 texture features; (2c) And extracting the polarization index PIs of the sample point, namely extracting the backscattering coefficients of the VV polarization mode and the VH polarization mode from the Sentinel-1 synthetic aperture radar image, and constructing the polarization index through mathematical transformation.
- 4. The wheat stripe rust monitoring method based on multi-source satellite remote sensing and interpretable stacked ensemble learning of claim 1, wherein the step (3) specifically comprises the following steps: (3a) The method comprises the steps of preliminary screening, namely calculating weights of vegetation indexes, texture features and polarization indexes by utilizing a Relief algorithm, and removing low-weight features by taking a weight median as a threshold value to obtain a candidate feature set; (3b) The redundancy removal, namely calculating MIQ values of mutual information quotients of all the features in the candidate feature set by utilizing a minimum redundancy maximum correlation algorithm (mRMR) algorithm, sequencing the features according to the MIQ values, and removing high redundancy features; (3c) The optimal feature combination determination comprises the steps of sequentially accumulating features according to MIQ value ordering based on a sequence forward selection algorithm (SFS algorithm), inputting the features into a classifier, determining the optimal feature combination for final modeling according to the precision of the classifier, constructing a gradually expanded nested feature subset based on the MIQ value descending order ordering result obtained by an mRMR algorithm, and ordering the 1 st feature Make up a first feature subset The features of rank 2 are then sorted Adding to obtain a second feature subset And so on to form the first Feature subset Each feature subset All are used as input of a stacked integrated learning monitoring model to carry out 5-fold cross validation on a sample set, and the average classification performance obtained by the cross validation is used as an evaluation value As the optimal feature combination for final modeling.
- 5. The wheat stripe rust monitoring method based on multi-source satellite remote sensing and interpretable stacked ensemble learning of claim 1, wherein the step (4) specifically comprises the following steps: (4a) Constructing a base learner and generating metadata: (4a1) Selecting a random forest, extreme gradient lifting, lightweight gradient lifting machine, a support vector machine and 5 self-adaptive reinforced machine learning models with obvious differentiation as a basic learner; (4a2) Training each basic learner with the label of wheat stripe rust on-site investigation data, namely disease index DI by adopting a five-fold cross validation method, wherein the selected basic learner comprises In each round of verification, sequentially selecting 1 subset as a verification set and the rest 4 subsets as training sets, and repeating the process for 5 times to ensure that each subset is verified once; (4a3) Obtaining the prediction output of the base learner for the first The base learning device is used for learning the base, Splicing the predicted probability values of the verification set obtained by 5 times of cross verification to generate a length of Is of the prediction vector of (1) The 5 basis learners generate 5 prediction vectors in total , , , , ]; (4B) Building a meta learner and outputting a monitoring result: (4b1) Construction of the input data set of the meta learner 5 predictive vectors [ , , , , Stacking in columns to form a new structure The feature matrix of the dimension is used as input data of the element learner; (4b2) Selecting a type lifting model CatBoost as a meta learner and utilizing a new model Training the CatBoost model by the dimensional feature matrix, and learning a prediction deviation and error correction mechanism of the base learner to obtain a trained CatBoost model; (4b3) And outputting a final wheat stripe rust monitoring and classifying result, namely health or morbidity, by using the trained CatBoost model.
- 6. The wheat stripe rust monitoring method based on multi-source satellite remote sensing and interpretable stacked ensemble learning of claim 1, wherein the step (5) specifically comprises the steps of: (5a) Calculating SHAP values of all the features, and quantifying marginal contribution degree of the features to model prediction results; (5b) Generating a feature importance histogram and a SHAP abstract map, analyzing positive and negative correlation between features and occurrence probability of stripe rust, verifying consistency of model decision logic and crop pathology mechanism, and outputting a visualized disease monitoring distribution map.
Description
Wheat stripe rust monitoring method based on multisource satellite remote sensing and interpretable stacking integrated learning Technical Field The invention relates to the technical fields of agricultural remote sensing monitoring and intelligent agriculture, in particular to a wheat stripe rust monitoring method based on multi-source satellite remote sensing and interpretable stacking integrated learning. Background Wheat stripe rust is one of the most harmful fungal diseases in the world and difficult to control, and the occurrence of the wheat stripe rust can lead to the great reduction of wheat yield, thereby seriously threatening the grain safety. Therefore, the method realizes large-scale and high-precision disease monitoring and prediction, and has important significance for guiding large-area agricultural production and precise prevention and control. The traditional crop disease identification method mainly relies on manual field investigation, is accurate, time-consuming and labor-consuming, and is difficult to meet the timeliness requirement of large-scale area monitoring. With the development of remote sensing technology, monitoring crop diseases by using satellite images has become a research hotspot. The current mainstream method relies on vegetation index and texture features extracted from optical satellite images to characterize crop growth. However, the existing monitoring method has the following limitations that firstly, optical remote sensing is extremely easy to be limited by weather conditions, cloud coverage often causes optical data to be lost in a critical crop growth season, so that the optimal window period for disease monitoring is missed, and secondly, single optical characteristics mainly reflect pigment and texture changes on the surface of a canopy, and subtle changes of the internal structure and the moisture content of crops caused by diseases are difficult to capture. Although synthetic aperture radars have all-weather observation advantages and are sensitive to canopy structures, their current application in the disease monitoring field is relatively lagged, and it is often difficult for a single data source to comprehensively characterize complex disease stress features. In addition, in terms of data processing algorithms, most of the existing researches use single machine learning models such as random forests, support vector machines and the like. Due to the differences in classification scenarios and vegetation types, a single model tends to have difficulty maintaining stable generalization ability and is prone to being over-fitted. More importantly, conventional machine learning models are often considered as "black box" systems, lacking in the interpretability of the decision process, making it difficult for users to understand what features the model makes based on, reducing the reliability of the monitoring results in actual agricultural decisions. Thus, there is a need for a wheat stripe rust monitoring method that can integrate optical and radar multisource features, has high generalization capability, and has decision transparency. Disclosure of Invention In order to solve the problems of insufficient optical information, optical data loss and performance bottleneck, generalization and insufficient robustness of a single model in the prior art, the primary aim of the invention is to provide a wheat stripe rust monitoring method based on multisource satellite remote sensing and interpretable stacked integrated learning, which fuses optical spectrum and texture information of Sentinel-2 and radar polarization information of Sentinel-1 and remarkably improves generalization capability and robustness of the model. In order to achieve the purpose, the invention adopts the following technical scheme that the wheat stripe rust monitoring method based on multi-source satellite remote sensing and interpretable stacking integrated learning comprises the following steps in sequence: (1) Acquiring and preprocessing data, namely acquiring a Sentinel-2 optical remote sensing image and a Sentinel-1 synthetic aperture radar image of a research area, and synchronously carrying out ground wheat stripe rust field investigation data on the ground to obtain preprocessed data; (2) Extracting multisource characteristics, namely respectively extracting vegetation indexes, texture characteristics and polarization indexes from the preprocessed Sentinel-2 optical remote sensing image and the Sentinel-1 synthetic aperture radar image to construct an initial characteristic set; (3) Selecting optimal features, namely screening optimal feature combinations from an initial feature set by adopting a strategy of combining a Relief algorithm, a minimum redundancy maximum correlation algorithm and a sequence forward selection algorithm; (4) Constructing a stacking integrated learning monitoring model, training, and inputting the optimal characteristic combination into the trained stacking inte