CN-121982561-A - Multi-source characteristic dynamic weight wheat rust disease monitoring method with regional adaptability
Abstract
The invention provides a multisource characteristic dynamic weight wheat rust monitoring method with regional adaptability, and relates to the technical field of agricultural remote sensing monitoring. The method comprises the steps of firstly generating pseudo-missing points based on a buffer zone elimination method, constructing a monitoring data set with balanced positive and negative samples, secondly combining multi-source data such as weather, soil, topography, remote sensing vegetation indexes and the like in a wheat key period, constructing an initial feature set reflecting dynamic environment changes, screening and dynamically weighting the feature set by using a Lasso regression algorithm, determining an optimal feature combination according to environmental differences of different areas, and finally utilizing an integrated learning strategy to combine multiple machine learning algorithms to construct a monitoring model. The method can effectively solve the problems of difficult acquisition of the negative sample and poor trans-regional mobility of the model in large-scale monitoring, and realizes high-precision dynamic monitoring of the wheat rust suitable area.
Inventors
- YANG YI
- DONG YINGYING
- HUANG WENJIANG
- ZHANG BIYAO
- LIU LINYI
- REN KEHUI
Assignees
- 中国科学院空天信息创新研究院
Dates
- Publication Date
- 20260505
- Application Date
- 20251231
Claims (10)
- 1. The method for monitoring the wheat rust by using the multi-source characteristic dynamic weight with regional adaptability is characterized by comprising the following steps of: acquiring a wheat planting distribution range, wheat rust occurrence point data and multi-source environment data of a target area; Generating pseudo-missing point data according to a preset space constraint rule based on the wheat rust occurrence point data, and constructing a monitoring data set containing positive and negative samples; based on the crop weathered period characteristics, carrying out time sequence segmentation processing on the multi-source environmental data to construct an initial characteristic set reflecting environmental changes in different growth stages; Performing feature importance assessment on the initial feature set by using a regression analysis screening model, screening out a target feature set associated with the occurrence of the regional disease according to the environmental response difference of different target regions, and determining feature weights; Based on the target feature set, a plurality of base learners are fused by utilizing an integrated learning strategy to construct a wheat rust monitoring model, and a monitoring result is output by utilizing the model.
- 2. The method for monitoring wheat rust with regional adaptive multi-source feature dynamic weights according to claim 1, wherein the generating pseudo-missing point data according to a preset space constraint rule comprises: Constructing a buffer area by taking the wheat rust occurrence point as a center; Randomly generating pseudo-missing points in a region outside the coverage range of the buffer region; And controlling the generated pseudo-missing points to be balanced with the wheat rust occurrence points in number, wherein the space distance between any two data points is larger than a preset distance threshold.
- 3. The method for monitoring the wheat rust with the regional adaptability and the multi-source characteristic dynamic weight according to claim 1, wherein the time sequence segmentation processing is performed on the multi-source environment data based on the crop weather period characteristics, and the method comprises the following steps: Determining a plurality of key climatic nodes for wheat growth; Dividing a plurality of continuous or overlapped time windows according to the key weather nodes; respectively counting the statistic value of the multi-source environment data which changes along with time in each time window as a dynamic monitoring characteristic; Combining the dynamic monitoring feature with a static monitoring feature that does not change over time to form the initial feature set.
- 4. The method for monitoring wheat rust with regional adaptability and multi-source characteristic dynamic weight according to claim 3, wherein the key climatic nodes comprise a turning green period, a heading-grouting period and a maturation-harvesting period, and the multi-source environmental data at least comprise meteorological data, soil data and remote sensing vegetation index data.
- 5. The method for monitoring wheat rust with regional adaptability and multiple source feature dynamic weights according to claim 1, wherein the feature importance assessment by using regression analysis screening model comprises: Adopting a linear regression algorithm introducing regularization term as the regression analysis screening model; inputting the initial feature set into the regression analysis screening model, and compressing the coefficients of non-key features to zero by adjusting regularization parameters; And reserving the characteristic with the coefficient being different from zero as the target characteristic set, and taking the numerical value of the non-zero coefficient as the corresponding characteristic weight.
- 6. The method for monitoring the wheat rust with the regional adaptability and the multi-source characteristic dynamic weight according to claim 5, wherein the linear regression algorithm introducing regularization term is a Lasso regression algorithm, and the characteristic importance evaluation by using a regression analysis screening model further comprises the steps of carrying out stability screening on the characteristics by using a leave-one-out cross-validation strategy, and retaining the characteristics that coefficients are not zero in the validation years of a preset proportion.
- 7. The method for monitoring wheat rust with regional adaptive multi-source feature dynamic weights according to claim 1, wherein the fusing the plurality of base learners with the integrated learning strategy specifically comprises: Selecting at least two machine learning algorithms with different principles as a base learner, wherein the base learner comprises a decision tree model, a probability model or a neural network model; And training to obtain a final wheat rust monitoring model by using the Stacking generalized Stacking strategy and taking the output of the base learner as the input of a secondary learner.
- 8. The method for monitoring wheat rust with regional adaptive multi-source signature dynamic weights of claim 7 wherein the base learner is selected from at least two combinations of random forests, artificial neural networks, gradient lifts, and maximum entropy models.
- 9. A multi-source feature dynamic weight wheat rust monitoring system with regional adaptability, comprising: The data acquisition module is used for acquiring the wheat planting distribution range of the target area, the wheat rust occurrence point data and the multi-source environment data; The sample construction module is used for generating pseudo-missing point data according to a preset space constraint rule based on the wheat rust occurrence point data and constructing a monitoring data set; the characteristic engineering module is used for carrying out time sequence segmentation processing on the multi-source environment data based on the crop weather period characteristics to construct an initial characteristic set; The region self-adaptive screening module is used for carrying out feature importance assessment on the initial feature set by utilizing a regression analysis screening model, screening out a target feature set according to the environmental response difference of different target regions and determining feature weights; and the integrated monitoring module is used for building a wheat rust monitoring model by fusing a plurality of base learners by utilizing an integrated learning strategy based on the target feature set.
- 10. The system of claim 9, wherein the sample construction module is configured to generate pseudo-shortcomings in areas 10km away from the point of occurrence of the wheat rust, and wherein the area adaptive screening module is configured to output differentiated feature weight combinations for different geographical areas, respectively, using Lasso regression algorithm.
Description
Multi-source characteristic dynamic weight wheat rust disease monitoring method with regional adaptability Technical Field The invention belongs to the technical field of agricultural remote sensing monitoring and plant protection, in particular relates to a technology for monitoring large-scale crop diseases by combining remote sensing data, meteorological data and agronomic knowledge and utilizing a machine learning algorithm, and particularly relates to a multi-source characteristic dynamic weight wheat rust monitoring method with regional adaptability. Background Wheat rust (including stripe rust, leaf rust and stem rust) is an airborne fungal disease that severely threatens wheat production worldwide. In recent years, with global warming, rainfall pattern change and cultivation regulation, the occurrence and popularity rules of wheat rust are changed remarkably, disease hot spot areas drift continuously, the explosion frequency and the severity degree are increased increasingly, and the global grain safety is challenged. Timely and accurately monitoring the occurrence and development conditions of wheat rust disease has important significance for guiding agricultural production and formulating scientific prevention and control strategies. Traditional disease monitoring mainly relies on manual field investigation, and although accurate point location information can be obtained, the defects of low efficiency, long time consumption, small coverage range, high labor cost and the like exist, large-scale real-time monitoring is difficult to realize, disease missing report is extremely easy to cause, and the requirement of modern agriculture on rapid disease early warning cannot be met. With the development of remote sensing technology, large-area crop disease monitoring by using satellite remote sensing images becomes a research hotspot. The existing wheat rust remote sensing monitoring method is mainly divided into two types, wherein one type is based on monitoring of spectral characteristics, vegetation indexes are built by analyzing changes of spectral reflectivity of crop canopy under disease stress or identification is carried out by utilizing hyperspectral data, and the other type is based on risk prediction of environmental factors, and a disease epidemic model is built by meteorological data (such as temperature and humidity and precipitation). However, the prior art, when applied to large-scale, trans-regional wheat rust monitoring, still suffers from the following major drawbacks: existing approaches tend to be biased towards a single data source. The optical remote sensing data is easy to be interfered by cloud and rain weather, so that the data is lost, the growth condition of crops and the heterogeneity of the surface environment are ignored by using the meteorological data, and an effective mechanism for deep coupling of multi-source data such as meteorological, soil, topography, vegetation climate and the like is lacked. Moreover, the occurrence of wheat rust is comprehensively influenced by various environmental factors, and dominant driving factors in different geographic areas are obviously different. For example, in arid and semiarid regions, moisture may be the primary limiting factor, while in humid regions, temperature may be more critical. The existing monitoring model is usually obtained based on data training of a specific area (most of small-scale experimental areas), characteristic selection and model parameters are fixed, and when the model is directly transferred and applied to other areas with different climate types or planting systems, the accuracy is often greatly reduced, and the monitoring requirements under the global or intercontinental scale cannot be met. When the existing monitoring method is used for constructing a machine learning monitoring model, not only disease occurrence points are required to be used as positive samples, but also healthy points or non-occurrence points are required to be used as negative samples, and in large-scale remote sensing monitoring, reliable non-occurrence point data are very difficult to obtain. Often, field investigation only records disease points, lacks systematic sampling of healthy areas, and if background points are randomly selected as negative samples, pseudo negative samples, namely points which actually have disease but are marked as healthy, are very easy to introduce, so that model training deviation is caused. In addition, considering that crop growth and disease development are a dynamic process, the influence weights of environmental conditions in different climatic periods (such as turning green, heading and grouting) on the disease are different, the existing model mostly adopts an average value of a full growth period or static data at a certain moment, and the cumulative effect of the environmental dynamic change in a key climatic window on the disease cannot be finely described. In summary, how to construct a set