Search

CN-121996642-A - Automatic station soil humidity observation data layer-by-layer correction method based on meteorological environment factors and multi-model integration

CN121996642ACN 121996642 ACN121996642 ACN 121996642ACN-121996642-A

Abstract

The invention discloses a layer-by-layer correction method for automatic station soil humidity observation data based on meteorological environment factors and multi-model integration, which is characterized in that the influence of evaporation and precipitation on soil humidity is considered, hysteresis effect is achieved, the change of different layers of soil humidity is influenced by the change of upper layer humidity, the soil humidity observed manually is taken as a reference, meteorological factors such as air temperature, wind speed, precipitation and vegetation index are combined with environmental factors, a plurality of machine learning algorithms such as Generalized Additive Model (GAM) integration Cubist, random forest, XGBoost and CatBoost are comprehensively adopted, the automatic station soil humidity observation data are corrected in a layer-by-layer manner, the corrected soil humidity data obviously improves the observation precision, particularly the soil humidity observation network with high precision and long-term stability is improved obviously in deep soil, and a reliable data basis is provided for verification of remote sensing and numerical simulation soil humidity products.

Inventors

  • LI HUIRONG
  • YANG DAJUN
  • JI MENG
  • XU CHENLU
  • XU YONGMING
  • HAN YANLI
  • CUI SHILIN
  • ZHANG SIHAI
  • GUO ZHENJIE
  • ZHANG LIWEI

Assignees

  • 锡林浩特国家气候观象台
  • 南京信息工程大学

Dates

Publication Date
20260508
Application Date
20251230

Claims (8)

  1. 1. An automatic station soil humidity observation data layer-by-layer correction method based on meteorological environment factors and multi-model integration is characterized by comprising the following steps of: S1, dividing a soil layer into N layers according to the burial depths of sensor probes laid at an automatic station, numbering the N layers, and enabling n=1, 2, & gt, wherein N represents the soil surface layer when n=1; S2, preprocessing the obtained artificially measured soil humidity and the soil humidity observed by the automatic station, and removing abnormal values in the soil humidity data observed by the automatic station; matching the soil humidity of the preprocessed manual observation and the soil humidity of the automatic station observation from two dimensions of soil layer depth and observation time to obtain a matching data set, wherein the matching data set comprises a plurality of groups of soil humidity data sets of manual actual measurement and automatic station observation of all soil layer depths corresponding to different dates, comparing the soil humidity of the manual observation and the soil humidity of the automatic station observation, and removing outliers according to residual errors of the manual observation and the soil humidity of the automatic station observation by using a 3 sigma principle; S3, acquiring time-by-time meteorological observation data including near-surface air temperature, wind speed, relative humidity, evaporation capacity and precipitation capacity of an automatic station based on a matching data set, calculating to obtain day-scale near-surface air temperature, day-scale wind speed and day-scale relative humidity, calculating to obtain a soil humidity hysteresis weight coefficient according to a soil humidity hysteresis days m, and obtaining weighted accumulated evaporation capacity and weighted accumulated precipitation capacity of the previous m days; S4, let n=1, take the surface soil humidity observed manually as the dependent variable, take the surface soil humidity observed by the automatic station, the daily scale near-surface air temperature, the daily scale wind speed, the daily scale relative humidity, the weighted accumulated precipitation, the weighted accumulated evaporation and the normalized vegetation index NDVI as independent variables, respectively adopt four machine learning algorithms including Cubist, RF, XGBoost and CatBoost to construct a soil correction sub-model, take correction results obtained by the four machine learning algorithms as independent variables, and use a generalized additive model GAM algorithm to construct a surface soil humidity correction model to output surface soil correction humidity; S5, setting n=n+1, taking the artificially observed soil humidity of the ith layer as a dependent variable, taking the observed soil humidity of the nth layer, the n-1 th layer soil correction humidity, the day scale near surface air temperature, the day scale wind speed, the day scale relative humidity, the weighted accumulated precipitation, the weighted accumulated evaporation and the normalized vegetation index NDVI of an automatic station as independent variables, respectively adopting four machine learning algorithms including Cubist, RF, XGBoost and CatBoost to construct a soil correction sub-model, taking correction results obtained by the four machine learning algorithms as independent variables, and constructing a soil humidity correction model of the nth layer by using a generalized additive model GAM algorithm to output the soil correction humidity of the nth layer; S6, repeating the step S5 until n=N, integrating the soil humidity correction models of all soil layers, and constructing to obtain a layer-by-layer soil humidity correction model; And S7, training the layer-by-layer soil correction model by adopting the matched data set, outputting the trained layer-by-layer soil correction model, verifying the accuracy of the correction result by cross verification, and outputting the soil correction humidity of each layer by layer.
  2. 2. The automatic station soil moisture observation data layer-by-layer correction method based on meteorological environment factors and multi-model integration according to claim 1, wherein step S2 further comprises: the artificially measured soil humidity is converted from mass water content to volume water content by the following formula so as to be consistent with the soil humidity unit observed by the automatic station. The conversion formula is as follows: ; Wherein, the Represents the water content of the soil volume, Represents the water content of the soil mass, Representing the soil volume weight; Based on the soil humidity threshold range observed by the automatic station, eliminating obvious error or invalid observed values from soil humidity data observed by the automatic station, judging a data sequence which is unchanged for a plurality of continuous days as invalid data, and then eliminating; And carrying out soil layer depth matching on the soil humidity observed by the manual observation and the automatic station, so that the soil layer depth corresponding to the manual observation is matched with the soil layer depth and the observation time observed by the automatic station.
  3. 3. The automatic station soil moisture observation data layer-by-layer correction method based on meteorological environment factors and multi-model integration according to claim 1, wherein step S2 further comprises: And comparing the matched manual observation with the soil humidity observed by the automatic station to obtain a precision evaluation index of the soil humidity observed by the automatic station, judging an outlier according to a 3 sigma principle according to a residual error between the two if the precision evaluation index is smaller than a preset precision threshold, removing the outlier to generate a matched data set, and recalculating the precision evaluation index.
  4. 4. The method for layer-by-layer correction of automatic station soil humidity observation data based on meteorological environment factors and multi-model integration according to claim 3, wherein the precision evaluation index comprises a plurality or all of a correlation coefficient R, an average absolute error MAE, a root mean square error RMSE and an average deviation MB.
  5. 5. The automatic station soil moisture observation data layer-by-layer correction method based on meteorological environment factors and multi-model integration according to claim 1, wherein step S3 further comprises: carrying out average treatment on the near-surface air temperature, the wind speed and the relative humidity according to daily scales to obtain daily average near-surface air temperature, daily average wind speed and daily average relative humidity; Summing the evaporation capacity and the precipitation capacity according to a daily scale to obtain daily evaporation capacity and daily precipitation capacity; For the 1 st to N th soil layers, calculating complex correlation coefficients of manually observed soil humidity and weighted accumulated evaporation amount and weighted accumulated precipitation amount of the previous m days, wherein m=1, 2,. 15, taking m value with the highest complex correlation coefficient value as the soil humidity hysteresis days, and calculating the weighted accumulated evaporation amount and weighted accumulated precipitation amount corresponding to the soil humidity hysteresis days, wherein the weighted accumulated evaporation amount and weighted accumulated precipitation amount of the previous m days are calculated by adopting the following formula: ; in the formula, A weighted cumulative value representing the evaporation amount or precipitation amount corresponding to the date t; The daily precipitation or evaporation capacity of the ith day before the date t is represented, and m is the lag days of soil humidity; Soil moisture hysteresis weight coefficient for day i: ; And extracting normalized vegetation index NDVI corresponding to the pixels and the date from the MODIS remote sensing data according to the positions and the date of the observation points, wherein if cloud coverage exists on the day of the observation date, iteration is carried out to replace the cloud-free time phase average value of the date before and after the observation date.
  6. 6. The method for layer-by-layer correction of automatic station soil humidity observation data based on meteorological environment factors and multi-model integration according to claim 1, wherein in step S6, a soil humidity correction model of an ith layer is constructed by using a generalized additive model GAM algorithm based on the following formula to output the soil humidity correction of the ith layer: ; wherein M is soil humidity; Is the intercept; a model function representing each of the arguments, Cubist, RF, XGBoost, catBoost is the correction result output by the Cubist, RF, XGBoost, catBoost model respectively; is an error term.
  7. 7. The automatic station soil moisture observation data layer-by-layer correction method based on meteorological environment factors and multi-model integration of claim 1, further comprising: respectively taking the effective values of the soil humidity after correction of the 1 st-N th soil layers as dependent variables, respectively using the corrected effective soil humidity values of other soil layers as independent variables, respectively using Cubist, RF, XGBoost and CatBoost models to construct an interlayer fitting model, integrating based on a generalized additive model GAM, and constructing an integrated interpolation model based on the inter-soil-layer correlation; And (3) acquiring soil layer numbers corresponding to the outliers and the outliers in the automatic station soil humidity observation data removed in the step (S2), and estimating and interpolating the outliers and the outliers according to corrected effective humidity values of other soil layers based on an integrated interpolation model.
  8. 8. The method for layer-by-layer correction of automatic station soil humidity observation data based on integration of meteorological environment factors and multiple models according to claim 7, wherein when soil humidity data of all soil layers at a certain moment is missing, based on corrected soil humidity of adjacent time phases, the soil humidity at the current moment is estimated by a time sequence linear interpolation method for interpolation, and a corrected soil humidity data set which is continuous in time and depth is generated.

Description

Automatic station soil humidity observation data layer-by-layer correction method based on meteorological environment factors and multi-model integration Technical Field The invention relates to the technical field of soil moisture monitoring, in particular to a layer-by-layer correction method for automatic station soil humidity observation data based on meteorological environment factors and multi-model integration. Background Soil moisture, one of the key climate variables in the earth's system, plays an important role in hydrologic cycle, ecosystem, agricultural production, carbon cycle and climate change research. The accurate soil moisture monitoring data has important significance for drought monitoring, farmland moisture management and climate change research, and is particularly important in ecologically vulnerable areas. Currently, the main approaches to obtain soil moisture include ground observation, numerical simulation and remote sensing inversion. The remote sensing method can provide large-range soil moisture distribution information, but has the problems of lower spatial resolution, limited time resolution, vegetation coverage or precision reduction under complex terrain conditions, and the like. In contrast, ground observation can provide high-precision point location soil moisture data, and is an important basis for remote sensing inversion products and verification of numerical simulation results. The ground observation mode comprises manual measurement and automatic station monitoring, wherein the manual measurement has high precision, high labor intensity and limited observation frequency, the automatic station can realize continuous and automatic monitoring, is widely used for the precision verification of remote sensing and simulation products, and can cause systematic deviation of the observation result due to the differences of sensor types, burial depths and installation conditions, thereby influencing the data reliability and subsequent application. In the existing research, partial scholars adopt a quality control and threshold value inspection method to screen the soil humidity observation data of the automatic station, such as removing abnormal values by comparing the relation between precipitation and soil temperature change, or adopt autocorrelation characteristics to evaluate the quality of time series. However, systematic correction of automatic station soil moisture data has been very rarely studied, and there is a lack of systematic correction schemes that combine manual observation data with multi-model algorithms. The invention with publication number CN120275612B discloses a mountain GNSS-R soil humidity inversion method integrating a topography humidity index, which utilizes a Stacking integrated model to distribute weights to soil humidity prediction results output by a random forest model, a XGBoost model and a support vector machine model, and outputs a final soil humidity prediction result. However, the method is obtained by inversion based on remote sensing data, and humidity data of the deep soil cannot be accurately obtained. The invention with the publication number of CN103645295A discloses a multilayer soil moisture simulation method and system, which are used for constructing a soil moisture layered equilibrium model, regulating a model structure by combining a remote sensing technology, acquiring model parameters by utilizing the remote sensing technology, constructing a watershed hydrologic space information database, and carrying out numerical simulation of a multilayer soil moisture process by utilizing the regulated soil moisture layered equilibrium model and the watershed hydrologic space information database. The invention is based on a remote sensing inversion method as well, and in addition, although the invention carries out layering treatment on a soil water layer, the calculation is quite complex, the influence of vegetation on precipitation and evaporation is more concerned, and the invention has no universality. The invention with publication number CN119986726A discloses a GNSS-R deep soil humidity inversion method considering precipitation information, which adopts a multi-step correction method to correct the obtained earth surface soil humidity from the aspects of change rate and precipitation, and finally obtains the deeper soil humidity from the statistical result obtained from a large amount of data. However, the invention divides the correction interval according to the soil humidity value, corrects the soil humidity by the change rate, and does not consider the correction requirement of the automatic station sensing data. Disclosure of Invention The invention aims to provide a layer-by-layer correction method for automatic station soil humidity observation data based on meteorological environment factors and multi-model integration, and simultaneously considers that the influence of evaporation and precipitation on soil humidity has