CN-121980210-A - Construction method of soil humidity space-time multi-step prediction model and storage medium
Abstract
The application belongs to the technical field of hydrology, data analysis and machine learning intersection, and particularly discloses a construction method and a storage medium of a soil humidity space-time multi-step prediction model, wherein the method comprises the following steps of constructing a data set, constructing a connection matrix based on time sequence correlation of soil humidity data in a training data set, and performing k-hop expansion and normalization treatment on the connection matrix to obtain a k-hop normalized connection matrix Based on Constructing a graph convolution module for extracting wide area space characteristics and remote heterogeneous dependence of soil humidity, and constructing a prediction model comprising GConvLSTM units and ConvLSTM units, and GConvLSTM units and ConvLSTM units which are crossed or stacked in sequence to form And (3) obtaining an initial KGCCLSTM prediction model by using the layer memory flow architecture. The method can realize high-precision multi-step prediction of soil humidity in a complex scene.
Inventors
- CHEN NENGCHENG
- PAN ZIWEI
- LI ZEYUAN
- XU LEI
Assignees
- 湖北珞珈实验室
- 中国地质大学(武汉)
Dates
- Publication Date
- 20260505
- Application Date
- 20251127
Claims (10)
- 1. The construction method of the soil humidity space-time multi-step prediction model is characterized by comprising the following steps of: s10, acquiring soil humidity data of a research area and external environment data affecting the soil humidity, wherein the soil humidity data are obtained by adopting a length of Selecting a sample data set, and then after the sample data set The soil humidity data is used as label data, a sample data set and the label data are preprocessed, and the sample data set and the label data are divided into a training data set, a verification data set and a test data set according to a preset proportion; s20, constructing a connection matrix based on time sequence correlation of soil humidity data in the training data set, and performing k-hop expansion and normalization processing on the connection matrix to obtain a k-hop normalized connection matrix Based on Constructing a graph convolution module for extracting wide area space characteristics and remote heterogeneous dependence of soil humidity; s30, constructing a prediction model, wherein the prediction model comprises GConvLSTM units and ConvLSTM units, the GConvLSTM units are embedded into the graph convolution module, and the input of the graph convolution module comprises original data at the current moment, The method comprises the steps of outputting a hidden state and a cell state of a GConvLSTM at the current moment by a hop graph convolution result and a hidden state of a ConvLSTM unit at the last moment, outputting the hidden state and the cell state at the current moment ConvLSTM by a ConvLSTM unit by taking an output of GConvLSTM units as an input, and then crossing or stacking GConvLSTM units and ConvLSTM units in sequence to form A layer memory flow architecture is used for obtaining an initial KGCCLSTM prediction model; S40, inputting the training data set into an initial KGCCLSTM prediction model for training, minimizing a loss function through a preset optimizer, updating model parameters, and adjusting the model parameters by using the verification data set to obtain a final prediction model.
- 2. The method for constructing a soil moisture spatiotemporal multi-step prediction model according to claim 1, wherein in step S10, the preprocessing includes a data normalization process and a feature screening process; the data normalization processing adopts a maximum-minimum normalization method, and the formula is as follows: In the formula, For a normalized data set, As a result of the original data set, Is the minimum value of the original data set, Is the maximum value of the original data set; the feature screening process is based on a correlation threshold between features , Calculating the correlation coefficient between any two external environment characteristics If (if) And retaining the characteristic with higher relativity with the soil humidity, and eliminating the other characteristic.
- 3. The method for constructing a soil moisture spatiotemporal multi-step prediction model according to claim 1, wherein in step S20, the time-series correlation index includes any one or more combinations of Pearson correlation coefficient, spearman class correlation coefficient or Kendall correlation coefficient; The specific steps of constructing the connection matrix based on the time sequence correlation index are as follows: s21a, regarding each pixel of soil humidity data as a node, and setting the total number of nodes as The connection matrix A is an N multiplied by N dimensional matrix; S22a, calculating any node And node Time sequence correlation coefficient in training data set time span , ; S23a, selecting AND node Front with maximum time sequence correlation coefficient Individual node [ ] Positive integer) as a node Is to make the following Corresponding to other nodes Obtaining an initial connection matrix 。
- 4. The method for constructing soil moisture spatiotemporal multi-step prediction model according to claim 1, wherein in step S20, the connection matrix is constructed Proceeding with The specific steps of the hop expansion and normalization process are S21b, which is the initial connection matrix Adding self-loops to obtain a connection matrix containing self-loops The formula is as follows: , Is that A dimension identity matrix; s22b, pair Proceeding with The power of the order matrix operation to obtain Hop connection matrix ; S23b, pair Normalization processing is carried out to make Obtaining -Hop normalized connection matrix , Representation of Is the first of (2) Line 1 Column elements.
- 5. The method for constructing a soil moisture spatiotemporal multi-step prediction model according to claim 1, wherein in step S20, the feature extraction formula of the graph convolution module is as follows: In the formula, Indicated at time t The result of the hop graph convolution, Soil moisture data representing time t is shown, Representation of -Weights of hop map convolution.
- 6. The method for constructing a soil moisture spatiotemporal multi-step prediction model according to claim 1, wherein in step S30, said GConvLSTM unit input includes Time of day raw data 、 Time of day Hop graph convolution output And Hidden state of time ConvLSTM unit ; The GConvLSTM unit updates the cell state and the hidden state through a gating mechanism, and the specific formula is as follows: In the formula, 、 、 The outputs of the input gate, the forget gate and the output gate are respectively, In order to be a candidate cell state, 、 Respectively is Time of day, The cell state of the cell at time GConvLSTM, 、 Respectively is Time of day, The hidden state of the cell at time GConvLSTM; 、 、 In order for the weight matrix to be trainable, 、 Is a bias vector; representing a two-dimensional convolution operation, Representing the Hadamard product operation, The tensor stitching operation is represented as such, For the sigmoid activation function, Is a hyperbolic tangent activation function.
- 7. The method for constructing soil moisture spatiotemporal multi-step prediction model according to claim 1, wherein in step S30, said ConvLSTM unit takes the hidden state of GConvLSTM unit For input, the cell state and hidden state are updated by a gating mechanism, and the specific formula is as follows: In the formula, 、 、 The outputs of the input gate, the forget gate and the output gate are respectively, 、 Respectively is Time of day, The cell state of the cell at time ConvLSTM, 、 Respectively is Time of day, The hidden state of the cell at time ConvLSTM; 、 and is equal to a matrix of trainable weights, 、 Equal as offset vector; representing a two-dimensional convolution operation, Representing the Hadamard product operation, For the sigmoid activation function, Is a hyperbolic tangent activation function; the output of the ConvLSTM unit By passing through Feature dimension reduction is carried out on the convolution layer to obtain Predicted value of soil humidity at time The formula is as follows: Wherein, the Representation of And (5) convolution operation.
- 8. The method for constructing a soil moisture space-time multi-step prediction model according to claim 1, wherein in step S30, the specific transfer manner of the multi-layer memory flow architecture is as follows: Horizontal transfer: Time of day (time) Cell status of layer GConvLSTM units And hidden state As a result of Time of day (time) The input of the unit of layer GConvLSTM, ; Time of day (time) Cell status of layer ConvLSTM units And hidden state As a result of Time of day (time) An input of a layer ConvLSTM unit; vertical transfer: Time of day (time) Hidden state of layer GConvLSTM cells As a result of Time of day (time) An input of a layer ConvLSTM unit; Time of day (time) Hidden state of layer ConvLSTM cells As a result of Time of day (time) An input of a layer GConvLSTM unit; Multi-step prediction by To the point of Time of day input data prediction At the moment the soil humidity will As a means of Time-of-day input data, iterative prediction To the point of Soil moisture at the moment.
- 9. The method for constructing a soil humidity space-time multi-step prediction model according to claim 1, wherein in step S40, the preset optimizer comprises any one of an ADAM optimizer, a random gradient descent SGD optimizer or a RMSprop optimizing optimizer, and the preset loss function comprises any one of a mean square error MSE loss function, a mean absolute error MAE loss function or a Huber loss function; the method further comprises evaluating the final prediction model using the test dataset, the evaluation index comprising a root mean square error, RMSE, a decision coefficient Any one or more combinations of the mean deviation MB or the standard deviation BSD, and the specific formula is as follows: In the formula, 、 Respectively pixels The true value and the predicted value of the soil humidity, An average value of the true values of all pixels; 、 the length and the width of the soil humidity data are respectively; Arbitrarily designating pixels; To test the time series length of the data set, 、 Respectively is Time pixel Predicted and actual values of (a).
- 10. A computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, which when executed by a processor, implements the steps of the method for constructing a soil moisture spatiotemporal multi-step prediction model according to any one of claims 1 to 9.
Description
Construction method of soil humidity space-time multi-step prediction model and storage medium Technical Field The application belongs to the technical field of hydrology, data analysis and machine learning intersection, and particularly relates to a construction method of a soil humidity space-time multi-step prediction model and a storage medium. Background Soil Moisture (SM) is a key climate variable defined by global climate observation system (Global Climate Observing System, GCOS), and is a core tie connecting land-to-atmosphere energy exchange and water circulation processes, and has irreplaceable effects in agricultural production (such as crop irrigation scheduling), weather forecast (such as precipitation induction mechanism analysis), disaster response (such as drought and flood early warning) and global climate change research. Accurate soil humidity prediction can provide scientific decision basis for the fields, and reduce the influence of extreme climate events on human society and natural ecosystems. At present, soil humidity prediction methods are mainly divided into two types, namely a dynamic model based on a physical mechanism and a machine learning model based on data driving. Dynamic models (such as Global Circulation Models (GCM), liu Mianmo type LSM, hydrologic Models (HM) and the like) based on physical mechanisms simulate the change process of soil humidity by constructing complex physical equations (such as energy conservation equations and water movement equations). The model has the core advantages of being capable of reflecting physical association between soil humidity and factors such as atmosphere, vegetation, topography and the like, but has obvious limitations that firstly, the model depends on a large number of physical parameters (such as soil porosity and vegetation coverage), the parameter acquisition difficulty is high and is easily influenced by observation errors, secondly, extremely high calculation resources are required to be consumed for solving a physical equation, real-time or near-real-time prediction requirements are difficult to meet, thirdly, the characterization of part of key physical processes (such as moisture exchange of soil-vegetation-atmosphere interfaces) is insufficient, and accumulated errors are easy to occur in short-term (such as 1-7 days) multi-step prediction. The machine learning model based on data driving is widely applied to the field of soil humidity prediction by virtue of strong characteristic learning capability, and is mainly divided into three types, namely a space model, a time sequence model and a time space model. Spatial models (e.g., convolutional neural network CNN) are adept at processing high-dimensional raster data, extracting local spatial features by sliding convolution kernels, have been used to predict vegetation coverage soil moisture in combination with remote sensing indices (e.g., NDVI, NDWI). The time sequence model (such as a long-short-term memory network LSTM) can capture the long-term dependency relationship of the time sequence, inhibit redundant information through a gating mechanism, and is suitable for single-step time sequence prediction of soil humidity. The space-time model (such as a convolution long-short-term memory network ConvLSTM) fuses the space feature extraction capability of CNN and the time sequence modeling capability of LSTM, and the processing of space-time coupled data is realized by introducing two-dimensional convolution in gating calculation, so that the space-time model becomes one of main stream models for short-term soil humidity prediction. However, the existing data driven model still has the following key problems: 1. The local spatial information limitation is that the receptive fields of the CNN, convLSTM and other models are limited by the size of convolution kernels (such as 3 multiplied by 3 and 5 multiplied by 5), the spatial information in the local rectangular range can be captured only, and the soil humidity correlation of remote areas (such as non-adjacent areas with similar climatic conditions) cannot be effectively utilized, so that the prediction precision of soil humidity distribution with strong spatial heterogeneity is insufficient. 2. The remote heterogeneous dependence is missing, and the spatial distribution of soil humidity is commonly influenced by factors such as topography (e.g. mountain land and plain), hydrology (e.g. river and lake), climate (e.g. monsoon and cyclone), and the like, so that complex remote heterogeneous dependence (such as hysteresis influence of upstream rainfall on downstream soil humidity) is formed. The existing model (such as TGC-LSTM) introduces Graph Convolution (GCN) to process non-Euclidean data, but does not fully combine time sequence correlation to construct a connection relation, and does not form a deep fusion memory flow architecture with ConvLSTM, so that the remote dependence is difficult to effectively capture. 3. The c