CN-116050652-B - Runoff prediction method based on local attention enhancement model

CN116050652BCN 116050652 BCN116050652 BCN 116050652BCN-116050652-B

Abstract

The invention relates to a runoff prediction method based on a local attention enhancement model, and belongs to the field of time sequence prediction. The method comprises the steps of obtaining data, preprocessing the data, and inputting the processed data into a prediction model which is trained to obtain a predicted runoff sequence, wherein the runoff prediction model comprises a variable selection module, a local information enhancement module, an attention module, a model extraction module and the like. According to the prediction model, the difference of influence degrees of different covariates on results is considered, the covariates are weighted by the variable selection module, the local information of the runoff sequence is captured by the local information enhancement module, the short-term change trend characteristics of data at a single time point are obtained, the similarity and the attention information between the change trends are obtained by the self-attention module, the depth of an encoder and a decoder and the model extraction module are reasonably set, and more accurate prediction under the same available memory is realized.

Inventors

LIU JIANWEI
LI ZHENGHAO
Zhao Xunyi
LIU LIANGCHEN
ZENG SIDONG
WU JUN
TANG SHU

Assignees

重庆邮电大学
中国科学院重庆绿色智能技术研究院

Dates

Publication Date: 20260505
Application Date: 20230222

Claims (6)

1. A runoff prediction method based on a local attention enhancement model is characterized by comprising the steps of obtaining runoff data, preprocessing the runoff data, inputting the processed data into a prediction model which is trained to obtain a predicted runoff sequence, wherein the runoff prediction model comprises a variable selection module, a local information enhancement module, an attention module and a model extraction module, and the method specifically comprises the following steps of: S1, preparing a data set and preprocessing; obtaining historical runoff data and related covariates, aligning according to a time dimension, normalizing, and dividing according to a training set, a verification set and a test set; s2, inputting a training set into a variable selection module, wherein the variable selection module carries out variable weighting on input variables through a gating residual error network, screens out related covariates which contribute to the result more greatly, and the variable selection module comprises a gating residual block GRB and a gating residual block GRB Function of obtaining each variable The weight of flattened input under operation is as follows: Wherein, the Represent the first The individual variables are at The input after the transformation of the moment of time, Representation of All of the flattened vectors of the moment in time, The weights are selected for the variables and, Is that Time of day (time) Outputting the converted variables by the GRB module; Is that All variables at the moment pass through the output of the variable selection module, Representing the number of variables; S3, carrying out position coding on the time sequence, wherein a position coding layer injects position information into input data; S4, inputting the sequence after position coding into a coder-decoder structure, wherein the coder-decoder structure comprises a local information enhancement module, an attention module and a model extraction module, the output of the coder is used for participating in the cross attention calculation of the decoder, and the output of the decoder is a prediction sequence; S5, calculating a loss function by adopting an average absolute error MAE and a Nash correlation coefficient NSE for a predicted result; s6, setting an initial learning rate for the training model, and optimizing the model attenuation learning rate according to an Adam algorithm; And S7, verifying the training effect by using the data of the verification set, and stopping training in advance when the result of the verification set continuously drops to prevent overfitting.
2. The runoff prediction method according to claim 1, wherein in step S1, the related covariates of the runoff data are obtained by performing preliminary screening of the covariates by using Pearson correlation coefficient method and performing normalization processing, wherein the prediction object is Sum factor The calculation formula of the Pearson correlation coefficient between the two is as follows: Wherein, the In order to make the number of data samples, Is that Is the first of (2) The value of the sample is a function of the value of the sample, Is that Is the first of (2) The value of the sample is a function of the value of the sample, Is that Is used for the measurement of the mean value of the samples, Is that Is a sample mean of (c).
3. The radial flow prediction method according to claim 1, wherein in step S3, the position coding layer injects position information into the input data, and the corresponding calculation formula is: Wherein, the The location of the representation is embedded, Indicating the position of the data in the sequence, Representation of Is used in the manufacture of a printed circuit board, Representing the dimensions of an even number of dimensions, Representing odd dimensions, spacing for a fixed length , And (3) with The relative positional relationship of (2) can be calculated by using a dihedral angle sum and difference formula, and the calculation formula is as follows: 。
4. The radial flow prediction method according to claim 1, wherein in step S4, the local information enhancement module uses a time convolution network to perform local context capture on the time sequence, the attention module performs global attention calculation on the time sequence, and the model extraction module acts on the encoder section to perform one-dimensional pooling operation on an intermediate model of the encoder to reduce the size of the stacked encoding layer models.
5. The radial flow prediction method according to claim 1 or 4, wherein in step S4, the encoder is composed of a temporal convolution layer, a multi-head self-attention layer, a residual connection and layer normalization, the encoder is capable of stacking multiple layers in series, the decoder is composed of a temporal convolution layer, a multi-head self-attention layer, a multi-head cross-attention layer, a residual connection and layer normalization, the decoder is capable of stacking multiple layers in series, the decoder is required to perform 0 padding on the predicted sequence position, wherein, The time convolution layer uses a convolution neural network and comprises a causal convolution, a hole convolution and a residual error module, wherein the causal convolution is used for learning short-term local features, the causal convolution enlarges a receptive field through stacking of the convolution layers and learns local sequence information, holes are injected into standard convolution to increase the receptive field, the input of the hole convolution allows interval sampling, and a residual error connection calculation formula of the residual error module is as follows: Wherein, the Representing the input of the sub-layer, Representing the input of the sub-layer, Representing the final input; The attention prediction sequences obtained through the multi-head self-attention layer and the multi-head cross-attention layer are as follows: Wherein, the Representing time-series samples after a positive normalization process, linear represents the Linear layer, The initial time-series component is represented and, K, V denote a query component, a key component, and a numeric component, 、 Respectively representing the weight matrix corresponding to the query component, the key component and the numerical component, A sequence of attention prediction is indicated, Representing the normalized exponential function of the sample, Representing the transposed component of the key-value component, Representing the model scale.
6. The radial flow prediction method according to claim 1, wherein in step S5, the calculation formulas of the mean absolute error MAE and the nash correlation coefficient NSE are: Wherein, the Indicating the length of the traffic prediction period, Representation of The predicted value of the moment runoff quantity, Representation of The observation value of the moment runoff quantity, Represents the average value of the runoff amount observation values, Is the predicted time.

Description

Runoff prediction method based on local attention enhancement model Technical Field The invention belongs to the field of time sequence prediction, and relates to a runoff prediction method based on a local attention enhancement model. Background The time sequence simulation and prediction can play an important supporting role for resource optimization configuration and disaster prevention and reduction. Currently, there are mainly a physical driving model describing a sequence generation process and a data driving model based on data. Wherein, the physical driving model is based on the data-related physical concept and emphasizes the complex process of time series formation. In particular to the field of hydrologic prediction, the model usually needs to consider various parameters such as weather, topography, soil and the like. The method has the advantages that a good prediction effect can be obtained in the runoff prediction of the region with complete data, but because of large difference of the data completeness of different regions, corresponding error accumulation can be generated by the model structure and the uncertainty of the parameters, and the model is greatly limited by the parameters and the input conditions of future meteorological elements. In contrast, the data-driven model does not require a physical generation process of a definite time sequence, and can also generate a good prediction effect by establishing a functional relationship between the driving factors and the prediction factors. Thus, predicting a time series process by means of a data-driven model is considered to be a simple, efficient prediction method in the case of incomplete data. At present, models such as a modified version of long-short-term memory network (RNN) based on a cyclic neural network (RNN) show good performance of multi-step prediction, but are limited by a model structure, and the cyclic neural network always has the problem of capturing sequence length caused by incapability of parallel training and gradient disappearance. In addition, convolutional Neural Network (CNN) based Time Convolutional Network (TCN) solves the problem of parallel training, but relies on stacking hidden layers to obtain a larger receptive field, resulting in a need for greater memory footprint. TCN is therefore inadequate for long time series acquisition. Disclosure of Invention In view of the above, the present invention aims to provide a runoff prediction method based on a local attention enhancement model, which improves the prediction accuracy of the prediction model. In order to achieve the above purpose, the present invention provides the following technical solutions: A runoff prediction method based on a local attention enhancement model obtains runoff data and preprocesses the runoff data, and inputs the processed data into a prediction model which is trained to obtain a predicted runoff sequence. The runoff prediction model comprises a variable selection module, a local information enhancement module, an attention module, a model extraction module and the like. The runoff prediction model designed by the invention considers the characteristics of long runoff period, irregular variation trend and the like, takes the difference of influence degree of different covariates on results into consideration, weights the covariates by utilizing a variable selection module, captures local information of a runoff sequence by utilizing a local information enhancement module, obtains short-term variation trend characteristics of data at a single time point, obtains similarity and attention information between variation trends by utilizing a self-attention module, reasonably sets the depth of an encoder and a decoder and a model extraction module, and realizes more accurate prediction under the same available internal memory. The method specifically comprises the following steps: S1, preparing a data set and preprocessing; Obtaining historical runoff data and related covariates, aligning according to a time dimension, normalizing, and eliminating the influence of the data on the model by the numerical value of the data; s2, inputting the training set into a variable selection module, wherein the variable selection module performs variable weighting on input variables through a gating residual error network and other modules, screens out related covariates with larger contribution to results, optimizes the influence of data on a model, and thus obtains better prediction precision; S3, carrying out position coding on the time sequence, wherein a position coding layer injects position information into input data; S4, inputting the sequence after position coding into a coder-decoder structure, wherein the coder-decoder structure comprises a local information enhancement module, an attention module, a model extraction module and the like, the output of the coder is used for participating in the cross attention calculation of the decoder, and