CN-122020560-A - Time sequence prediction method and system based on multi-scale and interactive fusion

CN122020560ACN 122020560 ACN122020560 ACN 122020560ACN-122020560-A

Abstract

A time sequence prediction method and system based on multi-scale and interactive fusion comprises the steps of embedding time sequence features, mapping an original time sequence to a unified hidden dimension, adapting a time sequence format, carrying out channel grouping, carrying out global branch construction, carrying out local convolution extraction on a local time sequence mode, carrying out interactive fusion on the global and local features through cross-branch time weight, carrying out sequence modeling, describing long-term dependence through an LSTM network, and carrying out prediction output, wherein the global feature representation is generated through a prediction head, and the prediction result of a future time step is generated through a prediction head. The invention solves the defects of the existing model in aspects of channel attention, multi-scale feature utilization and calculation cost through multi-scale decoupling and interactive fusion, remarkably improves the prediction precision while keeping lower calculation complexity, and has wide application prospects in the fields of traffic flow prediction, energy load prediction, financial analysis and the like.

Inventors

LI PEIJIN
HU JILIN

Assignees

华东师范大学

Dates

Publication Date: 20260512
Application Date: 20260209

Claims (10)

1. The time sequence prediction method based on multi-scale and interactive fusion is characterized by comprising the following steps of: s1, embedding time sequence features, namely mapping an original time sequence to a unified hidden dimension; S2, timing sequence format adaptation, namely rearranging into a format of B, N, T and C and carrying out channel grouping; S3, constructing a global branch, namely generating a global feature representation based on the time-channel coordinate attention; S4, constructing local branches, namely extracting a local time sequence mode by using local convolution; s5, interactive fusion, namely fusing global and local features through cross-branch time weight cross-weighting; S6, sequence modeling, namely using an LSTM network to describe long-term dependence; S7, predicting and outputting, namely generating a prediction result of the future time step through a prediction head.
2. A method of time series prediction based on multi-scale and interactive fusion according to claim 1, wherein the time series feature embedding comprises: S11, multi-dimensional reconstruction processing, namely grouping and reorganizing the mapped hidden layer features according to preset scale factors, and converting feature tensors from a three-dimensional time sequence format to a four-dimensional space-time sequence format adapting to multi-scale convolution; S12, dimension alignment, namely keeping the characteristics under different groups aligned in hidden dimensions through dimension rearrangement or linear transformation so as to support subsequent interactive fusion; The feature map follows the following mathematical relationship: X ∈ R^{B×T×C} x_embedded=linear (X) ∈r { b×t×hidden_dim }, where B is the batch size, T is the input time step length, C is the feature dimension channel number, hidden_dim is the hidden layer dimension, x_embedded is the feature embedded tensor obtained after mapping, and R represents the real number domain.
3. The method for predicting a time sequence based on multi-scale and interactive fusion according to claim 1, wherein the step of adapting the time sequence format comprises: s21, rearranging the format, namely rearranging the one-dimensional time sequence into a B, N, T and C format, wherein N represents the branch number and is obtained by copying in the branch dimension; s22, grouping channels, namely dividing the channel dimensions into groups, wherein the grouping number is g, and the number of each group of channels is C_g=C/g to form grouping characteristics F; The timing format adaptation follows the tensor reassembly logic expression: X̃ = Reshape(X_embedded, [B, N, T, C]) C_g = C / g F= Reshape (X ̃, [ b×g, N, T, c_g ]), where N is the number of branches (number of virtual nodes), g is the number of channel packets, c_g is the number of channels per group, X ̃ e R { b×n×t×c } is the rearranged input, and F e R { b×n×t×c_g } is the grouped intermediate feature.
4. A time series prediction method based on multi-scale and interactive fusion, characterized in that the global branch construction step, namely, the time-channel-based coordinate attention, comprises the following steps: S31, performing adaptive average pooling AdaptiveAvgPool d in the time dimension and the node/channel dimension respectively for each group of features F to obtain a time descriptor z_t and a channel descriptor z_c; S32, feature fusion, namely expanding z_t along a time dimension and expanding z_c along a node dimension, splicing, and splitting the z_t and the z_c into time attention a_t and channel attention a_c through 1×1 convolution Conv_ {1×1} mixed information; S33, attention weighting, namely performing element-by-element multiplication weighting on the original characteristic F after Sigmoid activation to obtain a global branch representation G; The global branch construction follows the following space-time coordinate weight calculation relation: z_t = AdaptiveAvgPool2d_time(F) ∈ R^{B×N×1×C_g} z_c = AdaptiveAvgPool2d_node(F) ∈ R^{B×1×T×C_g} S = Concat(Expand(z_t, T), Expand(z_c, N)) U = Conv_{1×1}(S) [a_t, a_c] = Split(U) G=f ++σ (a_t) ++σ (a_c), where z_t is the time dimension pooling result, z_c is the node/channel dimension pooling result, it represents element-wise multiplication (Hadamard product), σ (·) is a Sigmoid activation function, [ · ], · ] represents stitching in specified dimensions, and G is a global branching feature.
5. A method of time series prediction based on multi-scale and interactive fusion according to claim 1, wherein the interactive fusion attention comprises: s51, calculating time weight, namely respectively carrying out global average pooling GAP_node of node dimensions on a global branch G and a local branch L, solving a mean_c of channel dimensions, and applying Softmax normalization to obtain attention weight sequences u_G (t) and u_L (t) of the time dimensions; S52, weight broadcasting, namely expanding the weight broadcasting to the original feature dimension to obtain weight tensors W_G and W_L; S53, cross weighting, namely weighting the global feature G by using the local weight W_L to obtain G ', and weighting the local feature L by using the global weight W_G to obtain L'; S54, feature fusion, namely adding the cross weighted features to obtain a fusion feature F_out; The expression is as follows: u_G(t) = Softmax(Mean_c(GAP_node(G))) ∈ R^{B×1×T} u_L(t) = Softmax(Mean_c(GAP_node(L))) ∈ R^{B×1×T} W_G = Broadcast(u_G, [B, N, T, C_g]) W_L = Broadcast(u_L, [B, N, T, C_g]) G' = G ⊙ W_L L' = L ⊙ W_G F_out=g '+l', where gap_node (·) is global average pooling of node dimensions, mean_c (·) is averaging channel dimensions, softmax (·) is Softmax normalization, broadcast (·) is Broadcast operation, low-dimensional tensor is extended to high dimensions, G 'is interaction weighted global feature, and L' is interaction weighted local feature.
6. A method of time series prediction based on multiscale and interactive fusion according to claim 1, the interactive fusion is characterized by adopting the following formula: G '=g+w_l, L' =l+w_g, f_out=g '+l' whereinw_l is the time attention weight of the local branch and w_g is the time attention weight of the global branch, obtained by Softmax normalization.
7. A method of time series prediction based on multi-scale and interactive fusion according to claim 1, wherein said prediction output comprises: S71, feature projection, namely inputting the deep feature vector extracted after multi-scale attention and interactive fusion into a double-layer full-connection network, wherein a first full-connection layer maps feature dimensions to hidden layer feature dimensions, combines a nonlinear activation function and random inactivation treatment, and a second full-connection layer maps the feature dimensions to target feature dimensions; S72, time dimension transformation, namely inputting the projected characteristics into a prediction head or a linear mapping module based on a cyclic structure, and mapping the time dimension from an input length T to a prediction length T'; S73, outputting a result, namely outputting a predicted sequence of T time steps in the future; The interactive fusion attention follows the following cross-branch feature interaction and dynamic weighting relation: H_proj = OutputProjection(H) ∈ R^{B×T×C} Ŷ = PredictionHead (h_proj) ∈r { B x T 'x C } where T' is the prediction time step length, Ŷ e R { B x T 'x C } is the prediction output, predictionHead perform a time dimension transform of R t→r { T'.
8. A method of time series prediction based on multi-scale and interactive fusion according to claim 1, wherein the coordinate attention comprises time dimension pooling and channel/node dimension pooling, generating attention weights via 1 x 1 convolution.
9. A time series prediction method based on multi-scale and interactive fusion according to claim 3, wherein the value range of the channel grouping number g is 8-32, and the branching value range is 4-16.
10. A time sequence prediction system based on multi-scale and interactive fusion, which is characterized by comprising the following modules connected in sequence: The feature embedding module is used for receiving the original time sequence X and outputting embedded features to the format adapting module; The format adaptation module rearranges and groups the embedded features and outputs grouping features F to the global branching module and the local branching module; the global branching module is used for applying the time-channel joint coordinate attention to the F and outputting global characteristics G to the interaction fusion module; the local branch module is used for applying 3X 3 convolution to F and outputting local features L to the interaction fusion module; the interactive fusion module is used for carrying out cross weighted fusion on the G and the L and outputting fusion characteristics F_out to the sequence modeling module; the sequence modeling module is used for performing sequence modeling on the F_out and outputting a hidden state H to the prediction output module; And the prediction output module is used for mapping the H into a prediction result and outputting the prediction result.

Description

Time sequence prediction method and system based on multi-scale and interactive fusion Technical Field The invention relates to the technical field of time sequence prediction and deep learning, in particular to a time sequence prediction method and a system based on multi-scale and interactive fusion. Background The time sequence prediction is an important research direction in the fields of artificial intelligence and data mining, and has wide application in the fields of traffic flow prediction, energy management, financial analysis, weather forecast and the like. Traditional time series prediction methods such as ARIMA, exponential smoothing and the like mainly rely on statistical assumptions, and are difficult to capture complex nonlinear timing patterns. In recent years, the deep learning method has made remarkable progress in time series prediction. The cyclic neural network RNN and the variant LSTM and GRU thereof can model long-term time sequence dependency, the convolutional neural network CNN can extract local time sequence modes, and the Transformer and the attention mechanism thereof are excellent in capturing global dependency. However, the prior art still has the following disadvantages: The channel attention is insufficient, namely, in a multivariate time sequence, the importance of different characteristic channels is different, the existing method usually adopts unified weight to process all channels, and the correlation and the difference among the channels are not fully utilized; the multi-scale modeling is insufficient, the time sequence simultaneously contains global trend and local fluctuation, and the single-scale feature extraction is difficult to comprehensively describe the time sequence dynamics. Although multiscale Attention mechanisms such as EMA (EFFICIENT MULTI-scale Attention high-efficiency multiscale Attention) and the like succeed in the field of computer vision, the problem of dimension mismatch exists when the multiscale Attention mechanisms are directly applied to time sequences; The existing multi-branch network generally adopts a simple splicing or adding mode to fuse global and local characteristics, and the interaction relation among different scale characteristics cannot be modeled explicitly, so that the information integration is insufficient; the calculation complexity is high, and although the cross-space learning based on matrix multiplication can model long-distance dependence, the calculation cost is high in a long sequence scene, and the real-time prediction requirement is difficult to meet. Therefore, a time-series prediction method that can fully utilize time and channel two-dimensional information, efficiently fuse multi-scale features, and maintain low computational complexity is needed. Disclosure of Invention The invention aims to overcome the defects of the prior art, and provides a time sequence prediction method and a system based on multi-scale and interactive fusion, which are used for solving the technical problems of insufficient channel attention, insufficient multi-scale modeling, simple branch fusion mode and high calculation complexity in the prior art. In order to achieve the above purpose, the invention adopts the following technical scheme: A time sequence prediction method based on multi-scale and interactive fusion comprises the following steps: s1, embedding time sequence features, namely mapping an original time sequence to a unified hidden dimension; S2, timing sequence format adaptation, namely rearranging into a format of B, N, T and C and carrying out channel grouping; S3, constructing a global branch, namely generating a global feature representation based on the time-channel coordinate attention; S4, constructing local branches, namely extracting a local time sequence mode by using local convolution; s5, interactive fusion, namely fusing global and local features through cross-branch time weight cross-weighting; S6, sequence modeling, namely using an LSTM network to describe long-term dependence; S7, predicting and outputting, namely generating a prediction result of the future time step through a prediction head. Wherein: Step S1, embedding time sequence features, comprising: S11, multi-dimensional reconstruction processing, namely grouping and reorganizing the mapped hidden layer features according to preset scale factors, and converting feature tensors from a three-dimensional time sequence format to a four-dimensional space-time sequence format adapting to multi-scale convolution; S12, dimension alignment, namely keeping the characteristics under different groups aligned in hidden dimensions through dimension rearrangement or linear transformation so as to support subsequent interactive fusion; the feature map follows the following mathematical relationship: X ∈ R^{B×T×C} X_embedded = Linear(X) ∈ R^{B×T×hidden_dim} Wherein B is the batch size, T is the input time step length, C is the feature dimension (channel number), hi