CN-122025126-A - Multi-scale blood glucose prediction baseline model and method based on double-flow enhancement mechanism
Abstract
The invention discloses a multi-scale blood sugar prediction baseline model and a method based on a double-flow enhancement mechanism, which belong to the technical field of blood sugar prediction, wherein the model comprises the steps of inputting original blood sugar time sequence data; the method comprises the steps of carrying out sequence division based on a dynamic multi-scale patch algorithm, providing a DFB-Mamba model for global flow to model global features, and simultaneously introducing a forgetting gate to dynamically regulate and control the combination of two branches of Mamba, so as to enhance the modeling capability of a Mamba model in long distance. For partial flows, a LoRE module and a lightweight self-attention module are provided, a DSSAF algorithm is adopted to dynamically weight and fuse two flows in the double-flow model, and a prediction result is output. The invention solves the problem of serious characteristic coupling between two streams of a double-stream model, and achieves the effect of nearly SOTA in the public data sets (OhioT DM and DCLP 5) in the two blood sugar prediction fields.
Inventors
- GUO JINHONG
- WANG YUHAN
- ZOU YUANYUAN
- CHENG JIE
- SHI LU
Assignees
- 上海交通大学
Dates
- Publication Date
- 20260512
- Application Date
- 20260127
Claims (10)
- 1. The multi-scale blood sugar prediction baseline model based on the double-flow enhancement mechanism is characterized by mainly comprising a double-flow structure consisting of a DFB-Mamba model, a LoRE module and a light self-attention module, and a DMSP algorithm and a DSSAF algorithm embedded in the double-flow structure, wherein the method specifically comprises the following steps: Step one, taking original blood glucose time series data as a model input; The method comprises the steps of carrying out sequence division based on a dynamic multi-scale patch algorithm, namely inputting time sequence data in the step one into a DMSP algorithm, wherein the DMSP algorithm divides the original time sequence data into a global sequence for describing overall change trend and a local sequence for describing local fluctuation change; Step three, dynamically adjusting patch parameters according to the predicted length; step four, in the global stream, taking the global patch sequence obtained by a DMSP algorithm as the input of a DFB-Mamba model, wherein the dimension of the input sequence is expressed as Simultaneously feeding the input sequence into two parallel branches in the DFB-Mamba model to construct a double-branch structure, wherein the two branches are respectively marked as left branches And right branch The method comprises the steps of performing linear mapping and forgetting Gate construction of a right branch, performing nonlinear activation and gating fusion of a left branch, performing SSM selective modeling of the left branch, performing fusion of double-branch output and Gate-Mamba output generation, performing introduction of a dynamic convolution structure and multi-scale adaptation, setting the final output of a Gate-Mamba module as y, wherein the dimension of y is expressed as The y formula is: ; Step five, in the local stream, taking the local patch sequence obtained by a DMSP algorithm as the input of the local stream, wherein the local patch sequence firstly carries out feature mapping processing through an embedded layer, and taking the output of the embedded layer as the input of a LoRE module, and marking as the input of the LoRE module Meanwhile, the light self-attention module utilizes a local window algorithm to limit the calculation among the token to a set window; step six, adopting DSSAF algorithm to dynamically weight and fuse two streams in the double-stream model, wherein DSSAF algorithm is embedded into the DSI module; step seven, after fusion is completed, sending a fusion output result to an output head for regression output, wherein the output result is as follows: ; finally, the final predicted sequence result can be output through an output head of a linear layer, and is recorded as: 。
- 2. The multi-scale blood glucose prediction baseline model and method based on the double-flow enhancement mechanism as claimed in claim 1, wherein the original blood glucose time series data is stored in a csv file format; The csv file at least comprises two columns of data, wherein the first column is time stamp data, the time stamp data is aligned after preprocessing operation, and the second column is blood glucose value corresponding to the time stamp and is used as a target variable of subsequent regression prediction.
- 3. The multi-scale blood glucose prediction baseline model and method based on the dual-stream enhancement mechanism as claimed in claim 2, wherein the original time sequence is expressed as , Each time step in the sequence is represented, The length of the sequence is indicated and, The dimension of the sequence is represented, global patch division operation is carried out on the original time sequence to obtain a global patch sequence, and the result is recorded as , The local patch dividing operation is carried out on the original time sequence to obtain a local patch sequence, and the result is recorded as , Representing the patch length of a local patch, where The number of token after the patch is represented, P represents the length of the patch and S represents the step size.
- 4. The multi-scale blood glucose prediction baseline model and method based on the double-flow enhancement mechanism of claim 3, wherein in the fourth step, the linear mapping of the right branch and the forgetting gate are constructed, and the method is characterized in that: (4.2.1) feeding the input z of the right branch into a linear mapping layer, and performing linear transformation treatment on the input z; (4.2.2), the right branch output after linear mapping is further sent into a forgetting gate structure; (4.2.3) the forget gate is used for dynamically adjusting the right branch information so as to control the transmission degree of the left branch of the information; (4.2.4), the output of the forget gate is normalized by a Sigmoid activation function to generate a gating weight.
- 5. The multi-scale blood glucose prediction baseline model and method based on the double-flow enhancement mechanism of claim 4, wherein in the fourth step, the nonlinear activation of the left branch is fused with gating, and the method specifically comprises the following steps: (4.3.1) sending the input x of the left branch into a nonlinear activation function SiLU to obtain an activation output result of the left branch; (4.3.2) performing element-wise multiplication operation on the gating weight generated by the forget gate and the result of the left branch after SiLU activation; (4.3.3) the multiplication result is used as one of the DFB-Mamba model output results.
- 6. The multi-scale blood glucose prediction baseline model and method based on the dual-stream enhancement mechanism of claim 5, wherein in the fourth step, SSM of the left branch is selectively modeled, specifically comprising: (4.4.1) in the left branch, selectively modeling the input sequence based on SSM, the SSM selective modeling for modeling the global dependencies in the input sequence; (4.4.2) taking the output of the left branch after SSM selective modeling as another output result of the DFB-Mamba model.
- 7. The multi-scale blood glucose prediction baseline model and method based on the dual-stream enhancement mechanism of claim 6, wherein the fifth step comprises the following steps: step (5.2), a low-rank characteristic learning process based on a feedforward neural network, which specifically comprises the following steps: the FFN module completes the compression and recovery of characteristic dimension through twice linear mapping, namely firstly, the FFN module goes through a projection matrix Projecting the high-dimensional feature vectors into a low-dimensional subspace, followed by a projection matrix Mapping the low-dimensional features back to the original high-dimensional space; Step (5.3), constructing residual connection, which specifically comprises the following steps: after the low-rank feature mapping is completed, carrying out residual connection on an output result of the FFN module and an original input feature X; Modeling the dynamic convolution in the steps (5.4) and LoRE; Step (5.5), the lightweight self-attention module utilizes a local window algorithm to limit the calculation among the token to the set window, and a specific calculation expression is as follows: ; Wherein, the Representing the query vector, key vector and value vector, respectively, within the local window.
- 8. The multi-scale blood glucose prediction baseline model and method based on the dual-stream enhancement mechanism of claim 7, wherein the sixth step specifically comprises the following steps: Step (6.1), double-flow weight generation based on Softmax, specifically comprising the following steps: (6.1.1), the original time sequence is sent to an FFN module to extract the characteristics of the input sequence, and the FFN module carries out nonlinear mapping on the input sequence; (6.1.2) feeding the characteristic representation of the FFN module output into a Softmax function; (6.1.3) normalizing the feature representation by a Softmax function; (6.1.4), softmax function outputs global weights and local weights; step (6.2), double-flow dynamic fusion based on DSSAF.
- 9. The multi-scale blood glucose prediction baseline model and method based on the dual-stream enhancement mechanism of claim 8, wherein the step (6.2) is specifically: (6.2.1) respectively weighting the global stream output characteristics and the local stream output characteristics according to the global weight and the local weight; And (6.2.2) fusing the characteristics of the two weighted streams to obtain a fused output result of the double-stream model.
- 10. A blood glucose prediction method, which uses the multi-scale blood glucose prediction baseline model based on the double-flow enhancement mechanism according to claim 9, and is characterized in that blood glucose time series data is input into the multi-scale blood glucose prediction baseline model based on the double-flow enhancement mechanism, and a prediction result is output by the multi-scale blood glucose prediction baseline model based on the double-flow enhancement mechanism.
Description
Multi-scale blood glucose prediction baseline model and method based on double-flow enhancement mechanism Technical Field The invention relates to the technical field of blood glucose prediction, in particular to a multi-scale blood glucose prediction baseline model and method based on a double-flow enhancement mechanism. Background Time series prediction is important in critical applications such as climate monitoring, financial analysis and medical pre-warning. The development of deep learning remarkably improves the prediction capability of the model in the process of describing complex dependency and multi-time scale dynamic, and particularly promotes the development of long-term time sequence prediction, so that the model can capture long-range association (such as Khatri Rao Neural Operators) and has light weight characteristics (such as SPARSETSF). Physiological signals are typically strong non-stationary time series, are driven by multi-source factors together, have the characteristics of high individual difference and mode aliasing, want to model local mutation from long-term physiological time series, accurately predict the change of the physiological signals in future time, and have been a difficult problem in the medical field. Among these, the glucose trace is particularly complex, and its local variability tends to be directly related to high risk events. Specifically: first, blood glucose levels are susceptible to dietary and human activity, and the like, and fluctuations are large and often accompanied by abrupt jumps, and the prediction error of the model in these critical areas is often significantly higher than in the plateau. Although there have been many studies in an attempt to solve these problems, such as dealing with postprandial diabetic blood glucose excursions with personalized machine learning models (Brugger et al) or predicting mutations in the blood glucose sequence in advance with interpretable machine learning models (Duckworth et al), the perceptibility of local effects is actually poor and the models are not generalizing. Secondly, when the length of the history sequence is continuously increased, the model becomes more difficult to model local mutation from the long sequence, because the model does not have the capability of distinguishing global features from local features, so that when the input sequence is overlong, the model is mistaken for global trend, and local mutation points are completely ignored. Although there have been approaches to attempt to hard-divide sequences into long and short term (Ding et al) or to use a dual-branch structure for split modeling (Zhao et al) to improve the model's ability to model both long-range and local perception, the results indicate that model performance is still limited and that as sequences continue to increase, models perform worse while capturing local fluctuations. In addition, at the design level of the dual-flow model, although the dual-flow structure can decouple global and local modes at the architecture level, the existing method still has difficulty in effectively capturing local mutation, and particularly, the dual-flow structure shows degradation in a long-sequence scene. The reason may be that the global and local features still have strong coupling in the high-dimensional expression, the feature difference of the two norms is small, so that the information is mutually interfered, and even if the global features and the local features are separately captured in a display manner, the two cannot be effectively modeled respectively. The simple double-flow design faces two major limitations (1) the rough feature fusion usually only directly splices two normal forms in dimension, ignores the structural characteristics of the original sequence, and leads to the fact that the fusion mode cannot truly reflect the semantic difference of two types of modes. (2) the boundary of the global mode and the local mode is inherently ambiguous. The two-stream output still has a large amount of sharing characteristics, and strong coupling causes mutual interference, so that the advantages of the two streams cannot be fully exerted. Therefore, an effective double-flow framework must not only realize structural separation, but also further expand the expression difference of the two modes and weaken information interaction to truly improve the overall performance. Based on the above, the invention designs a multi-scale blood glucose prediction baseline model and a multi-scale blood glucose prediction baseline method based on a double-flow enhancement mechanism so as to solve the problems. Disclosure of Invention Aiming at the defects existing in the prior art, the invention provides a multi-scale blood glucose prediction baseline model and a multi-scale blood glucose prediction baseline method based on a double-flow enhancement mechanism. In order to achieve the above purpose, the invention is realized by the following technical sch