CN-115796343-B - Prediction method for micro-scene risk-cause-effect relationship of multi-element sequence of infant formula milk powder production sequence
Abstract
The invention provides a prediction method of micro-scene risk and cause-and-effect relation of a multi-element sequence of an infant formula milk powder production sequence, which comprises the steps of S1, selecting a sequence in a technological process index of the infant formula milk powder, extracting sequence Data to form M pieces of sequence Data with different sources to form a Data set D, S2, carrying out stabilization treatment on the sequence Data in the Data set D, expanding the Data set D by using a sliding time slice to form a new Data set Data, wherein the Data set Data comprises a target sequence and a variable sequence. The invention extracts similar sequence segments and excludes interference sequence segments for the related sequences of the raw material index, the production process parameter and the finished product index of the infant formula milk powder, selects the sequence segment most related to the target prediction index through causal analysis, establishes related micro-scenes for prediction analysis, and discovers key control points and key process control parameters which have the greatest influence on the quality of the milk powder finished product, and finds out the relative optimal value from the key control points and the key process control parameters.
Inventors
- LI GUANG
- GONG WEIWEN
- CHU XIAOJUN
- XIONG LINA
- BAO WEIHUA
- JIANG YANXI
- Yang Pinfeng
- DAI ZHIJUN
- KONG YING
- MA LI
Assignees
- 贝因美(杭州)食品研究院有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20221117
Claims (5)
- 1. The prediction method of the micro-scene risk-cause-effect relationship of the multi-element sequence of the infant formula milk powder production sequence is characterized by comprising the following steps: s1, selecting sequences in technological process indexes of infant formula milk powder, extracting sequence data to form M pieces of sequence data of different sources, and forming a data set D; s2, performing stabilization processing on the sequence Data in the Data set D, and expanding the Data set D by using a sliding time slice to form a new Data set Data, wherein the Data set Data comprises a target sequence and a variable sequence; S3, clustering by extracting the most distinguishable subsequence in the variable sequence as the characteristic sequence of the variable sequence, marking as an Rs sequence, clustering the extracted Rs sequence as Clu, wherein Clu m,i represents the ith clustering center line of m variables, and Clu m,i uniquely identifies a category, and the clustering number of each variable is marked as cnt (Clu m );Clu m,i is the Rs sequence obtained above; S4, carrying out random combination on Clu of different variables m to form a sequence cube Scube, wherein Scubek represents a kth sequence cube, each sequence is marked as S v,t , v represents a v variable sequence, and t represents a t sub-sequence in the variable sequence; S5, screening the extracted sequence cubes Scube, screening each Scube to obtain related micro-scene M-scenes, and defining a scene transition entropy for each M-scene to describe the causal influence of the scene on the target index sequence; s6, performing similarity judgment, matching the micro-scene with the highest correlation, and performing risk prediction by using a model of the micro-scene; In the step S6, for variable sequences recorded by all instruments and equipment at a certain moment in the milk powder process, a sequence set with the length of len=60 min is taken forward, 4 sequences of the basic powder protein content, the total number of bacterial colonies, the water content and the vitamin content are removed, and finally M-4 variable sequences with the length of 60min are generated, wherein each sequence is recorded as preS m , 0<m < = M, M represents an M variable; ; selecting Clu m,i with the minimum sequence similarity value as the category of the sequence, combining M-4 preClu, and selecting a scene with the maximum scene transfer entropy and exceeding an entropy threshold value from M-scenes, namely a scene with the highest similarity and the highest correlation; in the step S6, after a micro-scene fitting method of a multi-granularity space is utilized, a training set is constructed by the scene mode, prediction is carried out by using ridge regression, N technological process index sequences and target index sequences in the same time period are longitudinally spliced with a translation step, wherein the target index sequences in step of step=10 min are longitudinally spliced, a sequence group in all time periods of the scene is transversely spliced, the last row is used as a y value, and the rest is used as an X value, and prediction is carried out by using a ridge regression model.
- 2. The method for predicting the micro-scene risk-cause-effect relationship of the multi-element sequence of the infant formula production sequence according to claim 1, wherein in the step S2, the Data set D is expanded in a way of determining a sampling time slice T, wherein the length of the T is the time of the total process of the production of the milk powder, then the time slice is divided according to a time window of 10min each time with the length of 60min, the sampling time slice T/60min is repeated for times in the time slice T, T/60min time sequence subsequences are generated, the Data set Data is formed, the number of sequences of each variable is recorded as N, M variables are counted, each sequence is recorded as S m,n , 0<m < = M,0<n < = N, M represents the M variable, N represents the N sequence of the M variable, and each sequence length is 60min.
- 3. The method for predicting the risk-cause-effect relationship of a micro-scenario in a multi-component sequence of an infant formula according to claim 1, wherein in said step S3, the clustering is performed by first defining a OrderRank array recording a sub-sequence S and a multi-source time-series dataset in increasing order Each time series of (a) Maximum of distances of all sub-sequences of (a) The value of the partition points dt in the OrderRank array, which is the average of two adjacent distance values in the OrderRank array, is then calculated, each partition point dt can divide OrderRank into 2 subsets and the entire dataset Data into And (3) with Wherein: ; the value of the segmentation point dt on OrderRank arrays is evaluated by the gap value, and the calculation formula is as follows: ; Wherein the method comprises the steps of And Representation of And (3) with , And Representation of And (3) with Mean represents mean, std represents variance, dt where maximum gap value exists is the optimal segmentation point, and the greater the gap value, the more representative And (3) with The more the interval is, the better the dividing effect is.
- 4. The method for predicting the risk-cause-effect relationship of a micro-scenario of a multi-component sequence of an infant formula according to claim 1, wherein in the step S3, for the same variable, all sub-sequences in the dataset Data are classified into respective categories by calculating the sequence similarity between all sequences and each Clum, i sequence, and selecting the category of the smallest sequence similarity value.
- 5. The method for predicting micro-scenario risk causal relationships for a multi-component sequence of an infant formula production sequence according to claim 1, wherein the causality between the sequence of transfer entropy variables and the target sequence is calculated using the modified transfer entropy calculation formula as follows: ; representing the transfer entropy of variables A, B, C to variable X, step represents the offset, , , , ; Wherein A, B, C are different variable names, k represents the sequence length, Taking a k-length sequence from an X variable at the moment t; When (when) When A, B and C are causally related to X, but the time sequence has a plurality of interference factors, thus And screening is carried out on the basis of the related micro-scene, and the first seventy percent is selected to ensure that each group of sequences in the related micro-scene have stronger causal relationship.
Description
Prediction method for micro-scene risk-cause-effect relationship of multi-element sequence of infant formula milk powder production sequence Technical Field The invention relates to the field of quality safety risk monitoring of infant formula milk powder, in particular to a prediction method of micro-scene risk-cause-effect relationship of a multi-element sequence of an infant formula milk powder production sequence. Background The quality of infant formula milk powder is not only affected by raw materials, processes and the like, but also is important to find key working procedures and control proper process parameters. At present, most infant formula milk powder enterprises control the process control parameters in a mode of setting an intermediate mean value under the specification of national standards, and a small part of enterprises combine past research experiences and adopt limited single-factor tests to determine key control points and related process control parameters. The method can simply guarantee the product quality with a low threshold, but is relatively extensive, key control points cannot be accurately determined, and relatively optimal parameters of process control cannot be set. For the existing milk powder sequence prediction, compared with the current time sequence prediction algorithm: (1) The unit time sequence prediction method comprises the steps of firstly converting a non-stationary time sequence into a stationary time sequence by an ARIMA model and an autoregressive moving average model, and then regressing a dependent variable only on the hysteresis value of the dependent variable and the current value and the hysteresis value of a random error term. The method is only suitable for single sequence prediction, and multi-step prediction can be performed. (2) LSTM is suitable for learning prediction of single long-time sequence, and has high accuracy in single-step prediction but not in multi-step prediction. (3) The multi-element time sequence prediction method (VAR model) is a vector autoregressive model, is a commonly used economic model, and is characterized in that the VAR model is used for regressing a plurality of follow-up time window variables of all variables by using all historical information in the model, can perform multi-step prediction and risk prediction, and cannot mine causal relations among all sequences. (4) Traditional statistical regression methods (Naive). And predicting the translated target sequence as y by taking all variable sequences as x. The method is simple to operate, can perform multi-step prediction and has low accuracy. For the milk powder time series, because a plurality of factors have wide complex relations, including interconnection relations, causal relations and the like, the interaction among the complex factors is not clear, and meanwhile, the time series contain various complex information, wherein a plurality of interference information affects the prediction accuracy. At present, two major defects are found in the prediction method of design and research, namely, in the multi-step prediction of time sequences, the accuracy is not high, so that the influence of interference information is required to be reduced, the accuracy is improved, and in the aspect of the prediction analysis of multi-element time sequences, the existing method cannot mine the causal relationship among all sequences, for example, the multi-element time sequence prediction method can only perform risk prediction and cannot explain the cause of abnormality. Disclosure of Invention Aiming at the defects existing in the prior art, the invention aims to provide a prediction method for micro-scene risk-cause-effect relationship of a multi-element sequence of an infant formula milk powder production sequence. According to the invention, aiming at the related sequences of the raw material index, the production process parameter and the finished product index of the infant formula milk powder, the similar sequence section is extracted, the interference sequence section is eliminated, the sequence section most related to the target prediction index is selected through causal analysis, the related micro scene is established for prediction analysis, the key control point and the key process control parameter which have the greatest influence on the quality of the finished milk powder product are scientifically excavated, and the relative optimal value is found out. In order to solve the technical problems, the invention is realized by the following technical scheme: The method for predicting the micro-scene risk-cause-effect relationship of the multi-element sequence of the infant formula milk powder production sequence is characterized by comprising the following steps of: the prediction method comprises the following steps: S1, selecting sequences in main process indexes of infant formula milk powder, extracting sequence data to form M pieces of sequence data of different