CN-115935218-B - Intermittent process measurement data anomaly detection method based on dual support vector data description
Abstract
The invention discloses an intermittent process measurement data anomaly detection method based on dual support vector data description, which comprises the steps of firstly carrying out modal division on an intermittent process by using a fuzzy clustering method to obtain each modal data set, and fusing each stable modal and adjacent transition modal data to construct a fused modal data set; and finally, utilizing the inner and outer control limits of the dual SVDD model and a data abnormality discrimination strategy to realize abnormality detection of the intermittent process measurement data. According to the invention, the data change relation among different modes of the intermittent process is fully considered, the measurement data abnormality of the intermittent process is detected according to the internal and external layer control limit and the data abnormality discrimination strategy of the constructed dual SVDD model, the false detection rate of the abnormality detection of the measurement data of the process can be effectively reduced, and the accuracy of the abnormality detection of the measurement data of the intermittent process is improved.
Inventors
- WANG JIANLIN
- Ai Xingcong
- ZHOU XINJIE
Assignees
- 北京化工大学
Dates
- Publication Date
- 20260512
- Application Date
- 20220421
Claims (4)
- 1. The intermittent process measurement data anomaly detection method based on the dual support vector data description is characterized by comprising the following steps of: Collecting batch intermittent process data, namely collecting the batch intermittent process data, carrying out modal division on the batch intermittent process data by using a fuzzy C-means FCM algorithm, obtaining stable modal and transitional modal data sets according to modal division information, and obtaining fusion modal data sets by fusing stable modal and adjacent transitional modal data, wherein the batch intermittent process data are penicillin fermentation process variables including ventilation rate, stirring power, substrate flow rate, substrate flow temperature, substrate concentration, dissolved oxygen concentration, biomass concentration, penicillin concentration, reactor volume, carbon dioxide concentration, PH, reactor temperature, heat production capacity, acid adding flow rate, alkali adding flow rate, cooling water adding flow rate and heating water flow rate; training all stable mode and transitional mode data sets to obtain an inner SVDD model, training the fused mode data sets to obtain an outer SVDD model, and constructing an intermittent process measurement data anomaly detection model based on double SVDD; Thirdly, constructing a data anomaly discrimination strategy by utilizing the inner and outer hypersphere radii R in and R out of the dual SVDD model and combining the moving window idea, and carrying out online anomaly detection on intermittent process measurement data; The third step specifically comprises the following steps: On-line sample Determining the mode of the sample according to the sampling time information of the sample And the mean value and standard deviation of the modal data set are utilized to normalize the modal data set to obtain Will be Incoming modality Obtaining the distance between the sample and the sphere center of the super-sphere by the inner SVDD model ; Will be And (3) with And Comparing when Less than or equal to In the time-course of which the first and second contact surfaces, Directly judging as normal sample, and when Greater than or equal to In the time-course of which the first and second contact surfaces, Will be determined directly as an abnormal sample; When (when) Greater than And is smaller than At this time Can not be directly judged as a normal sample or an abnormal sample, and the state of the sample is judged by combining the thought of a moving window, wherein the left end point of the moving window is denoted as WIN L , the right end point of the moving window is denoted as WIN R , and the width of the moving window is set The selection of the window is only required to meet the requirement that the length of the shortest mode is smaller than or equal to that of the shortest mode, the movement of the window is limited in the same mode, the condition that the state of a sample at the current moment only refers to the state information of a data sample in the current mode is ensured, and the movement rule of the window is (1) When the sampling time is the right end point of the window and is not equal to the right boundary of the current mode, the sampling time moves backwards by a width under the condition of not exceeding the right boundary of the current mode (2) The sampling time is the right end point of the window and is equal to the right boundary of the current mode, and is directly moved backwards by a width, (3) inside the current mode, When the sampling time coincides with the left end of the window, the window moves forward to the right end point to coincide with the sampling time; Is provided with Is the sampling time of (1) Judging the sample state by combining the established moving window; From the center of sphere of (2) Internally cross over to And At this time, calculate the left end point of the window to Time of day sample Standard deviation of centre of sphere and addition of The standard deviation after that, both have not been changed significantly, which is considered Also normal sample, add Then the standard deviation is changed by more than or equal to 2 times, at this time, the standard deviation is considered to be Is an abnormal sample; From the slave Is passed through from outside to And Between, at this time, calculate window left end to The mean value of the center of sphere distance of the time sample is smaller than When it is considered that Normal sample, window left end point to Moment of time sample center of sphere distance average value of greater than or equal to Thought to be Still being an abnormal sample, and furthermore, if The state of the device remains unchanged when the device is in two conditions For the left end point of the current mode or Is also at the previous time sample of And When therebetween consider Consistent with the sample state at the previous time, no change occurs.
- 2. The method for detecting the abnormality of the measurement data of the batch process based on the description of the dual support vector data according to claim 1, wherein the first step comprises the following steps: collecting batch process measurement data for I batches , The index is batch index, J is variable number, K is sampling point number; obtaining modal partition data by averaging I batch data according to batch direction And is opposite to Each variable is obtained by respectively subtracting the mean value and dividing the mean value by the standard deviation For the purpose of FCM modality partitioning is seen as an optimization problem as follows ; In the formula, Represent the first A modal center; Representing process data samples Belonging to the first Membership of individual modalities; representing the number of batch process modes; for process data samples And modal center Euclidean distance between them; is a blurring factor; Is a membership matrix; representing arbitrary sampling points ; Obtaining membership from formula (1) And modal center Is an iteratively updated formula of (2) ; ; In the formula, To pair(s) Index of individual modalities; for process data samples And corresponding modal center Euclidean distance between them; after the iteration update is stopped, the stable mode and the transition mode can be divided according to the membership matrix Individual modes with a right boundary of the modes of , Represent the first Right boundary sampling instants of each modality, odd numbered therein Representing stable mode, even numbered Representing a transitional mode, and therefore, the first The individual modality data may be represented as , Represent the first The batch sampling time is Is developed according to the variable direction to obtain a corresponding modal data set as Wherein Representing the number of samples of the corresponding dataset; fusing the stable mode and the adjacent transition mode data, using Representing the data set obtained after fusing the data of the stable mode and the transitional mode, thus for the first stable mode, i.e. When the data set is fused, the corresponding fused data and the data set expanded according to the variable direction can be expressed as ; For the last stable mode, i.e When the data and the data set after corresponding fusion are ; When (when) And is also provided with When the data and the data set after corresponding fusion are ; In the formula, Representing the number of samples of the fused modality dataset; and respectively subtracting the mean value and the standard deviation of the variable directions from each obtained independent mode data set and each fusion data set to normalize.
- 3. The method for detecting the abnormality of the measurement data of the batch process based on the description of the dual support vector data according to claim 1, wherein the step two specifically comprises the following steps: SVDD super sphere model is composed of sphere center And super sphere radius It was determined that it satisfied the following optimization problem ; In the formula, Is a punishment parameter; Is a relaxation variable; Is the number of samples; As a nonlinear function; introducing Lagrangian multipliers Conversion to dual form ; In the formula, And Respectively are samples And A corresponding lagrangian multiplier; Gaussian kernel function Is defined as ; In the formula, Is a kernel width parameter; to be with natural constant An exponential function of the base; For the normalized first Individual independent modality data sets Training the inner SVDD model as ; In the formula, And Respectively the first The sphere center and the radius of the super sphere of the SVDD model of the inner layer of the independent mode; representing a support vector for the corresponding model; similarly, for the normalized first Individual fusion data sets Training the outer SVDD model to be ; In the formula, And Respectively represent the fusion of the first Training the sphere center and the radius of the super sphere of the outer SVDD model by using the data of the stable mode and the adjacent transition mode; for each stable mode, the corresponding inner SVDD model control limit is that The control limit of the outer SVDD model is as follows For each transition mode, the control limit of the inner SVDD model is as follows The control limit of the outer SVDD model is as follows ; In the formula, And The control limits of the outer layer model corresponding to the front and rear stable modes of the transition mode are respectively set; and (3) calculating the control limits of the inner and outer layer models of all modes according to the formulas (10) to (12), and completing the construction of the dual SVDD model.
- 4. The method for detecting the abnormality of the measurement data of the batch process based on the description of the dual support vector data according to claim 1, wherein the third step comprises the following steps: On-line sample Determining the mode of the sample according to the sampling time information of the sample And the mean value and standard deviation of the modal data set are utilized to normalize the modal data set to obtain Will be Incoming modality Obtaining the distance between the sample and the sphere center of the super-sphere by the inner SVDD model Is that ; Will be And (3) with And Comparing when Less than or equal to In the time-course of which the first and second contact surfaces, Directly judging as normal sample, and when Greater than or equal to In the time-course of which the first and second contact surfaces, Will be directly determined as an abnormal sample, i.e ; When (when) Greater than And is smaller than At this time Can not be directly judged as a normal sample or an abnormal sample, and the state of the sample is judged by combining the thought of a moving window, wherein the left end point of the moving window is denoted as WIN L , the right end point of the moving window is denoted as WIN R , and the width of the moving window is set The selection of the window is only required to meet the requirement that the length of the shortest mode is smaller than or equal to that of the shortest mode, and the movement of the window is limited in the same mode, so that the state of a sample at the current moment can be ensured to only refer to the state information of a data sample in the current mode, and the movement rule of the window is (1) When the sampling time is the right end point of the window and is not equal to the right boundary of the current mode, the sampling time moves backwards by a width under the condition of not exceeding the right boundary of the current mode (2) The sampling time is the right end point of the window and is equal to the right boundary of the current mode, and is directly moved backwards by a width, (3) inside the current mode, When the sampling time coincides with the left end of the window, the window moves forward to the right end point to coincide with the sampling time; Is provided with Is the sampling time of (1) Judging the sample state by combining the established moving window; From the center of sphere of (2) Internally cross over to And At this time, calculate the left end point of the window to Time of day sample Standard deviation of centre of sphere and addition of The standard deviation after that, both have not been changed significantly, which is considered Also normal sample, add Then the standard deviation is changed by more than or equal to 2 times, at this time, the standard deviation is considered to be Is an abnormal sample; From the slave Is passed through from outside to And Between, at this time, calculate window left end to The mean value of the center of sphere distance of the time sample is smaller than When it is considered that Normal sample, window left end point to Moment of time sample center of sphere distance average value of greater than or equal to Thought to be Still being an abnormal sample, and furthermore, if The state of the device remains unchanged when the device is in two conditions For the left end point of the current mode or Is also at the previous time sample of And When therebetween consider The state of the intermittent process data is consistent with that of the sample at the previous moment, and the intermittent process data abnormality discrimination strategy based on the dual SVDD model is obtained as follows ; ; ; In the formula, ; ; And Respectively from the left end point of the window to The prior moment of the sample and the mean value of the center of sphere distances of all samples in the current moment, and the current sample state can be judged through the judging strategy.
Description
Intermittent process measurement data anomaly detection method based on dual support vector data description Technical Field The invention belongs to the technical field of intermittent process monitoring, and particularly relates to an intermittent process measurement data anomaly detection method based on dual support vector data description. Background The intermittent process can produce high added value products according to market demand change and customer customization requirements, and is widely applied to the fields of chemical industry, pharmacy, food processing and the like. In-situ meters for batch production processes provide a large amount of process measurement data, providing support for data driven process modeling. However, due to the influence of factors such as performance attenuation of the field measuring instrument or external environment interference, the abnormal measurement data of the intermittent process occurs, and the accuracy of the data-driven process modeling is directly influenced. Therefore, the abnormal detection of the intermittent process measurement data can provide reliable data for data-driven process modeling, and promote the application of intermittent process monitoring and optimizing control methods and technologies. Support vector data description (Support Vector Data Description, SVDD) implements anomaly detection of intermittent process measurement data by constructing a minimum closed hypersphere of all training data in a high-dimensional space, distinguishing normal data from abnormal data according to the hypersphere boundary. However, the intermittent process measurement data anomaly detection method based on the single SVDD model ignores the multi-mode characteristic of the intermittent process, and reduces the accuracy of process measurement data anomaly detection. Compared with a single SVDD model-based intermittent process measurement data anomaly detection method, the intermittent process measurement data anomaly detection method based on the multi-mode SVDD model has higher detection performance, but only the data characteristics inside each mode are considered, the data change relation between adjacent modes is ignored, and the false detection rate of process measurement data anomaly detection is higher. Therefore, the intermittent process measurement data anomaly detection method based on the dual support vector data description fully considers the data change relation among different modes, builds a dual SVDD process measurement data anomaly detection model by utilizing each stable mode and transition mode data set, and detects the intermittent process measurement data anomaly by combining a data anomaly discrimination strategy, so that the false detection rate of the anomaly data can be effectively reduced, and the accuracy of the intermittent process measurement data anomaly detection is improved. Disclosure of Invention The invention aims to improve the accuracy of the detection of the measurement data abnormality of an intermittent process, and provides a detection method of the measurement data abnormality of the intermittent process based on double support vector data description, which comprises the following steps: acquiring batch intermittent process data, carrying out modal partitioning on the batch intermittent process data by using a Fuzzy C-means (FCM) algorithm, obtaining stable modal and transition modal data sets according to modal partitioning information, and fusing stable modal and adjacent transition modal data to obtain a fused modal data set; Training all stable mode and transitional mode data sets to obtain an inner SVDD model, training the fused mode data sets to obtain an outer SVDD model, and constructing an intermittent process measurement data anomaly detection model based on double SVDD; And thirdly, constructing a data anomaly discrimination strategy by utilizing the inner and outer hypersphere radii R in and R out of the dual SVDD model and combining the moving window idea, and carrying out online anomaly detection on the intermittent process measurement data. The first step specifically comprises the following steps: Intermittent process measurement data X i (KxJ) of I batches are collected, I (I is more than or equal to 1 and less than or equal to I) is a batch index, J is a variable number, and K is a sampling point number. The I batch data are averaged according to the batch direction to obtain modal partition data X (KxJ), and each variable of X is respectively subtracted by the mean value and divided by the standard deviation to obtain the standard deviation For the followingFCM modality partitioning is seen as an optimization problem as follows Wherein v p represents the P-th modal center, U kp is more than or equal to 0 and less than or equal to 1 represents the membership degree of the process data sample x k belonging to the P-th modal, P represents the number of intermittent process modes, d kp represents