CN-121997237-A - Multi-view time sequence anomaly detection method, device, equipment and medium
Abstract
The invention discloses a multi-view time sequence anomaly detection method, device, equipment and medium, wherein the method comprises the steps of preprocessing collected data to obtain an original time sequence set, extracting time sequence characteristics of each time sequence in the original time sequence set to obtain time sequence characteristic representation, constructing semantic description based on statistical attribute information of corresponding variables of each time sequence in the original time sequence set, carrying out semantic coding on the semantic description to obtain initial statistical semantic characteristic representation, carrying out semantic association modeling on the initial statistical semantic characteristic representation to obtain statistical semantic representation, obtaining a fusion characteristic set based on the statistical semantic representation, reconstructing the fusion characteristic set to obtain a reconstruction characteristic set, determining comprehensive anomaly scores based on reconstruction errors, depth characteristic scores and residual analysis indexes, completing time sequence anomaly discrimination based on the comprehensive anomaly scores, and outputting potential faults of an industrial system to be detected.
Inventors
- YU XU
- DING PENGJU
- XI LIANG
- TAO YE
- ZHANG PEIYING
- LU HAO
- Chen Chenglizhao
- CAO SHAOHUA
Assignees
- 中国石油大学(华东)
Dates
- Publication Date
- 20260508
- Application Date
- 20260129
Claims (10)
- 1. The multi-view time sequence abnormality detection method is characterized by comprising the following steps: Collecting a multi-element time sequence generated in the operation process of the industrial system to be detected by using a sensor arranged in the industrial system to be detected, and preprocessing the collected data to obtain an original time sequence set for carrying out anomaly detection on the industrial system to be detected; Extracting time sequence characteristics of each time sequence in the original time sequence set to obtain time sequence characteristic representation; Based on the statistical attribute information of the corresponding variables of each time sequence in the original time sequence set, constructing semantic description, and carrying out semantic coding on the semantic description to obtain initial statistical semantic feature representation; Based on the statistical semantic representation, restraining and strengthening feature dimensions related to the variable interaction relationship in the time sequence feature representation to obtain a fusion feature set; the method comprises the steps of determining a reconstruction error based on an original time sequence and a reconstruction feature set, constructing a depth feature score based on a reconstruction process, determining a multivariate residual analysis index based on a reconstruction error distribution relation corresponding to a plurality of variables, determining a comprehensive anomaly score based on the reconstruction error, the depth feature score and the multivariate residual analysis index, completing time sequence anomaly discrimination based on the comprehensive anomaly score, and outputting potential faults of an industrial system to be detected.
- 2. The multi-view time series anomaly detection method of claim 1, wherein extracting a time series feature from each time series in an original time series set to obtain a time series feature representation comprises: The method comprises the steps of extracting time sequence characteristics of each time sequence in an original time sequence set by adopting a first transducer encoder, wherein the first transducer encoder comprises a first layer normalization module, a multi-head attention mechanism layer, a second layer normalization module and a feedforward neural network which are sequentially connected; after the data preprocessing is finished, mapping the normalized subsequence to a unified feature representation space through linear projection, so that time sequence data from different time slices and different variables can be subjected to joint modeling in the same feature space: , Wherein, the In order to embed the matrix in a linear fashion, The offset vector is represented as such, The dimensions of the hidden feature are represented, First, the The subsequence is linearly mapped to obtain an initial hidden representation.
- 3. The method for detecting multi-view time series anomalies according to claim 1, wherein a semantic description is constructed based on statistical attribute information of corresponding variables of each time series in the original time series set, and the semantic description comprises a mean value, a standard deviation, a sequence value range, overall trend intensity and relative fluctuation degree.
- 4. The multi-view time series anomaly detection method of claim 1, wherein the semantic description is semantically encoded to obtain an initial statistical semantic feature representation, the initial statistical semantic feature representation is encoded by a pre-training large language model, wherein the pre-training large language model comprises a mask attention mechanism layer, a first layer normalization module, a feedforward multi-head neural network layer and a second layer normalization module which are sequentially connected; The method comprises the steps of carrying out semantic association modeling on initial statistical semantic feature representation to obtain statistical semantic representation, and adopting a semantic encoder to realize the statistical semantic representation, wherein semantic embedding is carried out on a semantic token sequence which is represented by taking a variable as a basic unit, and the semantic token sequence is input into a semantic transform encoder for modeling.
- 5. The multi-view time series anomaly detection method of claim 1, wherein feature dimensions related to variable interaction relations in the time series feature representation are constrained and enhanced based on the statistical semantic representation to obtain a fused feature set, and specifically comprises the steps of constraining and enhancing through a mechanism constraint strategy guided by the statistical semantic; Time series characteristic representation With semantic feature representation Interaction is performed through a learnable cross-attention mechanism, wherein the time sequence features guide a semantic retrieval process to obtain semantic enhancement time sequence representation constrained by statistical semantic priors, and the calculation process is defined as follows: Wherein, the The vector of the Query is represented by a vector, The vector of the Key is represented by a vector of keys, The Value vector is represented by a Value vector, Representing the feature dimensions of Query and Key vectors in the attention subspace, Representing a fused feature set; Introducing statistical constraints consistent with the current mechanism state to obtain a stable and mechanism-consistent semantic enhancement time sequence representation: Wherein, the And the representation layer normalization operation is used for balancing the scale difference between the statistical semantic constraint and the numerical dynamic characteristic.
- 6. The multi-view time series anomaly detection method of claim 1, wherein reconstructing the fused feature set to obtain a reconstructed feature set specifically comprises: Feature reconstruction is carried out on the fusion feature set based on a transducer decoder, and the sequence representation is enhanced through the fused semantics Reconstructing to obtain a reconstructed feature set : 。
- 7. The multi-view time series anomaly detection method of claim 1, wherein determining a reconstruction error based on the original time series and the reconstruction feature set comprises obtaining a reconstruction error reflecting a reconstruction consistency of the time series according to a difference between the original time series data and the reconstruction feature set; given the first The first sample is at Time series over individual variables And its reconstruction results Reconstructing error scores The definition is as follows: relying solely on reconstruction errors tends to be difficult to reveal underlying mechanism changes in the underlying high-dimensional representation space; Based on the reconstruction process, constructing depth feature scores, specifically comprising: Calculating a depth feature score reflecting the deviation degree of potential feature distribution based on the potential feature representation obtained in the feature reconstruction process; Definition of the definition Represent the first The first sample of Decoding feature representation corresponding to each variable, then depth feature scoring The definition is as follows: Wherein, the 、 In order for the parameters to be able to be learned, 、 The offset vector is represented as such, Representing the modified linear units of the device, Is a Sigmoid function; Based on the reconstruction error distribution relation corresponding to the variables, determining a multivariate residual analysis index specifically comprises: Definition of the definition Represent the first Reconstruction error vector of individual samples, residual analysis score definition The method comprises the following steps: Wherein, the 、 In order for the parameters to be able to be learned, 、 The offset vector is represented as such, Representing a modified linear unit; Determining a composite anomaly score based on the reconstruction error, the depth feature score, and the multivariate residual analysis indicator, comprising: The reconstruction error, the depth characteristic score and the multivariate residual analysis index are subjected to weighted fusion to obtain a comprehensive anomaly score; Reconstruction errors Depth feature scoring Multivariate residual analysis Carrying out weighted fusion to obtain comprehensive anomaly scores, completing time sequence anomaly discrimination based on the comprehensive anomaly scores, and outputting potential anomalies: Wherein, the 、 、 Representing the relative contribution weights of each view score to the mechanism destruction under different anomaly scenarios, 。
- 8. The multi-view time sequence abnormality detection system is characterized by comprising: The acquisition and preprocessing module is configured to acquire a multi-element time sequence generated in the operation process of the industrial system to be detected by using sensors arranged in the industrial system to be detected, and preprocess the acquired data to obtain an original time sequence set for carrying out anomaly detection on the industrial system to be detected; The feature extraction module is configured to extract time sequence features of each time sequence in the original time sequence set to obtain time sequence feature representation; The semantic coding module is configured to construct semantic descriptions based on statistical attribute information of corresponding variables of each time sequence in the original time sequence set, and perform semantic coding on the semantic descriptions to obtain initial statistical semantic feature representations; The constraint and reinforcement module is configured to carry out constraint and reinforcement on feature dimensions related to variable interaction relations in the time sequence feature representation based on the statistical semantic representation to obtain a fusion feature set; The output module is configured to determine a reconstruction error based on the original time sequence and the reconstruction feature set, construct a depth feature score based on the reconstruction process, determine a multivariate residual analysis index based on a reconstruction error distribution relation corresponding to a plurality of variables, determine a comprehensive anomaly score based on the reconstruction error, the depth feature score and the multivariate residual analysis index, complete time sequence anomaly discrimination based on the comprehensive anomaly score, and output potential faults of the industrial system to be detected.
- 9. An electronic device, comprising: A memory for non-transitory storage of computer readable instructions, and A processor for executing the computer-readable instructions, Wherein the computer readable instructions, when executed by the processor, perform the method of any of the preceding claims 1-7.
- 10. A storage medium, characterized by non-transitory storage of computer readable instructions, wherein the method of any of claims 1-7 is performed when the non-transitory computer readable instructions are executed by a computer.
Description
Multi-view time sequence anomaly detection method, device, equipment and medium Technical Field The present invention relates to the field of time series anomaly detection technologies, and in particular, to a method, an apparatus, a device, and a medium for detecting a multi-view time series anomaly. Background With the rapid development of the fields of industrial Internet, network security, intelligent operation and maintenance and the like, large-scale sensing equipment continuously generates high-dimensional, multi-variable and strong-time-dependence data in the operation process. The data has various sources, high sampling frequency and obvious dynamic evolution characteristics, and is an important basis for reflecting the running state and health level of the system. However, in the practical application scenario, because the system structure is complex, the working condition changes frequently, and the external operation environment is complex and changeable, the time sequence data often presents non-stable dynamic characteristics, complex cross-variable coupling relation and noise interference of different degrees, so that the abnormality often does not appear as significant numerical deviation, but is reflected as local destruction of the interaction mechanism and the action relation among the variables, namely, in the operation process of the system, the originally stable influence relation among part of the variables generates abnormal change in a specific time period or under a specific condition, including the interruption of the causality relation, abnormal drift of influence intensity and the occurrence of abnormal dependency relation. Once the abnormal state is not recognized in time, equipment faults, system performance degradation and even safety accidents can be caused, and serious economic loss is caused. The time sequence anomaly detection is used as a key technical means in complex system operation monitoring and risk early warning, and aims to learn a normal operation mode of a system from historical observation data and identify abnormal behaviors deviating from the mode. The existing time sequence anomaly detection method mostly takes a predictive modeling framework or a reconstruction modeling framework as a core, automatically learns the association relation between local time sequence dependence and variables through structures such as a convolutional neural network or a recurrent neural network, and the like, and part of the method further introduces an attention mechanism to enhance the modeling capability of the sequence long-range dependence, so that the anomaly detection accuracy is improved to a certain extent. The existing time sequence anomaly detection method improves the characterization effect of the complex time sequence to a certain extent, but still faces a plurality of challenges. Firstly, the existing mainstream method generally relies on numerical characteristics to carry out modeling, lacks explicit modeling on the normal operation mechanism of a system and the stability of the action relationship between variables, and is difficult to carry out steady abnormality judgment under the condition of weak abnormal signals or strong noise interference. Secondly, part of methods try to introduce high-level priori information or external knowledge to enhance the expression capacity of the model, but the introduction of semantic information often lacks causal constraint between the semantic information and numerical dynamics, and cross-modal alignment errors can mask real abnormal mechanisms, even can introduce additional semantic noise to interfere with the learning of original time sequence structural features. Finally, the anomaly discrimination mechanism of the existing method is usually based on a single visual angle, such as a prediction error or a reconstruction error, and is difficult to simultaneously describe various anomaly forms such as time dynamic deviation, high-dimensional potential characteristic anomalies, cross-variable structure changes and the like, so that omission is easy to generate under weak anomalies or structure anomalies, and the application effect in complex actual scenes still has a further improvement space. Disclosure of Invention In order to solve the defects in the prior art, the invention provides a multi-view time series anomaly detection method, a device, equipment and a medium; In one aspect, a multi-view time series anomaly detection method is provided, including: collecting a multi-element time sequence generated in the operation process of the industrial system to be detected by using a sensor arranged in the industrial system to be detected, and preprocessing the collected data to obtain an original time sequence set for carrying out abnormal detection on the industrial system to be detected, wherein the multi-element time sequence comprises time sequence data corresponding to a plurality of monitoring variables reflecting the