CN-115526234-B - Cross-domain model training and log anomaly detection method and device based on transfer learning

CN115526234BCN 115526234 BCN115526234 BCN 115526234BCN-115526234-B

Abstract

The application provides a cross-domain model training method based on transfer learning, which comprises the following steps of A1, carrying out sliding window division processing on a source system log message and a target system log message to obtain a corresponding source system log sequence and a target system log sequence, A2, carrying out equal division processing on the source system log sequence and the target system log sequence to obtain a log sequence pair, A3, carrying out analysis processing and conversion processing on the source system log message and the target system log message to obtain a log template vector, A4, carrying out model training according to the log sequence pair, the log template vector and a total loss function to obtain a trained LSTM model and a hypersphere model. According to the technical scheme, the similarity between the two features is compared in pairs by adopting a comparison learning method, so that the difference between the features is quantified, the model training cost can be reduced, and the detection effect of log anomaly detection is enhanced.

Inventors

HE SHIMING
CHEN BOWEN
Xiao Jinpan
LI WENJUN
HU PENG
HU JINBIN

Assignees

长沙理工大学

Dates

Publication Date: 20260505
Application Date: 20220830

Claims (10)

1. The cross-domain model training method based on transfer learning is characterized by being applied to log anomaly detection and comprising the following steps of: A1, carrying out sliding window division processing on a source system log message and a target system log message to obtain a corresponding source system log sequence and a target system log sequence; A2, carrying out equal division processing on the source system log sequence and the target system log sequence to obtain a log sequence pair, wherein the log sequence pair comprises a first subsequence set and a second subsequence set, and the first subsequence set and the second subsequence set comprise a plurality of log sequences; A3, analyzing and converting the source system log message and the target system log message to obtain a log template vector; a4, performing model training according to the log sequence pairs, the log template vectors and the total loss function to obtain a trained long-short-term memory network LSTM model and an hypersphere model; The total loss function comprises an hypersphere loss function, an alignment loss function and a uniform loss function, wherein the alignment loss function is used for aligning or approximating the distance between the same pair of log sequence features, and the uniform loss function is used for uniformly distributing the log sequence features on the hypersphere; The equal dividing processing comprises the steps of carrying out sliding window segmentation on a source system and a target system log, obtaining a source system log sequence and a plurality of target system log sequences, mixing and randomly scrambling the source system log sequence and the plurality of target system log sequences, dividing the source system log sequence and the plurality of target system log sequences into two equal sub-data sets serving as input of a subsequent model, and carrying out random mixing segmentation on the log sequences of the source system and the target system without distinction.
2. The method for training a cross-domain model based on transfer learning of claim 1, wherein the hypersphere loss function The method comprises the following steps: ; Wherein, the Representing a first log sequence feature set formed by extracting the first subsequence set through the LSTM model, Representing a second log sequence feature set formed by extracting the second subsequence set through the LSTM model, Representing a single log sequence feature, C representing an hyperspherical center of sphere feature; the total loss function The method comprises the following steps: ; Wherein, the 、 Is a hyper-parameter that balances the three loss functions, The alignment loss function is represented as a function of alignment, A uniform loss function is represented and is used to represent, Representing the hyperspherical loss function.
3. The method for training a transitional learning-based cross-domain model according to claim 2, wherein the alignment loss function The method comprises the following steps: ; Wherein, the Representing an i-th of said log sequence features in said first set of log sequence features, Representing an i-th of said log sequence features in said second set of log sequence features, Representing the total number of log sequence features in a single set of log sequence features.
4. A method of training a transitional learning-based cross-domain model according to claim 2 or 3, characterized in that the uniform loss function The method comprises the following steps: ; Wherein, the The base number representing the natural logarithm, Representing an i-th of said log sequence features in said first set of log sequence features, Representing an i-th of said log sequence features in said second set of log sequence features, Representing the total number of log sequence features for a single log sequence feature set.
5. The method for training a cross-domain model based on transfer learning according to claim 1, wherein the A2 comprises: A21, mixing and randomly disturbing the source system log sequence and the target system log sequence to obtain a mixed log sequence; A22, dividing the mixed log sequence into two equal sub log sequence sets to obtain the log sequence pair.
6. The transition learning-based cross-domain model training method of claim 1, further comprising: and extracting the log sequence features based on the LSTM model, and adjusting and determining a decision boundary, wherein the decision boundary is used for distinguishing normal log sequence features and abnormal log sequence features to obtain a decision boundary distance, and the decision boundary distance is the distance from the decision boundary to the sphere center of the hypersphere model.
7. The method for detecting the log abnormality of the cross-domain based on the transfer learning is characterized in that the method for detecting the log abnormality is realized through an LSTM model and a hypersphere model, the LSTM model and the hypersphere model are obtained by training according to the model training method of any one of claims 1 to 6, and the method for detecting the log abnormality comprises the following steps: b1, carrying out sliding window segmentation processing on a target system log message to be detected to obtain a target system log sequence; B2, analyzing the target system log sequence into a log template, and obtaining a log template vector according to the log template; B3, inputting the log template vector into the trained LSTM model to obtain a log sequence feature set, wherein the log sequence feature set comprises a plurality of log sequence features; And B4, inputting the log sequence characteristics into the trained hypersphere model to obtain an abnormality detection result.
8. The method for detecting a cross-domain log anomaly based on transfer learning according to claim 7, wherein the B4 comprises: b41, calculating a first distance from the log sequence feature to the spherical center of the hypersphere model; B42, comparing the first distance with the decision boundary distance to obtain a comparison result; b43, obtaining an abnormality detection result according to the comparison result.
9. A terminal device, comprising: A memory for storing a computer program; A processor for reading the computer program in the memory and executing the cross-domain model training method based on the transfer learning as claimed in any one of claims 1 to 6 or the cross-domain log anomaly detection method based on the transfer learning as claimed in any one of claims 7 to 8.
10. A computer readable storage medium, wherein computer executable instructions are stored in the computer readable storage medium, and when executed by a processor, the computer executable instructions are configured to implement the method for cross-domain model training based on transfer learning as claimed in any one of claims 1 to 6 or the method for cross-domain log anomaly detection based on transfer learning as claimed in any one of claims 7 to 8.

Description

Cross-domain model training and log anomaly detection method and device based on transfer learning Technical Field The application relates to the technical field of log anomaly detection, in particular to a method, equipment and storage medium for cross-domain model training and log anomaly detection based on transfer learning. Background The system log records detailed operation information. Typically, the cause of the fault is also logged in the system. Through analysis and detection of the log, various dimensional information can be provided for fault location. The log anomaly detection can help system debugging and analysis of root causes, and reliable service is provided for the system. In the production service, a brand new deployed system has the advantages that the total number of collected logs is small due to short system running time, a detection model cannot be trained, and the problem of cold start of abnormal log detection occurs. Transfer learning is an effective method for solving the problem of cold start of log anomaly detection. The transfer learning is to transfer knowledge in one field (source field) to another field (target field), so that the log abnormality detection effect can be greatly improved under the condition of insufficient samples. However, different manufacturers and model devices, formats, grammar and semantics of system logs are different, and the log specifications are not uniform. When the software systems have different working service properties, the component call, IO output and fault types of the software systems are different. Therefore, the migration can be divided into the same domain cross-system migration and the cross-domain migration according to the similarity between the service domains of the source system and the target system. Similar to the system service object of the source system and the target system, the system service object is different only in log grammar and format. Cross-domain migration refers to the difference in system service objects, running logic, of the source system and the target system. For example, BGL (Blue Gene/L supercomputer), HPC (high performance cluster), thunderbird all belong to supercomputer systems, HDFS, hadoop (WordCount, pageRank) and Spark all are distributed systems, and Windows, linux, mac all are operating systems. Cross-system migration is inter-system migration in the same field, such as Windows- > Linux, BGL- > Thunderbird. Cross-domain migration is performed by a cross-domain system, such as Windows- > Hadoop and BGL- > Hadoop. The existing transfer learning method is cross-system transfer, but in a real environment, due to the lack of a data set, the requirement on cross-domain transfer learning is larger, and when the fields of a source system and a target system are different, the abnormality detection performance of a model on the target system is poorer. Therefore, how to improve the detection effect of log anomaly detection under the condition of a small sample of the target system becomes a problem to be solved. The above information disclosed in the background section is only for enhancement of understanding of the background of the application and therefore it may contain information that does not form the prior art that is already known to a person of ordinary skill in the art. Disclosure of Invention The application provides a model training and log anomaly detection method, equipment and a storage medium, which are used for solving the problems existing in the prior art. The application provides a model training method, which comprises the following steps of A1, carrying out sliding window division processing on a source system log message and a target system log message to obtain a corresponding source system log sequence and a corresponding target system log sequence, A2, carrying out equal division processing on the source system log sequence and the target system log sequence to obtain a log sequence pair, wherein the log sequence pair comprises a first subsequence set and a second subsequence set, the first subsequence set and the second subsequence set comprise a plurality of log sequences, A3, carrying out analysis processing and conversion processing on the source system log message and the target system log message to obtain a log template vector, A4, carrying out model training according to the log sequence pair, the log template vector and a total loss function to obtain a trained LSTM (LoShort-Term Memory model with Long and Short time) model, wherein the total loss function comprises an hyperspheric loss function, an alignment loss function and a uniform loss function, and the alignment loss function is used for uniformly distributing log characteristics of the same or uniform loss on the same spherical sequence. In some embodiments, the hypersphere Loss function Loss h is: Wherein V 1 represents a first log sequence feature set formed by extracting the first subsequence set through th