EP-4738126-A1 - VISUALIZATION OF SYSTEM LOGS USING TIME CURVES

EP4738126A1EP 4738126 A1EP4738126 A1EP 4738126A1EP-4738126-A1

Abstract

A computer-implemented method is presented for visualizing log data captured in a computer system. The method includes: receiving a plurality of log records, where each log record includes a severity indicator; grouping log records in the plurality of log records into groups of log records, such that log records in a given group of records are chronological; for each group of log records, extracting one or more templates from a given group of log records, where each template is comprised of a text string representing log records in the given group of log records; for each group of log records, computing a similarity measure between a given group of log records and the remaining groups of log records; and visualizing the groups of log records by projecting each group of log records onto a multi-dimensional plane based on the similarity measures for the groups of log records and connecting projections for the groups of log records in chronological order using a line. (Fig. 1)

Inventors

WOHLFEIL, Esteban Pérez
BORYSENKOV, Dmytro

Assignees

Dynatrace LLC

Dates

Publication Date: 20260506
Application Date: 20251024

Claims (15)

A computer-implemented method for visualizing log data captured in a computer system, comprising: - receiving, by a computer processor, a plurality of log records, where each log record includes a severity indicator; - grouping, by the computer processor, log records in the plurality of log records into groups of log records, such that log records in a given group of records are chronological; - for each group of log records, extracting, by the computer processor, one or more templates from a given group of log records, where each template is comprised of a text string representing log records in the given group of log records; - for each group of log records, computing, by the computer processor, a similarity measure between a given group of log records and the remaining groups of log records; and - visualizing the groups of log records by projecting each group of log records onto a multi-dimensional plane based on the similarity measures for the groups of log records and connecting projections for the groups of log records in chronological order using a line.
The method of claim 1, wherein grouping log records comprises: - determining time gaps between consecutive log records in the plurality of log records; - determining position of k-largest time gaps between log records; - creating a current group of log records; - sequentially processing log records in the plurality of log records by adding log records to the current group of log records until a stop condition is met; and - creating a new group of log records and then continue sequential processing of log records in response to the stop condition being met, where the stop condition is met when severity indicator of next log record changes by a predefined amount, time gap to the next log record is one of k-largest time gaps, or number of log records in the current group of log records exceeds a threshold.
The method of claim 2, further comprising maintaining an average severity for a fixed number of log records immediately preceding the next log record, where the stop condition is met when the severity indicator of the next log record changes by a predetermined amount in relation to the average severity for the fixed number of log records.
The method of any one of the preceding claims, wherein extracting one or more templates further comprises: - clustering the groups of log records together into one or more clusters; and - for each cluster, parsing log records in a given cluster and replacing variables in the log records with a wildcard character, thereby forming templates.
The method of any one of the preceding claims, further comprising reordering log records in the plurality of log records chronologically according to timestamps prior to the step of grouping the log records.
The method of any one of the preceding claims, wherein the similarity measure between a given group of log records and the remaining groups of log records are normalized and weighted logarithmically by size of each group.
The method of claim 6, wherein computing a similarity measure for a given group of log records further comprises - for each template in the given group of log records, calculate a distance for a given template in the given group of log records to each of the templates in the other group of log records; and - aggregating the distances for each template in the given group of log records to derive a final distance for the given group of log records.
The method of claim 7, wherein the distance is further defined as a Levenshtein distance between text strings comprising a template.
The method of any one of the preceding claims, further comprising projecting each group of log records onto a multi-dimensional plane using multi-dimensional scaling.
The method of any one of the preceding claims, further comprising computing a coefficient of determination for the projected log records and displaying the coefficient of determination concurrently with the projected log records.
The method of any one of the preceding claims, wherein visualizing the groups of log records further comprises displaying information about the templates contained in a given group of log records.
The method of any one of the preceding claims, further comprising displaying a summary for each group of log records along with the templates contained within.
The method of claim 12, further comprising summarizing template contained in a group of log records using a large language model.
The method of any one of the preceding claims, wherein visualizing the groups of log records further comprises at least one of: - replaying a step-by-step animation of the line based on temporal evolution and - overlaying multiple curves corresponding to log data from different systems.
A non-transitory computer-readable medium having computer-executable instructions that, upon execution of the instructions by a processor of a computer, cause the computer to: - receive a plurality of log records, where each log record includes a severity indicator; - group log records in the plurality of log records into groups of log records, such that log records in a given group of records are chronological; - for each group of log records, extract one or more templates from a given group of log records, where each template is comprised of a text string representing log records in the given group of log records; - for each group of log records, compute a similarity measure between a given group of log records and the remaining groups of log records; and - visualize the groups of log records by projecting each group of log records onto a multi-dimensional plane based on the similarity measures for the groups of log records and connecting projections for the groups of log records in chronological order using a line.

Description

Field The present disclosure relates to techniques for summarizing and visualizing system logs using time curves. Background Analyzing log data presents significant challenges due to its poor formatting, immense volume (billions of log messages), and rapid growth (up to millions of records per second). As software systems continue to expand, both the quantity of system logs and the rate at which they are generated are expected to increase. Despite these challenges, log messages contain valuable information, such as error stack traces and execution details, that is often not documented elsewhere. Unfortunately, the semi-structured nature of system logs and their overwhelming volume make it difficult for both humans and large language models (LLMs) to interpret and explain the monitored processes effectively. There are many sophisticated and automated solutions for narrow tasks like anomaly detection or failure prediction, which have proven that log data can be useful. However, the system log is still the last place people want to look at when they need to figure out what happened to the large systems. Therefore, this disclosure tackles a more general problem - the lack of explainability in log data analysis. In particular, the potential of the visualization techniques is explored to explain the evolution of a computing system, purely based on the log data, and suggest a new approach to solving the problem. The most common tools for visualizing log data are dashboards, which typically display numerous detailed histograms and line plots simultaneously, allowing for interactive navigation (e.g., selecting timeframes or applying filtering queries). While dashboards are theoretically well-suited for aiding in problem analysis, mastering them can be challenging, and advanced interactions require significant time investment. Consequently, dashboards are generally not intended for use by a broad audience. However, an overall analysis of system evolution can be valuable even for individuals with limited knowledge of the system, such as small online service owners, salespeople, client support staff, and developers from related domains. This underscores the need for simple and intuitive visualization strategies. For instance, Streamgraphs or stacked bar charts can show composition over time (counts of similar logs over time), but require categorization of the data into distinct disjoint classes with certain properties of interest, which in case of logs is not easy since the most valuable information in the system log typically lies in the semantic meaning of log messages, which is hardly measurable. Such lack of intuitive features can potentially be addressed through the use of embeddings. Visualizing the embedding space can provide insights into the composition of semantic content by training or utilizing pre-trained text embeddings and visualizing them, such as through down-projections on scatter plots. While a robust embedding transformation can effectively display the distribution of log messages and their semantic similarities, it struggles to illustrate the evolution of the system state due to the loss of temporal context. To incorporate chronological order, one can adapt trajectory-based methods. However, in this disclosure paper, the authors manually design embeddings, which is a challenging task with log data. Additionally, the sheer volume of log messages results in an overwhelming number of points on the graph. Even with effective trajectory bundling, the results can become visually complex. In practice, system logs can easily reach millions of lines and gigabytes of data. Therefore, grouping consecutive log messages into events is crucial not only for visualization purposes but also due to computational limitations. For example, feeding an entire log file directly into any large language model (LLM) is either unfeasible or terribly expensive. While utilizing the context window may suffice for tasks such as anomaly detection, it may be unsuitable for providing a comprehensive overview of system evolution. To enable Al assistance in analyzing the evolution of the system state, one needs to develop a more abstract and concise representation of the log messages that still preserves all essential semantic information contained in the original raw data. Thinking in terms of groups of log records (referred to as events) allows for the consideration of visually simpler techniques, such as timelines. Timelines are easy to interpret and effectively display the flow of events over time. However, they do not incorporate semantic similarity, making state analysis challenging. One might attempt to extend timelines to 2-D graphs or utilize various bump charts. This approach, however, requires splitting the sequence into meaningful subparts based on content, as demonstrated with categories of user actions. Unfortunately, it is not directly possible for the case of log records, since in order to divide the monitored process