US-20260127208-A1 - SYSTEM AND METHOD FOR AN EXTENDABLE LOG ANALYSIS FRAMEWORK UTILIZING LARGE LANGUAGE MODELS

US20260127208A1US 20260127208 A1US20260127208 A1US 20260127208A1US-20260127208-A1

Abstract

A method for analyzing log data is disclosed. The method includes receiving log data; applying a plurality of filter modules to the log data to generate a plurality of filtered results; and selectively activating a subset of a plurality of checker modules to analyze the plurality of filtered results and generate a plurality of analysis results, wherein each of the subset of the plurality of checker modules is activated based on an established relationship with at least one of the plurality of filter modules and based on the filtered results. The method further includes generating report data from the plurality of analysis results by at least one writer module. A system for performing the method is also disclosed.

Inventors

Chien-Han SU
Hung-Chun Liu
Po-Yuan Jeng
Yu-Ting WAN
Lei Chen
Yung-Chih CHIU
Yu-Chieh Lin
Ji-Jie Lin

Assignees

MEDIATEK INC.

Dates

Publication Date: 20260507
Application Date: 20251106

Claims (20)

1 . A method for analyzing log data, the method performed by a system comprising a processor and a memory, the method comprising: receiving log data by the processor; applying a plurality of filter modules to the log data by the processor to generate a plurality of filtered results; selectively activating a subset of a plurality of checker modules by the processor to analyze the plurality of filtered results and generate a plurality of analysis results, wherein each of the subset of the plurality of checker modules is activated based on an established relationship with at least one of the plurality of filter modules and based on the filtered results generated by the at least one of the plurality of filter modules; and generating report data from the plurality of analysis results by at least one writer module executed by the processor.
2 . The method of claim 1 , further comprising: defining an analysis workflow by a large language model (LLM) agent executed by the processor prior to applying the plurality of filter modules; wherein the analysis workflow comprises the plurality of filter modules, the plurality of checker modules, and the at least one writer module, and wherein the established relationship is defined within the analysis workflow.
3 . The method of claim 2 , wherein the LLM agent defines the analysis workflow further based on request data received by the processor, and wherein the request data comprises at least one of a user complaint or a user prompt.
4 . The method of claim 3 , wherein defining the analysis workflow comprises: selecting the at least one writer module from a plurality of predefined writer modules based on the request data; identifying the plurality of checker modules based on upstream requirements associated with the selected at least one writer module; and identifying the plurality of filter modules based on upstream requirements associated with the identified plurality of checker modules.
5 . The method of claim 1 , wherein the established relationship specifies at least one of: at least one of the plurality of filter modules is an upstream module for at least two of the plurality of checker modules; or at least two of the plurality of filter modules are upstream modules for at least one of the plurality of checker modules.
6 . The method of claim 1 , wherein generating the report data comprises aggregating the plurality of analysis results from one or more of the plurality of checker modules by the at least one writer module.
7 . The method of claim 1 , further comprising: receiving review feedback data by the processor, wherein the review feedback data is generated by a reviewer based on the report data; and re-applying the plurality of filter modules to the log data by the processor based on the review feedback data.
8 . The method of claim 2 , wherein the analysis workflow is structured as a directed acyclic graph (DAG), wherein the plurality of filter modules, the plurality of checker modules, and the at least one writer module constitute a plurality of nodes in the DAG.
9 . The method of claim 2 , wherein the LLM agent defines the analysis workflow by selecting from a plurality of predefined modules stored in the memory, and wherein the plurality of predefined modules comprise a plurality of writer templates and a plurality of checker rules embodying domain expertise.
10 . The method of claim 1 , wherein the report data comprises a plurality of predefined instructions, and wherein each of the plurality of predefined instructions is associated with at least one of the plurality of analysis results to provide actionable guidance to a reviewer.
11 . A system for analyzing log data, the system comprising: a memory configured to store log data, a plurality of filter modules, a plurality of checker modules, and at least one writer module; and a processor coupled to the memory, the processor configured to: receive the log data; apply the plurality of filter modules to the log data to generate a plurality of filtered results; selectively activate a subset of the plurality of checker modules to analyze the plurality of filtered results and generate a plurality of analysis results, wherein each of the subset of the plurality of checker modules is activated based on an established relationship with at least one of the plurality of filter modules and based on the filtered results generated by the at least one of the plurality of filter modules; and execute the at least one writer module to generate report data from the plurality of analysis results.
12 . The system of claim 11 , wherein the processor is further configured to: execute a large language model (LLM) agent to define an analysis workflow prior to applying the plurality of filter modules; wherein the analysis workflow comprises the plurality of filter modules, the plurality of checker modules, and the at least one writer module, and wherein the established relationship is defined within the analysis workflow.
13 . The system of claim 12 , wherein the LLM agent defines the analysis workflow further based on request data received by the processor, and wherein the request data comprises at least one of a user complaint or a user prompt.
14 . The system of claim 13 , wherein the processor is configured to define the analysis workflow by: selecting the at least one writer module from a plurality of predefined writer modules based on the request data; identifying the plurality of checker modules based on upstream requirements associated with the selected at least one writer module; and identifying the plurality of filter modules based on upstream requirements associated with the identified plurality of checker modules.
15 . The system of claim 11 , wherein the established relationship specifies at least one of: at least one of the plurality of filter modules is an upstream module for at least two of the plurality of checker modules; or at least two of the plurality of filter modules are upstream modules for at least one of the plurality of checker modules.
16 . The system of claim 11 , wherein the processor is configured to execute the at least one writer module to generate the report data by aggregating the plurality of analysis results from one or more of the plurality of checker modules.
17 . The system of claim 11 , wherein the processor is further configured to: receive review feedback data, wherein the review feedback data is generated by a reviewer based on the report data; and re-apply the plurality of filter modules to the log data based on the review feedback data.
18 . The system of claim 12 , wherein the analysis workflow is structured as a directed acyclic graph (DAG), and wherein the plurality of filter modules, the plurality of checker modules, and the at least one writer module constitute a plurality of nodes in the DAG.
19 . The system of claim 12 , wherein the memory is further configured to store a plurality of predefined modules, and wherein the LLM agent defines the analysis workflow by selecting from the plurality of predefined modules, and wherein the plurality of predefined modules comprise a plurality of writer templates and a plurality of checker rules embodying domain expertise.
20 . The system of claim 11 , wherein the report data comprises a plurality of predefined instructions, and wherein each of the plurality of predefined instructions is associated with at least one of the plurality of analysis results to provide actionable guidance to a reviewer.

Description

CROSS REFERENCE TO RELATED APPLICATIONS This application claims the benefit of U.S. Provisional Application No. 63/716,772, filed on Nov. 6, 2024. The content of the application is incorporated herein by reference. BACKGROUND In the development of modern software and hardware products, system log analysis is an essential process for debugging and monitoring. Engineers analyze log data, which comprises a detailed record of system events and errors, to identify the root causes of issues. Conventionally, log analysis is performed through manual inspection, a process that is inefficient and prone to error when dealing with large volumes of data. The task of locating a single problematic entry within extensive logs is time-consuming and requires significant domain expertise. While automated log analysis tools such as Logstash and Splunk exist, they present their own limitations. A primary drawback is that these conventional solutions perform optimally only with structured log data, requiring additional parsing and normalization for common unstructured free-text logs. Furthermore, many existing tools depend on brittle pattern-matching mechanisms, such as regular expressions (regex), which are tedious to create and maintain. Implementing custom, product-specific rules within these platforms also involves a steep learning curve for proprietary configurations, which hinders flexibility and rapid adaptation. Therefore, a need exists for a more flexible and extendable log analysis framework capable of handling unstructured logs from multiple domains, while reducing the manual effort associated with pattern maintenance and custom logic implementation. SUMMARY In one embodiment, a method for analyzing log data performed by a system comprising a processor and a memory is disclosed. The method comprises receiving log data by the processor; applying a plurality of filter modules to the log data by the processor to generate a plurality of filtered results; selectively activating a subset of a plurality of checker modules by the processor to analyze the plurality of filtered results and generate a plurality of analysis results, wherein each of the subset of the plurality of checker modules is activated based on an established relationship with at least one of the plurality of filter modules and based on the filtered results generated by the at least one of the plurality of filter modules; and generating report data from the plurality of analysis results by at least one writer module executed by the processor. In another embodiment, a system for analyzing log data is disclosed. The system comprises a memory and a processor. The memory is configured to store log data, a plurality of filter modules, a plurality of checker modules, and at least one writer module. The processor is coupled to the memory and is configured to receive the log data; apply the plurality of filter modules to the log data to generate a plurality of filtered results; selectively activate a subset of the plurality of checker modules to analyze the plurality of filtered results and generate a plurality of analysis results, wherein each of the subset of the plurality of checker modules is activated based on an established relationship with at least one of the plurality of filter modules and based on the filtered results generated by the at least one of the plurality of filter modules; and execute the at least one writer module to generate report data from the plurality of analysis results. These and other objectives of the present invention will no doubt become obvious to those of ordinary skill in the art after reading the following detailed description of the preferred embodiment that is illustrated in the various figures and drawings. BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a schematic block diagram of a system for analyzing log data according to an embodiment of the present invention. FIG. 2 is a sequence and data flow diagram illustrating a process for analyzing log data performed by the system in FIG. 1. FIG. 3 is a block diagram illustrating a three-layer framework for the log analysis method according to an embodiment of the present invention. FIG. 4 is a flowchart of a method for analyzing log data according to an embodiment of the present invention. DETAILED DESCRIPTION FIG. 1 is a schematic block diagram of a system 100 for analyzing log data according to an embodiment of the present invention. The system 100 provides an extendable and automated framework for log analysis that overcomes the limitations of conventional manual inspection and brittle pattern-matching tools. The system 100 is configured to dynamically construct and execute a customized analysis workflow, thereby improving the efficiency and accuracy of identifying issues within complex log data. The system 100 includes a data collection module 10, a processor 11, and a memory 12. In some embodiments, the system 100 further interacts with a data reviewer 13. The system 100 can also