Search

CN-122019219-A - Log abnormality diagnosis and repair method and system based on collaboration of knowledge base and large model

CN122019219ACN 122019219 ACN122019219 ACN 122019219ACN-122019219-A

Abstract

The invention discloses a log abnormality diagnosis and repair method and system based on collaboration of a knowledge base and a large model, and relates to the technical field of log analysis. The method comprises the steps of acquiring system running logs from different sources, carrying out unified format processing and summarizing, carrying out log pre-analysis, carrying out cutting, dimension splitting and preliminary analysis on the logs, carrying out multidimensional matching on a log knowledge base based on a vector database and the log knowledge base according to a preliminary analysis result, and generating an abnormality judgment and restoration suggestion based on a matching result and the current log content by a large model reasoning process. According to the log abnormality diagnosis and repair method and system based on the collaboration of the knowledge base and the large model, provided by the invention, the shortcomings of the prior art in the aspects of accuracy, generalization, automation and intelligence are effectively overcome by introducing a multidimensional log pre-analysis mechanism, supporting a knowledge base structure associated with a front-end abnormality and a large model and vector retrieval collaborative reasoning framework.

Inventors

  • WANG WEIDI
  • YUAN JIACHENG

Assignees

  • 福建星网智慧软件有限公司
  • 福建星网锐捷通讯股份有限公司

Dates

Publication Date
20260512
Application Date
20251209

Claims (10)

  1. 1. A log abnormality diagnosis and repair method based on the cooperation of a knowledge base and a large model is characterized by comprising the following steps: the log collecting and processing process is to obtain system operation logs of different sources and to process and collect the system operation logs in a unified format; the log pre-analysis process is to cut, dimension split and primarily analyze the log; the log knowledge base matching process is to carry out multidimensional matching based on the vector database and the log knowledge base according to the preliminary analysis result; and (3) a large model reasoning process, namely generating abnormal judgment and restoration suggestions based on the matching result and the current log content.
  2. 2. The method of claim 1, wherein the vector database comprises a vector representation of an anomaly log instance, a vector representation of an anomaly log description, and a corresponding anomaly log ID, and wherein the structure of the log knowledge base comprises an anomaly log ID, an anomaly log instance, an anomaly repair step, an anomaly log description, a pre-anomaly log ID, and a pre-anomaly log description.
  3. 3. The method of claim 1, wherein the log pre-analysis process specifically comprises the steps of cutting a log file into a plurality of small files, invoking a large language model for each row of logs to perform semantic analysis, and outputting the following structured dimensions: Original log text; the log premounting problem is that the large model summarizes fault types possibly attributed to the log based on self knowledge; log pre-solving method, i.e. the restoring direction primarily suggested by the large model; the last error source log records the log content of the previous line determined to be error for context association; And (5) time stamp, namely extracting time information from the log, and marking the log as missing if the time information is not the time information.
  4. 4. The method according to claim 2, wherein the log knowledge base matching process specifically comprises: vectorizing the log pre-attribution problem, searching similar abnormal description in a vector database, and returning an ID list and a first similarity score; vectorizing a log pre-solution, searching similar repair description in a vector database, and returning an ID list and a second similarity score; vectorizing the previous error source log, searching the preposed exception in a vector database, and returning an ID list and a third similarity score; Fusing the three similarity scores to generate comprehensive matching sequences; And obtaining corresponding repairing steps from the log knowledge base according to the Top-N matching result.
  5. 5. The method of claim 4, wherein the large model reasoning process specifically includes inputting the current log, the matched repair steps, and the context information into a large model to generate a final diagnosis conclusion and an executable repair suggestion.
  6. 6. A log abnormality diagnosis and repair system based on the cooperation of a knowledge base and a large model is characterized by comprising the following components: The log collection processing module is used for obtaining system running logs from different sources and carrying out unified format processing and summarization; the log pre-analysis module is used for cutting, dimension splitting and preliminary analysis of the log; the log knowledge base matching module is used for carrying out multidimensional matching based on the vector database and the log knowledge base according to the preliminary analysis result; and the large model reasoning module is used for generating abnormal judgment and repair suggestions based on the matching result and the current log content.
  7. 7. The system of claim 6, wherein the vector database comprises a vector representation of an anomaly log instance, a vector representation of an anomaly log description, and a corresponding anomaly log ID, and wherein the structure of the log knowledge base comprises an anomaly log ID, an anomaly log instance, an anomaly repair step, an anomaly log description, a pre-anomaly log ID, and a pre-anomaly log description.
  8. 8. The system of claim 6, wherein the log pre-analysis module comprises: the cutting sub-module is used for cutting the log file into a plurality of small files; the structuring sub-module is used for carrying out semantic analysis on the large language model called by each row of logs and outputting the following structuring dimension: Original log text; the log premounting problem is that the large model summarizes fault types possibly attributed to the log based on self knowledge; log pre-solving method, i.e. the restoring direction primarily suggested by the large model; the last error source log records the log content of the previous line determined to be error for context association; And (5) time stamp, namely extracting time information from the log, and marking the log as missing if the time information is not the time information.
  9. 9. The system of claim 8, wherein the log knowledge base matching module specifically comprises: The first vectorization submodule is used for vectorizing the log premonition problem, searching similar abnormal description in the vector database and returning an ID list and a first similarity score; The second vectorization submodule is used for vectorizing a log pre-solution method, searching similar repair description in a vector database and returning an ID list and a second similarity score; a third quantization sub-module, configured to vectorize the "last error source log", search for the pre-exception in the vector database, and return an ID list and a third similarity score; The comprehensive sorting sub-module is used for fusing the three similarity scores to generate comprehensive matching sorting; and the repair step acquisition submodule is used for acquiring a corresponding repair step from the log knowledge base according to the Top-N matching result.
  10. 10. The system of claim 9, wherein the large model inference module is specifically configured to input the current log, the matched repair steps, and the context information into the large model to generate the final diagnostic conclusion and the executable repair suggestion.

Description

Log abnormality diagnosis and repair method and system based on collaboration of knowledge base and large model Technical Field The invention relates to the technical field of log analysis, in particular to a log abnormality diagnosis and repair method and system based on the cooperation of a knowledge base and a large model. Background In modern complex software systems, journaling is an important basis for diagnosing system anomalies and locating the root cause of faults. However, log data analysis typically has the following challenges: The system is often provided with a plurality of log collectors to generate massive heterogeneous logs; the abnormality identification is difficult, not all logs are truly abnormal, and warning and debugging information and real faults need to be distinguished; the repair relies on experience, even if an abnormality is identified, the repair scheme is highly dependent on the experience of operation and maintenance personnel, and a standardized and automatic means is lacked; And the context is missing, namely a single log line is difficult to reflect the fault overall view, and comprehensive judgment is needed by combining the historical error context. In the prior art, partial schemes adopt a rule engine or simple keyword matching to detect abnormality, but have poor generalization capability, and also attempt to introduce a machine learning model for classification, but lack the deep understanding and restoration suggestion generation capability of log semantics. In addition, most of the existing knowledge bases are static rule bases, and it is difficult to dynamically adapt to new abnormal modes. Therefore, there is a need for an intelligent log analysis method and system that can automatically determine if a log is truly anomalous, and combine historical knowledge with large model reasoning capabilities to generate an executable repair suggestion. Disclosure of Invention The invention aims to solve the technical problem of providing a log abnormality diagnosis and repair method and system based on the cooperation of a knowledge base and a large model, which effectively overcomes the defects of the prior art in the aspects of accuracy, generalization, automation and intellectualization by introducing a multidimensional log pre-analysis mechanism, supporting a knowledge base structure associated with a prepositioned abnormality and a large model and vector retrieval cooperation reasoning framework. In a first aspect, the present invention provides a log anomaly diagnosis and repair method based on knowledge base and large model cooperation, including: the log collecting and processing process is to obtain system operation logs of different sources and to process and collect the system operation logs in a unified format; the log pre-analysis process is to cut, dimension split and primarily analyze the log; the log knowledge base matching process is to carry out multidimensional matching based on the vector database and the log knowledge base according to the preliminary analysis result; and (3) a large model reasoning process, namely generating abnormal judgment and restoration suggestions based on the matching result and the current log content. Further, the vector database comprises vector representations of anomaly log examples, vector representations of anomaly log descriptions and corresponding anomaly log IDs, and the structure of the log knowledge base comprises anomaly log IDs, anomaly log examples, anomaly repair steps, anomaly log descriptions, leading anomaly log IDs and leading anomaly log descriptions. Further, the log pre-analysis process specifically comprises the steps of cutting a log file into a plurality of small files, invoking a large language model for each row of log to carry out semantic analysis, and outputting the following structural dimensions: Original log text; the log premounting problem is that the large model summarizes fault types possibly attributed to the log based on self knowledge; log pre-solving method, i.e. the restoring direction primarily suggested by the large model; the last error source log records the log content of the previous line determined to be error for context association; And (5) time stamp, namely extracting time information from the log, and marking the log as missing if the time information is not the time information. Further, the log knowledge base matching process specifically includes: vectorizing the log pre-attribution problem, searching similar abnormal description in a vector database, and returning an ID list and a first similarity score; vectorizing a log pre-solution, searching similar repair description in a vector database, and returning an ID list and a second similarity score; vectorizing the previous error source log, searching the preposed exception in a vector database, and returning an ID list and a third similarity score; Fusing the three similarity scores to generate comprehensive matching seq