CN-121996457-A - Intelligent early warning and root cause positioning method, system and storage medium for multi-source log
Abstract
The invention provides an intelligent early warning and root cause positioning method, a system and a storage medium of a multi-source log, which are characterized in that firstly, a historical log stream is analyzed to generate an index dynamic base line which is self-adaptive along with a period, then index extraction and deviation calculation are carried out on the real-time multi-source log stream, and an initial abnormal event carrying context is generated through comprehensive judgment of continuous deviation and associated deviation; the method comprises the steps of carrying out map traversal and deviation degree quantitative analysis on an event based on a correlation map database, upgrading the event into an early warning event when the calculated link deviation degree exceeds a threshold value, then carrying out a plurality of root cause presumption paths containing threshold value comparison according to a preset fault mode knowledge base in order of priority, thereby rapidly locking high probability root causes, finally automatically integrating all analysis processes and results, generating and displaying a structured analysis report, and further improving early warning accuracy and fault positioning efficiency.
Inventors
- ZENG HAIPING
- ZHOU QI
- WANG YUANPING
Assignees
- 深圳市锋驰科技有限公司
Dates
- Publication Date
- 20260508
- Application Date
- 20260119
Claims (10)
- 1. An intelligent early warning and root cause positioning method for a multi-source log is characterized by comprising the following steps: acquiring a historical multi-source log stream, extracting a multi-dimensional index based on a preset time sequence characteristic, and generating an index dynamic base line; acquiring a first multi-source log stream, and extracting to obtain a first multi-dimensional index; Calculating the deviation between the first multidimensional index and the index dynamic base line, and comparing the deviation with a preset deviation threshold; generating an initial abnormal event in response to the continuous deviation of the index values or the simultaneous deviation of the multiple associated indexes; Comparing a preset association diagram database according to the initial abnormal event to obtain a link deviation degree; If the link deviation exceeds a preset deviation threshold, generating an early warning event; based on a preset fault mode knowledge base, executing a plurality of root cause presumption paths on the early warning event to obtain a high probability root cause; And generating and displaying a multi-source log analysis report based on a preset structured analysis report rule according to the initial abnormal event, the early warning event and the high probability root cause.
- 2. The method for intelligent early warning and root cause positioning of a multi-source log according to claim 1, wherein the method for extracting multi-dimensional indexes based on preset time sequence features to generate an index dynamic baseline specifically comprises the following steps: extracting time sequence characteristics of the multi-dimensional index based on a plurality of preset time dimension periods according to the historical multi-source log stream; calculating a historical fluctuation mode of each multi-dimensional index in each time dimension period based on the extracted time sequence characteristics; According to the historical fluctuation mode, determining a normal fluctuation range threshold value of each multidimensional index in the corresponding time dimension period; and forming the index dynamic base line which changes periodically along with the time dimension according to the normal fluctuation range threshold.
- 3. The method for intelligent early warning and root cause positioning of a multi-source log according to claim 1, wherein the generating an initial abnormal event in response to continuous deviation of index values or simultaneous deviation of multiple associated indexes specifically comprises: Marking the index exceeding a preset first deviation threshold as a primary deviation index according to the deviation calculation result of the first multidimensional index and the index dynamic baseline; In response to the deviation state of the single primary deviation index, continuously exceeding a preset duration threshold, determining that the index value continuously deviates; in response to the existence of a plurality of primary deviation indexes belonging to the same service link, marking the primary deviation indexes within the same time window, and judging that a plurality of association indexes deviate simultaneously; and generating the initial abnormal event according to the continuous deviation result or the simultaneous deviation of a plurality of associated indexes and the associated trigger index, the time point and the associated tracking identifier.
- 4. The method for intelligent early warning and root cause positioning of a multi-source log according to claim 3, wherein the comparing the preset association diagram database according to the initial abnormal event to obtain the link deviation degree specifically comprises: taking the association tracking mark contained in the initial abnormal event as an initial node, and performing one-degree or multi-degree relation expansion search in the association graph database; Acquiring all associated log event entities and infrastructure entities in an associated time window; Counting the number and type distribution of the associated log event entities and the number and topological relation of the associated infrastructure entities; and based on the statistical result, respectively calculating the event diffusion scale proportion, the entity influence coverage rate and the critical path performance attenuation amplitude, and fusing the proportion, the coverage rate and the amplitude to obtain the link deviation degree.
- 5. The method for intelligent early warning and root cause positioning of a multi-source log according to claim 4, wherein the performing of a plurality of root cause speculative paths on the early warning event based on a preset fault mode knowledge base to obtain a high probability root cause specifically comprises: Sequentially selecting and executing speculative paths according to a preset priority order in the fault mode knowledge base, wherein the speculative paths at least comprise infrastructure layer inspection, dependent service layer inspection and network security layer inspection; In the current speculation path of execution, comparing thresholds according to the entity and the index associated with the early warning event and a preset logic sequence; And responding to all threshold comparison conditions in the current path to be continuously triggered, terminating execution of the follow-up speculated path, and determining the checking object pointed by the current path as the high probability root cause.
- 6. The method for intelligent early warning and root cause positioning of a multi-source log according to claim 1, wherein the generating and displaying the multi-source log analysis report based on the preset structured analysis report rule specifically comprises: Integrating the judging process and the result data of the initial abnormal event, the early warning event and the high probability root cause; Integrating the integrated data into a structured paragraph according to the structured analysis report rule, wherein the structured paragraph comprises an alarm triggering condition, an associated map snapshot, a threshold comparison logic chain and root cause judgment basis; And associating the structured paragraph with a corresponding visualization element, generating the multi-source log analysis report and displaying the multi-source log analysis report.
- 7. The intelligent early warning and root cause positioning system of the multi-source log is characterized by comprising a memory and a processor, wherein the memory comprises an intelligent early warning and root cause positioning method program of the multi-source log, and the intelligent early warning and root cause positioning method program of the multi-source log is executed by the processor to realize the following steps: acquiring a historical multi-source log stream, extracting a multi-dimensional index based on a preset time sequence characteristic, and generating an index dynamic base line; acquiring a first multi-source log stream, and extracting to obtain a first multi-dimensional index; Calculating the deviation between the first multidimensional index and the index dynamic base line, and comparing the deviation with a preset deviation threshold; generating an initial abnormal event in response to the continuous deviation of the index values or the simultaneous deviation of the multiple associated indexes; Comparing a preset association diagram database according to the initial abnormal event to obtain a link deviation degree; If the link deviation exceeds a preset deviation threshold, generating an early warning event; based on a preset fault mode knowledge base, executing a plurality of root cause presumption paths on the early warning event to obtain a high probability root cause; And generating and displaying a multi-source log analysis report based on a preset structured analysis report rule according to the initial abnormal event, the early warning event and the high probability root cause.
- 8. The intelligent early warning and root cause positioning system of a multi-source log according to claim 7, wherein the multi-dimensional index is extracted based on a preset time sequence feature to generate an index dynamic baseline, and the system specifically comprises: extracting time sequence characteristics of the multi-dimensional index based on a plurality of preset time dimension periods according to the historical multi-source log stream; calculating a historical fluctuation mode of each multi-dimensional index in each time dimension period based on the extracted time sequence characteristics; According to the historical fluctuation mode, determining a normal fluctuation range threshold value of each multidimensional index in the corresponding time dimension period; and forming the index dynamic base line which changes periodically along with the time dimension according to the normal fluctuation range threshold.
- 9. The intelligent early warning and root cause positioning system of a multi-source log according to claim 7, wherein the generating an initial abnormal event in response to continuous deviation of index values or simultaneous deviation of multiple associated indexes comprises: Marking the index exceeding a preset first deviation threshold as a primary deviation index according to the deviation calculation result of the first multidimensional index and the index dynamic baseline; In response to the deviation state of the single primary deviation index, continuously exceeding a preset duration threshold, determining that the index value continuously deviates; in response to the existence of a plurality of primary deviation indexes belonging to the same service link, marking the primary deviation indexes within the same time window, and judging that a plurality of association indexes deviate simultaneously; and generating the initial abnormal event according to the continuous deviation result or the simultaneous deviation of a plurality of associated indexes and the associated trigger index, the time point and the associated tracking identifier.
- 10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer readable storage medium includes an intelligent early warning and root cause positioning method program of a multi-source log, and when the intelligent early warning and root cause positioning method program of the multi-source log is executed by a processor, the steps of the intelligent early warning and root cause positioning method of the multi-source log according to any one of claims 1 to 6 are implemented.
Description
Intelligent early warning and root cause positioning method, system and storage medium for multi-source log Technical Field The invention relates to the field of information system observability, in particular to an intelligent early warning and root cause positioning method, system and storage medium for a multi-source log. Background Along with the rapid development of distributed systems and cloud primary technologies, from basic micro-service monitoring to large model reasoning on equal-height concurrent and asynchronous complex systems, how to realize accurate abnormal early warning and efficient root cause positioning from massive and heterogeneous multi-source logs is always a key technical difficulty. In the prior art, firstly, the traditional monitoring method is mostly dependent on a preset static threshold value to carry out single index alarm, and the method cannot adapt to the periodic and seasonal dynamic changes of system load and service mode, so that false alarm and false alarm missing frequently occur. Secondly, although the acquisition and preliminary association of the logs are realized through partial technology, single index fluctuation is often seen in isolation during abnormality judgment, and the identification capability of the multi-index collaborative abnormality mode in the same service link is lacking. Then, in the correlation analysis stage, the existing scheme is difficult to effectively utilize the global topology and real-time dependency relationship of the system, so that the influence range and the diffusion degree of a single abnormal event cannot be quantitatively evaluated, and the early warning decision lacks accurate and objective quantitative basis. Finally, after early warning is generated, the root cause positioning process is generally tedious and highly depends on expert experience, a structured automatic investigation path is lacked, and the average fault repair time is long. Therefore, a closed-loop operation and maintenance analysis technology capable of realizing intelligent perception, accurate assessment and automatic root cause positioning is needed. Disclosure of Invention In view of the above problems, the present invention aims to provide an intelligent early warning and root cause positioning method, system and storage medium for multi-source logs, which realize high automation of early warning generation and root cause positioning by constructing a dynamic baseline, multi-mode abnormal triggering, map quantitative evaluation and structured root cause speculation. The method comprises the steps of firstly, enabling an anomaly detection benchmark to adaptively fluctuate by establishing a dynamic baseline bound with a multi-time dimension period, secondly, improving sensitivity to early compound anomaly perception by fusing a judging mechanism of 'continuous deviation' and 'association and deviation' of two modes at the same time and associating a service link context, then, realizing evaluation of the influence range and severity of an anomaly event by expanded search and multidimensional influence quantitative analysis based on an association graph database, meanwhile, simulating expert diagnosis logic by presetting a fault speculation path and a interlinked threshold triggering mechanism based on priority, shortening root cause positioning time, and finally, improving readability of operation and maintenance information and providing visual basis for decision by converting a complex analysis process and result into a standardized report with clear structure and a graph union. The first aspect of the invention provides an intelligent early warning and root cause positioning method for a multi-source log, which comprises the following steps: acquiring a historical multi-source log stream, extracting a multi-dimensional index based on a preset time sequence characteristic, and generating an index dynamic base line; acquiring a first multi-source log stream, and extracting to obtain a first multi-dimensional index; Calculating the deviation between the first multidimensional index and the index dynamic base line, and comparing the deviation with a preset deviation threshold; generating an initial abnormal event in response to the continuous deviation of the index values or the simultaneous deviation of the multiple associated indexes; Comparing a preset association diagram database according to the initial abnormal event to obtain a link deviation degree; If the link deviation exceeds a preset deviation threshold, generating an early warning event; based on a preset fault mode knowledge base, executing a plurality of root cause presumption paths on the early warning event to obtain a high probability root cause; And generating and displaying a multi-source log analysis report based on a preset structured analysis report rule according to the initial abnormal event, the early warning event and the high probability root cause. In this scheme, the extractio