Search

CN-122028089-A - Intelligent operation and maintenance method and system of core network and electronic equipment

CN122028089ACN 122028089 ACN122028089 ACN 122028089ACN-122028089-A

Abstract

The invention relates to the technical field of communication network operation and maintenance, in particular to an intelligent operation and maintenance method, system and electronic equipment of a core network. The method comprises the steps of carrying out unified time sequence on KPIs, logs and virtual resource indexes of a core network in a data acquisition layer, calling Merlion a library by the data analysis layer to finish abnormal scoring and output event portraits, carrying out time lag causal graphs among NOTEARS learning indexes by a diagnosis decision layer, generating a fault root cause list ordered according to suspicious degree by coupling a Drools rule engine, automatically triggering AMF/SMF virtualization expansion or parameter adjustment by an operation and maintenance execution layer according to a strategy, and feeding back results to the acquisition layer to form a closed loop. The current network verification shows that the method can compress the traditional 72-hour optimization period to the minute level, the manual workload is reduced by 90%, and key KPIs such as registration success rate and the like are improved by 1.8%.

Inventors

  • ZHANG GONGQIN
  • ZHANG QUANYU

Assignees

  • 北京长焜科技有限公司

Dates

Publication Date
20260512
Application Date
20260204

Claims (10)

  1. 1. An intelligent operation and maintenance method of a core network is characterized by comprising the following steps: collecting multidimensional time sequence data comprising key performance indexes KPI, signaling flow logs and system resource utilization rate from a core network element, and performing cleaning and standardization treatment; Based on monitoring of the multi-dimensional time sequence data of the core network, identifying a primary abnormal event, and automatically associating at least one other dimension index which is adjacent to the primary abnormal event in a time domain and has a preset association relation to generate a multi-dimensional abnormal feature vector; inputting the multidimensional abnormal feature vector into a diagnosis decision model, wherein the diagnosis decision model carries out reasoning by fusing a causal network between indexes based on historical data learning and a preset domain knowledge rule, and outputs a root cause positioning result of the primary abnormal event; And automatically triggering and executing targeted network repair operation according to the root cause positioning result, and verifying the operation and maintenance closed loop based on repaired data feedback.
  2. 2. The method of claim 1, wherein the automatic association is specifically performed by determining other index nodes having strong causal association with the index node corresponding to the primary abnormal event according to a pre-constructed inter-index causal network, and integrating state change information of the other index nodes within a pre-set time window into the multidimensional abnormal feature vector.
  3. 3. The method of claim 2, wherein the inter-indicator causal network is trained on historical running data by a causal discovery algorithm, the causal discovery algorithm comprising a gradient-based acyclic graph structure learning algorithm, a PC algorithm, or a glanger causal verification algorithm.
  4. 4. The method of claim 1, wherein the fusing the inter-index causal network learned based on historical data with the preset domain knowledge rules comprises: calculating a first confidence coefficient of each potential root cause node by utilizing a probability map model constructed by the inter-index causal network; Carrying out logic reasoning based on the multidimensional abnormal feature vector by utilizing the domain knowledge rule to generate a second confidence coefficient or constraint condition of each potential root cause node; and integrating the first confidence coefficient and the second confidence coefficient or constraint conditions to determine a final root cause positioning result and a repair suggestion.
  5. 5. The method of claim 1 or 4, wherein the domain knowledge rules comprise IF-THEN rules for characterizing logical relationships between network devices, traffic flows, and failure modes.
  6. 6. The method of claim 1, wherein the network repair operation comprises at least one of resource elastic scaling of the target network element, performing a service process restart, adjusting a network load balancing policy, issuing a key parameter configuration, or triggering a network slice reconfiguration.
  7. 7. An intelligent operation and maintenance system of a core network, configured to implement the method according to any one of claims 1 to 6, comprising: the data acquisition layer is configured to acquire multidimensional time sequence data comprising key performance indexes KPI, signaling flow logs and system resource utilization rate from a core network element, and carry out cleaning and standardization processing; The data analysis layer is configured to identify a primary abnormal event based on monitoring of the multi-dimensional time sequence data of the core network, and automatically correlate at least one other dimension index which is adjacent to the primary abnormal event in the time domain and has a preset correlation relationship, so as to generate a multi-dimensional abnormal feature vector; The diagnosis decision layer is configured to input the multidimensional abnormal feature vector into a diagnosis decision model, wherein the diagnosis decision model performs reasoning by fusing a causal network between indexes learned based on historical data with a preset domain knowledge rule and outputs a root cause positioning result of the primary abnormal event; And the operation and maintenance execution layer is configured to automatically trigger and execute targeted network repair operation according to the root cause positioning result and verify an operation and maintenance closed loop based on repaired data feedback.
  8. 8. The system of claim 7, wherein the data analysis layer comprises: a timing anomaly detection unit for identifying the primary anomaly event from the multi-dimensional timing data; and the causal association unit is used for accessing the stored causal network among the indexes and executing the automatic association.
  9. 9. The system of claim 7, wherein the diagnostic decision layer comprises: the probability reasoning unit is used for running a probability graph model constructed based on the inter-index causal network; and the rule reasoning unit is used for loading and running the domain knowledge rule.
  10. 10. An electronic device comprising a processor and a memory, the memory having stored thereon a computer program, characterized in that the processor implements the method according to any of claims 1 to 6 when executing the program.

Description

Intelligent operation and maintenance method and system of core network and electronic equipment Technical Field The invention relates to the technical field of communication network operation and maintenance, in particular to an intelligent operation and maintenance method, system and electronic equipment of a core network. Background With the evolution of the complexity and cloudiness of the 4G/5G network, the operation and maintenance of the core network faces a great challenge. Massive multidimensional operation and maintenance data (such as key performance indicators (KPIs, key Performance Indicator), signaling logs, resource utilization rates and the like) are generated daily, so that a data black hole is formed, and the traditional mode of performing anomaly investigation and root cause positioning by relying on manual experience is low in efficiency and slow in response. The prior art generally employs a fixed threshold based alarm system or simple association rules, which have the disadvantage that: the static threshold is difficult to adapt to the dynamic change of the network, a large number of invalid alarms are easy to generate, and the real fault signals are submerged. The positioning is difficult because the fault phenomenon (such as the decline of the business KPI) is often caused by the interleaving of the problems of a plurality of links such as the bottom layer resources, the signaling flow, the network topology and the like, and the existing method lacks the capabilities of automatic association and depth causal analysis of cross-layer and cross-domain indexes, so that the positioning is inaccurate. And the response and repair are lagged, namely from the problem discovery, analysis and positioning to repair execution, manual intervention is seriously relied on, the closed loop period is long, and the requirement of a high-availability network is difficult to meet. Therefore, a closed-loop operation and maintenance scheme capable of automatically sensing abnormality, intelligently diagnosing root cause and rapidly performing repair is needed. Disclosure of Invention In view of this, the embodiments of the present application are directed to providing an intelligent operation and maintenance method, system and electronic device for a core network, so as to solve the problems of low operation and maintenance efficiency, difficult positioning and slow response speed in the prior art. In order to achieve the above purpose, the invention adopts the following technical scheme: In a first aspect, the present invention provides an intelligent operation and maintenance method for a core network, including the following steps: collecting multidimensional time sequence data comprising key performance indexes KPI, signaling flow logs and system resource utilization rate from a core network element, and performing cleaning and standardization treatment; Based on monitoring of the multi-dimensional time sequence data of the core network, identifying a primary abnormal event, and automatically associating at least one other dimension index which is adjacent to the primary abnormal event in a time domain and has a preset association relation to generate a multi-dimensional abnormal feature vector; inputting the multidimensional abnormal feature vector into a diagnosis decision model, wherein the diagnosis decision model carries out reasoning by fusing a causal network between indexes based on historical data learning and a preset domain knowledge rule, and outputs a root cause positioning result of the primary abnormal event; And automatically triggering and executing targeted network repair operation according to the root cause positioning result, and verifying the operation and maintenance closed loop based on repaired data feedback. Optionally, the automatic association is specifically that other index nodes with strong causal association with the index nodes corresponding to the primary abnormal event are determined according to a pre-constructed inter-index causal network, and state change information of the other index nodes in a preset time window is integrated into the multidimensional abnormal feature vector. Optionally, the inter-indicator causal network is trained on historical operational data by a causal discovery algorithm, the causal discovery algorithm comprising a gradient-based acyclic graph structure learning algorithm, a PC algorithm, or a glanger causal inspection algorithm. Optionally, the fusing the causal network between indexes learned based on the historical data and the preset domain knowledge rule for reasoning includes: calculating a first confidence coefficient of each potential root cause node by utilizing a probability map model constructed by the inter-index causal network; Carrying out logic reasoning based on the multidimensional abnormal feature vector by utilizing the domain knowledge rule to generate a second confidence coefficient or constraint condition of each pot