CN-121980390-A - Fault root cause classification method based on class-level self-adaptive model integration

CN121980390ACN 121980390 ACN121980390 ACN 121980390ACN-121980390-A

Abstract

The invention provides a class-level self-adaptive model integration-based fault root cause classification method which comprises the steps of obtaining a plurality of trained base models which are heterogeneous with each other, determining original prediction probability of a sample to be distinguished belonging to each fault type by utilizing each base model, wherein the sample comprises a multi-dimensional network performance index collected in a network fault adjacent time window, obtaining a matrix indicating each base model to predict exclusive class-level weight of each fault type, and weighting the prediction probability of each fault type according to the matrix to obtain fusion probability of the sample belonging to each fault type so as to determine the fault type of the sample to be distinguished. According to the scheme, through the class-level weight matrix, the relative importance of various heterogeneous base models in predicting different fault categories can be fully utilized to carry out weighted fusion on the original prediction probability, the performance advantage of the relatively strongest model can be fully exerted on each fault category, and the accuracy of fault classification is improved.

Inventors

WANG JIACHEN
WANG YUANYUAN
FAN XUANCHENG
DAI LULU
TIAN LIN
YUAN CHUNJING

Assignees

中国科学院计算技术研究所

Dates

Publication Date: 20260505
Application Date: 20260228

Claims (10)

1. A fault root cause classification method for network quality degradation comprises the following steps: Acquiring a plurality of trained base models which are heterogeneous with each other, and determining the original prediction probability of a sample to be distinguished belonging to each of a plurality of fault categories by utilizing each base model, wherein the sample to be distinguished comprises multidimensional network performance indexes acquired in a network fault adjacent time window; Acquiring a class-level weight matrix indicating the class-level weight exclusive to each fault class prediction of each base model, and weighting the original prediction probability of each fault class determined by a plurality of base models according to the weight matrix to obtain the fusion probability of each fault class of a sample to be distinguished; and determining the fault category to which the sample to be judged belongs according to the fusion probability of each fault category to which the sample to be judged belongs.
2. The method of claim 1, wherein each of the plurality of base models is pre-trained independently and cross-validated in multiple folds in the following manner: randomly dividing the data set into F subsets of equal size; performing F-fold cross validation on the base model by using the subsets, wherein F-fold comprises selecting an F-th subset as validation fold data, subtracting 1 subset from the rest of F as training fold data, training the base model by using the training fold data to obtain a model obtained by F-fold training, and outputting the validation fold prediction probability of the F-fold on the validation fold data by the model obtained by F-fold training; Carrying out parameter average integration on the model obtained by training the 1-F fold to obtain a trained base model; The prediction probabilities of the verification folds from the 1 st fold to the F fold are spliced to form a prediction probability matrix of the base model on all samples of the data set.
3. The method according to claim 2, wherein the class-level weight matrix is obtained as follows: for each fault category, acquiring the prediction probability of a sample belonging to the fault category from a prediction probability matrix according to the label, and determining the recall rate of the base model on the fault category; for each fault category, normalizing the recall rate of the fault category according to each base model in the plurality of base models to obtain the class-level weight of each base model to the fault category; Class-level weights of each base model for each fault class are collected, and a class-level weight matrix is constructed.
4. A method according to claim 3, characterized in that class-level weights are obtained in the following way: Wherein, the Is the first The individual basis models are in fault categories The class-level weight of the upper class, Is the first The individual basis models are in fault categories The recall rate of the data is calculated, Is the first The individual basis models are in fault categories The recall rate of the data is calculated, Is the total number of base models.
5. The method of claim 4, wherein the fusion probability is determined as follows: Wherein, the For the sample to be discriminated Belonging to the fault category Is used for the fusion probability of (a), As a total number of base models, Is the first The individual basis models are in fault categories The class-level weight of the upper class, Is the first Sample to be distinguished determined by individual base model Belonging to the fault category Is used to predict the original prediction probability of (c).
6. The method according to one of claims 1 to 5, wherein the fault class to which the sample to be discriminated belongs is the fault class to which the highest fusion probability corresponds.
7. The method of any one of claims 1-5, wherein the plurality of trained base models that are heterogeneous with respect to each other include LGBM models, XGB models, and CatBoost models.
8. A computer apparatus/device/system comprising a memory, a processor and a computer program/instruction stored on the memory, characterized in that the processor executes the computer program/instruction to carry out the steps of the method according to one of claims 1 to 7.
9. A computer program product comprising computer programs/instructions which, when executed by a processor, implement the steps of the method of one of claims 1 to 7.
10. A computer readable storage medium, having stored thereon a computer program/instructions, which are executed by a processor to implement the steps of the method of one of claims 1-7.

Description

Fault root cause classification method based on class-level self-adaptive model integration Technical Field The invention relates to the technical field of computers, in particular to the field of artificial intelligence, and more particularly relates to a fault root cause classification method based on class-level self-adaptive model integration. Background In an operation and maintenance support system of a mobile communication network, a base station is used as a key network element, and the running state of the base station directly influences the service quality of the whole network. With the continuous expansion of the network scale, the operation and maintenance data generated by the base station has the characteristics of high dimensionality, high noise, strong time sequence and the like. Traditional manual experience investigation or threshold-based alarm rules cannot meet the high-precision and real-time network quality difference root cause discrimination requirements under complex scenes. The root cause of the network quality difference is the root cause of network quality degradation (such as slow network speed, high delay, packet loss, large jitter, frequent disconnection, etc.). The network quality difference root cause discrimination is essentially a complex multi-classification problem based on multi-dimensional performance index (KPI) characteristics, so that the industry gradually adopts a data-driven intelligent discrimination method, including a traditional machine learning method, a deep learning method and an integrated learning method. The traditional machine learning method is widely applied in early stages, such as a Support Vector Machine (SVM), K Nearest Neighbor (KNN) and the like. The theory of the model is mature, but obvious bottleneck exists in high-dimensional sparse operation and maintenance data. Specifically, SVM has high computational complexity and sensitivity to kernel functions when facing large-scale high-dimensional features, KNN is easily affected by dimension disasters in a high-dimensional space, and the calculation amount of a prediction stage is large. In addition, traditional machine learning generally relies on artificial feature engineering, and nonlinear mapping and interaction relation among indexes in KPIs are difficult to automatically model, so that the generalization capability of a model in a complex scene is insufficient. As computing power increases, deep learning methods are used to process the temporal characteristics and multidimensional correlation structures of KPIs. Typical representatives include convolutional networks for modeling local timing features, cyclic networks and their variants (e.g., LSTM) for capturing long-term dependencies. The deep learning has the capability of automatic feature extraction, and can reduce the workload of manual design features. However, depth models typically rely on a large number of labeled samples to stabilize the training. In the wireless network fault data, different fault categories often have significant long tail distribution, and the number of partial minority sample is very limited. In this case, the high-volume depth model is prone to memory fitting to a minority class of samples, i.e., performs well on the training set but has insufficient generalization ability on the test samples, creating an overfitting problem. Meanwhile, the training and reasoning cost of the depth model is high, the deployment complexity is high, and certain limitation exists in a large-scale wireless operation and maintenance scene needing real-time response. In order to further improve the overall performance of network quality difference root cause discrimination, the integrated learning utilizes the complementarity among models to reduce the deviation of a single model on a specific fault class by combining the prediction results of a plurality of base models. Common methods include Bagging (Bagging) based random forests, lifting (Boosting) based gradient lifting methods, and multimodal fusion Stacking (Stacking) methods. Although these approaches improve model stability to some extent, there are significant limitations in wireless network failure scenarios where the failure category distribution is highly unbalanced. The random forest is dependent on a self-help sampling mechanism, is insensitive to unbalance of fault categories and is easy to deviate to categories with a large number of samples, the performance of a gradient lifting model on different fault categories is often influenced by model structure bias, the performance difference of the model on different fault categories is large, a single model is difficult to consider all fault categories, and different base models in a stacking method are in a global unified fusion mode and lack of an effective fine-grained cooperative mechanism. In addition, although ensemble learning can improve classification effects to a certain extent by utilizing complementarity of