Search

CN-122019371-A - Method for capturing and reproducing software information state errors

CN122019371ACN 122019371 ACN122019371 ACN 122019371ACN-122019371-A

Abstract

The invention discloses a method for capturing and reproducing software information state errors, which realizes the stable operation of a software system through data acquisition, statistical analysis, anomaly association analysis and intelligent restoration scheme recommendation, utilizes an API and a text analysis library to automatically acquire key information, calculates the mean value and standard deviation of file damage, configuration files and operation log characteristics, judges abnormal data, reveals the connection among different anomalies through anomaly association analysis, establishes an anomaly detection rule, fuses multi-source data, builds and trains a Bayesian network model, adopts cross verification optimization model performance, and effectively solves the limitation of capturing and reproducing software information state errors in the prior art.

Inventors

  • XIN BAILIN
  • ZHENG HONGCHANG
  • Hu Caiqun

Assignees

  • 珠海古亦科技有限公司

Dates

Publication Date
20260512
Application Date
20260119

Claims (7)

  1. 1. A method for capturing and reproducing software information state errors is characterized in that: s1, acquiring software file information, registry information and system environment information through an API, acquiring information from configuration file information through a text analysis library, determining hash values, values of configuration items, key values of registry items and log information as detection items, and outputting the acquired information to the step S2; S2, inputting the information obtained in the step S1, calculating the mean value and standard deviation of file damage, configuration file and operation log characteristics, comparing the values of the file damage, configuration file and operation log characteristics with the calculated mean value and standard deviation, and using a formula |xi Mu| > k x sigma to determine whether the feature value is abnormal data; s3, after judging the abnormal data in the step S2, carrying out abnormal association analysis, judging whether two anomalies are associated by using an intersection, and further calculating the intersection A1 n A2 of any two anomalies A1 and A2E A to determine whether the two anomalies are associated; s4, recommending a preliminary repair mode for the abnormal reasons in the damaged file, the error configuration item and the error environment variable after the abnormal correlation analysis in the S3 step; S5, multi-source data fusion is carried out on the software file, the configuration file, registry information, the operation log, system environment information and user behavior information, a Bayesian network model is trained to recommend a repairing scheme again, K value selection judgment is given through a cross verification mode, and stability and fitting problems of the model are processed.
  2. 2. The method for capturing and reproducing software information status errors according to claim 1, wherein in the step S2: The mean and standard deviation of file damage, configuration file and running log features are calculated, wherein for the features, the mean mu and standard deviation sigma are calculated first, and the formula is as follows: , , where xi is the value of the ith feature and n is the number of feature values; Comparing the file damage, the configuration file and the running log characteristic value with the calculated mean value and standard deviation to judge whether the file damage, the configuration file and the running log characteristic value are abnormal data, and judging that the file damage, the running log characteristic value and the calculated mean value and standard deviation are abnormal data if the value xi of one characteristic meets the following conditions: , Wherein k is a preset threshold; Average value: Mu represents the average value, which is the number of data points divided by the sum of all data in the data set, and is used for describing the center position of the data; standard deviation: sigma represents standard deviation, which is the square of the mean of the square of the difference between the data point and the mean, and is used to describe the degree of dispersion or fluctuation of the data.
  3. 3. The method for capturing and reproducing software information status errors according to claim 2, wherein said abnormal data judgment: xi, representing the value of the ith feature, namely a feature value in the collected software operation data; k represents a preset threshold value for judging whether a characteristic value is abnormal data, and the value of k is larger than 1; In the calculation process, the mean and standard deviation of each feature are first calculated, and then the formula |xi is used Mu| > k x sigma to determine whether a feature value is abnormal data, and if one feature value satisfies this condition, it is abnormal data.
  4. 4. The method for capturing and reproducing software information status errors as set forth in claim 1, wherein said S3 step of exception association analysis first defines rules for each exception detection including file corruption detection Configuration error detection Dependency problem detection Environment variable problem detection ; Secondly, collecting data, including file information, configuration information, registry information, operation logs and system environment information; further detecting each file f, the configuration item c, the dependence V and the environment variable V by applying a corresponding abnormality detection rule; then, the association relationship between the detected anomalies is analyzed by defining an anomaly set A, wherein each element is a detected anomaly, for any two anomalies A1, A2 ε A, calculating their intersection A1 n A2 to determine if they are associated; Finally, the calculation process is that an abnormal set A is initialized to be an empty set, file damage abnormality is added to the set A if H (f) is not equal to H preset (f) for each file f; For each anomaly A1 in A, for each anomaly A2 in A, if A1+.A2, an intersection A1 n A2 is calculated, and if the intersection is not empty, the association between A1 and A2 is recorded.
  5. 5. The method for capturing and reproducing software information state errors according to claim 1, wherein in the step S5, a plurality of data sources including a software file, a configuration file, registry information, an operation log, system environment information and user behavior information are fused, anomaly analysis is performed on the fused data, and according to anomaly analysis results, a repair scheme is recommended, and the repair scheme is recommended, including automatic repair, semi-automatic repair and manual repair.
  6. 6. The method for capturing and reproducing software information state errors of claim 5, wherein said repair scheme recommendation uses a Bayesian network for repair scheme recommendation based on anomaly analysis results; the specific process of training the Bayesian network model comprises the following steps: Presetting mathematical characters: theta parameter D data set Likelihood function Bes network E, expectation maximization algorithm Data collection, namely collecting various data D related to the system; defining a network structure, and defining a Bayesian network B, wherein the Bayesian network B comprises nodes and edges; initializing parameters, namely initializing a conditional probability table CPT i for each node i in the network; The training network uses the expectation maximization algorithm E to estimate the parameter θ i in the conditional probability table CPT i ; Maximizing a likelihood function L (d|θ), where D is the dataset and θ is the parameter; the verification and adjustment uses cross-validation.
  7. 7. A method for capturing and reproducing software information status errors as defined in claim 6, wherein said cross-validating is performed by dividing the data set into a plurality of subsets, then training a model on each subset, And (3) performing a cross-validation process: The data preprocessing cleans the collected data, including noise removal and outlier processing; Data segmentation: dividing the dataset D into k equal-sized subsets, denoted D1, D2. Model training and verification: in the ith iteration, the ith subset Di is taken as the validation set, the other K-1 subsets D-1= { D1, D2 1, Di+1,..dk } as a training set; Using training set D I training a model Mi; Evaluating the model Mi by using a verification set Di, and calculating performance indexes including accuracy, recall and F1 fraction; the model is adjusted according to the cross verification result, parameters or structures of the model are adjusted, and the steps 3 to 5 are repeated; final model verification the final model is verified using the entire dataset D.

Description

Method for capturing and reproducing software information state errors Technical Field The invention relates to the technical field of software information, in particular to a method for capturing and reproducing state errors of software information. Background In the prior art, the following limitations exist mainly in capturing and reproducing software information state errors: The key information detection range is limited, only digital signatures and registry information are usually detected, software anomalies cannot be comprehensively captured, the repair mode is single, the method mainly relies on creating a starting item and adding a repair file, the repair process is time-consuming and has limited effect, targeted repair is lacking, targeted repair cannot be carried out on different types of software anomalies, user interaction is lacking, the repair process is automatically completed, and a user cannot know the reasons of the anomalies or participate in the repair process. In chinese patent publication No. CN 104123223B, there is disclosed a method and apparatus for repairing software, the key information mentioned in this invention document includes only digital signature and registry information, however, the abnormal state of software is caused by various factors including: file missing or damage, namely digital signature and registry information are normal, but software files are damaged or missing, so that normal operation cannot be performed; Configuration file error, namely software configuration file error, which causes software to be unable to start or have abnormal functions; The environment variable problem is that the environment variable is set incorrectly, so that the software cannot find necessary resources or library files; The software dependence problem is that other software or library files depending on the software are missing or incompatible in version, so that the software cannot normally run; thus, detecting only the digital signature and registry information does not fully capture the abnormal state of the software. Meanwhile, the repair method mentioned in the file mainly depends on creating a startup item and adding a repair file, however, the repair file cannot solve all types of software exception problems, such as configuration file errors or environment variable problems, in the repair method. The method comprises the steps of automatically acquiring software files, a registry and system environment information through an API and a text analysis library, calculating the mean value and standard deviation of features, judging whether feature values are abnormal data or not through a formula, carrying out abnormal association analysis, analyzing association relations among different anomalies, defining rules of each anomaly detection, recommending a preliminary restoration mode, fusing multiple data sources, training a Bayesian network model, giving K value selection judgment through a cross verification mode, processing stability and fitting problems of the model, and finally recommending the restoration scheme through the Bayesian network based on the current state of the system and the known fault mode. Disclosure of Invention The invention provides a method for capturing and reproducing software information state errors in order to solve the technical problems. The technical scheme of the invention is realized by a method for capturing and reproducing the state errors of softening information, which comprises the following steps: s1, acquiring software file information, registry information and system environment information through an API, acquiring information from configuration file information through a text analysis library, determining hash values, values of configuration items, key values of registry items and log information as detection items, and outputting the acquired information to the step S2; S2, inputting the information obtained in the step S1, calculating the mean value and standard deviation of file damage, configuration file and operation log characteristics, comparing the values of the file damage, configuration file and operation log characteristics with the calculated mean value and standard deviation, and using a formula |xi Mu| > k x sigma to determine whether the feature value is abnormal data; s3, after judging the abnormal data in the step S2, carrying out abnormal association analysis, judging whether two anomalies are associated by using an intersection, and further calculating the intersection A1 n A2 of any two anomalies A1 and A2E A to determine whether the two anomalies are associated; s4, recommending a preliminary repair mode for the abnormal reasons in the damaged file, the error configuration item and the error environment variable after the abnormal correlation analysis in the S3 step; S5, multi-source data fusion is carried out on the software file, the configuration file, registry information, the operation log, system enviro