CN-115964200-B - Software reliability assessment method based on code index

CN115964200BCN 115964200 BCN115964200 BCN 115964200BCN-115964200-B

Abstract

The invention discloses a software reliability assessment method based on a code index, which comprises the following steps of S1, acquiring a project fault problem list and a project Commit record, mining a tracking relation between the project fault problem list and the Commit, determining a project fault problem single trigger type, dividing the project fault problem single trigger type into a code trigger type and a non-code trigger type according to trigger reasons, and extracting the project fault problem list with all trigger types being the code trigger type. The invention is not limited by project running time and fault data quantity, and an automatic scanning tool is used for obtaining code indexes as input in a code finishing stage, so that the reliability of software is predicted in advance, and risk estimation and operation and maintenance preparation for possible faults of the software are facilitated in early stages. Meanwhile, the influence of the code factors on the reliability is considered, the correlation relation between the code index and the reliability is analyzed, objective data are provided for developers, the quality of written codes is evaluated, and references are provided for quality optimization.

Inventors

ZHANG HE
ZHANG JUZHENG
DONG LIMING
ZHOU XIN
RONG GUOPING
SHAO DONG

Assignees

江苏瀚天智能科技股份有限公司

Dates

Publication Date: 20260505
Application Date: 20221229

Claims (9)

1. The software reliability assessment method based on the code index is characterized by comprising the following steps: S1, acquiring a project fault problem list and a project Commit record, mining a tracking relation between the project fault problem list and the Commit, determining a project fault problem single-trigger type, dividing the project fault problem single-trigger type into a code trigger type and a non-code trigger type according to trigger reasons, and extracting all project fault problem lists with the trigger types of the code trigger type; s2, based on a software reliability increment model, obtaining a variation trend and a variation function of the software reliability along with the evolution of the project version by using regression analysis, thereby quantifying a corresponding reliability value for each project version; S3, acquiring source codes of all versions of the project, and calculating code indexes corresponding to all versions of the project by using a code index scanning tool; s4, carrying out reliability analysis on code indexes corresponding to each version of the project, and selecting a model feature set; S5, equally dividing the reliability values into four types as target features, dividing the model feature set into a training set and a testing set, training a four-classification model through the training set and the testing set, and constructing a software reliability evaluation model based on code indexes corresponding to each version of the project; s6, inputting a code index corresponding to the project version to be evaluated in a software reliability evaluation model, outputting a reliability label of the project version code to be evaluated by the software reliability evaluation model, predicting reliability in a code completion stage, and assisting in code quality evaluation and optimization.
2. The method for evaluating the reliability of software based on code indexes according to claim 1, wherein said S1 comprises the steps of: s1-1, selecting a project fault problem list and Commit data, mining a tracking relation between the project fault problem list and the Commit data, and dividing the project fault problem list into a code triggering type and a non-code triggering type so as to extract the code triggering problem list required by an experiment; if the Commit data has a field marking the number of the problem list to which the Commit data belongs, the field is used for associating with the field of the fault problem list number, and the problem list related or solved in the Commit is the code triggering type problem list; If the Commit data does not have a field mode to mark the corresponding problem list number but is mentioned in the fault description, using Commit description text information, searching the problem list number mentioned in the fault description based on rule matching, associating the list number with the problem list number field, indirectly mining the code-triggered problem list, and marking the code-triggered problem list as a non-code-triggered problem list; and S1-2, labeling all problem sheets by using a direct method and an indirect method, wherein the labels are code triggering and non-code triggering respectively, and the problem sheets triggered by the codes are extracted to be used as data input of a software reliability incremental model.
3. The method for evaluating the reliability of software based on the code index according to claim 1, wherein said S2 comprises the steps of: S2-1, using the code-triggered project fault problem list extracted in the step S1, and using a software reliability increment model to fit a function of the project reliability value along with the version change; S2-2, calculating the reliability value of each version of the project according to the obtained function of the reliability value of the project along with the change of the version, and summarizing and counting.
4. The method for evaluating the reliability of software based on the code index according to claim 1, wherein said S3 comprises the steps of: s3-1, acquiring source codes of all versions of the project; and S3-2, scanning code indexes corresponding to all versions of the project by using a plug-in/code scanning tool, and carrying out summarization statistics.
5. The method for evaluating the reliability of software based on the code index according to claim 1, wherein said S4 comprises the steps of: s4-1, summarizing the reliability value of each version of the project and the code index corresponding to each version of the project by using the steps S2 and S3, and analyzing the code index influencing the reliability of the project; In the relation analysis of the code index and the reliability, three characteristic selection schemes are used, namely a pearson correlation coefficient, a chi-square test and an information gain; And S4-2, analyzing the project reliability through three schemes respectively, and taking the code index set with the greatest influence on the reliability as a model feature set.
6. The method for evaluating the reliability of software based on the code index according to claim 1, wherein said S5 comprises the steps of: s5-1, dividing the software reliability value, wherein [0, 0.25] is low in reliability, (0.25, 0.5] is low in reliability, (0.5, 0.75) is high in reliability, and (0.75, 1) is high in reliability; S5-2, preprocessing data, namely preprocessing the features acquired in the model feature set, wherein the preprocessing comprises abnormal value, missing value, data standardization and continuous feature processing; s5-3, training a four-classification model by selecting a random forest classifier, and verifying the classifier by adopting a K-fold cross verification method.
7. The method for evaluating the reliability of software based on the code index according to claim 1, wherein said S6 comprises the steps of: S6-1, selecting a project version to be evaluated, calculating a code index corresponding to the project version to be evaluated by using a code scanning tool, inputting a software reliability evaluation model, and obtaining predicted reliability; s6-2, analyzing code indexes according to the predicted reliability, and assisting in evaluating and optimizing the code quality.
8. The method for evaluating the reliability of software based on the code index according to claim 1, wherein said step S6 is applied to both the data complete item and the data missing item, When the item to be evaluated is a data complete item, training a four-classification model by using the steps S1 to S5, and evaluating the reliability of the version code of the item to be evaluated by using the steps; When the item to be evaluated is a data missing item, a model obtained from the complete item of data with a similar scale to the item to be evaluated can be used, steps S1 to S5 are omitted, and the step is directly used for reliability evaluation.
9. The method for evaluating the reliability of the software based on the code index of claim 3, wherein the software reliability incremental model is used for reliability modeling from the point of software failure, a mathematical means mainly comprising a differential equation is adopted for establishing a quantitative function model among a plurality of random parameters in the software testing process, and SRGM of NHPP (non-homogeneous poisson process) types are most widely applied in the quantitative function model, and the quantitative function model meets the following basic public assumption: (1) Assuming that failure events occur randomly, observing failure and removing faults by a tester and a debugging person meet NHPP; (2) Let [ N (t), t≥0 ] be the random counting process, N (t) be the number of faults detected cumulatively by the tester in [0, t ], and E [ N (t) ]=m (t), where m (t) is the mean function, satisfying m (0) =0, The following formula is obtained: For the function of the project reliability value changing with version, E [ N (t) ] is the expected accumulated fault number when t, lambda (t) is the fault occurrence rate at the moment t, a (t) is the fault function related to t, and b (t) is the fault detection rate function related to t; The software reliability R (x|t) can be calculated by the following formula, if t is the time point when the last failure occurs, t is more than or equal to 0, and x >0, the software reliability is within (t, t+x): The release interval time of each version of the project is the same, the variable can be regarded as a uniformly-changed variable, the independent variable t in the formula is replaced by a version number, the variable x in the formula is replaced by the release interval of the version, and curve fitting is carried out to obtain a function that the reliability of the project changes along with the version.

Description

Software reliability assessment method based on code index Technical Field The invention relates to the technical field of software quality evaluation, in particular to a software reliability evaluation method based on a code index. Background The software reliability refers to the probability that the software fails within a prescribed time. The main goal of software reliability research is to build a model to observe the condition of software functioning properly over a period of time in the past and to predict the probability of failure not occurring over a period of time in the future. Through reliability prediction, project teams can be helped to predict whether software is easy to fail in a certain period of time in the future in advance so as to estimate maintenance workload and cope with the risk of software failure. Currently, the mainstream software reliability assessment model is a software reliability incremental model, which models reliability from the point of software failure, and adopts mathematical means mainly comprising differential equations (sets) to establish quantitative function models among a plurality of random parameters in the software testing process. Fitting a cumulative fault quantity change trend which changes along with time by using recorded software fault data, normalizing the incremental fault quantity into a reliability probability value by using a specific formula, and further obtaining a reliability calculation formula of a specific item. Given the model inputs, time nodes and time intervals, the reliability of a time period in the past or future can be calculated. The method relates to a function fitting process, which is only practicable on the basis of a period of project operation and accumulated fault data with a certain scale, and cannot predict reliability in the initial stage of the project. Meanwhile, the reliability incremental model based on fault data does not relate to the cause of fault generation, and the model prediction result cannot be analyzed and understood fundamentally. Aiming at the defects of the existing model, the method for evaluating the reliability of the software based on the code index is provided, the influence of the code factors on the reliability is considered, and the method is applicable to cross-project data which cannot be fitted by a software reliability incremental model function due to data missing, and the early stage of the project and the finishing stage of the code are predicted in advance so as to evaluate and optimize the quality of the code. Disclosure of Invention The invention aims to provide a software reliability evaluation method based on code indexes, so as to solve the problems in the background technology. In order to solve the technical problems, the invention provides a software reliability assessment method based on code indexes, which is characterized by comprising the following steps: S1, acquiring a project fault problem list and a project Commit record, mining a tracking relation between the project fault problem list and the Commit, determining a project fault problem single-trigger type, dividing the project fault problem single-trigger type into a code trigger type and a non-code trigger type according to trigger reasons, and extracting all project fault problem lists with the trigger types of the code trigger type; s2, based on a software reliability increment model, obtaining a variation trend and a variation function of the software reliability along with the evolution of the project version by using regression analysis, thereby quantifying a corresponding reliability value for each project version; S3, acquiring source codes of all versions of the project, and calculating code indexes corresponding to all versions of the project by using a code index scanning tool; s4, carrying out reliability analysis on code indexes corresponding to each version of the project, and selecting a model feature set; S5, equally dividing the reliability values into four types as target features, dividing the model feature set into a training set and a testing set, training a four-classification model through the training set and the testing set, and constructing a software reliability evaluation model based on code indexes corresponding to each version of the project; s6, inputting a code index corresponding to the project version to be evaluated in a software reliability evaluation model, outputting a reliability label of the project version code to be evaluated by the software reliability evaluation model, predicting reliability in a code completion stage, and assisting in code quality evaluation and optimization. Further, the step S1 includes the following steps: s1-1, selecting a project fault problem list and Commit data, mining a tracking relation between the project fault problem list and the Commit data, and dividing the project fault problem list into a code triggering type and a non-code triggering type so as