CN-121996446-A - Open source big data system software reliability modeling method considering complex test and fault removal efficiency

CN121996446ACN 121996446 ACN121996446 ACN 121996446ACN-121996446-A

Abstract

The invention belongs to the technical field of software reliability models, and particularly relates to an open source big data system software reliability modeling method considering complex test and fault removal efficiency. Aiming at the reliability evaluation problem of open source big data system software, the invention takes the complex test environment and the failure removal efficiency into consideration to carry out software reliability modeling. The related experiments verify the accuracy of the proposed model in predicting the residual faults of the software and the effectiveness of the reliability evaluation of the open source big data system software. The model can help software developers to conduct fault prediction and reliability assessment in the actual development and testing process of open source big data system software.

Inventors

WANG JINYONG

Assignees

山西大学

Dates

Publication Date: 20260508
Application Date: 20260116

Claims (2)

1. The open source big data system software reliability modeling method considering complex test and fault removal efficiency is characterized by comprising the following steps: Assume that the testing process of the open source big data system obeys the non-homogeneous poisson process, and the mean function is that In the test process, the fault detection can be regarded as a counting process, using Representing time to date By the number of detected faults: (1); Wherein i is an integer; Assuming that the complex test procedure is compliant with the Weibull distribution Complex test rate function Expressed as: (2); Wherein, the As a parameter of the dimensions of the device, Is a shape parameter; Consider the learning process of fault detection in the open source big data system software test process, the fault detection rate function Expressed as: (3); Wherein, the In order for the failure detection rate to be high, Is the inflection point coefficient; Assuming that during the fault detection process of the open source big data system, the faults detected instantaneously are related to the number of actual residual faults in the software, the following formula is established: (4); Wherein, the Representing the total number of faults that the software can initially detect, Indicating failure cancellation efficiency; Substituting the formula (2) and the formula (3) into the formula (4) yields: (5); equation (5) is an expression of the proposed software reliability model.
2. The open source big data system software reliability modeling method considering complex test and troubleshooting efficiency according to claim 1, characterized in that the parameters of the software reliability model are estimated by using a maximum likelihood estimation method.

Description

Open source big data system software reliability modeling method considering complex test and fault removal efficiency Technical Field The invention belongs to the technical field of software reliability models, and particularly relates to an open source big data system software reliability modeling method considering complex test and fault removal efficiency. Background In recent years, big data technology has been widely used in the field of human social services. People enjoy the benefits and convenience brought by big data technology and simultaneously question the reliability of big data system software. The complexity of the open source big data system represented by Hadoop, spark, flink, HBase and the like mainly comes from the architecture characteristics of the system, the characteristics of the open source ecological system and the special requirements of big data application scenes, and the complexity is expressed in multiple dimensions, including system design, data characteristics, test scenes and other layers. First, open source big data systems typically employ distributed architectures and componentized designs in order to handle large-scale data, which introduce multiple levels of complexity to testing. Secondly, the open source project generally adopts a community driving mode, the iteration speed of the system version is high, the version fragmentation is serious, and the complexity of the test is further increased. Thirdly, the core of the big data system is data processing, and the requirements of the scale, the data type and the timeliness of the data directly increase the software testing difficulty. Fourth, performance is a key indicator of large data systems (such as throughput, latency, and concurrency), and the difficulty of testing performance and scalability far exceeds that of conventional software. Fifth, open source big data systems typically require integration with upstream and downstream tools (e.g., relational databases, message queues, BI tools), and compatibility testing ranges are extremely broad. Sixth, the "fault tolerance" (e.g., automatic recovery after node failure) of the distributed system is one of its core characteristics, but construction of relevant test scenarios is extremely difficult. Finally, testing of open source big data systems lacks mature standardized tools, resulting in difficulty in effective improvement of automated test coverage. For reliability challenges of open source big data systems, various software reliability models have been developed. Tamura et al propose a three-dimensional stochastic differential equation software reliability model that takes into account big data characteristics, fault factors and network factors. Wang et al propose an open source big data software reliability model based on Weibull-Weibull distribution. Cao and Gao propose a method for evaluating big data systems using fault tree analysis. Wang et al further consider the efficiency of fault introduction and fault elimination and propose a relative software reliability model of the open source big data system. Tamura and Yamada et al propose a software reliability assessment method suitable for cloud computing and big data environments based on fault data clustering and a risk rate model. Kumar et al utilize a non-homogeneous poisson process (NHPP) based Software Reliability Growth Model (SRGMs) to evaluate the software reliability of open source data systems. Although the model can predict the residual fault quantity of software and effectively evaluate the software reliability of the open source big data system under certain conditions, the complexity of the software testing process of the open source big data system is not fully considered. Furthermore, troubleshooting efficiency also plays a key role in the software reliability modeling process. The troubleshooting efficiency is defined as the ratio of the number of faults that are cleared during the software test to the number of faults detected, and is used to quantify the proportion of faults detected that can actually be repaired. Disclosure of Invention Because of the relatively complex testing of open source big data system software, such as distributed system design, data complexity, dynamic development of open source ecosystems, skills of community testers, limitation of testing resources, and the like, the reliability problem of open source big data system software has been widely focused. Currently, research on the reliability of open source big data system software is limited. Aiming at the reliability evaluation problem of open source big data system software, the invention provides a method for modeling the reliability of open source big data system software by considering complex test and fault removal efficiency. The invention adopts the following technical scheme to achieve the aim: the open source big data system software reliability modeling method considering complex test and fault removal ef