CN-122019339-A - Software defect testing method based on large model

CN122019339ACN 122019339 ACN122019339 ACN 122019339ACN-122019339-A

Abstract

The invention provides a software defect testing method based on a large model, which belongs to the technical field of software defect testing, and comprises the steps of extracting the running condition of software testing, and obtaining defect feature vectors through the large model; constructing a characterization matrix through potential characterization analysis, obtaining a defect combination through gain analysis, comparing the defect scene set obtained by testing with a source code difference, constructing a scene matrix, mapping and fusing the scene matrix with the characterization matrix to obtain a defect expression vector, positioning the defect according to the defect expression vector, and generating a repair suggestion by combining a historical repair scheme of large model mining. The invention realizes accurate analysis and positioning of the defects, improves the testing efficiency, and reduces the rate of missing and false detection of the defects.

Inventors

Shen Zifeng
LIU YING
WANG NING

Assignees

西安中科信测科技有限公司

Dates

Publication Date: 20260512
Application Date: 20260309

Claims (8)

1. A software defect testing method based on a large model, comprising: Step 1, extracting test running conditions of software to be tested in different test cases, and inputting the test running conditions into a running large model to obtain defect feature vectors of the software to be tested based on the corresponding test cases; Step 2, potential characterization analysis is carried out on the corresponding defect feature vectors to obtain potential defect types, potential influence range grades and permissible expansion influence thresholds of the corresponding potential defect types, all the defect feature vectors are classified and divided according to the test types of the test examples to obtain a characterization matrix of the similar defect feature vectors, wherein the potential defect types, the potential influence range grades and the permissible expansion influence thresholds of each defect feature vector are used as a ternary array to be supplemented into the corresponding defect feature vectors; Step 3, performing gain analysis on the characterization matrix to obtain a plurality of defect combinations corresponding to the characterization matrix; capturing tested codes of source codes of the software to be tested in different test cases, and determining a defect scene set corresponding to the source codes according to code differences between the tested codes and the source codes; step 5, constructing a scene matrix according to defect scene sets corresponding to all test examples related under the same characterization matrix, mapping the scene matrix with the corresponding characterization matrix, and combining all defect combinations under the corresponding characterization matrix to obtain defect expression vectors of the software to be tested; And 6, carrying out defect positioning on the software to be tested according to all the defect expression vectors, and giving out defect repair suggestions, wherein the defect repair suggestions are generated based on a historical defect repair scheme of running large model mining and combining specific characteristics of the defect expression vectors.
2. The method for testing software defects based on large models according to claim 1, wherein extracting the software to be tested before the test running condition of different test cases comprises: According to the service business of the software to be tested, calling business scene data, type scene data and tool scene data from a business database, wherein the business scene data refers to data which are related to execution logic of a service business and an execution code layer of the service business in the business database, and the execution code layer comprises a plurality of independent business execution code segments; Extracting information from the service scene data to obtain a plurality of individual service execution code segments and logic expansion code segments of each individual service, respectively encoding the individual service execution code segments and the logic expansion code segments to obtain corresponding first codes and second codes, and reserving the combination of the first codes and the second codes under the same individual service execution code segment; Combining each first code with each remaining second code except the second codes matched with the first code, performing ambiguity processing on each code combination, eliminating the corresponding code combination if the ambiguity coefficient is larger than a preset coefficient, otherwise, reserving the corresponding code combination; obtaining an instance to be analyzed matched with each coding combination from a combination-instance database; determining each coding combination of the individual service execution code segments related to the adjacent log generation time, and combining the tool dependence relationship of the adjacent log generation time determined by the tool work tree constructed according to the tool scene data and the radiation relationship of the log grabbing tool of the adjacent log generation time on the individual service execution code segments related to the adjacent log generation time to obtain an example correction template of the corresponding coding combination; determining a plurality of instance filling templates related to the type scene data, and uniquely calling the instance filling templates according to the execution type corresponding to each coding combination to obtain a first template; And correcting and filling the corresponding instance to be analyzed according to the instance correction template and the first template to obtain a test instance.
3. The method for testing software defects based on large models according to claim 2, wherein the steps of correcting and filling the corresponding instance to be analyzed according to the instance correction template and the first template to obtain a test instance include: according to the instance correction template and the first template, carrying out initial correction and filling on the corresponding instance to be analyzed to obtain an initial instance; Acquiring service flow data, code execution frequency data and abnormal alarm data of software to be tested in a current service period in real time, and constructing a real-time service state feature set; Inputting the real-time service state feature set into a preset template optimization model to obtain a first adjustment parameter of an example correction template and a second adjustment parameter of the first template, wherein the template optimization model is a neural network model which is obtained based on running large model training and is used for adjusting the template parameters according to the real-time service state; Based on the first adjustment parameters, carrying out layered updating on the correction rules of the instance correction templates, and meanwhile, based on the second adjustment parameters, carrying out accurate updating on the filling rules of the first templates; and optimizing the corresponding initial instance based on the updated instance correction template and the updated first template to obtain the test instance.
4. The method of claim 1, wherein obtaining potential defect types and potential impact range levels and permissible expansion impact thresholds based on each potential defect type comprises: Performing quantization processing on the feature importance, the global contribution and the historical relevance of the defect feature vectors, and converting each defect feature vector into a corresponding potential characterization vector; Calculating cosine similarity of any two potential characterization vectors, and if the similarity is larger than a preset clustering threshold, classifying the two potential characterization vectors into the same cluster, wherein each cluster corresponds to one potential defect type, and all potential characterization vectors in the cluster are defect characterization sets of the potential defect type; Extracting influence keywords in business influence information corresponding to each potential defect type, wherein the influence keywords are related to business module identification, function failure description, data anomaly type and user influence range description; Matching the influence keywords with a preset influence range rule base by adopting a maximum coverage matching strategy, wherein each rule in the preset influence range rule base corresponds to an influence grade and a corresponding allowable expansion influence threshold; if the influence key word hits only one rule, determining the influence level corresponding to the corresponding rule as a potential influence range level of the corresponding potential defect type; if the influence key word hits at least two rules, determining the highest influence level in the hit rules as a potential influence range level of the corresponding potential defect type; Acquiring an influence time sequence corresponding to a corresponding potential defect type, and respectively averaging the influence time sequence and a corresponding standard influence time sequence to obtain an actual influence average value and a standard influence average value; according to the influence weight of the corresponding potential defect type in the service system, multiplying the influence weight by the actual influence mean value and the standard influence mean value respectively to obtain a weighted actual influence mean value and a weighted standard influence mean value; Calculating the absolute value of the difference between the weighted actual influence mean value and the weighted standard influence mean value; if the absolute value of the difference is larger than a preset influence threshold, determining an allowable expansion influence threshold corresponding to the potential defect type as a maximum allowable expansion threshold corresponding to the influence level in the influence range rule base; Otherwise, the allowable expansion influence threshold is determined as the minimum allowable expansion threshold corresponding to the influence level.
5. The method for testing software defects based on a large model according to claim 1, wherein performing gain analysis on the characterization matrix to obtain defect combinations of the corresponding characterization matrix comprises: constructing a row analysis model of a corresponding characterization matrix for each row of the characterization matrix, wherein the row analysis model of the corresponding characterization matrix is constructed by quantifying a coupling mapping relation among the various row vectors in the characterization matrix in a potential defect type dimension, a potential influence range grade dimension and an allowable expansion influence threshold dimension, constructing a coupling matrix, and applying the coupling matrix in the row analysis model; determining prior knowledge of a corresponding characterization matrix according to a statistical rule of historical defect characterization data under the same test type, and performing dimension reduction processing on the coupling matrix to obtain a new matrix, wherein the prior knowledge comprises mean, variance and covariance constraints based on potential defect types, potential influence range grades and three dimensions of an allowed influence threshold; Acquiring a new feature vector of the new matrix and a coupling feature vector of the coupling matrix, and acquiring constraint weights of each dimension in the ternary array based on the new feature vector and the coupling feature vector; Determining an association relation of each row vector in the characterization matrix under each dimension in the ternary array based on the rest row vectors to obtain a relation array, wherein the association relation comprises linear association, nonlinear association and no association; When the association array meets the standard analysis standard, the corresponding row vector is reserved, otherwise, the corresponding row vector is removed; performing enhancement processing on each reserved line vector according to the constraint weight to obtain an enhanced line vector, and determining the line gain of the enhanced line vector on the defect characteristic; And carrying out cluster analysis on all the reserved row vectors, classifying the row vectors with the row gain similarity higher than a preset threshold and the potential defect types associated with each other into the same cluster, wherein each cluster corresponds to a group of defect combinations.
6. The large model based software defect testing method of claim 5, wherein the defect combinations include defect trends, trend types, and trend generating conditions collectively determined by potential impact range levels and permissible extension impact thresholds for row vectors within clusters.
7. The method for testing software defects based on a large model according to claim 1, wherein mapping the scene matrix and the corresponding characterization matrix, and combining all defects under the corresponding characterization matrix to obtain defect expression vectors of the software to be tested, comprises: extracting row vector clustering center features and core constraint factors of each defect combination based on all defect combinations under a corresponding characterization matrix, wherein the core constraint factors are determined based on the row vector clustering center features; Supplementing association features of dimension matching of the defect scene and the characterization matrix triples to each defect scene vector in the scene matrix to obtain a scene matrix with aligned dimensions; Taking a core constraint factor of each defect combination as a weight factor, and assigning an existing weight to the corresponding defect combination by combining the mapping priority of each row vector in the corresponding characterization matrix, wherein the mapping weight of the row vector corresponding to the defect combination of the direct trend type is higher than the mapping weight of the row vector corresponding to the defect combination of the potential trend type, the mapping weight of the row vector corresponding to the defect combination with high satisfaction of the trend generating condition threshold is higher than the mapping weight of the row vector corresponding to the defect combination with low satisfaction of the trend generating condition threshold, and determining the mapping priority of the corresponding row vector according to the mapping weights of the trend type and the trend generating condition threshold; performing association mapping on scene features of the scene matrix with aligned dimensions and row vectors of the characterization matrix to obtain a scene-characterization association matrix, and performing fusion mapping on the scene-characterization association matrix and the current weights of defect combinations to obtain an intermediate defect representation matrix; and performing row dimension aggregation on the intermediate defect representation matrix according to the defect scene dimension to obtain a defect representation vector of the software to be tested under the corresponding characterization matrix.
8. The method of claim 7, wherein the correlation feature is a historical probability of a defect scene trigger corresponding to a potential defect type, a degree of matching of a defect scene impact range to a characterization matrix potential impact range level.

Description

Software defect testing method based on large model Technical Field The invention relates to the technical field of software testing, in particular to a software defect testing method based on a large model. Background In the process of software development and operation and maintenance, software defect testing is a core link for guaranteeing software quality and improving software operation stability. Along with the complexity of a software system and the diversification of service scenes, the traditional software defect testing method depends on manual design test cases and manual analysis of defect characteristics, so that the testing efficiency is low, and the complex service scenes and potential defect types are difficult to cover. In recent years, the application of the large model technology in the field of software testing gradually falls to the ground, but the existing software defect testing scheme based on the large model lacks systematic testing flow design, so that the problems of defect missing detection and false detection easily occur in the software testing, and the defect testing requirement of a complex software system cannot be met. Therefore, the invention provides a software defect testing method based on a large model. Disclosure of Invention The invention provides a software defect testing method based on a large model, which is used for solving the technical problems. The invention provides a software defect testing method based on a large model, which comprises the following steps: Step 1, extracting test running conditions of software to be tested in different test cases, and inputting the test running conditions into a running large model to obtain defect feature vectors of the software to be tested based on the corresponding test cases; Step 2, potential characterization analysis is carried out on the corresponding defect feature vectors to obtain potential defect types, potential influence range grades and permissible expansion influence thresholds of the corresponding potential defect types, all the defect feature vectors are classified and divided according to the test types of the test examples to obtain a characterization matrix of the similar defect feature vectors, wherein the potential defect types, the potential influence range grades and the permissible expansion influence thresholds of each defect feature vector are used as a ternary array to be supplemented into the corresponding defect feature vectors; Step 3, performing gain analysis on the characterization matrix to obtain a plurality of defect combinations corresponding to the characterization matrix; capturing tested codes of source codes of the software to be tested in different test cases, and determining a defect scene set corresponding to the source codes according to code differences between the tested codes and the source codes; step 5, constructing a scene matrix according to defect scene sets corresponding to all test examples related under the same characterization matrix, mapping the scene matrix with the corresponding characterization matrix, and combining all defect combinations under the corresponding characterization matrix to obtain defect expression vectors of the software to be tested; And 6, carrying out defect positioning on the software to be tested according to all the defect expression vectors, and giving out defect repair suggestions, wherein the defect repair suggestions are generated based on a historical defect repair scheme of running large model mining and combining specific characteristics of the defect expression vectors. Preferably, the extracting the software to be tested includes, before the test running condition of different test cases: According to the service business of the software to be tested, calling business scene data, type scene data and tool scene data from a business database, wherein the business scene data refers to data which are related to execution logic of a service business and an execution code layer of the service business in the business database, and the execution code layer comprises a plurality of independent business execution code segments; Extracting information from the service scene data to obtain a plurality of individual service execution code segments and logic expansion code segments of each individual service, respectively encoding the individual service execution code segments and the logic expansion code segments to obtain corresponding first codes and second codes, and reserving the combination of the first codes and the second codes under the same individual service execution code segment; Combining each first code with each remaining second code except the second codes matched with the first code, performing ambiguity processing on each code combination, eliminating the corresponding code combination if the ambiguity coefficient is larger than a preset coefficient, otherwise, reserving the corresponding code combination; obtaining an inst