Search

CN-121996538-A - Fuzzy test method based on thread execution sequence feedback

CN121996538ACN 121996538 ACN121996538 ACN 121996538ACN-121996538-A

Abstract

The invention discloses a fuzzy test method based on thread execution sequence feedback, which comprises the steps of inserting source codes used for dynamically analyzing memory safety and collecting type definition and function definition of thread execution sequence information into source code files of a program to be tested, analyzing the source codes of the program to be tested by a compiler to generate an abstract syntax tree, traversing the abstract syntax tree to obtain specific nodes and performing instrumentation so as to detect memory errors and monitor memory state information and collect thread execution sequence information when the program runs, compiling the source codes after instrumentation to obtain assembly codes, performing coverage instrumentation on the assembly codes, compiling and linking the assembly codes after instrumentation to obtain executable files, performing fuzzy test on the executable files, using the coverage information obtained by coverage instrumentation, the memory state information obtained by memory safety dynamic analysis instrumentation and the thread execution sequence information obtained by thread context instrumentation as feedback, and outputting a program input file enabling the program to be tested to have program breakdown or memory errors.

Inventors

  • CHEN ZHE
  • WANG ANQI

Assignees

  • 南京航空航天大学

Dates

Publication Date
20260508
Application Date
20251215

Claims (6)

  1. 1. The fuzzy test method based on thread execution sequence feedback is characterized by comprising the following steps: Step S1, a source code file of a multithreading program to be tested is opened, and source codes of type definition and function definition used for memory security dynamic analysis and source codes of type definition and function definition used for collecting thread execution sequence information are inserted into the source code file; step S2, preprocessing the source code of the multithreading program to be detected after the step S1 by utilizing a compiler to obtain a preprocessed source code, and analyzing the preprocessed source code by utilizing the compiler, wherein the analysis comprises lexical analysis, grammar analysis and semantic analysis to generate an abstract grammar tree; step S3, traversing an abstract syntax tree, obtaining a node to be instrumented, performing memory security dynamic analysis instrumentation and thread context instrumentation on the node to be instrumented, detecting memory errors, monitoring memory state information and collecting thread execution sequence information when the instrumented source code operates; s4, compiling the source code after the instrumentation in the step S3 to obtain assembly code; step S5, coverage rate instrumentation is carried out on the assembly codes, and instrumentation of source codes inserted in the step S1 and the step S3 is skipped; Step S6, assembling and linking the assembly codes after pile insertion to obtain an executable file; And S7, performing fuzzy test on the executable file, and using coverage rate information obtained by coverage rate instrumentation, memory state information obtained by memory security dynamic analysis instrumentation and thread execution sequence information obtained by thread context instrumentation as feedback to guide the fuzzy test and output a program input file which enables a to-be-tested multithreaded program to have program breakdown or memory errors.
  2. 2. The fuzzy test method based on the thread execution sequence feedback of claim 1, wherein the specific procedure of the step S1 is as follows: step S11, inserting a source code defined by a memory security detection function, wherein the function judges whether memory access is legal or not by checking whether a legal address range recorded by pointer metadata pmd contains an address range to be accessed by a multithread program to be tested, and simultaneously updates the minimum pointer boundary distance, namely the minimum difference value between a pointer value ptr and the boundary of the legal address range recorded by pointer metadata pmd, for guiding seed screening and priority calculation of fuzzy test; Step S12, inserting source codes defined by a memory allocation library function and a wrapper function of a memory release library function, judging whether the memory operation is successful or not by the wrapper function PRFmalloc based on a return value after calling the memory allocation library function malloc, and updating a memory allocation peak value if the memory operation is successful; Step S13, inserting a source code of type definition and variable definition of a thread execution sequence hash value, wherein the thread execution sequence hash value is a variable of a structural body type, each member in the variable is a 64-bit unsigned integer and represents a hash value corresponding to a function call sequence or a statement execution sequence; in the case of using a Pthread multithread library and an OpenMP multithread library, the definition of the structure type includes hash values corresponding to a function mutex_lock, a function mutex_unlock, and a function pthread_join call sequence in the Pthread multithread library, and also includes hash values corresponding to a critical statement start, a critical statement end, and a barrier statement execution sequence in the OpenMP multithread library; Step S14, inserting source codes defined by a hash value updating function of a thread execution sequence, wherein the hash value updating function collects thread context information of thread synchronization events in a multithreaded program to be tested written by using a Pthread multithreaded library and an OpenMP multithreaded library, updates the hash value of the thread execution sequence and is used for identifying a thread execution sequence corresponding to the multithreaded program to be tested, the collected thread context information comprises a thread number, a code line number and a hash number of a calling expression of a multithreaded library function, the thread number, the line number and the hash number are synthesized into an integer through bit operation, and the integer is merged into the hash value of the thread execution sequence by using the hash function.
  3. 3. The fuzzy test method based on the thread execution sequence feedback according to claim 2, wherein the specific procedure of the step S3 is as follows: Step S31, pile inserting is carried out at a main function inlet, and is used for establishing inter-process communication with a fuzzy tester, obtaining the address of a shared memory variable provided by the fuzzy tester, and initializing the shared memory variable, wherein the shared memory variable is used for storing the minimum pointer boundary distance, the memory allocation peak value and the thread execution sequence hash value; Step S32, inserting piles for the memory dereferencing expression, the array subscript expression and the member expression, replacing the initial address to be accessed by each expression with the call of a memory security detection function PRFcheck _dpv for detecting memory errors and updating the minimum pointer boundary distance at the same time, and storing the memory errors and the minimum pointer boundary distance in a shared memory variable provided by a fuzzy tester; Step S33, inserting piles for the calling expressions of the memory allocation library function and the memory release library function, replacing the calling of the memory allocation library function with the calling of the packaging function of the memory allocation library function, replacing the calling of the memory release library function with the calling of the packaging function of the memory release library function, updating the memory allocation peak value, and storing the memory allocation peak value in the shared memory variable provided by the fuzzy tester; the wrapper function PRFmalloc judges whether the quantity of the currently allocated memory is larger than a memory allocation peak value before the memory allocation library function malloc is called after the memory allocation library function malloc is called for memory allocation, and if so, updates the memory allocation peak value into a shared memory variable provided by the fuzzy tester; and step S34, locking, unlocking and synchronizing the threads in the Pthread multithread library and the OpenMP multithread library, inserting a call for updating a function of a hash value of a thread execution sequence, collecting thread context information, updating the hash value of the thread execution sequence, and identifying a thread execution sequence corresponding to the multithread program execution to be tested, wherein the hash value of the thread execution sequence is stored in a shared memory variable provided by a fuzzy tester.
  4. 4. The fuzzy test method based on the thread execution sequence feedback of claim 3, wherein the specific procedure of the step S7 is as follows: Step S71, the fuzzy tester provides an operation mode option, and allows a user to select whether to use at least one of a minimum pointer boundary distance, a memory allocation peak value, a Pthread thread execution sequence and an OpenMP thread execution sequence as a feedback guide fuzzy test; Step S72, when each test case is operated, judging whether the test case triggers a new program execution path or not by using coverage information obtained by coverage instrumentation, judging whether the test case generates smaller minimum pointer boundary distance or consumes larger memory allocation peak value by using memory state information obtained by memory security dynamic analysis instrumentation; Discarding the test case if the test case does not generate new execution characteristics including a new program execution path, a smaller minimum pointer boundary distance, a larger memory allocation peak value and a new thread execution sequence, otherwise, judging again, if the user simultaneously starts a running mode of thread execution sequence information guidance and memory state information guidance and generates a new thread execution sequence, but does not trigger the new program execution path, and does not generate a smaller minimum pointer boundary distance or a larger memory allocation peak value, discarding the test case; step S73, if a new test case is operated, a smaller minimum pointer boundary distance or a more consumed memory allocation peak value is generated, then the new test case is used for replacing the test case which has the same program execution path or the same thread execution sequence as the new test case in the current seed queue but has a larger minimum pointer boundary distance or a smaller memory allocation peak value, so that the seed queue is simplified; Step S74, for each seed in the seed queue, namely a test case, calculating the priority weight of the execution sequence of the seed according to the minimum pointer boundary distance and the memory allocation peak value corresponding to each seed, and for the same seed, if the current running time generates a smaller minimum pointer boundary distance or a larger memory allocation peak value, adding 1 to the current execution sequence priority weight of the seed, wherein the initial execution sequence priority weight of each seed is 0; and step S75, calculating excellent factors of the seeds according to the execution time, the file size, the minimum pointer boundary distance, the memory allocation peak value and the priority weight of the execution sequence of the seeds in the seed queue, sorting the seeds in the seed queue according to the order of the excellent factors from large to small, and performing mutation and fuzzy test on the seeds according to the order of the excellent factors from large to small.
  5. 5. A computer device comprising a memory, a processor, and a computer program stored in the memory and capable of running on the processor, wherein the processor, when executing the computer program, implements the steps of the thread execution sequence feedback based fuzzy test method of any of claims 1 to 4.
  6. 6. A computer readable storage medium storing a computer program, wherein the computer program when executed by a processor implements the steps of the fuzzy test method based on thread execution sequence feedback of any of claims 1 to 4.

Description

Fuzzy test method based on thread execution sequence feedback Technical Field The invention relates to a fuzzy test method based on thread execution sequence feedback, and belongs to the technical field of multi-thread program fuzzy test. Background The C language is widely used for developing operating system kernels, embedded systems and high-performance computing software by virtue of the high efficiency and flexibility of the C language approaching the bottom layer. The C language allows a programmer to manually manage memory, while improving system controllability and execution efficiency, and increasing memory security risks. Common memory errors include buffer overflow, null pointer dereferencing, pointer use after release, repeated release, memory leakage, etc., which may cause program crashes or malicious exploitation in severe cases. With the popularity of multi-core processors, multi-threaded programming has become the dominant way to improve program performance. However, when C language is combined with multi-threading, the complexity and concealment of memory security issues is further amplified. Unlike single-threaded programs, the order of execution of multiple threads in a multi-threaded program has uncertainty, resulting in memory errors that are often triggered only under a particular thread execution sequence. Fuzzy test technology combined with dynamic analysis has become a mainstream program test method. The dynamic analysis technology can extract state information such as pointer values, memory allocation and release records and the like in the program execution process, and find potential memory errors by analyzing the information in the running process. However, the validity of dynamic analysis depends on test case inputs and thread execution sequences, as some errors can only be triggered by specific test case inputs and thread execution sequences. The fuzzy test technology solves the problem of automatically generating test case input, generates input through random variation, continuously optimizes input based on feedback information (such as code coverage rate), and guides the generated input to cover more program execution paths so as to find potential errors. Fuzzy testers incorporating dynamic analysis (such as AFL incorporating ASan) have shown good effectiveness in memory error detection for single-threaded programs. However, in a multithreaded program, the program behavior is not only dependent on test case input, but also is affected by the randomness of thread scheduling and thread execution sequences, and the uncertainty causes the limitation of the conventional technology in the memory error detection of the multithreaded program, and the memory error caused by the multithreaded interaction is difficult to find only depending on the code coverage rate. Specifically, the main problems faced by the prior art are as follows: (1) The existing fuzzy tester cannot effectively find memory errors caused by multi-thread interaction The existing fuzzy tester (such as AFL) mainly uses code coverage rate as feedback to guide the generation and optimization of test case input. But the method ignores the thread execution order in a multi-threaded program, i.e. for the same code coverage, different thread execution orders may lead to completely different execution results, triggering different errors, such as memory errors caused by concurrent access, data contention. (2) Seed screening mechanisms of existing fuzzy testers do not consider multithread features of seeds The existing fuzzy tester (such as AFL) mainly screens seeds according to code coverage rate, seed values caused by thread execution sequence differences among the seeds cannot be distinguished, so that the seeds with the same code coverage rate as the existing seeds and different thread execution sequences are discarded, and the testing efficiency is reduced. In summary, the existing fuzzy test technology combined with dynamic analysis still has significant defects in the test of the multithreading program, and a new method capable of combining thread contexts is needed to realize more comprehensive and efficient error detection. Disclosure of Invention The invention aims to solve the technical problem of providing a fuzzy test method based on thread execution sequence feedback, and remarkably improving the memory error discovery capability of a multithreaded program. The invention adopts the following technical scheme for solving the technical problems: the fuzzy test method based on thread execution sequence feedback comprises the following steps: Step S1, a source code file of a multithreading program to be tested is opened, and source codes of type definition and function definition used for memory security dynamic analysis and source codes of type definition and function definition used for collecting thread execution sequence information are inserted into the source code file; step S2, preprocessing the source cod