CN-116861442-B - Binary program-oriented data-oriented vulnerability retrieval method
Abstract
The invention provides a binary program-oriented data guide loophole retrieval method, which belongs to the technical field of computers and comprises the steps of inputting test data into a target binary program in a computer and operating the target binary program, when the target binary program is operated to a loophole function where a memory error loophole is located, storing a memory snapshot, marking the test data of a buffer zone of the input computer in the memory snapshot as a pollution source, executing the target binary program by taking a first instruction of the loophole function as a starting point, and if any memory copy Gadget for copying data in a memory to other memory positions and any address write Gadget for writing the data into the memory are found in the process of executing the target binary program, judging that the current memory error loophole is enough to be promoted for data guide DOP to be utilized. The method can analyze the usability of the data-oriented vulnerability of the binary program.
Inventors
- FU CAI
- ZHU QINGCHEN
- LUO TIANYU
- LV JIANQIANG
- HAN LANSHENG
- LIU MING
- ZOU DEQING
Assignees
- 华中科技大学
Dates
- Publication Date
- 20260508
- Application Date
- 20230710
Claims (6)
- 1. A binary program-oriented data-oriented vulnerability retrieval method is characterized by comprising the following steps: Inputting test data into a target binary program in a computer and running the target binary program, and storing a memory snapshot when the target binary program runs to a loophole function where a memory error loophole is located; marking test data in the memory snapshot as a pollution source, and executing a target binary program by taking a first instruction of a loophole function as a starting point; After the target binary program executes the data transmission instruction polluted by the polluted source, if the operation of the data transmission instruction polluted by the polluted source is written back to the memory, the instruction of the target operand is cleared, the instruction of the source operand is fetched according to the sequence of entering the memory, and the instruction of the source operand and the current instruction form a Gadget together; If any memory copy instruction AMC GADGET for copying the data in the memory to other memory locations and any address write instruction AMW GADGET for writing the data to any memory locations are found in the process of executing the target binary program, judging that the current memory error vulnerability is enough to be improved to the data guide vulnerability DOP for utilization; Taking the instruction address of the first instruction of the loophole function as an analysis starting point, taking the found instruction address of the first Gadget G1 as an analysis end point 1, and taking the found instruction address of the second Gadget G2 as an analysis end point 2; acquiring a z3 expression expr1 of the analysis end point 1 relative to the analysis start point according to the backward slice diagram between the analysis start point and the analysis end point 1; acquiring a z3 expression expr2 of the analysis end point 2 relative to the analysis start point according to the backward slice between the analysis start point and the analysis end point 2; if expr 1= expr2 there is a solution, then there is a data flow dependency between analysis endpoint 1 and analysis endpoint 2, otherwise there is no data flow dependency between analysis endpoint 1 and analysis endpoint 2; if there is a data flow dependency between analysis endpoint 1 and analysis endpoint 2, G1 and G2 are combined into one Gadget.
- 2. The binary program-oriented data-oriented vulnerability retrieval method of claim 1, wherein the obtaining the backward slice between the analysis start point and the analysis end point 1 comprises: Collecting binary byte codes from an analysis starting point to an analysis end point 1 and converting the binary byte codes into a VEX IR instruction sequence; splitting the VEX IR instruction sequence into sub-expressions, and constructing a data flow diagram DFG according to the statement type and sub-expression type of the VEX IR and the read temporary variables, registers and memories; a backward slice between the analysis start point and the analysis end point 1 is extracted from the DFG.
- 3. The binary program-oriented data-oriented vulnerability retrieval method of claim 2, further comprising: performing layer sequence traversal on the backward slice diagram between the analysis starting point and the analysis end point 1, and splitting the VEX IR nodes into sub-expressions for each VEX IR node in the backward slice diagram; converting the sub-expression into a z3 expression; Solving the expression range of the corresponding register and the memory at the analysis end point 1 by using the Optimize class of the z3 expression, and determining the polluted memory according to the expression range of the corresponding register and the memory at the analysis end point 1; Executing a target binary program with the found Gadget G1 as a starting point for the polluted memory until a second Gadget G12 is found; Regarding G12 as a new arbitrary memory read AMR GADGET if the source address of G12 is covered by G1's pollution, regarding G12 as a new arbitrary address write AMW GADGET if the destination address of G12 is covered by G1's pollution, regarding G2 as an arbitrary memory copy AMC GADGET if the source address of G12 is covered by G1's pollution and the destination address of G12 can be polluted; continuing to execute the target binary program by taking G12 as a starting point until a new Gadget cannot be found any more or the currently explored Gadget enables the program to have DOP availability.
- 4. The binary program-oriented data-oriented vulnerability retrieval method of claim 3, wherein the Gadget G1 is any address write AMW GADGET or any memory copy AMC GADGET.
- 5. The method for retrieving a binary-oriented data oriented vulnerability of claim 1, further comprising performing path exploration during execution of a target binary program, comprising: in the process of path exploration, defining all branches with polluted judging conditions as symbolized branches, and defining basic blocks where the symbolized branches are positioned as symbolized nodes, wherein the symbolized nodes are as follows: mov register 1, [ Address 1] Cmp [ Address 2], register 1 Jxx code segment addresses If the cmp instruction in the jump node of the symbolized branch is polluted, defining the current branch as a controllable branch; Executing one path at a time in the process of executing the target binary program, and recording all controllable branches passing in the current running and the current jump result; After the single path exploration is finished, performing sub-generation search on the controllable branches, wherein the sub-generation search comprises the following steps: when one symbolized branch is reached, recording the jump direction of the current execution, taking out the path reaching the symbolized branch, reversing the jump direction at the tail of the path, and generating a new path for the next execution; When executing the newly generated path, the symbol execution technology is used for collecting the constraint condition of the newly generated path, verifying the resolvability of the path constraint, solving the symbolized input to the path, executing the path and carrying out new path exploration.
- 6. The method for retrieving a binary program-oriented data oriented vulnerability of claim 1, further comprising determining whether a current program execution is in a loop structure by using a Dispatcher search algorithm for controlling a loop number and relieving a path explosion, wherein the Dispatcher search algorithm is: searching a loop structure in the loop structure target binary program by using LoopFinder tools to record the entering edges and the exiting edges of all loops; using a stack to store a circulation structure where the current simulation execution is located, continuously acquiring instructions of the simulation execution in the running process, and checking whether the current instruction address is positioned at an entry side or an exit side of the circulation; If the current stack top element is positioned at the entering side of the cycle, checking whether the current stack top element is the cycle, if not, stacking the cycle, otherwise, indicating a second round entering the same cycle; if the current loop is at the outgoing side of the loop, the loop at the top of the stack is popped up, and meanwhile whether the stack is empty is checked, if so, the current simulation execution is not in any loop.
Description
Binary program-oriented data-oriented vulnerability retrieval method Technical Field The invention belongs to the technical field of computers, and particularly relates to a binary program-oriented data-oriented vulnerability retrieval method. Background In the current binary security field, DOP (Data-OrientedProgramming) vulnerability has become an important threat. The DOP exploit the existing data fragments in the program to construct malicious behavior, independent of the control flow of the program. This makes the handling of DOP vulnerabilities by conventional defense mechanisms and detection techniques more difficult. For some found memory errors, the program is typically crashed, but if the user carefully constructs the program input, the memory errors may be promoted to DOP bugs and exploited. Therefore, determining whether the current memory error can be lifted as a DOP bug becomes an important issue. The traditional defense mechanism and detection tool mainly pay attention to the traditional control flow hijacking attack, and the analysis capability of DOP vulnerability is limited, so that an efficient and accurate method and system for analyzing whether the current memory error has the possibility of DOP vulnerability exploitation are not available at present. Disclosure of Invention In order to overcome the defects in the prior art, the invention provides a binary program-oriented data-oriented vulnerability retrieval method. In order to achieve the above object, the present invention provides the following technical solutions: a binary program-oriented data-oriented vulnerability retrieval method comprises the following steps: Inputting test data into a target binary program in a computer and running the target binary program, and storing a memory snapshot when the target binary program runs to a loophole function where a memory error loophole is located; marking test data in the memory snapshot as a pollution source, and executing a target binary program by taking a first instruction of a loophole function as a starting point; After the target binary program executes the data transmission instruction polluted by the polluted source, if the operation of the data transmission instruction polluted by the polluted source is written back to the memory, the instruction of the target operand is cleared, the instruction of the source operand is fetched according to the sequence of entering the memory, and the instruction of the source operand and the current instruction form a Gadget together; If any memory copy instruction AMC GADGET for copying the data in the memory to other memory locations and any address write instruction AMW GADGET for writing the data to any memory locations are found during the execution of the target binary program, it is determined that the current memory error hole is sufficient to promote the data oriented hole DOP for exploitation. Further, the method further comprises the following steps: Taking the instruction address of the first instruction of the loophole function as an analysis starting point, taking the found instruction address of the first Gadget G1 as an analysis end point 1, and taking the found instruction address of the second Gadget G2 as an analysis end point 2; acquiring a z3 expression expr1 of the analysis end point 1 relative to the analysis start point according to the backward slice diagram between the analysis start point and the analysis end point 1; acquiring a z3 expression expr2 of the analysis end point 2 relative to the analysis start point according to the backward slice between the analysis start point and the analysis end point 2; if expr 1= expr2 there is a solution, then there is a data flow dependency between analysis endpoint 1 and analysis endpoint 2, otherwise there is no data flow dependency between analysis endpoint 1 and analysis endpoint 2; if there is a data flow dependency between analysis endpoint 1 and analysis endpoint 2, G1 and G2 are combined into one Gadget. Further, the acquiring the backward slice between the analysis start point and the analysis end point 1 includes: Collecting binary byte codes from an analysis starting point to an analysis end point 1 and converting the binary byte codes into a VEX IR instruction sequence; splitting the VEX IR instruction sequence into sub-expressions, and constructing a data flow diagram DFG according to the statement type and sub-expression type of the VEX IR and the read temporary variables, registers and memories; a backward slice between the analysis start point and the analysis end point 1 is extracted from the DFG. Further, the method further comprises the following steps: performing layer sequence traversal on the backward slice diagram between the analysis starting point and the analysis end point 1, and splitting the VEX IR nodes into sub-expressions for each VEX IR node in the backward slice diagram; converting the sub-expression into a z3 expression; Solving the expressio