CN-122020654-A - Intelligent agent collaborative vulnerability reasoning method and system based on logic modeling and structural prompt

CN122020654ACN 122020654 ACN122020654 ACN 122020654ACN-122020654-A

Abstract

The invention provides an agent collaborative vulnerability reasoning method and system based on logic modeling and structural prompt. The method comprises the steps of modeling each vulnerability type by adopting a structured prompt language, generating a structured vulnerability expression of each vulnerability type, inquiring and matching in a code warehouse to obtain matched code fragments, constructing a positive sample and a negative sample by utilizing all the matched code fragments and the structured vulnerability expression to form a training sample set, fine-tuning a pre-training large language model to learn a mapping relation between the code fragments and the vulnerability types so as to obtain a fine-tuned vulnerability model, inputting a program to be detected and a corresponding structured vulnerability expression into the fine-tuned vulnerability model, predicting to obtain candidate vulnerability codes, inputting the candidate vulnerability codes and the corresponding structured vulnerability expressions into a preset intelligent body, and carrying out context perception verification on the candidate vulnerability codes by combining an atomic operation tool package to obtain a vulnerability detection result.

Inventors

YIN ZHONGXU
LI JUNRU
Zong Guoxiao
KONG LIYA
Sang Haiya
DONG WENJIE
ZHAO WENCHEN

Assignees

中国人民解放军网络空间部队信息工程大学

Dates

Publication Date: 20260512
Application Date: 20251218

Claims (8)

1. The utility model provides an agent collaborative vulnerability reasoning method based on logic modeling and structural prompt, which is characterized by comprising the following steps: Modeling each vulnerability type by adopting a preset structured prompt language to generate a structured vulnerability expression of each vulnerability type; Step 2, taking the structured vulnerability expressions of all vulnerability types as matching rules, inquiring and matching in a given code warehouse based on the matching rules to obtain code fragments matched with all the structured vulnerability expressions; step 3, fine tuning the pre-training large language model by using the training sample set, so that the pre-training large language model learns the mapping relation between the code segments and the vulnerability types, and a fine-tuned vulnerability model is obtained; and 4, inputting a program to be detected and a corresponding structured vulnerability expression into the fine-tuned vulnerability model, predicting to obtain candidate vulnerability codes in the program to be detected, and inputting the predicted candidate vulnerability codes and the corresponding structured vulnerability expression into a preset agent so that the agent can call a preset atomic operation tool kit to perform context perception verification on the candidate vulnerability codes to obtain a vulnerability detection result.
2. The method of claim 1, wherein the structured hint language is composed of a plurality of keywords, the keywords specifically including operation node, type of operation result, comparison type, protection condition for operation node, operation parameter, right/left operand of operation, whether operation node is protected by condition, whether condition indicates operation, whether all conditions must be true/any one condition is true, some operation condition is present, and maximum/minimum value of parameter.
3. The method for reasoning collaborative vulnerability of an agent based on logical modeling and structured prompting according to claim 1, wherein in step 4, the atomic operation toolkit specifically includes a function indexing and expanding operation, a code slicing operation and a variable boundary identifying operation, wherein: Establishing a function index set for all functions in the program to be detected, and performing context expansion on the called function based on the function index set if the called function exists in the function index set when the function call exists in the reasoning process; Taking the candidate vulnerability codes as a suspected risk node, starting backtracking from the suspected risk node, and performing data flow sensitive reverse slicing on the program to be detected; And variable boundary identification operation, namely extracting variable definition related codes from the candidate vulnerability codes, extracting variable types, and obtaining upper and lower boundaries of the variables according to the variable types.
4. The method for collaborative vulnerability inference of an agent based on logical modeling and structured prompting according to claim 3, wherein indexing is established for all functions in the program to be detected, so that when there is a function call in the inference process, if the called function exists in the function index set, context expansion is performed on the called function based on the function index set, and the method specifically comprises: generating an abstract syntax tree of the program to be detected; Traversing the abstract syntax tree to identify all function nodes, extracting the function name, the parameter list, the return type and the function body of each function node to form the index of the function node, and synthesizing the indexes of all the function nodes to obtain a function index set; When a function call is encountered in the reasoning process, the function name and the current recursion depth of the called function are obtained, whether the current recursion depth is smaller than the preset maximum recursion depth is judged, and if so, a function body corresponding to the function name of the called function is inquired in the function index set and used as the context of the called function.
5. The method for reasoning collaborative vulnerability of an agent based on logic modeling and structured prompting according to claim 3, wherein backtracking is started from the suspected risk node, and a reverse slice of data flow sensitivity is performed on the program to be detected, specifically comprising: generating an abstract syntax tree of the program to be detected; Traversing all nodes in the abstract syntax tree, and executing the following relevance judgment for each node, wherein the relevance judgment comprises the steps of judging whether variables contained in the node exist in a variable set of the suspected risk node or not, if yes, judging that the node is relevant to the suspected risk node, and if not, judging that the node is irrelevant; Program codes corresponding to all nodes related to the suspected risk node are acquired and all related program codes are aggregated to form a reverse slicing result set.
6. An agent cooperative vulnerability inference system based on logic modeling and structural prompting, which is characterized by comprising: the vulnerability coding module is used for modeling each vulnerability type by adopting a preset structured prompt language and generating a structured vulnerability expression of each vulnerability type; the training sample construction module is used for taking the structured vulnerability expressions of all vulnerability types as matching rules, inquiring and matching in a given code warehouse based on the matching rules to obtain code fragments matched with all the structured vulnerability expressions; The model fine tuning module is used for fine tuning the pre-training large language model by using the training sample set, so that the pre-training large language model learns the mapping relation between the code segments and the vulnerability types, and a fine tuned vulnerability model is obtained; The vulnerability inference module is used for inputting a program to be detected and a corresponding structured vulnerability expression into the fine-tuned vulnerability model, predicting to obtain candidate vulnerability codes in the program to be detected, and inputting the predicted candidate vulnerability codes and the corresponding structured vulnerability expression into a preset agent so that the agent can call a preset atomic operation tool kit to perform context perception verification on the candidate vulnerability codes to obtain a vulnerability detection result.
7. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of any one of claims 1 to 5 when the program is executed by the processor.
8. A non-transitory computer readable storage medium, on which a computer program is stored, which when executed by a processor implements the method of any one of claims 1 to 5.

Description

Intelligent agent collaborative vulnerability reasoning method and system based on logic modeling and structural prompt Technical Field The invention relates to the technical field of computer application, in particular to an agent collaborative vulnerability reasoning method and system based on logic modeling and structural prompt. Background With the continued development of the internet and software technology, modern software systems have grown significantly in size and complexity. Developing an accurate and interpretable vulnerability detection has become a central challenge for software security. Traditional vulnerability detection techniques mainly include symbol execution, static code analysis, pattern matching, and the like. With the advancement of deep learning, the existing methods explore the vulnerability detection method based on LLM, and the models can learn vulnerability patterns from a large-scale code corpus, so that the detection performance is improved to a certain extent. However, these methods mostly rely on manual experience or simple automated tools, and have great limitations. The specific expression is as follows: Code context understanding is limited in that the semantic understanding of code by large models depends on patterns in the training data, but complex vulnerabilities require deep program analysis (e.g., data flow, control flow), while models may only capture surface grammar features. Deep semantic logic has poor understanding capability, namely, vulnerabilities in codes often depend on implicit logic (such as time race conditions and the sequence of asynchronous callback), and a large model cannot deeply understand the semantic logic of the codes, so that the vulnerability recognition capability is weak. The interpretability is weak, the model takes the accuracy of the result as an optimization target, and the interpretation is not inferred. Disclosure of Invention The invention provides an intelligent collaborative vulnerability reasoning method and system based on logic modeling and structural prompt, which aims to solve the problems that the existing vulnerability detection method based on a Large Language Model (LLM) generally has the problems of 'phantom output', 'reasoning process can not be explained', and the like, and the adaptability to continuously evolving code structures and variant vulnerability modes is low. In a first aspect, the present invention provides an agent collaborative vulnerability reasoning method based on logic modeling and structural hints, including: Modeling each vulnerability type by adopting a preset structured prompt language to generate a structured vulnerability expression of each vulnerability type; Step 2, taking the structured vulnerability expressions of all vulnerability types as matching rules, inquiring and matching in a given code warehouse based on the matching rules to obtain code fragments matched with all the structured vulnerability expressions; step 3, fine tuning the pre-training large language model by using the training sample set, so that the pre-training large language model learns the mapping relation between the code segments and the vulnerability types, and a fine-tuned vulnerability model is obtained; and 4, inputting a program to be detected and a corresponding structured vulnerability expression into the fine-tuned vulnerability model, predicting to obtain candidate vulnerability codes in the program to be detected, and inputting the predicted candidate vulnerability codes and the corresponding structured vulnerability expression into a preset agent so that the agent can call a preset atomic operation tool kit to perform context perception verification on the candidate vulnerability codes to obtain a vulnerability detection result. Further, the structured prompt language is composed of a plurality of keywords, which specifically include an operation node, a type of operation result, a comparison type, a protection condition for the operation node, an operation parameter, right/left operands of an operation, whether the operation node is conditional, whether a condition indicates an operation, whether all conditions must be true/any one condition is true, that some operation conditions are present, and a maximum/minimum value of a parameter. Further, in step 4, the atomic operation toolkit specifically includes a function indexing and expanding operation, a code slicing operation and a numerical boundary identifying operation, wherein: Establishing a function index set for all functions in the program to be detected, and performing context expansion on the called function based on the function index set if the called function exists in the function index set when the function call exists in the reasoning process; Taking the candidate vulnerability codes as a suspected risk node, starting backtracking from the suspected risk node, and performing data flow sensitive reverse slicing on the program to be detected;