CN-121996526-A - Intelligent agent element debugging method, intelligent agent element debugging device, intelligent agent element debugging equipment and storage medium
Abstract
The invention provides an agent element debugging method, device, equipment and storage medium, which are applied to the technical field of artificial intelligence, wherein the method comprises the steps of obtaining a plurality of groups of element combination data to be debugged; the method comprises the steps of combining element combination data, debugging and obtaining debugging results corresponding to the element combination data, carrying out structural analysis on the debugging results of the element combination data to obtain structural analysis results, wherein the element combination data comprise a group of prompt words of an agent, test data and data combinations of a large language model, different element combination data comprise at least one element data of the prompt words, the test data and the large language model, the element combination data are debugged in parallel according to the element combination data, the debugging results corresponding to the element combination data are subjected to structural analysis, and the structural analysis results are used for assisting a user to select target prompt words or target large language models with optimal performance for the agent from the element combination data. By adopting the technical scheme of the invention, the intelligent agent debugging efficiency can be improved.
Inventors
- ZHAO SHUCHAO
- LU SHAOXUN
- CAI XIHUI
- MEI JINLING
- HAN YAODONG
- ZHANG HAN
- WEI XINGBO
Assignees
- 商飞智能技术有限公司
Dates
- Publication Date
- 20260508
- Application Date
- 20260409
Claims (10)
- 1. The intelligent agent element debugging method is characterized by comprising the following steps of: acquiring a plurality of groups of element combination data to be debugged, wherein each group of element combination data comprises a group of prompt words, test data and data combination of a large language model of an intelligent agent, and different element combination data comprises at least one element data in the prompt words, the test data and the large language model; Debugging the intelligent agent according to the element combination data in parallel, and obtaining a debugging result corresponding to the element combination data in parallel; And carrying out structural analysis on the debugging results of the element combination data to obtain structural analysis results, wherein the structural analysis results are used for assisting a user in selecting target prompt words or target large language models with optimal performance for the intelligent agent from the element combination data.
- 2. The method for debugging an agent element according to claim 1, wherein the obtaining a plurality of sets of element combination data to be debugged includes: acquiring a plurality of prompt words, a plurality of test data and a plurality of large language models; Acquiring prompt words, test data and large language models which are screened from the prompt words, the test data and the large language models by a user; and combining the prompt words, the test data and the large language model screened by the user to obtain a plurality of groups of element combination data to be debugged.
- 3. The method for debugging an agent element according to claim 2, wherein the step of combining the prompt words, the test data and the large language model screened by the user to obtain a plurality of groups of element combination data to be debugged includes: The data combination mode input by the user is acquired, wherein the data combination mode comprises any one of full permutation combination, random sampling combination and appointed combination; And combining the prompt words, the test data and the large language model screened by the user according to the data combination mode to obtain a plurality of groups of element combination data to be debugged.
- 4. The method for debugging an agent element according to any one of claims 1 to 3, wherein each test data includes a test case and an expected output text corresponding to the test case, each debug result of the element combination data includes a model output text corresponding to the corresponding element combination data, and the performing structural analysis on the debug result of each element combination data to obtain a structural analysis result includes: Determining a comprehensive quantized value corresponding to each element combination data according to the model output text and the corresponding expected output text of each element combination data, wherein the magnitude of the comprehensive quantized value is in direct proportion to the performance of the intelligent agent when the intelligent agent uses the corresponding element combination data; and determining a structural analysis result according to the comprehensive quantized value corresponding to each element combination data.
- 5. The method for debugging an agent element according to claim 4, wherein the debugging result of each element combination data further includes model running time and model running consumption word meta information corresponding to the corresponding element combination data, and the determining the comprehensive quantization value corresponding to each element combination data according to the model output text and the corresponding expected output text of each element combination data includes: determining a first quantized value corresponding to each element combination data according to the similarity between the model output text and the corresponding expected output text of each element combination data; determining a second quantized value corresponding to each element combination data according to the model operation time consumption of each element combination data; According to the model operation consumption word meta information of each element combination data, determining a third quantization value corresponding to each element combination data; And determining a comprehensive quantized value corresponding to each element combination data according to at least one of the first quantized value, the second quantized value and the third quantized value of each element combination data.
- 6. A method for debugging an agent element according to any one of claims 1 to 3, wherein the step of performing a structural analysis on the debugging result of each element combination data to obtain a structural analysis result comprises: The method comprises the steps of carrying out structural analysis on debugging results of element combination data by using a perspective table to obtain a structural analysis result, wherein the structural analysis result comprises a multi-dimensional analysis view corresponding to the element combination data, a first dimension of the multi-dimensional analysis view is a large language model element, a second dimension of the multi-dimensional analysis view is a prompt word element, each cell of the multi-dimensional analysis view represents a large language model of the corresponding dimension and the structural analysis result of the element combination data under the prompt word, and the first dimension and the second dimension are different and are respectively a row or a column of the multi-dimensional analysis view.
- 7. A method of commissioning an agent element according to any one of claims 1 to 3, wherein the structured analysis result includes a structured analysis result for each of the element combination data, the method further comprising: Acquiring a first prompt word input by a user, determining a first target large language model with optimal performance from large language models included in each element combination data according to structural analysis results of different large language models corresponding to the first prompt word, and recommending the first target large language model to the user; or acquiring a second large language model input by the user, determining a second target prompt word with optimal performance from the prompt words included in the element combination data according to the structural analysis results of different prompt words corresponding to the second large language model, and recommending the second target prompt word to the user.
- 8. An agent element commissioning device, comprising: The system comprises a combined data acquisition module, a debugging module and a debugging module, wherein the combined data acquisition module is used for acquiring a plurality of groups of element combined data to be debugged, each group of element combined data comprises a group of prompt words of an intelligent agent, test data and data combination of a large language model, and different element combined data comprises at least one element data in the prompt words, the test data and the large language model; The parallel debugging module is used for debugging the intelligent agent in parallel according to the element combination data and obtaining a debugging result corresponding to the element combination data in parallel; The analysis module is used for carrying out structural analysis on the debugging results of the element combination data to obtain structural analysis results, and the structural analysis results are used for assisting a user in selecting target prompt words or target large language models with optimal performance for the intelligent agent from the element combination data.
- 9. An electronic device comprising a memory, a processor and a computer program stored on the memory and running on the processor, wherein the processor implements the agent element commissioning method of any one of claims 1 to 7 when the computer program is executed by the processor.
- 10. A non-transitory computer readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the agent element commissioning method of any one of claims 1 to 7.
Description
Intelligent agent element debugging method, intelligent agent element debugging device, intelligent agent element debugging equipment and storage medium Technical Field The present invention relates to the field of artificial intelligence technologies, and in particular, to a method, an apparatus, a device, and a storage medium for debugging an agent element. Background With the rapid development of large language models (Large Language Model, LLM), building "AI (Artificial Intelligence) agents" based on LLM that can autonomously plan, use tools, and accomplish complex tasks has become a hotspot for current artificial intelligence application development. In the development life cycle of AI agents, "debugging" is an extremely critical and time-consuming link, and a successful agent usually relies on the subtle coordination of three core elements, 1, high quality prompt word engineering (Prompt Engineering), instruction to guide Model actions, 2, representative Test Data, input samples covering various edge conditions, 3, proper base Model, models of different manufacturers, different parameters, and significant differences in reasoning capability and instruction following capability. In the related art, in the process of debugging an intelligent agent, debugging is usually performed through a control variable method, such as fixing a model and testing data, repeatedly modifying a prompt word and observing a prompt result, or fixing the prompt word, switching different models for comparison, and finally screening to obtain a proper model and a proper prompt word. However, the above-described technique has a problem of low debugging efficiency. Disclosure of Invention The invention provides an agent element debugging method, device, equipment and storage medium, which are used for solving the defect of low agent debugging efficiency in the prior art and achieving the purpose of improving agent debugging efficiency by carrying out joint debugging on a plurality of elements of an agent in parallel. The invention provides an agent element debugging method, which comprises the following steps: Acquiring a plurality of groups of element combination data to be debugged, wherein each group of element combination data comprises a group of prompt words, test data and data combination of a large language model of an intelligent agent, and different element combination data comprises at least one element data in the prompt words, the test data and the large language model; debugging the intelligent agent according to the element combination data in parallel, and obtaining debugging results corresponding to the element combination data in parallel; And the structural analysis result is used for assisting a user to select a target prompt word or a target large language model with optimal performance for the intelligent agent from the element combination data. According to the method for debugging the intelligent agent element provided by the invention, the method for acquiring a plurality of groups of element combination data to be debugged comprises the following steps: acquiring a plurality of prompt words, a plurality of test data and a plurality of large language models; acquiring prompt words, test data and large language models screened by a user from the prompt words, the test data and the large language models; And combining the prompt words, the test data and the large language model screened by the user to obtain a plurality of groups of element combination data to be debugged. According to the method for debugging the intelligent agent element, the prompting words, the test data and the large language model screened by the user are combined to obtain a plurality of groups of element combination data to be debugged, and the method comprises the following steps: the method comprises the steps of acquiring a data combination mode input by a user, wherein the data combination mode comprises any one of full permutation combination, random sampling combination and appointed combination; and combining the prompting words, the test data and the large language model screened by the user according to a data combination mode to obtain a plurality of groups of element combination data to be debugged. According to the agent element debugging method provided by the invention, each test data comprises a test case and an expected output text corresponding to the test case, each element combination data debugging result comprises a model output text corresponding to the corresponding element combination data, the debugging result of each element combination data is subjected to structural analysis to obtain a structural analysis result, and the method comprises the following steps: determining a comprehensive quantized value corresponding to each element combination data according to the model output text and the corresponding expected output text of each element combination data, wherein the magnitude of the comprehensive quan