CN-122021705-A - AI intelligent agent for unmanned cluster task planning
Abstract
The invention relates to an AI intelligent agent for unmanned cluster task planning, belongs to the technical field of artificial intelligence, and solves the problems of lack of domain knowledge, deviation of user intention, lack of environment interaction and unsound memory capacity of the existing general large model in unmanned cluster task planning. The AI intelligent agent comprises an external memory system, a prompt word management module, a task planning module, a scheme conversion module and a task evaluation module, wherein the external memory system is used for storing a history dialogue record of task planning and preset unmanned cluster information, the prompt word management module is used for searching the external memory system based on input information and environment information to obtain task related information and generate task planning prompt words, the task planning module is used for obtaining a task planning scheme based on the task planning prompt words, the scheme conversion module is used for converting the task planning scheme into an executable control instruction of the unmanned cluster so as to control the unmanned cluster to execute tasks to obtain a task executing result, and the task evaluation module is used for evaluating the task executing result and storing the evaluation result into the external memory system.
Inventors
- GUO XINYU
- QIU ZHEWEI
- LI TIANMING
- FENG FUYONG
- DANG RUINA
- LI ZHENGKUN
- HUANG WEIHAN
- WANG DI
- WANG ZE
Assignees
- 中兵智能创新研究院有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20260325
Claims (10)
- 1. The AI intelligent agent for unmanned cluster task planning is characterized by comprising a prompt word management module, a task planning module, a scheme conversion module, an external memory system and a task evaluation adjustment module; The external memory system is used for storing a history dialogue record of unmanned cluster task planning and preset unmanned cluster information; the prompt word management module is used for searching the external memory system based on the received input information and environment information to obtain corresponding task related information and generating task planning prompt words; The task planning module is used for obtaining a task planning scheme based on the task planning prompt word, and training a task planning large model by utilizing a pre-acquired unmanned cluster task planning instruction data set and preference data; The scheme conversion module is used for converting the task planning scheme into an executable control instruction of the unmanned aerial vehicle cluster so as to control the unmanned aerial vehicle cluster to execute the task to obtain a task executing result; and the task evaluation module is used for evaluating the result of the task execution of the unmanned cluster and storing the evaluation result as the history dialogue record into the external memory system.
- 2. The AI agent of claim 1, wherein the mission planning large model is a pure decoder model using a fransformer architecture; training the mission planning large model, comprising: the method comprises the steps of obtaining an instruction data set of unmanned cluster task planning, wherein the instruction data set comprises a plurality of groups of task planning instructions and corresponding expected task planning schemes; performing parameter fine adjustment on the task planning large model based on the instruction data set to obtain a fine-adjusted task planning large model; Acquiring unmanned cluster task planning preference data and constructing a preference data set, wherein the preference data set comprises a plurality of groups of task planning instructions, corresponding task planning preference data and task planning non-preference data; and training the fine-tuned task planning large model based on the preference data set to obtain the trained task planning large model.
- 3. The AI agent of claim 2, wherein performing parameter fine-tuning of the mission planning large model to achieve efficient fine-tuning of parameters of the mission planning large model using a low-rank adaptation LoRA method comprises: configuring a trainable low-rank adapter in a linear projection layer of a transducer layer of the task planning large model, and freezing original parameters of the task planning large model; Loading the instruction data set, training the configured low-rank adapter parameters with the aim of minimizing cross entropy loss between the task planning large model output and an expected task planning scheme, and ending training when the loss function value converges to obtain the trained low-rank adapter; And combining the trained parameters of the low-rank adapter with the original parameters of the task planning large model to obtain the fine-tuned task planning large model.
- 4. The AI agent of claim 3, wherein training the fine-tuned mission planning large model comprises: constructing a model with the same structure as the task planning large model after parameter fine adjustment as a reference model, and freezing parameters of the model in the training process; The task planning large model subjected to parameter fine adjustment is used as a strategy model; Loading the preference data set to the reference model and the strategy model respectively, training the strategy model by using a direct preference optimization loss function, updating parameters of the strategy model by using gradient back propagation, and ending training when the loss function value is converged to obtain a trained strategy model as a task planning large model.
- 5. The AI agent of claim 1, wherein the external memory system comprises a structured database and an unstructured vector database; The structured database is used for storing the capability parameters and task rules of each unmanned unit in the unmanned cluster; The unstructured vector database is used for storing the vectorized representation data of the task related text document.
- 6. The AI agent of claim 5, wherein, in the prompt word management module, Based on the input information, identifying a task type to select a corresponding prompt word template; Based on the input information, retrieving the external memory system to obtain task related information; And integrating the input information, the environment information and the task related information into the prompt word template to obtain the task planning prompt word.
- 7. The AI agent of claim 6, wherein retrieving the external memory system based on the input information yields task related information, comprising: constructing a search prompt word through a preset prompt word template based on the input information and the table structure information of the structured database, and generating a database query sentence by using a large language model; searching the structured database by using the database query statement to obtain a structured data record related to the user query intention; When the number of the structured data records is greater than or equal to a preset number, converting the structured data records into natural language description text as task related information; otherwise, based on the input information, carrying out semantic similarity retrieval on the unstructured vector database to obtain task supplementary information, converting the structured data record into a natural language description text and splicing the natural language description text with the task supplementary information to obtain task related information.
- 8. The AI agent of any of claims 6-7, wherein identifying a task type to select a corresponding alert word template based on the input information comprises: word segmentation processing is carried out on the input information to obtain a keyword set; matching the keyword set with a preset verb set and phrase set of each task type respectively to obtain verb matching times and phrase matching times of each task type; taking the task type with the highest phrase matching times as the task type of the time; When a plurality of task types with highest phrase matching times exist, the task type with the highest corresponding verb matching times is used as the task type of the time; when the phrase matching times of all task types are 0, the task type with the highest verb matching times is used as the current task type.
- 9. The AI agent of claim 1, wherein the mission planning module further comprises a feasibility check module configured to check a mission planning scheme obtained by the mission planning large model based on a preset rule, and when an error occurs, supplement error information to a mission planning prompt word, and regenerate the mission planning scheme.
- 10. The AI agent of claim 1, wherein the executable control instructions to convert the mission planning scheme to an unmanned cluster comprise: extracting task allocation information of each group according to the task planning scheme in a predefined regular matching mode, wherein the task allocation information comprises the types and the number of unmanned units, the task types and the area names for executing the tasks; based on time sequence division in the task planning scheme, unique codes are allocated to all the groups according to a scheduling sequence; and generating the executable control instruction of the unmanned cluster according to the codes of the groups and the corresponding task allocation information according to a preset JSON format.
Description
AI intelligent agent for unmanned cluster task planning Technical Field The invention relates to the technical field of artificial intelligence, in particular to an AI intelligent agent for unmanned cluster task planning. Background The unmanned cluster system is a cooperative system consisting of a large number of autonomous or semi-autonomous unmanned units (such as unmanned vehicles, unmanned planes and the like), and has the core advantages of strong system robustness, high task execution efficiency and high organization flexibility. The unmanned cluster task planning is a key for playing the effectiveness, and is characterized in that complex tasks are effectively decomposed into subtasks and reasonably distributed to individuals in a cluster, and meanwhile, unified environment cognition is formed by utilizing information acquired by a plurality of sensors in the cluster, and finally, cooperative control instructions are generated. Traditional unmanned cluster mission planning relies primarily on symbolic methods or reinforcement learning-based methods. The symbolic method needs to carry out symbolic modeling on the problems, mainly rules and models, is suitable for a structured and stable environment, but has limited effect in a highly dynamic and complex real scene and has weaker generalization capability. While reinforcement learning methods can learn strategies through trial and error, a large number of samples are generally required, the bonus function is difficult to design, the data collection is time-consuming and costly, and the training process is difficult to converge. In recent years, a number of data-driven task planning decision-making modes are developed by means of the powerful generation and reasoning capabilities of large language models. Because the pre-training large model has strong generalization capability, the driven control decision algorithm does not depend on explicit modeling of the environment, so that the method is suitable for high-dimensional, nonlinear and dynamic environments. Early studies were mostly directed to single tasks such as vision-manipulation or vision-navigation, whose underlying model capabilities are questionable. Further, the large model is used as a task controller to understand the tasks input by human beings, planning and decision making are performed aiming at specific environments, and finally control response is generated, so that the rudiment of the large model intelligent body is formed. However, the general big model is directly applied to task planning in the unmanned cluster professional field, a plurality of serious challenges are still faced, firstly, the general pre-training big model uses massive general data to perform pre-training, but lacks knowledge in the specific field, on an unmanned platform control task, the vertical domain training data aiming at a specific task scene is lacking, the problem that the big model cannot understand the task input by the human being when the unmanned cluster task planning problem under the specific task scene is directly processed by using the pre-training big model, illusion is generated, the generated control scheme is inconsistent with the actual equipment configuration, the environmental conditions and the like, in addition, the independent big language model only can generate text based on the current input, the environment information (such as real-time capability of each unit, physical orientation of the task scene element and the like) cannot be actively perceived, and the task planning scheme cannot be converted into a specific control signal for the unmanned unit due to the fact that the generated content format is uncertain. Disclosure of Invention In view of the above analysis, the embodiment of the invention aims to provide an AI intelligent agent construction method for unmanned cluster task planning, which is used for solving the problems of lack of domain knowledge, deviation of user intention, lack of environment interaction and unsound memory capacity of the existing general large model in unmanned cluster task planning. The aim of the invention is mainly realized by the following technical scheme: The invention provides an AI intelligent agent for unmanned cluster task planning, which comprises a prompt word management module, a task planning module, a scheme conversion module, an external memory system and a task evaluation adjustment module; The external memory system is used for storing a history dialogue record of unmanned cluster task planning and preset unmanned cluster information; the prompt word management module is used for searching the external memory system based on the received input information and environment information to obtain corresponding task related information and generating task planning prompt words; The task planning module is used for obtaining a task planning scheme based on the task planning prompt word, and training a task planning large model by uti