CN-122024017-A - Large model-intelligent agent cooperative dynamic extensible remote sensing target interpretation system, method and medium

CN122024017ACN 122024017 ACN122024017 ACN 122024017ACN-122024017-A

Abstract

The application provides a large model-intelligent agent cooperative dynamic extensible remote sensing target interpretation system, a method and a medium. The target interpretation system comprises a cognition module, a planning module and an execution module, wherein the cognition module is used for carrying out task decomposition on a text task to be processed through a large language model, obtaining subtasks corresponding to the text task to be processed, obtaining direct dependency relations between every two subtasks, the planning module is used for carrying out task sequence planning on the subtasks based on the direct dependency relations between the subtasks, constructing a task execution sequence, the task execution sequence comprises a first subtask used for serial execution and a second subtask used for parallel execution, and the execution module is used for carrying out task processing on a multi-mode remote sensing image to be processed corresponding to the subtasks based on the task execution sequence, and outputting a task execution result.

Inventors

Qi Xiyu
LUO YICHANG
LIU SHIXIONG
LIU XIAOXUAN
WANG LEI
SHI HANRU
WAN YUNFENG
YANG YING
ZHAO XINYU

Assignees

中国科学院空天信息创新研究院

Dates

Publication Date: 20260512
Application Date: 20260127

Claims (10)

1. A large model-agent cooperative dynamic extensible remote sensing target interpretation system is characterized in that the large model-agent cooperative dynamic extensible remote sensing target interpretation system comprises a cognition module, a planning module and an execution module, wherein, The cognitive module is used for carrying out task decomposition on a text task to be processed through a large language model, obtaining subtasks corresponding to the text task to be processed, and obtaining a direct dependency relationship between every two subtasks; The system comprises a planning module, a task execution sequence, a task execution module and a task execution module, wherein the planning module is used for carrying out task sequence planning on the subtasks based on the direct dependency relationship between the subtasks and constructing the task execution sequence, and the task execution sequence comprises a first subtask for serial execution and a second subtask for parallel execution; And the execution module is used for carrying out task processing on the multi-mode remote sensing image to be processed corresponding to the subtasks based on the task execution sequence and outputting a task execution result.
2. The large model-agent collaborative dynamic extensible remote sensing target interpretation system of claim 1, wherein the planning module includes a relationship construction module, a task acquisition module, and a sequence generation module; The relation construction module is used for constructing a directed acyclic graph based on the task execution step of the subtasks; the task acquisition module is used for acquiring the first subtask and the second subtask through a graph algorithm based on the directed acyclic graph, wherein the first subtask is constructed based on the subtask with the direct dependency relationship; the sequence generation module is used for generating the task execution sequence based on the first subtask and the second subtask.
3. The large model-agent collaborative dynamic extensible remote sensing target interpretation system according to claim 2, characterized in that, The planning module is also used for acquiring a first image processing algorithm set corresponding to the task type of the subtask based on a first mapping relation between the task type and the image processing algorithm, acquiring a second image processing algorithm set corresponding to the mode of the multi-mode remote sensing image based on a second mapping relation between the image mode and the image processing algorithm, and performing algorithm evaluation based on the first image processing algorithm set and the second image processing algorithm set to acquire an optimal image processing algorithm; the execution module is further configured to perform task processing on the multi-mode remote sensing image to be processed corresponding to the subtasks based on the task execution sequence and the optimal image processing algorithm, and output the task execution result.
4. The large model-agent collaborative dynamic extensible remote sensing target interpretation system according to claim 3, characterized in that, The planning module is further configured to determine a candidate image processing algorithm in the first image processing algorithm set and the second image processing algorithm set, determine a performance evaluation result of the candidate image processing algorithm based on a utility function, and use the candidate image processing algorithm with the best evaluation result as the optimal image processing algorithm.
5. The large model-agent collaborative dynamic scalable remote sensing target interpretation system according to claim 3 or 4, characterized in that, The execution module is further used for determining error information of a third subtask based on the task execution result after the task execution result is output, and sending the error information of the third subtask to the cognitive module, wherein the third subtask is a subtask with failed task processing; The cognition module is further used for analyzing the error information of the third subtask to obtain an error analysis result, and sending the error analysis result to the planning module; The planning module is further configured to obtain an optimal candidate execution policy corresponding to the error analysis result based on a third mapping relation between the analysis result and the execution policy, and send the optimal candidate execution policy to the execution module, where the optimal candidate execution policy at least includes one or more of extending execution time of the third subtask, reducing a sampling number of the multi-mode remote sensing image corresponding to the third subtask, and updating an optimal image processing algorithm of the third subtask; The execution module is further configured to re-execute the third subtask based on the optimal candidate execution policy.
6. The large model-intelligent agent cooperative dynamic extensible remote sensing target interpretation method is characterized by being applied to a large model-intelligent agent cooperative dynamic extensible remote sensing target interpretation system, wherein the large model-intelligent agent cooperative dynamic extensible remote sensing target interpretation system comprises a cognition module, a planning module and an execution module, and the method comprises the following steps: the cognitive module carries out task decomposition on a text task to be processed through a large language model, acquires subtasks corresponding to the text task to be processed, and acquires a direct dependency relationship between every two subtasks; the planning module performs task sequence planning on the subtasks based on the direct dependency relationship between the subtasks, and a task execution sequence is constructed; the task execution sequence comprises a first subtask for serial execution and a second subtask for parallel execution; And the execution module performs task processing on the multi-mode remote sensing image to be processed corresponding to the subtasks based on the task execution sequence and outputs a task execution result.
7. The method of claim 6, wherein the method further comprises: The planning module obtains a first image processing algorithm set corresponding to the task type of the subtask based on a first mapping relation between the task type and the image processing algorithm; the planning module obtains a second image processing algorithm set corresponding to the mode of the multi-mode remote sensing image based on a second mapping relation between the image mode and the image processing algorithm; the planning module carries out algorithm evaluation based on the first image processing algorithm set and the second image processing algorithm set to obtain an optimal image processing algorithm; The execution module performs task processing on the multi-mode remote sensing image to be processed corresponding to the subtasks based on the task execution sequence, and outputs a task execution result, including: And the execution module performs task processing on the multi-mode remote sensing image to be processed corresponding to the subtasks based on the task execution sequence and the optimal image processing algorithm, and outputs the task execution result.
8. The method according to claim 6 or 7, characterized in that the method further comprises: After the task execution result is output, the execution module determines error information of a third subtask based on the task execution result and sends the error information of the third subtask to the cognitive module, wherein the third subtask is a subtask with failed task processing; The cognition module analyzes the error information of the third subtask to obtain an error analysis result, and sends the error analysis result to the planning module; The planning module obtains an optimal candidate execution strategy corresponding to the error analysis result based on a third mapping relation between the analysis result and the execution strategy, and sends the optimal candidate execution strategy to the execution module, wherein the optimal candidate execution strategy at least comprises one or more of the following steps of prolonging the execution time threshold of the third subtask, reducing the sampling number of the multi-mode remote sensing image corresponding to the third subtask, and updating an optimal image processing algorithm of the third subtask; The execution module re-executes the third sub-task based on the optimal candidate execution policy.
9. A large model-agent collaborative dynamically extensible remote sensing target interpretation system comprising a memory and a processor, the memory storing a computer program executable on the processor, wherein the processor implements the method of any one of claims 6-8 when executing the program.
10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 6-8.

Description

Large model-intelligent agent cooperative dynamic extensible remote sensing target interpretation system, method and medium Technical Field The application relates to the technical field of remote sensing image processing and artificial intelligence, in particular to a large model-agent cooperative dynamic extensible remote sensing target interpretation system and method and a computer readable storage medium. Background The remote sensing target analysis system is used as an important technical support in the fields of earth observation, environment monitoring, city planning, national defense safety and the like, and aims to automatically extract ground object targets and semantic information thereof from multi-mode remote sensing data such as visible light, infrared, synthetic aperture radar (SYNTHETIC APERTURE RADAR, SAR), hyperspectrum and the like. In the related art, remote sensing target interpretation systems mostly design a fixed task processing pipeline by a domain expert based on a predefined algorithm flow and a solidified working mode, and combine tasks of preprocessing, detection, classification, segmentation, and the like in a serial manner. The mode has certain stability when processing specific tasks, but has the problems of low workflow rigidness, low intelligent degree and the like when facing complex, changeable and unstructured remote sensing scenes, so that the execution time of the complex tasks is longer. And when a new algorithm or a new data type is accessed, the system needs to modify and redeploy the bottom code, so that the actual requirements of current multi-source heterogeneous data fusion and dynamic task execution are difficult to meet. Disclosure of Invention The embodiment of the application provides a large model-agent cooperative dynamic extensible remote sensing target interpretation system and method and a computer readable storage medium, wherein, The embodiment of the application provides a large-model-agent cooperative dynamic extensible remote sensing target interpretation system, which comprises a cognition module, a planning module and an execution module, wherein the cognition module is used for carrying out task decomposition on a text task to be processed through a large language model, obtaining subtasks corresponding to the text task to be processed, and obtaining direct dependency relations between every two subtasks, the planning module is used for carrying out task sequential planning on the subtasks based on the direct dependency relations between the subtasks to construct a task execution sequence, the task execution sequence comprises a first subtask used for carrying out serial execution and a second subtask used for carrying out parallel execution, and the execution module is used for carrying out task processing on the multi-mode remote sensing image to be processed corresponding to the subtasks based on the task execution sequence and outputting a task execution result. In a second aspect, an embodiment of the present application provides a large-model-agent cooperative dynamic scalable remote sensing target interpretation method, which is applied to a large-model-agent cooperative dynamic scalable remote sensing target interpretation system, where the large-model-agent cooperative dynamic scalable remote sensing target interpretation system includes a cognitive module, a planning module and an execution module; the method comprises the steps that a cognitive module carries out task decomposition on a text task to be processed through a large language model, sub-tasks corresponding to the text task to be processed are obtained, direct dependency relations between every two sub-tasks are obtained, a planning module carries out task sequence planning on the sub-tasks based on the direct dependency relations between the sub-tasks, a task execution sequence is constructed, the task execution sequence comprises a first sub-task used for serial execution and a second sub-task used for parallel execution, and an execution module carries out task processing on a multi-mode remote sensing image to be processed corresponding to the sub-tasks based on the task execution sequence, and a task execution result is output. In a third aspect, an embodiment of the present application provides a large model-agent collaborative dynamic scalable remote sensing target interpretation system, including a memory and a processor, the memory storing a computer program executable on the processor, the processor implementing the method of the second aspect when executing the program. In a fourth aspect, embodiments of the present application provide a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the method of the second aspect. In the embodiment of the application, firstly, a cognitive module decomposes an input text task through a large language model to obtain corresponding subtasks and dependency