CN-121981276-A - Unmanned system autonomous control method and system based on large language model

CN121981276ACN 121981276 ACN121981276 ACN 121981276ACN-121981276-A

Abstract

The application discloses a man-machine fusion unmanned system autonomous control method and system based on a large language model, and relates to the fields of large language model application and unmanned systems. Acquiring multi-sensor environment data at the current moment, and constructing a current local environment topological graph; the method comprises the steps of taking a current instruction and a topological graph as fusion query vectors to search an expert strategy database, selecting a preset number of historical expert strategies as candidate strategy sets, generating an optimal strategy at the current moment through a large language model based on the current instruction, the topological graph and the candidate strategy sets, controlling execution, and updating the topological graph if a task is not completed, repeating the steps until the current task is completed. The application enables the unmanned system to perform the steps of drawing establishment, searching, decision making and executing, and improves the response capability, task adaptability and execution reliability of the unmanned system in a complex unknown environment.

Inventors

XIN LIMING
WU CHENXU
ZHU CHUNZHEN
LIU YUEHUA

Assignees

上海大学

Dates

Publication Date: 20260505
Application Date: 20260407

Claims (10)

1. An unmanned system autonomous control method based on a large language model is characterized by comprising the following steps: the method comprises the steps of obtaining an unstructured high-level semantic instruction sequence, wherein the unstructured high-level semantic instruction sequence comprises a plurality of unstructured high-level semantic instructions; Traversing the unstructured high-level semantic instruction sequences, and executing the following steps aiming at each unstructured high-level semantic instruction until all tasks corresponding to the unstructured high-level semantic instructions in the unstructured high-level semantic instruction sequences are completed, and ending control: Taking the starting time corresponding to the current unstructured high-level semantic instruction in the unstructured high-level semantic instruction sequence as the current time to acquire multi-sensor environment data at the current time, wherein the multi-sensor environment data comprises laser radar data, vision sensor data and inertial measurement unit data; Constructing a current local environment topological graph based on multi-sensor environment data at the current moment; searching an expert strategy database by taking a current unstructured high-level semantic instruction and a current local environment topological graph as fusion query vectors, and selecting a preset number of historical expert strategies as a current candidate strategy set, wherein the expert strategy database comprises historical fusion query vectors and corresponding historical expert strategies; Based on the current unstructured high-level semantic instruction, the current local environment topological graph and the current candidate strategy set, obtaining an optimal strategy at the current moment through a large language model; Based on the optimal strategy at the current moment, controlling the unmanned system to execute the current task corresponding to the current unstructured high-level semantic instruction; and if the current task is not completed, acquiring multi-sensor environment data at the next moment to update the current local environment topological graph, and repeating the steps until the current task is completed.
2. The unmanned system autonomous control method based on the large language model of claim 1, wherein the constructing the current local environment topological graph based on the multi-sensor environment data at the current moment specifically comprises: based on the multi-sensor environment data at the current moment, carrying out state estimation and local mapping operation through a preset fusion mapping algorithm to obtain unmanned system pose information and a local environment point cloud map at the current moment, wherein the preset fusion mapping algorithm comprises a FAST-LIVO2 algorithm or a similar multi-sensor fusion mapping algorithm; and constructing a current local environment topological graph based on the pose information of the unmanned system at the current moment and the local environment point cloud map.
3. The unmanned system autonomous control method based on large language model according to claim 1, wherein the searching expert strategy database using the current unstructured high-level semantic instruction and the current local environment topological graph as the fusion query vector, selecting a preset number of historical expert strategies as the current candidate strategy set, comprises: Inputting a current unstructured high-level semantic instruction and a current local environment topological graph into a cross-modal fusion characterization model to obtain a fusion query vector corresponding to the current unstructured high-level semantic instruction and the current local environment topological graph, wherein the cross-modal fusion characterization model comprises a text encoder, a graph encoder and a modal fusion layer, the text encoder is used for extracting instruction semantic features of the unstructured high-level semantic instruction, the graph encoder is used for extracting topological graph structural features of the local environment topological graph, and the modal fusion layer is used for fusing the instruction semantic features and the corresponding topological graph structural features to obtain the fusion query vector; respectively calculating cosine similarity of fusion query vectors corresponding to the current unstructured high-level semantic instruction and the current local environment topological graph and all historical fusion query vectors in the expert strategy database; sequencing all cosine similarities from high to low to obtain a similarity sequencing set; And selecting historical expert strategies corresponding to the cosine similarity of the preset quantity in the similarity sorting set as a current candidate strategy set.
4. The unmanned system autonomous control method based on the large language model of claim 1, wherein the obtaining the optimal strategy at the current moment further comprises: Performing policy constraint verification on an optimal policy at the current moment, wherein the policy constraint verification comprises security constraint and dynamic constraint; If all the constraints in the policy constraint verification are met, the policy constraint verification is passed; if any constraint in the policy constraint check is not satisfied, the policy constraint check is not passed.
5. The unmanned aerial vehicle fusion unmanned system autonomous control method based on the large language model according to claim 1, wherein the unmanned aerial vehicle is controlled to execute the current task corresponding to the current unstructured high-level semantic instruction based on the optimal strategy at the current moment, and specifically comprises the following steps: mapping the optimal strategy at the current moment to a preset action primitive database to obtain the executable action primitive of the current bottom layer; Converting the current bottom executable action primitive into a bottom control instruction, wherein the bottom control instruction comprises speed, course, gesture and altitude; And inputting the bottom control instruction into a bottom controller of the unmanned system to control the unmanned system to execute the current task corresponding to the current unstructured high-level semantic instruction.
6. The unmanned system autonomous control method based on the large language model according to claim 4, wherein the security constraint comprises: projecting the optimal strategy at the current moment to the current local environment topological graph to obtain a predicted path of the unmanned system; Based on the predicted path and the current local environment topological graph, calculating the minimum distance between the unmanned system and each obstacle respectively; if the minimum distance between the unmanned system and each obstacle is greater than or equal to a preset safety threshold, the safety constraint is met; If the minimum distance between the unmanned system and any obstacle is smaller than a preset safety threshold, the safety constraint is not met; kinetic constraints, specifically include: extracting the speed, the acceleration and the yaw rate change rate according to the optimal strategy at the current moment; If the speed is greater than the preset maximum allowable speed, or the acceleration is greater than the preset maximum allowable acceleration, or the yaw rate change rate is greater than the preset maximum allowable yaw rate change rate, the dynamic constraint is not satisfied; And if the speed is smaller than or equal to the preset maximum allowable speed, the acceleration is smaller than or equal to the preset maximum allowable acceleration, and the yaw rate change rate is smaller than or equal to the preset maximum allowable yaw rate change rate, the dynamic constraint is satisfied.
7. The autonomous control method of the unmanned aerial vehicle fusion system based on the large language model according to claim 6, wherein if cosine similarity corresponding to all historical expert strategies in the current candidate strategy set is lower than a preset similarity threshold value, or strategy constraint verification fails, a safety guarantee mechanism is triggered, and the safety guarantee mechanism comprises switching to an autonomous navigation mode based on geometric obstacle avoidance to control the unmanned aerial vehicle system or controlling the unmanned aerial vehicle system to return to a preset safety point.
8. The unmanned aerial vehicle fusion unmanned aerial vehicle autonomous control method based on the large language model of claim 5, wherein the unmanned aerial vehicle fusion unmanned aerial vehicle autonomous control method based on the large language model further comprises the step of locally adjusting the bottom layer control instruction through an autonomous obstacle avoidance algorithm of the laser radar in the process of inputting the bottom layer control instruction into a bottom layer controller of the unmanned aerial vehicle to control the unmanned aerial vehicle to execute a current task corresponding to a current unstructured high-level semantic instruction, so that real-time obstacle avoidance is realized.
9. The unmanned system autonomous control method based on the large language model of claim 2, wherein the current local environment topological graph is constructed based on the unmanned system pose information and the local environment point cloud map at the current moment, and specifically comprises the following steps: extracting geometric position information and semantic tags of key entities in an environment through a point cloud semantic segmentation algorithm, and constructing a space connection relationship among the key entities in the environment through a topological relation analysis algorithm to obtain a current local environment topological graph; The current local environment topological graph is as follows: ; ; Wherein, the Indicating the current time Is a local environment topology map of (1); Indicating the current time A set of key entity nodes in the environment; Indicating the current time Spatial topological connection relation among entities; Indicating the current time Multi-modal feature matrix of (a); Representation of Multi-mode feature matrix of the ith node; Representing a vector concatenation operation; Representation of Geometric position information of the ith node; Representation of Semantic tags of the ith node.
10. The unmanned aerial vehicle system autonomous control system based on the large language model is characterized in that the unmanned aerial vehicle system autonomous control system based on the large language model applies the unmanned aerial vehicle system autonomous control method based on the large language model of any one of claims 1 to 9, and the unmanned aerial vehicle system autonomous control system based on the large language model comprises: The instruction acquisition module is used for acquiring an unstructured high-level semantic instruction sequence, wherein the unstructured high-level semantic instruction sequence comprises a plurality of unstructured high-level semantic instructions; The task scheduling module is used for traversing the unstructured high-level semantic instruction sequences, executing the following steps aiming at each unstructured high-level semantic instruction until all tasks corresponding to the unstructured high-level semantic instructions in the unstructured high-level semantic instruction sequences are completed, and ending control: The environment sensing module is used for taking the starting time corresponding to the current unstructured high-level semantic instruction in the unstructured high-level semantic instruction sequence as the current time to acquire multi-sensor environment data at the current time, wherein the multi-sensor environment data comprises laser radar data, vision sensor data and inertia measurement unit data; the topology construction module is used for constructing a current local environment topology map based on the multi-sensor environment data at the current moment; The strategy retrieval module is used for retrieving an expert strategy database by taking the current unstructured high-level semantic instruction and the current local environment topological graph as fusion query vectors, and selecting a preset number of historical expert strategies as a current candidate strategy set, wherein the expert strategy database comprises historical fusion query vectors and corresponding historical expert strategies; the decision generation module is used for obtaining an optimal strategy at the current moment through a large language model based on the current unstructured high-level semantic instruction, the current local environment topological graph and the current candidate strategy set; the task execution module is used for controlling the unmanned system to execute the current task corresponding to the current unstructured high-level semantic instruction based on the optimal strategy at the current moment; And the circulation control module is used for acquiring multi-sensor environment data at the next moment to update the current local environment topological graph if the current task is not completed, and repeating the steps until the current task is completed.

Description

Unmanned system autonomous control method and system based on large language model Technical Field The application relates to the technical field of large language model application and unmanned systems, in particular to a unmanned system autonomous control method and system based on large language model fusion of a man-machine. Background With continuous iteration and upgrading of unmanned system technology, task scenes are increasingly abundant, and the unmanned system is rapidly expanded from early preset track flight and basic fixed-point operation to highly unstructured and uncertain dynamic environments such as emergency search and rescue, unknown airspace exploration, complex site situation awareness and the like. In such a scenario, relying on only the underlying flight control capabilities has not been satisfactory, and unmanned systems are highly desirable for the ability to understand intent and integrate expert knowledge with experience to make logic decisions. In view of the capability characteristics of a large language model (Large Language Model, LLM) in terms of semantic understanding and logic reasoning, the method is fused with an unmanned system to enhance high-level cognition and decision making capability to be an important direction, but still faces the following challenges that firstly, high-level instructions are difficult to be reliably understood and converted into executable strategies by the unmanned system, the related unmanned system usually takes GPS (Global Positioning System ) navigation points, track curves or speed vectors and the like as input and constraint bases, so that the related unmanned system is better in executing explicit navigation and maneuvering instructions, and in complex tasks such as search, rescue, reconnaissance and the like, the related technology is difficult to convert unstructured high-level intention into specific task strategies, so that stable and executable bottom control instructions cannot be generated, secondly, the related method is concentrated on bottom obstacle avoidance and local path planning based on geometric characteristics such as point clouds, and lacks a semantic understanding and experience calling mechanism facing strategy layers, and is difficult to quickly align current environment semantic information with experiences and complete strategies in an executing process, so that strategy selection lag or uncomfortableness is better, and the unmanned system is difficult to adjust task execution efficiency in time. Therefore, there is a need for a unmanned aerial vehicle fusion unmanned aerial vehicle autonomous control method based on a large language model, so that the unmanned aerial vehicle can understand task intention in an unknown environment, call expert experience and adaptively generate executable actions, thereby improving the autonomous adaptability of the unmanned aerial vehicle in an unstructured dynamic environment and the task completion efficiency. Disclosure of Invention The application aims to provide a large language model-based unmanned system autonomous control method and system, which can solve the technical problems of poor understanding of a high-level instruction, lack of a strategy selection mechanism supported by expert experience, poor task adaptability caused by excessive dependence on preset waypoints and the like of a traditional unmanned system in a complex unstructured environment. In order to achieve the above object, the present application provides the following solutions: in a first aspect, the application provides a large language model-based unmanned system autonomous control method for man-machine fusion, which comprises the following steps: the method comprises the steps of obtaining an unstructured high-level semantic instruction sequence, wherein the unstructured high-level semantic instruction sequence comprises a plurality of unstructured high-level semantic instructions; Traversing the unstructured high-level semantic instruction sequences, and executing the following steps aiming at each unstructured high-level semantic instruction until all tasks corresponding to the unstructured high-level semantic instructions in the unstructured high-level semantic instruction sequences are completed, and ending control: Taking the starting time corresponding to the current unstructured high-level semantic instruction in the unstructured high-level semantic instruction sequence as the current time to acquire multi-sensor environment data at the current time, wherein the multi-sensor environment data comprises laser radar data, vision sensor data and inertial measurement unit data; Constructing a current local environment topological graph based on multi-sensor environment data at the current moment; searching an expert strategy database by taking a current unstructured high-level semantic instruction and a current local environment topological graph as fusion query vectors, and selecting a preset