CN-121979630-A - Task planning method, computing device and machine-readable storage medium

CN121979630ACN 121979630 ACN121979630 ACN 121979630ACN-121979630-A

Abstract

The application discloses a task planning method, computing equipment and a machine-readable storage medium, and relates to the technical field of personal intelligence. The method comprises the steps of determining a target environment object based on an input task instruction, inquiring a preset scene knowledge base based on the target environment object to obtain background knowledge corresponding to the target environment object, wherein the background knowledge comprises at least one of the attribute, the position and the function of the target environment object, generating an enhancement instruction based on the background knowledge and the task instruction, and inputting the enhancement instruction into a pre-trained task planning model to obtain a task planning result corresponding to the task instruction. By introducing a dynamically constructed scene knowledge base and an intelligent retrieval verification mechanism, the perception and understanding capability of a task planning system to the physical world is enhanced, the limitation that the traditional method depends on static internal knowledge is overcome, and the practicability and reliability of task planning under a body intelligent scene are improved.

Inventors

YIN QIANQIAN
ZENG GUANG
FU LING
TONG XING
SHE LINGJUAN
ZHOU ZHIZHONG

Assignees

中科云谷科技有限公司

Dates

Publication Date: 20260505
Application Date: 20251224

Claims (10)

1. A method of mission planning, comprising: determining a target environment object based on the input task instruction; inquiring a preset scene knowledge base based on the target environment object to obtain background knowledge corresponding to the target environment object, wherein the background knowledge comprises at least one of the attribute, the position and the function of the target environment object; generating an enhancement instruction based on the background knowledge and the task instruction; And inputting the enhancement instruction into a pre-trained task planning model to obtain a task planning result corresponding to the task instruction.
2. The task planning method according to claim 1, wherein the constructing step of the scene knowledge base includes: constructing a semantic map of a target physical scene based on observation data of the target physical scene, wherein the semantic map comprises attributes and positions of all environment objects in the target physical scene; respectively acquiring description semantic information of each environment object, wherein the description semantic information is used for defining functions of the environment objects or describing function association information between the environment objects and other environment objects in the same target physical scene; And storing the attribute, the position and the description semantic information in an associated mode for each environment object to generate a scene knowledge base corresponding to the target physical scene.
3. The mission planning method of claim 2, wherein the background knowledge further comprises membership between the target environmental object and other environmental objects in a target physical scene; before the step of querying a preset scene knowledge base based on the target environment object, the method further comprises the following steps: Acquiring a spatial relationship among the environmental objects in the semantic map; and determining membership relations among the environment objects based on the spatial relations, and storing the membership relations to the scene knowledge base.
4. The task planning method according to claim 1, wherein the querying the preset scene knowledge base based on the target environmental object to obtain the background knowledge corresponding to the target environmental object includes: Inquiring a preset scene knowledge base based on the target environment object, and determining a first preset number of inquiry results; according to the semantic relativity between each query result and the target environment object, sequencing the query results from high to low; And selecting a second preset number of query results with top ranking from the ranking results as background knowledge corresponding to the target environment object.
5. The mission planning method of claim 1, wherein the generating an enhancement instruction based on the background knowledge and the mission instruction comprises: Generating result verification information based on the background knowledge, the task instruction and the observation data of the target physical scene; Inputting the result verification information into a pre-trained result verification model, and screening target background knowledge related to the task instruction from the background knowledge through the result verification model; generating enhancement instructions based on the target background knowledge and the task instructions.
6. The mission planning method of claim 1, wherein the generating an enhancement instruction based on the background knowledge and the mission instruction comprises: combining the background knowledge with a preset reasoning step guide template to obtain a combined text, wherein the reasoning step guide template is used for indicating a task planning model to gradually reason according to preset reasoning steps; and splicing the combined text and the task instruction to form the enhancement instruction.
7. The mission planning method of claim 1, wherein the generating an enhancement instruction based on the background knowledge and the mission instruction comprises: Combining the background knowledge with a preset reasoning output instruction to obtain a combined text, wherein the reasoning output instruction is used for indicating a task planning model to output a reasoning process of task planning while outputting a task planning result; and splicing the combined text and the task instruction to form the enhancement instruction.
8. A mission planning method as claimed in claim 6 or 7, further comprising: Judging whether an reasoning process for generating the task planning result is compliant or not based on a preset compliance rule, wherein the preset compliance rule comprises at least one of a physical feasibility rule, an operation safety rule and a task logic dependency rule; Outputting the task planning result under the condition that the reasoning process is compliant; And under the condition that the reasoning process is not compliant, outputting planning failure prompt information.
9. A computing device, comprising: a memory configured to store instructions; A processor configured to invoke the instructions from the memory and when executing the instructions is capable of implementing a mission planning method as claimed in any one of claims 1 to 8.
10. A machine-readable storage medium having stored thereon instructions for causing a machine to perform a mission planning method as claimed in any one of claims 1 to 8.

Description

Task planning method, computing device and machine-readable storage medium Technical Field The application relates to the technical field of personal intelligence, in particular to a task planning method, computing equipment and a machine-readable storage medium. Background In the field of self-contained artificial intelligence, visual language models are becoming a core component of robots for high-level task planning. Existing mainstream methods typically map simplified user instructions directly into action sequences based on a single current observation image. However, this paradigm has inherent limitations in that the planning process relies heavily on static general knowledge of the visual language model internalized during the pre-training phase, and is not able to efficiently perceive and fuse the context information provided by the task execution environment beyond the current field of view. This results in robots that are very likely to be disjointed from a specific physical reality in the face of complex tasks in the open world that require long-range reasoning or rely on a specific environmental layout, presenting shortcomings in many aspects such as reliability, safety, and adaptability. Therefore, how to break through the limitation of the excessive dependence of the existing visual language model on the internal static knowledge and the situation perception thereof has become a key challenge for improving the actual performance of the intelligent system in the complex physical environment. Disclosure of Invention In view of the foregoing deficiencies of the prior art, it is an object of an embodiment of the present application to provide a task planning method, a computing device and a machine-readable storage medium. In order to achieve the above object, a first aspect of the present application provides a task planning method, including: determining a target environment object based on the input task instruction; Inquiring a preset scene knowledge base based on the target environment object to obtain background knowledge corresponding to the target environment object, wherein the background knowledge comprises at least one of the attribute, the position and the function of the target environment object; generating an enhancement instruction based on the background knowledge and the task instruction; And inputting the enhancement instruction into a pre-trained task planning model to obtain a task planning result corresponding to the task instruction. In the embodiment of the application, the construction step of the scene knowledge base comprises the following steps: based on the observation data of the target physical scene, constructing a semantic map of the target physical scene, wherein the semantic map comprises the attribute and the position of each environment object in the target physical scene; respectively acquiring description semantic information of each environment object, wherein the description semantic information is used for defining functions of the environment objects or describing function association information between the environment objects and other environment objects in the same target physical scene; and storing the attribute, the position and the description semantic information in an associated manner for each environment object to generate a scene knowledge base corresponding to the target physical scene. In the embodiment of the application, the background knowledge also comprises membership between the target environment object and other environment objects in the target physical scene; Before the step of querying the preset scene knowledge base based on the target environment object, the method further comprises the following steps: Acquiring a spatial relationship among all environment objects in the semantic map; And determining membership relations among all environment objects based on the spatial relations, and storing the membership relations in a scene knowledge base. In the embodiment of the application, a preset scene knowledge base is queried based on a target environment object to obtain background knowledge corresponding to the target environment object, and the method comprises the following steps: Inquiring a preset scene knowledge base based on the target environment object, and determining a first preset number of inquiry results; according to the semantic relativity between each query result and the target environment object, sequencing the query results from high to low; And selecting a second preset number of query results with top ranking from the ranking results as background knowledge corresponding to the target environment object. In the embodiment of the application, generating the enhancement instruction based on the background knowledge and the task instruction comprises the following steps: Generating result verification information based on background knowledge, task instructions and observation data of a target physical scene; in