CN-121998085-A - Physical state deduction-based task decomposition system and method for intelligent robot with body

CN121998085ACN 121998085 ACN121998085 ACN 121998085ACN-121998085-A

Abstract

The invention discloses a physical state deduction-based task decomposition system and method for an intelligent robot with a body, and relates to the technical field of intelligent robots. The system comprises a task instruction receiving and initial decomposing module, a state sensing and formatting input module, a task feasibility deduction module, a feedback and optimization closed-loop module and a decision and output module. The invention constructs a technical framework of sensing, deduction and optimization closed loop, so that the robot can virtually deduct the task sequence before executing the actual physical action, thereby identifying the environment state conflict and the execution path risk in advance, fundamentally converting the task execution mode from the passive response depending on trial-error into the active reliable planning based on the physical state deduction, and remarkably improving the adaptability, safety and deployment efficiency of the robot for executing complex tasks in an open environment.

Inventors

LIU HEHUI
Yang Ganxu
ZHONG WEIKANG

Assignees

广州瞬擎智合技术有限公司

Dates

Publication Date: 20260508
Application Date: 20260119

Claims (10)

1. The utility model provides a have intelligent robot task decomposition system of body based on physical state deduction which characterized in that includes: The task instruction receiving and initial decomposing module is used for establishing an interaction channel between a user and the system and converting the natural language task requirement of the user into a subtask list which can be processed by the system; The state sensing and formatting input module is used for acquiring the state of the robot and the state information of the external environment and generating the current structured state description; The task feasibility deduction module is used for carrying out gradual physical state deduction according to the received subtask list and the current structured state description, and verifying the physical feasibility of the task feasibility deduction module; The feedback and optimization closed-loop module is used for generating feedback information containing conflict reasons and modification suggestions based on the task feasibility deduction module when deduction fails, and driving the task instruction receiving and initial decomposing module to re-optimize the subtask list; And the decision and output module is used for completing formatting, manual confirmation and multiplexing storage of the subtask list and simultaneously docking the execution end of the robot.
2. The system of claim 1, wherein the task instruction receiving and initial decomposing module comprises: the man-machine interaction unit is used for providing a natural language or graphical interface and receiving a task instruction input by a user; and the task decomposition unit is used for filling the task instruction and the current structured state description into a corresponding prompt word template, and calling LLM to generate a subtask list conforming to a preset format.
3. The system of claim 1, wherein the state aware and formatting input module comprises: the robot state sensing unit is used for reading the sensor data in the robot in real time; and the environment sensing unit is used for acquiring a current environment image through a camera device carried by the robot body and calling the VLM to generate a preliminary description.
4. The system of claim 1, wherein the task feasibility deduction module comprises: The input unit is used for receiving the current structured state description and a subtask list, and each subtask in the subtask list comprises a precondition, a task description and a post state change; the deduction processing unit is used for traversing the subtask list in sequence and executing the following operations on each subtask: Extracting a precondition of the subtask, calling LLM and checking a prompt word template through a preset precondition to judge whether the precondition is consistent with the current deduction state; If the precondition is satisfied, calling LLM and updating the current deduction state according to the task description and the post state change of the subtask through a preset state update prompt word template; failure processing, namely judging deduction failure if the preconditions are not met, and recording the identification of the failed subtask and the preconditions which are not met; The output unit is used for outputting the deduction result and the final deduction state sequence.
5. The system of claim 1, wherein the feedback and optimization closed loop module comprises: The receiving unit is used for receiving the prodigal task information, the failure reason and the state context when the failure occurs; the analysis processing unit is used for calling LLM, analyzing failure reasons and generating natural language feedback for guiding re-planning; and the feedback unit is used for outputting and generating natural language feedback text for guiding the re-planning.
6. The system of claim 1, wherein the decision and output module comprises: The feasibility task output unit is used for formatting the deduced subtask list so as to enable the subtask list to be readable; And the manual editing and confirming unit is used for displaying the formatted subtask list and supporting manual editing, auditing and confirmation.
7. The system of claim 6, wherein the decision and output module further comprises: and the task storage unit is used for storing the effective subtask list confirmed by the personnel and calling the follow-up similar tasks so as to improve the planning efficiency.
8. A physical state deduction-based task decomposition method for an intelligent robot with body, characterized in that the system according to any one of claims 1-7 is applied, the method comprising the steps of: s100, system initialization and knowledge preparation are carried out, all prompt word templates are predefined and stored, and a robot kinematic model and a sensor interface are configured; s200, real-time state sensing and construction, namely synchronously acquiring an environment image and sensor data inside a robot based on a sensor interface, calling VLM and fusing the environment image and the sensor data inside the robot to generate a current structured state description; s300, a task instruction receiving and initial decomposing module is operated, the received user task instruction and the current structured state description are filled into a corresponding prompt word template, LLM is called, and an initial subtask list is obtained; S400, feasibility closed-loop deduction and optimization, namely performing traversal check on the output subtask lists one by taking the current structured state description as an initial deduction state, executing precondition check and state update, if the precondition is met, synchronously updating the current structured state description, and if the precondition is not met, generating accurate feedback and driving the LLM to re-optimize the subtask lists until all the subtask lists pass deduction; and S500, outputting and confirming the result, outputting the verified subtask list, and transmitting the subtask list to a robot execution engine after manual confirmation.
9. The method of claim 8, wherein the cue word templates include environmental descriptions, task decomposition, premise checking, status updating, and feedback generation.
10. The method as set forth in claim 8, wherein the step S400 includes the following steps: step S401, setting the Current State of the Current deduction State to be equal to the Current structural state_structural; Step S402, sequentially traversing each subtask S_i in the subtask List Subtask _List; Step S403, precondition checking, namely filling preconditions preconditions of Current State and S_i of the Current deduction State into a checking prompt word template, and calling LLM to judge; Step S404, if LLM returns yes, the State is updated, namely, the post State changes postconditions of the Current deduction State current_State and S_i are filled in an update prompt word template, LLM is called, the Current deduction State current_State is replaced by the output of the update prompt word template, and the next subtask is processed by jumping to S402; step S405, if LLM returns no and unsatisfied preconditions, deduction fails, a Feedback and optimization closed loop module is triggered, and Feedback text Feedback is generated; Step S406, the Task instruction task_cmd, the current Structured state_structured and the Feedback text Feedback are input into a Task decomposition unit together to request regeneration or local adjustment of the subtask List Subtask _List so as to form closed loop optimization.

Description

Physical state deduction-based task decomposition system and method for intelligent robot with body Technical Field The invention relates to the technical field of artificial intelligent robots, in particular to a physical state deduction-based task decomposition system and method for an intelligent robot with a body. Background With the development of intelligent and Large Language Model (LLM) technology, the LLM is utilized to understand and decompose high-level natural language instructions to generate a task sequence executable by a robot, and the task sequence becomes the leading edge direction of robot autonomy. However, reliable application of the powerful semantic understanding capabilities of LLM to robots in the physical world still faces fundamental challenges. The prior art scheme mainly has the following problems and disadvantages: 1. The problem of 'physical illusion' of LLM is that the prior method generally directly depends on LLM to carry out task decomposition. LLM is essentially a language model whose training data and reasoning process severely lacks embedded understanding of physical world basic laws such as spatial geometry, object dynamics, resource exclusivity. This results in its resolved subtask list often being logically reasonable but physically infeasible. For example, LLM may program steps that violate physical wisdom, such as "unscrew without tools" or "reopen the refrigerator door while holding with both arms". This "physical illusion" makes the decomposition result not directly usable for driving a real robot. 2. Open-loop decomposition lacks a verification mechanism, and the current main stream method is an open-loop mode of 'one-time decomposition and direct execution'. The system lacks an automatic feasibility verification link based on physical common sense for the decomposition result of LLM output. The correctness is highly dependent on the prompter engineering and the contingency of the LLM, and the reliability cannot be ensured. Once the physical contradiction occurs in the decomposition, the physical contradiction can only be exposed in the actual robot execution stage, which may cause task failure, equipment damage or safety accidents, high risk and high debugging cost. 3. The lack of a structured representation and reasoning method of physical states-to enable a machine to verify the physical feasibility of a task, it is necessary to structurally describe the states of the robot and the environment and model how the actions change these states. The existing method lacks a method for representing and reasoning a physical state which is aligned with a control model of a robot bottom layer and can be reasoning. Existing methods typically rely on unstructured natural language generated by large models to describe the state, e.g. "robot near table". The description semantics are fuzzy, and can not be linked with precise geometric and physical models such as a robot kinematic model, an environment map and the like, so that the system can not automatically and strictly deduce and verify the geometric feasibility of actions such as moving to a certain place, grabbing something and the like, such as whether the actions are reachable, collision and physical feasibility, such as whether the actions are overweight and stable. Therefore, how to construct a task representation that can be understood by LLM and support underlying physical and automatic reasoning is a problem that needs to be solved by those skilled in the art. Disclosure of Invention In view of the above, the present invention has been made to provide an intelligent robot task decomposition system and method based on physical state deduction that overcomes or at least partially solves the above-mentioned problems. In order to achieve the above purpose, the present invention adopts the following technical scheme: in a first aspect, an embodiment of the present invention provides an intelligent robot task decomposition system based on physical state deduction, including: The task instruction receiving and initial decomposing module is used for establishing an interaction channel between a user and the system and converting the natural language task requirement of the user into a subtask list which can be processed by the system; The state sensing and formatting input module is used for acquiring the state of the robot and the state information of the external environment and generating the current structured state description; The task feasibility deduction module is used for carrying out gradual physical state deduction according to the received subtask list and the current structured state description, and verifying the physical feasibility of the task feasibility deduction module; The feedback and optimization closed-loop module is used for generating feedback information containing conflict reasons and modification suggestions based on the task feasibility deduction module when deduction fails, and driving the