CN-122020150-A - Method and device for generating training data of robot decision model

CN122020150ACN 122020150 ACN122020150 ACN 122020150ACN-122020150-A

Abstract

The invention discloses a method and a device for generating training data of a robot decision model, and belongs to the technical field of computer vision. A method for generating training data of a robot decision model includes dividing a task to be executed of a robot into a plurality of actions, determining a decision frame of a current action and a next action, generating a data pair of the current action according to the decision frame of the current action and the next action, and splicing the plurality of data according to the execution sequence of the plurality of actions to generate training data of the task to be executed of the robot decision model. The method for generating the training data of the robot decision model can improve the data acquisition efficiency and the calculation performance in the training process of the robot task decision model.

Inventors

CHEN FENG
CHEN ANQI

Assignees

清华大学

Dates

Publication Date: 20260512
Application Date: 20251211

Claims (10)

1. The method for generating the training data of the robot decision model is characterized by comprising the following steps of: Dividing a task to be executed of the robot into a plurality of actions; Determining a decision frame of the current action and the next action; Generating a data pair of the current action according to the decision frame of the current action and the next action; and splicing the plurality of data according to the execution sequence of the plurality of actions to generate training data of the task to be executed of the robot decision model.
2. The generating method according to claim 1, wherein the decision frame is used for representing the semantics of the current action, and the decision frame has a mapping relationship with the execution result of the current action.
3. The method according to claim 2, wherein the decision frame is a start state frame of the current action and is a state frame after a previous action is performed.
4. The generating method according to claim 1, wherein the dividing the task to be performed of the robot into a plurality of actions includes: Generating an action sequence according to the environment state corresponding to the task to be executed, a preset action library and the task to be executed; Dividing the task to be executed into a plurality of actions according to the action sequence.
5. The method of generating of claim 1, wherein the training data further comprises: And the robot executes the initial environment state of the task to be executed.
6. The method of generating of claim 5, wherein the training data further comprises: And a status frame after the last action of the plurality of actions is executed.
7. A device for generating training data of a robot decision model, comprising: the task dividing module is used for dividing a task to be executed of the robot into a plurality of actions; the decision frame determining module is used for determining a decision frame of the current action and the next action; The data pair generation module is used for generating a data pair of the current action according to the decision frame of the current action and the next action; And the training data generation module is used for splicing the plurality of data according to the execution sequence of the plurality of actions to generate training data of the task to be executed of the robot decision model.
8. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method of any of claims 1 to 6 when executing the computer program.
9. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program which, when executed by a processor, implements the method of any of claims 1 to 6.
10. A computer program product, characterized in that the computer program product comprises a computer program which, when executed by a processor, implements the method of any of claims 1 to 6.

Description

Method and device for generating training data of robot decision model Technical Field The invention relates to the technical field of computer vision, in particular to a method and a device for generating training data of a robot decision model. Background In the prior art, generation of training data of a robot decision model often depends on a physical simulation environment. The simulation method generally needs to completely simulate each frame of motion of the robot, so as to ensure that each step of motion from an initial state to a target state is continuous and real. This approach, while guaranteeing physical consistency, also incurs significant computational overhead. Especially in the training scene requiring large-scale multitasking and multiple scenes, the increase of the simulation calculation cost is often in an exponential level, so that the training period of the robot decision model is long, and the robot decision model becomes a key bottleneck for restricting the improvement of the system efficiency. This section is intended to provide a background or context to the embodiments of the invention that are recited in the claims. The description herein is not admitted to be prior art by inclusion in this section. Disclosure of Invention The embodiment of the invention provides a method and a device for generating training data of a robot decision model, which are used for solving at least part of the technical problems in the prior art. The specification provides a method for generating training data of a robot decision model, which comprises the following steps: Dividing a task to be executed of the robot into a plurality of actions; Determining a decision frame of the current action and the next action; Generating a data pair of the current action according to the decision frame of the current action and the next action; and splicing the plurality of data according to the execution sequence of the plurality of actions to generate training data of the task to be executed of the robot decision model. In one embodiment, the decision frame is used for representing the semantics of the current action, and a mapping relationship exists between the decision frame and the execution result of the current action. In one embodiment, the decision frame is a starting state frame of the current action and is a state frame after the last action is performed. In one embodiment, the dividing the task to be performed by the robot into a plurality of actions includes: Generating an action sequence according to the environment state corresponding to the task to be executed, a preset action library and the task to be executed; Dividing the task to be executed into a plurality of actions according to the action sequence. In one embodiment, the training data further comprises: And the robot executes the initial environment state of the task to be executed. In one embodiment, the training data further comprises: And a status frame after the last action of the plurality of actions is executed. The specification also provides a device for generating training data of the robot decision model, which comprises: the task dividing module is used for dividing a task to be executed of the robot into a plurality of actions; the decision frame determining module is used for determining a decision frame of the current action and the next action; The data pair generation module is used for generating a data pair of the current action according to the decision frame of the current action and the next action; And the training data generation module is used for splicing the plurality of data according to the execution sequence of the plurality of actions to generate training data of the task to be executed of the robot decision model. In one embodiment, the task partitioning module includes: The action sequence generating unit is used for generating an action sequence according to the environment state corresponding to the task to be executed, a preset action library and the task to be executed; And the task dividing unit is used for dividing the task to be executed into the plurality of actions according to the action sequence. The embodiment of the invention also provides computer equipment, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the generation method of the robot decision model training data when executing the computer program. The present specification also provides a computer readable storage medium storing a computer program which when executed by a processor implements the method for generating robot decision model training data described above. The present specification also provides a computer program product comprising a computer program which when executed by a processor implements the method of generating robot decision model training data described above. As can be seen from the above description, the em