CN-121981153-A - Teaching annotation data generation method and device for intelligent body and electronic equipment

CN121981153ACN 121981153 ACN121981153 ACN 121981153ACN-121981153-A

Abstract

The invention provides a method, a device and electronic equipment for generating teaching annotation data of an intelligent body, wherein the method comprises the steps of obtaining simulation original data and obtaining operation tasks of a target intelligent body, obtaining action semantic annotations and scene semantic annotations matched with the simulation original data based on the simulation original data, decomposing the operation tasks of the target intelligent body into a plurality of operation subtasks, labeling each operation subtask to obtain subtask set annotations matched with the operation tasks, and obtaining teaching annotation data of the target intelligent body for executing the operation tasks based on the action semantic annotations and/or the scene semantic annotations and/or the subtask set annotations. The teaching annotation data is automatically and efficiently generated.

Inventors

LIU XUECHENG
XIE SHAOXUAN
YAO GUOCAI
NI ZIQIANG

Assignees

北京智源人工智能研究院

Dates

Publication Date: 20260505
Application Date: 20251224

Claims (10)

1. A method for generating teaching annotation data of a body-building agent, the method comprising: acquiring simulation original data and acquiring an operation task of a target body intelligent agent, wherein the simulation original data is operation track data for simulating the target body intelligent agent to execute the operation task; based on the simulation original data, obtaining action semantic annotation and scene semantic annotation matched with the simulation original data; Decomposing the operation task of the target body intelligent agent into a plurality of operation subtasks, and marking each operation subtask to obtain a subtask set marking matched with the operation task; and obtaining teaching annotation data of the target body intelligent agent for executing the running task based on the action semantic annotation, the scene semantic annotation and/or the subtask set annotation.
2. The method for generating teaching annotation data of a body-building intelligent agent according to claim 1, wherein the simulation original data includes a simulation running track of the target body-building intelligent agent executing the running task, the obtaining action semantic annotation matched with the simulation original data based on the simulation original data includes: extracting and obtaining the operation action characteristics of the target body intelligent agent based on the simulation operation track; Carrying out structuring treatment on the operation action characteristics to obtain structured action characteristics; And obtaining action semantic annotation matched with the simulation original data based on the action characteristics after the structuring processing.
3. The method for generating teaching annotation data of a body-building intelligent agent according to claim 1, wherein the simulation original data includes simulation scene metadata information of the target body-building intelligent agent for executing the operation task, and the obtaining scene semantic annotation matched with the simulation original data based on the simulation original data includes: extracting scene description information of the simulation scene based on the simulation scene metadata information; carrying out vectorization processing on the scene description information to obtain vectorized scene description; and obtaining scene semantic annotations matched with the simulation original data based on the vectorized scene description.
4. The method for generating teaching annotation data for an agent according to claim 1, wherein before said annotating each of said running subtasks to obtain a subtask set annotation matching said running task, said method further comprises: Acquiring a teaching video of the target body-building intelligent agent for executing the running task; decomposing the teaching video according to the running subtasks to obtain a plurality of groups of teaching video frames; Matching and mapping the teaching video frame and the running subtask to obtain a mapping combination of the teaching video frame and the running subtask; labeling each running subtask to obtain subtask set labels matched with the running tasks, wherein the labeling comprises the following steps: and labeling each running subtask based on the mapping combination to obtain a subtask set label matched with the running task.
5. The method for generating teaching annotation data for an agent according to any one of claims 1 to 4, wherein the method further comprises: Setting version numbers for the simulation original data, the action semantic annotation, the scene semantic annotation and the subtask set annotation respectively; And under the condition that the version number of the simulation original data is updated, based on the simulation original data updated by the version number, carrying out teaching data annotation again to obtain action semantic annotation updated by the version number, scene semantic annotation updated by the version number and subtask set annotation updated by the version number.
6. The method for generating teaching annotation data for a body-building agent according to any one of claims 1 to 4, wherein the obtaining of the simulation raw data is achieved by: Obtaining a pre-configured mapping table, wherein the mapping table is configured with corresponding relations between different body-building agents and different simulation configuration parameters, and the simulation configuration parameters are used for simulating the configuration parameters of the body-building agents for executing the running tasks; Determining a target simulation configuration parameter for simulating the target body intelligent agent to execute the running task based on the target body intelligent agent and the corresponding relation in the mapping table; And simulating according to the target simulation configuration parameters to obtain simulation original data of the target body intelligent agent for executing the operation task.
7. The method for generating teaching annotation data for a body-building agent according to claim 1, wherein after said obtaining simulation raw data and obtaining an operation task of a target body-building agent, the method further comprises: Acquiring calculation force levels of different calculation nodes; The step of obtaining the action semantic annotation and the scene semantic annotation matched with the simulation original data based on the simulation original data comprises the following steps: respectively predicting a first computational power requirement for performing action semantic annotation and a second computational power requirement for performing scene semantic annotation; screening a plurality of computing nodes to obtain a first computing node matched with the first computing power demand based on the first computing power demand and the computing power level of different computing nodes, and screening a plurality of computing nodes to obtain a second computing power node matched with the second computing power demand; obtaining action semantic labels matched with the simulation original data through a first computing node based on the simulation original data, and obtaining scene semantic labels matched with the simulation original data through a second computing node based on the simulation original data; labeling each running subtask to obtain subtask set labels matched with the running tasks, wherein the labeling comprises the following steps: Predicting a third calculation force requirement for subtask set labeling; screening a plurality of computing nodes to obtain a third computing node matched with the third computing force demand based on the third computing force demand and the computing force level of different computing nodes; And labeling each running subtask based on the third computing force node to obtain a subtask set label matched with the running task.
8. A device for generating teaching annotation data of an agent, the device comprising: The system comprises an acquisition module, a control module and a control module, wherein the acquisition module is used for acquiring simulation original data and acquiring an operation task of a target body intelligent agent, wherein the simulation original data is operation track data for simulating the target body intelligent agent to execute the operation task; the processing module is used for obtaining action semantic annotation and scene semantic annotation matched with the simulation original data based on the simulation original data; The decomposition module is used for decomposing the running task of the target body intelligent agent into a plurality of running subtasks and labeling each running subtask to obtain a subtask set label matched with the running task; The generating module is used for obtaining teaching annotation data of the target body intelligent agent for executing the running task based on the action semantic annotation, the scene semantic annotation and/or the subtask set annotation.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method for generating teaching annotation data for an agent of any of claims 1 to 7 when the computer program is executed.
10. A non-transitory computer-readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the self-agent teaching annotation data generation method according to any one of claims 1 to 7.

Description

Teaching annotation data generation method and device for intelligent body and electronic equipment Technical Field The invention relates to the technical field of artificial intelligence, in particular to a method and a device for generating teaching annotation data of a body intelligent agent and electronic equipment. Background In the training field of intelligent models with bodies, high-quality and structured teaching annotation data are key bases for improving strategy generalization capability. At present, the intelligent model with body needs to learn a complex action strategy through multi-mode interaction data, and the accuracy and efficiency of data annotation directly influence the upper limit of the performance of the model. Related technologies can know that the current labeling mode generally adopts manual frame-by-frame labeling, for example, labeling personnel need to manually identify and label action types for each frame of image or track point aiming at video stream or motion track data, which causes the problems of low efficiency, high labor cost, inconsistent labeling standards and the like in the process of generating teaching labeling data. Therefore, searching for a method for automatically and efficiently generating teaching annotation data becomes a current research hotspot. Disclosure of Invention The invention provides a method and a device for generating teaching annotation data of a body-equipped intelligent agent and electronic equipment, which realize automatic and efficient generation of the teaching annotation data. The invention provides a method for generating teaching annotation data of an intelligent body, which comprises the steps of obtaining simulation original data and obtaining an operation task of an intelligent body of a target, wherein the simulation original data is operation track data for simulating the intelligent body of the target to execute the operation task, obtaining action semantic annotation and scene semantic annotation matched with the simulation original data based on the simulation original data, decomposing the operation task of the intelligent body of the target into a plurality of operation subtasks and annotating the operation subtasks to obtain a subtask set annotation matched with the operation task, and obtaining teaching annotation data of the intelligent body of the target to execute the operation task based on the action semantic annotation and/or the scene semantic annotation and/or the subtask set annotation. The method for generating teaching annotation data of the intelligent body comprises the steps of obtaining action semantic annotations matched with the simulation original data based on the simulation original data, extracting operation action features of the intelligent body of the object based on the simulation operation tracks, carrying out structuring processing on the operation action features to obtain structured operation features, and obtaining the action semantic annotations matched with the simulation original data based on the structured operation features. The method for generating teaching annotation data of the intelligent body comprises the steps of enabling simulation original data to comprise simulation scene metadata information of the intelligent body of the target to execute the operation task, obtaining scene semantic annotations matched with the simulation original data based on the simulation original data, extracting scene description information of a simulation scene based on the simulation scene metadata information, vectorizing the scene description information to obtain vectorized scene description, and obtaining scene semantic annotations matched with the simulation original data based on the vectorized scene description. The method for generating teaching annotation data of the intelligent body comprises the steps of obtaining teaching videos of the target intelligent body for executing the operation tasks, decomposing the teaching videos according to the operation subtasks to obtain a plurality of groups of teaching video frames, carrying out matching mapping on the teaching video frames and the operation subtasks to obtain a mapping combination of the teaching video frames and the operation subtasks, and marking each operation subtask to obtain a subtask set annotation matched with the operation tasks. The method for generating teaching annotation data of the intelligent body comprises the steps of setting version numbers for the simulation original data, the action semantic annotation, the scene semantic annotation and the subtask set annotation respectively, and carrying out teaching data annotation again based on the updated simulation original data of the version numbers under the condition that the version numbers of the simulation original data are monitored to be updated so as to obtain the action semantic annotation updated by the version numbers, the scene semantic annotation updated by