CN-121997480-A - Robot body autonomous generation method and system based on artificial intelligence

CN121997480ACN 121997480 ACN121997480 ACN 121997480ACN-121997480-A

Abstract

The invention provides an autonomous generation method and system of a robot body based on artificial intelligence, comprising the steps of carrying out text coding on text description of a robot structure to generate text embedding, predicting and generating all components including component types and various attributes through repeated iteration by using a component generator according to the text embedding, predicting connection probabilities among the components by using a connection relation predictor to screen high-probability connection to form a component connection relation, obtaining father-son connection relation among the components through the generated component types and connection relation of the components, connecting the components to father components through joints of the components, assigning various attributes to form a urdf model of the whole robot, wherein each component in the urdf model is represented by a simple geometric body assigned by shape and size attributes of the components, and after all component models are replaced by a mesh library, forming a three-dimensional entity model of the whole robot. The invention can automatically generate the robot body.

Inventors

Kong Xinhao
LIU YANG
YAN YAN
Gan shuai

Assignees

北京唯实具身智能研究院

Dates

Publication Date: 20260508
Application Date: 20251216

Claims (10)

1. An artificial intelligence-based robot body autonomous generation method, which is characterized by comprising the following steps: S1, carrying out text coding on text description of a robot structure to generate text embedding; s2, according to the generated text embedding, using a component generator to generate all components through multiple iterations, and predicting, wherein all components comprise component types and various attributes; s3, predicting connection probability among the components by using a connection relation predictor, and screening connection with high probability to form a component connection relation; S4, obtaining a father-son connection relation between each component through the generated component types and connection relation of the components, connecting the components to the father component through joints of the components, assigning various attributes to form a urdf model of the whole robot, wherein each component in the urdf model is represented by a simple geometric body assigned by the shape attribute and the size attribute of the component; S5, replacing each simple geometric body through a mesh library, and forming a three-dimensional solid model of the whole robot after all component models are replaced.
2. The method of claim 1, wherein the component generator is a fully-connected network, the input of the first iteration is text embedding and encoding of existing component information, the initial existing component information is null, the output is a component type probability, a joint type probability, an axial direction vector, a position vector, a shape type probability, a size vector, a probability of adding a new component of the first component, and the components of the first iteration are generated, including component types and various attributes; comparing the probability of adding the new component output by the first round of iteration with a preset threshold, and continuing to perform a second round of iteration when the probability of adding the new component output by the first round of iteration is larger than the preset threshold, wherein the second round of iteration takes the component generated by the first round of iteration after being encoded and embedded with the text as network input, and outputs the component type probability, joint type probability, axial direction vector, position vector, shape type probability, size vector and probability of adding the new component of the second round of iteration to generate a component of the second round of iteration, wherein the component type and various attributes are included; And continuously comparing the probability of adding the new component output by the second iteration with a preset threshold value, and finally generating all components including component types and various attributes through multiple iterations.
3. The method according to claim 1, wherein the connection relation predictor is a graph neural network GNN, encodes all components and their attributes of the input into vectors, and initializes the connection relation to 2 according to the number of components n (N-1+n-2+n-3+) -dimensional vectors; And outputting the predicted connection probability as a (n-1+n-2+n-3.+1) dimensional vector through the GNN, and screening high-probability connection to form a component connection relation.
4. The method of claim 1, wherein the training process of the component generator comprises: Each group of training data in the training set comprises text description and component information, wherein the component information comprises component type, joint type, axial direction, position, shape, size and component connection relation; converting each set of training data in the training set, comprising: Encoding the text description as text embedding; Encoding component information, including encoding component type, joint type, axial direction, position, shape and size of each component currently contained; The component type probability, the joint type probability, the axial direction vector, the position vector, the shape type probability, the size vector and the probability of adding a new component are output as components; Converting each group of training data in the training set into the input and output format, wherein the probability is expressed by 1 or 0; The loss function is designed into three types, namely, classification loss of component types using cross entropy loss, regression loss of component attributes using mean square error loss and probability loss of new components using binary cross entropy loss, wherein the total loss is the sum of the three types; the training of the component generator is realized by carrying out multiple iterations through the processes of loading data, forward propagation, calculation loss and reverse propagation and optimization.
5. The method of claim 1, wherein the training process of the connection relation predictor comprises: Each group of training data in the training set comprises text description and component information, wherein the component information comprises component type, joint type, axial direction, position, shape, size and component connection relation; The training data is converted into the following format, wherein the node characteristics comprise the type and attribute of each component, the type of each component is represented as a one-hot vector, and the attribute is represented as a numerical vector; the loss function is designed as binary cross entropy loss, and the difference between the connection probability predicted by the model and the actual connection label is calculated; and carrying out multiple iterations through the processes of loading data, forward propagation, calculation loss and reverse propagation and optimization, so as to realize the training of the connection relation predictor.
6. An artificial intelligence based robot ontology autonomous generation system, the system comprising: the text coding module is used for text coding the text description of the robot structure to generate text embedding; the component generating module is used for generating all components by multiple iterations through prediction by using the component generator according to the generated text embedding, and comprises component types and various attributes; the connection relation prediction module is used for predicting connection probability among the components by using the connection relation predictor, screening high-probability connection and forming a component connection relation; The urdf model forming module is used for obtaining the father-son connection relation between the components through the generated component types and connection relation of the components, connecting the components to the father components through joints of the components, assigning various attributes to form a urdf model of the whole robot, and each component in the urdf model is represented by a simple geometry assigned by the shape attribute and the size attribute of the component; the three-dimensional solid model forming module is used for replacing each simple geometric body through the mesh library, and after all the component models are replaced, a three-dimensional solid model of the whole robot is formed.
7. The system of claim 6, wherein the component generator is a fully-connected network, the input of the first iteration is text embedding and encoding of existing component information, the initial existing component information is null, the output is a component type probability, a joint type probability, an axial direction vector, a position vector, a shape type probability, a size vector, a probability of adding a new component, and the components of the first iteration are generated, including component types and various attributes; comparing the probability of adding the new component output by the first round of iteration with a preset threshold, and continuing to perform a second round of iteration when the probability of adding the new component output by the first round of iteration is larger than the preset threshold, wherein the second round of iteration takes the component generated by the first round of iteration after being encoded and embedded with the text as network input, and outputs the component type probability, joint type probability, axial direction vector, position vector, shape type probability, size vector and probability of adding the new component of the second round of iteration to generate a component of the second round of iteration, wherein the component type and various attributes are included; And continuously comparing the probability of adding the new component output by the second iteration with a preset threshold value, and finally generating all components including component types and various attributes through multiple iterations.
8. The system of claim 7, wherein the connection predictor is a graph neural network GNN, encodes all components and their attributes of the input into vectors, and initializes the connection to 2 according to the number of components n (N-1+n-2+n-3+) -dimensional vectors; And outputting the predicted connection probability as a (n-1+n-2+n-3.+1) dimensional vector through the GNN, and screening high-probability connection to form a component connection relation.
9. The system of claim 6, wherein the training process of the component generator comprises: Each group of training data in the training set comprises text description and component information, wherein the component information comprises component type, joint type, axial direction, position, shape, size and component connection relation; converting each set of training data in the training set, comprising: Encoding the text description as text embedding; Encoding component information, including encoding component type, joint type, axial direction, position, shape and size of each component currently contained; The component type probability, the joint type probability, the axial direction vector, the position vector, the shape type probability, the size vector and the probability of adding a new component are output as components; Converting each group of training data in the training set into the input and output format, wherein the probability is expressed by 1 or 0; The loss function is designed into three types, namely, classification loss of component types using cross entropy loss, regression loss of component attributes using mean square error loss and probability loss of new components using binary cross entropy loss, wherein the total loss is the sum of the three types; the training of the component generator is realized by carrying out multiple iterations through the processes of loading data, forward propagation, calculation loss and reverse propagation and optimization.
10. The system of claim 6, wherein the training process of the connection relation predictor comprises: Each group of training data in the training set comprises text description and component information, wherein the component information comprises component type, joint type, axial direction, position, shape, size and component connection relation; The training data is converted into the following format, wherein the node characteristics comprise the type and attribute of each component, the type of each component is represented as a one-hot vector, and the attribute is represented as a numerical vector; the loss function is designed as binary cross entropy loss, and the difference between the connection probability predicted by the model and the actual connection label is calculated; and carrying out multiple iterations through the processes of loading data, forward propagation, calculation loss and reverse propagation and optimization, so as to realize the training of the connection relation predictor.

Description

Robot body autonomous generation method and system based on artificial intelligence Technical Field The invention relates to the technical field of artificial intelligence, in particular to an autonomous robot body generation method and system based on artificial intelligence. Background Traditional robot designs often rely on engineers' experience and manual operations, requiring long periods of manual modeling, optimization, and tuning. Disclosure of Invention In order to solve the technical problems in the prior art, the invention provides an artificial intelligence-based robot body autonomous generation method and system, wherein the technical scheme is as follows: In one aspect, a robot ontology autonomous generation method based on artificial intelligence is provided, the method comprising: S1, carrying out text coding on text description of a robot structure to generate text embedding; s2, according to the generated text embedding, using a component generator to generate all components through multiple iterations, and predicting, wherein all components comprise component types and various attributes; s3, predicting connection probability among the components by using a connection relation predictor, and screening connection with high probability to form a component connection relation; S4, obtaining a father-son connection relation between each component through the generated component types and connection relation of the components, connecting the components to the father component through joints of the components, assigning various attributes to form a urdf model of the whole robot, wherein each component in the urdf model is represented by a simple geometric body assigned by the shape attribute and the size attribute of the component; S5, replacing each simple geometric body through a mesh library, and forming a three-dimensional solid model of the whole robot after all component models are replaced. Optionally, the component generator is a fully-connected network, the input of the first iteration is text embedding and the encoding of the existing component information, the initial existing component information is null, the component type probability, the joint type probability, the axial direction vector, the position vector, the shape type probability, the size vector and the probability of adding a new component of the first component are output, and the component of the first iteration is generated, wherein the component comprises the component type and various attributes; comparing the probability of adding the new component output by the first round of iteration with a preset threshold, and continuing to perform a second round of iteration when the probability of adding the new component output by the first round of iteration is larger than the preset threshold, wherein the second round of iteration takes the component generated by the first round of iteration after being encoded and embedded with the text as network input, and outputs the component type probability, joint type probability, axial direction vector, position vector, shape type probability, size vector and probability of adding the new component of the second round of iteration to generate a component of the second round of iteration, wherein the component type and various attributes are included; And continuously comparing the probability of adding the new component output by the second iteration with a preset threshold value, and finally generating all components including component types and various attributes through multiple iterations. Optionally, the connection relation predictor is a graph neural network GNN, encodes all the input components and their attributes into vectors, and initializes the connection relation to 2 according to the number n of components(N-1+n-2+n-3+) -dimensional vectors; And outputting the predicted connection probability as a (n-1+n-2+n-3.+1) dimensional vector through the GNN, and screening high-probability connection to form a component connection relation. Optionally, the training process of the component generator includes: Each group of training data in the training set comprises text description and component information, wherein the component information comprises component type, joint type, axial direction, position, shape, size and component connection relation; converting each set of training data in the training set, comprising: Encoding the text description as text embedding; Encoding component information, including encoding component type, joint type, axial direction, position, shape and size of each component currently contained; The component type probability, the joint type probability, the axial direction vector, the position vector, the shape type probability, the size vector and the probability of adding a new component are output as components; Converting each group of training data in the training set into the input and output format, wherein the probability is expressed b