CN-116009537-B - Multi-robot navigation method for cooperative transportation of large-scale components

CN116009537BCN 116009537 BCN116009537 BCN 116009537BCN-116009537-B

Abstract

The invention discloses a multi-robot navigation method for large-scale part cooperative transportation, which comprises the steps of S1, obtaining a starting point position S 0 ＝(p x0 ,p y0 , an end point position S g ＝(g x ,g y ) of multi-robot movement and obstacle information S o in a movement scene, S2, designing a formation of the multi-robot formation according to the shape of a required transportation object, obtaining relative position constraint among robots, S3, establishing a deep neural network, wherein the input of the deep neural network is the state of the multi-robot formation, and the state of the deep neural network is output as an execution action of the multi-robot formation, S4, training the deep neural network by using a PPO algorithm, S5, inputting the state of the multi-robot formation into a trained Actor network, obtaining the action of each step, and further obtaining the navigation path of the multi-robot formation from the starting point to the end point. The method can acquire the navigation path and the gesture of the multi-robot formation according to the distance constraint between the environment and the multi-robot.

Inventors

SUN XUEYING
WEI YIFEI
ZHANG QIANG
QI LIANG
ZHANG YONGWEI
YE SHUXIA
BAO LIN
LI CHANGJIANG

Assignees

江苏科技大学

Dates

Publication Date: 20260505
Application Date: 20221212

Claims (9)

1. A multi-robot navigation method for collaborative handling of large components, comprising the steps of: S1, acquiring the starting point positions of the motions of multiple robots End position And obstacle information in a motion scene The obstacle information ; Representing the coordinates of the sampling points of the edge of the obstacle, , Representing the total number of sampling points of the edge of the obstacle; s2, designing a formation of a multi-robot formation according to the shape of the object to be conveyed, and acquiring relative position constraint among robots; S3, establishing a deep neural network, wherein the input of the deep neural network is the state of the multi-robot formation, and the output of the deep neural network is the execution action of the multi-robot formation; The state of the multi-robot formation The method comprises the following steps: wherein Motion parameters for a multi-robot formation, wherein position parameters Position, speed parameters for forming reference points for multiple robots Speed and attitude parameters of multi-robot formation reference points in x direction and y direction respectively The method comprises the steps of forming a rotating radian with a reference point as a center for a plurality of robots, wherein the reference point of the multi-robot formation is the geometric center of the multi-robot formation; the deep neural network comprises an Actor network and a Critic network, wherein the input of the Actor network is the state of multi-robot formation The output is the execution action of the multi-robot formation The input of the Critic network is the state of multi-robot formation And actions Output as evaluation value ; S4, training the deep neural network by using a PPO algorithm; s5, inputting the state of the multi-robot formation into a trained Actor network to obtain an execution action of each step, and according to the current motion parameters and the execution actions of the multi-robot formation Calculating motion parameters at the next moment, and further obtaining a navigation path from a starting point to a terminal point of the multi-robot formation; The step S4 specifically includes: S41, randomly initializing parameters of an Actor network and a Critic network, and enabling iteration times to be equal to those of the Actor network and the Critic network ; S42, order Multiple robot formation slave start point Starting, initializing motion parameters of a multi-robot formation; s43, according to the current state The action is executed by the Actor network Calculation of Status of time-of-day multi-robot formation And according to Calculating the current raw prize ; Represent the first In several iterations The state of multi-robot formation at the moment; According to Judging whether the multi-robot formation reaches the destination or collides with the obstacle, if the multi-robot formation does not reach the destination or collides with the obstacle, making Step S43 is performed again until the multi-robot formation reaches the end point or collides with an obstacle; S44, recording the track of the current iteration Wherein Represent the first The movement time length of the multi-robot formation in the multiple iterations is as follows I.e. reaching the end point or colliding with an obstacle; Calculating discount rewards at each moment in the iteration, The discount rewards for time are: ; Is a discount coefficient; s45, optimizing the value of an Actor network by a random gradient descent method, wherein an optimized objective function is to maximize discount rewards at each moment; Optimizing weight of Critic network, and optimizing objective function to minimize output of Critic network Error between the value and the discount rewards at each moment; s46, order Jumping to step S42 for the next iteration until the change of the discount rewards between the two iterations is smaller than the preset value 。
2. The method of claim 1, wherein the Actor network comprises 4 hidden layers and an output layer, the number of neurons in the 4 hidden layers is 128,256,256,64, the activation functions are tanh functions, the output layer comprises 6 output nodes which respectively represent probabilities of different actions in the action space, wherein the probability value is the largest is selected as the execution action ; The Critic network comprises 4 hidden layers and an output layer, the number of neurons in the 4 hidden layers is 128,256,256,64 respectively, the activation functions are tanh functions, the output layer comprises 1 output node, and the output node represents an evaluation value 。
3. The multi-robot navigation method of claim 2, wherein the motion space comprises six motions, motion 1 represents a left motion of the multi-robot formation, motion 2 represents an upward motion of the multi-robot formation, motion 3 represents a right motion of the multi-robot formation, motion 4 represents a downward motion of the multi-robot formation, motion 5 represents a left-handed adjustment gesture of the multi-robot formation, and motion 6 represents a right-handed adjustment gesture of the multi-robot formation.
4. The multi-robot navigation method according to claim 1, wherein the calculation is performed in step S43 Status of time-of-day multi-robot formation The specific steps of (a) are as follows: (1) When executing an action For adjusting direction of movement, if time And (3) with Time interval between Adjusting time greater than robot formation speed , The speed of the moment multi-robot formation reference point is the execution The velocity of the latter is determined by the speed of the motor, Position of reference point for moment multi-robot formation The method comprises the following steps: Wherein the method comprises the steps of For the preset acceleration value, the acceleration value is set to be equal to the preset acceleration value, 、 Respectively the first In several iterations The speed of the multi-robot formation reference point in the x direction and the y direction at the moment; if the moment And (3) with Time interval between Adjustment time less than robot formation speed , Speed of multi-robot formation reference point in x direction and speed in y direction at moment The method comprises the following steps of: And Respectively presetting a maximum x-direction speed and a maximum y-direction speed; Position of reference point for moment multi-robot formation The method comprises the following steps: (2) When executing an action In order to adjust the gesture, the position of the reference point of the multi-robot formation is unchanged, the reference point is used as the center for adjusting the rotation gesture, The gesture parameters of the moment multi-robot formation are execution Post-pose parameters 。
5. The multi-robot navigation method according to claim 4, wherein when the handling object is a rod-shaped member, there are two robots in the multi-robot formation, a distance between the two robots is L, and L is a length of the rod-shaped member; when executing an action In order to adjust the posture of the device, Position coordinates of two+robots in moment multi-robot formation And The method comprises the following steps of: , ; , ; Wherein the method comprises the steps of Forming reference points for multiple robots Position parameters of time; the motion directions of the two robots are respectively And 。
6. The multi-robot navigation method according to claim 1, wherein the step S43 is a primary rewards method The method comprises the following steps: ; Wherein the method comprises the steps of In order to reach the rewards of the target points, , Is that The distance between the reference point and the end point of the multi-robot formation at the moment, Is a first prize value; Is a first distance threshold; For the step size penalty, , Is a first penalty value; Step number of steps from starting point to current moment of multi-robot formation in the iterative process; For the penalty of the distance to the obstacle, , Is that The minimum distance from the edge sampling point of the obstacle to the carried and constructed moment; as a result of the second distance threshold value, Is a second penalty value; For the penalty of the distance from the endpoint, , Is the third penalty value.
7. The multi-robot navigation method of claim 6, wherein, First distance threshold A first prize value of 0.2 200, First penalty value Is-0.5, a second distance threshold 1, Second penalty value A third penalty value of-150 Is-2.
8. A computer storage medium having stored thereon computer instructions, characterized in that the computer instructions when run perform the multi-robot navigation method of any of claims 1 to 7.
9. A computer device comprising a processor and a storage medium, the storage medium being the computer readable storage medium of claim 8, the processor loading and executing instructions and data in the storage medium for implementing the multi-robot navigation method of any one of claims 1 to 7.

Description

Multi-robot navigation method for cooperative transportation of large-scale components Technical Field The invention belongs to the technical field of mobile robots, and particularly relates to a navigation method for cooperatively carrying large parts by multi-robot formation. Background During the production and transportation of large equipment, it is often necessary to handle large parts. The large-scale part handling system is widely applied to a plurality of fields such as ship manufacturing, large-scale aircraft manufacturing, concrete pipe pile production and the like. The existing large-scale component carrying equipment mainly comprises hoisting equipment and a jacking translation machine. The large-scale part is hoisted by hoisting equipment and is transported by the jacking translation mechanism, so that the existing transportation process path is relatively fixed. When the delivery start point or the delivery end point is changed, the track needs to be paved again, so that the problems of low efficiency and high cost are caused. The other feasible scheme is that a mode of carrying by multiple robots in a cooperative manner is adopted, a plurality of robots are adopted to jointly support one target object, the consistency of group movement is ensured through cooperative motion control among the robots, and the movement of the target object is realized. Navigation and obstacle avoidance of multi-robot cooperative transportation are generally realized by adopting a mode of setting a guide path in an electromagnetic induction, laser or visual mode, but when the environment is changed, the guide path is required to be paved again, so that the efficiency is low. In addition, a way is that a navigation path is obtained through a traditional single robot navigation and obstacle avoidance algorithm, and then the pose of the multiple robots is calculated through the algorithm, but the method is complex on one hand, and on the other hand, the situation that the navigation path of the single robot cannot be applied to the multiple robots exists, namely, the position and the motion among the robots are restrained when the multiple robots are cooperatively carried, and the path which the single robot can pass through possibly cannot pass through when the multiple robots exists. Disclosure of Invention Aiming at the problems in the prior art, the invention provides a multi-robot navigation method for cooperative transportation of large parts, which can acquire navigation paths and gestures of multi-robot formation according to the distance constraint between the environment and the multi-robots. The technical scheme is that the invention adopts the following technical scheme: A multi-robot navigation method for large-scale part cooperative transportation comprises the following steps: S1, acquiring a starting point position S 0＝(px0,py0), an end point position S g＝(gx,gy) of the movement of the multiple robots and barrier information S o in a movement scene, wherein the barrier information so＝[(ox1,oy1),(ox1,oy1),…,(oxN,oyN)];(oxn,oyn) represents barrier edge sampling point coordinates, N is more than or equal to 1 and less than or equal to N, and N represents the total number of barrier edge sampling points; s2, designing a formation of a multi-robot formation according to the shape of the object to be conveyed, and acquiring relative position constraint among robots; S3, establishing a deep neural network, wherein the input of the deep neural network is the state of the multi-robot formation, and the output of the deep neural network is the execution action of the multi-robot formation; The state S of the multi-robot formation is S= [ S g,sr,so ], wherein S r＝[(px,py),vx,vy and arc ] are motion parameters of the multi-robot formation, wherein the position parameter (p x,py) is the position of a multi-robot formation reference point, the speed parameter v x,vy is the speed of the multi-robot formation reference point in the x direction and the y direction respectively, and the gesture parameter arc is the rotating radian of the multi-robot formation with the reference point as the center; The deep neural network comprises an Actor network and a Critic network, wherein the input of the Actor network is the state S of the multi-robot formation, and the output is the execution action act of the multi-robot formation; S4, training the deep neural network by using a PPO algorithm; s5, inputting the state of the multi-robot formation into a trained Actor network to obtain execution actions of each step, calculating motion parameters at the next moment according to the current motion parameters of the multi-robot formation and the execution actions act, and further obtaining a navigation path from a starting point to a terminal point of the multi-robot formation. Further, in the deep neural network, the Actor network comprises 4 hidden layers and an output layer, the number of neurons in the 4 hidden layers is 128,