CN-121973243-A - Dynamic path planning method and device for tunnel construction operation mechanical arm

CN121973243ACN 121973243 ACN121973243 ACN 121973243ACN-121973243-A

Abstract

The invention relates to the technical field of tunnel construction and discloses a method and a device for planning a dynamic path of a mechanical arm for tunnel construction operation, wherein the method comprises the steps of collecting environmental data in a tunnel and real-time attitude parameters of the mechanical arm in real time; the method comprises the steps of constructing a real-time three-dimensional grid map on the inner wall of a tunnel based on environment data in the tunnel, simultaneously detecting and marking obstacles in the tunnel in real time, constructing a real-time state vector of an mechanical arm, constructing a real-time combined state vector based on the real-time state vector of the mechanical arm and the environment state vector of the tunnel, obtaining pose adjustment parameters of the mechanical arm based on a pre-trained mechanical arm motion path model in the tunnel according to the real-time combined state vector, wherein the mechanical arm motion path model in the tunnel is obtained based on reinforcement learning model training, and a punishment function at least comprises a path rewarding function, a dynamic collision punishment function and a step number punishment function. The method and the device can solve the problem of path planning of the mechanical arm in a tunnel construction scene.

Inventors

LI CHUANGUI
JIANG XINBO
MA YA
TU WENFENG
Duan Chuanbo
MA ZHAO
CHEN CHANGYUAN
SUN ZHIPING
LIU HONGLIANG
LI HAISHENG
WANG YIPENG
LV XINJIAN

Assignees

山东高速建设管理集团有限公司
山东大学
山东高速临滕公路有限公司

Dates

Publication Date: 20260505
Application Date: 20260402

Claims (10)

1. The method for planning the dynamic path of the mechanical arm for tunnel construction operation is characterized by comprising the following steps of: Acquiring environmental data in a tunnel and real-time attitude parameters of a mechanical arm in real time; Constructing a real-time three-dimensional grid map on the inner wall of the tunnel based on the environmental data in the tunnel, detecting the obstacle in the tunnel in real time, and marking the detected obstacle in the three-dimensional grid map; Constructing a real-time state vector of the mechanical arm according to the real-time attitude parameters of the mechanical arm, constructing a tunnel environment state vector according to the inner wall of the tunnel and the obstacle, and constructing a real-time joint state vector based on the real-time state vector of the mechanical arm and the tunnel environment state vector; And obtaining pose adjustment parameters of the mechanical arm based on a pre-trained mechanical arm motion path model in the tunnel according to the real-time joint state vector, wherein the mechanical arm motion path model in the tunnel is obtained by training based on a reinforcement learning model, and the punishment function at least comprises a path rewarding function, a dynamic collision punishment function and a step number punishment function.
2. The method for planning a dynamic path of a tunnel construction work mechanical arm according to claim 1, wherein the real-time detection of the obstacle in the tunnel comprises: identifying a target point cloud which is possibly an obstacle from the point cloud data; dividing a tunnel space into three-dimensional grid units according to a preset size based on point cloud data, wherein each grid unit is marked as three states of idle, occupied and unknown; Identifying the outline of the obstacle based on the image information, and confirming the authenticity of the obstacle by carrying out position matching with the obstacle to be identified; continuously tracking the preliminarily detected obstacle, and distinguishing a static obstacle and a dynamic obstacle according to the position range and the profile change of the obstacle.
3. The method for planning a dynamic path of a tunnel construction operation mechanical arm according to claim 1, wherein the dimension of the real-time state vector of the mechanical arm comprises the real-time pose of the mechanical arm, the distance between the end effector and a fixed obstacle and the distance between the end effector and a target point, and the dimension of the tunnel environment state vector at least comprises the obstacle position, the obstacle speed and the obstacle type identifier.
4. The method for planning a dynamic path of a mechanical arm for tunnel construction operation according to claim 3, wherein the training method of the mechanical arm motion path model in the tunnel is as follows: Pre-training a mechanical arm path planning model based on a reinforcement learning model by adopting a basic state vector and combining with a mechanical arm action space, wherein in the training process, an adopted rewarding function at least comprises a path rewarding function, a collision punishment function and a step number punishment function; And (3) carrying out iterative fine tuning on the pre-trained mechanical arm path planning model by adopting a tunnel joint state vector and combining with a mechanical arm action space, wherein in the fine tuning process, the adopted punishment function at least comprises a path rewarding function, a dynamic collision punishment function and a step number punishment function.
5. The method for planning dynamic path of mechanical arm for tunnel construction operation of claim 4, wherein the dimension of the state vector of the tunnel environment further comprises dust concentration and a key geometrical feature point set of the tunnel inner wall, and the penalty function adopted in the iterative fine tuning process of the mechanical arm path planning model comprises a stability detection function and a narrow channel rewarding function according to the dust concentration and based on the mapping relation between the set dust concentration and the upper limit of the movement speed of the mechanical arm.
6. The method for planning a dynamic path of a tunnel construction operation mechanical arm according to claim 4 or 5, wherein the total reward function adopted in the fine tuning stage is the sum of the reward functions; The dynamic collision penalty function is a piecewise function, when the mechanical arm collides, the value is set, and otherwise, the value is 0; The path rewarding function, the step number punishment function, the stability detection function and the narrow channel rewarding function are all products of weights and rewarding items; When one or more dynamic obstacles are detected, the weights of all sub-items in the reward function are adjusted according to the obstacle coefficient of the highest priority based on the obstacle priority mapping table, when the weights of the dynamic collision penalty function are increased, the weights corresponding to the narrow channel reward function and the dust environment reward function are preferentially compressed, the weights corresponding to the path reward function and the step number penalty function are reserved, so that the sum of the weights meets the set value range, and when the dust concentration is lower than the set low dust threshold, the weights of the stability detection function are reduced, and the weights of other functions are increased, so that the sum of the weights meets the set value range.
7. The utility model provides a tunnel construction operation arm dynamic path planning device which characterized in that includes: The real-time data acquisition module is configured to acquire the environmental data in the tunnel and the real-time attitude parameters of the mechanical arm in real time; the grid map construction module is configured to construct a real-time three-dimensional grid map of the inner wall of the tunnel based on the environmental data in the tunnel, detect the obstacle in the tunnel in real time and mark the detected obstacle in the three-dimensional grid map; The state vector construction module is configured to construct a real-time state vector of the mechanical arm according to the real-time attitude parameters of the mechanical arm, construct a tunnel environment state vector according to the inner wall of the tunnel and the obstacle, and construct a real-time joint state vector based on the real-time state vector of the mechanical arm and the tunnel environment state vector; The dynamic path planning module is configured to obtain pose adjustment parameters of the mechanical arm based on a pre-trained mechanical arm motion path model in the tunnel according to the real-time joint state vector, wherein the mechanical arm motion path model in the tunnel is obtained by training based on a reinforcement learning model, and the punishment function at least comprises a path rewarding function, a dynamic collision punishment function and a step number punishment function.
8. An electronic device comprising a processor and a memory having stored thereon computer instructions that, when executed by the processor, cause the electronic device to perform the method of any of claims 1 to 6.
9. A computer readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the method of any one of claims 1 to 6.
10. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1 to 6.

Description

Dynamic path planning method and device for tunnel construction operation mechanical arm Technical Field The invention belongs to the technical field of tunnel construction, and particularly relates to a method and a device for planning a dynamic path of a tunnel construction operation mechanical arm. Background The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art. In tunnel construction, the mechanical arm is used as key equipment, is widely applied to operations such as drilling, grouting, material handling and the like, and the rationality of path planning directly determines the operation efficiency, construction safety and operation precision, and is more key for pushing the automation and intelligent upgrading of tunnel construction. In the construction process, obstacles such as temporary stockpiles, grouting machines, transport vehicles and the like frequently appear and dynamically change, static obstacles and dynamic obstacles are interlaced and distributed, the state of the inner wall of a tunnel can be finely adjusted along with the construction process, and the scene characteristics provide extremely high requirements for the instantaneity, the suitability and the safety of the path planning of the mechanical arm. The existing tunnel mechanical arm path planning scheme is mainly based on a predefined track or a static environment model, cannot quickly respond to the position change of dynamic obstacles in a tunnel, is insufficient in obstacle avoidance safety and path suitability, and is difficult to fit with the actual requirements of a tunnel dynamic construction scene. In addition, the tunnel construction scene operation space is narrow and closed, and dust concentration fluctuation is big, easily interferes environment perception, influences operation accuracy and security. Disclosure of Invention In view of the above, the invention provides a method and a device for planning a dynamic path of a tunnel construction operation mechanical arm. The method is used for solving the problem of path planning of the mechanical arm in a tunnel construction scene. The first aspect of the invention provides a dynamic path planning method for a tunnel construction operation mechanical arm, which comprises the following steps: Acquiring environmental data in a tunnel and real-time attitude parameters of a mechanical arm in real time; Constructing a real-time three-dimensional grid map on the inner wall of the tunnel based on the environmental data in the tunnel, detecting the obstacle in the tunnel in real time, and marking the detected obstacle in the three-dimensional grid map; Constructing a real-time state vector of the mechanical arm according to the real-time attitude parameters of the mechanical arm, constructing a tunnel environment state vector according to the inner wall of the tunnel and the obstacle, and constructing a real-time joint state vector based on the real-time state vector of the mechanical arm and the tunnel environment state vector; And obtaining pose adjustment parameters of the mechanical arm based on a pre-trained mechanical arm motion path model in the tunnel according to the real-time joint state vector, wherein the mechanical arm motion path model in the tunnel is obtained by training based on a reinforcement learning model, and the punishment function at least comprises a path rewarding function, a dynamic collision punishment function and a step number punishment function. In some embodiments, detecting an obstacle in a tunnel in real time includes: identifying a target point cloud which is possibly an obstacle from the point cloud data; dividing a tunnel space into three-dimensional grid units according to a preset size based on point cloud data, wherein each grid unit is marked as three states of idle, occupied and unknown; Identifying the outline of the obstacle based on the image information, and confirming the authenticity of the obstacle by carrying out position matching with the obstacle to be identified; continuously tracking the preliminarily detected obstacle, and distinguishing a static obstacle and a dynamic obstacle according to the position range and the profile change of the obstacle. In some embodiments, the dimension of the real-time state vector of the mechanical arm comprises the real-time pose of the mechanical arm, the distance between the end effector and a fixed obstacle and the distance between the end effector and a target point, and the dimension of the state vector of the tunnel environment at least comprises the obstacle position, the obstacle speed and the obstacle type identifier. In some embodiments, the training method of the mechanical arm motion path model in the tunnel is as follows: Pre-training a mechanical arm path planning model based on a reinforcement learning model by adopting a basic state vector and combining with a mechanical arm action