CN-122022679-A - Multi-device cooperative scheduling method, device, medium and program product

CN122022679ACN 122022679 ACN122022679 ACN 122022679ACN-122022679-A

Abstract

The application relates to a multi-device cooperative scheduling method, a device, a medium and a program product. The method comprises the steps of obtaining multi-source fusion data aiming at a plurality of logistics devices in a warehousing system, wherein the multi-source fusion data are at least obtained by carrying out data fusion on tasks to be executed, device sensor data and warehousing space topology data of the plurality of logistics devices in the warehousing system, carrying out task allocation and path collaborative prediction on the basis of the multi-source fusion data by a strategy network to obtain an initial action instruction set aiming at the plurality of logistics devices, wherein the initial action instruction set comprises initial action instructions corresponding to the plurality of logistics devices, carrying out action conflict detection on the basis of the initial action instruction set to obtain a conflict detection result, carrying out instruction optimization on the initial action instruction set on the basis of the conflict detection result to obtain an execution action instruction set, and controlling the plurality of logistics devices to act on the basis of the execution action instruction set.

Inventors

Lu Renpu
FAN YANG
LIU YANG
Zhong Yuanhuai
ZHOU FENG
GONG WENJIE
TANG SONG
LI ZHENG
WANG ZHUANG
DENG RONGMING
LIU SHIJIE

Assignees

四川中烟工业有限责任公司

Dates

Publication Date: 20260512
Application Date: 20260123

Claims (10)

1. A multi-device cooperative scheduling method, the method comprising: The method comprises the steps of obtaining multi-source fusion data aiming at a plurality of logistics equipment in a warehousing system, wherein the multi-source fusion data is at least obtained by carrying out data fusion on tasks to be executed, equipment sensor data and warehousing space topology data of a plurality of logistics equipment in the warehousing system; Performing task allocation and path collaborative prediction by a policy network based on the multi-source fusion data to obtain initial action instruction sets aiming at a plurality of logistics equipment, wherein the initial action instructions contained in the initial action instruction sets are at least used for indicating the moving paths and the operation parameters of the logistics equipment; performing action conflict detection based on the initial action instruction set to obtain a conflict detection result; Performing instruction optimization on the initial action instruction set based on the conflict detection result to obtain an execution action instruction set; and controlling a plurality of logistics equipment to perform cooperative action based on the execution action instruction set.
2. The method of claim 1, wherein after controlling the plurality of logistics apparatuses to cooperate based on the set of execution action instructions, the method further comprises: acquiring equipment fault information and execution feedback data of the cooperative action of a plurality of logistics equipment; Performing multi-target rewards calculation based on the execution feedback data, and determining a strategy gradient update amount; determining an environmental status update amount based on the device failure information; and updating parameters of the strategy network based on the strategy gradient updating quantity and the environment state updating quantity.
3. The method of claim 2, wherein the execution feedback data includes an actual completion time and an effective operation time of each of the logistics apparatuses, and a number of occurrence of collision; The multi-objective rewards calculation based on the execution feedback data, and the strategy gradient update amount determination comprises the following steps: Determining a task completion efficiency factor based on the deviation rate of the actual completion time and the task planning time of each logistics equipment; determining an equipment utilization coefficient based on the ratio of the effective working time length to the total working time length of each logistics equipment; and carrying out multi-objective rewarding calculation based on the task completion efficiency factor, the equipment utilization system and the conflict occurrence frequency to determine the strategy gradient updating amount.
4. The method of claim 1, wherein the acquiring multi-source fusion data for a plurality of logistics devices in a warehousing system comprises: acquiring tasks to be executed, equipment sensor data and warehousing space topology data corresponding to a plurality of logistics equipment in a warehousing system; performing task sequencing based on the task type and the task priority of the task to be executed to obtain a hierarchical task sequence; And carrying out space-time correlation on the hierarchical task sequence, the equipment sensor data and the warehousing space topology data to obtain multi-source fusion data aiming at a plurality of logistics equipment in a warehousing system.
5. The method of claim 1, wherein the policy network comprises a graph convolution layer, a long-short-term memory layer, and a full connection layer; The strategy network performs task allocation and path collaborative prediction based on the multi-source fusion data to obtain an initial action instruction set aiming at a plurality of logistics equipment, and the method comprises the following steps: performing topological structure feature extraction processing on the multisource fusion data by a graph convolution layer in the strategy network to generate a topological neighborhood feature graph; Performing equipment track prediction on the topological neighborhood feature map by a long-term and short-term memory layer in the strategy network to generate a dynamic track prediction vector; and performing task allocation and path collaborative prediction on the dynamic track prediction vector by a full connection layer in the strategy network to generate an initial action instruction set aiming at a plurality of logistics equipment.
6. The method according to claim 1, wherein said performing an action conflict detection based on said initial action instruction set, resulting in a conflict detection result, comprises: Carrying out space-time track prediction on each logistics equipment based on the initial action instruction set to obtain a predicted space-time track corresponding to each logistics equipment; Performing space-time overlapping detection based on the predicted space-time trajectories corresponding to the logistics equipment to determine detection results of the predicted space-time trajectories, wherein the detection results at least comprise one of conflict time, conflict space coordinates, conflict types and conflict severity; And determining a conflict detection result based on the detection result of each predicted space-time track.
7. The method of claim 6, wherein the operating parameters include at least a speed of movement; the step of performing instruction optimization on the initial action instruction set based on the conflict detection result to obtain an execution action instruction set, including: based on the conflict detection result, instructions to be optimized in the initial action instruction set; Determining a speed attenuation coefficient based on a detection result of the predicted space-time track corresponding to the instruction to be optimized, a task priority of the corresponding task and a space traffic condition; Re-determining an alternative path based on a starting point and an end point contained in the instruction to be optimized; and carrying out instruction optimization on the instruction to be optimized based on the speed attenuation coefficient and the alternative path to obtain an execution action instruction set.
8. A multi-device co-scheduling apparatus, the apparatus comprising: The system comprises an acquisition module, a storage system and a storage system, wherein the acquisition module is used for acquiring multi-source fusion data aiming at a plurality of logistics equipment in the storage system, wherein the multi-source fusion data is at least obtained by carrying out data fusion on tasks to be executed, equipment sensor data and storage space topology data of the plurality of logistics equipment in the storage system; the decision module is used for carrying out task allocation and path collaborative prediction by a strategy network based on the multi-source fusion data to obtain initial action instruction sets aiming at a plurality of logistics equipment, wherein the initial action instructions contained in the initial action instruction sets are at least used for indicating the moving paths and the operation parameters of the logistics equipment; The conflict detection module is used for performing action conflict detection based on the initial action instruction set to obtain a conflict detection result; The optimizing module is used for carrying out instruction optimization on the initial action instruction set based on the conflict detection result to obtain an execution action instruction set; and the control module is used for controlling the logistics equipment to perform cooperative action based on the execution action instruction set.
9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1 to 7 when the computer program is executed.
10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 7.

Description

Multi-device cooperative scheduling method, device, medium and program product Technical Field The application relates to the technical field of warehouse logistics, in particular to a multi-equipment collaborative scheduling method, a device, equipment, a medium and a program product. Background With the development of automation technology, a modern intelligent warehousing system is developed, and logistics automation and operation efficiency improvement can be realized by the modern intelligent warehousing system through a multi-equipment cooperative scheduling technology, wherein the multi-equipment cooperative scheduling technology mainly solves the problems of task allocation, path planning and dynamic coordination of conveying equipment such as AGVs (Automated Guided Vehicle, automatic guided vehicles), shuttling vehicles and the like in a warehouse, and the material circulation efficiency is optimized through a dynamic decision mechanism. The related cooperative scheduling technology has the problem of higher frequency of multi-device conflict, so that a multi-device cooperative scheduling method is needed to improve the scheduling effect of multi-device scheduling. Disclosure of Invention In view of the foregoing, it is desirable to provide a multi-device co-scheduling method, apparatus, device, medium, and program product that can reduce planning conflicts for multiple devices. In a first aspect, the present application provides a multi-device cooperative scheduling method, including: The method comprises the steps of obtaining multi-source fusion data aiming at a plurality of logistics devices in a warehousing system, wherein the multi-source fusion data are at least obtained by carrying out data fusion on tasks to be executed, device sensor data and warehousing space topology data of the logistics devices in the warehousing system, carrying out task allocation and path collaborative prediction on the basis of the multi-source fusion data by a strategy network to obtain an initial action instruction set aiming at the logistics devices, wherein the initial action instruction set comprises initial action instructions at least used for indicating moving paths and operation parameters of the logistics devices, carrying out action conflict detection on the basis of the initial action instruction set to obtain a conflict detection result, carrying out instruction optimization on the initial action instruction set on the basis of the conflict detection result to obtain an execution action instruction set, and controlling the logistics devices to carry out collaborative action on the basis of the execution action instruction set. In one embodiment, after the plurality of logistics devices are controlled to perform cooperative action based on the execution action instruction set, the method further comprises the steps of obtaining device fault information and execution feedback data of the plurality of logistics devices to perform cooperative action, performing multi-objective rewarding calculation based on the execution feedback data, determining a strategy gradient update amount, determining an environment state update amount based on the device fault information, and updating parameters of the strategy network based on the strategy gradient update amount and the environment state update amount. In one embodiment, the execution feedback data includes an actual completion time and an effective working time of each of the logistics devices, and a number of occurrence of conflict; the method comprises the steps of carrying out multi-target rewarding calculation based on execution feedback data, and determining a strategy gradient updating amount, wherein the strategy gradient updating amount comprises the steps of determining a task completion efficiency factor based on the deviation rate of the actual completion time and the task planning time of each logistics device, determining a device utilization coefficient based on the ratio of the effective working time length to the total working time length of each logistics device, and carrying out multi-target rewarding calculation based on the task completion efficiency factor, the device utilization system and the conflict occurrence frequency. In one embodiment, the acquiring multi-source fusion data for a plurality of logistics devices in a warehousing system comprises acquiring tasks to be executed, device sensor data and warehousing space topology data corresponding to the logistics devices in the warehousing system, performing task sequencing based on task types and task priorities of the tasks to be executed to obtain a hierarchical task sequence, and performing space-time correlation on the hierarchical task sequence, the device sensor data and the warehousing space topology data to obtain the multi-source fusion data for the logistics devices in the warehousing system. In one embodiment, the strategy network comprises a graph convolution layer, a lon