CN-120343049-B - Cooperative working method and system of multiple robots and Internet of things equipment

CN120343049BCN 120343049 BCN120343049 BCN 120343049BCN-120343049-B

Abstract

The invention provides a cooperative working method and a system of a plurality of robots and Internet of things equipment, which relate to the technical field of warehouse logistics automation, wherein the method comprises the steps of obtaining a task to be processed and real-time load states of a plurality of robots; generating a capacity map of an Internet of things equipment group according to the acquired real-time environment data of the plurality of Internet of things equipment, generating a three-dimensional task priority matrix of the task to be processed according to the task to be processed, the real-time load states of the plurality of robots and the capacity map of the Internet of things equipment group based on a dynamic task evaluation model, and distributing a target robot-Internet of things equipment combination for the task to be processed based on the three-dimensional task priority matrix of the task to be processed by adopting a reinforcement learning resource matching algorithm. According to the invention, the real-time environment data of the Internet of things equipment and the pre-established dynamic task evaluation model are used for matching the target robot-Internet of things equipment combination for the task to be processed, so that the cooperative work of multiple robots and the Internet of things equipment in multiple scenes is realized.

Inventors

ZHANG ZHIJIE
LIU JIA

Assignees

北京涵鑫盛科技有限公司

Dates

Publication Date: 20260508
Application Date: 20250415

Claims (8)

1. The cooperative working method of the multi-robot and the Internet of things equipment is characterized by comprising the following steps of: Acquiring a task to be processed and real-time load states of a plurality of robots; Generating a capability map of an Internet of things device group according to real-time environment data of a plurality of Internet of things devices acquired by a distributed sensor network, wherein the capability map is characterized by extracting multidimensional capability vectors of the Internet of things devices based on the real-time environment data; Based on a pre-established dynamic task evaluation model, generating a three-dimensional task priority matrix of the task to be processed according to the task to be processed, real-time load states of a plurality of robots and the capability map of the Internet of things equipment group; A reinforcement learning resource matching algorithm is adopted, and a target robot-Internet of things equipment combination is distributed for the task to be processed based on a three-dimensional task priority matrix of the task to be processed; the expression of the three-dimensional task priority matrix is as follows: Wherein, the For a three-dimensional task priority matrix, As a function of the time-dimension priority, As a function of the priority of the spatial dimension, As a function of the priority of the dimensions of the resource, For the time dimension priority weighting factor, For the spatial dimension priority weighting factor, Priority weight coefficients are used as resource dimensions; The time dimension priority function The expression of (2) is as follows: Wherein, the For the deadline of the task to be processed, For the current system time it is possible to determine, In order to prevent and eliminate the zero minimum value, In order for the attenuation factor to be a factor, The residual workload of the task to be processed; the time dimension priority weighting coefficient The expression of (2) is as follows: Wherein, the As the initial time dimension weight coefficient, For the deadline of the task to be processed, For the current system time it is possible to determine, A threshold value of a preset critical time is set; the spatial dimension priority function The expression of (2) is as follows: Wherein, the In order to be a cost of the dynamic path, For the position coordinates of the robot, For the position coordinates of the task object, Is the spatial attenuation radius; as the sum of all the edges on the path, As the moving speed of the robot, For the barrier penalty factor, Is a path segment The density of the obstacle in the area where it is located, Is the time cost; the resource dimension priority function The expression of (2) is as follows: where n is the total number of resource types, As a weight for the type of resource of class i, Awarding coefficients for excess capacity; Normalized task demand values for class i resources, For the normalized device capability value for device j under the class i resource, For the total margin of the normalized device capability of device j under the i-th class resource and the normalized task demand value of the i-th class resource, Mapping to a (0, 1) interval for the total margin; For the value of the demand of a task for a class i resource, For task demand vector Is selected from the group consisting of a maximum value of (c), For the demand values of all resource types, , The capability value of the internet of things device j for the i-th type resource, Capability vector for Internet of things equipment Is set at the maximum value of (c), For the capability values of all resource types of device j, 。
2. The method of claim 1, wherein the process of building a dynamic task assessment model comprises: Calculating an environmental obstacle topological structure entropy value and a dynamic interference factor based on the capability map of the Internet of things equipment group, and obtaining real-time environmental complexity of the Internet of things equipment group; constructing a multi-dimensional capability vector based on the specification parameters and the historical performance data of the Internet of things equipment to obtain a capability quantization vector of the Internet of things equipment; and establishing a dynamic task assessment model based on the real-time environment complexity and the energy quantization vector.
3. The method of claim 1, wherein after the assigning the target robot-internet of things device combination to the task to be processed, the method further comprises: And in the process of executing the to-be-processed work task by the target robot-internet of things equipment combination, monitoring state parameters of the internet of things equipment group, and dynamically adjusting a resource allocation strategy through a self-adaptive feedback mechanism.
4. The method of claim 3, wherein the dynamically adjusting the resource allocation policy via an adaptive feedback mechanism comprises: establishing a composite evaluation index comprising task completion timeliness, equipment reliability and energy consumption efficiency; and optimizing control parameters based on the composite evaluation index and the depth deterministic strategy gradient algorithm, and adjusting a resource allocation strategy through the control parameters.
5. The method of claim 4, wherein the expression of the composite evaluation index is as follows: Wherein, the For the composite evaluation index value, For the time-out value of the task completion, In order to be able to effect the value of the efficiency, Is an index of the reliability of the internet of things equipment, For the time-efficient weight coefficient of task completion, In order for the energy efficiency weighting factor to be a factor, The reliability weight coefficient of the equipment of the Internet of things is obtained; the expression of the depth deterministic strategy gradient algorithm is as follows: Wherein, the As a function of the strategy to be parameterized, As a function of the parameters of the policy network, To pass policy network parameters The objective function of the maximization is that, As a function of action value Is used as a reference to the desired value of (a), The constraint coefficients are updated for the policy and, Is the KL divergence between the new strategy and the old strategy.
6. A collaborative system of multiple robots and internet of things devices, comprising: the original data acquisition module is used for acquiring tasks to be processed and real-time load states of a plurality of robots; the device group state acquisition module is used for generating a capability map of the device group of the Internet of things according to the real-time environment data of the plurality of devices of the Internet of things acquired by the distributed sensor network; the task priority matrix generation module is used for generating a three-dimensional task priority matrix of the task to be processed according to the task to be processed, the real-time load states of the robots and the capability map of the Internet of things equipment group based on a pre-established dynamic task evaluation model; the robot-equipment distribution module is used for distributing a target robot-Internet of things equipment combination for the task to be processed based on the three-dimensional task priority matrix of the task to be processed by adopting a reinforcement learning resource matching algorithm; the expression of the three-dimensional task priority matrix is as follows: Wherein, the For a three-dimensional task priority matrix, As a function of the time-dimension priority, As a function of the priority of the spatial dimension, As a function of the priority of the dimensions of the resource, For the time dimension priority weighting factor, For the spatial dimension priority weighting factor, Priority weight coefficients are used as resource dimensions; The time dimension priority function The expression of (2) is as follows: Wherein, the For the deadline of the task to be processed, For the current system time it is possible to determine, In order to prevent and eliminate the zero minimum value, In order for the attenuation factor to be a factor, The residual workload of the task to be processed; the time dimension priority weighting coefficient The expression of (2) is as follows: Wherein, the As the initial time dimension weight coefficient, For the deadline of the task to be processed, For the current system time it is possible to determine, A threshold value of a preset critical time is set; the spatial dimension priority function The expression of (2) is as follows: Wherein, the In order to be a cost of the dynamic path, For the position coordinates of the robot, For the position coordinates of the task object, Is the spatial attenuation radius; as the sum of all the edges on the path, As the moving speed of the robot, For the barrier penalty factor, Is a path segment The density of the obstacle in the area where it is located, Is the time cost; the resource dimension priority function The expression of (2) is as follows: where n is the total number of resource types, As a weight for the type of resource of class i, Awarding coefficients for excess capacity; Normalized task demand values for class i resources, For the normalized device capability value for device j under the class i resource, For the total margin of the normalized device capability of device j under the i-th class resource and the normalized task demand value of the i-th class resource, Mapping to a (0, 1) interval for the total margin; For the value of the demand of a task for a class i resource, For task demand vector Is selected from the group consisting of a maximum value of (c), For the demand values of all resource types, , The capability value of the internet of things device j for the i-th type resource, Capability vector for Internet of things equipment Is set at the maximum value of (c), For the capability values of all resource types of device j, 。
7. The electronic equipment is characterized by comprising at least one processor and a memory, wherein the memory and the processor are connected through a bus; The memory is used for storing one or more programs; a method of co-operating a multi-robot and an internet of things device according to any one of claims 1 to 5, when the one or more programs are executed by the at least one processor.
8. A readable storage medium having stored thereon an execution program which, when executed, implements the method of co-operating a multi-robot and an internet of things device according to any one of claims 1 to 5.

Description

Cooperative working method and system of multiple robots and Internet of things equipment Technical Field The invention relates to the technical field of warehouse logistics automation, in particular to a cooperative working method and system of a plurality of robots and Internet of things equipment. Background With the rapid development of the internet of things technology, the internet of things robot is widely applied to a plurality of fields such as industry, agriculture, medical treatment, military and the like. The robot of the Internet of things collects data of a working site in real time through equipment such as a sensor and the like, and transmits the data to a remote control center through a network, so that the remote control of the robot is realized. However, in the cooperative working scene of the existing robots and the internet of things equipment, due to the factors of a large number of robots, complex tasks, changeable environments and the like, the response of the internet of things equipment to real-time environment data in the prior distributed system is not timely, the efficiency of the existing algorithm in dynamic task allocation is not high, the control accuracy and instantaneity are difficult to ensure, and the cooperative working scene of the multiple robots and the internet of things equipment in multiple scenes cannot be met. Disclosure of Invention In order to overcome the defects in the prior art, the invention provides a cooperative working method of a plurality of robots and Internet of things equipment, which comprises the following steps: Acquiring a task to be processed and real-time load states of a plurality of robots; Generating a capability map of an Internet of things device group according to real-time environment data of a plurality of Internet of things devices acquired by a distributed sensor network; based on a pre-established dynamic task evaluation model, generating a three-dimensional task priority matrix of the task to be processed according to the task to be processed, real-time load states of a plurality of robots and capability maps of the equipment groups of the Internet of things; And (3) adopting a reinforcement learning resource matching algorithm, and distributing a target robot-Internet of things equipment combination for the task to be processed based on the three-dimensional task priority matrix of the task to be processed. Preferably, the process of establishing the dynamic task assessment model includes: Calculating an environmental obstacle topological structure entropy value and a dynamic interference factor based on the capability map of the Internet of things equipment group, and obtaining the real-time environmental complexity of the Internet of things equipment group; constructing a multi-dimensional capability vector based on specification parameters and historical performance data of the Internet of things equipment to obtain a capability quantization vector of the Internet of things equipment; And establishing a dynamic task assessment model based on the real-time environment complexity and the capacity quantization vector. Preferably, the expression of the three-dimensional task priority matrix is as follows: P(t,s,r)=α·T(t)+β·S(s)+γ·R(r) Wherein P (T, S, R) is a three-dimensional task priority matrix, T (T) is a time dimension priority function, S (S) is a space dimension priority function, R (R) is a resource dimension priority function, alpha is a time dimension priority weight coefficient, beta is a space dimension priority weight coefficient, and gamma is a resource dimension priority weight coefficient. Preferably, the expression of the time dimension priority function T (T) is as follows: Wherein d i is the cut-off time of the task to be processed, t now is the current system time, epsilon is the zero prevention and elimination minimum value, lambda is the attenuation factor, and omega i is the residual workload of the task to be processed; the expression of the time dimension priority weight coefficient α is as follows: wherein alpha 0 is an initial time dimension weight coefficient, d i is the deadline of a task to be processed, t now is the current system time, and τ critical is a preset critical time threshold; The expression of the spatial dimension priority function S (S) is as follows: Wherein, C path is the dynamic path cost, (x r,yr) is the robot position coordinate, (x t,yt) is the task target position coordinate, and σ is the spatial attenuation radius; V r is the moving speed of the robot, μ is the obstacle penalty coefficient, obstacle _density (e) is the obstacle density of the area where the path segment e is located, Is the time cost; the expression of the resource dimension priority function R (R) is as follows: Wherein n is the total number of resource types, omega i is the weight of the i-th type resource type, and eta is the excess capacity rewarding coefficient; Normalized task demand values for class i r