CN-121094375-B - Emergency task scheduling method and device in limited space

CN121094375BCN 121094375 BCN121094375 BCN 121094375BCN-121094375-B

Abstract

The application relates to an emergency task scheduling method and device in a limited space, in the scene that multiple emergency tasks and task executors autonomously select emergency tasks, the method considers the characteristic that the movement of the task executors can influence the executed time of the tasks, takes the executed time of the tasks of N emergency tasks as a state in a reinforcement learning model and takes a settable staged excitation value as an action in the reinforcement learning model, adjusts actions of different time steps through the reinforcement learning model under the influence of rewards, and guides the task executors with shorter distances to select and execute the corresponding emergency tasks, so as to realize the dynamic optimized scheduling of the task executors executing the emergency tasks, thereby achieving the purpose of minimizing the executed time of the tasks of the multiple emergency tasks.

Inventors

JIANG HAIBO
TAO MING
ZHAO WU
ZHOU ZHE
GAN QUAN
QING HUIGUANG

Assignees

湖南为安科技有限公司

Dates

Publication Date: 20260508
Application Date: 20250804

Claims (10)

1. The emergency task scheduling method in the limited space is characterized by being applied to scheduling of N emergency tasks of an emergency task management system in the limited space, wherein the emergency task management system comprises the following steps: the system comprises a plurality of communication devices, a task execution device, a communication device and a communication device, wherein each communication device is provided with at least one emergency task and is used for issuing the emergency task to a movable task executor in a coverage range, issuing a plurality of excitation values and receiving a task execution result uploaded by the task executor; The task executors can select to execute or exit the emergency task, upload the task execution result and generate task excitation under the condition of executing the emergency task, wherein the task excitation is determined based on one excitation value corresponding to the communication distance between the task executors and the communication equipment; The method comprises the following steps: The method comprises the steps of obtaining the task executed time length of each emergency task in the current time step, wherein the task executed time length comprises the task executed time length of a task executor and the uploading time length of a task execution result uploaded by the task executor, and the uploading time length is related to the communication distance between the task executor and the communication equipment; Setting rewards of the reinforcement learning model by taking the executed time periods of the N emergency tasks as states in the reinforcement learning model, taking a plurality of incentive values as actions in the reinforcement learning model, and learning decisions for setting the actions of the next time step by taking the executed time periods of the tasks of the N emergency tasks as target functions based on the reinforcement learning model; scheduling is completed based on the decision of each time step.
2. The method for scheduling emergency tasks within a confined space according to claim 1, wherein the function corresponding to rewards is characterized by: ; Wherein, the As a mapping function of the rewards, And As the weight value of the weight, Is the first The task-executed duration of the individual emergency tasks, Is the first The emergency tasks are in action The task of the next time step after adjustment has been executed for a long time, Is the first The desired duration of the individual emergency tasks, Is the slave And The mapping function of the smaller value is taken in, Is the slave And The mapping function of the smaller value is taken in, For the number of a plurality of emergency tasks, As a time-consuming matter of the time-step, Is a time step State of (2) And actions Is a mapping pair of (c).
3. The method for scheduling emergency tasks in a limited space according to claim 1, wherein obtaining an upload duration of each task execution result comprises: ; ; ; Wherein, the In order to upload the time period, For the amount of data of the task execution result, For communication rates between the communication device and the task executor in a limited space, Is the bandwidth of the communication and, For a signal-to-noise ratio in a limited space, As a result of the channel fading factor, As a path loss factor (pathloss factor), For the communication distance between the task executor and the communication device, For the transmit power at the time of the upload, Is gaussian noise density in a limited space.
4. A method of scheduling emergency tasks within a confined space as claimed in claim 3 wherein said task incentive determination process includes: determining a current communication distance between the task executor and the communication device at a current time step; Determining a current excitation value based on the current communication distance; Calculating positive excitation of the task executor based on the task execution duration and the current excitation value; Determining task execution cost and uploading cost of the task executor, and calculating total cost based on the task execution cost and the uploading cost; And calculating the difference between the positive excitation and the total cost to obtain task excitation.
5. The method of claim 4, further comprising, prior to said determining a current incentive value based on said current communication distance: Dividing a communication range of the communication device into a plurality of communication range segments; Assigning different excitation values to different communication range segments; The determining a current excitation value based on the current communication distance includes: Judging a communication range section to which the current communication distance belongs; And taking the excitation value corresponding to the communication range segment as the current excitation value of the current communication distance.
6. The method of claim 5, further comprising, prior to said learning a decision to set said action for a next time step based on said reinforcement learning model with an objective function of minimizing said task executed durations of said N emergency tasks: Determining the number of task executors in the coverage area of each communication device in the current time step; Determining a plurality of performance standard values according to the number, wherein the performance standard values are used for determining the task executors participating in the emergency task and only allowing the task executors with the residual performance higher than the performance standard values to participate in the emergency task; taking a settable stimulus value as an action in the reinforcement learning model, comprising: and taking the settable excitation value and the settable performance standard value as actions in the reinforcement learning model.
7. The method of claim 1, wherein the reinforcement learning model is a DQN model.
8. An emergency task scheduling device in a limited space, which is applied to scheduling of N emergency tasks of an emergency task management system in the limited space, wherein the emergency task management system comprises: the system comprises a plurality of communication devices, a task execution device, a communication device and a communication device, wherein each communication device is provided with at least one emergency task and is used for issuing the emergency task to a movable task executor in a coverage range, issuing a plurality of excitation values and receiving a task execution result uploaded by the task executor; The task executors can select to execute or exit the emergency task, upload the task execution result and generate task excitation under the condition of executing the emergency task, wherein the task excitation is determined based on one excitation value corresponding to the communication distance between the task executors and the communication equipment; The device comprises: The system comprises a task executor, a time acquisition module, a time processing module and a communication device, wherein the task executor is used for acquiring the task executed time of each emergency task in the current time step, the task executed time comprises the task executed time of the task executor and the uploading time of the task execution result uploaded by the task executor, and the uploading time is related to the communication distance between the task executor and the communication device; An action decision module for taking task executed durations of N emergency tasks as states in a reinforcement learning model, taking a plurality of excitation values as actions in the reinforcement learning model, setting rewards of the reinforcement learning model, and learning decisions for setting the actions of the next time step based on the reinforcement learning model by taking the task executed durations of the N emergency tasks as objective functions; and the task scheduling module is used for completing scheduling based on the decision of each time step.
9. An electronic device comprising at least one controller and a memory for communicative connection with the controller, the memory storing instructions executable by the at least one controller to cause the at least one controller to perform the method of scheduling emergency tasks within a confined space as claimed in any one of claims 1 to 7.
10. A computer-readable storage medium having stored thereon computer-executable instructions for causing a computer to perform the method for scheduling emergency tasks in a confined space as claimed in any one of claims 1 to 7.

Description

Emergency task scheduling method and device in limited space Technical Field The embodiment of the application relates to the technical field of task scheduling, in particular to a method and a device for scheduling emergency tasks in a limited space. Background The limited space generally refers to a special working environment which is sealed or partially sealed and relatively isolated from the outside, such as a closed place of an underground mine tunnel, a tunnel and the like. The traditional limited space operation mainly depends on manual operation, but due to the limitation of space environment, workers are difficult to stay for a long time, and the operation efficiency is seriously affected. With the rapid development of 5G communication technology, movable operation equipment gradually replaces manual work, and various operation tasks are executed in a limited space, so that the operation safety is ensured, and the working efficiency is improved. When multiple emergency tasks are executed in a limited space, the task processing quality can be remarkably improved based on cooperative processing of multiple movable operation devices. However, the existing co-processing schemes tend to ignore the negative effects of device mobility in that when the distance between the movable work device and the task issuing device (e.g., base station) increases, the channel quality is significantly deteriorated due to the influence of communication distance and environmental interference (e.g., dust, etc.). The degradation is particularly manifested as the problems of signal-to-noise ratio fluctuation, communication rate reduction and the like, so that the uploading time of the task execution result is greatly increased, and the overall task completion time is further prolonged. Particularly, in the situation that the movable operation equipment has the emergency of the autonomous selection task, how to intelligently schedule the proper movable operation equipment to execute the emergency task by dynamically adjusting the parameter setting of the task release equipment, and finally, the overall execution duration of multiple emergency tasks is reduced, which becomes a key technical problem to be solved urgently. Disclosure of Invention The following is a summary of the subject matter described in detail herein. This summary is not intended to limit the scope of the claims. The main purpose of the disclosed embodiments is to provide an emergency task scheduling method and device in a limited space, which can guide task executors with proper distances to select and execute corresponding emergency tasks, and realize dynamic optimized scheduling of the task executors executing the emergency tasks, so as to achieve the purpose of minimizing the executed time of tasks of multiple emergency tasks. A first aspect of an embodiment of the present application proposes a method for scheduling N emergency tasks applied to an emergency task management system in a limited space, where the emergency task management system includes: the system comprises a plurality of communication devices, a task execution device, a communication device and a communication device, wherein each communication device is provided with at least one emergency task and is used for issuing the emergency task to a movable task executor in a coverage range, issuing a plurality of excitation values and receiving a task execution result uploaded by the task executor; The task executors can select to execute or exit the emergency task, upload the task execution result and generate task excitation under the condition of executing the emergency task, wherein the task excitation is determined based on one excitation value corresponding to the communication distance between the task executors and the communication equipment; The method comprises the following steps: The method comprises the steps of obtaining the task executed time length of each emergency task in the current time step, wherein the task executed time length comprises the task executed time length of a task executor and the uploading time length of a task execution result uploaded by the task executor, and the uploading time length is related to the communication distance between the task executor and the communication equipment; Setting rewards of the reinforcement learning model by taking the executed time periods of the N emergency tasks as states in the reinforcement learning model, taking a plurality of incentive values as actions in the reinforcement learning model, and learning decisions for setting the actions of the next time step by taking the executed time periods of the tasks of the N emergency tasks as target functions based on the reinforcement learning model; scheduling is completed based on the decision of each time step. The emergency task scheduling method in the limited space provided by the embodiment has at least the following advantages: In order to solve the problems, under th