Search

CN-122022620-A - Transfer equipment cooperative scheduling method and system in steelmaking production, medium and terminal

CN122022620ACN 122022620 ACN122022620 ACN 122022620ACN-122022620-A

Abstract

The application discloses a transfer equipment cooperative scheduling method and system in steelmaking production, a medium and a terminal, relates to the technical field of multi-target cooperative scheduling and the field of steelmaking, and mainly aims to solve the problem that the conventional scheduling optimization method is difficult to meet the cooperative scheduling requirement of multi-equipment multi-constraint coupling. The method comprises the steps of incorporating multiple transfer devices and multiple tasks into a multi-agent reinforcement learning framework, training in a simulation environment integrating a hardware space model and constraint logic modeling to obtain an action strategy generation model, enabling a dispatching instruction to be more suitable for coupling characteristics of a steel ladle transportation system, improving collaborative dispatching capability, constructing a transfer device interaction diagram to display a space adjacent relation and a task coupling relation between transfer devices, which is equivalent to a communication mechanism, enabling time sequence behavior information of surrounding transfer devices to be utilized in local decision, further improving collaborative dispatching capability, correcting the dispatching instruction by utilizing a conflict resolution strategy, avoiding conflict, and improving the executable of the dispatching instruction.

Inventors

  • CHEN BAO
  • LUO XIAOCHUAN
  • SU QIANWEI
  • Nong Siyuan

Assignees

  • 东北大学

Dates

Publication Date
20260512
Application Date
20251222

Claims (10)

  1. 1.A cooperative scheduling method for transfer equipment in steelmaking production is characterized by comprising the following steps: In the steelmaking process, collecting global state information of the current moment of a target steelmaking system in real time; constructing a transfer equipment interaction diagram at the current moment based on the global state information at the current moment, wherein each node in the transfer equipment interaction diagram at the current moment represents each transfer equipment, and each side represents the interaction relation among the transfer equipment; Generating a model based on an action strategy generating model which is trained by a model and corresponds to each transfer device, and generating a current moment preliminary scheduling instruction of the transfer device according to the current moment global state information and the current moment transfer device interaction diagram, wherein the action strategy generating model is obtained by incorporating multiple transfer devices and multiple tasks into a multi-agent reinforcement learning framework and performing model training in a simulation environment integrating a hardware space model and constraint logic modeling; Predicting the running track of the transfer equipment in a preset future period based on the current moment preliminary scheduling instructions respectively to obtain predicted running tracks of the transfer equipment, and correcting the current moment preliminary scheduling instructions based on a conflict resolution strategy when conflicts exist among a plurality of the predicted running tracks to obtain current moment conflict-free scheduling instructions of the transfer equipment so as to respectively control the transfer equipment to execute ladle transportation tasks based on the current moment conflict-free scheduling instructions, wherein the transfer equipment comprises an overhead travelling crane and a trolley.
  2. 2. The method of claim 1, wherein the generating, for each of the transferring devices, a model based on the action policy of the completed model training corresponding to the transferring device, generating a current time preliminary scheduling instruction of the transferring device according to the current time global state information and the current time transferring device interaction diagram includes: For each transfer device, extracting local observation state information at the current moment corresponding to the transfer device from the global state information at the current moment, and adding the local observation state information at the current moment into a historical observation state information sequence maintained by the transfer device to obtain a historical observation state information sequence at the current moment, wherein the local observation state information at the current moment comprises self state information of the transfer device at the current moment, state information of other transfer devices which can be observed, state information of stations which can be observed and ladle transportation task information of the transfer device; Performing vector conversion processing on the current time historical observation state information sequence by using a cyclic neural network layer of the action strategy generation model to obtain a current time sequence embedded vector; Determining a node corresponding to the transfer equipment and adjacent nodes of the node in the current time transfer equipment interaction diagram, and utilizing the action strategy to generate a diagram annotation force mechanism layer of a model, and carrying out weighted aggregation treatment on a current time sequence embedded vector of the node and current time sequence embedded vectors of all adjacent nodes to obtain a node characteristic vector of the node; And screening out the optional action at the current moment of the transfer equipment from a preset action set corresponding to the transfer equipment based on the historical observation state information sequence at the current moment, generating a local value evaluation layer of a model by utilizing the action strategy, calculating the action value at the current moment of each optional action at the current moment based on the node characteristic vector, and taking the optional action at the current moment corresponding to the action value at the highest current moment as a preliminary scheduling instruction at the current moment of the transfer equipment.
  3. 3. The method of claim 1, wherein the correcting the preliminary scheduling command at the current time based on the conflict resolution policy to obtain the conflict-free scheduling command at the current time of each transfer device includes: If the collision is the collision among the crown blocks, determining the crown block with the highest priority based on a crown block priority evaluation network, keeping the current moment preliminary scheduling instructions of the crown blocks with the highest priority unchanged, and correcting the current moment preliminary scheduling instructions of the crown blocks except the crown block with the highest priority into avoidance instructions to obtain conflict-free scheduling instructions at the current moment; If the conflict is the inter-trolley conflict, determining a winning trolley based on the trolley space-time resource reservation and the bidding network request, keeping the current time primary scheduling instruction of the winning trolley unchanged, and correcting the current time primary scheduling instructions of other trolleys except the winning trolley into an avoidance instruction so as to obtain a current time conflict-free scheduling instruction.
  4. 4. The method of claim 1, wherein the generating a model for each of the transferring devices based on the action strategy of the completed model training corresponding to the transferring device, before generating the current time preliminary scheduling instruction of the transferring device according to the current time global state information and the current time transferring device interaction diagram, the method further comprises: Constructing a corresponding agent model for each transfer device, and constructing a global value evaluation model and a target global value evaluation model for the target steelmaking system, wherein each agent model comprises an action strategy generation model and a target action strategy generation model; In a simulation environment, acquiring global state information of a current simulation time, constructing an interaction diagram of a current simulation time transfer device based on the global state information of the current simulation time, extracting corresponding local observation state information of the current simulation time from the global state information of the current simulation time by each transfer device, adding the local observation state information of the current simulation time into a maintained historical observation state information sequence to obtain a historical observation state information sequence of the current simulation time, generating a model based on a corresponding action strategy, generating a preliminary scheduling instruction of the current simulation time according to the historical observation state information sequence of the current simulation time and the interaction diagram of the current simulation time transfer device, respectively carrying out running track prediction processing based on each preliminary scheduling instruction of the current simulation time, correcting each preliminary scheduling instruction of the current simulation time based on a conflict resolution strategy when a conflict exists between predicted running tracks, obtaining a current simulation time conflict-free scheduling instruction of each transfer device, controlling each transfer device to carry out a ladle transport task based on the current simulation time conflict-free scheduling instruction, obtaining global state information of the next simulation time, calculating a current state and a corresponding value of the current simulation time, the global state information, and the current state sets, A conflict-free scheduling instruction set at the current simulation moment, the global rewarding value at the current simulation moment, the global state information at the next simulation moment, the local observation state information set at the next simulation moment and the conflict mark combination at the current simulation moment are used for obtaining a steelmaking system vector at the current simulation moment so as to circularly generate a plurality of steelmaking system vectors; according to the sampling priority level of the current simulation moment of each steelmaking system vector, carrying out non-uniform probability sampling operation to obtain a training sample set; Aiming at each sample in the training sample set, a model is generated by utilizing an action strategy of each transfer device, a current simulation time optional action evaluation value set of each transfer device is calculated according to a current simulation time local observation state information set, a global value evaluation model is utilized, the current simulation time optional action evaluation value sets are combined in a nonlinear mode under the condition of the current simulation time global state information to obtain a current simulation time global action evaluation value, a model is generated by utilizing a target action strategy of each transfer device, the next simulation time optional action evaluation value set of each transfer device is calculated according to a next simulation time local observation state information set, the next simulation time global state information is utilized to serve as a condition, the next simulation time optional action evaluation value sets are combined in a nonlinear mode to obtain a current simulation time global action expected value, a time differential error loss value is calculated on the basis of the current simulation time global rewarding value, the current simulation time global action expected value and the current simulation time global action evaluation value, a time differential error loss value is calculated on the basis of the time differential error loss value, a plurality of collision and the model is calculated by utilizing a time differential gradient descent method, and the model is calculated until a plurality of collision and the model is resolved, and the model is calculated.
  5. 5. The method of claim 4, wherein prior to constructing a corresponding agent model for each of the transfer facilities and constructing a global value assessment model and a target global value assessment model for the target steelmaking system, the method further comprises: Acquiring layout information of the target steelmaking system, and constructing a hardware space model based on the layout information, wherein the layout information comprises station layout information, crown block and crown block track layout information and trolley track layout information; Acquiring a production plan, generating a ladle transportation task sequence based on the production plan, and configuring ladle technological parameters for each ladle transportation task; constructing constraint logic, wherein the constraint logic comprises ladle temperature constraint logic, ladle waiting time constraint logic and station capacity constraint logic; constructing cost calculation logic; embedding the ladle transportation task sequence, the constraint logic and the cost calculation logic into the hardware space model to obtain a simulation environment so as to circularly generate a steelmaking system vector in the simulation environment.
  6. 6. The method of claim 4, wherein the global rewards function is expressed as the following formula: Wherein, the Indicating the global prize value at time t, Representing the task delay penalty increment, Indicating an increment in the operational time or distance of the transfer device, Represents the time-varying energy consumption, the temperature drop overrun risk, the maintenance cost comprehensive cost increment, Indicating the number and severity of actual or serious potential conflicts occurring at time t, 、 、 All represent weights.
  7. 7. The method according to any one of claims 1-6, further comprising: Constructing a primary simulation environment, and performing primary model training in the primary simulation environment to obtain a plurality of primary models which are subjected to model training, wherein the primary models comprise a generation model for a plurality of primary action strategies, a primary global value evaluation model and a primary conflict resolution strategy; Constructing a medium-level simulation environment, taking parameters of each primary model as initial values, and performing medium-level model training in the medium-level simulation environment to obtain a plurality of medium-level models with model training completed; And constructing an advanced simulation environment, taking parameters of each intermediate model as initial values, and performing advanced model training in the advanced simulation environment to obtain a plurality of action strategy generation models, global value evaluation models and conflict resolution strategies which are subjected to model training, wherein the scale of the advanced simulation environment is consistent with the scale of the target steelmaking system and is larger than that of the intermediate simulation environment, and the scale of the intermediate simulation environment is larger than that of the primary simulation environment.
  8. 8. A cooperative scheduling system for a transfer device in steelmaking production, comprising: the global state information acquisition module is used for acquiring global state information of the current moment of the target steelmaking system in real time in the steelmaking process; The transfer equipment interaction diagram construction module is used for constructing a transfer equipment interaction diagram at the current moment based on the global state information at the current moment, wherein each node in the transfer equipment interaction diagram at the current moment represents each transfer equipment, and each side represents the interaction relation among the transfer equipment; The primary scheduling instruction generation module is used for generating a model aiming at each transfer device based on an action strategy of completed model training corresponding to the transfer device, and generating a current moment primary scheduling instruction of the transfer device according to the current moment global state information and the current moment transfer device interaction diagram, wherein the action strategy generation model is obtained by incorporating various transfer devices and various tasks into a multi-agent reinforcement learning framework and performing model training in a simulation environment integrating a hardware space model and constraint logic modeling; The conflict resolution module is used for respectively predicting the running track of the transfer equipment in a preset future period based on each current moment preliminary scheduling instruction to obtain the predicted running track of each transfer equipment, correcting each current moment preliminary scheduling instruction based on a conflict resolution strategy when conflicts exist among a plurality of the predicted running tracks to obtain a current moment conflict-free scheduling instruction of each transfer equipment, and respectively controlling each transfer equipment to execute a ladle transportation task based on a plurality of the current moment conflict-free scheduling instructions, wherein the transfer equipment comprises an overhead travelling crane and a trolley.
  9. 9. A storage medium having stored therein at least one executable instruction for causing a processor to perform operations corresponding to the diversion apparatus co-scheduling method in steel making production as set forth in any one of claims 1-7.
  10. 10. A terminal comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete communication with each other through the communication bus; the memory is configured to store at least one executable instruction, wherein the executable instruction causes the processor to perform operations corresponding to the diversion equipment co-scheduling method in steelmaking operations as set forth in any one of claims 1-7.

Description

Transfer equipment cooperative scheduling method and system in steelmaking production, medium and terminal Technical Field The application relates to the technical field of multi-target cooperative scheduling and the field of steelmaking, in particular to a cooperative scheduling method and system for transfer equipment in steelmaking production, a medium and a terminal. Background The steelmaking process generally comprises a plurality of process links such as a converter (or an electric furnace), a refining furnace, a tundish, a continuous caster and the like, and molten steel needs to be transferred between fixed process equipment through the ladle. The ladle transport system is generally composed of a plurality of crown blocks arranged on an overhead rail and a plurality of trolleys running on a ground rail together to form a multi-dimensional logistics network. The scheduling efficiency of the network directly determines the consistency of the production process, the stability of the molten steel temperature and the overall energy consumption and economy. At present, most of the existing dispatching optimization methods use crown blocks as single dispatching cores, and the ground trolley is regarded as a fixed delay link. However, because key constraints such as limited number of trolleys, congestion of a track network, accurate space-time butt joint of crown blocks and trolleys at a lifting point are ignored, the obtained optimal scheduling scheme cannot be realized due to trolley conflict or butt-joint waiting in actual implementation, the executable performance is poor, and the cooperative scheduling requirement of multi-equipment multi-constraint coupling is difficult to meet. Disclosure of Invention In view of the above, the application provides a cooperative scheduling method and system for transfer equipment in steelmaking production, a medium and a terminal, and aims to solve the problem that the conventional scheduling optimization method is difficult to meet the cooperative scheduling requirement of multi-equipment multi-constraint coupling. According to one aspect of the application, there is provided a method for collaborative scheduling of a diversion facility in steelmaking, comprising: In the steelmaking process, collecting global state information of the current moment of a target steelmaking system in real time; constructing a transfer equipment interaction diagram at the current moment based on the global state information at the current moment, wherein each node in the transfer equipment interaction diagram at the current moment represents each transfer equipment, and each side represents the interaction relation among the transfer equipment; Generating a model based on an action strategy generating model which is trained by a model and corresponds to each transfer device, and generating a current moment preliminary scheduling instruction of the transfer device according to the current moment global state information and the current moment transfer device interaction diagram, wherein the action strategy generating model is obtained by incorporating multiple transfer devices and multiple tasks into a multi-agent reinforcement learning framework and performing model training in a simulation environment integrating a hardware space model and constraint logic modeling; Predicting the running track of the transfer equipment in a preset future period based on the current moment preliminary scheduling instructions respectively to obtain predicted running tracks of the transfer equipment, and correcting the current moment preliminary scheduling instructions based on a conflict resolution strategy when conflicts exist among a plurality of the predicted running tracks to obtain current moment conflict-free scheduling instructions of the transfer equipment so as to respectively control the transfer equipment to execute ladle transportation tasks based on the current moment conflict-free scheduling instructions, wherein the transfer equipment comprises an overhead travelling crane and a trolley. According to another aspect of the present application, there is provided a transfer equipment co-scheduling system in steelmaking production, comprising: the global state information acquisition module is used for acquiring global state information of the current moment of the target steelmaking system in real time in the steelmaking process; The transfer equipment interaction diagram construction module is used for constructing a transfer equipment interaction diagram at the current moment based on the global state information at the current moment, wherein each node in the transfer equipment interaction diagram at the current moment represents each transfer equipment, and each side represents the interaction relation among the transfer equipment; The primary scheduling instruction generation module is used for generating a model aiming at each transfer device based on an action strategy of completed m