CN-122022393-A - Scheduling method for port railway shunting locomotives

CN122022393ACN 122022393 ACN122022393 ACN 122022393ACN-122022393-A

Abstract

The invention relates to the technical field of port railway dispatching and particularly discloses a port railway shunting locomotive dispatching method which comprises the steps of obtaining port railway real-time state information, inputting the port railway real-time state information into a pre-trained port railway shunting locomotive dispatching model to obtain real-time dispatching instructions, extracting global space topological features according to a port full scene different composition observation model by the pre-trained port railway shunting locomotive dispatching model, carrying out individual time sequence memory enhancement based on the extracted global space topological features to obtain a strategy network, carrying out optimization training by combining a reinforcement learning rewarding mechanism strategy network to obtain the strategy network, and sending the real-time dispatching instructions to corresponding shunting locomotives so that the shunting locomotives can operate based on the real-time dispatching instructions. The dispatching method of the port railway shunting locomotives can improve the turnover efficiency of the train wagon.

Inventors

ZHAO BING
YAO ZHANPENG
HAN LIANG
Wang Yandai
SHAN HONGFENG
ZHANG XISHU
LU XIANGCHAO
WANG XIANG
XU HAIYANG
ZHANG JIANYONG
WANG XIAOWEI

Assignees

山东港口日照港集团有限公司
江苏集萃清联智控科技有限公司
中车株洲电力机车研究所有限公司

Dates

Publication Date: 20260512
Application Date: 20260410

Claims (10)

1. The port railway shunting locomotive scheduling method is characterized by comprising the following steps of: Acquiring port railway real-time state information, wherein the port railway real-time state information at least comprises port railway position attribute information, shunting locomotive attribute information, task attribute information, classification channel attribute information and operation train attribute information; Inputting the port railway real-time state information into a pre-trained port railway shunting locomotive scheduling model to obtain real-time scheduling instructions, wherein the pre-trained port railway shunting locomotive scheduling model is obtained by extracting global space topological features according to a port full scene heterogram observation model, carrying out individual time sequence memory enhancement based on the extracted global space topological features to obtain a strategy network, and carrying out optimization training on the strategy network by combining a reinforcement learning rewarding mechanism; And sending the real-time scheduling instruction to a corresponding shunting locomotive so that the shunting locomotive performs operation based on the real-time scheduling instruction.
2. The port railway shunting locomotive scheduling method according to claim 1, wherein the pre-trained port railway shunting locomotive scheduling model is obtained by performing global space topology feature extraction according to a port full scene heterogram observation model, performing individual time sequence memory enhancement based on the extracted global space topology feature to obtain a strategy network, and performing optimization training on the strategy network in combination with a reinforcement learning rewarding mechanism, and comprises the following steps: Constructing a port full-scene heterogram observation model, and constructing a global state vector describing macroscopic environmental characteristics according to the port full-scene heterogram observation model, wherein the port full-scene heterogram observation model at least comprises heterogeneous nodes obtained according to abstraction of physical entities, logical objects and operation subjects in port railways and graph structures describing relations among the heterogeneous nodes; Extracting global space topological features from the port full scene heterographing observation model; Constructing an individual time sequence memory enhancement module according to the global space topological feature and the global state vector describing the macro environment feature so as to obtain an individual time sequence enhancement feature vector; constructing a multi-semantic fusion vector according to the individual time sequence enhanced feature vector and the global space topological feature, and performing action scoring on the multi-semantic fusion vector to obtain a strategy network; And carrying out optimization training on the strategy network according to the reinforcement learning rewarding mechanism to obtain a pre-trained port railway shunting locomotive scheduling model.
3. The port railway shunting locomotive scheduling method according to claim 2, wherein constructing a port full scene heterogram observation model comprises: abstracting physical entities, logical objects and operation subjects in port railways into heterogeneous nodes, wherein the heterogeneous nodes comprise position nodes, dispatching nodes, task nodes, classification channel nodes and operation train nodes; And establishing a space topology and logic association edge set between port entities according to the heterogeneous nodes, wherein the space topology and logic association edge set comprises a physical connection relation, a position mapping relation and a task association relation, the physical connection relation can represent the topological connectivity between port railway stations and wharfs, the position mapping relation can represent the real-time distribution of dynamic entities in a road network, and the task association relation can represent the subordinate relation of the transportation, return and recombination tasks associated with the operation trains.
4. The port railway shunting locomotive scheduling method according to claim 2, wherein the global space topology feature extraction is performed on the port full scene heterogram observation model, and the method comprises the following steps: Projecting the heterogeneous nodes to a unified hidden feature space to obtain an initial node embedded vector; calculating importance scores of the source node to the target node under a specific semantic relation according to a heterogeneous multi-head attention mechanism of the meta-relation; performing weighted aggregation and message synthesis on cross-type neighbor information to generate comprehensive heterogeneous context representation of the node; And embedding and updating according to the residual connection and the layer normalized node so as to realize the layer-by-layer evolution of the multi-hop neighbor information.
5. The port railway shunting locomotive scheduling method according to claim 4, wherein constructing an individual time sequence memory enhancement module according to the global space topology feature and the global state vector describing macro environment features to obtain an individual time sequence enhancement feature vector comprises: Feature fusion is carried out on the initial node embedded vector and the global state vector describing macro environment features, and a shunting locomotive decision input vector is obtained; screening and reserving historical operation logic according to a reset gate and an update gate mechanism of the GRU to obtain an individual time sequence enhancement module; and inputting the decision input vector of the shunting locomotive to the individual time sequence enhancement module to obtain an individual time sequence enhancement feature vector.
6. The method for dispatching port railway shunting locomotives according to claim 5, wherein the step of screening and retaining the historical operation logic according to a reset gate and update gate mechanism of the GRU to obtain the individual time sequence enhancement module comprises: Determining the hidden layer state of each shunting locomotive; Screening and reserving historical operation logic of the locomotive according to a reset gate and an update gate of the GRU, wherein the reset gate is used for determining the proportion of historical information to be forgotten, and the update gate is used for determining the proportion of historical state blended in new information at the current moment; filtering the hidden layer state of the train at one moment according to the reset door, and generating candidate memory information at the current moment by combining the input information at the current moment; and obtaining an individual time sequence enhancement module of the shunting locomotive according to the weighted fusion of the hidden layer state of the updating door at one moment on the shunting locomotive and the candidate memory information at the current moment.
7. The port railway shunting locomotive scheduling method according to claim 2, wherein constructing a multi-semantic fusion vector according to the individual time sequence enhanced feature vector and the global space topological feature, and performing action scoring on the multi-semantic fusion vector to obtain a strategy network comprises: Constructing a multi-semantic fusion vector according to the individual time sequence enhancement feature vector, the action type vector and the task node embedding vector aiming at each candidate action; dynamically generating a current legal candidate action list; inputting the multi-semantic fusion vector and the legal candidate action list into a multi-layer perceptron to perform unified action scoring, and obtaining the original scores of all legal candidate actions; And obtaining a strategy network of the shunting locomotive under the unified action space according to the original scores of all legal candidate actions.
8. The port rail shunting locomotive scheduling method according to claim 7, wherein dynamically generating the current legal candidate action list comprises: searching all the position nodes of the reachable stations and wharfs in the road network topology of the node where the current shunting locomotive is located, and taking the position nodes as candidate parameters of the moving action; all feasible tasks in the current location node are retrieved and used as candidate parameters for executing actions, wherein the feasible tasks comprise ready tasks and non-ready tasks meeting reservation conditions.
9. The port railway shunting locomotive scheduling method according to claim 8, wherein obtaining a strategy network of shunting locomotives under a unified action space according to the original scores of all legal candidate actions comprises: Performing mask control of type level and parameter level according to the legal candidate action list, and performing masking treatment on original scores of illegal actions; Normalizing the original scores of the candidate actions after mask control according to a Softmax operator to obtain the strategy distribution of the current shunting locomotive in a unified action space; If the current decision action is movement, calculating waiting time and a track occupation path according to the occupation condition of the current track road network, wherein the calculation of the waiting time and the track occupation path comprises traversing the preset paths of other shunting locomotives in the road network and mapping each track side into a time window set according to the length and the safety distance of a train, and carrying out space-time A-based path optimization taking a DRL time step as a unit in a state space to obtain the number of in-situ waiting actions and a time-free path, wherein the number of in-situ waiting actions is the waiting time required for starting the movement action.
10. The port railway shunting locomotive scheduling method according to claim 2, wherein the optimizing training of the strategy network according to a reinforcement learning rewards mechanism comprises: constructing action rewards based on the job progress and the wagon contribution; and carrying out iterative training on the strategy network according to the multi-agent near-end strategy optimization framework.

Description

Scheduling method for port railway shunting locomotives Technical Field The invention relates to the technical field of port railway dispatching, in particular to a port railway shunting locomotive dispatching method. Background In modern port dredging and transporting systems, port railways are used as core ties for connecting national rail lines with port operation areas, and take charge of large-scale cargo collection and transportation tasks. With the promotion of intelligent port construction, port railway operations are gradually changed to automatic and unmanned operations, wherein a shunting locomotive (hereinafter referred to as shunting) is used as a key power device for carrying out the traction, disassembly, grouping and inter-node transfer of trains in ports, and the dispatching efficiency directly determines the turnover speed and the overall logistics smoothness of a train wagon in ports. Unlike simple marshalling station operation, the port railway operation flow has obvious full-flow and multi-node characteristics. The process generally comprises the steps of unpacking humps after the arrival of the incoming trains at the harbor station, pulling the trains to each wharf operation area by a dispatching machine, returning the train to the station by the dispatching machine after the wharf finishes the loading and unloading operation, and finally unpacking the humps again, leaving the harbor and the like. In this process, the dispatching machine is not only responsible for classifying humps and freight cars (hereinafter referred to as railroad cars) in the terminal, but also for carrying out long-distance transfer tasks between the terminal and each of the dispersed dock nodes. However, existing port railway dispatch modes face many challenges in coping with such complex scenarios. Firstly, the coupling degree of operation links is high, the arrival time of a port entering train, the unpacking capacity of a hump and the loading and unloading progress of a wharf operation area are mutually restricted, the fluctuation of any link can dynamically influence the task allocation of a dispatching machine, secondly, the dispatching machine has limited resources and various task types, and complex decision balance needs to be carried out between 'executing hump unpacking' and 'executing inter-node transferring'. Currently, the conventional scheduling method relies on manual experience or rule-based static strategies (such as first-come first-serve), so that it is difficult to comprehensively stage the cooperative relationship between the station and the wharf from the global perspective. The localized dispatching is very easy to cause the increase of empty running of dispatching among different operation nodes and unsmooth operation connection, and finally the average residence time of a train in a port is overlong, so that the overall operation efficiency of a port railway dredging and transporting system is severely restricted. The current mature port railway dispatching method is mainly divided into a mathematical programming method based on an operation study model and a general deep reinforcement learning method, but the two methods have defects in the aspects of processing special isomerism, time sequence dependence and dynamic uncertainty of port railways, and the method is characterized in that 1) the mathematical programming method based on the operation study model can theoretically obtain the global optimal solution, but the calculation complexity is exponentially increased along with the scale of a road network, and the method is highly dependent on a determined operation schedule, so that the problem that the real-time performance of a dispatching scheme is poor and is easy to fail in a dynamic environment is difficult to deal with the uncertain factors such as the fluctuation of the operation time of a wharf, the sudden fault and the like, and 2) the general DRL (deep reinforcement learning ) method has the online decision capability, but is difficult to effectively extract the complex non-Euclidean road network characteristics of the port railways, and faces the problems of sparse rewarding and difficult convergence in long-sequence decision, and meanwhile, the lack of perfect task booking mechanism is easy to cause the conflict or the cooperation fault among multiple dispatching machines. Therefore, how to distribute the tasks of de-compiling, transferring and taking-in and sending-out to improve the turnover efficiency of the train wagon for a plurality of automatic dispatching machines is a technical problem to be solved by the technicians in the field. Disclosure of Invention The invention provides a dispatching method for a port railway shunting locomotive, which solves the problem that in the related art, a plurality of automatic shunting locomotives cannot be effectively and dynamically distributed to unpack, transfer, pick-up and delivery work tasks and the like, so that the turn