CN-121998359-A - Intelligent scheduling system architecture integrating deep reinforcement learning and time sequence prediction

CN121998359ACN 121998359 ACN121998359 ACN 121998359ACN-121998359-A

Abstract

The invention relates to an intelligent scheduling system architecture integrating deep reinforcement learning and time sequence prediction, which realizes autonomous learning of scheduling strategies, early prediction of production risk and collaborative optimization of a man-machine by cooperative linkage of six layers of architectures. The architecture comprises a data acquisition and fusion layer, a micro-service business core layer, a deep learning engine layer, a scheduling execution layer, a visual decision layer and a system guarantee layer, wherein the data acquisition and fusion layer is used as a basis to gather multi-source production data, the deep learning engine layer is used as a core to conduct scheduling decision and risk prediction, the scheduling execution layer is used as a bridge to convert AI decision into a production instruction, man-machine cooperation is achieved through the visual decision layer, finally, a feedback data driving model in the deep learning engine layer is continuously optimized to form a self-evolution intelligent closed loop, the micro-service business core layer provides necessary business data standardization function, and the system guarantee layer ensures stable operation of AI service. The invention is suitable for the technical field of manufacturing production scheduling.

Inventors

Lin Tiaoda
XIONG SHENGHUI
LIU XINYU

Assignees

长园智能装备(广东)有限公司

Dates

Publication Date: 20260508
Application Date: 20260128

Claims (9)

1. An intelligent scheduling system architecture integrating deep reinforcement learning and time sequence prediction is characterized by comprising a data acquisition and fusion layer, a micro-service business core layer, a deep learning engine layer, a scheduling execution layer, a visual decision layer and a system guarantee layer, wherein the system is based on the data acquisition and fusion layer and gathers multi-source production data, the deep learning engine layer is used as a core to conduct scheduling decision and risk prediction, the scheduling execution layer is used as a bridge to convert AI decision into a production instruction, man-machine cooperation is achieved through the visual decision layer, finally, feedback data in the deep learning engine layer is used for driving a model to be continuously optimized, and a self-evolution intelligent closed loop is formed, wherein the micro-service business core layer provides necessary business data standardization function, and the system guarantee layer ensures stable operation of AI service.
2. The intelligent scheduling system architecture integrating deep reinforcement learning and time sequence prediction according to claim 1, wherein the deep learning engine layer comprises a scheduling policy optimization sub-module, a production risk prediction sub-module and an AI decision interpretation sub-module, the scheduling policy optimization sub-module performs multi-objective dynamic decision based on a DRL deep reinforcement learning model, the production risk prediction sub-module performs time sequence prediction and linkage based on an LSTM+attention risk prediction model, the AI decision interpretation sub-module analyzes and interprets logic of an AI scheduling scheme, and the three sub-modules are highly coordinated to realize integrated functions of dynamic scheduling, risk prediction and decision interpretation.
3. The intelligent scheduling system architecture integrating deep reinforcement learning and time sequence prediction as set forth in claim 2, wherein the scheduling policy optimization sub-module is implemented by the following technique, A. the algorithm type selection and division adopts a 'PPO (near-end policy optimization) +DQN (deep Q network)' mixed DRL architecture, wherein the PPO (near-end policy optimization) is responsible for continuous action space decision and the DQN (deep Q network) is responsible for a division cooperation mechanism of discrete action space decision; b. Model training design, 1) training sample construction, namely extracting historical production scheduling data of the past 3 years from an HDFS, wherein each sample comprises 'input characteristics (equipment load, order information and material inventory at the moment) +production scheme (process distribution result at the moment) +execution result (exchange achievement rate and equipment load balance rate at the moment)', 10 ten thousand+ effective samples are constructed, 2) state space definition comprises equipment states, order attributes, material inventory and process constraint, 3) rewarding function design comprises the steps of carrying out multi-objective rewarding function calculation by taking 'exchange achievement rate, equipment load balance rate and disturbance response speed' as cores, 4) training process comprises the steps of adopting an 'offline pre-training and online fine tuning' mode based on TensorFlow frames, and ensuring that the model is suitable for the latest production scene; c. Model deployment and reasoning, namely deploying the trained model into reasoning service through TensorFlow Serving (supporting model hot update without stopping), outputting a scheduling scheme when real-time data in Redis is input, and directly transmitting to a micro-servitization service core layer.
4. The intelligent scheduling system architecture integrating deep reinforcement learning and time sequence prediction as set forth in claim 3, wherein in the model training design step 3), the specific weights and calculation modes of the bonus function design are as follows, Prize value = achievement rate of exchange period x 40% + device load balancing rate x 30% + disturbance response speed x 30%, Wherein, the The arrival rate of the delivery period is 1 when the actual delivery time is less than or equal to the required delivery period, and the delay is reduced by 0.2 a day; device load balancing rate, namely, the smaller the standard deviation of all device load rates is, the higher the score (highest 1, lowest 0); The corresponding disturbance speed is 1 when the time for adjusting the production after the single insertion/fault is less than or equal to 100ms, and the time is reduced by 0.1 when 50ms is increased.
5. The intelligent scheduling system architecture integrating deep reinforcement learning and time sequence prediction as set forth in claim 2, wherein the technique of the production risk prediction submodule is implemented as follows, D. The algorithm type selection reasons are that the production risk (equipment overload and material shortage) has time sequence relevance (the overload risk increases rapidly within 12 hours when the equipment load rate exceeds 80% in 3 hours continuously), and LSTM (long short-term memory network) is good at capturing the long-term dependency relationship of time sequence data, so that the gradient disappearance problem of the traditional RNN can be avoided; e. Risk types and characteristic inputs, namely three risks of overload of focusing equipment, shortage of materials and delay of working procedures; f. The prediction and linkage logic is that 1) risk prediction is carried out, the model outputs 'risk probability', the risk probability is more than or equal to 70%, the risk probability is judged to be 'high risk', 50% -70% is 'medium risk', and <50% is 'low risk', 2) the linkage is carried out, namely, when the risk probability is judged to be 'high risk', the automatic triggering and production scheduling strategy optimizing sub-module recalculates, and the medium risk only sends early warning to a visual decision layer for artificial reference.
6. The intelligent scheduling system architecture integrating deep reinforcement learning and time sequence prediction as set forth in claim 2, wherein the AI decision-making and interpretation submodule is implemented by the following technique, G. The explanation dimension and method comprise 1) analyzing the characteristic contribution degree, calculating the contribution ratio of each characteristic by adopting an SHAP algorithm, 2) converting natural language into popular description, and 3) comparing and explaining the reasons of unselected alternative schemes; h. And the output mode is that the interpretation result and the scheduling scheme are synchronously transmitted to a visual decision layer, and the corresponding AI decision interpretation can be checked by clicking any procedure in the scheduling Gantt chart.
7. The intelligent scheduling system architecture integrating deep reinforcement learning and time sequence prediction according to claim 1, wherein the data acquisition and integration layer acquires data through an MQTT/OPC UA protocol, the data are transmitted through Kafka, the data are stored in Redis (real-time data for reasoning) and HDFS (historical data for training) in a layered mode after being cleaned, the data are guaranteed to be full, accurate and fast, the micro-service business core layer builds a micro-service infrastructure based on SpringCloudAlibaba, develops system management services, basic data services, low-code services and the like, elastic deployment and efficient cooperation of business modules are achieved, the scheduling execution layer converts AI decisions into equipment executable instructions through a standardized API, the equipment executable instructions are issued to various data exchange modules, and execution progress and abnormal data are returned in real time.
8. The architecture of an intelligent scheduling system integrating deep reinforcement learning and time sequence prediction as set forth in claim 1, wherein the visual decision layer is implemented by the following technique, I. constructing a production scheduling Gantt chart and a risk early warning thermodynamic diagram based on Echarts, and intuitively displaying a production scheduling scheme and risk early warning; j. and the man-machine cooperative intervention is that a manager is supported to carry out parameter adjustment and scheme modification, and manual intervention is carried out.
9. The intelligent scheduling system architecture integrating deep reinforcement learning and time sequence prediction according to claim 1 is characterized in that the system guarantee layer achieves containerized deployment and elastic expansion of core AI services through Kubernetes, a Prometheus+ Grafana monitoring platform is utilized to track performance indexes (reasoning delay and resource occupation) of model services in real time, full-link log analysis is conducted by means of an ELK (electronic component k) technology stack, long-term stable operation of the system is guaranteed, and rapid fault recovery capability is achieved.

Description

Intelligent scheduling system architecture integrating deep reinforcement learning and time sequence prediction Technical Field The invention relates to the technical field of manufacturing production scheduling, in particular to an intelligent scheduling system architecture integrating deep reinforcement learning and time sequence prediction. Background With the deep advancement of intelligent manufacturing, manufacturing enterprises are facing the transformation challenges of 'multi-variety, small lot, short-run, high-disturbance' production modes. Under the background of frequent order fluctuation, various resource constraints and complex and changeable production environments, enterprises have higher requirements on the intellectualization, real-time and flexibility of production scheduling systems. Currently, the production scheduling systems of the main stream of the manufacturing industry have the following inherent limitations in technical implementation: In terms of core algorithms, most systems are based on classical operational research methods, such as heuristic rules (e.g., greedy algorithms), linear programming, or meta-heuristic algorithms (e.g., genetic algorithms), which rely on predefined fixed rules and weights to handle multi-objective optimization problems such as latency, resource loading, etc., whose scheduling logic is static and deterministic. In terms of system architecture, the mainstream scheme generally adopts a single or traditional hierarchical architecture, and functional modules such as order management, resource scheduling, scheduling engine and the like are tightly coupled. The architecture causes difficult system maintenance, poor expansibility, often needs to reconstruct codes when a production scheduling strategy is newly added or a new data source is docked, has long development period and high cost, and is difficult to process high-frequency real-time data. Meanwhile, the system has high overall architecture complexity, and the dependency relationship among the modules is complicated, so that maintenance and upgrading become extremely difficult. More importantly, the traditional systems generally lack deep learning capability, cannot automatically extract features and rules from mass production data, can only rely on manually set rules to make decisions, and severely restrict the intelligent level and adaptability of the system. In the aspect of risk treatment, the existing scheme mainly depends on warning or manual experience inspection based on a static threshold, and the mode can only perform passive response after the problem occurs, and cannot perform prospective prediction and active intervention on potential risks such as equipment overload, material shortage and the like, so that production interruption and order delivery delay are often caused. Disclosure of Invention The invention aims to solve the technical problem of overcoming the defects of the prior art and providing an intelligent scheduling system architecture integrating deep reinforcement learning and time sequence prediction, wherein autonomous learning of scheduling strategies, early production risk prejudgment and man-machine cooperative optimization are realized through cooperative linkage of a six-layer architecture. The technical scheme includes that the architecture comprises a data acquisition and fusion layer, a micro-service business core layer, a deep learning engine layer, a scheduling execution layer, a visual decision layer and a system guarantee layer, wherein the system is based on the data acquisition and fusion layer and gathers multi-source production data, the deep learning engine layer is used as a core to conduct scheduling decision and risk prediction, the scheduling execution layer is used as a bridge to convert AI decision into production instructions, man-machine cooperation is achieved through the visual decision layer, finally, a feedback data driving model in the deep learning engine layer is used for continuous optimization, a self-evolution intelligent closed loop is formed, the micro-service business core layer provides necessary business data standardization function, and the system guarantee layer ensures stable operation of AI service. Further, the deep learning engine layer comprises a scheduling policy optimizing sub-module, a production risk predicting sub-module and an AI decision explaining sub-module, wherein the scheduling policy optimizing sub-module carries out multi-target dynamic decision based on a DRL deep reinforcement learning model, the production risk predicting sub-module carries out time sequence prejudging and linkage based on an LSTM+attention risk predicting model, the AI decision explaining sub-module analyzes and explains logic of an AI scheduling scheme, and the three sub-modules are highly cooperated to realize the integrated functions of dynamic scheduling, risk prejudging and decision explaining. Further, the technique of the scheduling