CN-116261690-B - Computer system and method for providing operating instructions for blast furnace thermal control

CN116261690BCN 116261690 BCN116261690 BCN 116261690BCN-116261690-B

Abstract

A computer system (100), a computer-implemented method, and a computer program product for training a reinforcement learning model (130) to provide operating instructions for blast furnace heat control are provided. A domain adaptive machine learning model (110) generates a first domain invariant dataset (22) from historical operating data (21) obtained as a multivariate time series and reflecting thermal states of respective blast furnaces (BF 1 to BFn) of a plurality of domains. A transient model (121) of the universal blast furnace process is used to generate manual operation data (24 a) as a multivariate time series reflecting the thermal state of the universal blast furnace (BFg) for a particular thermal control action (26 a). The generation type deep learning network (122) generates a second domain invariant dataset (23 a) by passing the features learned from the historical operational data (21) to the manual operational data (24 a). The reinforcement learning model (130) determines (1400) rewards (131) for a particular thermal control action (26 a) in view of a given objective function by processing the combined first and second domain invariant data sets (22, 23 a). The second domain invariant dataset is regenerated based on the modified parameters (123-2) in accordance with the rewards (131), and the determination of rewards is repeated to learn optimized operational instructions to be applied to optimized thermal control actions of the respective operating states of the one or more blast furnaces.

Inventors

Sidric Shokart
HANSEN FABRICE
HAUSEMER LIONEL
Mariam Banyasadi
Philip Bemis

Assignees

保尔伍斯股份有限公司

Dates

Publication Date: 20260512
Application Date: 20210928
Priority Date: 20200930

Claims (15)

1. A computer-implemented method (1000) for training a reinforcement learning model (130) to provide operating instructions for blast furnace heat control, the method characterized by the steps of: -processing, by a domain adaptive machine learning model (110) trained by transfer learning, historical operating data (21) obtained as a multivariate time series and reflecting the thermal states of respective blast furnaces (BF 1 to BFn) of a plurality of domains to generate a first domain-invariant dataset (22) representative of the thermal states of any of the blast furnaces (BF 1 to BFn), irrespective of the domain; Generating manual operation data (24 a) as a multivariate time series reflecting the thermal state of a generic blast furnace (BFg) for a specific thermal control action (26 a) by using a transient model (121) of the generic blast furnace process, wherein the transient model (121) reflects the respective physical, chemical, thermal and flow conditions of the generic blast furnace and provides a solution for upward gas flow and downward movement of a solid layer constructed in the generic blast furnace while exchanging heat, mass and momentum transfer; -processing the manual operation data (24 a) by a generating deep learning network (122) trained on a multivariate time sequence of the historical operation data (21) to generate a second domain invariant dataset (23 a) by passing features learned from the historical operation data (21) to the manual operation data (24 a); The reinforcement learning model (130) determines a reward (131) for the particular thermal control action (26 a) by processing the combined first domain-invariant dataset (22) and second domain-invariant dataset (23 a) in view of a given objective function, and Regenerating the second domain invariant dataset based on modified parameters (123-2) according to the rewards (131), wherein a genetic search and/or bayesian optimization algorithm (123-1) directs a search of the modified parameters for further thermal control actions based on a current environment (25 a) of the reinforcement learning model (130) and an output of the specific thermal control action (26 a) of a current learning step, and repeating the determining steps to learn optimized operating instructions for optimized thermal control actions to be applied to respective operating states of one or more blast furnaces.
2. The method of claim 1, further comprising: the reinforcement learning model (130) predicts an optimized operating instruction for at least one actuator of a particular blast furnace in production based on current operating state data of the particular blast furnace; after applying a thermal control action according to the optimized operation instruction to the at least one actuator, determining the reward based on a new state of the specific blast furnace after performing the thermal control action, and If the reward is below a predefined threshold, second domain invariant data is regenerated for one or more alternative operating instructions using the transient model for retraining the reinforcement learning model.
3. The method of claim 1, wherein the domain-adaptive machine learning model (110) is implemented by a generative deep learning neural network having convolution and/or recursion layers trained to extract domain-invariant features from the historical operating data (21) as the first domain-invariant dataset.
4. The method of claim 1, wherein the domain adaptive machine learning model (110) has been trained to learn a plurality of mappings of corresponding raw data from a plurality of blast furnaces (BF 1-BFn) to a reference blast furnace (BFr), wherein each mapping is a representation of a conversion of a respective blast furnace to the reference blast furnace, and the plurality of mappings corresponds to the first domain invariant dataset.
5. The method of claim 4, wherein the domain-adaptive machine learning model (110) is implemented by a generative deep learning architecture based on CycleGAN architecture.
6. The method of claim 1, wherein the reinforcement learning model is trained to learn the optimized operating instructions such that the associated target measurements are within a predefined range of pareto fronts of the corresponding multi-dimensional objective functions.
7. The method of claim 1, wherein the transient model (121) comprises a plurality of calculation units, wherein each unit represents a respective layer of raw material of the common blast furnace that is loaded at one time, wherein each calculation unit solves a gas phase formula in an iterative sequential manner to meet relative gas phase parameter tolerances in each iteration time interval, and when the gas phase parameters converge to a predetermined tolerance value, sequentially solves a solid phase formula in the same iteration time interval.
8. The method of claim 7, wherein iteratively solving the gas phase equation for each iteration of a pressure-velocity correction loop comprises: calculating gas, solid and liquid properties; calculating the reaction rate and the heat transfer coefficient; Calculating gas temperature, type, velocity and pressure drop, and Wherein sequentially solving the solid phase formula comprises: Calculating the solid temperature and the type; calculating the temperature and the type of the liquid, and The solids velocity was calculated.
9. The method of claim 1, wherein the transient model (121) receives one or more of the following input parameters, load material amount and chemistry analysis, temperature, pressure, PCI rate, and oxygen enrichment, an energy equation with predicted hot metal temperature, one or more species equations for calculating hot metal chemistry, and one or more gas phase equations for predicting top gas temperature, efficiency (Eta CO), and pressure.
10. The method of claim 1, wherein the reinforcement learning model is implemented by a recurrent neural network.
11. The method of claim 1, further comprising: Predicting information about future thermal evolution of a specific blast furnace state based on said historical operating data (21) and/or further measured environmental data related to the environment of said blast furnace by using one or more respectively trained associated machine learning models (ML 1 to MLn), to supplement said historical operating data (21) with future multivariate time series data related to future points in time, and A future multivariate time series is processed by the domain adaptive machine learning model (110) to augment the first domain invariant dataset (22) with data relating to the future point in time.
12. The method of claim 11, wherein training a particular (MLT) model of the associated machine learning models (ML 1 to MLn) comprises: training (703) a plurality of base models with different selections of operational data (701) and/or environmental data (702) using one or more machine learning algorithms to provide base model specific future multivariate time series data as training inputs to specific ones of the machine learning models; training (706) a particular model of the associated machine learning models with future multivariate time series data specific to the base model to learn which combination of base models is best suited to which state of the blast furnace.
13. The method according to claim 12, wherein the specific one of the machine learning models (ML 1 to MLn) is trained to predict at the future point in time one of anomalies in the blast furnace process, the thermal state of the blast furnace and hot metal production KPIs, loading matrix optimization, blast furnace phenomenon based on process inspection according to tuyere cameras, tap hole opener recommendation for optimal operation, phenomenon based on TMT SOMA and KPIs, phenomenon marked by process rules.
14. A computer program product which, when loaded into a memory of a computer system and executed by at least one processor of the computer system, performs the steps of the computer implemented method according to any of the preceding claims.
15. A computer system (100) comprising a plurality of functional modules which, when executed by the computer system, perform the steps of the computer-implemented method according to any one of claims 1 to 13.

Description

Computer system and method for providing operating instructions for blast furnace thermal control Technical Field The present invention relates generally to systems for controlling blast furnaces and, more particularly, to methods, computer program products, and systems for generating operating instructions for a blast furnace using a machine learning method. Background A blast furnace (blast furnaces) is used to produce molten iron as a raw material for steel. Blast furnaces have very complex processes that need to be modeled because they rely on multivariable process inputs and disturbances. The aim is to reduce material and fuel consumption in order to optimise the efficiency and stability of the whole furnace, the hot metal quality and to increase the life of the furnace. It is therefore desirable to provide optimized operating instructions for complex production target definitions. Disclosure of Invention This technical problem is solved by the features of the independent claims by training a Reinforcement Learning (RL) model implemented by a recurrent neural network to provide operating instructions for blast furnace thermal control. The operation instructions relate to corresponding thermal control actions. As used herein, a thermal control action refers to any action that affects an actuator in order to thermally control a blast furnace process. Depending on the level of control automation, the operating instructions may provide guidance for the human operator to correct control of the blast furnace, or they may directly indicate the heat controller of the blast furnace, which may execute such instructions without human interaction. Thus, real world (measured) operational data from multiple blast furnaces is used with a simulation model (transient model) of the blast furnace process to train a recurrent neural network model through reinforcement learning. This can be understood as offline RL model training at the data level and at the simulation model level. From the historic data, a number of additional features can be generated, providing better insight into the characterization of the blast furnace process. These features are phenomena defined by rules implemented from the recorded raw data, or predictions of process phenomena that are available in the form of predictions provided by machine learning models. When trained, the RL model provides recommendations of operating instructions to the main executor of the blast furnace, such as, for example, tuyere and blast setpoints, such as Pulverized Coal Injection (PCI) rate (kg/s), blast flow rate (Nm 3/s), oxygen enrichment (%), etc., and/or load composition and load setpoints, such as coke rate (kg/load), basicity, load split, etc. The recommendations provided ensure that when the heat balance is exceeded Cheng Chuyu, the objective function will be optimized after the above recommendations are manually implemented by the virtual operator (autonomous level 5 to maximum autonomous level) or by a human operator. The objectives are defined by blast furnace specialists and may consist of a number of objectives, such as (1) fuel consumption minimization, (2) blast furnace life maximization, (3) CO 2 rejection minimization, and (4) stabilizing iron levels and quantities for blast furnace operation. Each target is weighted (e.g., by an expert) to define a global target for training the RL model. As the model is trained and deployed in production, it can continue to learn continuously from the deviation between the global target and the actual target (online RL model training), which is reached after the recommended operating instructions are executed for the thermal control of the respective blast furnace. In one embodiment, a computer-implemented method is provided for training a reinforcement learning model to provide operating instructions for blast furnace heat control. For example, the reinforcement learning model may be implemented by a recurrent neural network. The domain-adaptive machine learning model trained by the transfer learning processes historical operating data obtained as a multivariate time series from a plurality of blast furnaces of a plurality of domains. The historical operating data reflects the thermal state of the corresponding blast furnace in a plurality of fields. Typically, there are thousands of sensors per blast furnace measuring operating parameters such as, for example, temperature, pressure, chemical content, etc. These parameters measured at a specific point in time define the corresponding thermal state of the blast furnace at that point in time. Because of various characteristics (e.g., operation mode, size, input material (material composition), etc.) of each blast furnace, two blast furnaces (source blast furnace and target blast furnace) cannot be directly compared without performing a special conversion on the multiple time series data. The domain adaptive machine learning model generates as output a first domain i