CN-121978909-A - Medical glass kiln working condition self-adaptive regulation and control method based on deep reinforcement learning

CN121978909ACN 121978909 ACN121978909 ACN 121978909ACN-121978909-A

Abstract

The invention discloses a medical glass kiln working condition self-adaptive regulation and control method based on deep reinforcement learning, which relates to the technical field of production regulation and control and comprises the following steps of S1, dividing fan regulation and control task types, determining the matching relation between each operation stage and a corresponding fan regulation and control task, S2, combining the layout distribution of all fans inside an existing kiln, establishing a relation model between the fans and a radiation range and the regulation effect of environmental parameters in the kiln, S3, establishing a deep reinforcement learning model based on the relation model between the environmental data monitored by a sensor in the kiln and the regulation effect of the fans and the radiation range and the environmental parameters in the kiln, obtaining optimal scheduling strategies of the fans in the kiln under different environmental parameter backgrounds, S4, extracting a quality attribute feature set and a glass liquid historical data set of medical glass liquid, constructing a decision tree, outputting instructions to a fan regulation and control system for controlling whether the smelting quality of the medical glass liquid meets the standard or not, and controlling the operation of the fan regulation and control system.

Inventors

Zeng Yuanzhong
SUN ZHONGHONG
XU QIAN
YANG SHUANG
YU XIANGLIN

Assignees

攀科药用包装(四川)有限公司

Dates

Publication Date: 20260505
Application Date: 20251226

Claims (10)

1. The method for adaptively regulating and controlling the working condition of the medical glass kiln based on deep reinforcement learning is characterized by comprising the following steps of: S1, dividing fan regulation task types based on a glass melting operation flow in a kiln, defining a matching relation between each operation stage and a corresponding fan regulation task, and constructing a corresponding set of the fan regulation tasks and the operation flow of the melting operation of the kiln; S2, marking coordinate positions of all fans based on a corresponding set of fan regulation tasks and operation flows and layout distribution of all fans in the existing kiln, and establishing a relation model between the fans and radiation ranges and the adjustment effect of environmental parameters in the kiln; S3, constructing a deep reinforcement learning model based on environmental data monitored by sensors in the kiln and a relation model between the fans and the radiation range and the adjusting effect of environmental parameters in the kiln, and obtaining an optimal scheduling strategy of the fans in the kiln under different environmental parameter backgrounds; S4, based on production history data of medicinal glass liquid in the kiln, a quality attribute feature set of the medicinal glass liquid is established, and a decision tree is constructed by combining the glass liquid data set monitored by the sensor and the quality attribute feature set to judge whether the quality of the medicinal glass liquid melted in the kiln meets the standard or not, and an instruction is output to a fan regulation system to control the operation of the fan regulation system or not.
2. The method for adaptively adjusting and controlling the working condition of the medical glass kiln based on deep reinforcement learning according to claim 1, wherein the step S1 specifically comprises the following steps: Dividing the core stage of the melting operation flow based on the glass melting operation flow in the kiln, wherein the core stage comprises the steps of pre-melting production raw materials, clarifying glass liquid, homogenizing and cooling; According to the production requirements of different stages, the corresponding fan regulation task types comprise temperature regulation, air pressure regulation and gas content regulation; And matching each operation core stage with a corresponding regulation task of the fan according to the historical regulation data to obtain a corresponding set of the fan regulation task and the operation flow of the furnace melting operation.
3. The method for adaptively controlling the working condition of the medical glass kiln based on deep reinforcement learning according to claim 2, wherein the step S2 specifically comprises: based on a corresponding set of fan regulation tasks and operation flows and layout distribution of fans inside the existing kiln, a three-dimensional coordinate system is built by taking the center of the kiln as an origin, the coordinate positions of all fans are recorded, and the coordinate positions of the fans are marked, wherein the coordinate positions comprise the height, the length distance and the width distance of the fans from the origin.
4. The method for adaptively controlling the working conditions of a pharmaceutical glass kiln based on deep reinforcement learning according to claim 3, wherein the step S2 further comprises: based on the coordinate positions of all fans, calculating the effective coverage area of a single fan as Depth of radiation ; And then the volume range of the radiation space is obtained by the product of the two Wherein The fan blade edge air flow attenuation coefficient is d is the fan blade rotation diameter, v is the wind speed, k is the wind speed attenuation coefficient, H is the mounting height, and θ is the air flow diffusion angle.
5. The method for adaptively controlling the working condition of a pharmaceutical glass kiln based on deep reinforcement learning according to claim 4, wherein the step S2 further comprises: Based on the radiation ranges of m fans and the spatial positions of the m fans, the environmental parameter adjusting effect of the combined action of the fans in a certain spatial range in the kiln is measured as follows Wherein The value of 1 or 0 indicates that the ith fan is operated or turned off, The effect factor function is adjusted for the fan to environment parameters related to the distance and the service life of the fan, The radiation space volume range of the ith fan; and establishing a relation model between the fan and the radiation range and between the fan and the effect of adjusting environmental parameters in the kiln.
6. The method for adaptively controlling the working condition of a pharmaceutical glass kiln based on deep reinforcement learning according to claim 5, wherein the step S2 further comprises: Based on the historical use time and action position information of the fan and the corresponding effect on the environmental parameter adjustment, a multiple linear regression equation is established, and the fan effect factor function for the environmental parameter adjustment is as follows: wherein As a function of the distance variable, In order to use the time variable(s), 、 As the coefficient of regression of the coefficient of the data, Is a constant term which is used to determine the degree of freedom, Is an error term; Fitting to obtain a functional relation between the service time and the acting distance of the fan and the effect of adjusting the environmental parameters.
7. The method for adaptively controlling the working condition of a pharmaceutical glass kiln based on deep reinforcement learning according to claim 6, wherein the step S2 further comprises: Based on the service time of the fan, the fan with the service time longer than the service life is used And the value is assigned to minus infinity, the position coordinates of the fan are fed back to the fan regulation and control system, and a warning popup window is arranged to remind the fan at the corresponding position to replace.
8. The method for adaptively controlling the working condition of a pharmaceutical glass kiln based on deep reinforcement learning according to claim 7, wherein the step S3 specifically comprises: Based on a corresponding set of fan regulation tasks and operation flows and a relation model between fans and radiation ranges and the regulation effect of environmental parameters in a kiln, a reinforcement learning model is established by using a Markov theory, firstly, an environmental state set S in the kiln is established, four-dimensional vectors including operation stages, temperature, air pressure and air content are included, a fan regulation action set vector A comprises fan position coordinates, fan switches and three-dimensional vectors of fan regulation tasks, the regulation effect of the environmental parameters under the combined action of all fans in a certain space range in the kiln is taken as a reward function R, A discount factor for rewarding; Strategy The probability of taking fan regulation action a under the condition that the environmental state inside the kiln is s is expressed as The expected return of the environmental state s preset inside the kiln is Setting an action cost function based on the expected return of the environmental state inside the kiln for s and the fan regulation action a And after training the model, inputting an environmental state s to obtain a fan regulation and control action scheme with highest value Q, wherein the fan regulation and control action scheme comprises fan position coordinates, fan switches and fan regulation and control tasks of corresponding operation flows.
9. The method for adaptively controlling the working condition of the medical glass kiln based on deep reinforcement learning according to claim 8, wherein the step S4 specifically comprises: Extracting a glass liquid quality attribute feature set and a glass liquid historical data set D in the kiln based on production historical data of medicinal glass liquid in the kiln, and presetting the glass liquid historical data set D in the kiln into K categories initially; the empirical entropy of the glass liquid history data set D in the kiln is expressed as: wherein Is a subset belonging to the kth class in the glass liquid history data set D in the kiln, The number of elements representing the subset is indicated, The number of elements of the glass liquid historical data set in the kiln is represented; the empirical condition entropy of a certain glass liquid attribute characteristic A on a glass liquid historical data set D in a kiln, namely a branch node is as follows: ; Calculating the difference value of the two formulas to obtain the information gain of each glass liquid attribute characteristic, wherein the information gain is as follows: 。
10. The method for adaptively controlling the working condition of a pharmaceutical glass kiln based on deep reinforcement learning according to claim 9, wherein the step S4 further comprises: Based on the information gain values of all the glass liquid attribute characteristics obtained by calculation, taking the attribute characteristic with the maximum information gain as a current decision node, removing the attribute characteristic used in the previous step, updating the glass liquid historical data set and the glass liquid attribute set in the kiln, dividing the glass liquid historical data sets in the kiln with different branches according to the attribute characteristic values, sequentially calculating the information gain again for the subset under each value condition, taking the attribute with the maximum information gain as the current decision node, and repeating the operation to divide all the glass liquid attribute characteristic sets to finish the construction of a decision tree; And judging whether the quality of the molten glass in the kiln meets the standard or not by taking the real-time glass liquid data obtained through monitoring of the sensor as an input set, outputting an instruction to a fan regulation and control system, stopping regulation and control of the fan system if the quality of the molten glass meets the standard, and if not, continuing to operate the fan system to regulate and control.

Description

Medical glass kiln working condition self-adaptive regulation and control method based on deep reinforcement learning Technical Field The invention relates to the technical field of production regulation and control, in particular to a method for adaptively regulating and controlling the working condition of a medicinal glass kiln based on deep reinforcement learning. Background The medicinal glass is a key material for medicine package, and is mainly used for manufacturing containers such as ampoule, penicillin bottle, infusion bottle and the like. The production of the medicinal glass depends on the stable operation of a kiln, and parameters such as temperature, pressure, atmosphere and the like of the kiln directly influence the melting, clarifying and forming processes of the glass. The traditional kiln control method often adopts a fixed set value, is difficult to adapt to different types of glass or dynamic changes in the production process, and causes unstable glass quality and low production efficiency. Disclosure of Invention In order to solve the technical problems, the technical scheme solves the problems by providing a method for adaptively regulating and controlling the working condition of the medical glass kiln based on deep reinforcement learning. In order to achieve the above purpose, the invention adopts the following technical scheme: A medical glass kiln working condition self-adaptive regulation and control method based on deep reinforcement learning comprises the following steps: S1, dividing fan regulation task types based on a glass melting operation flow in a kiln, defining a matching relation between each operation stage and a corresponding fan regulation task, and constructing a corresponding set of the fan regulation tasks and the operation flow of the melting operation of the kiln; S2, marking coordinate positions of all fans based on a corresponding set of fan regulation tasks and operation flows and layout distribution of all fans in the existing kiln, and establishing a relation model between the fans and radiation ranges and the adjustment effect of environmental parameters in the kiln; S3, constructing a deep reinforcement learning model based on environmental data monitored by sensors in the kiln and a relation model between the fans and the radiation range and the adjusting effect of environmental parameters in the kiln, and obtaining an optimal scheduling strategy of the fans in the kiln under different environmental parameter backgrounds; S4, based on production history data of medicinal glass liquid in the kiln, establishing a quality attribute feature set of the medicinal glass liquid, and constructing a decision tree by combining the glass liquid data set monitored by the sensor and the quality attribute feature set to judge whether the quality of the medicinal glass liquid melted in the kiln meets the standard or not, outputting a command to a fan regulation system, and controlling the operation of the fan regulation system or not; Preferably, the step S1 specifically includes: Dividing the core stage of the melting operation flow based on the glass melting operation flow in the kiln, wherein the core stage comprises the steps of pre-melting production raw materials, clarifying glass liquid, homogenizing and cooling; According to the production requirements of different stages, the corresponding fan regulation task types comprise temperature regulation, air pressure regulation and gas content regulation; according to the historical regulation and control data, matching each operation core stage with a corresponding regulation and control task of a fan to obtain a corresponding set of the fan regulation and control task and the operation flow of the furnace melting operation; Preferably, the step S2 specifically includes: Based on a corresponding set of fan regulation tasks and operation flows and layout distribution of fans inside the existing kiln, constructing a three-dimensional coordinate system by taking the center of the kiln as an origin, recording coordinate positions of all fans, and marking the coordinate positions of the fans, wherein the coordinate positions comprise the height, the length distance and the width distance of the fans from the origin; further, the step S2 further includes: based on the coordinate positions of all fans, calculating the effective coverage area of a single fan as Depth of radiation; And then the volume range of the radiation space is obtained by the product of the twoWhereinThe fan blade edge airflow attenuation coefficient is d is the rotation diameter of the fan blade, v is wind speed, k is the wind speed attenuation coefficient, H is the mounting height, and θ is the airflow diffusion angle; further, the step S2 further includes: Based on the radiation ranges of m fans and the spatial positions of the m fans, the environmental parameter adjusting effect of the combined action of the fans in a certain spatial range in the k