CN-122022781-A - Vapor recovery device and method for coal electricity production

CN122022781ACN 122022781 ACN122022781 ACN 122022781ACN-122022781-A

Abstract

The invention discloses a steam recovery device and a method for coal electricity production, which belong to the technical field of energy recovery, wherein the method comprises the steps of collecting system operation data in real time and extracting working condition feature vectors; the method comprises the steps of obtaining a dynamic system response matrix, modeling three targets of inter-stage pressure loss optimization, supercooling degree control and zero-approaching operation as an intelligent agent, constructing a cooperative game model taking the maximized weighted Nash product as a target, solving the optimal equipment adjustment quantity, executing control instructions in a layered mode on different time scales, and carrying out self-adaptive updating on strategy parameters by adopting reinforcement learning based on a long-term operation benefit evaluation result. The invention realizes multi-objective collaborative optimization, real-time dynamic response and system self-learning, and obviously improves the energy efficiency and economy of steam recovery.

Inventors

LI FEI
JIN TAO
ZHUANG WENBIN
QIAO DAN

Assignees

国家能源集团宁夏电力有限公司

Dates

Publication Date: 20260512
Application Date: 20260130

Claims (10)

1. A steam recovery method for coal power production, characterized by comprising the steps of: S1, collecting operation data of a steam recovery system in real time, and extracting working condition feature vectors ; S2, acquiring time Dynamic system response matrix for representing influence of equipment adjustment quantity on performance index change quantity ; S3, modeling three targets of interstage pressure loss optimization, supercooling degree control and zero-approaching operation as three intelligent agents, and defining the profit function of each intelligent agent 、、 Based on the dynamic system response matrix Current performance index value And constructing a cooperative game model with the maximization of weighted Nash product as a target by using the profit function, and solving to obtain an optimal equipment adjustment vector ; S4, adjusting the quantity vector based on the optimal equipment Performing control instruction and strategy calibration in layers on different time scales including a minute level by combining with a preset quick response strategy; S5, based on long-term operation benefit evaluation result To decide policy parameters And carrying out self-adaptive updating and updating the optimized knowledge base.
2. The method for recovering steam for coal power generation according to claim 1, wherein in step S1, the operation data includes at least a steam pressure of each stage Temperature (temperature) Flow rate and flow rate Opening of valve Pump frequency Load of machine set Ambient temperature Water temperature of circulation Real-time electricity price Unit price of standard coal Wherein the subscripts , , , , Representing different measuring points or devices, subscripts representing Time of day.
3. The steam recovery method for coal power production according to claim 2, wherein in step S2, the vector of the performance index variation is a performance index variation vector output by a preset neural network prediction model, and the expression is: Wherein, the Is shown in The time predicted performance index change vector is of the dimension , Representing the prediction function of the neural network, Is shown in The characteristic vector of the working condition at the moment, Is shown in The device adjustment quantity vector to be executed at the moment, Representing a set of parameters of the predictive model.
4. The method for recovering steam for coal power generation according to claim 3, wherein in step S2, the method for calculating the dynamic system response matrix is as follows: Wherein, the Indicating time of day Dynamic system response matrix of (a), dimension is And is also provided with In order to adjust the number of devices that can be made, Is shown in The characteristic vector of the working condition at the moment, Is shown in The device adjustment quantity vector to be executed at the moment, Matrix elements in (a) Indicating at the moment Under the working condition of (1) Unit adjustment quantity of individual devices to The individual performance indicators predict the impact coefficients of the variation.
5. The method for recovering steam for coal power production according to claim 1, wherein in step S3, the agent benefit function is defined as follows: the pressure loss optimization agent gain function is expressed as: Wherein, the Represents the benefit value of the pressure loss optimizing agent, An economic benefit coefficient representing the pressure loss optimization coefficient, Represents the pressure loss optimization safety penalty coefficient, Representing the current pressure loss optimization coefficient, Represents the allowable lower limit of the pressure loss optimization coefficient, Indicating the actual pressure loss of the vapor flowing through the passage, Indicating maximum pressure loss allowed for safe operation; the supercooling degree control agent profit function is expressed as: Wherein, the Indicating the profit value of the supercooling degree control agent, An economic benefit coefficient representing the deviation of the supercooling degree control, Represents the supercooling degree control deviation safety penalty coefficient, Indicating the deviation of the supercooling degree control, Represents the supercooling degree deviation permission threshold value, Representing the standard deviation of the degree of supercooling within the recent time window, A reference value representing the standard deviation of the supercooling degree, Representing a stability reward factor; the zeroing operation agent profit function is expressed as: Wherein, the Represents the return value of the zeroing operation agent, Economic coefficient representing the duty cycle of the zeroing operation time, Indicating the duty cycle of the zeroing operation time period, An energy consumption penalty factor representing the zeroing run length duty cycle, Representing the precision of the bonus coefficients, Representing the auxiliary device power consumed to maintain the low end difference, Indicating the end difference of the current condenser, Represents the optimal condenser end difference set value, Indicating the tolerance of the end difference control precision.
6. The steam recovery method for coal power generation according to claim 5, wherein in step S3, the optimization problem of the cooperative game model is expressed as: and satisfies the following constraints: The optimal device adjustment vector Is a solution of the optimization problem, and is obtained by solving the following formula through a sequence quadratic programming algorithm: Wherein, the Representing a device adjustment vector to be optimized, Represent the first A revenue threshold acceptable to the individual agent, Represent the first Negotiating weights of individual agents , Indicating time of day Is used for the performance index actual value vector of (a), And (3) with Representing the allowable lower and upper limit vectors of the performance index, And (3) with A lower limit and an upper limit vector representing the amount of device adjustment in a single control period, Indicating time of day Is a function of the device state vector of (a), And (3) with Representing the lower and upper vectors of the operating range of the device state.
7. The method for recovering steam for coal power generation according to claim 1, wherein in step S4, the control law of the minute-level optimizing control layer is: Wherein, the A device set point vector representing the next control period, Representing the current device state vector of the device, Representing an optimal plant adjustment vector.
8. The steam recovery method for coal power generation according to claim 1, wherein in step S5, the calculation formula of the long-term operation benefit evaluation value is: Wherein, the An actual economic benefit improvement value representing a unit evaluation period, Indicating the total length of the evaluation period, Is shown at the moment The estimated running cost of the reference strategy is used, At the moment of time The actual running cost of the method is adopted.
9. The method for recovering steam for coal power production according to claim 8, wherein in step S5, the decision strategy parameters Is based on reinforcement learning framework, which long-term accumulated gain expectations The update formula of the parameters is as follows: Wherein, the Representing policy parameters as Long term accumulation of discounted revenue expectations at that time, Representing the desired operator(s), Indicating time of day Is a real-time reward of (1), Representing the discount factor(s), A sequence number representing a future time step, The learning rate is indicated as being indicative of the learning rate, Representing long-term cumulative revenue expectancy versus policy parameters Is a gradient of (a).
10. A steam recovery device for coal power production, characterized in that the steam recovery method for coal power production according to any one of claims 1 to 9 is adopted.

Description

Vapor recovery device and method for coal electricity production Technical Field The invention belongs to the technical field of energy recovery, and particularly relates to a steam recovery device and method for coal power production. Background In the coal electricity production process, a large amount of steam waste heat is directly discharged, so that energy waste is caused, and the running cost and the environmental burden are increased. Therefore, the research and development of the steam recovery device and method have important significance for improving energy efficiency and reducing coal consumption. At present, a control strategy based on fixed rules or local optimization, such as PID regulation and static parameter setting, is mostly adopted in the steam recovery system, and although the basic recovery function can be realized, the requirements of dynamic change of working conditions and multi-objective collaborative optimization are difficult to deal with. The existing method often conflicts with each other in the aspects of inter-stage pressure loss, supercooling degree control, zero-approaching operation and the like, and lacks an integral coordination mechanism, so that the system operation economy and stability are limited. In addition, the prior art generally relies on manual experience adjustment or single model prediction, and fails to fuse real-time operation data with a multi-target game mechanism, so that problems of lag in adjustment response, difficulty in considering optimization targets, insufficient long-term self-adaptive capacity and the like are caused. Therefore, there is a need for a method and apparatus for optimizing vapor recovery with self-learning capability that can respond in real-time, multi-objective collaboration. Disclosure of Invention Aiming at the defects of the prior art, the invention provides a steam recovery device and a method for coal electricity production, which solve the problems. In order to achieve the purpose, the invention is realized by the following technical scheme that the steam recovery method for coal electricity production comprises the following steps: S1, collecting operation data of a steam recovery system in real time, and extracting working condition feature vectors ; S2, acquiring timeDynamic system response matrix for representing influence of equipment adjustment quantity on performance index change quantity; S3, modeling three targets of interstage pressure loss optimization, supercooling degree control and zero-approaching operation as three intelligent agents, and defining the profit function of each intelligent agent、、Based on the dynamic system response matrixCurrent performance index valueAnd constructing a cooperative game model with the maximization of weighted Nash product as a target by using the profit function, and solving to obtain an optimal equipment adjustment vector; S4, adjusting the quantity vector based on the optimal equipmentCombining with a preset quick response strategy, carrying out control instruction and strategy calibration in layers on different time scales of second level, minute level and hour level; S5, based on long-term operation benefit evaluation result To decide policy parametersAnd carrying out self-adaptive updating and updating the optimized knowledge base. Based on the technical scheme, the invention also provides the following optional technical schemes: further, in step S1, the operation data includes at least steam pressure of each stage Temperature (temperature)Flow rate and flow rateOpening of valvePump frequencyLoad of machine setAmbient temperatureWater temperature of circulationReal-time electricity priceUnit price of standard coalWherein the subscripts,,,,Representing different measuring points or devices, subscripts representingTime of day. In a further technical scheme, in the step S2, the vector of the performance index variation is a performance index variation vector output by a preset neural network prediction model, and the expression is as follows: Wherein, the Is shown inThe time predicted performance index change vector is of the dimension,Representing the prediction function of the neural network,Is shown inThe characteristic vector of the working condition at the moment,Is shown inThe device adjustment quantity vector to be executed at the moment,Representing a set of parameters of the predictive model. In a further technical scheme, in step S2, the method for calculating the dynamic system response matrix includes: Wherein, the Indicating time of dayDynamic system response matrix of (a), dimension isAnd is also provided withIn order to adjust the number of devices that can be made,Is shown inThe characteristic vector of the working condition at the moment,Is shown inThe device adjustment quantity vector to be executed at the moment,Matrix elements in (a)Indicating at the momentUnder the working condition of (1)Unit adjustment quantity of individual devices toThe individ