CN-121997266-A - Papermaking process carbon footprint prediction method based on two-stage machine learning

CN121997266ACN 121997266 ACN121997266 ACN 121997266ACN-121997266-A

Abstract

The invention discloses a paper making process carbon footprint prediction method based on two-stage machine learning, and belongs to the technical field of industrial environmental protection and machine learning intersection. The method aims at solving the problems that the conventional life cycle evaluation is difficult to handle the multivariable nonlinear coupling, insufficient prediction precision and poor interpretability of the papermaking process, and comprises the steps of constructing a characteristic data set covering a whole life cycle link, preprocessing, training by adopting a random forest algorithm to obtain an initial prediction model, outputting a prediction result, calculating a prediction residual, constructing a residual prediction model based on the residual and an original characteristic training XGBoost model, and superposing the two model prediction results to obtain a final carbon footprint prediction value. The method remarkably improves the prediction precision and the model interpretability of the high-emission interval, and is suitable for carbon emission accounting and process optimization in the paper industry.

Inventors

REN SHIXUE
Sun Rushen
ZHOU ZAILI
WANG WEI

Assignees

东北林业大学
牡丹江恒丰纸业股份有限公司

Dates

Publication Date: 20260508
Application Date: 20260128

Claims (10)

1. The papermaking process carbon footprint prediction method based on two-stage machine learning is characterized by comprising the following steps of: Constructing a characteristic data set covering the whole life cycle link of papermaking, and preprocessing; Training the preprocessed data set by adopting a random forest algorithm to obtain an initial carbon footprint prediction model and outputting a prediction result; calculating a prediction residual error of the initial carbon footprint prediction model; Training XGBoost a model based on the prediction residual and the original feature dataset to construct a residual prediction model; And superposing the prediction result of the initial carbon footprint prediction model and the output of the residual prediction model to obtain a final carbon footprint prediction value.
2. The method of claim 1, wherein the step of determining the position of the substrate comprises, Constructing a characteristic data set covering the whole life cycle link of papermaking and preprocessing comprises the following steps: Integrating the raw material structure, the energy structure, the technological process parameters and the waste treatment data to construct a feature matrix; performing independent thermal coding on the category variable and performing logarithmic transformation on the continuous long tail variable; The data set is divided into a training set and a test set, and all preprocessing operations are fitted based on the training set only.
3. The method of claim 1, wherein the step of determining the position of the substrate comprises, Training the preprocessed data set by adopting a random forest algorithm comprises the following steps: constructing a plurality of decision trees by taking the number of leaf nodes as 1 and the depth of the unrestricted tree as a constraint condition; Randomly selecting a feature subset to determine an optimal division mode when the nodes of each decision tree are split; Based on the training decision tree set, outputting an initial prediction result in an integrated mode.
4. The method of claim 1, wherein the step of determining the position of the substrate comprises, Calculating a prediction residual of the initial carbon footprint prediction model includes: subtracting the predicted value of the initial carbon footprint prediction model from the actual carbon footprint calculated value to obtain an original residual error; and analyzing the correlation between the original residual error and the predicted value to confirm whether a learnable systematic deviation exists in the residual error.
5. The method of claim 1, wherein the step of determining the position of the substrate comprises, Training XGBoost the model to construct a residual prediction model includes: taking the preprocessed original characteristic data set as an input characteristic; using the prediction residual as a training target, and fitting by using XGBoost algorithm; The XGBoost algorithm takes the residual error of the previous round of fitting as a guide, and gradually approximates to the objective function through an additive tree model.
6. The method of claim 1, wherein the step of determining the position of the substrate comprises, After training with the random forest algorithm, further comprising: Based on the random forest algorithm operation result, executing substitution importance analysis; And identifying key process parameters affecting the carbon footprint according to the importance ranking of the features.
7. The method of claim 1, wherein the step of determining the position of the substrate comprises, After obtaining the final carbon footprint prediction value, further comprising: evaluating the performance of the final carbon footprint predictor on an independent test set using a decision coefficient and a root mean square error indicator; The final carbon footprint predictor is applied to carbon emission trend analysis of a specific paper type.
8. The method of claim 7, wherein the step of determining the position of the probe is performed, The specific paper is cultural paper.
9. A computer terminal device, comprising: One or more processors; a memory coupled to the processor for storing one or more programs; When executed by the one or more processors, causes the one or more processors to implement the steps of the method of any of claims 1-8.
10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method according to any of claims 1-8.

Description

Papermaking process carbon footprint prediction method based on two-stage machine learning Technical Field The invention belongs to the technical field of industrial environmental protection and machine learning intersection, and particularly relates to a papermaking process carbon footprint prediction method based on two-stage machine learning. Background In the context of global climate change and carbon neutralization targets of various countries, scientific quantification and prediction of carbon emission levels of industrial products are key issues. The paper industry, which is a typical energy and resource intensive process manufacturing industry, relies primarily on traditional lifecycle assessment methods for lifecycle carbon footprint assessment. The method quantifies greenhouse gas emission in the links of raw material acquisition, production, energy supply, transportation, waste treatment and the like through a system, and lays a theoretical foundation for understanding the emission mode of the industry. However, in an actual papermaking production scenario, there are significant limitations to conventional lifecycle assessment methods due to the long process chain, numerous process variables, and significant factory-to-factory variation. The method generally depends on preset scenes and fixed parameters, is difficult to absorb and process massive continuous process data, and is more incapable of effectively describing complex nonlinear coupling effects among various process parameters, so that the application potential of the method for accurately predicting and optimizing the process under the actual production conditions of multi-factory and multi-process combination is limited. With the development of machine learning technology in environmental fields, algorithms such as random forests, XGBoost, etc. have been tried for carbon emission prediction. These data driven models exhibit advantages in handling high dimensional inputs and non-linear relationships. However, in specific applications for the complex flow industries such as papermaking, the prior art solutions still have shortcomings. First, most studies use single-stage machine learning models for prediction, and lack of systematic diagnosis and correction of the model residual structure results in insufficient prediction accuracy of the model in extreme conditions or high emission intervals. Secondly, the model is weak in interpretability, and related analysis stays at the variable importance sorting level, and cannot be deeply fused with the specific physicochemical mechanism of the papermaking process, so that emission reduction suggestions which can guide production practice and have operability are difficult to be proposed from the prediction results. Finally, when a prediction model is constructed, the selection of the data preprocessing method has obvious influence on performance, but the prior art lacks a standardized preprocessing strategy consensus aiming at the data characteristics of the papermaking carbon footprint, which brings difficulty to the stable reproduction and reliable application of the model. Therefore, in the prior art, when the problem of predicting the carbon footprint of a complex system in the paper industry is solved, multiple challenges such as limited model prediction precision, insufficient adaptability to key working conditions, poor interpretability, non-uniform pretreatment method and the like are faced, and a prediction method which can deeply fuse process mechanisms, has residual learning capability and has higher robustness is needed. Disclosure of Invention In order to solve the technical problems, the invention provides a paper making process carbon footprint prediction method based on two-stage machine learning, which aims to solve the problems in the prior art. In order to achieve the above object, the present invention provides a method for predicting a carbon footprint of a papermaking process based on two-stage machine learning, comprising the steps of: Constructing a characteristic data set covering the whole life cycle link of papermaking, and preprocessing; Training the preprocessed data set by adopting a random forest algorithm to obtain an initial carbon footprint prediction model and outputting a prediction result; calculating a prediction residual error of the initial carbon footprint prediction model; Training XGBoost a model based on the prediction residual and the original feature dataset to construct a residual prediction model; And superposing the prediction result of the initial carbon footprint prediction model and the output of the residual prediction model to obtain a final carbon footprint prediction value. Optionally, constructing a feature data set covering a full life cycle of papermaking, and preprocessing includes: Integrating the raw material structure, the energy structure, the technological process parameters and the waste treatment data to construct a feature matri