CN-115222110-B - Method for improving LSTM interpretability and building energy consumption prediction accuracy

CN115222110BCN 115222110 BCN115222110 BCN 115222110BCN-115222110-B

Abstract

The invention discloses a method for improving the interpretation of LSTM and the prediction accuracy of building energy consumption, which relates to the technical field of building energy consumption prediction and has the technical key points that the method comprises the following steps of S1, analyzing the space dimension, namely the influence of different variables on a final prediction result; S2, analyzing the influence of the time dimension, namely different hours, on the final result. According to the method, the contribution of time and space dimensions of input data to final building energy consumption prediction is quantitatively analyzed, the interpretability of a model is improved, meanwhile, the expression capacity of the model is fully reserved, for each data instance, a relevance score can be distributed to each feature, so that the contribution degree of each feature to final decision can be explained, the trust of the prediction is improved, and finally, the purpose of improving the building energy consumption prediction accuracy of a long-term and short-term memory network model is achieved by eliminating the space and time features with small relevance.

Inventors

LI GUANNAN
Xiao Junan
HE WEIJIAN
ZHAN LEI
LI FAN
XU CHENGLIANG
GAO JIAJIA
WU YUBEI
WANG YONG
WANG ZIXI
CHEN LIANG
XIONG CHENGLONG

Assignees

武汉科技大学

Dates

Publication Date: 20260512
Application Date: 20220629

Claims (4)

1. A method for improving LSTM interpretability and building energy consumption prediction accuracy, comprising the steps of: S1, analyzing the space dimension, namely the influence of different variables on a final prediction result: A. Preprocessing original data, wherein the preprocessed original data comprises all data of M space variable characteristics within N hours, and an N multiplied by M matrix is formed; B. dividing the preprocessed data into a training set and a testing set, wherein the training set data is used for training a long-term and short-term memory network model, and the trained model is used for the testing set to preliminarily generate a predicted value; C. Processing the generated predicted values by a space-time layer-by-layer correlation propagation method, returning each predicted value to a matrix with the same size as the input data by the space-time layer-by-layer correlation propagation method, and summing the matrix column by column to average value to obtain a1 XM vector which respectively represents the correlation of the input space variable characteristics to the predicted values; D. Removing a space variable feature with minimum correlation, inputting the data with the removed space variable feature into the long-short-term memory network model 1 to output a predicted value, analyzing the predicted result by adopting a change coefficient evaluation index of root mean square error, comparing the predicted result with a preliminary predicted value, and directly outputting a label of the removed space variable feature and an optimal change coefficient value of root mean square error if the value of the change coefficient of root mean square error of the predicted result processed by a space-time layer-by-layer correlation propagation method is larger; if the value of the change coefficient of the root mean square error of the predicted result obtained by the space-time layer-by-layer correlation propagation method is smaller, other space variable characteristics are further removed, the input data comprise all data of M-1 variables within N hours, the training set and the testing set are divided and the long-period memory network model 2 is trained through the steps B to C to generate the predicted value, and then the other parameter variables are removed by the space-time layer-by-layer correlation propagation method until the value of the change coefficient of the root mean square error of the predicted result obtained by the space-time layer-by-layer correlation propagation method is larger than the value of the change coefficient of the root mean square error of the predicted result obtained by the original long-period memory network model, the removed i parameter variable labels and the optimal change coefficient value of the root mean square error are output, and finally the purpose of optimizing the predicted performance of the long-period memory network is achieved; s2, analyzing the influence of the time dimension, namely different hours, on the final result: 1) After i=a parameter variables are removed in the step S1, the input data is all data of (M-a) parameter variables within N hours, and an n× (M-a) matrix is formed; 2) Dividing data into a training set and a testing set, wherein the training set data is used for training a long-period memory network model 1, and the trained model is used for the testing set to preliminarily generate a predicted value; 3) Processing the generated predicted values by a space-time layer-by-layer correlation propagation method, returning each predicted value to a matrix with the same size as the input data by the space-time layer-by-layer correlation propagation method, and summing the matrix row by row to average value to obtain an N multiplied by 1 vector, wherein the vector respectively represents the correlation of time dimension characteristics to the predicted values; 4) Removing a time dimension feature with minimum correlation, inputting the data with the removed time dimension feature into a long-short-term memory network model 2 to output a predicted value, analyzing the predicted result by adopting a change coefficient evaluation index of root mean square error, comparing the predicted result with a preliminary predicted value, and directly outputting a label of the removed time dimension feature and an optimal change coefficient value of root mean square error if the value of the change coefficient of root mean square error of the predicted result processed by a space-time layer-by-layer correlation propagation method is larger; if the value of the change coefficient of the root mean square error of the predicted result obtained by the space-time layer-by-layer correlation propagation method is smaller, other time dimension characteristics are further removed, the input data comprise (M-a) all data of the space variable characteristics in N-1 hours, the training set and the testing set are divided and the long and short-term memory network model is trained to generate the predicted value through the step 2) and the step 3), and then one other variable is removed by the space-time layer-by-layer correlation propagation method until the value of the change coefficient of the root mean square error of the predicted result obtained by the space-time layer-by-layer correlation propagation method is larger than the value of the change coefficient of the root mean square error of the predicted result obtained by the original long and short-term memory network model, and the purposes of optimizing the predicting performance of the long and short-term memory network model are finally achieved; the raw data in the step S1 contains 9 variables including month, week, hour, air temperature, dew point temperature, sea level pressure, wind direction, wind speed and energy consumption after being preprocessed.
2. The method for improving the interpretability of LSTM and the accuracy of building energy consumption prediction as recited in claim 1, wherein the training set to test set ratio in step B is η:1- η.
3. The method for improving the interpretability of LSTM and the accuracy of building energy consumption prediction as recited in claim 1, wherein the training set to test set ratio in step 2) is from 1 to 1.
4. The method for improving the interpretation of LSTM and the accuracy of the prediction of building energy consumption as claimed in claim 1, wherein the preprocessing of the raw data in step A comprises missing value filling, normalization and sliding window processing.

Description

Method for improving LSTM interpretability and building energy consumption prediction accuracy Technical Field The invention relates to the technical field of building energy consumption prediction, in particular to a space-time layer-by-layer correlation propagation method for improving the interpretability of a long-term and short-term memory network model and the prediction accuracy of building energy consumption. Background Over the past decades, energy consumption of buildings has increased dramatically due to population growth, increased demand for building functions, and global climate change. Building energy consumption accounts for 30% of global energy consumption on average. Building energy consumption prediction has been seen by industry and academia as a critical and challenging task. The accurate building energy consumption prediction can provide effective guidance for energy resource allocation, energy saving measure formulation and energy system improvement. With the advent of the big data age, data driving (such as artificial neural networks, support vector machines, random forests, deep neural networks and the like) has become an increasingly important method in building energy management systems, and high computing efficiency is demonstrated. Long and short term memory network models (LSTM) are widely used for time series prediction problems. Although long-short-term memory network models have been widely used in building energy consumption prediction, and high prediction accuracy has been achieved. But due to its multi-layer and non-linear structure, the underlying physical properties of the monitored system are hardly known in depth and are opaque and untraceable in predictions, often considered as "black box" methods. It is a great challenge for building professionals to fully understand the learned reasoning mechanisms and trust the predictions made, as the developed models typically have high complexity and low interpretability. Traditional model interpretable methods do not quantitatively analyze the contribution of the input data to the final prediction. And providing a layer-by-layer correlation propagation method for the scholars, distributing a correlation score for each feature, and judging the contribution degree of each feature to the final decision. The accurate building energy consumption prediction can provide effective guidance for energy resource allocation, energy saving measure formulation and energy system improvement. Currently, while machine learning is becoming more and more powerful, developed models, particularly artificial neural networks such as long and short term memory network models, are becoming more and more complex, resulting in lower model interpretability. The complex reasoning mechanism behind machine learning makes the model unintelligible to ordinary building professionals, thereby reducing trust in predictions made. Accordingly, the present invention aims to propose a method for improving LSTM interpretability and building energy consumption prediction accuracy to solve the above-mentioned problems. Disclosure of Invention The present invention aims to solve the above problems and provide a method for improving the interpretive performance of LSTM and the accuracy of prediction of building energy consumption, which quantitatively analyzes the contribution of time and space dimensions of input data to the prediction of final building energy consumption, increases the interpretive performance of a model, and fully maintains the expressive power of the model. Finally, the aim of improving the building energy consumption prediction accuracy of the long-term memory network model is fulfilled by eliminating the characteristic of small correlation. In order to achieve the above purpose, the technical scheme of the invention is as follows, a method for improving the interpretation of LSTM and the prediction accuracy of building energy consumption, comprising the following steps: S1, analyzing the space dimension, namely the influence of different variables on a final prediction result: A. Preprocessing original data, wherein the preprocessed original data comprises all data of M space variable characteristics within N hours, and an N multiplied by M matrix is formed; B. dividing the preprocessed data into a training set and a testing set, wherein the training set data is used for training a long-term and short-term memory network model, and the trained model is used for the testing set to preliminarily generate a predicted value; C. Processing the generated predicted values by a space-time layer-by-layer correlation propagation method, returning each predicted value to a matrix with the same size as the input data by the space-time layer-by-layer correlation propagation method, and summing the matrix column by column to average value to obtain a1 XM vector which respectively represents the correlation of the input space variable characteristics to the predi