Search

CN-122026318-A - Clustering and ensemble learning-based power load prediction method, system, equipment and medium

CN122026318ACN 122026318 ACN122026318 ACN 122026318ACN-122026318-A

Abstract

The invention discloses a power load prediction method, a system, equipment and a medium based on clustering and ensemble learning, wherein the method comprises the steps of obtaining sample data of historical power load data by using a sliding window method, clustering the sample data by using a clustering algorithm, calculating profile coefficients under different clustering numbers, selecting optimal clusters, training a plurality of pre-selected prediction algorithms to obtain corresponding prediction errors, analyzing the difference degree among the different prediction algorithms, selecting an optimal basic learner combination, taking the selected prediction algorithms one by one as a meta learner in each cluster, training, taking the prediction algorithm with the smallest prediction error as the meta learner of the cluster, establishing an ensemble model for each cluster, obtaining a power load data sequence acquired in real time, determining the cluster to which the cluster belongs, and calling the corresponding ensemble model to conduct power load prediction. The method integrates the load classification characteristics and the multi-model advantages, and improves the prediction precision and generalization capability under complex scenes.

Inventors

  • ZHU LIANG
  • ZHOU KEHUI
  • ZHU JIRAN
  • XIA JUN
  • LI CHENKUN
  • ZHOU HENGYI
  • QI FEI
  • REN LEI
  • TANG HAIGUO
  • ZHANG YI

Assignees

  • 国网湖南省电力有限公司电力科学研究院
  • 国网湖南省电力有限公司
  • 国家电网有限公司

Dates

Publication Date
20260512
Application Date
20260112

Claims (14)

  1. 1. The power load prediction method based on clustering and ensemble learning is characterized by comprising the following steps of: Preprocessing historical power load data, and acquiring sample data by using a sliding window method; Clustering the sample data by using a clustering algorithm, calculating contour coefficients under different clustering numbers, and selecting an optimal cluster; Training a plurality of pre-selected prediction algorithms, obtaining a prediction error of each prediction algorithm, analyzing the difference degree among different prediction algorithms, and selecting an optimal prediction algorithm combination as a basic learner combination; The selected prediction algorithm is used as a meta learner one by one and trained in each cluster, and the prediction algorithm with the minimum prediction error is selected as the meta learner of the cluster, so that an integrated model is built for each cluster; And acquiring a power load data sequence acquired in real time, determining the cluster to which the power load data sequence belongs, and calling an integrated model of the cluster to which the power load data sequence belongs to predict the power load.
  2. 2. The method for predicting power load based on clustering and ensemble learning as set forth in claim 1, wherein preprocessing the historical power load data includes performing data cleaning and normalization processing on the historical power load data.
  3. 3. The clustering and ensemble learning based power load prediction method of claim 1, wherein selecting an optimal clustering process includes: Setting different clustering numbers, and completing sample data clustering under the different clustering numbers through a K-Medoids algorithm; and calculating the contour coefficients under different clustering numbers, and selecting a clustering result corresponding to the clustering number with the largest contour coefficient as an optimal cluster.
  4. 4. The method for predicting power load based on clustering and ensemble learning as claimed in claim 1, wherein the selection process of the base learner combination includes: Pre-selecting a plurality of prediction algorithms; respectively carrying out independent training on a plurality of pre-selected prediction algorithms by utilizing sample data to obtain a prediction error of each prediction algorithm; Analyzing the difference degree among different prediction algorithms by using the pearson correlation coefficient; And (3) carrying out ascending order sequencing on the multiple prediction algorithms according to the prediction errors, then selecting the first m prediction algorithms meeting the condition that the pearson correlation coefficient is smaller than a threshold value as an optimal prediction algorithm combination, and taking the optimal prediction algorithm combination as a basic learner combination.
  5. 5. The cluster and ensemble learning based power load prediction method as claimed in claim 1, wherein the ensemble model creation process of each cluster includes: constructing an integrated model framework, constructing a first layer by using a base learner combination, constructing a second layer by using any one prediction algorithm in the optimal prediction algorithm combination as a meta learner, and using the original input and the output of the first layer base learner combination as inputs by the meta learner; and re-training the combination of the base learners in each cluster, then taking the selected prediction algorithm as a meta learner one by one and training, and selecting the prediction algorithm with the minimum prediction error as the meta learner of the cluster to obtain the integrated model of the cluster.
  6. 6. The cluster and ensemble learning based power load prediction method as claimed in claim 5, wherein retraining the base learner combining process in each cluster includes: Let the combination of base learners be { f 1 , f 2 , …, f m },f m denote the mth base learner; For each cluster C k , the samples in cluster C k are divided into training set D train,k and test set D test,k ; Dividing D train,k into m parts by adopting an m-fold cross verification strategy, taking 1 part of the m parts as a verification subset in a traversing manner, and taking the remaining m-1 parts as a training subset to train each base learner in sequence; the element learner selection process includes: Constructing a training sample data set and a test sample data set of the meta learner, combining an original sample with a predicted value of each base learner for the original sample to obtain a meta learner training sample Wherein 、 The input features and labels of the ith original sample respectively, Representing the predicted value of the mth base learner for the ith original sample As input to the meta learner, will As a corresponding tag; The selected prediction algorithms are used as the meta learner one by one, training is carried out by utilizing a training sample data set of the meta learner, testing is carried out by utilizing a testing sample data set of the meta learner, and the prediction algorithm with the minimum prediction error is selected as the meta learner of the cluster.
  7. 7. A cluster and ensemble learning based power load prediction system, comprising: The sample data acquisition module is used for preprocessing the historical power load data and acquiring sample data by using a sliding window method; The clustering module is used for clustering the sample data by using a clustering algorithm, calculating contour coefficients under different clustering numbers and selecting an optimal cluster; The base learner selection module is used for training a plurality of pre-selected prediction algorithms, obtaining the prediction error of each prediction algorithm, analyzing the difference degree among different prediction algorithms, and selecting the optimal prediction algorithm combination as a base learner combination; the integrated model building module is used for taking the selected prediction algorithm as a meta learner one by one in each cluster and training, and selecting the prediction algorithm with the minimum prediction error as the meta learner of the cluster, so that an integrated model is built for each cluster; and the real-time prediction module is used for acquiring the power load data sequence acquired in real time, determining the cluster to which the power load data sequence belongs, and calling an integrated model of the cluster to which the power load data sequence belongs for power load prediction.
  8. 8. The cluster and ensemble learning based power load prediction system as claimed in claim 7, wherein the sample data acquisition module pre-processes the historical power load data including data cleaning and normalization of the historical power load data.
  9. 9. The cluster and ensemble learning based power load prediction system of claim 7, wherein said cluster module selects an optimal cluster process comprising: Setting different clustering numbers, and completing sample data clustering under the different clustering numbers through a K-Medoids algorithm; and calculating the contour coefficients under different clustering numbers, and selecting a clustering result corresponding to the clustering number with the largest contour coefficient as an optimal cluster.
  10. 10. The cluster and ensemble learning based power load prediction system of claim 7, wherein the process of the base learner selecting module selecting a base learner combination includes: Pre-selecting a plurality of prediction algorithms; respectively carrying out independent training on a plurality of pre-selected prediction algorithms by utilizing sample data to obtain a prediction error of each prediction algorithm; Analyzing the difference degree among different prediction algorithms by using the pearson correlation coefficient; And (3) carrying out ascending order sequencing on the multiple prediction algorithms according to the prediction errors, then selecting the first m prediction algorithms meeting the condition that the pearson correlation coefficient is smaller than a threshold value as an optimal prediction algorithm combination, and taking the optimal prediction algorithm combination as a basic learner combination.
  11. 11. The cluster and ensemble learning based power load prediction system as claimed in claim 7, wherein said ensemble model creation module creates an ensemble model process for each cluster including: constructing an integrated model framework, constructing a first layer by using a base learner combination, constructing a second layer by using any one prediction algorithm in the optimal prediction algorithm combination as a meta learner, and using the original input and the output of the first layer base learner combination as inputs by the meta learner; and re-training the combination of the base learners in each cluster, then taking the selected prediction algorithm as a meta learner one by one and training, and selecting the prediction algorithm with the minimum prediction error as the meta learner of the cluster to obtain the integrated model of the cluster.
  12. 12. The cluster and ensemble learning based power load prediction system as claimed in claim 11, wherein retraining the base learner combining process in each cluster includes: Let the combination of base learners be { f 1 , f 2 , …, f m },f m denote the mth base learner; For each cluster C k , the samples in cluster C k are divided into training set D train,k and test set D test,k ; Dividing D train,k into m parts by adopting an m-fold cross verification strategy, taking 1 part of the m parts as a verification subset in a traversing manner, and taking the remaining m-1 parts as a training subset to train each base learner in sequence; the element learner selection process includes: Constructing a training sample data set and a test sample data set of the meta learner, combining an original sample with a predicted value of each base learner for the original sample to obtain a meta learner training sample Wherein 、 The input features and labels of the ith original sample respectively, Representing the predicted value of the mth base learner for the ith original sample As input to the meta learner, will As a corresponding tag; The selected prediction algorithms are used as the meta learner one by one, training is carried out by utilizing a training sample data set of the meta learner, testing is carried out by utilizing a testing sample data set of the meta learner, and the prediction algorithm with the minimum prediction error is selected as the meta learner of the cluster.
  13. 13. An electronic device, comprising: A memory having a computer program stored thereon; A processor for loading and executing the computer program to implement the clustering and ensemble learning based power load prediction method as claimed in any one of claims 1 to 6.
  14. 14. A computer readable storage medium having stored thereon a computer program, which when executed by a processor, implements the clustering and ensemble learning based power load prediction method according to any one of claims 1 to 6.

Description

Clustering and ensemble learning-based power load prediction method, system, equipment and medium Technical Field The invention relates to the technical field of machine learning and power load prediction, in particular to a power load prediction method, system, equipment and medium based on clustering and ensemble learning. Background Power load prediction is a key technology in power system operation and planning, and it predicts power demand in a certain time period in the future by analyzing information such as historical load data, weather, holiday factors and the like. Accurate load prediction has important significance for economic operation, safety and stability and environmental protection of the power system. By means of accurate load prediction, the power company can more effectively arrange a power generation plan, unnecessary power generation cost is reduced, and energy utilization efficiency is improved. In addition, accurate load prediction is beneficial to timely adjusting the power supply strategy of a power grid operator, preventing power grid faults caused by unbalanced supply and demand, and guaranteeing safe and stable operation of a power system. At present, the load prediction technology mainly depends on a traditional statistical model and a machine learning method. The traditional method such as autoregressive integral moving average (ARIMA) relies on linear assumption, so that non-stationary characteristics of a load are difficult to capture, and a single machine learning model such as a Support Vector Machine (SVM) and a long-short-term memory network (LSTM) can extract time sequence dependency, but has limited fusion capability on multidimensional characteristics. In recent years, the integrated learning technology (such as random forest and gradient lifting tree) partially improves the prediction robustness by combining multi-model prediction results, but the base learner has serious homogeneity and does not consider the heterogeneity of load data. For example, the electricity consumption modes of residential, commercial and industrial loads are obviously different, different modes mutually interfere during mixed modeling, so that the sensitivity of the model to local characteristics is insufficient, and the prediction accuracy is reduced. The limitations of the prior art are mainly characterized in that firstly, most methods do not classify and process load data, heterogeneous load hybrid input makes a model difficult to capture characteristic rules under a subdivision scene, secondly, a single model or a simple integration strategy is difficult to balance deviation and variance, the problem of over-fitting or under-fitting easily occurs under a complex scene, and furthermore, although a deep learning model can extract time sequence dependence, the calculation complexity is high, so that the application of the deep learning model in real-time scheduling is limited. Therefore, a prediction method capable of integrating load classification characteristics and multi-model advantages is needed to improve prediction accuracy and generalization capability in complex scenes. Disclosure of Invention The invention provides a clustering and ensemble learning-based power load prediction method, a clustering and ensemble learning-based power load prediction system, a clustering and ensemble learning-based power load prediction equipment and a clustering and ensemble learning-based power load prediction medium, which are used for solving the problems of low prediction accuracy and weak robustness caused by insufficient data heterogeneity and model generalization capability and poor adaptability of complex scenes in the existing load prediction technology. In a first aspect, a method for predicting a power load based on clustering and ensemble learning is provided, including the steps of: Preprocessing historical power load data, and acquiring sample data by using a sliding window method; Clustering the sample data by using a clustering algorithm, calculating contour coefficients under different clustering numbers, and selecting an optimal cluster; Training a plurality of pre-selected prediction algorithms, obtaining a prediction error of each prediction algorithm, analyzing the difference degree among different prediction algorithms, and selecting an optimal prediction algorithm combination as a basic learner combination; The selected prediction algorithm is used as a meta learner one by one and trained in each cluster, and the prediction algorithm with the minimum prediction error is selected as the meta learner of the cluster, so that an integrated model is built for each cluster; And acquiring a power load data sequence acquired in real time, determining the cluster to which the power load data sequence belongs, and calling an integrated model of the cluster to which the power load data sequence belongs to predict the power load. Further, preprocessing the historical power load data comp