CN-115526430-B - Load interval prediction method, system and medium for multi-distance clustering and information aggregation
Abstract
The invention discloses a load interval prediction method, a system and a medium based on multi-distance clustering and information aggregation. The method comprises the steps of clustering normalized load curve data through a K-means algorithm based on multiple distances to obtain K-type curve set results, dividing the i-type results and corresponding meteorological data into a training set and a testing set, respectively inputting the training set and the testing set into a daily local fluctuation probability prediction model and a daily overall trend probability prediction model for training, respectively inputting characteristic values of the testing set of the two models, obtaining upper and lower limits of a section of the two models under t-score, respectively carrying out information aggregation on the upper limit and the lower limit of the prediction section of the two models, and calculating an aggregated prediction section through Choquet integral aggregation function.
Inventors
- Yue Shouzhi
- HONG HAISHENG
- DENG QI
- XU CHENDE
- LUO FENG
Assignees
- 广东电网有限责任公司广州供电局
Dates
- Publication Date
- 20260512
- Application Date
- 20221031
Claims (6)
- 1. The load interval prediction method for multi-distance clustering and information aggregation is characterized by comprising the following steps of: Clustering the normalized load curve data by a K-means algorithm based on multiple distances to obtain Collecting results of class curves; For the first The class result is divided into a training set and a testing set with the corresponding meteorological data; Respectively inputting the training set and the test set into a daily local fluctuation probability prediction model and a daily overall trend probability prediction model for training to obtain the daily local fluctuation probability prediction model and the daily overall trend probability prediction model with updated parameters; the intra-day local fluctuation probability prediction model is BiLSTM and comprises a forward LSTM network and a backward LSTM network, and a quantile regression layer is added to the network at last; the regression model of the quantile regression layer is as follows: ; In the formula, In response to a variable Is the first of (2) A conditional quantile; To explain the variables Is the number of (3); In order to be an intercept of the beam, Is that The regression coefficient vector under the quantile, Is in the range of 1 to The method is obtained by solving the optimization problem of the following formula: ; Wherein, the To interpret the variable column vector; the quantile regressive layer The response variables at quantiles are: ; wherein: the number of units being hidden layers; Activating a function for an output layer; output for LSTM hidden layer; 、 weights and biases for the output layer; The daily integral trend probability prediction model is a Gaussian regression process model, and an interval prediction result which reflects the daily integral trend and matches and predicts the daily meteorological data is obtained by learning the mapping relation between the input daily meteorological data and the real load value of each sampling point in the output day, and specifically comprises the following steps: for inclusion of Training set of individual samples, input feature matrix is , Is the first An input vector of samples of length The corresponding output response is , Is the first Outputting response values; Definition of the definition Is that Corresponding function A set of random variables is formed and obeys a joint gaussian distribution, the gaussian process of which is expressed as: ; In the formula, All are the random variables of the two-dimensional space, As a function of the mean value of the function, As a covariance function, the calculation formula is: ; ; Is provided with I.e. the function in the absence of any observations is expected to be 0; The regression model is expressed as: ; In the formula, I.e. obey mean 0, variance 0 Is a white gaussian noise of (a) and (b), Is that The vector is randomly arranged in a dimension, Is an observed value; Obtaining an output observed value Is: ; In the formula, Is that Order covariance matrix, corresponding element , Is that A rank identity matrix; known test set input Wherein the first The corresponding vectors are recorded as The predicted value output by the test set is Training set observations And predicted value Is: ; Predictive value The posterior distribution of (2) is: ; In the formula, 、 Respectively representing the mean value and the variance of the output predicted value of the test set, wherein the mean value corresponds to the deterministic predicted result of the output predicted value, and a confidence interval can be constructed by combining the variance; the characteristic values of the test sets of the two models are respectively input to obtain The upper and lower limits of the two model intervals under the quantile; And respectively carrying out information aggregation on the upper limit and the lower limit of the prediction interval of the two models, and calculating the aggregated prediction interval through Choquet integral aggregation functions.
- 2. The load interval prediction method of multi-distance clustering and information aggregation according to claim 1, wherein the multi-distance-based K-means algorithm specifically comprises: (1) Inputting equilong curve sample data set needing to be clustered , Wherein Is the first A bar curve sample; (2) At the position of Is selected randomly Taking different samples as initial clustering centers of K-means; (3) Traversing all samples, calculate the first by Distance between each sample and all cluster centers And fall it into the class with the smallest distance to the sample; ; Wherein, the In order to calculate the distance between the two samples corresponding to the sampling points by using the Euclidean distance, In order to calculate the distance between the two samples corresponding to the sampling points by using the DTW distance, The distance between two corresponding sampling points of the two samples obtained by calculating the DTW distance by adopting the previous term difference method; 、 And Respectively determining weights corresponding to the three distances by adopting an entropy weight method; (4) Traversing all the categories, recalculating the clustering centers of the categories, repeating the steps (3) and (4) until the clustering centers are not changed if the current clustering center is changed compared with the previous clustering, and outputting a clustering result.
- 3. The load interval prediction method for multi-distance clustering and information aggregation according to claim 1, wherein the multi-distance K-means algorithm adopts DBI index to measure clustering effect and gives the best number of clusters as K value, wherein the DBI index is used as the K value The method comprises the following steps: ; In the formula, Is the first The number of cluster centers is set up, Is the first Average distance of samples within a class to the cluster center, Is the first The average distance from the sample in the class to the clustering center represents the dispersion degree of each curve in the class, and the calculation formula is as follows: ; In the formula, First, the The number of class samples, Is the first Class III Samples.
- 4. The load interval prediction method of multi-distance clustering and information aggregation according to claim 1, wherein the prediction intervals of the two models are aggregated into one interval by an aggregation function, the aggregation function uses Choquet integration based on a fuzzy measure, and the fuzzy measure represents the relation degree between elements to be aggregated, specifically: for the polymerization to be carried out Individual elements Values after polymerization The method comprises the following steps: ; In the formula, Representing an aggregate function, i.e. The following condition is true if and only if: (1) Boundary conditions: And is also provided with ; (2) Monotonicity of if , The method comprises the following steps: ; definition of fuzzy measures for reference sets , Is that Power set of (1), if a function The following properties are satisfied, which is called a blur measure: (a) Boundary conditions: ; (b) Monotonicity of if Has the following components ; The power mean of the radix of the aggregate value set is used as a fuzzy measure, The value is taken as 2, specifically: ; In the formula, Middle (f) The individual elements are By means of The resulting new set is rearranged in incremental fashion, Representation of Maximum of (3) The elements.
- 5. The load interval prediction system for multi-distance clustering and information aggregation is characterized by being applied to the load interval prediction method for multi-distance clustering and information aggregation according to any one of claims 1-4, and comprising a clustering module, a local-global model training module, a local-global model prediction module and an information aggregation module; The clustering module is used for clustering the normalized load curve data through a K-means algorithm based on multiple distances to obtain Collection of results for class curves, for the first The class result is divided into a training set and a testing set with the corresponding meteorological data; The local-global model training module is used for respectively inputting a training set and a testing set into the intra-day local fluctuation probability prediction model and the intra-day global trend probability prediction model for training to obtain the intra-day local fluctuation probability prediction model and the intra-day global trend probability prediction model after parameter updating; the local-whole model prediction module is used for respectively inputting characteristic values of test sets of the two models to obtain The upper and lower limits of the two model intervals under the quantile; The information aggregation module is used for respectively carrying out information aggregation on the upper limit and the lower limit of the prediction interval of the two models, and calculating the aggregated prediction interval through Choquet integral aggregation functions.
- 6. A storage medium storing a program, wherein the program when executed by a processor implements the load interval prediction method for multi-distance clustering and information aggregation according to any one of claims 1 to 4.
Description
Load interval prediction method, system and medium for multi-distance clustering and information aggregation Technical Field The invention belongs to the technical field of power load prediction, and particularly relates to a load interval prediction method, a system and a medium for multi-distance clustering and information aggregation. Background For the load prediction problem, deterministic prediction and probabilistic prediction can be classified by differences in prediction forms. The result of deterministic prediction is a single-point expected value of a predicted object at a future time, and the probability prediction gives a predicted result at the future time in the form of probability distribution or confidence interval, so that quantitative analysis of prediction uncertainty is effectively realized. Probability prediction can provide a power system with richer and more accurate uncertainty information than deterministic prediction. Luo Fengzhang, zhang Xu, yang Xin, etc. A comprehensive energy distribution system load analysis prediction [ J ]. High voltage technology based on deep learning, 2021,47 (01): 23-32. A load prediction model based on convolutional neural network and support vector regression is constructed, and the convolutional neural network is used for extracting data implicit characteristics to obtain a relatively accurate prediction result. Li Dan, zhang Yuanhang, yang Baohua, etc. short-term power load probability prediction method based on constrained parallel LSTM quantile regression [ J ]. Power grid technology, 2021,45 (04): 1356-1364. Quantile regression prediction method based on constrained parallel long-short-term memory (LSTM) neural network is provided, which combines LSTM with quantile and considers the constraint relation between quantile predictors, thus obtaining better probability prediction results. Zhang Shuqing, li Jun, jiang Anqi, et al, novel two-stage short-term power load prediction based on FPA-VMD and BiLSTM neural networks [ J ]. Grid technology, 2022,46 (08): 3269-3279. A two-stage load prediction method is presented combining optimized variational modal decomposition with a two-way long-short term memory neural network (BiLSTM), which takes advantage of the timing characteristics of the BiLSTM network mining data, and verifies the effectiveness thereof by way of example. Huang Natian, liu Debao, cai Guowei, etc. the prediction of the charging load interval of the electric automobile based on the scene of the multi-phase day [ J ]. Chinese motor engineering report 2021,41 (23): 7980-7990. The scene generation method is introduced before the prediction, so that the scene feature expression is more abundant, and the precision of the prediction model is further improved. The power system automation, 2015,39 (12): 56-61 ], provides a load prediction method based on multi-level clustering and a support vector machine, which performs multi-level clustering on cell loads according to each attribute to obtain a more accurate prediction result. However, since the change rules of daily load curves are not uniform, the problem that model training is not ideal enough and the accuracy of the prediction result is low due to the fact that only models are used for learning all the change rules of the load curves in the above documents. Disclosure of Invention The invention mainly aims to overcome the defects and shortcomings of the prior art and provide a load interval prediction method, a system and a medium for multi-distance clustering and information aggregation. In order to achieve the above purpose, the present invention adopts the following technical scheme: the invention provides a load interval prediction method for multi-distance clustering and information aggregation, which comprises the following steps: Clustering the normalized load curve data through a K-means algorithm based on multiple distances to obtain a K-type curve set result; For the i-th type result, dividing the i-th type result and the corresponding meteorological data into a training set and a testing set; Respectively inputting the training set and the test set into a daily local fluctuation probability prediction model and a daily overall trend probability prediction model for training to obtain the daily local fluctuation probability prediction model and the daily overall trend probability prediction model with updated parameters; respectively inputting characteristic values of test sets of the two models to obtain upper and lower limits of two model intervals under t quantiles; And respectively carrying out information aggregation on the upper limit and the lower limit of the prediction interval of the two models, and calculating the aggregated prediction interval through Choquet integral aggregation functions. As an optimal technical scheme, the K-means algorithm based on the multiple distances is specifically as follows: (1) Inputting an equal-length curve sample data set S, S= { S 1,s1