CN-122022895-A - Method and system for predicting day-ahead electricity price of electric power market based on enhanced linear Stacking framework
Abstract
The invention relates to the technical field of electric power market prediction, in particular to an electric power market day-ahead electricity price prediction method and system based on an enhanced linear Stacking framework, which comprises the following steps of obtaining multi-source heterogeneous data, preprocessing to obtain a standardized feature matrix, calculating feature importance through a random forest, screening an optimal feature subset by combining K-fold time sequence cross validation and marginal benefit analysis, constructing two basic learners, and adopting The training basic learner is used for obtaining the extra-folding predictive vector and extracting statistical characteristics, the extra-folding predictive vector and the statistical characteristics are used as input into the training basic learner, then the optimal super parameters of the basic learner are searched through Bayesian optimization and the re-fitting is completed, the element learner is trained to obtain a Stacking integrated model, a test sample is input into the model, and the current price predictive value of the original dimension is obtained through inverse transformation. The invention can improve the accuracy and stability of electricity price prediction, and is suitable for the current electricity price prediction scene of the electric power market.
Inventors
- YOU XINYA
- WANG XING
- Yao shuanglong
- LIU YE
- ZHAO JIANGMIN
Assignees
- 临沂大学
- 益彩(北京)科技有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20260331
Claims (9)
- 1. The method for predicting the day-ahead electricity price of the electric power market based on the enhanced linear Stacking framework is characterized by comprising the following steps of: S1, acquiring multi-source heterogeneous data of an electric power market, constructing a feature matrix by a plurality of groups of data samples, and performing systematic preprocessing on original data to obtain a standardized feature matrix; S2, constructing a random forest model, calculating importance scores of all features in the standardized feature matrix, arranging the importance scores in descending order, evaluating the prediction performance of different feature combinations through K-fold time sequence cross verification, and screening an optimal feature subset based on marginal benefit analysis; s3, constructing two basic learners, namely, constructing a linear regression model as a first basic learner and introducing L2 regularization, and constructing a multi-layer perceptron model comprising three hidden layers as a second basic learner, wherein the two basic learners take the optimal feature subsets as inputs and respectively output electricity price predicted values; S4, adopting the standardized feature matrix for the optimal feature subset Training two basic learners by using a folding time sequence cross-validation strategy to respectively obtain folding prediction vectors of the two basic learners; S5, constructing a linear regression meta learner, taking the out-of-folding prediction vectors of the two basic learners and the statistical features extracted from the standardized feature matrix as inputs, and dynamically fusing the prediction results through learning weight coefficients; S6, searching regularization coefficients of the first basic learner and the number and discarding rate of hidden layer neurons in the second basic learning period by adopting a Bayesian optimization algorithm, determining optimal super parameters, re-fitting the two basic learners, and training the result output by the fitted basic learners by using the meta learners to obtain a trained Stacking multi-layer Stacking integrated model; And S7, inputting the test sample into a trained Stacking multi-layer Stacking integrated model to obtain a predicted electricity value, and sequentially carrying out inverse standardization and inverse differential transformation on the predicted electricity value to restore the predicted value of the day-ahead electricity price of the original electricity price dimension.
- 2. The method for predicting the day-ahead electricity price of the electric power market based on the enhanced linear Stacking framework according to claim 1, wherein S1 is specifically as follows: S1.1, data acquisition: The electric power market operation data comprise daily marginal electricity price, real-time marginal electricity price, system-level electricity demand, various energy generating capacity data and meteorological data, sampling frequency is set, and multidimensional characteristic variables collected at the same time step form a data sample, wherein each data sample is one Maintaining original characteristics, continuously collecting data in a period of time, and forming a time sequence data sample; s1.2, data preprocessing: Deleting the data sample containing the null value if the data sample exceeds the deletion proportion threshold value, and adopting a forward filling strategy if the data sample is lower than the deletion proportion threshold value; performing first-order differential operation on all data samples subjected to missing value processing, and eliminating non-stationarity characteristics of a time sequence; for the missing value processing and the first-order difference The characteristic matrix is built by adopting a sliding window for each data sample, and the length of the sliding window is set as The step length is 1, the sliding window slides backwards from the 1 st data sample in sequence, and each time continuous interception is performed The data samples are used as training samples and are processed by a sliding window to be co-produced Training sample number 1 The feature matrix of each training sample is expressed as Wherein Flattening the feature matrix of each training sample into a one-dimensional feature vector to obtain , , , Representing the characteristic dimension after flattening the image, , Representation of Is used for the indexing of (a), Represent the first The first training sample Dimensional characteristics, thereby forming a complete characteristic matrix , Feature matrix Middle (f) Line 1 Column elements are noted as Represents the first Training sample number The characteristics of the dimensions; For complete characteristic matrix Performing Z-score standardization on original features of each dimension to obtain a standardized feature matrix , 。
- 3. The method for predicting the day-ahead electricity price of the electric power market based on the enhanced linear Stacking framework according to claim 1, wherein S2 is specifically as follows: constructing a random forest regressor, and inputting a normalized feature matrix Calculating average contribution of each dimension feature to prediction error reduction in decision tree node splitting to obtain importance score The features of each dimension are arranged in descending order according to the importance scores to form candidate feature sequences , Features that are most important are represented, Features that represent the lowest importance scores; before selecting from candidate feature sequences Features, for the normalized feature matrix Execution of Fold time series cross validation, before Folded for training, the first The folds are used for verification to obtain the mean square error corresponding to each fold, and the method is used for calculating the prior mean square error according to each fold Mean square error average value corresponding to each characteristic From 1 to Gradually increasing the number of features selected from the candidate feature sequences, and calculating each The corresponding mean value of the mean square error, From 1 to Each after traversing The corresponding mean square error average value sequence is expressed as ; To use only the most important features Mean square error mean value corresponding to time Based on the calculation of the marginal benefit evaluation The predicted performance after each feature has the marginal benefit of And each of Mean square error mean difference of individual features Is the ratio of (1) From 1 to When the value of the marginal benefit tends to be stable, selecting the current Corresponding feature quantity And obtaining the optimal feature subset for the optimal feature quantity.
- 4. The method for predicting the day-ahead electricity price of the electric power market based on the enhanced linear Stacking framework according to claim 1, wherein S3 is specifically as follows: Respectively inputting the optimal feature subsets into two basic learners, wherein the input optimal feature subsets comprise Training samples, each training sample having dimensions after step S2 changed to The optimal feature subset is expressed as , Representing a subset of the preferred features Middle (f) A number of the training samples are used to determine, ; S3.1, constructing a first basic learner based on a linear regression model, and respectively passing each training sample in the optimal characteristic subset through the first basic learner to output a corresponding predicted electric value; Calculating a regularization loss function to train the first basic learner, introducing an L2 norm penalty term into the standard least square loss function by adopting Ridge regression, carrying out parameter estimation by minimizing the residual square sum, and solving an optimal parameter term in the first basic learner by minimizing the regularization loss function; s3.2, constructing a second basic learner based on a multi-layer perceptron MLP, wherein the MLP model adopts a three-layer hidden layer architecture, comprises an input layer, three hidden layers and an output layer, and adopts a full-connection structure, and the neuron numbers of the three hidden layers are respectively 、 、 The dimension of the output layer is 1; Inputting training samples in the optimal characteristic subsets into a second basic learner, and generating corresponding predicted electric values by the network of the second basic learner through layer-by-layer transformation; Training the second basic learner, updating the trainable parameters in the second basic learner by adopting an Adam optimizer, performing self-adaptive learning rate adjustment by the Adam optimizer through maintaining first-order moment estimation and second-order moment estimation of the gradient, wherein the first-order moment estimation is an exponentially weighted moving average, the second-order moment estimation is an exponentially weighted moving variance, and the loss function adopts a mean square error.
- 5. The method for predicting the day-ahead price of the electric power market based on the enhanced linear Stacking framework according to claim 4, wherein S4 is specifically as follows: s4.1, dividing the optimal feature subset into time series Subsets that do not overlap each other , The number of folds is indicated, Represent the first A subset of the number of Fold time series Cross-validation strategy, item 1 During fold verification, the front part is moved The folded subset is used as a training set, the first Folding as verification set; S4.2, in the first basic learner, the first basic learner is performed The prediction of the fold outside the fold is carried out first The L2 regularization loss function is minimized, parameters of a linear regression model are solved, and the first learner is obtained in the first place Folded parameter item, and verification set is verified by using the obtained parameter item pair Each sample of the first model is predicted to generate a first basic learner at the first stage A prediction value of the fold out of the fold; s4.3, in the second basic learner, the first is performed The prediction of the fold outside the fold is carried out first The mean square error loss function is minimized, and an Adam optimizer is adopted to solve the second basic learner in the first stage Training parameters for a fold, verification set using the training parameters Predicting each sample in the model to generate a second basic learner in the model A prediction value of the fold out of the fold; S4.4, step S4.2 and step S4.3 traversal From 1 to Will each The folded results are spliced according to the sequence of the original sample data to form complete out-of-folding prediction vectors, and the out-of-folding prediction vectors of the first basic learner are expressed as The out-of-roll prediction vector of the second basis learner is expressed as 。
- 6. The method for predicting the day-ahead price of the electric power market based on the enhanced linear Stacking framework according to claim 5, wherein S5 is specifically as follows: S5.1, out-of-the-way prediction vector And Middle (f) The corresponding out-of-folding predicted values of the samples are respectively recorded as And ; S5.2, extracting statistical indexes based on a historical observation window for each training sample in the normalized feature matrix, wherein the statistical indexes comprise a sequence mean value Standard deviation of Maximum value Minimum value of And a trend indicator; S5.3, inputting the extra-folded prediction vector and the statistical index corresponding to each sample into a meta-learner, adding the input weights, and outputting the first item by the meta-learner The final predicted electrical value for each sample.
- 7. The method for predicting the day-ahead price of the electric power market based on the enhanced linear Stacking framework according to claim 6, wherein S6 is specifically as follows: S6.1, respectively carrying out super-parameter optimization on the two basic models through a Bayesian optimization algorithm; In the first base learner, the search space contains an indicator variable of the base learner type And L2 regularization coefficient , Taking 0 or 1, respectively corresponding to standard linear regression and Ridge regression, wherein alpha takes a value between 0.001 and 10.0 and takes effect only when tau=1, and obtaining an optimal indicating variable and an L2 regularization coefficient through Bayesian optimization; In the second basic learner part, the number of the neurons of the first hidden layer and the second hidden layer and the Dropout rate are optimized by Bayes, and the requirements are met Wherein the number of neurons of the third hidden layer Adopting geometric mean heuristic calculation, and obtaining the optimal quantity and discarding rate of neurons through Bayesian optimization; S6.2, after the optimal super parameters of the two basic learners are determined, fitting is conducted again, the element learners are trained based on the output of the basic learners after fitting, and a trained Stacking multilayer Stacking integrated model is obtained, wherein the model comprises the two basic learners after fitting and the trained element learners.
- 8. The method for predicting the day-ahead price of the electric power market based on the enhanced linear Stacking framework according to claim 7, wherein S7 is specifically as follows: the new data are obtained as test samples, the new data are input into a trained Stacking multilayer Stacking integrated model to obtain a predicted electricity value, then the predicted electricity value is inversely normalized based on the standard set in the step S1, the predicted electricity value is restored to a differential scale, then inverse differential transformation is carried out by means of the original electricity value of the previous time step of the current test sample, the predicted electricity value restored to the original differential scale is added with the original electricity value of the previous time step, and the predicted value of the day-ahead electricity price of the original electricity price is output.
- 9. An electric power market day-ahead electricity price prediction system based on an enhanced linear Stacking framework, which performs the electric power market day-ahead electricity price prediction method based on the enhanced linear Stacking framework according to any one of claims 1 to 8, characterized by comprising the following modules: the data acquisition module is used for acquiring multi-source heterogeneous data of the electric power market; the preprocessing module is used for executing missing value processing, differential transformation, feature matrix construction and standardization processing; The feature screening module is used for evaluating the feature importance of the random forest and performing cross verification on the time sequence; the super-parameter optimizing module is used for searching the optimal super-parameter configuration for the first basic learner and the second basic learner through Bayesian optimization; the model training module comprises a first basic learner, a second basic learner and a meta learner and is used for constructing and training a Stacking multilayer Stacking integrated model; The post-processing module is used for carrying out inverse transformation processing on the prediction result output by the Stacking multilayer Stacking integrated model; And the output module is used for outputting the predicted value of the day-ahead electricity price of the original electricity price.
Description
Method and system for predicting day-ahead electricity price of electric power market based on enhanced linear Stacking framework Technical Field The invention relates to the technical field of electric power market prediction, in particular to an electric power market day-ahead electricity price prediction method and system based on an enhanced linear Stacking framework. Background The current price is used as a core signal for market clearing, so that the price quotation decision and the income management of power generation enterprises are directly influenced, and the price monitoring method has a key effect on the supply and demand balance scheduling of power grid operators, the price monitoring of market managers and the risk assessment of investors. The method has the advantages that the day-ahead electricity price can be accurately predicted, prospective information can be provided for market participants, the power generation enterprises can be helped to optimize the unit combination and quotation strategy, the economic loss caused by price fluctuation is reduced, and meanwhile decision support is provided for system operators to maintain safe and stable operation of the power grid. However, the formation mechanism of the price of electricity is influenced by complex coupling of multiple factors such as supply-demand relationship, fuel cost, power transmission constraint, weather condition, market game and the like, so that the price sequence presents significant nonlinearity, non-stationarity and high volatility characteristics, and accurate prediction becomes very challenging, and the following defects exist in the prior art objectively: The conventional electricity price prediction method has various defects that the traditional statistical method (such as ARIMA and LEAR) relies on linear assumption, nonlinear fluctuation of electricity price is difficult to capture, and the traditional machine learning (SVR and KNN) can model nonlinearity but has insufficient description of long-range dependency of time sequence. Although the deep learning method (LSTM, GRU, transformer) has strong time sequence modeling capability, the deep learning method is easy to be fitted in the power market scene with limited samples and large noise, has complex structure and poor interpretability, forms a 'black box' model, and influences the reliability of market decision. The single model can not meet the requirements of nonlinear fitting, generalization capability and interpretability at the same time, and is difficult to consider the regular trend and sudden disturbance of electricity price. The existing integrated learning scheme has the obvious defects that most of the basic learners are superimposed by complex models and still fit easily, the basic learners only use prediction results and do not combine with dynamic weighting of market statistical features, training lacks an extrafolding prediction mechanism, information leakage exists, preprocessing is not standard, reproducibility is low, integrated weights are opaque, and interpretability is insufficient. In the whole, the current method is difficult to balance in terms of model complexity, generalization performance, interpretability and engineering practicability, and cannot adapt to the high-precision and high-credibility day-ahead electricity price prediction requirements of the mature electric power market. In summary, the problems to be overcome by the invention are contradiction between complexity and generalization capability, contradiction between prediction accuracy and decision-making interpretability, modeling conflict between linear trend and nonlinear fluctuation, loss of market state identification and dynamic weight adjustment, and information leakage risk in ensemble learning. Therefore, the invention provides a method and a system for predicting the day-ahead electricity price of an electric power market based on an enhanced linear Stacking framework to solve the problems. Disclosure of Invention Aiming at the defects of the prior art, the invention develops a method and a system for predicting the day-ahead electricity price of the electric power market based on an enhanced linear Stacking framework, and the invention integrates the characteristic screening, the super-parameter optimization and the integration, the prediction model which can capture nonlinear complex relations and realize high generalization robustness and strong decision interpretability is constructed, and the accuracy and stability of electricity price prediction can be improved. On the one hand, the technical scheme for solving the technical problem is that the method for predicting the day-ahead electricity price of the electric power market based on the enhanced linear Stacking framework comprises the following steps: S1, acquiring multi-source heterogeneous data of an electric power market, constructing a feature matrix by a plurality of groups of data samples, and per