CN-121980275-A - Multi-source driving factor layering analysis and interpretable modeling method, system, equipment and medium for power system
Abstract
The invention belongs to the technical field of electric quantity prediction, and discloses a method, a system, equipment and a medium for hierarchical analysis and interpretable modeling of a multi-source driving factor of an electric power system, which are used for solving the problems of insufficient prediction precision, poor model interpretation and analysis flow splitting in the prior art. The method comprises the steps of S1, multi-source heterogeneous data fusion and preprocessing, S2, analysis of electric quantity internal rules based on time sequence autocorrelation, S3, multi-factor correlation intensity quantification based on a Speerman correlation coefficient, S4, analysis of nonlinear influence contribution degree based on XGBoost and SHAP values, S5, comprehensive importance fusion evaluation based on two-dimensional weighted geometric average, and S6, multidimensional sensitivity and marginal effect analysis based on interaction items. The invention realizes high-precision, interpretable and closed-loop power load analysis of the contribution of the quantized driving factors.
Inventors
- FANG ZHICHUN
- YANG LINHUA
- YAN YUAN
- YE HONGDOU
- CHEN LINHONG
- CHEN YUYANG
- Lou Xiawei
- WU HAOTIAN
- ZHOU LUYAO
- WANG QIANYING
- GAO HAN
Assignees
- 国网浙江省电力有限公司营销服务中心
Dates
- Publication Date
- 20260505
- Application Date
- 20260403
Claims (10)
- 1. The multi-source driving factor layering analysis and interpretable modeling method for the electric power system is characterized by comprising the following steps of: S1, acquiring historical electricity consumption, meteorological elements and economic index data of a target area, and preprocessing to form a standardized feature matrix; s2, performing time sequence autocorrelation analysis on the historical power consumption sequence in the standardized feature matrix, and identifying the inherent periodic fluctuation and long-term trend; S3, quantifying linear association strength between each external driving factor and the power consumption based on the standardized feature matrix to obtain a Szelman class correlation coefficient and a judgment coefficient, and performing preliminary screening based on the Szelman class correlation coefficient and the judgment coefficient to obtain a preliminary candidate factor set; S4, constructing a plurality of feature subsets based on the preliminary candidate factor set, training an electric quantity prediction model by adopting a XGBoost algorithm, calculating the average marginal contribution of each feature to model output based on a SHAP method, obtaining the SHAP value and the global SHAP importance of each feature, and sequencing the preliminary candidate factors according to the global SHAP importance; S5, respectively normalizing the absolute value of the Speermann correlation coefficient of each feature in the preliminary candidate factor set and the global SHAP importance to obtain a correlation dimension score and a SHAP dimension score, calculating a comprehensive importance score based on a two-dimensional weighted geometric average, and respectively extracting an important meteorological factor set and an important economic factor set after sorting according to the comprehensive importance score; And S6, constructing a polynomial regression model based on the important meteorological factor set and the important economic factor set, and outputting a key driving factor contribution quantification result through calculation.
- 2. The method for hierarchical analysis and interpretable modeling of multi-source driving factors of a power system according to claim 1, wherein the method comprises the following steps: The preprocessing in the step S1 comprises data cleaning, time alignment and standardization; the normalization process uses a Z-score normalization method.
- 3. The method for hierarchical analysis and interpretable modeling of multiple driving factors of a power system according to claim 1, wherein in step S4, a plurality of feature subsets are constructed based on the preliminary candidate factor set, specifically: For each target factor in the preliminary candidate factor set, respectively constructing a feature subset containing the target factor and a feature subset not containing the target factor.
- 4. The method for hierarchical analysis and interpretable modeling of multi-source driving factors of a power system according to claim 1, wherein the method comprises the following steps: The global SHAP importance in step S4 is obtained by averaging the feature SHAP values of all the samples according to feature dimensions.
- 5. The method for hierarchical analysis and interpretable modeling of multiple driving factors of a power system according to claim 1, wherein the step S5 of calculating the composite importance score based on a two-dimensional weighted geometric average is calculated by the following formula: , In the formula, Is the first The overall importance score of the individual features, Is the first The correlation dimension of the individual features normalizes the score, Is the first The SHAP dimension of each feature normalizes the score, And Is a weight coefficient and satisfies 。
- 6. The method for modeling multi-source driving factor hierarchical analysis and interpretation according to claim 1, wherein the polynomial regression model in step S6 comprises a primary term, a secondary term and an interactive term, and the expression is: , In the formula, In order to predict the amount of power, In order to be an intercept term, Is the first The coefficients of the primary terms of the individual features, Is the first And the first The coefficients of the interaction term for the individual features, Is the first The coefficients of the quadratic terms of the individual features, As an error term, The number of key drivers involved in polynomial modeling.
- 7. The method for hierarchical analysis and interpretable modeling of multiple driving factors of a power system according to claim 1, wherein in step S6, the quantization result of the contribution of the key driving factors is output by calculation, specifically: Obtaining marginal effect and interactive effect indexes of each key driving factor under different meteorological and economic situations by calculating partial derivatives, and outputting a contribution quantification result of the key driving factors; The key driving factor contribution quantification result comprises sensitivity curves and interaction effect indexes of the key factors under different meteorological scenes and economic scenes.
- 8. A power system multi-source driving factor hierarchical parsing and interpretable modeling system, characterized by being used for realizing the power system multi-source driving factor hierarchical parsing and interpretable modeling method according to any one of claims 1 to 7.
- 9. A computer device comprising a memory, a processor and a computer program, wherein the computer program when executed by the processor implements the power system multi-source driver hierarchical parsing and interpretable modeling method of any of claims 1 to 7.
- 10. A computer readable storage medium having stored thereon a computer program, which when executed by a processor implements the power system multi-source driver hierarchical parsing and interpretable modeling method of any of claims 1 to 7.
Description
Multi-source driving factor layering analysis and interpretable modeling method, system, equipment and medium for power system Technical Field The invention belongs to the technical field of electric quantity prediction, and particularly relates to a method, a system, equipment and a medium for multi-source driving factor hierarchical analysis and interpretable modeling of an electric power system. Background The accurate load prediction of the power system is a core foundation for guaranteeing the stable operation of the power grid, realizing the optimal configuration of resources and improving the energy utilization efficiency. With the continuous expansion of the scale of the power system and the continuous improvement of the duty ratio of renewable energy sources, the fluctuation, the randomness and the complexity of the power load are obviously enhanced, and higher requirements are put on the accuracy and the interpretability of the prediction method. Traditional power load prediction methods rely primarily on time series analysis techniques of historical power usage data, such as employing autoregressive integral moving average models and seasonal variations thereof. Such methods are based on linear stationary assumptions, which can effectively identify and predict the periodic patterns and long-term trends inherent in load data. However, it is inherently limited in that it is difficult to effectively incorporate the direct effects of external multi-source driving factors such as weather conditions, economic activities, etc., and in particular, it is impossible to characterize the non-linear correlations and interactions that are prevalent between these factors and the electrical load. When external impacts such as extreme weather events, holiday effects or macro-economic policy adjustment are faced, the prediction accuracy of the traditional time sequence model is often remarkably reduced. In order to improve the prediction performance, advanced prediction models based on machine learning, such as integrated learning algorithms of gradient lifting decision trees, random forests and the like, are introduced in the prior art. Such models are capable of automatically learning complex nonlinear mappings from high-dimensional data including historical loads, weather and economic indicators, and generally achieve higher point prediction accuracy than conventional statistical models. However, such models are commonly regarded as "black boxes" whose internal decision mechanisms lack transparency and do not provide a quantitative interpretation of the specific contribution of each input feature to the final predicted result. This makes the prediction result limited in application reliability and operability in scenarios requiring explicit causal logic and decision support, such as power scheduling, demand side management, etc. To alleviate the model interpretability problem, the prior art attempts to introduce post-hoc interpretation techniques, such as SHAP value analysis based on game theory, feature attribution of individual predictions on the trained model. Meanwhile, statistical correlation analysis (such as Szellman grade correlation) is adopted to carry out preliminary screening on the input features so as to reduce the complexity of the model. However, existing schemes typically organize these technical links into a loose, staged pipelined framework by first performing a feature-based preliminary screening based on simple statistical correlations, then training a machine learning model to pursue optimal prediction accuracy, and finally analyzing the interpretability as a separate, subsequent additional link. The framework has the obvious defects that firstly, the critical factors with complex nonlinear contribution can be removed early based on linear or monotonically related feature primary screening, secondly, the interpretability analysis link and the model training and optimizing process are mutually disjointed, the generated depth interpretation information cannot be fed back and organically fused into feature evaluation and model decision, thirdly, an integrated analysis means capable of systematically quantifying nonlinear interaction effect among multiple factors and the scene influence of the nonlinear interaction effect is finally lacking. Such a disjoint results in the prior art being difficult to implement a closed loop transformation from "high-precision prediction" to "interpretable, quantifiable, actionable insight" and failing to provide reliable decision support with both accuracy and transparency for power system scheduling under complex multi-factor coupling conditions. Disclosure of Invention Based on the above-mentioned shortcomings and drawbacks of the prior art, it is an object of the present invention to at least solve one or more of the above-mentioned problems of the prior art, in other words, to provide a method, a system, a device and a medium for hierarchical analysis and