CN-121598090-B - Time sequence load data enhancement generation method and device based on hidden Markov model
Abstract
The invention provides a time sequence load data enhancement generation method and device based on a hidden Markov model, which relate to the technical field of power system data processing, wherein historical and real-time original time sequence load data are divided and time aligned according to a key time period, fluctuation characteristics are extracted, the historical and real-time fluctuation characteristics are obtained, the hidden Markov model is utilized to model different fluctuation running states in the historical fluctuation characteristics, an online identification model of the fluctuation running states is established, the real-time fluctuation characteristics are taken as input, the identified fluctuation running states are output in real time, and the real-time load data after corresponding noise is adaptively generated and then injected into the time aligned real-time load data, so that enhancement load data are generated. According to the invention, the training-free load data synthesis framework based on the hidden Markov model is adopted, so that the periodic characteristics and different fluctuation running states in the load data can be automatically and accurately identified on the premise of not depending on a large amount of historical data, and the data enhancement is completed through self-adaptive noise injection.
Inventors
- QIAN JIANGUO
- SONG JINGYUN
- XU FENG
- HU JIHENG
- DIAO RUISHENG
- CHEN JIACHENG
- LIU HAN
- ZHANG HUI
- HUANG CHENGSI
- WO JIANDONG
- ZHANG JING
- HUANG YINQIANG
- QUE LINGYAN
- LV QIN
- SHI YUNHUI
- CHEN SHIXUAN
- Qiao Tiantong
Assignees
- 国网浙江省电力有限公司金华供电公司
- 国网浙江省电力有限公司
Dates
- Publication Date
- 20260508
- Application Date
- 20260129
Claims (8)
- 1. The time sequence load data enhancement generation method based on the hidden Markov model is characterized by comprising the following steps of: s1, dividing and time-aligning acquired historical and real-time original time sequence load data according to a key time period to obtain time-aligned historical load data and real-time load data; S2, extracting fluctuation features from the time-aligned historical load data and the time-aligned real-time load data to obtain historical fluctuation features and real-time fluctuation features, modeling different fluctuation running states in the historical fluctuation features by using a hidden Markov model, establishing an online identification model of the fluctuation running states, taking the real-time fluctuation features as input, and outputting the identified fluctuation running states in real time; s3, adaptively generating corresponding noise according to the identified fluctuation running state, and then injecting the noise into the time-aligned real-time load data to generate enhanced load data; S2 comprises the following steps: S21, aggregating the time-aligned load data of different time scales according to the corresponding time scales to obtain aggregate load data of the time scales, and calculating the logarithmic income ratio of the aggregate load data of the time scales; S22, extracting fluctuation index characteristics of aggregate load data of each time scale by combining corresponding logarithmic yield to describe random behaviors; s23, interpolating the fluctuation index features of all time scales to uniform time resolution to form a comprehensive fluctuation feature vector, wherein the comprehensive fluctuation feature vector presents a clustering mode corresponding to different fluctuation operation states; S24, modeling the time evolution of different fluctuation operation states in the comprehensive fluctuation feature vector by using a Gaussian hidden Markov model, and establishing an online identification model of the fluctuation operation states; s25, after parameter estimation, the online identification model takes real-time fluctuation characteristics as input and outputs the identified fluctuation running state in real time; S3 comprises the following steps: S31, calculating the mean fluctuation level and the self-adaptive scaling factor of each identified fluctuation running state; s32, generating self-adaptive noise which is proportional to the fluctuation characteristics of the current fluctuation running state based on the mean fluctuation level and the self-adaptive scaling factor; s33, injecting self-adaptive noise into the reference load data, synthesizing the self-adaptive noise with the reference load data, and generating enhanced load data through upper and lower limit clipping.
- 2. The method for generating time series load data enhancement based on hidden markov model according to claim 1, wherein in S1, the dividing and time alignment are performed according to a key time period, in particular, the dividing is performed according to a weekday and a weekend, and the load data of a historical source year is aligned to a calendar of a target year so as to correct the dislocation between the weekday and the weekend caused by the difference of the initial week each year, and the method specifically comprises the following steps: S11, calculating the working day offset of the source year and the target year; s12, converting the workday offset into an hour offset, circularly shifting an original time sequence source load data sequence through modular operation according to the hour offset, calculating a load value at a time t after time alignment, and obtaining the load data sequence after time alignment as predicted reference load data; S13, establishing a layering autoregressive model for residual errors between actual observed load data and predicted reference load data, and obtaining reference load data after deviation elimination, wherein the layering autoregressive model is used for eliminating systematic deviation.
- 3. The time series load data enhancement generation method based on the hidden markov model according to claim 1, wherein in S22, the fluctuation index feature includes a real fluctuation rate feature for quantifying short-term fluctuation intensity of the load data, and a GARCH type fluctuation cluster feature for simulating a fluctuation time aggregation phenomenon of the load data.
- 4. The method for generating the time sequence load data enhancement based on the hidden Markov model according to claim 1, wherein in S2, the hidden Markov model is used for modeling different fluctuation operation states in the historical fluctuation characteristics, specifically, a Gaussian hidden Markov model with K potential states is used, each potential state represents one fluctuation operation state with unique random characteristics, the time evolution of the different fluctuation operation states is modeled through a first-order Markov chain, and the Gaussian hidden Markov model consists of a state transition model and an observation model.
- 5. The method for generating time series load data enhancement based on hidden Markov model according to claim 4, wherein in S25, model parameters are estimated by a expectation maximization algorithm with diagonal covariance constraint, and a forward algorithm is used for predicting and identifying a fluctuation running state.
- 6. The time sequence load data enhancement generation method based on the hidden Markov model is characterized by comprising the following steps of: S10, dividing and time-aligning the acquired historical and real-time original time sequence load data according to a key time period to obtain time-aligned historical load data and real-time load data; respectively carrying out time synchronization on the acquired historical and real-time external condition data and the time-aligned historical load data and real-time load data to obtain time-synchronized historical external condition data and real-time external condition data; S20, extracting fluctuation features from time-aligned historical load data and real-time load data to obtain historical fluctuation features and real-time fluctuation features, performing feature coding on time-synchronized historical external condition data and real-time external condition data to obtain historical external condition features and real-time external condition features, modeling different fluctuation running states and historical external condition features in the historical fluctuation features by using a conditional hidden Markov model, establishing an improved online identification model of the fluctuation running states, taking the real-time fluctuation features and the real-time external condition features as inputs, and outputting the identified fluctuation running states in real time; the method comprises the steps of S21, aggregating time-aligned load data of different time scales according to corresponding time scales to obtain aggregated load data of the different time scales, calculating logarithmic profitability of the aggregated load data of the different time scales, S22, extracting fluctuation index features of the aggregated load data of each time scale according to the corresponding logarithmic profitability to describe random behaviors, S23, interpolating the fluctuation index features of all the time scales to uniform time resolution to form a comprehensive fluctuation feature vector, wherein the comprehensive fluctuation feature vector presents a clustering mode corresponding to different fluctuation operation states, S24, modeling the time evolution of the different fluctuation operation states in the comprehensive fluctuation feature vector by using the Gaussian hidden Markov model to establish an online identification model of the fluctuation operation states, S25, after parameter estimation of the online identification model, taking the real-time fluctuation feature as input, outputting the identified fluctuation running state in real time; S30, adaptively generating corresponding noise according to the identified fluctuation running states, injecting the noise into the time-aligned real-time load data, and generating enhanced load data, wherein the method comprises the steps of S31, calculating the mean fluctuation level and the adaptive scaling factor of each identified fluctuation running state, S32, generating adaptive noise which is proportional to the fluctuation characteristics of the current fluctuation running state based on the mean fluctuation level and the adaptive scaling factor, S33, injecting the adaptive noise into the reference load data, synthesizing the reference load data, and generating the enhanced load data through upper and lower limit cutting.
- 7. The method for generating time series load data enhancement based on hidden markov model according to claim 6, wherein the external condition data includes weather data, calendar information and economic indicators.
- 8. A time series load data enhancement generation device based on a hidden markov model for executing the method of any one of claims 1 to 5, comprising: The time alignment module is used for dividing and time-aligning the acquired historical and real-time original time sequence load data according to a key time period to acquire time-aligned historical load data and real-time load data; the real-time identification module is used for extracting fluctuation characteristics from the time-aligned historical load data and the time-aligned real-time load data, obtaining the historical fluctuation characteristics and the real-time fluctuation characteristics, modeling different fluctuation running states in the historical fluctuation characteristics by using a hidden Markov model, establishing an online identification model of the fluctuation running states, taking the real-time fluctuation characteristics as input, and outputting the identified fluctuation running states in real time; the method comprises the steps of S21, respectively aggregating time-aligned load data of different time scales according to corresponding time scales to obtain aggregate load data of the different time scales, respectively calculating the logarithmic income ratio of the aggregate load data of the different time scales, S22, extracting fluctuation index features of the aggregate load data of each time scale according to the corresponding logarithmic income ratio to describe random behaviors, S23, interpolating the fluctuation index features of all the time scales to uniform time resolution to form a comprehensive fluctuation feature vector, wherein the comprehensive fluctuation feature vector presents a clustering mode corresponding to different fluctuation operation states, S24, modeling the time evolution of different fluctuation operation states in the comprehensive fluctuation feature vector by utilizing a Gaussian hidden Markov model to establish an online recognition model of the fluctuation operation state, S25, after parameter estimation, inputting the real-time fluctuation feature and outputting the recognized fluctuation operation state in real time; The data enhancement module is used for adaptively generating corresponding noise according to the identified fluctuation running states and then injecting the noise into the time-aligned real-time load data to generate enhanced load data, and comprises S31, calculating the mean fluctuation level and the adaptive scaling factor of each identified fluctuation running state, S32, generating adaptive noise which is proportional to the fluctuation characteristics of the current fluctuation running state based on the mean fluctuation level and the adaptive scaling factor, S33, injecting the adaptive noise into the reference load data, synthesizing the reference load data, and generating the enhanced load data through upper and lower limit cutting.
Description
Time sequence load data enhancement generation method and device based on hidden Markov model Technical Field The invention relates to the technical field of power system data processing, in particular to a time sequence load data enhancement generation method and device based on a hidden Markov model. Background As the permeability of distributed renewable energy sources (e.g., photovoltaic, wind power) continues to increase in electrical power systems, resulting in grid net loads exhibiting highly non-linear and strong uncertainty characteristics. In this context, long-term operation optimization, scheduling planning and risk assessment of power systems are increasingly dependent on massive and high-quality load data. However, due to factors such as data privacy, business confidentiality and network security, the sharing of real load data by power enterprises is strictly limited, so that the traditional planning method and the artificial intelligence model based on historical data face serious 'data barren' problems. To address the challenge of data scarcity, synthetic load data generation techniques have become an important research direction. The prior art mainly comprises a first method for generating an countermeasure network, such as generating the countermeasure network and kernel density estimation by adopting a set generation countermeasure network or combination conditions so as to generate a resident load mode, a second method for optimizing generation of time sequence data, such as generating the countermeasure network frame by utilizing bidirectional LSTM and style migration reconstruction so as to improve the diversity of the time sequence data of photovoltaic power generation, and a third method based on probability analysis, such as probability analysis of a net load curve aiming at a high-permeability distributed photovoltaic scene. Nevertheless, the prior art has significant technical bottlenecks when applied to long-term planning and operation scheduling of power grids: 1. the data scarcity restriction is that a high-dimensional long-term load data set is difficult to acquire, and due to strict privacy restriction, generated model training data is insufficient, generalization capability is weak, and accuracy is obviously reduced when a new load scene is processed or an operation working condition is prolonged; 2. The prior method is difficult to automatically and effectively identify and reserve key time period characteristics (such as significant differences between workdays and weekends) in load data, so that the generated synthetic data cannot accurately reflect a real long-term load evolution mode; 3. the applicability of the assessment framework is limited, the current assessment method is mostly dependent on statistical distribution comparison, and key time dynamic characteristics such as multi-annual period change, short-term operation transient state and the like are ignored, so that the applicability of the synthesized data in actual power grid decision is greatly reduced. Therefore, there is an urgent need to develop a new method capable of precisely capturing the load time series characteristics, without relying on a large amount of training data, and adaptively generating high-quality synthetic load data. Disclosure of Invention The invention provides a time sequence load data enhancement generation method and device based on a hidden Markov model, which can accurately capture load time sequence characteristics, does not need to rely on a large amount of training data and can adaptively generate high-quality synthesized load data, so as to solve the technical problems that the existing synthetic load data generation method is severely dependent on a large amount of historical data and is difficult to train, and key periodic characteristics and fluctuation characteristics in the load data are difficult to accurately capture and retain, and the generated data is insufficient in applicability in long-term power grid planning and operation analysis. In order to achieve the above purpose, the present invention provides the following technical solutions: the present invention provides in a first aspect a method for generating time-series load data enhancement based on a hidden markov model, comprising: S1, dividing and time-aligning acquired historical and real-time original time sequence load data according to a key time period to obtain time-aligned historical load data and real-time load data, S2, extracting fluctuation features from the time-aligned historical load data and real-time load data to obtain historical fluctuation features and real-time fluctuation features, modeling different fluctuation running states in the historical fluctuation features by using a hidden Markov model, establishing an online identification model of the fluctuation running states, taking the real-time fluctuation features as input, outputting the identified fluctuation running states in real time