CN-120822406-B - Construction and application of multi-mode diffusion model for ultra-short-term photovoltaic power probability prediction
Abstract
The scheme provides a construction method and application of a multi-mode diffusion model for ultra-short-term photovoltaic power probability prediction, designs the multi-mode diffusion model constructed based on the diffusion model, performs reverse denoising by adopting a coupling U-shaped network formed by coupling sky image sequence branches and photovoltaic power sequence branches, jointly generates different possible future sky images and corresponding photovoltaic power distribution in a unified end-to-end multi-mode diffusion frame, fully exploits the relevance of multi-mode information, avoids the limitation of a traditional 'two-stage' model on information interaction and uncertainty transmission, and improves the accurate prediction of ultra-short-term photovoltaic power.
Inventors
- YAN KE
- Shao Baipeng
- YIN LI
- HUANG JING
Assignees
- 杭州易启未来能源科技有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20250625
Claims (9)
- 1. The method for constructing the multi-mode diffusion model for ultra-short-term photovoltaic power probability prediction is characterized by comprising the following steps of: Acquiring a training data set, wherein the training data set comprises a plurality of groups of training data, each group of training data comprises condition pairing data and prediction pairing data corresponding to the condition pairing data, the condition pairing data comprises a first sky image sequence and a corresponding first photovoltaic power sequence, and the prediction pairing data comprises a second sky image sequence and a corresponding second photovoltaic power sequence; Inputting a training data set into a multi-model diffusion framework to train to obtain a multi-model diffusion model, wherein the multi-model diffusion framework is built based on the diffusion model, inverse denoising is performed by adopting a coupling U-shaped network formed by coupling a sky image sequence branch and a photovoltaic power sequence branch, the sky image sequence branch and the photovoltaic power sequence branch of the coupling U-shaped network share a plurality of coding layers, transition layers and a plurality of decoding layers, each coding layer of the sky image sequence branch is in jump connection with the decoding layer of the corresponding level, each coding layer of the photovoltaic power sequence branch is in jump connection with the decoding layer of the corresponding level, the sky image sequence branch comprises a sky image sequence encoder arranged in front of the coding layer and a sky image sequence decoder arranged behind the decoding layer, and the photovoltaic power sequence branch comprises an image sequence encoder arranged in front of the coding layer and a photovoltaic power sequence decoder arranged behind the decoding layer; Each encoding layer, each decoding layer and each transition layer are high-efficiency multi-mode modules, each high-efficiency multi-mode module comprises a sky image sequence sub-network, a step number embedding branch and a photovoltaic power sequence sub-network which are in parallel, wherein the sky image sequence of the sky image sequence branch is input into the sky image sequence sub-network of the high-efficiency multi-mode module, the photovoltaic power sequence of the photovoltaic power sequence branch is input into the photovoltaic power sequence sub-network of the high-efficiency multi-mode module, the discrete steps in the denoising process are input into the step number embedding branch for linear processing and scaling offset to obtain embedded vectors, the embedded vectors respectively act on the sky image sequence sub-network and the photovoltaic power sequence sub-network, and the output sequences of the sky image sequence sub-network and the photovoltaic power sequence sub-network are subjected to cross-mode attention processing in the PV multi-mode attention network based on random offset to obtain the sky image sequence corresponding to the sky image sequence branch and the photovoltaic power sequence corresponding to the photovoltaic power sequence branch.
- 2. The method for constructing the multi-mode diffusion model for ultra-short-term photovoltaic power probability prediction according to claim 1, wherein the second photovoltaic power sequence of each group of training data input to the multi-mode diffusion model performs noise adding on the condition of the first photovoltaic power sequence to obtain a light Fu Jia noise sequence, the second sky image sequence performs noise adding on the condition of the first sky image sequence to obtain a sky image noise adding sequence, and the light Fu Jia noise sequence, the first photovoltaic power sequence, the sky image noise adding sequence and the first sky image sequence are input to a coupling U-shaped network together for coupling denoising to obtain a sky image denoising sequence and a photovoltaic denoising sequence.
- 3. The method for constructing a multi-modal diffusion model for ultra-short term photovoltaic power probability prediction according to claim 1, wherein the inverse denoising process is defined as: ; Wherein the method comprises the steps of And A first photovoltaic power sequence and a first aerial image sequence are represented respectively, And Respectively representing the denoising results of the light Fu Jia noise sequence and the sky image noise adding sequence in the t step, As the denoising result of the light Fu Jia noise sequence in the t-1 step, Adding a noise sequence to the sky image after the denoising in the t-1 step, thereby simultaneously depending on 、 、 And Is generated, t represents the number of diffusion steps, Representing a model of the multi-modal diffusion, The probability distribution of the inverse denoising process, N () represents a gaussian distribution, Representing the mean function of the gaussian distribution.
- 4. The method for constructing the multi-mode diffusion model for ultra-short term photovoltaic power probability prediction according to claim 1, wherein the photovoltaic power sequence sub-network comprises a group normalization layer, a sigmoid linear unit, a one-dimensional convolution, a one-dimensional self-attention and a normalization layer which are sequentially connected, the photovoltaic power sequences of the photovoltaic power sequence branches are input into the photovoltaic power sequence sub-network, sequentially pass through the group normalization layer, the sigmoid linear unit, the one-dimensional convolution, the one-dimensional self-attention and the normalization layer, are connected with the photovoltaic power sequence residual errors of the original input photovoltaic power sequence sub-network after being processed, and the embedded vectors are input into the normalization layer.
- 5. The method for constructing the multi-mode diffusion model for ultra-short-term photovoltaic power probability prediction according to claim 1, wherein the sky image sequence sub-network comprises a group normalization layer, a sigmoid linear unit, a two-dimensional spatial convolution, a one-dimensional time convolution, an up-down sampling layer, a one-dimensional-two-dimensional self-attention and a normalization layer which are sequentially connected, the sky image sequence branched by the sky image sequence is input into the sky image sequence sub-network, and is subjected to residual connection with the sky image sequence input into the sky image sequence sub-network after being sequentially processed by the group normalization layer, the sigmoid linear unit, the two-dimensional spatial convolution, the one-dimensional time convolution, the up-down sampling layer, the one-dimensional-two-dimensional self-attention and the normalization layer, and an embedded vector is input into the normalization layer.
- 6. The method for constructing a multimodal diffusion model for ultra-short term photovoltaic power probability prediction according to claim 1, wherein for photovoltaic power sequences and sky image sequences input into a random offset-based PV multimodal attention network, a window smaller than the number of frames is set and offset values are randomly sampled within the window, for each photovoltaic segment of the photovoltaic power sequence characteristics, the sky image sequence segments corresponding to the current photovoltaic segment are calculated using the offset values, the window and the number of frames, and the photovoltaic segment and the corresponding sky image sequence segments are subjected to cross-modality attention calculation.
- 7. The method for constructing a multi-modal diffusion model for ultra-short term photovoltaic power probability prediction according to claim 1, wherein if only a historical sky image sequence is input, the historical photovoltaic power sequence is set to zero tensor, if only the historical photovoltaic power sequence is input, the historical sky image sequence is set to zero tensor, and the loss function of the model coupling U-shaped network is as follows: ; Wherein the method comprises the steps of Is the loss function of the coupled U-shaped network, Is aimed at noise of noise-added photovoltaic power sequence Is used to calculate the desired operation of the gaussian distribution of (c), Is the noise of the noisy photovoltaic power sequence, Is the noise added noise of the multi-modal model, And A first photovoltaic power sequence and a first aerial image sequence are represented respectively, And Respectively representing denoising results of a second photovoltaic power sequence and a second sky image sequence in the t step, wherein t is the number of diffusion steps, , Is the L2 norm.
- 8. The ultra-short term photovoltaic power probability prediction method is characterized by comprising the following steps of: Inputting a historical sky image sequence and corresponding historical photovoltaic data into the multi-mode diffusion model, and outputting a future sky image sequence and a future photovoltaic power sequence; Or inputting a historical photovoltaic power sequence into the multi-mode diffusion model, and outputting a future photovoltaic power sequence; Or inputting a historical sky sequence into the multi-mode diffusion model, and outputting a future sky image sequence and a future photovoltaic power sequence; Wherein the multi-modal diffusion model is constructed according to the method for constructing a multi-modal diffusion model as set forth in any one of claims 1 to 7.
- 9. A readable storage medium, characterized in that the readable storage medium has stored therein a computer program comprising program code for controlling a process to execute a process comprising the method of construction of a multimodal diffusion model according to any of claims 1 to 7 or the ultra short term photovoltaic power probability prediction method according to claim 8.
Description
Construction and application of multi-mode diffusion model for ultra-short-term photovoltaic power probability prediction Technical Field The application relates to the field of photovoltaic prediction, in particular to construction and application of a multi-mode diffusion model for ultra-short-term photovoltaic power probability prediction. Background Renewable energy sources, particularly solar Photovoltaic (PV), are increasingly becoming an important mainstay for future power systems. However, the inherent instability of photovoltaic power generation, especially the severe fluctuations in power (i.e., the "ultra-short term" forecast category) within seconds to 30 minutes caused by cloud variations, poses a serious challenge to the stable operation of the grid. Accurate ultra-short-term photovoltaic power probability prediction is important to optimizing scheduling, reducing management cost and improving system reliability, and if the ultra-short-term prediction of photovoltaic power probability cannot be timely and accurately obtained, the complexity of power grid scheduling can be increased, and additional economic and risk burden can be generated in various application scenes such as residences, businesses and public utilities. In recent years, sky images shot by cameras on the ground have become key data sources for monitoring cloud layer dynamics and assisting photovoltaic power probability ultra-short-term prediction due to high space-time resolution, rapid development of deep learning on video prediction and time sequence analysis has also prompted more and more research attempts to automatically extract space-time features from sky images by using structures such as convolutional neural networks, cyclic neural networks (RNNs) and transformers, and combine with historical photovoltaic data or other meteorological information to realize regression prediction of end-to-end future photovoltaic power probability. However, such end-to-end models generally face two major challenges, namely, firstly, difficulty in accurately capturing rapidly-changing cloud dynamics, and often occurrence of prediction errors due to time lags when the cloud cover is severely changed, and secondly, lack of effective Uncertainty Quantization (UQ) capability, particularly in high-proportion renewable energy grid-connected scenarios, UQ is critical to risk management. Although there are many researches on probabilistic solar prediction, the prediction is performed based on time-series measurement data of photovoltaic, and the exploration of probability prediction by using sky images is still relatively limited, that is, there is no generating model on the market at present, which can jointly generate different possible future sky images and corresponding probability distributions of photovoltaic power sequences in a unified end-to-end framework. Diffusion Models (Diffusion Models) have shown great potential in the fields of computer vision and photovoltaic prediction in recent years due to their strong probability generating capability and modeling advantage for high-dimensional complex data distribution. Through the mode of step-by-step denoising, the diffusion model is not only stable in training, but also flexible in describing multimodal distribution and extreme values. In terms of solar photovoltaic prediction, SHADECAST and other researches have applied a diffusion model to probabilistic space-time prediction of satellite images, and set prediction is guided by using cloud layer evolution information. However, such work typically relies on only a single modality input (e.g., satellite images) and fails to effectively fuse multi-modality information such as photovoltaic power sequences. With the success of the multi-mode diffusion model in the fields of text-image, audio-video generation and the like, the visual data and physical quantity (such as photovoltaic power) are subjected to joint probability prediction under a unified diffusion frame, the prediction precision and UQ level of the photovoltaic power under extreme weather are expected to be remarkably improved, and the defects of the existing method in the aspects of multi-mode depth fusion and end-to-end uncertainty quantification are overcome. However, the construction of such an end-to-end multi-modal photovoltaic power probability prediction diffusion model faces the key challenges of (1) data heterogeneity: sky images (high-dimensional spatiotemporal RGB sequences) and photovoltaic power (one-dimensional time sequences) differ significantly in terms of modality and representation, and there is no theory on how to process and fuse efficiently in parallel in a unified model. (2) Cross-modal alignment and interaction, namely closely correlating a photovoltaic power sequence with a sky image sequence in time, wherein a model can capture cross-modal dependence of the photovoltaic power sequence and the sky image sequence, and bidirectional information interaction