CN-121997090-A - Novel load and distributed resource clustering-based rapid LDP-MST (LDP-Multi-State machine-Mass Spectrometry) light-weight method

CN121997090ACN 121997090 ACN121997090 ACN 121997090ACN-121997090-A

Abstract

The invention relates to the control technology of an electric power system, and discloses a rapid LDP-MST light-weight method based on novel load and distributed resource clustering, which comprises the steps of collecting novel load data and distributed resource data, and preprocessing the collected data; extracting key features, calculating the weight of each feature through an entropy method, constructing a multidimensional feature vector, carrying out privacy protection processing on the features based on an LDP mechanism, constructing a lightweight Minimum Spanning Tree (MST) based on the features after the privacy protection processing, carrying out clustering division on the Minimum Spanning Tree (MST) to obtain clusters, dynamically updating the cluster results, generating power grid adaptation parameters by standardized parameters, and carrying out efficiency optimization on an LDP-MST lightweight algorithm. The method has the advantages that the rapid, accurate and dynamic clustering of massive novel loads and distributed resources is realized, and therefore, efficient and reliable technical support is provided for real-time scheduling and planning of the power grid.

Inventors

YANG HAIYUN
TIAN CHENGJI
Zhu Guojiao
ZHANG LIANG
ZHENG GU
WANG HENG
WANG MENG
XU WEI
WANG LIN
HAN KEXUE
PU XING
YUAN XIAOJUN

Assignees

国网四川省电力公司巴中供电公司

Dates

Publication Date: 20260508
Application Date: 20251226

Claims (10)

1. The rapid LDP-MST light weight method based on novel load and distributed resource clustering is characterized by comprising the following steps: collecting novel load data and distributed resource data, and preprocessing the collected data; Extracting key features, calculating the weight of each feature by an entropy method, and constructing a multidimensional feature vector; performing privacy protection processing on the features based on an LDP mechanism; Constructing a lightweight minimum spanning tree MST based on the features after privacy protection processing; Clustering and dividing based on the minimum spanning tree MST to obtain clusters; Dynamically updating the cluster result; generating a power grid adaptation parameter by the standardized parameter; And (3) performing efficiency optimization on the LDP-MST lightweight algorithm.
2. The rapid LDP-MST light weight method based on novel load and distributed resource clustering of claim 1, wherein preprocessing the collected data comprises cleaning the collected data to remove abnormal values, filling the missing values by adopting an interpolation method: ; in the formula, 、 The front and rear effective points of the missing points are respectively; 、 The time intervals of the missing points and the front and back effective points are respectively; finally, processing the data by adopting an extremum method: ; in the formula, Is the normalized power value; Is the original power value; , respectively minimum and maximum power in the sample set; preprocessing data to form a sample set to be clustered: ; Where n is the total sample size.
3. The rapid LDP-MST light weight method based on novel load and distributed resource clustering as claimed in claim 1, wherein the extracted features comprise time sequence features, fluctuation features and controllability features; Wherein the timing characteristics include peak-to-valley occurrence moments, and a daily average load factor: ; in the formula, Rated power for the device; Power value at t time; the fluctuation characteristics comprise short-time fluctuation frequency and power fluctuation coefficient: ; in the formula, Is the average value of the power and, ; The controllability features include adjusting response speed and reducible capacity ratio: ; in the formula, Power can be reduced to the maximum.
4. The rapid LDP-MST light weight method based on novel load and distributed resource clustering as claimed in claim 3, wherein each characteristic weight is calculated by entropy method as follows: ; in the formula, Is the first The first sample of Normalized duty cycle of the individual features; Is a feature dimension; Is the first Weights of the individual features; Information entropy of the j-th feature.
5. The rapid LDP-MST light weight method based on novel load and distributed resource clustering of claim 1, wherein the privacy protection processing adopts a hierarchical noise adding strategy, specifically, the high sensitivity characteristic adopts strong privacy protection: ; in the formula, The high sensitivity characteristic value is obtained after differential privacy protection; Is the original real characteristic value; () Is a Laplace distribution; the low sensitivity feature employs weak privacy protection: ; in the formula, The low sensitivity characteristic value is subjected to differential privacy protection; Is characteristic sensitivity; 、 Privacy budgets for high/low sensitivity features, respectively.
6. The rapid LDP-MST light weight method based on novel load and distributed resource clustering as recited in claim 1, wherein said constructing a light weight minimum spanning tree MST comprises, Firstly, adopting principal component analysis PCA to reduce the dimension of the feature set after privacy protection, and calculating a covariance matrix: ; in the formula, Is a variance matrix; the feature vector of the ith sample after privacy protection is obtained; the average value vector of the feature vectors of all samples after privacy protection is obtained; Performing eigenvalue decomposition on the covariance matrix to obtain an eigenvector matrix, and selecting the first k principal components, wherein the mapping formula is as follows: ; in the formula, Is a feature vector matrix; the feature set is a feature set after privacy protection; then, a weighted Euclidean distance is defined, and the formula is: ; in the formula, Representing a weighted euclidean distance between the ith sample and the jth sample; Weights representing the first feature; Representing the projection value of the ith sample on the ith principal component; representing the projection value of the jth sample on the jth principal component, setting a distance threshold Only reserve Is a sample pair of (a); And (3) adopting FPGA hardware to accelerate distance matrix calculation in parallel, and constructing a minimum spanning tree MST by combining a Kruskal algorithm with the union set.
7. The rapid LDP-MST light weight method based on novel load and distributed resource clustering as recited in claim 6, wherein said performing cluster partitioning based on minimum spanning tree MST topology comprises, Calculating the median of all edge weights of the minimum spanning tree MST, and setting a cutting threshold value: ; in the formula, The median of all edge weights of the minimum spanning tree MST; For adjusting the coefficient; When the target cluster quantity is C, alpha is adjusted through a dichotomy to enable the cluster quantity to be converged to C, and the method is initially used If the number of clusters Then increase If (1) Then decrease C connected components are formed after cutting until the clustering quantity is converged to C, each component corresponds to one cluster G 1 ,G 2 , gc, and the conditions that the distance in similar samples is smaller than the distance in similar samples are met The distance between heterogeneous samples is greater than 。
8. The rapid LDP-MST light weight method based on novel load and distributed resource clustering as claimed in claim 7, wherein said dynamically updating the cluster result comprises, Collecting new daily data, and calculating the average characteristic deviation between the characteristics and the cluster to which the new daily data belongs: ; in the formula, The characteristic deviation of the new sample i and the cluster to which the new sample i belongs is added; An average value of the j-th feature of the cluster c; the j characteristic value of the newly added sample i; Is the first Weights of the individual features; Is a feature dimension; Setting based on historical data verification When (when) And when the characteristic change is judged to be a sample with obvious characteristic change, the distance between the sample and other samples is recalculated, the MST local topology of the minimum spanning tree is updated, and the cluster attribution is adjusted.
9. The rapid LDP-MST light weight method based on novel load and distributed resource clustering as claimed in claim 7, wherein said standardized parameter output generation of power grid adaptation parameters comprises: Cluster typical curve: ; in the formula, Is the sample weight; a typical power value at time t for the c-th cluster; The actual power value of the ith sample at time t; regulatory potential indicators, including total adjustable capacity of clusters and average response time; The total adjustable capacity of the cluster is: ; in the formula, Maximum adjustable power after being cut for the ith sample; The average response time is: ; in the formula, An average response time for cluster c; individual response time for the ith sample; Is the first Clustering clusters; clustering confidence: ; in the formula, Is a weighted euclidean distance.
10. The method for rapid LDP-MST light weight based on novel load and distributed resource clustering as set forth in claim 7, wherein said optimizing the LDP-MST light weight algorithm comprises: s1, performing wavelet transformation compression, and performing compression and block calculation on self-adaptive data, wherein the method specifically comprises the following steps: s11, three-layer decomposition is carried out on the timing characteristic data by adopting db4 wavelet basis, The first layer retains high frequency coefficients; The second layer retains intermediate frequency coefficients; a third layer quantizes the low frequency coefficients; The compression formula is: ; Wherein W is a wavelet transformation matrix; is a compression error; Is a sample set compressed by wavelet transformation; the method comprises the steps of preprocessing a sample set to be clustered; S12, partitioning and clustering, namely partitioning sub-blocks according to voltage levels, independently constructing a minimum spanning tree MST by each sub-block, calculating inter-block distances through core samples, and performing global clustering fusion: ; in the formula, The result is a global clustering result; Core samples that are sub-block d; Core samples that are sub-block d; is an inter-block fusion threshold; A core sample distance between the c-th sub-block and the d-th sub-block; S2, heterogeneous hardware is accelerated cooperatively, sample distances are calculated in parallel through a plurality of GPUs, and a calculation formula is as follows: ; in the formula, Accelerating the calculated distance for the GPU between the ith sample and the jth sample; For parallel computing functions, executing cooperatively by a plurality of GPUs; Vector of the ith sample in the feature space after dimension reduction; Vector of the j-th sample in the feature space after dimension reduction; Is a feature weight vector; Calculating and outputting a local clustering center, and completing global clustering by a cloud based on a center sample; S13, constructing pruning optimization by using a minimum spanning tree MST; s131, randomly sampling S samples to construct an initial minimum spanning tree MST, and searching through k neighbor Correcting the key edge, wherein the approximation error is less than or equal to 3%, and the time complexity is optimized as follows ; S132, pruning a dynamic threshold, and adjusting a distance threshold according to the sample density: ; in the formula, Is the sample density; Is a reference threshold; S14, improving the time sequence perceived distance measurement; S141, dynamic time warping, replacing Euclidean distance processing time sequence offset: ; in the formula, Is a time pair Ji Hanshu; a dynamic time warping distance between the ith sample and the jth sample; The power value at time t for the ith sample; The power value of the j-th sample at the corresponding moment after time alignment; To find an optimal path that minimizes the total distance among all possible time-aligned paths; s142, physical constraint weighting, wherein geographic distance and time sequence distance are fused, and the total distance is as follows: ; in the formula, Is the geographic distance; Is the maximum geographic span; S15, feature engineering of field knowledge embedding; S151, constructing cross features: ; in the formula, Peak-to-peak period for sample i; a peak load period representing a jth sample; Is the pearson correlation coefficient; Is the peak-to-charge coincidence degree; The matching degree is the source load matching degree; Generating power at a time t for an ith sample; load power at time t for the jth sample; S152, normalizing the median, and processing the fluctuation characteristics by adopting an anti-extreme value: ; in the formula, Is the original characteristic value; Is the median of x; is the difference between the quartile range of x; Is the normalized characteristic value; 25 th and 75 th percentiles of data, respectively; s16, noise adjustment of data distribution perception, noise reduction is carried out on a high-density characteristic area, and the scale parameters are as follows: ; in the formula, Is the probability density of the feature; s17, differential privacy aggregation, namely, aggregating the region-level samples first and then adding noise, wherein the aggregated noise is individual noise It is calculated as: ; in the formula, N is the total number of samples; the aggregation characteristic value after adding the Laplace noise is obtained; s18, performing post-clustering treatment correction; S181, median smoothing, namely replacing mean value with median in the cluster to eliminate noise: ; in the formula, The typical characteristic value after the median smoothing of the c-th cluster; s182, checking boundary samples, and secondarily classifying cross-cluster boundary samples by combining physical labels.

Description

Novel load and distributed resource clustering-based rapid LDP-MST (LDP-Multi-State machine-Mass Spectrometry) light-weight method Technical Field The invention relates to the technical field of power system control, in particular to a novel load and distributed resource clustering-based rapid LDP-MST light-weight method. Background With the shift of the energy system toward decentralization and multi-energy complementation, the quantity of distributed resources and novel loads increases exponentially, and if the characteristics of each resource/load are analyzed one by one, the calculation complexity increases dramatically, and the time requirement (usually requiring second-level response) of real-time scheduling of a power grid is far exceeded. And the traditional clustering algorithm such as K-Means, DBSCAN and traditional MST clustering has low clustering precision and privacy protection loss. Disclosure of Invention The invention aims to overcome the defects of the prior art and provides a novel load and distributed resource clustering-based rapid LDP-MST light-weight method. The aim of the invention is realized by the following technical scheme that the method for quickly lightening LDP-MST based on novel load and distributed resource clustering comprises the following steps of, Novel load and distributed resource data acquisition and preprocessing; Extracting key features, calculating the weight of each feature by an entropy method, and constructing a multidimensional feature vector; Privacy protection processing is carried out based on an LDP mechanism; Constructing a light-weight minimum spanning tree MST; Clustering and dividing based on the minimum spanning tree MST to obtain clusters; Dynamically updating the cluster result; generating a power grid adaptation parameter by the standardized parameter; And (3) performing efficiency optimization on the LDP-MST lightweight algorithm. Specifically, the novel load and distributed resource data acquisition and preprocessing specifically comprises the steps of acquiring novel load data and distributed resource data, cleaning the acquired data, removing abnormal values, filling the missing values by adopting an interpolation method, and obtaining the new load data and the distributed resource data: ; in the formula, 、The front and rear effective points of the missing points are respectively;、 The time intervals of the missing points and the front and back effective points are respectively; finally, processing the data by adopting an extremum method: ; in the formula, Is the normalized power value; Is the original power value; , respectively minimum and maximum power in the sample set; preprocessing data to form a sample set to be clustered: ; Where n is the total sample size. Specifically, the extracted features include timing features, fluctuation features, and controllability features; Wherein the timing characteristics include peak-to-valley occurrence moments, and a daily average load factor: ; in the formula, Rated power for the device; Power value at t time; the fluctuation characteristics comprise short-time fluctuation frequency and power fluctuation coefficient: ; in the formula, Is the average value of the power and,; The controllability features include adjusting response speed and reducible capacity ratio: ; in the formula, Power can be reduced to the maximum. Specifically, the weights of the features are calculated by an entropy method as follows: ; in the formula, Is the firstThe first sample ofNormalized duty cycle of the individual features; Is a feature dimension; Is the first Weights of the individual features; Information entropy of the j-th feature. Specifically, the privacy protection processing adopts a hierarchical noise adding strategy, specifically, the high sensitivity characteristic adopts strong privacy protection: ; in the formula, The high sensitivity characteristic value is subjected to differential privacy protection; representing the original real characteristic value; () Representing a laplace distribution for generating random noise meeting differential privacy requirements; the low sensitivity feature employs weak privacy protection: ; in the formula, The probability density function of the Laplace noise is as follows;Is a scale parameter; Is characteristic sensitivity; 、 Privacy budgets for high/low sensitivity features, respectively. In particular, the construction of the lightweight minimum spanning tree MST includes, First, principal component analysis PCA is adopted to protect privacyDimension reduction, covariance matrix calculation: ; in the formula, Representing a covariance matrix; the feature vector of the ith sample after privacy protection is obtained; the average value vector of the feature vectors of all samples after privacy protection is obtained; Performing eigenvalue decomposition on the covariance matrix to obtain an eigenvector matrix, and selecting the first k principal components, wherein the ma