CN-121414423-B - Electric energy substitution potential user identification method and system based on contrast learning

CN121414423BCN 121414423 BCN121414423 BCN 121414423BCN-121414423-B

Abstract

The invention discloses an electric energy substitution potential user identification method and system based on contrast learning, comprising the following steps of S1, dividing users into high-potential users and common users based on a plurality of electric energy substitution potential indexes respectively to construct positive and negative sample pairs corresponding to each index, S2, respectively extracting feature vectors corresponding to different potential indexes from a user association graph by using a plurality of graph rolling network encoders, S3, respectively training each graph rolling network encoder by using the positive and negative sample pairs through contrast learning loss function to output optimized feature vectors, S4, dynamically fusing the optimized feature vectors through a attention mechanism to generate fused feature vectors, S5, outputting probability of the user as the electric energy substitution potential user by using a classifier based on the fused feature vectors, and classifying and judging the probability by using an optimized threshold. The invention can realize the multi-dimensional feature collaborative fusion to accurately identify the electric energy to replace potential users.

Inventors

LI WEISONG
YANG XINGWEI
CHEN JIANHUA
YUAN HAOLIANG

Assignees

广东工业大学

Dates

Publication Date: 20260508
Application Date: 20251029

Claims (6)

1. The electric energy substitution potential user identification method based on contrast learning is characterized by comprising the following steps of: S1, dividing users into high-potential users and common users respectively based on a plurality of electric energy substitution potential indexes to construct positive and negative sample pairs corresponding to each index, wherein the potential indexes at least comprise self-transformation basic indexes, community linkage value indexes and global adaptation priority indexes; the self-transformation basic index is a unit energy consumption alternative duty ratio, and the calculation formula is as follows: Wherein, the For users In the energy consumption amount which can be modified by the electric energy substitution technology, For users Total energy consumed; the community linkage value index is community transformation cooperative degree, and the calculation formula is as follows: Wherein, the Representing a user Is a function of the number of neighbor users, For users With neighbor users Is transformed to a synergistic coefficient, and takes a range of values , For neighbour users Is a modification value weight of (a); The global adaptation priority index is global adaptation score, and the calculation formula is as follows: Wherein, the In order to adapt the policy to the degree of policy, For the load-bearing adaptation degree of the power grid, determining according to whether the residual capacity of the regional power grid where the user is positioned can support the transformed power demand, For the region to develop a target fitness, , , Is a weight coefficient, satisfies ; S2, respectively extracting feature vectors corresponding to different potential indexes from the user association graphs by using a plurality of graph convolution network encoders; S3, training each graph convolution network encoder by utilizing the positive and negative sample pairs through comparison and learning loss functions so as to output optimized feature vectors; S4, dynamically fusing the optimized feature vectors through an attention mechanism to generate fused feature vectors; S5, based on the fusion feature vector, outputting the probability that the user is the potential user for replacing the electric energy by using a classifier, and classifying and judging the probability by using an optimized threshold value to obtain the identification result of whether the user is the potential user.
2. The method for identifying the potential users by replacing electric energy based on contrast learning according to claim 1, wherein the graph rolling network encoder comprises a graph rolling network layer and a full connection layer, and the encoding process of the graph rolling network layer is as follows: Wherein, the Is at the first Layer-learned user characteristics, when When it represents the initial characteristics of the user , Is a normalized adjacency matrix in which The matrix of degrees of representation, Representing the adjacency matrix after addition of the self-loop, Is a matrix of parameters that can be trained, Representing a nonlinear activation function.
3. The contrast learning-based power replacement potential user identification method of claim 1, wherein the contrast learning loss function is a infoNCE loss function expressed as: Wherein, the Is a temperature coefficient for scaling the similarity value and preventing the gradient from disappearing or exploding during the training, Representation and representation The same class, namely the user node set with similar electric energy substitution potential characteristics, Representation and representation Different classes, namely a set of user nodes with large differences in power substitution potential characteristics, Cosine similarity representing characteristics of the user nodes.
4. The contrast learning based power replacement potential user identification method of claim 1, wherein the dynamic fusion of the attention mechanisms comprises the steps of: combining a plurality of the feature vectors into a feature matrix; Based on the feature matrix, calculating to obtain a query vector, a key vector and a value vector through a learnable parameter matrix; According to the query vector and the key vector, calculating attention weights, wherein the attention weights are used for representing the importance of different feature vectors in fusion; and carrying out weighted summation on the value vectors by using the attention weights, and outputting the fused feature matrix.
5. An electrical energy replacement potential user identification system based on contrast learning, comprising: The system comprises a sample construction module, a sampling module and a control module, wherein the sample construction module is used for respectively dividing users into high-potential users and common users based on a plurality of electric energy substitution potential indexes to construct positive and negative sample pairs corresponding to each index; the self-transformation basic index is a unit energy consumption alternative duty ratio, and the calculation formula is as follows: Wherein, the For users In the energy consumption amount which can be modified by the electric energy substitution technology, For users Total energy consumed; the community linkage value index is community transformation cooperative degree, and the calculation formula is as follows: Wherein, the Representing a user Is a function of the number of neighbor users, For users With neighbor users Is transformed to a synergistic coefficient, and takes a range of values , For neighbour users Is a modification value weight of (a); The global adaptation priority index is global adaptation score, and the calculation formula is as follows: Wherein, the In order to adapt the policy to the degree of policy, For the load-bearing adaptation degree of the power grid, determining according to whether the residual capacity of the regional power grid where the user is positioned can support the transformed power demand, For the region to develop a target fitness, , , Is a weight coefficient, satisfies ; The encoder module comprises a plurality of encoders and is used for respectively extracting feature vectors corresponding to different potential indexes from the user association graph by using a plurality of graph convolution network encoders; the contrast learning module is used for training each graph convolution network encoder by utilizing the positive and negative sample pairs through contrast learning loss functions so as to output optimized feature vectors; the feature fusion module is used for dynamically fusing the optimized feature vectors through an attention mechanism to generate fused feature vectors; And the classification module is used for outputting the probability of the user as the electric energy to replace the potential user by using the classifier based on the fusion feature vector, and classifying and judging the probability by using the optimized threshold value so as to obtain the identification result of whether the user is the potential user.
6. The system of claim 5, wherein the encoder module comprises a graph roll-up network and a fully connected layer; And/or the feature fusion module comprises an attention calculation unit, wherein the attention calculation unit is used for stacking a plurality of feature vectors into a matrix, calculating a query matrix, a key matrix and a value matrix through a learnable parameter matrix, calculating attention weights based on the query matrix and the key matrix, and carrying out weighted summation on the value matrix by using the attention weights so as to generate a fusion feature vector; And/or the classification module comprises a sigmoid activation function and a threshold optimization unit.

Description

Electric energy substitution potential user identification method and system based on contrast learning Technical Field The invention relates to the technical field of data identification, in particular to a method and a system for identifying potential users by replacing electric energy based on comparison learning. Background The accurate identification of replacing potential users by electric energy is one of key technologies for realizing clean and efficient transformation of energy consumption sides. The development of the technology is closely related to the data accumulation degree of the energy industry and the evolution of the artificial intelligence technology, and the core challenge is how to realize high-precision user identification under the actual scene of limited annotation data, complex user characteristics and various association relations. Early identification methods mainly rely on traditional rule patterns, namely screening standards are established by energy domain experts according to experience, such as directly listing users in high-energy consumption industry as potential objects, or setting hard thresholds of fossil energy consumption, equipment capacity and the like. Although the method is simple and convenient to implement, the subjectivity is strong, the method is difficult to adapt to complex personalized application scenes, and the missed judgment rate is high in practice. With the popularization of machine learning technology, supervised learning methods are becoming the mainstream. The method adopts models such as logistic regression, random forests, XGBoost and the like, takes static properties such as user industry types, energy consumption scales, equipment years, regional electricity prices and the like as characteristics, and carries out model training based on historical transformation cases. In addition, the method focuses on the independent characteristics of users, and completely ignores the association relation existing between the users, so that the recognition accuracy is obviously insufficient when facing to clustered user scenes such as parks, communities and the like. In recent years, along with the wide deployment of an electricity consumption information acquisition system and an energy consumption online monitoring platform, multi-source data such as user attributes, electricity consumption behaviors, association structures and the like are effectively integrated. The graph neural network technology is introduced to capture complex association characteristics among users by virtue of the advantages of the graph neural network technology in the aspect of processing graph structure data, and meanwhile, the semi-supervised learning technology can guide the learning process of a large amount of unlabeled data by using a small amount of labeled data, so that an important path is provided for solving the problem of labeling scarcity. However, the conventional scheme based on the graph neural network still has obvious limitations that on one hand, the model performance still depends on more marked data to ensure accuracy, and on the other hand, the prior knowledge of the field contained in unmarked data cannot be fully mined, so that the characteristic learning efficiency is low, and the requirement of accurately screening high-potential users in actual business is difficult to meet. At present, the prior art schemes similar to the thought of the invention can be mainly divided into the following three types: The first category is user identification schemes based on semi-supervised graph learning. The core of this type of approach is to use the structural information of the user's association graph to deliver supervisory signals, which typically represent, for example, tag propagation algorithms and their improved models. The basic label propagation algorithm propagates label information of a small number of marked nodes to unmarked nodes through the weights of edges based on the assumption that the adjacent node categories are similar. The method has a certain effect in scenes such as industrial user identification, but the performance of the method is highly dependent on the connectivity of the graph, and the identification effect on isolated users with less relevance is poor. The improved scheme is that a semi-supervised graph rolling network model is adopted, the characteristic representation of fusion of attributes and structures of a user is learned through the graph rolling network, and the classification loss of marked nodes and the consistency loss of unmarked nodes are considered in a loss function, so that a higher F1 value (comprehensive evaluation index) is obtained when training with a small amount of marked data. Although the performance is superior to the traditional supervised learning, the method is essentially a passive propagation carrier taking unlabeled data as a supervision signal, and the prior index information directly related to the el