Search

CN-122020281-A - Power user classification method and device for confusing industry, electronic equipment and storage medium

CN122020281ACN 122020281 ACN122020281 ACN 122020281ACN-122020281-A

Abstract

The application provides a method and a device for classifying electric power users in a confusing industry, electronic equipment and a storage medium. The method comprises the steps of obtaining multidimensional industry characteristic data of electric power users to be classified, generating fusion characteristic vectors for each electric power user to be classified according to the multidimensional industry characteristic data, clustering the electric power users to be classified based on the fusion characteristic vectors to determine a fuzzy set of the electric power users to be classified in the fuzzy set of the electric power users to be confused, carrying out initial prediction on industries of the electric power users to be classified in the fuzzy set of the electric power users to be confused, and calculating comprehensive discrimination scores of each industry corresponding to the users by using a preset weighted dot product model to determine the attribution industry of the users. The embodiment of the application can solve the technical problem of disordered classification of the electricity consumption data caused by similar industry classification, improves the purity and accuracy of the electricity consumption statistics data of the industry, and provides reliable data support for macroscopic economic analysis and industry supervision based on the electricity data.

Inventors

  • FENG SEN
  • GE XIN
  • WANG PEI
  • YIN QING
  • WANG CHAO
  • JU JIAN

Assignees

  • 国网陕西省电力有限公司营销服务中心(计量中心)

Dates

Publication Date
20260512
Application Date
20251229

Claims (16)

  1. 1. A method of classifying power users in a confusing industry, the method comprising: Acquiring multidimensional industry characteristic data of power users to be classified; generating a fusion feature vector for each power user to be classified according to the multidimensional industry feature data by using an attention mechanism; Clustering all power users to be classified based on the fusion feature vector by using a preset clustering algorithm to determine a fuzzy set of the users in the confusable industry; Carrying out initial prediction on industries to which any one of the power users to be classified belongs in the fuzzy set of the users of the industry to be confused by utilizing a preset mixed discrimination model, and determining a contribution vector of each industry characteristic of the any one of the power users to be classified to an initial prediction result; Calculating the comprehensive discrimination score of each industry corresponding to any one user by using a preset weighted dot product model based on the contribution vector of each industry characteristic of the any one user to be classified to the initial prediction result; And determining the attribution industry of any user according to the comprehensive discrimination score.
  2. 2. The method of claim 1, wherein determining the home industry of the any one user based on the composite discriminant score comprises: And determining the industry with the highest comprehensive discrimination score as the attribution industry of any user.
  3. 3. The method of claim 2, wherein determining the home industry of the any one user based on the composite discriminant score comprises: Reclassifying the arbitrary user based on a hierarchical decision mechanism of a dual threshold value, and redefining the attribution industry of the arbitrary user; Wherein the classifying decision mechanism based on the dual threshold reclassifies the arbitrary user, and redefining the attribution industry of the arbitrary user, comprising: calculating the optimal industry category and the corresponding first comprehensive discrimination score of any one user in the fuzzy set of the users in the confusable industry, and the suboptimal industry and the corresponding second comprehensive discrimination score of any one user; calculating a score difference between the first comprehensive discrimination score and the second comprehensive discrimination score; Comparing the first comprehensive discrimination score with a preset confidence threshold, comparing the score difference with a preset dominance difference threshold, and determining the industry to which any one user belongs according to a comparison result.
  4. 4. The method of claim 3, wherein the predetermined confidence threshold comprises a maximum confidence threshold and a minimum confidence threshold, wherein comparing the first integrated discrimination score to the predetermined confidence threshold and comparing the score difference to a predetermined dominance difference threshold, and wherein determining the industry to which the any one of the users belongs based on the comparison comprises: determining that any one of the users belongs to the industry with the highest comprehensive discrimination score under the condition that the first comprehensive discrimination score is larger than or equal to the maximum confidence threshold and the score difference is larger than or equal to a preset priority difference threshold; Performing manual verification on industries to which any one user belongs under the condition that the first comprehensive discrimination score is smaller than the minimum confidence coefficient threshold or the score difference is smaller than a preset priority difference threshold; And marking the industry to which any one user belongs as pending under the condition that the first comprehensive discrimination score is smaller than the maximum confidence threshold and is larger than or equal to the minimum confidence threshold and the score difference is larger than or equal to a preset priority difference threshold.
  5. 5. The method of claim 1, wherein obtaining multidimensional industry feature data for the power consumer to be classified comprises: And acquiring multidimensional industry characteristic data of the power users to be classified at the power utilization side, the user side and the checking side respectively, and constructing industry user characteristic images of each power user to be classified according to the multidimensional industry characteristic data.
  6. 6. The method of claim 5, wherein the multi-dimensional industry feature data comprises a load curve depth feature, a production timing feature, and an energy consumption intensity feature, wherein, The depth characteristic of the load curve comprises at least one index of a production continuity index, a load fluctuation frequency domain energy ratio and a non-power frequency load ratio; The production time sequence characteristics comprise at least one index of intelligent class pattern recognition and seasonal production characteristic vectors; the energy consumption intensity characteristic comprises an electricity consumption index of a unit product.
  7. 7. The method of claim 1, wherein generating a fusion feature vector for each power consumer to be classified from the multi-dimensional industry feature data using an attention mechanism comprises: Preprocessing the multidimensional industry characteristic data; Carrying out standardization and coding treatment on the multidimensional industry characteristic data; splicing the standardized and coded multidimensional industry characteristic data according to a preset sequence to generate an original characteristic vector; And carrying out weighted summation on the original feature vectors by using an attention mechanism to obtain a fusion feature vector.
  8. 8. The method of claim 7, wherein preprocessing the multi-dimensional industry feature data comprises: denoising the multidimensional industry characteristic data by using a wavelet transformation method; Deleting abnormal values in the multidimensional industry feature data; and supplementing the missing values in the multidimensional industry characteristic data by using a linear interpolation method.
  9. 9. The method of claim 7, wherein normalizing and encoding the multi-dimensional industry feature data comprises: performing standardization processing on the multidimensional industry characteristic data by using a Z-score standardization method; and converting the category type data in the multidimensional industry characteristic data into binary value vectors by using a single-heat coding method.
  10. 10. The method of claim 7, wherein weighting and summing the original feature vectors using an attention mechanism to obtain a fused feature vector comprises: according to a preset attention parameter, calculating the attention weight of each feature in the original feature vector; And carrying out weighted summation on the original feature vectors according to the attention weight so as to obtain the fusion feature vector.
  11. 11. The method of claim 1, wherein clustering all power users to be classified based on the fused feature vector using a preset clustering algorithm to determine a confusable industry user fuzzy set comprises: determining a self-adaptive optimal neighborhood radius parameter by using a K distance graph method; Performing initial clustering on all power users to be classified by utilizing the compact self-adaptive optimal neighborhood radius parameters to form a high-confidence core cluster, and taking a set formed by users outside the high-confidence core cluster as a class set to be classified; Performing secondary clustering on the undetermined classification set by using loose self-adaptive optimal neighborhood radius parameters, and determining a weak association substructure in the undetermined classification set; and taking a weak correlation substructure formed by the secondary clustering and a set formed by users of noise points which are not classified after the secondary clustering as a fuzzy set of users in the confusing industry.
  12. 12. The method of claim 1, wherein the initially predicting the industry to which any one of the power users to be classified in the confusable industry user fuzzy set belongs using a preset hybrid discriminant model, and determining a contribution vector of each industry feature of the any one of the power users to be classified to an initial prediction result, comprises: Training a benchmark industry classifier by using a gradient lifting tree method by using users containing marked industries and corresponding fusion feature vectors thereof, and predicting the industry probability of the fuzzy set of the users in the confusable industries; Taking the industry attribution prediction probability of each user in the reference industry classifier as a target to be interpreted, and taking the fusion feature vector corresponding to each user as a feature set input by an interpreter; inputting the defined interpretation objects and interpretation features into a saprolidine plus interpreter to obtain A contribution vector predicted for each industry category, wherein the contribution vector expression is as follows: ; Wherein, the Representing a contribution vector; The method comprises the steps of judging whether the user is a user or not, wherein the number of features participating in judgment is represented by i, the i represents the i user, and k represents the k industry to which the i user belongs.
  13. 13. The method according to claim 1, wherein calculating the composite discrimination score of each industry to which the arbitrary one of the users corresponds by using a preset weighted dot product model based on the contribution vector of each industry feature of the arbitrary one of the power users to be classified to the initial prediction result, comprises: calculating global average saproli addition absolute values of all the features corresponding to each user in the user fuzzy set of the confusing industry; Constructing a judgment matrix by using a 1-9 scale method according to the comparison value among the characteristics of each user; calculating the feature weight vector of each feature according to the judgment matrix; calculating the comprehensive discrimination score of each feature according to the global average saprolite plus absolute value and the feature weight vector, wherein the expression of the comprehensive discrimination score is as follows: ; Where i represents a fuzzy user set The i-th user to be distinguished is confusing; k represents the kth industry to which the ith user belongs, j represents the jth industry feature of the ith user; The number of features involved in the discrimination is indicated, A contribution vector representing a j-th industry feature; Feature weights representing the jth industry feature.
  14. 14. An electrical power consumer classification device for a confusing industry, the device comprising: the acquisition module is used for acquiring multidimensional industry characteristic data of the power users to be classified; The vector calculation module is used for generating fusion feature vectors for each power user to be classified according to the multidimensional industry feature data by using an attention mechanism; The clustering module is used for clustering all power users to be classified based on the fusion feature vector by using a preset clustering algorithm so as to determine a fuzzy set of the users in the confusable industry; The prediction module is used for carrying out initial prediction on industries to which any one of the power users to be classified in the fuzzy set of the confusable industry users belongs by utilizing a preset mixed discrimination model, and determining a contribution vector of each industry feature of the any one of the power users to be classified to an initial prediction result; The calculation score module is used for calculating the comprehensive discrimination score of each industry corresponding to any one user by utilizing a preset weighted dot product model based on the contribution vector of each industry characteristic of the any one user to be classified to the initial prediction result; and the output module is used for determining the attribution industry of any user according to the comprehensive discrimination score.
  15. 15. An electronic device, the electronic device comprising: A memory and a processor, the memory having stored thereon a computer program to be run by the processor, which when run by the processor causes the processor to perform the method of power user classification of a confusing industry as claimed in any one of claims 1 to 13.
  16. 16. A storage medium having stored thereon a computer program which, when executed by a processor, causes the processor to perform the method of power consumer classification of the confusing industry of any of claims 1 to 13.

Description

Power user classification method and device for confusing industry, electronic equipment and storage medium Technical Field The application relates to the technical field of electricity behavior analysis, in particular to a method and a device for classifying electric power users in the confusing industry, electronic equipment and a storage medium. Background Currently, the large electric power data becomes a key element for making perspective economic and social operation situations, optimizing energy resource allocation and supporting industrial policy formulation. The power consumption data of the different industries are core statistical indexes, and the classification accuracy and purity of the power consumption data directly determine the reliability of macroscopic analysis. In China, the industry electricity statistics mainly conform to national standards of national economic industry Classification (GB/T4754-2017), and power grid enterprises endow users with industry labels according to user report information, business registration information and the like, and report statistical data based on the industry labels. However, in practical applications, a large number of industry classifications have the difficult problems of fuzzy boundaries and high overlapping of electricity behavior characteristics, which results in distortion of statistical data. The main aspects are that one industry classification has the adjacency, for example, the manufacturing industry of electric machinery and equipment and the manufacturing industry of computers, communication and other electronic equipment have the intersection on the production process and electric equipment, and the other industry classification has the advantages that the same user can contain multiple production activities, for example, one large manufacturing park can simultaneously cover metal processing and equipment assembly, and the mixed electricity utilization characteristic of the large manufacturing park is difficult to be simply classified into a single industry. The user groups which are adjacent in standard classification and similar in actual electricity utilization behavior form a user set in the confusing industry, and are difficult to treat by the traditional method. Further, the prior art scheme has obvious defects in coping with the challenge that the prior method depends on the time domain characteristics of a single load curve, cannot deeply excavate the frequency domain rule of load fluctuation, the time sequence characteristic of production service and the energy consumption difference of process essence, and has limited distinguishing capability on the confusing industry. Disclosure of Invention The present application has been made keeping in mind at least one of the above problems occurring in the prior art. According to one aspect of the application, there is provided a method of classifying power users in a confusing industry, the method comprising: Acquiring multidimensional industry characteristic data of power users to be classified; generating a fusion feature vector for each power user to be classified according to the multidimensional industry feature data by using an attention mechanism; Clustering all power users to be classified based on the fusion feature vector by using a preset clustering algorithm to determine a fuzzy set of the users in the confusable industry; Carrying out initial prediction on industries to which any one of the power users to be classified belongs in the fuzzy set of the users of the industry to be confused by utilizing a preset mixed discrimination model, and determining a contribution vector of each industry characteristic of the any one of the power users to be classified to an initial prediction result; Calculating the comprehensive discrimination score of each industry corresponding to any one user by using a preset weighted dot product model based on the contribution vector of each industry characteristic of the any one user to be classified to the initial prediction result; And determining the attribution industry of any user according to the comprehensive discrimination score. In some embodiments, determining the home industry of the any one user according to the integrated discriminant score comprises: And determining the industry with the highest comprehensive discrimination score as the attribution industry of any user. In some embodiments, determining the home industry of the any one user according to the integrated discriminant score comprises: Reclassifying the arbitrary user based on a hierarchical decision mechanism of a dual threshold value, and redefining the attribution industry of the arbitrary user; Wherein the classifying decision mechanism based on the dual threshold reclassifies the arbitrary user, and redefining the attribution industry of the arbitrary user, comprising: calculating the optimal industry category and the corresponding first comprehensive discrimination