Search

CN-121980147-A - Method, medium and equipment for generating electric vehicle charging behavior data set

CN121980147ACN 121980147 ACN121980147 ACN 121980147ACN-121980147-A

Abstract

The application provides a generation method, medium and electronic equipment of an electric vehicle charging behavior data set, wherein a first low-rank matrix and a second low-rank matrix which are trained can be obtained, a weight matrix of a large language model is determined according to the first low-rank matrix and the second low-rank matrix, then a search data set is determined according to the electric vehicle charging and driving data set, the search data set is used as a real data source, the electric vehicle charging behavior data set is generated through the large language model, and finally semantic consistency verification and numerical rationality verification are carried out on the electric vehicle charging behavior data set, so that large-scale electric vehicle charging behavior data can be obtained, the limitation of privacy factors of users of the electric vehicle is avoided, hardware resources and storage resources used by fine adjustment can be effectively reduced, the problem of overhigh data repeatability caused by nearest neighbor search is overcome, the generated electric vehicle charging behavior data has higher timeliness, and the acquisition cost of the electric vehicle charging behavior data is effectively reduced.

Inventors

  • YUAN MINGHAN
  • LIU YANGYANG
  • JI KUNHUA
  • CHEN SONG
  • GE WEIDONG
  • YAO YIN

Assignees

  • 国家电网有限公司华东分部

Dates

Publication Date
20260505
Application Date
20260128

Claims (10)

  1. 1. The method for generating the electric vehicle charging behavior data set is characterized by comprising the following steps: Acquiring an electric vehicle charging and driving data set; performing iterative training on a first low-rank matrix and a second low-rank matrix according to a weight matrix of a first large language model to obtain the first low-rank matrix and the second low-rank matrix after training is completed; determining a weight matrix of a second large language model according to the first low-rank matrix and the second low-rank matrix which are completed by training, wherein the weight matrix of the second large language model is different from the weight matrix of the first large language model; determining a retrieval data set according to the electric vehicle charging and driving data set; the retrieval data set is used as a real data source, and the electric vehicle charging behavior data set is generated through the second large language model; And carrying out semantic consistency verification and numerical rationality verification on the electric vehicle charging behavior data set.
  2. 2. The method of claim 1, wherein iteratively training the first low rank matrix and the second low rank matrix based on the weight matrix of the first large language model comprises: determining an incremental weight matrix according to the product of the second low-rank matrix and the matrix of the first low-rank matrix; Updating the weight matrix of the first large language model by using the sum of the increment weight matrix and the weight matrix of the first large language model; determining an output vector according to the input vector and the updated first large language model, wherein the input vector is determined according to the electric vehicle charging and driving data set; Calculating a loss value of the output vector according to a preset loss function; Adjusting the first low-rank matrix and the second low-rank matrix with the aim of minimizing the loss value of the output vector; And determining the updated increment weight matrix according to the adjusted first low-rank matrix and the second low-rank matrix, and continuously performing an iteration process of adjusting the first low-rank matrix and the second low-rank matrix according to the updated increment weight matrix until an iteration stop condition is met.
  3. 3. The method of claim 2, wherein the loss function is a negative log likelihood loss function formulated as follows: wherein L is a loss function, For the t-th Token to be output, Is positioned at The previous Token, P, is the Token probability distribution based on the model parameters, token is a character.
  4. 4. The method of claim 2, wherein the first low rank matrix and the second low rank matrix are adjusted as formulated as follows: wherein A is a first low rank matrix, B is a second low rank matrix, Is the learning rate.
  5. 5. The method of claim 1, wherein determining a search dataset from the electric vehicle charging and driving dataset comprises: Dividing the electric vehicle charging and driving data set into a plurality of data blocks; Vectorizing a plurality of data blocks to obtain a search set vector; Dividing the search set vector into a plurality of subclasses according to the data category, and calculating the center vector of each subclass; calculating cosine similarity between the original query vector and the center vector of each subclass; and uniformly extracting a plurality of pieces of data in the subclass with the maximum similarity to form a retrieval data set.
  6. 6. The method according to claim 1, characterized in that the method further comprises: Determining data dimension of the electric vehicle charging behavior data set according to the electric vehicle charging and driving data set, wherein the data dimension comprises a vehicle model, a battery type, a battery capacity, a battery duration, an access time, a charging electric quantity, a remaining electric quantity, a charging end time, a driving distance, a driving duration, a vehicle energy consumption, a parking time, a charging power, a driving electric quantity, an average vehicle speed, a starting SOC and an off-grid SOC.
  7. 7. The method of claim 6, wherein performing semantic consistency verification on the electric vehicle charging behavior data set comprises: Acquiring a data set related to verification dimension in the electric vehicle charging behavior data set, wherein the verification dimension comprises an initial SOC, a driving distance, an average vehicle speed, a parking time and vehicle energy consumption; randomly sampling the data set related to the verification dimension to obtain a sampled data set; inputting the sampled data set into a second large language model to obtain a verification data set; Calculating a semantic consistency index of the verification data set and the retrieval data set, and expressing the semantic consistency index as follows by a formula: wherein ESCs are semantic consistency indicators, In order to validate the validation vector in the dataset, In order to retrieve the data vectors in the data set, Is the sum of the verification vector Semantically closest retrieval of data vectors in a data set Is used for the distance of euclidean distance, To verify vectors Center vector of subclass Is a euclidean distance of (c).
  8. 8. The method of claim 7, wherein performing a numerical rationality verification on the electric vehicle charging behavior dataset comprises: acquiring a plurality of vector pairs consisting of verification vectors in the verification data set and data vectors in the retrieval data set, wherein the Euclidean distance between the verification vectors and the data vectors in the vector pairs is minimum; judging whether the verification vector meets a rationality condition according to the verification vector and the data vector in the vector pair; according to the number of the verification vectors in the vector pairs being reasonable vectors, calculating a data rationality index, and expressing the data rationality index by a formula as follows: wherein DRI is a data rationality index, N is the number of vector pairs, To verify that the vector is a reasonable number of vectors.
  9. 9. A computer readable medium having stored thereon computer readable instructions executable by a processor to implement the method of any of claims 1 to 8.
  10. 10. An electronic device comprising a memory for storing computer program instructions and a processor for executing the computer program instructions, wherein the computer program instructions, when executed by the processor, cause the electronic device to perform the method of any one of claims 1 to 8.

Description

Method, medium and equipment for generating electric vehicle charging behavior data set Technical Field The invention relates to the technical field of new energy, in particular to a method, medium and equipment for generating an electric vehicle charging behavior data set. Background Currently, large-scale electric vehicle charging behavior data is required to serve as a basis for research such as charging load prediction and electric vehicle demand response, and the more complete the content of the electric vehicle charging behavior data is, the more accurate the charging behavior of an electric vehicle user is depicted. However, due to the limitation of factors such as privacy protection of the electric automobile user, large-scale electric automobile charging behavior data are difficult to collect completely, and the acquisition difficulty is high. In addition, the electric vehicle charging behavior data acquired through the traditional vehicle-mounted data acquisition system needs higher equipment cost and user authorization cost, and the electric vehicle charging behavior data acquired through the traditional probability model depends on the data quality of the original data, and the updating time of the original data is long and the timeliness is poor. Therefore, a technical solution capable of obtaining charging behavior data of a large-scale electric vehicle is needed. Disclosure of Invention The application aims to provide a method, medium and equipment for generating an electric vehicle charging behavior data set, which are used for solving the problems that in the prior art, the electric vehicle charging behavior data is difficult to acquire and the acquired data quality is low due to the privacy of an electric vehicle user. To achieve the above object, some embodiments of the present application provide a method for generating a charging behavior data set of an electric vehicle, the method including: Acquiring an electric vehicle charging and driving data set; Performing iterative training on the first low-rank matrix and the second low-rank matrix according to the weight matrix of the first large language model to obtain a first low-rank matrix and a second low-rank matrix after training is completed; determining a weight matrix of a second large language model according to the first low-rank matrix and the second low-rank matrix which are completed by training, wherein the weight matrix of the second large language model is different from the weight matrix of the first large language model; determining a retrieval data set according to the electric vehicle charging and driving data set; taking the retrieval data set as a real data source, and generating an electric vehicle charging behavior data set through a second large language model; and carrying out semantic consistency verification and numerical rationality verification on the electric vehicle charging behavior data set. Further, performing iterative training on the first low-rank matrix and the second low-rank matrix according to the weight matrix of the first large language model, including: determining an incremental weight matrix according to the product of the second low-rank matrix and the matrix of the first low-rank matrix; Updating the weight matrix of the first large language model by using the sum of the incremental weight matrix and the weight matrix of the first large language model; determining an output vector according to the input vector and the updated first large language model, wherein the input vector is determined according to the electric vehicle charging and driving data set; calculating a loss value of the output vector according to a preset loss function; Adjusting the first low-rank matrix and the second low-rank matrix with the aim of minimizing the loss value of the output vector; And determining updated incremental weight matrixes according to the adjusted first low-rank matrix and the adjusted second low-rank matrix, and continuously performing an iteration process of adjusting the first low-rank matrix and the second low-rank matrix according to the updated incremental weight matrixes until an iteration stop condition is met. Further, the loss function is a negative log likelihood loss function, formulated as follows: ; wherein L is a loss function, For the t-th Token to be output,Is positioned atThe previous Token, P, is the Token probability distribution based on the model parameters, token is a character. Further, the first low rank matrix and the second low rank matrix are adjusted, and the first low rank matrix and the second low rank matrix are expressed as follows: ; ; wherein A is a first low rank matrix, B is a second low rank matrix, Is the learning rate. Further, determining a search data set according to the electric vehicle charging and driving data set, including: dividing an electric vehicle charging and driving data set into a plurality of data blocks; Vectorizing a plurality of data block