CN-121981319-A - User traffic prediction method and device based on multi-model fusion
Abstract
The embodiment of the invention discloses a user traffic prediction method and device based on multi-model fusion. The method comprises the steps of integrating data of an acquired source data set to obtain a processed user data set, carrying out data reconstruction on the processed user data set to obtain a reconstructed user data set, carrying out labeling on each reconstructed user data in the reconstructed user data set to generate a high-dimensional user tag matrix set, carrying out user level division according to a dimension model in a user data lake and the high-dimensional user tag matrix set to generate a user level tag sequence set, and carrying out user flow prediction according to the high-dimensional user tag matrix set and the user level tag sequence set to generate a user flow prediction information set. This embodiment may improve the accuracy of the user traffic prediction and reduce the bias of the resource allocation.
Inventors
- FANG ZHENYING
- YU BIN
- JIAN HAIXIA
- HUANG YANHAO
- HUANG SHUIPING
Assignees
- 北京鹏龙行汽车贸易有限公司
Dates
- Publication Date
- 20260505
- Application Date
- 20251231
Claims (10)
- 1. The user flow prediction method based on the multi-model fusion is characterized by comprising the following steps of: performing data integration on the acquired source data set by utilizing a data integration layer in a pre-built user data lake to obtain a processed user data set, wherein the user data lake further comprises an index system model and a dimension model; Performing data reconstruction on the processed user data set to obtain a reconstructed user data set; labeling each reconstructed user data in the reconstructed user data set according to the index system model in the user data lake to generate a high-dimensional user label matrix set; performing user hierarchy division according to the dimension model in the user data lake and the high-dimensional user tag matrix set to generate a user hierarchy tag sequence set, wherein the high-dimensional user tag matrix corresponds to the user hierarchy tag sequence one by one, and each high-dimensional user tag matrix and the corresponding user hierarchy tag sequence represent a user portrait; And carrying out user traffic prediction according to the high-dimensional user tag matrix set and the user level tag sequence set to generate a user traffic prediction information set, and carrying out resource allocation operation according to the user traffic prediction information set.
- 2. The method according to claim 1, wherein the method further comprises: storing the user flow prediction information set, and sending out flow loss early warning in response to determining that the user flow prediction information set meets a preset loss early warning condition.
- 3. The method according to claim 1, wherein the data integration of the acquired source data set using the data integration layer in the pre-built user data lake to obtain the processed user data set includes: using the data integration layer in the user data lake, performing the following data processing steps for each source data in the acquired source data set: Performing data increment extraction on the source data according to the time sequence to obtain extracted data; performing data cleaning on the extracted data to obtain cleaned data; and inserting the cleaned data into historical data corresponding to the same user to obtain processed user data.
- 4. A method according to claim 3, wherein said performing data reconstruction on said processed user data set to obtain a reconstructed user data set comprises: for each processed user data in the set of processed user data, performing the steps of: According to a preset time period group, carrying out data division on the processed user data to obtain a divided user data sequence, wherein each divided user data in the divided user data sequence comprises the same attribute tag sequence; according to a preset attribute tag sequence, constructing the divided user data in the divided user data sequence as a user behavior vector to obtain a user behavior vector group; and determining the user behavior vector group as the reconstructed user data in the reconstructed user data set.
- 5. The method of claim 4, wherein labeling each of the reconstructed user data in the set of reconstructed user data according to the metric system model in the user data lake to generate a set of high-dimensional user tag matrices comprises: Using the index system model in the user data lake, for each reconstructed data in the set of reconstructed user data, performing the steps of: Performing secondary classification on the reconstruction data to obtain a classification label, wherein the classification label characterizes a retaining channel class or a newly added user class; Determining data labels corresponding to the divided user data in each user behavior vector in the reconstruction data according to the classification labels to obtain a data label set, wherein the classification labels representing the maintaining channel class correspond to a first label coding group, and the classification labels representing the newly added user class correspond to a second label coding group; sequencing each data tag in the data tag set according to the time sequence to obtain a user time sequence tag sequence set; and constructing a high-dimensional user tag matrix set by using the obtained user time sequence tag sequence sets.
- 6. The method of claim 5, wherein said user-level partitioning according to the dimension model in the user data lake and the set of high-dimensional user tag matrices to generate a set of user-level tag sequences comprises: Updating user grading rules included in the dimension model by using user behavior vectors included in the reconstructed user data set to obtain an updated dimension model, wherein the updated dimension model comprises updated grading rules, and the updated grading rules comprise user level identification groups; Using the updated dimension model, for each high-dimensional user tag matrix in the set of high-dimensional user tag matrices, performing the steps of: And carrying out user hierarchy division according to the updated hierarchical rule in the updated dimension model and each user time sequence label in the high-dimensional user label matrix to generate a user hierarchy label sequence.
- 7. The method of claim 6, wherein said performing user traffic prediction from said high-dimensional user tag matrix set and said user-level tag sequence set to generate a user traffic prediction information set comprises: For each high-dimensional user tag matrix in the high-dimensional user tag matrix set, executing the following steps according to the user level tag sequence set of the high-dimensional user tag matrix and a preset flow prediction model, wherein the preset flow prediction model comprises a basic feature extraction module, a level feature extraction module, a cross-level attention fusion module, a feature fusion module and a prediction output module: generating multidimensional semantic features according to the high-dimensional user tag matrix and the basic feature extraction module; generating a hierarchical feature sequence according to the user hierarchical label sequence set and the hierarchical feature extraction module; Generating an enhanced hierarchical feature sequence according to the multi-dimensional semantic features, the hierarchical feature sequence and the cross-hierarchical attention fusion module; generating a multi-layer fusion feature according to the multi-dimensional semantic feature, the enhanced hierarchical feature sequence and the feature fusion module; and carrying out user flow prediction according to the multi-layer fusion characteristics and the prediction output module so as to generate user flow prediction information.
- 8. A user traffic prediction device based on multi-model fusion, comprising: the data integration unit is configured to integrate the acquired source data set by utilizing a data integration layer in a user data lake built in advance to obtain a processed user data set, wherein the user data lake further comprises an index system model and a dimension model; a data reconstruction unit configured to reconstruct data of the processed user data set to obtain a reconstructed user data set; a tagging unit configured to tag each of the reconstructed user data in the set of reconstructed user data according to an index system model in the user data lake to generate a set of high-dimensional user tag matrices; The user hierarchy dividing unit is configured to perform user hierarchy division according to the dimension model in the user data lake and the high-dimensional user tag matrix set so as to generate a user hierarchy tag sequence set, wherein the high-dimensional user tag matrices are in one-to-one correspondence with the user hierarchy tag sequences, and each high-dimensional user tag matrix and the corresponding user hierarchy tag sequence represent user portraits; And the user traffic prediction unit is configured to conduct user traffic prediction according to the high-dimensional user tag matrix set and the user level tag sequence set so as to generate a user traffic prediction information set, and perform resource allocation operation according to the user traffic prediction information set.
- 9. An electronic device, comprising: One or more processors; a storage device having one or more programs stored thereon, When executed by the one or more processors, causes the one or more processors to implement the method of any of claims 1-7.
- 10. A computer readable medium, characterized in that a computer program is stored thereon, wherein the program, when executed by a processor, implements the method according to any of claims 1-7.
Description
User traffic prediction method and device based on multi-model fusion Technical Field The embodiment of the disclosure relates to the technical field of computers, in particular to the fields of user traffic prediction and user portrait construction, and particularly relates to a user traffic prediction method and device based on multi-model fusion. Background User traffic prediction is one technique for predicting user variation trends. At present, a method for realizing user traffic prediction based on a technical architecture of static rules and batch processing data is generally adopted. However, the above method does not establish a hierarchical association mechanism between tags, resulting in fragmentation of user tags, which results in poor accuracy of user traffic prediction and thus deviation of resource allocation. Disclosure of Invention The disclosure is in part intended to introduce concepts in a simplified form that are further described below in the detailed description. The disclosure is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter. Some embodiments of the present disclosure propose a user traffic prediction method and apparatus based on multi-model fusion to solve the technical problems mentioned in the background section above. According to the first aspect, some embodiments of the present disclosure provide a user traffic prediction method based on multi-model fusion, which includes integrating data of an acquired source data set by using a data integration layer in a user data lake built in advance to obtain a processed user data set, wherein the user data lake further includes an index system model and a dimension model, reconstructing data of the processed user data set to obtain a reconstructed user data set, labeling each reconstructed user data in the reconstructed user data set according to the index system model in the user data lake to generate a high-dimensional user tag matrix set, performing user hierarchy division according to the dimension model in the user data lake and the high-dimensional user tag matrix set to generate a user hierarchy tag sequence set, wherein the high-dimensional user tag matrix corresponds to the user hierarchy tag sequence one by one, each high-dimensional user tag matrix represents a user, performing user traffic prediction according to the high-dimensional user tag matrix set and the user hierarchy tag sequence set to generate a user traffic prediction information set, and performing user traffic prediction operation according to the user traffic prediction information set. In a second aspect, some embodiments of the present disclosure provide a user traffic prediction device based on multi-model fusion, where the device includes a data integration unit configured to integrate data of an acquired source data set by using a data integration layer in a user data lake built in advance to obtain a processed user data set, where the user data lake further includes an index system model and a dimension model, a data reconstruction unit configured to reconstruct data of the processed user data set to obtain a reconstructed user data set, a labeling unit configured to label each reconstructed user data in the reconstructed user data lake according to the index system model in the user data lake to generate a high-dimensional user tag matrix set, and a user hierarchy dividing unit configured to perform user hierarchy division according to the dimension model in the user data lake and the high-dimensional user tag matrix set to generate a user hierarchy tag sequence set, where the high-dimensional user tag corresponds to the user hierarchy tag sequence one by one, each high-dimensional user tag matrix represents a user representation of the processed user tag sequence, and a user hierarchy prediction unit configured to perform high-dimensional tag sequence prediction according to the dimension tag sequence, and perform traffic prediction of the user data set according to the dimension model in the user data lake, and perform traffic prediction of the user hierarchy set. In a third aspect, some embodiments of the present disclosure provide an electronic device comprising one or more processors, and storage means having one or more programs stored thereon, which when executed by the one or more processors, cause the one or more processors to implement the method described in any of the implementations of the first aspect. In a fourth aspect, some embodiments of the present disclosure provide a computer readable medium having a computer program stored thereon, wherein the program, when executed by a processor, implements the method described in any of the implementations of the first aspect above. The embodiments of the present disclosure have the advantages that the accuracy of user traffic prediction can be improved and the deviation of