Search

CN-121981775-A - User behavior data processing method and device, electronic equipment and computer program product

CN121981775ACN 121981775 ACN121981775 ACN 121981775ACN-121981775-A

Abstract

The application provides a processing method and device of user behavior data, electronic equipment and a computer program product, and belongs to the field of advertisement data processing. The method comprises the steps of obtaining and integrating heterogeneous data from a plurality of data sources, distributing entities related to advertisement delivery to different entity sets according to performance indexes of the entities in a predefined evaluation period, wherein the entity sets comprise a main entity set and an aggregate entity set, generating a combined feature set based on the integrated heterogeneous data by taking the entity set as an operation granularity, wherein the combined feature set comprises first type features obtained based on behavior data and attribute data of all users in the corresponding entity set in the preset time period and second type features obtained based on historical time sequence data of the corresponding entity set, and inputting the combined feature set into a trained machine learning model to obtain target indexes related to the entity set.

Inventors

  • ZHANG XU
  • LI TAO

Assignees

  • 麒麟合盛网络技术股份有限公司

Dates

Publication Date
20260505
Application Date
20251230

Claims (10)

  1. 1. A method for processing user behavior data, the method comprising the steps of: acquiring and integrating heterogeneous data from a plurality of data sources, wherein the heterogeneous data comprises behavior data of a user in a preset time period, attribute data of the user and advertisement delivery data related to the user; assigning entities related to advertisement placement to different sets of entities according to their performance indicators during a predefined evaluation period, the sets of entities comprising a main set of entities and an aggregate set of entities; generating a combined feature set based on the integrated heterogeneous data by taking the entity set as an operation granularity, wherein the combined feature set comprises first type features obtained based on behavior data and attribute data of all users in the corresponding entity set in the preset time period and second type features obtained based on historical time sequence data of the corresponding entity set; And inputting the combined feature set into a trained machine learning model to obtain a target index related to the entity set.
  2. 2. The method of claim 1, wherein the predetermined period of time is a limited time window from when the user installs the application or completes a specified event.
  3. 3. The method according to claim 1 or 2, wherein the entity is an advertising campaign; The entity is distributed to different entity sets according to the performance index of the entity related to advertisement delivery in a predefined evaluation period, and the entity sets comprise a main entity set and an aggregate entity set, and the method comprises the following steps: and distributing the advertisement activities meeting the preset income contribution condition and/or the user scale condition to the main entity set, and distributing the rest advertisement activities to the aggregation entity set.
  4. 4. The method according to claim 1, wherein the generating a combined feature set based on the integrated heterogeneous data with the entity set as an operation granularity, the combined feature set including a first type of feature obtained based on behavior data and attribute data of all users in the corresponding entity set within the preset period of time, and a second type of feature obtained based on historical time series data of the corresponding entity set, includes the steps of: Calculating the behavior data in the preset time period to obtain behavior statistical characteristics; and carrying out coding processing based on a historical target on the category type variable in the attribute data.
  5. 5. The method according to claim 1, wherein the generating a combined feature set based on the integrated heterogeneous data with the entity set as an operation granularity, the combined feature set including a first type of feature obtained based on behavior data and attribute data of all users in the corresponding entity set within the preset period of time, and a second type of feature obtained based on historical time series data of the corresponding entity set, includes the steps of: Performing a time-shifting operation on the historical time-series data based on the historical time-series data, calculating statistics of one or more rolling time windows prior to a current processing time point, the historical time-series data including at least one of historical return on investment, lifecycle value, or cost.
  6. 6. The method of claim 5, wherein the second class of features further comprises ratio features between rolling time window correspondence statistics of different lengths.
  7. 7. The method of claim 1, wherein said inputting said combined feature set into a trained machine learning model results in a target index associated with said entity set, comprising the steps of: the machine learning model is configured to simultaneously output a plurality of predicted values corresponding to different future time lengths or different business objectives.
  8. 8. A device for processing user behavior data, the device comprising: the data integration module is used for acquiring and integrating heterogeneous data from a plurality of data sources, wherein the heterogeneous data comprises behavior data of a user in a preset time period, attribute data of the user and advertisement delivery data related to the user; the entity grouping module is used for distributing the entities to different entity sets according to the performance indexes of the entities related to advertisement delivery in a predefined evaluation period, wherein the entity sets comprise a main entity set and an aggregate entity set; The feature generation module is used for generating a combined feature set based on the integrated heterogeneous data by taking the entity set as the operation granularity, wherein the combined feature set comprises a first type of features obtained based on behavior data and attribute data of all users in the corresponding entity set in the preset time period and a second type of features obtained based on historical time sequence data of the corresponding entity set, and And the index output module is used for inputting the combined feature set into a trained machine learning model to obtain a target index related to the entity set.
  9. 9. An electronic device comprising a processor, a memory, a program or instructions stored on the memory and executable on the processor, which program or instructions when executed by the processor implement the steps of the method of any one of claims 1 to 7.
  10. 10. A computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions which when executed by a computer implement the steps of the method of any of claims 1 to 7.

Description

User behavior data processing method and device, electronic equipment and computer program product Technical Field The present application relates to the field of advertisement data processing, and in particular, to a method, an apparatus, an electronic device, and a computer program product for processing user behavior data. Background In the field of mobile internet advertising, advertisers rely on accurate predictions of long-term return on investment (Return on Investment, ROI) for budget allocation and optimization. Conventional approaches typically rely on historical revenue sequence data for a number of days after user installation, using a time series model to extrapolate future revenue. However, such methods suffer from significant hysteresis, failing to provide effective long-term ROI predictions early in delivery, resulting in difficulty for advertisers to adjust policies in time, miss high quality traffic windows, and lack stable modeling capabilities for sample sparse long-tailed advertising campaigns. Disclosure of Invention The embodiment of the application provides a processing method, a processing device, electronic equipment and a computer program product for user behavior data, which can reliably estimate long-term advertisement effect at the initial stage of user behavior. In a first aspect, an embodiment of the present application provides a method for processing user behavior data, where the method includes the following steps: acquiring and integrating heterogeneous data from a plurality of data sources, wherein the heterogeneous data comprises behavior data of a user in a preset time period, attribute data of the user and advertisement delivery data related to the user; assigning entities related to advertisement placement to different sets of entities according to their performance indicators during a predefined evaluation period, the sets of entities comprising a main set of entities and an aggregate set of entities; generating a combined feature set based on the integrated heterogeneous data by taking the entity set as an operation granularity, wherein the combined feature set comprises first type features obtained based on behavior data and attribute data of all users in the corresponding entity set in the preset time period and second type features obtained based on historical time sequence data of the corresponding entity set; And inputting the combined feature set into a trained machine learning model to obtain a target index related to the entity set. In a second aspect, an embodiment of the present application provides a processing apparatus for user behavior data, where the apparatus includes: the data integration module is used for acquiring and integrating heterogeneous data from a plurality of data sources, wherein the heterogeneous data comprises behavior data of a user in a preset time period, attribute data of the user and advertisement delivery data related to the user; the entity grouping module is used for distributing the entities to different entity sets according to the performance indexes of the entities related to advertisement delivery in a predefined evaluation period, wherein the entity sets comprise a main entity set and an aggregate entity set; The feature generation module is used for generating a combined feature set based on the integrated heterogeneous data by taking the entity set as the operation granularity, wherein the combined feature set comprises a first type of features obtained based on behavior data and attribute data of all users in the corresponding entity set in the preset time period and a second type of features obtained based on historical time sequence data of the corresponding entity set, and And the index output module is used for inputting the combined feature set into a trained machine learning model to obtain a target index related to the entity set. In a third aspect, an embodiment of the present application provides an electronic device, including a processor, a memory, and a program or instructions stored on the memory and executable on the processor, the program or instructions implementing the steps of the method for processing user behavior data according to the first aspect when executed by the processor. In a fourth aspect, embodiments of the present application provide a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, implement the steps of the method of processing user behavior data as described in the first aspect. According to the embodiment of the application, by utilizing the behavior data of the user in the preset time period, the future long-term index can be directly predicted in the initial stage of advertisement putting start, the accumulated data does not need to be relied on for multiple days, and decision support can be provided for real-time adjustment