Search

CN-121996845-A - User barrel dividing method and device

CN121996845ACN 121996845 ACN121996845 ACN 121996845ACN-121996845-A

Abstract

The embodiment of the specification relates to a method and a device for user barrel separation, wherein the method comprises multiple iterations, any iteration comprises the steps of obtaining respective current division points and corresponding current division interval combinations of multiple attribute dimensions of a user, the multiple attribute dimensions are selected from one or more of frequency type attributes, liveness type attributes, consumption type attributes and interaction type attributes, searching candidate points in a target range of a first division point of the current division points of any first attribute dimensions, updating the first division point to the first candidate point if the first candidate point meeting optimization conditions is searched, the optimization conditions comprise that after barrel separation operation is carried out on a target user group again according to the updated division interval combinations determined based on the first candidate points, the value of an objective function is reduced, and the objective function is used for measuring the degree of dispersion of the number of users among multiple barrels.

Inventors

  • LI XINJIA
  • LIU ZHEHAO
  • ZHAO FENG

Assignees

  • 支付宝(杭州)数字服务技术有限公司

Dates

Publication Date
20260508
Application Date
20260127

Claims (17)

  1. 1. A method for user barreling, comprising a plurality of iterations, any one iteration comprising: The method comprises the steps of obtaining respective current partition points and corresponding current partition interval combinations of a plurality of attribute dimensions of a user, wherein the attribute dimensions are selected from one or more of frequency attribute, liveness attribute, consumption attribute and interaction attribute; Searching candidate points in a target range of the first partition points for the first partition points in the current partition points of any first attribute dimension, updating the first partition points to be the first candidate points if the first candidate points meeting the optimization condition are searched, wherein the optimization condition comprises that after the target user group is subjected to barrel division operation again according to the updated partition interval combination determined based on the first candidate points, the value of a target function is reduced, and the target function is used for measuring the discrete degree of the number of users among a plurality of barrels.
  2. 2. The method of claim 1, wherein when the arbitrary round is the first round, the obtaining the current division point of each of the plurality of attribute dimensions of the user includes: Performing equal frequency division on any attribute dimension to obtain a plurality of candidate points, wherein the equal frequency division enables the number of users contained in each interval to be the same; and selecting a plurality of candidate points from the plurality of candidate points of the attribute dimension as current division points.
  3. 3. The method according to claim 2, wherein selecting a plurality of candidate points from the plurality of candidate points in the attribute dimension as the current partition point includes: randomly selecting a plurality of candidate points from the plurality of candidate points as the current division point, or And uniformly selecting a plurality of candidate points from the plurality of candidate points to serve as current division points.
  4. 4. The method of claim 1, wherein the target range of the first division point includes each candidate point between a previous division point and a subsequent division point of the first division point in a sequence ordered by a point value size of the current division point.
  5. 5. The method of claim 4, wherein the target range of the first split point includes each candidate point between the previous split point and the subsequent split point and at a distance from the first split point no more than k candidate points.
  6. 6. The method of claim 1, wherein the re-barreling the target user population by the updated partition combinations determined based on the first candidate points comprises: replacing a first partition point with a first candidate point, and updating a partition interval combination of the first attribute dimension; and determining a plurality of barrels according to Cartesian products among the respective partition combinations of the attribute dimensions, and distributing each user in the target user group to the corresponding barrel.
  7. 7. The method of claim 1, wherein the objective function comprises any of a standard deviation of the number of users in each bucket, a quotient of the standard deviation and an average value of the number of users in each bucket, and an average absolute error of the number of users in each bucket.
  8. 8. The method of claim 1, wherein the optimization conditions further comprise a number of users in any bucket after the re-barrelling operation being greater than a first threshold and less than a second threshold.
  9. 9. The method of claim 1, wherein the optimization condition further comprises that the number of users in any one of the buckets after the re-bucket operation is greater than a first threshold and the proportion of the number of buckets with the number of users greater than a second threshold does not exceed a first proportion.
  10. 10. The method of claim 8 or 9, wherein the optimization condition further includes that a ratio of the number of barrels whose number of users is smaller than a third threshold after the re-barreling operation does not exceed a second ratio.
  11. 11. The method of claim 8 or 9, wherein the optimization condition further comprises a number of non-empty buckets greater than the first number after the re-bucket splitting operation.
  12. 12. The method of claim 1, wherein the any round of iterations further comprises: searching candidate points in a second target range of a second segmentation point for the second segmentation point in the current segmentation points of the first attribute dimension; if a combination of the first candidate point and the second candidate point meeting the optimization condition is searched, the first segmentation point is updated to the first candidate point, and the second segmentation point is updated to the second candidate point.
  13. 13. The method of claim 12, wherein a previous division point of the second division point is the first division point and a subsequent division point is a third division point in the sequence ordered by the point value of the current division point, and the second target range of the second division point includes candidate points between the temporary value point of the first division point in the search and the third division point.
  14. 14. The method of claim 12, wherein a previous division point of the second division point is a fourth division point and a subsequent division point is the first division point in the sequence ordered by the point value of the current division point, and the second target range of the second division point includes candidate points between the temporary value point and the fourth division point of the first division point in the search.
  15. 15. An apparatus for user binning for multiple iterations, comprising: the system comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is configured to acquire respective current partition points and corresponding current partition interval combinations of a plurality of attribute dimensions of a user, wherein the attribute dimensions are selected from one or more of frequency attribute, liveness attribute, consumption attribute and interaction attribute; the searching unit is configured to search candidate points in a target range of the first division points for the first division points in the current division points of any first attribute dimension, update the first division points to be the first candidate points if the first candidate points meeting the optimization condition are searched, wherein the optimization condition comprises that after the target user group is subjected to barrel division operation again according to the updated division interval combination determined based on the first candidate points, the value of a target function is reduced, and the target function is used for measuring the discrete degree of the number of users among a plurality of barrels.
  16. 16. A computer readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method of any of claims 1-14.
  17. 17. A computing device comprising a memory and a processor, wherein the memory has executable code stored therein, which when executed by the processor, implements the method of any of claims 1-14.

Description

User barrel dividing method and device Technical Field One or more embodiments of the present disclosure relate to the field of data processing, and in particular, to a method and apparatus for user binning. Background With the rapid development of internet services and digital platforms, the scale and dimension of user data are exponentially increased, and how to understand user features efficiently has become a key to improving user experience. The user portrait technology is used as an important means of user modeling, and a structured user representation is constructed by integrating the attributes of the user, so that basic support is provided for downstream application scenes such as personalized recommendation, accurate marketing, risk control and the like. To further improve processing efficiency and policy interpretability, user groups are typically barreled, and users with similar features or behavior patterns are classified into the same category (i.e., a "bucket"), so that a unified policy or model is implemented at the group level. However, the number of people among barrels of the barrel dividing result in the related art is not uniform, and the performance of downstream tasks is affected. Therefore, a method is needed to promote the uniformity of user's barreling results. Disclosure of Invention One or more embodiments of the present disclosure describe a method and apparatus for user binning to promote uniformity of user binning results. In a first aspect, a method for user barreling is provided, including a plurality of iterations, any one iteration including: The method comprises the steps of obtaining respective current partition points and corresponding current partition interval combinations of a plurality of attribute dimensions of a user, wherein the attribute dimensions are selected from one or more of frequency attribute, liveness attribute, consumption attribute and interaction attribute; Searching candidate points in a target range of the first partition points for the first partition points in the current partition points of any first attribute dimension, updating the first partition points to be the first candidate points if the first candidate points meeting the optimization condition are searched, wherein the optimization condition comprises that after the target user group is subjected to barrel division operation again according to the updated partition interval combination determined based on the first candidate points, the value of a target function is reduced, and the target function is used for measuring the discrete degree of the number of users among a plurality of barrels. In some possible implementations, when the arbitrary round is the first round, the obtaining the current division point of each of the plurality of attribute dimensions of the user includes: Performing equal frequency division on any attribute dimension to obtain a plurality of candidate points, wherein the equal frequency division enables the number of users contained in each interval to be the same; and selecting a plurality of candidate points from the plurality of candidate points of the attribute dimension as current division points. In some possible implementations, selecting a plurality of candidate points from the plurality of candidate points in the attribute dimension as the current partition point includes: randomly selecting a plurality of candidate points from the plurality of candidate points as the current division point, or And uniformly selecting a plurality of candidate points from the plurality of candidate points to serve as current division points. In some possible embodiments, the target range of the first division point includes each candidate point between a previous division point and a subsequent division point of the first division point in a sequence ordered by a point value size of the current division point. In some possible embodiments, the target range of the first division point includes each candidate point between the previous division point and the next division point, and a distance from the first division point is not more than k candidate points. In some possible embodiments, the re-barreling the target user group according to the updated partition combination determined based on the first candidate point includes: replacing a first partition point with a first candidate point, and updating a partition interval combination of the first attribute dimension; and determining a plurality of barrels according to Cartesian products among the respective partition combinations of the attribute dimensions, and distributing each user in the target user group to the corresponding barrel. In some possible embodiments, the objective function includes any of a standard deviation of the number of users in each bucket, a quotient of the standard deviation and an average value of the number of users in each bucket, and an average absolute error of the number of users in each bu