CN-122019626-A - User value evaluation method, device, equipment, medium and product
Abstract
The application provides a method, a device, equipment, a medium and a product for evaluating user value, wherein the method comprises the steps of encrypting a local user identifier to obtain a first encrypted identifier, and executing security intersection with a second encrypted identifier of a data holder to obtain an intersection user; the method comprises the steps of sending a query request to a data holder, obtaining user statistics quantity to construct a feature distribution data set, screening based on the feature distribution data set to obtain an optimal feature set, carrying out federal learning with the data holder to construct a user value assessment model according to the optimal feature set, obtaining a user set to be assessed, and inputting the user set to be assessed into the user value assessment model to obtain a user value assessment result. Therefore, on the premise of protecting the data privacy of each participant, the modeling accuracy can be improved, the user value can be accurately and efficiently estimated, and the effect of accurately marketing and improving the user satisfaction can be achieved.
Inventors
- YU PENGSHUAI
- GUO YE
- LAI CHUNJIANG
- JIANG XINLEI
- LI JIAO
Assignees
- 中移动信息技术有限公司
- 中国移动通信集团有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20260130
Claims (20)
- 1. A method of evaluating user value, the method being applied to a data querying party, the method comprising: Encrypting the locally held user identifier to obtain a first encrypted identifier, and performing a secure intersection operation based on the first encrypted identifier and a second encrypted identifier provided by a data holder to obtain an intersection user commonly owned by the data inquirer and the data holder; Sending a query request to the data holder, and receiving fragment information returned by the data holder according to the query request, wherein the query request carries a plurality of features to be queried of a user and conditions to be met by each feature to be queried; According to the query request and the fragment information, determining the statistical quantity of users when the characteristics to be queried meet the corresponding conditions in the intersection users; Constructing a feature distribution dataset based on a plurality of statistical quantities for a plurality of features to be queried; based on the characteristic distribution data set, performing multiple collinearity analysis and correlation evaluation, and screening out an optimal characteristic set; transmitting the optimal feature set to the data holder, and performing federal learning with the data holder based on the optimal feature set to jointly construct a user value evaluation model; And acquiring a user set to be evaluated, inputting the user set to be evaluated into the user value evaluation model, and obtaining a user value evaluation result output by the user value evaluation model.
- 2. The method of claim 1, wherein obtaining a set of users under evaluation, inputting the set of users under evaluation to the user value evaluation model, obtaining a user value evaluation result output by the user value evaluation model, comprises: Acquiring a user set to be evaluated, constructing a polynomial commitment based on the user set to be evaluated, and sending the polynomial commitment to the data holder, wherein the data holder is used for verifying batch performance that the number of users of the user set to be evaluated is greater than a preset threshold according to the polynomial commitment; Generating validity verification data based on the user set to be evaluated, and sending the validity verification data to the data holder, wherein the data holder is used for verifying the validity of the user set to be evaluated; And after the batch verification and the validity verification are passed, inputting the user set to be evaluated into the user value evaluation model to obtain a user value evaluation result output by the user value evaluation model.
- 3. The method of claim 1, wherein the fragmentation information comprises a second secret fragmentation and a second statistics fragmentation, wherein sending a query request to the data holder, and receiving fragmentation information returned by the data holder according to the query request, and wherein determining, among the intersection users, the statistics of the users when the plurality of features to be queried each satisfy the corresponding condition according to the query request and the fragmentation information comprises: Splitting the query request into a first secret segmentation set, sending part of first secret segmentation in the first secret segmentation set to the data holder, and receiving part of second secret segmentation corresponding to the characteristic from the data holder; homomorphic calculation is carried out on the first secret patches which are remained except for the partial first secret patches locally and the partial second secret patches which are received from the data holder, so as to obtain a difference value patch between the condition and the characteristic; converting the difference fragments into mark bit fragments through a safety comparison protocol, and homomorphic accumulating the mark bit fragments of each user in the intersection users to obtain a first statistical result fragment; And receiving a second statistical result fragment obtained by the data holder, combining the first statistical result fragment and the second statistical result fragment, and performing fragment reconstruction to restore the statistical quantity of the users in the plaintext.
- 4. The method of claim 1, wherein performing multiple co-linearity analysis and correlation evaluation based on the feature distribution dataset, screening out an optimal feature set comprises: Constructing a linear regression model by taking the target feature as a dependent variable and the other features except the target feature in the feature distribution data set as independent variables, determining a decision coefficient based on the linear regression model, and calculating a variance expansion factor value of the target feature based on the decision coefficient; The variance expansion factor values corresponding to all the features are compared with a preset variance expansion factor threshold value respectively; deleting all features with variance expansion factor values higher than the variance expansion factor threshold from the feature distribution data set according to the comparison result, and generating a redundancy-removing feature set; calculating a correlation coefficient between each feature and a user value evaluation result in the redundancy-removing feature set based on a preset user value tag local to the data inquiring party; and screening out the features with the correlation coefficients higher than a preset correlation coefficient threshold value from the redundancy removing feature set to form the optimal feature set.
- 5. The method of claim 1, wherein obtaining a set of users under evaluation, constructing a polynomial commitment based on the set of users under evaluation, and transmitting the polynomial commitment to the data holder comprises: Taking the user set to be evaluated as coefficients of a polynomial to construct the polynomial; generating a polynomial commitment of the polynomial based on elliptic curve cryptography and transmitting the polynomial commitment to the data holder.
- 6. The method of claim 5, wherein after obtaining a set of users under evaluation, constructing a polynomial commitment based on the set of users under evaluation, and sending the polynomial commitment to the data holder, the method further comprises: a random challenge point sent in response to the data holder committing based on the polynomial; Calculating the value of the polynomial at the random challenge point, constructing a quotient polynomial based on the value, and generating a promise of the quotient polynomial; And transmitting the promise of the value and the quotient polynomial to the data holder, wherein the data holder is used for verifying batchability that the number of users of the user set to be evaluated is larger than a preset threshold according to the value and the promise of the quotient polynomial.
- 7. The method of claim 1, wherein generating validity verification data based on the set of users under evaluation, the sending the validity verification data to the data holder comprises: Performing recursive hash calculation on the user set to be evaluated to construct a merck tree, and calculating a root hash of the merck tree; generating a corresponding merck path for each user in the user set to be evaluated based on the merck tree; and sending the root hash, all the merck paths and the user set to be evaluated to the data holder as the validity verification data, wherein the data holder is used for verifying the validity of the user set to be evaluated based on the validity verification data.
- 8. A method according to claim 3, wherein, in the event that there are a plurality of data holders, receiving a portion of the second secret shard corresponding to the feature from the data holder comprises: receiving Reed-Solomon encoded patches from each data holder, wherein each encoded patch is generated by the corresponding data holder by encoding all local second secret patches into polynomials; constructing an error localization polynomial based on the coded slices using a Berlekamp-Welch algorithm; Calculating a root of the error localization polynomial to locate an evaluation point of a malicious data holder providing an erroneous second secret partition; constructing a repair equation set based on the evaluation points and the redundant information of the coding slices; And solving the repair equation set, and recovering the correct second secret shards corresponding to the malicious data holders.
- 9. The method according to any one of claims 1-8, wherein after the batch verification and the validity verification are passed, the set of users to be evaluated is input to the user value evaluation model, and after the user value evaluation result output by the user value evaluation model is obtained, the method further comprises: dividing the users in the user set to be evaluated into different value grades according to the value evaluation result; differentiated marketing strategies are performed for users of different price classes.
- 10. A method of evaluating user value, the method being applied to a data holder, the method comprising: Encrypting the locally held user identifier to obtain a second encrypted identifier, and performing security intersection operation based on the second encrypted identifier and a first encrypted identifier provided by a data inquiring party to obtain an intersection user commonly owned by the data inquiring party and the data holding party; receiving a query request sent by the data query party, and sending fragment information to the data query party according to the query request generation fragment information, wherein the query request carries a plurality of features to be queried of a user and conditions which each feature to be queried needs to meet; And receiving an optimal feature set sent by the data query party, and performing federal learning based on the optimal feature set and the data query party to jointly construct a user value evaluation model, wherein the data query party is used for inputting the user set to be evaluated into the user value evaluation model to obtain a user value evaluation result output by the user value evaluation model.
- 11. The method of claim 10, wherein after jointly constructing the user value assessment model, the method further comprises: Receiving a polynomial commitment constructed based on a user set to be evaluated and sent by the data inquiring party, and verifying batchability that the number of users of the user set to be evaluated is larger than a preset threshold according to the polynomial commitment; Receiving validity verification data generated based on the user set to be evaluated and sent by the data inquiring party, and verifying the validity of the user set to be evaluated according to the validity verification data; And after the batch verification and the validity verification are passed, sending a verification passing indication to the data inquiring party, wherein the data inquiring party is used for inputting the user set to be evaluated into the user value evaluation model after receiving the verification passing indication to obtain a user value evaluation result output by the user value evaluation model.
- 12. The method of claim 10, wherein the fragmentation information comprises a second secret fragmentation and a second statistics fragmentation, receiving a query request sent by the data querying party, generating fragmentation information according to the query request, and sending the fragmentation information to the data querying party, comprising: receiving a part of first secret patches corresponding to the query request from the data querying party; splitting the characteristics of the user held by the user into a second secret sharing set, and sending part of the second secret sharing in the second secret sharing set to the data inquirer; Homomorphic calculation is carried out on the second secret patches which are remained except the partial second secret patches locally and the partial first secret patches which are received from the data inquirer, so as to obtain a difference value patch between the condition and the characteristic; Converting the difference fragments into mark bit fragments through a safety comparison protocol, and homomorphic accumulating the mark bit fragments of each user in the intersection users to obtain second statistical result fragments; And sending the second statistical result fragments to the data inquiring party, wherein the data inquiring party is used for merging the second statistical result fragments with the first statistical result fragments obtained by the data inquiring party, and carrying out fragment reconstruction to restore the statistical quantity of the users in the plaintext.
- 13. The method of claim 12, wherein, in the case where there are a plurality of data holders, after splitting the features of the user held by itself into the second set of secret shards, the method further comprises: generating a random number, transmitting each second secret piece in the second secret pieces and the generated random number to other data holders together, and simultaneously receiving the second secret pieces from other data holders and the random numbers generated by other data holders; all second secret patches held locally are encoded into a polynomial through Reed-Solomon, and encoded patches with redundant information are generated; And the data query party is used for constructing a repair equation set based on the evaluation point and the code fragments after positioning the evaluation point of a malicious data holder providing the error second secret fragments by using a Berlekamp-Welch algorithm based on the code fragments, and recovering the correct second secret fragments by solving the repair equation set.
- 14. The method of claim 10, wherein verifying, based on the polynomial commitment, batchability that the number of users of the set of users under evaluation is greater than a preset threshold comprises: based on the polynomial commitment, sending a random challenge point to the data inquirer; Receiving a promise of a value and a quotient polynomial which are returned by the data inquiring party and are determined based on the random challenge point; and verifying whether the polynomial promise, the random challenge point, the value and the promise of the quotient polynomial meet a preset equation relationship or not by utilizing bilinear pairing technology so as to verify that the number of users of the user set to be evaluated is larger than a preset threshold batchability.
- 15. The method of claim 10, wherein the validity data comprises a root hash of a merck tree generated by the data querying party based on the set of users under evaluation, merck paths of all users in the set of users under evaluation, and wherein verifying the validity of the set of users under evaluation based on the validity verification data comprises: And for each user in the user set to be evaluated, verifying the validity of the user by reconstructing a hash path based on the merck path corresponding to the user and the root hash.
- 16. An apparatus for evaluating user value, the apparatus being applied to a data querying party, the apparatus comprising: The first security intersection module is used for carrying out encryption processing on the locally held user identifier to obtain a first encrypted identifier, and carrying out security intersection operation on the basis of the first encrypted identifier and a second encrypted identifier provided by a data holder to obtain an intersection user commonly owned by the data inquirer and the data holder; The sending module is used for sending a query request to the data holder and receiving the fragment information returned by the data holder according to the query request, wherein the query request carries a plurality of features to be queried of a user and conditions which are required to be met by each feature to be queried; the determining module is used for determining the statistical quantity of the users when the characteristics to be queried respectively meet the corresponding conditions in the intersection users according to the query request and the fragment information; the construction module is used for constructing a feature distribution data set based on a plurality of statistical quantities of the features to be queried; The screening module is used for carrying out multiple collinearity analysis and correlation evaluation based on the characteristic distribution data set and screening out an optimal characteristic set; The first model construction module is used for sending the optimal feature set to the data holder, and performing federal learning with the data holder based on the optimal feature set to jointly construct a user value evaluation model; And the value evaluation module is used for inputting the user set to be evaluated into the user value evaluation model to obtain a user value evaluation result output by the user value evaluation model.
- 17. An apparatus for evaluating user value, the apparatus being applied to a data holder, the apparatus comprising: The second security intersection module is used for carrying out encryption processing on the locally held user identifier to obtain a second encrypted identifier, and carrying out security intersection operation on the basis of the second encrypted identifier and the first encrypted identifier provided by the data inquiring party to obtain an intersection user commonly owned by the data inquiring party and the data holding party; The system comprises a receiving module, a data inquiring party, a judging module and a judging module, wherein the receiving module is used for receiving an inquiry request sent by the data inquiring party, generating fragment information according to the inquiry request, and sending the fragment information to the data inquiring party, wherein the inquiry request carries a plurality of characteristics to be inquired of a user and conditions which are required to be met by each characteristic to be inquired; The second model construction module is used for receiving the optimal feature set sent by the data query party, carrying out federal learning based on the optimal feature set and the data query party, and constructing a user value evaluation model together, wherein the data query party is used for inputting the user set to be evaluated into the user value evaluation model to obtain a user value evaluation result output by the user value evaluation model.
- 18. Network device comprising a processor, a memory and a program stored on the memory and executable on the processor, the program implementing the steps of a method for evaluating a user value according to any of claims 1 to 9 when being executed by the processor or the steps of a method for evaluating a user value according to any of claims 10 to 15 when being executed by the processor.
- 19. A computer-readable storage medium, on which a computer program is stored, which computer program, when being executed by a processor, implements the steps of a method of evaluating a user value according to any one of claims 1 to 9, or which computer program, when being executed by the processor, implements the steps of a method of evaluating a user value according to any one of claims 10 to 15.
- 20. A computer program product comprising computer instructions which, when executed by the processor, implement the steps of a method of assessing a user's value according to any one of claims 1 to 9, or which, when executed by the processor, implement the steps of a method of assessing a user's value according to any one of claims 10 to 15.
Description
User value evaluation method, device, equipment, medium and product Technical Field The embodiment of the application relates to the technical field of data processing, in particular to a method, a device, equipment, a medium and a product for evaluating user value. Background In the current digital age, user value assessment has become an important means for enterprises to achieve accurate marketing and to promote customer satisfaction. With the continuous increase and diversification of business volume, how to effectively identify and analyze user values, and further provide targeted services, has become a necessary trend for enterprise reform and development. However, the existing user value evaluation methods have a number of technical problems. Currently, many user value assessment methods rely primarily on rule screening and personal information based collection and analysis. The method is often limited to low-dimensional information, so that data is not standardized, and accurate judgment of user value cannot be realized. Moreover, while some businesses attempt to analyze the user's historical data and create user portraits using data warehouse and data analysis tools, this approach is prone to risk of privacy disclosure when processing sensitive information. Although measures of encryption storage and transmission are taken for user data, decryption is needed in the calculation process, and hidden danger of privacy disclosure still exists. Meanwhile, the prior art cannot effectively acquire the characteristics of high correlation, so that the evaluation accuracy of the user value score is low, and the actual requirements of enterprises are difficult to meet. The attribute data held by different enterprises are various and different in format, and the direct use of these huge data sets often requires high computational and communication costs. In addition, traditional screening and data analysis methods based on rules often depend on surface user characteristics, complex behavior patterns and potential requirements of users are difficult to capture, and evaluation effects are unsatisfactory. In summary, the prior art has the technical problems of high privacy risk, low evaluation accuracy, insufficient processing efficiency and the like in user value evaluation. Disclosure of Invention The embodiment of the application provides a user value evaluation method, device, equipment, medium and product, which are used for solving the technical problems of high privacy risk, low evaluation accuracy, insufficient processing efficiency and the like in user value evaluation in the prior art. In order to solve the technical problems, the application is realized as follows: in a first aspect, an embodiment of the present application provides a method for evaluating user value, where the method is applied to a data querying party, and the method includes: Encrypting the locally held user identifier to obtain a first encrypted identifier, and performing a secure intersection operation based on the first encrypted identifier and a second encrypted identifier provided by a data holder to obtain an intersection user commonly owned by the data inquirer and the data holder; Sending a query request to the data holder, and receiving fragment information returned by the data holder according to the query request, wherein the query request carries a plurality of features to be queried of a user and conditions to be met by each feature to be queried; According to the query request and the fragment information, determining the statistical quantity of users when the characteristics to be queried meet the corresponding conditions in the intersection users; Constructing a feature distribution data set based on a plurality of statistical quantities aiming at a plurality of features to be queried, performing multiple co-linearity analysis and correlation evaluation based on the feature distribution data set, and screening out an optimal feature set; transmitting the optimal feature set to the data holder, and performing federal learning with the data holder based on the optimal feature set to jointly construct a user value evaluation model; And acquiring a user set to be evaluated, inputting the user set to be evaluated into the user value evaluation model, and obtaining a user value evaluation result output by the user value evaluation model. Optionally, obtaining a set of users to be evaluated, inputting the set of users to be evaluated into the user value evaluation model, and obtaining a user value evaluation result output by the user value evaluation model, including: Acquiring a user set to be evaluated, constructing a polynomial commitment based on the user set to be evaluated, and sending the polynomial commitment to the data holder, wherein the data holder is used for verifying batch performance that the number of users of the user set to be evaluated is greater than a preset threshold according to the polynomial commitm