CN-116011594-B - Non-sampling factor decomposition machine service method under longitudinal federal learning architecture

CN116011594BCN 116011594 BCN116011594 BCN 116011594BCN-116011594-B

Abstract

The invention discloses a service method of a non-sampling factor decomposition machine under a longitudinal federal learning architecture, which belongs to the field of recommendation systems, and comprises the steps of initializing local models of both parties participating in longitudinal federal learning, calculating respective auxiliary vectors through the initialized local models, transmitting homomorphic encryption public keys to both parties participating in longitudinal federal learning by a trusted third party, wherein one party transmits an own encryption model and the auxiliary vectors to the other party to obtain a predicted value of an encrypted longitudinal federal factor decomposition machine algorithm on the other party, receiving the predicted value, calculating a loss function through local data and received data, solving respective encryption gradients by both parties, transmitting the calculated encryption gradients to the trusted third party for decryption, updating the models after the model gradients are received by both parties, and converging the models for recommendation tasks. The method has the advantages that the technical problem that the training efficiency of the recommendation frame model of the hidden feedback factor decomposition machine under the longitudinal federal learning architecture is poor under the non-sampling training condition is solved.

Inventors

HUANG HAI
GUO HONGZUO

Assignees

哈尔滨理工大学

Dates

Publication Date: 20260508
Application Date: 20221208

Claims (5)

1. A non-sampling factor decomposition machine service method under a longitudinal federal learning architecture is characterized by comprising the following steps: S1, company A and company B each initialize a local model, i.e., for company A, initialize parameters And based thereon calculate a vector For company B, initialize parameters And based thereon calculate a vector Calculating And (3) with The specific method of (a) is as follows: ; Wherein, the ; ; ; Wherein, the Representing the number of hidden factors, 、 And Representing the feature vector of company a, the feature vector of company B and the auxiliary neuron weights of the prediction layer, And Is dependent only on feature data local to company a and company B, wherein, And Global bias of company a and company B local models participating in joint modeling, And Representing the characteristic quantity of the data of companies a and B respectively, And Is company A in the model Personal and company B The weight of the individual variables is determined, And Representing the second order interactions between the company a data self-features and the company B data self-features respectively, Neuron weights representing the sum of two company internal feature intersections of the prediction layer A, B, Neuron weights representing the feature interdigitation terms of two companies of the predictive layer A, B; S2, the trusted third party server C sends the public key pub to the company A, B, and the company B sends the intermediate result Encryption to obtain Transmitting to company a; s3, the company A receives the encrypted intermediate result transmitted by the company B And pass through Solving encryption prediction function of longitudinal federal factorization machine algorithm Company A will And Sending back to company B; s4, the company A and the company B calculate a loss function by using data transmitted by the local party and the opposite party respectively Company A solves for encryption gradients , , , , , ; S5, respectively uploading the encrypted parameter gradients to a third party server C for decryption by the company A and the company B, and returning the results to the company A and the company B respectively to update the parameters by gradient descent; s6, repeating the steps S2 to S5 until the model converges.
2. The method for servicing the non-sampling factorizer under the longitudinal federal learning architecture according to claim 1, wherein the encryption method adopted in the step S2 comprises the following specific steps: ; Wherein the method comprises the steps of To use the additive homomorphic encryption function of the public key pub, Is the result after homomorphic encryption.
3. The method for non-sampling factorizer service under vertical federal learning architecture of claim 1, wherein step S3 is performed by Solving the predicted value of the longitudinal federal factorization machine algorithm The specific method of (a) is as follows: ; Wherein the method comprises the steps of And Representing homomorphic encryption functions using public key pub homomorphically encrypt the feature vector of company a and the feature vector of company B, For the auxiliary neuron weights of the prediction layer, +., And Are dependent only on feature data local to company a and company B.
4. The method for servicing a non-sampling factorizer under a vertical federal learning architecture of claim 1, wherein the step S4 is performed by computing a loss function The specific method of (a) is as follows: ; Wherein U is the data set of company B, For a dataset with positive feedback in company a, For substitution into the U-th item in the data set U and the data set The predicted value of the v-th feature of the (c), Representing the number of hidden factors, Respectively as vectors Is characterized in that in the ith and jth items, Respectively is vector Is characterized in that in the ith and jth items, Respectively is vector I and j of (a).
5. The method for servicing a non-sampled factorization machine under a vertical federal learning architecture of claim 1, wherein in step S5, firstly, the A and B companies receive the C-terminal decrypted gradient data, and then the A and B companies update the respective parameters by using the respective received data comprises the following steps: s5.1, A, B company receives C end using addition homomorphic decryption function The result of decryption, wherein company A receives , , Company B receives , , ; S5.2, the specific method for updating parameters by gradient descent is as follows: ; ; ; ; ; ; Wherein, the In order for the rate of learning to be high, And Global bias of company a and company B local models participating in joint modeling, And Linear weights of company a and company B local models participating in joint modeling respectively, And Hidden vectors of company a and company B local models participating in joint modeling, respectively.

Description

Non-sampling factor decomposition machine service method under longitudinal federal learning architecture Technical Field The invention relates to the technical field of recommendation systems, in particular to a non-sampling factor decomposition machine service method under a longitudinal federal learning architecture. Background The factoring machine is a general framework that integrates the advantages of flexible feature engineering and high-precision prediction of potential factor models. Federal learning as a new machine learning paradigm allows different participants to collaborate to build intelligent systems without revealing private data while solving privacy and data sparseness issues. Longitudinal federal learning is generally applicable to federal learning scenes composed of participants with the same sample space and different feature spaces on a data set, and can be understood as federal learning divided by features. Suppose two companies a and B want to cooperatively train a machine learning service recommendation model, each with their own data. For example, for a social portal website and an e-commerce website, the social portal website only has implicit feedback data of the user on the commodity, such as click condition of the user on the commodity, according to personal information of the user, such as occupation, gender and the like, and the user sets of the two companies A and B have higher coincidence. The electronic commerce website wants to acquire a data set of the social portal website and perform joint training with the data set of the company under the condition of ensuring the privacy of the user so as to provide customized recommendation service for the user. For reasons of user privacy and data security, the a-party and the B-party cannot directly exchange data, and in order to ensure data confidentiality in the training process, a third-party coordinator C is needed. C is a semi-honest third party that is primarily used to assist participants in secure federal learning, and C is independent of each participant and will collect intermediate results for gradient and loss value calculation and forward the results to each participant. The information received from the parties is encrypted or obfuscated so that the original data of the parties is not exposed to each other and the parties only receive model parameters related to the features they possess. In the above scenario, in order to provide more accurate recommendations, context characteristics are considered in addition to modeling user item interactions, but traditional non-sampled federal factorization methods are inefficient. Disclosure of Invention (One) solving the technical problems In view of the deficiencies of the prior art, the present invention provides a non-sampled factorization machine service method under a vertical federal learning architecture, having two companies a and B want to cooperatively train a machine learning service recommendation model, each having respective data, with the same sample space, different feature spaces on the two data sets, the invention combines the longitudinal federal learning with the non-sampling factorization machine, the longitudinal federal learning can ensure the data privacy, and simultaneously, the improved non-sampling factorization machine model is used for training, the traditional factorization machine model is converted into the matrix decomposition model by adopting a mathematical method, meanwhile, the loss function is optimized, and the like, and the problem of poor efficiency of the non-sampling factorization machine recommendation framework based on the longitudinal federal learning architecture is solved. (II) technical scheme In order to achieve the purpose, the invention provides a non-sampling factor decomposition machine service method under a longitudinal federal learning architecture, which comprises the following steps: S1, each of company A and company B initializes a local model, namely, for company A, initializes a parameter w A,vA and calculates a vector P A based on the parameter w A,vA, and for company B, initializes a parameter w B,vB and calculates a vector Q B based on the parameter w B,vB; s2, the trusted third party server C sends the public key pub to the company A, B, and the company B encrypts the intermediate result Q B to obtain [ [ Q B ] ], and transmits the intermediate result Q B to the company A. S3, company A receives the encryption intermediate result transmitted by company B [ Q B ] ], and obtains an encryption prediction function of the longitudinal federal factor decomposition machine algorithm through P A,[[QB]],hauxCompany A will [ [ P A ] ] andSent back to company B. S4, respectively utilizing data transmitted by local party and counterpart to calculate loss function L by company A and company B, and solving encryption gradient by company A S5, the company A and the company B upload the encrypted parameter gradients to a th