CN-122027117-A - Large model federal fine-tuning privacy protection method and system based on selective homomorphic encryption
Abstract
The invention discloses a large model federal fine-tuning privacy protection method and system based on selective homomorphic encryption, which utilize local heterogeneous data to evaluate model parameter sensitivity, conduct negotiation, limit the quantity of parameters of selective encryption, and reduce ciphertext communication overhead caused by homomorphic encryption while effectively protecting data privacy. Meanwhile, the client side exchanges and partially encrypts the model parameters according to the encryption parameter subset, so that the privacy risk is reduced, and meanwhile, efficient calculation is ensured. The server respectively aggregates the plaintext and ciphertext results, and the client terminal fuses the plaintext and ciphertext results, so that the homomorphic encryption ciphertext expansion can be prevented while the correct aggregation parameters are ensured. The method effectively combines homomorphic encryption technology to improve privacy security of federal domain model fine adjustment, controls cost caused by the federal domain model fine adjustment, and can effectively resist the most advanced gradient inversion attack.
Inventors
- YAN LI
- LIU JIANMIN
- LI BORUI
Assignees
- 西安交通大学
Dates
- Publication Date
- 20260512
- Application Date
- 20260316
Claims (10)
- 1. The large model federal fine-tuning privacy protection method based on selective homomorphic encryption is characterized by comprising the following steps of: S1, carrying out model parameter sensitivity evaluation according to local heterogeneous data and encryption budget, determining a local high-sensitivity parameter subset to be encrypted, and carrying out order-preserving encryption on the position of the parameter to be encrypted and the sensitivity of the parameter subset to be encrypted; S2, negotiating a parameter set to be encrypted according to the global encryption budget and the sensitivity of the parameter to be encrypted to obtain a global encryption parameter set; s3, intercepting a subset of the global encryption parameter set according to the local encryption budget to serve as a local final encryption parameter subset; s4, carrying out iterative updating on the model parameters, and carrying out column position exchange and partial encryption on the model parameters according to the local encryption parameter subset; s5, respectively aggregating the plaintext model parameters and the ciphertext model parameters after partial encryption, and generating an aggregated result; S6, decrypting the aggregated ciphertext model parameters through a locally known encryption parameter subset, fusing the ciphertext and the ciphertext parameters by utilizing a re-parameterization technology, and starting a new round of federal learning training; s7, repeating the steps S4 to S6 until the model converges, and finishing federal fine-tuning privacy protection of the large model.
- 2. The large model federal fine-tuning privacy protection method based on selective homomorphic encryption according to claim 1, wherein S1, a client performs model parameter sensitivity evaluation according to local heterogeneous data and encryption budget, determines a local high-sensitivity parameter subset to be encrypted, and sends the position of the parameter to be encrypted and the sensitivity thereof to an aggregation server by using order-preserving encryption; s2, the aggregation server negotiates a parameter set to be encrypted according to the global encryption budget and the sensitivity of the parameter to be encrypted to obtain a global encryption parameter set; S3, the client intercepts a subset of the global encryption parameter set according to the local encryption budget to serve as a local final encryption parameter subset; s4, the client performs iterative updating on the model parameters, performs column position exchange and partial encryption on the model parameters according to the local encryption parameter subset, and sends the partially encrypted plaintext and ciphertext model parameters to the server; s5, the server respectively aggregates the plaintext model parameters and the ciphertext model parameters which are partially encrypted, generates an aggregated result and sends the aggregated result to the client; s6, the client decrypts the aggregated ciphertext model parameters through a locally known encryption parameter subset, fuses the ciphertext and the ciphertext parameters by utilizing a re-parameterization technology, and starts a new round of federal learning training; and S7, repeating the steps S4 to S6, performing model training on each client, and performing model parameter aggregation on a server until the model converges, thereby completing large-model federal fine-tuning privacy protection.
- 3. The large model federal fine-tuning privacy protection method based on selective homomorphic encryption according to claim 2, wherein, The client performs model parameter sensitivity assessment according to local heterogeneous data and encryption budget, determines a local high-sensitivity parameter subset to be encrypted, sends the position of the parameter to be encrypted and the sensitivity thereof to an aggregation server by using order-preserving encryption, and the aggregation server performs negotiation on the parameter set to be encrypted according to global encryption budget and the sensitivity of the parameter to be encrypted to obtain a global encryption parameter set, and the client intercepts a subset of the global encryption parameter set according to the local encryption budget to serve as a local final encryption parameter subset, and specifically comprises the following steps: the client performs a local training based on the local heterogeneous data, evaluates the sensitivity of the model parameters on the local data, and uploads the parameter sensitivity and the corresponding position to the server through order-preserving encryption; the server negotiates a global encryption parameter set according to the personalized parameter subsets to be encrypted of all the clients, and also considering the most sensitive parameter subset in all the clients and the parameter subset which is important for most clients; based on the global encryption parameter set, the client intercepts a subset conforming to the local encryption budget to obtain a local encryption parameter subset of the client.
- 4. The large model federal fine-tuning privacy protection method based on selective homomorphic encryption according to claim 2, wherein the client performs iterative update on model parameters, performs column position exchange and partial encryption on the model parameters according to a local encryption parameter subset, and transmits the partially encrypted plaintext and ciphertext model parameters to a server, the server respectively aggregates the partially encrypted plaintext model parameters and ciphertext model parameters, and generates an aggregated result, and transmits the aggregated result to the client, the client decrypts the aggregated ciphertext model parameters through the locally known encryption parameter subset, fuses ciphertext and ciphertext parameters by using a reparameterization technique, and starts a new round of federal learning training specifically comprising: After each client ends the local parameter update, column position exchange is carried out according to the encryption parameter subset, then selective model parameter encryption is carried out, and plaintext and ciphertext model parameters are sent to the aggregation server; the aggregation server respectively aggregates the plaintext and ciphertext parameters to obtain aggregation results of the plaintext and ciphertext, and then returns the aggregation results to the client, wherein the plaintext parameters need SVD decomposition; The client decrypts the received ciphertext aggregate parameters, restores the true positions of the parameters by utilizing a re-parameterization technology, and fuses the aggregate results of the plaintext and the ciphertext.
- 5. The large model federal fine-tuning privacy protection method based on selective homomorphic encryption according to claim 2, wherein the client-side local parameter sensitivity evaluation can be evaluated one or more times during the whole training process as required; In the first federal learning local training iteration of the client, the following formula is used to obtain model parameter column sensitivity for each client: Wherein, the Is positioned at the first Line and th The model parameters of the column are set, Is a model parameter First, the The sensitivity of the parameters of the column, Is the first local data The characteristics of the dimensions, Rank LoRA.
- 6. The large model federal fine-tuning privacy protection method based on selective homomorphic encryption according to claim 2, wherein the aggregation server not only according to the personalized parameter subsets to be encrypted of each client, but also considers the most sensitive parameter subset of all clients and the parameter subset important to most clients to negotiate a global encryption parameter set, and the aggregation server is utilized to aggregate plaintext and ciphertext respectively, and SVD decomposition is carried out on the plaintext aggregation result.
- 7. The large model federal fine-tuning privacy protection method based on selective homomorphic encryption according to claim 6, wherein the client performs column position exchange again on the aggregation result to recover the correct parameter position, and performs feature fusion through SVD matrix decomposition technique.
- 8. Large model federal fine-tuning privacy protection system based on selective homomorphic encryption, characterized by comprising: the parameter sensitivity evaluation unit is used for performing sensitivity evaluation on the updated model parameters based on the local heterogeneous data; the server negotiates a global encryption parameter set according to the parameter sensitivity of different clients, and returns an encryption parameter subset of which the client meets budget; A model parameter confusion unit, wherein each client performs parameter column position exchange according to the encryption positions of the local encryption parameter subsets so as to confuse original model parameters; The client side carries out iterative updating on the model parameters and carries out partial encryption on the model parameters by utilizing the encryption parameter subset; the model parameter aggregation unit is used for transmitting the partially encrypted plaintext and ciphertext model parameters to the server by the client, and aggregating the partially encrypted plaintext and ciphertext model parameters by the server respectively; The model parameter decryption unit is used for receiving the plaintext and ciphertext model parameters sent by the server by the client and decrypting the ciphertext model parameters; and the model convergence unit is used for carrying out feature fusion on the plaintext parameters and the decrypted ciphertext parameters by the client so as to recover the next round of federal learning training until the model converges.
- 9. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor implements the large model federal fine-tuning privacy protection method based on selective homomorphic encryption according to any one of claims 1-7 when executing the computer program.
- 10. A computer readable storage medium storing a computer program, wherein the computer program when executed by a processor implements a selective homomorphic encryption based large model federal fine tuning privacy protection method according to any of claims 1-7.
Description
Large model federal fine-tuning privacy protection method and system based on selective homomorphic encryption Technical Field The invention belongs to the technical fields of large language model fine tuning, federal learning and homomorphic encryption, and particularly relates to a large model federal fine tuning privacy protection method and system based on selective homomorphic encryption. Background The large language model (Large Language Models, LLMs) achieves excellent performance in many task scenarios, but when it is deployed in private domain (medical, financial, etc.) scenarios, effective scene adaptation is often not possible due to the limitation of data privacy protection. Federal learning (FEDERATED LEARNING, FL) is becoming increasingly popular because it enables collaborative machine learning model training without exposing user-sensitive data privacy. However, several studies have shown that model parameters transmitted during FL training can be exploited by gradient inversion attacks (Gradient Inversion Attack) to reconstruct the user's private data. In order to defend against such attacks, methods currently proposed are differential privacy (DIFFERENTIAL PRIVACY, DP), secure Multi-party Computing (SMC), homomorphic encryption (Homomorphic Encryption, HE), and the like. The model parameters transmitted by the DP through noise disturbance have higher theoretical safety, but the accuracy of the model can be obviously reduced. SMC enables multiple participants to perform distributed computing tasks correctly without exposing private information, but requires complex computing and synchronization protocols to be designed, and is thus unsuitable for devices and large model scenarios with heterogeneous data and system capabilities. In contrast, HE receives a great deal of attention due to its strong privacy preserving capability and the calculability of ciphertext. Homomorphic encryption, however, is relatively large in both communication overhead and computational overhead due to the multiple complex encryption operations involved. Especially in a scene with huge parameters, such as a large language model, the aggregation process is often accompanied by huge ciphertext expansion, and the traditional homomorphic encryption method cannot realize the optimal balance of security and encryption overhead. Disclosure of Invention The invention aims to provide a large model federal fine-tuning privacy protection method and system based on selective homomorphic encryption, so as to solve the problems of high homomorphic encryption communication and calculation cost in the prior art and unbalanced security and encryption overhead in a large model fine-tuning scene of heterogeneous clients. In order to achieve the above purpose, the technical scheme adopted by the invention is as follows: the large model federal fine-tuning privacy protection method based on selective homomorphic encryption comprises the following steps: S1, carrying out model parameter sensitivity evaluation according to local heterogeneous data and encryption budget, determining a local high-sensitivity parameter subset to be encrypted, and carrying out order-preserving encryption on the position of the parameter to be encrypted and the sensitivity of the parameter subset to be encrypted; S2, negotiating a parameter set to be encrypted according to the global encryption budget and the sensitivity of the parameter to be encrypted to obtain a global encryption parameter set; s3, intercepting a subset of the global encryption parameter set according to the local encryption budget to serve as a local final encryption parameter subset; s4, carrying out iterative updating on the model parameters, and carrying out column position exchange and partial encryption on the model parameters according to the local encryption parameter subset; s5, respectively aggregating the plaintext model parameters and the ciphertext model parameters after partial encryption, and generating an aggregated result; S6, decrypting the aggregated ciphertext model parameters through a locally known encryption parameter subset, fusing the ciphertext and the ciphertext parameters by utilizing a re-parameterization technology, and starting a new round of federal learning training; s7, repeating the steps S4 to S6 until the model converges, and finishing federal fine-tuning privacy protection of the large model. Preferably, S1, a client performs model parameter sensitivity evaluation according to local heterogeneous data and encryption budget, determines a local high-sensitivity parameter subset to be encrypted, and sends the position of the parameter to be encrypted and the sensitivity thereof to an aggregation server by using order-preserving encryption; s2, the aggregation server negotiates a parameter set to be encrypted according to the global encryption budget and the sensitivity of the parameter to be encrypted to obtain a global encryption parameter set; S3, the c