CN-122021808-A - Distributed large model self-adaptive training method, user terminal and server

CN122021808ACN 122021808 ACN122021808 ACN 122021808ACN-122021808-A

Abstract

The invention relates to the field of artificial intelligence and distributed models, in particular to a distributed large model self-adaptive training method, a user terminal and a server, wherein the method comprises the steps of collecting historical dialogue data between a user and a customer service intelligent agent; the method comprises the steps of generating gradient data and interaction feature vectors used for representing a user interaction behavior mode, carrying out hash encryption on the gradient data to obtain encrypted gradient data, carrying out differential privacy disturbance processing on the interaction feature vectors to obtain encrypted interaction feature vectors, uploading the encrypted gradient data and the encrypted interaction feature vectors to a server, receiving first parameters issued by the server, using the first parameters in a global sharing trunk model, receiving cluster-level adapter parameters issued by the server, and applying the cluster-level adapter parameters to a lightweight adapter. The method of the invention can be used for considering the universality and individuation of the large model and protecting the privacy information of the user.

Inventors

DING ZIJIAN
ZHAO BINGJIE
DING JUNWEI
CHEN DEPIN

Assignees

钛动科技股份有限公司

Dates

Publication Date: 20260512
Application Date: 20260202

Claims (10)

1. A distributed large model adaptive training method for a user terminal, the method comprising: The system comprises a customer service agent, a distributed large model, a lightweight adapter and a user interface, wherein the customer service agent is used for acquiring historical dialogue data between the user and the customer service agent, and the distributed large model comprises a global sharing backbone model and the lightweight adapter; Carrying out local training on the distributed large model based on the historical dialogue data to generate gradient data, carrying out statistics on the historical dialogue data to generate interaction feature vectors used for representing the interaction behavior mode of the user, wherein the gradient data is the gradient of the parameters of the lightweight adapter; Carrying out hash encryption on the gradient data to obtain encrypted gradient data, and carrying out differential privacy disturbance processing on the interaction feature vector to obtain an encrypted interaction feature vector; Uploading the encrypted gradient data and the encrypted interaction feature vector to a server; And receiving the first parameters issued by the server, using the first parameters for the global sharing backbone model to update the global sharing backbone model, receiving cluster-level adapter parameters issued by the server, and applying the cluster-level adapter parameters to the lightweight adapter to update the lightweight adapter.
2. The privacy-preserving distributed large model adaptive training method of claim 1, wherein the encrypted gradient data acquisition method comprises: acquiring gradient data generated by the user terminal in a local training process; dividing the gradient data into a plurality of gradient blocks; And respectively executing Hash mapping operation on each gradient block, generating a Hash gradient value which cannot be reversely deduced, and taking the Hash gradient value as the encryption gradient data.
3. The privacy preserving distributed large model adaptive training method of claim 2, wherein the gradient data is derived by the user terminal from the lightweight adapter parameters based on a local loss function, the local loss function The expression of (2) is: ; In the formula, Representing the total loss value for the local training, E represents the mathematical expectation, Representing the predicted loss function and, The predicted output of the model is represented, Parameters representing the frozen shared backbone model, Parameters representing a trainable lightweight adapter, Indicating a loss of the task, A consistency constraint is represented as a function of the consistency constraints, The regularization coefficient is represented as a function of the regularization coefficient, Represents the degree of divergence of KL, Representing the output distribution of the current local model over the inputs x, Representing the output distribution of the global model for the same input x.
4. The method for adaptively training a distributed large model for protecting privacy according to any one of claims 1-3, wherein the interaction feature vector comprises an interaction round number statistic value, an emotion distribution variance value and an intention entropy value, and the encrypted interaction feature vector is generated by adding Laplacian noise or Gaussian noise to the interaction round number statistic value, the emotion distribution variance value and the intention entropy value based on a local differential privacy mechanism.
5. A user terminal comprising a processor and a memory, the memory storing computer program instructions, characterized in that the computer program instructions, when executed by the processor, implement the distributed large model adaptive training method for a user terminal according to any of claims 1-4.
6. A distributed large model adaptive training method for a server, comprising: receiving encryption gradient data and encryption interaction feature vectors uploaded by a plurality of user terminals; The method comprises the steps of executing global aggregation operation based on encryption gradient data to generate a global gradient for a global sharing trunk model, and calculating updated parameters of the global sharing trunk model according to the global gradient and the parameters before updating the global sharing trunk model, wherein the updated parameters are used as first parameters; Clustering the plurality of user terminals based on the encrypted interaction feature vector to generate a plurality of user clusters; For each user cluster, acquiring parameters of a lightweight adapter of a user terminal belonging to the user cluster, calculating an aggregate value of the parameters of the lightweight adapter, and taking the aggregate value as a cluster-level adapter parameter corresponding to the user cluster; and transmitting the first parameter and cluster-level adapter parameters matched with the target user terminal to the target user terminal for loading and carrying out local reasoning on the target user terminal.
7. The method for adaptively training a distributed large model for protecting privacy according to claim 6, wherein the clustering process is performed on the plurality of user terminals based on the encrypted interaction feature vector to generate a plurality of user clusters, and specifically comprises: Performing unsupervised clustering on the encrypted interaction feature vectors uploaded by the plurality of user terminals to determine a plurality of clustering clusters; And dividing the user terminals corresponding to the encrypted interaction feature vectors belonging to the same cluster into the same set to form the user cluster.
8. The method for adaptively training a distributed large model for privacy protection as in claim 6, wherein said issuing a first parameter to a target user terminal and a cluster-level adapter parameter matched to said target user terminal comprises: Acquiring a current encryption interaction feature vector of the target user terminal; calculating the Euclidean distance between the current encryption interaction feature vector of the target user terminal and the clustering center of each user cluster; And issuing the first parameter and the cluster-level adapter parameter corresponding to the matched cluster to the target user terminal as initial parameters for the target user terminal to perform local reasoning.
9. The method for adaptively training a distributed large model for protecting privacy according to any one of claims 6 to 8, further comprising a user forgetting processing step, comprising: Receiving a user logout request sent by a specific user terminal; And identifying a target user cluster to which the specific user terminal belongs in response to the user logout request, and setting the weight value of the cluster-level adapter parameter corresponding to the target user cluster to be zero.
10. A server comprising a processor and a memory, the memory storing computer program instructions, characterized in that the computer program instructions, when executed by the processor, implement the distributed large model adaptive training method for a server of any of claims 6-9.

Description

Distributed large model self-adaptive training method, user terminal and server Technical Field The invention relates to the field of artificial intelligence and distributed model training. More particularly, the invention relates to a distributed large model self-adaptive training method, a user terminal and a server. Background With the rapid development of deep learning technology, a large language model (Large Language Model, LLM) has demonstrated excellent capabilities in natural language processing tasks, and is widely applied to intelligent customer service, personal assistant and other scenes. In order to enhance user experience, the agent needs to have personalized service capabilities, i.e. self-adaptive adjustment can be performed according to language habits, emotion patterns and interaction preferences of a specific user. The traditional personalized training mode generally uploads massive dialogue data generated by a user to a cloud server for centralized fine adjustment. However, this centralized training approach faces serious data privacy and security challenges. The user's historical dialogue data often contains highly sensitive personal privacy of name, address, financial status, health information, etc. Directly uploading raw data not only violates increasingly strict data privacy protection regulations, but also increases the risk of leakage of data in the transmission and storage processes. Federal learning (FEDERATED LEARNING, FL) has been developed to address data islanding and privacy protection issues. Federal learning allows users to train models with private data at local terminals, only upload model updates (e.g., gradients or parameter differences) to servers for aggregation, thus achieving "data out of domain". Although federal learning reduces privacy risks to some extent, the prior art still has the following significant technical problems when applied to large model personalized training: First, the privacy security is insufficient due to gradient leakage. Although federal learning does not transmit raw data, studies have shown that the transmitted plaintext gradients or parameter updates contain a significant amount of semantic information about the raw data. An attacker (e.g., a malicious aggregation server or eavesdropper) may utilize a "gradient inversion attack" or reconstruction attack to recover the user's original training samples at the pixel or character level by analyzing the uploaded gradient data. Existing defense approaches, such as adding gaussian noise directly (differential privacy), often require a difficult tradeoff between privacy protection strength and model availability, with excessive noise resulting in significant degradation of large model performance. Second, global model performance degradation and insufficient personalization caused by Non-independent co-distribution (Non-IID) of data. In practical applications, there are large differences in interaction behavior, intent distribution, and language style of different users (i.e., statistical heterogeneity of data). Traditional federal averaging (FedAvg) algorithms attempt to train a "global model" for all users, which results in the model often converging to a mediocre compromise when aggregating very diverse gradients for different users, failing to meet the personalized needs of a particular user. In addition, if the user performs excessive fine tuning locally, the model is easy to lose global general reasoning capacity, and a 'catastrophic forgetting' phenomenon occurs, namely, the model forgets basic language logic and general knowledge while adapting to personal preference. Third, there is a lack of a user clustering mechanism to protect privacy. To solve the Non-IID problem, cluster-based federal learning is proposed, which aims at grouping similar users to train a specific group model. However, existing clustering methods typically rely on analyzing model parameters or explicit behavioral characteristics of users, which in effect constitute a "user portrayal" analysis, potentially revealing indirectly the user's behavioral habits or affiliated groups (e.g., by clustering, a group is found to have specific disease consultation characteristics). There is currently a lack of a mechanism to effectively group user interaction behavior features, as well as to strictly protect privacy of these features themselves. In summary, the existing large model training technology or large model federal training method mainly has the following technical problems that the conflict between full model generalization capability and personalized adaptation capability caused by user data statistical heterogeneity (Non-IID) and the problems of model disastrous forgetting and accurate user grouping difficulty caused by the conflict cannot be solved on the premise of ensuring that the original privacy information is irreversibly deduced by the uploaded data (comprising gradient and behavior characteristics). Disclos