CN-122021791-A - Task processing method, device, equipment and storage medium based on large language model
Abstract
The application belongs to the technical field of artificial intelligence and relates to a task processing method based on a large language model, which comprises the steps of processing financial corpus based on the large language model to analyze semantic importance information of each network layer in the large language model; generating a designated rank distribution scheme based on semantic importance information and parameter budget, training a large language model based on the designated rank distribution scheme to obtain a first large language model, fine-tuning the first large language model to obtain a second large language model, adjusting the designated rank distribution scheme to obtain a target rank distribution scheme if the second large language model does not pass semantic fidelity verification, performing model training, fine-tuning and evaluation processing on the large language model by using the target rank distribution scheme until the iteration termination condition is met and a target large language model is obtained, and performing task processing based on the target large language model. The method and the device can be applied to task processing scenes in the field of financial science and technology, and improve the accuracy of a large language model in task processing.
Inventors
- QU XIAOYANG
Assignees
- 平安科技(深圳)有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20260113
Claims (10)
- 1. A task processing method based on a large language model is characterized by comprising the following steps: Acquiring financial corpus data collected in advance; processing the financial corpus based on a preset large language model, and analyzing semantic importance information of each network layer contained in the large language model based on obtained model output data; carrying out scheme generation processing on the semantic importance information and preset parameter budgets based on a preset dynamic rank allocation engine to obtain a corresponding designated rank allocation scheme; training the large language model by using a preset sparse regularization constraint module based on a preset training data set and the appointed rank allocation scheme to obtain a trained first large language model; performing fine tuning on the first large language model based on a preset test data set to obtain a corresponding second large language model, and performing semantic fidelity verification on the second large language model; if the second large language model fails to pass the semantic fidelity verification, adjusting the designated rank allocation scheme based on a preset adjustment strategy to obtain a corresponding target rank allocation scheme; Based on a preset iteration processing strategy, performing model training, fine tuning and evaluation processing on the large language model by using the target rank allocation scheme until a preset iteration termination condition is met and a corresponding target large language model is obtained; and processing task data to be processed based on the target large language model.
- 2. The task processing method based on a large language model according to claim 1, wherein the step of analyzing semantic importance information of each network layer included in the large language model based on the obtained model output data specifically includes: cleaning the financial corpus data to obtain corresponding target corpus data; Processing the target corpus data based on the large language model, and recording output characteristics of each network layer in the large language model; information extraction processing is carried out based on the output characteristics of each network layer, so that corresponding semantic information of each network layer is obtained; And carrying out semantic importance evaluation processing on the semantic information of each network layer to obtain the semantic importance information of each network layer.
- 3. The task processing method based on a large language model according to claim 2, wherein the step of performing semantic importance evaluation processing on the semantic information of each network layer to obtain the semantic importance information of each network layer specifically comprises: Calling a preset scoring function; Acquiring appointed semantic information of an appointed network layer, wherein the appointed network layer is any layer of all network layers contained in the large language model; calculating the appointed semantic information based on the scoring function to obtain corresponding scoring data; Normalizing the scoring data to obtain corresponding target scoring data; and taking the target scoring data as the appointed semantic importance information of the appointed network layer.
- 4. The task processing method based on a large language model according to claim 1, wherein the step of performing scheme generation processing on the semantic importance information and a preset parameter budget by the preset dynamic rank allocation engine to obtain a corresponding designated rank allocation scheme specifically includes: Calling a preset mapping relation table based on the dynamic rank allocation engine; inquiring the rank of each network layer corresponding to the semantic importance information of each network layer from the mapping relation, and constructing a corresponding initial rank allocation scheme based on the rank of each network layer; parameter calculation is carried out based on the ranks of the network layers, so that the corresponding total parameter quantity is obtained; acquiring parameter budgets of the large language model, and analyzing the total parameter quantity and the parameter budgets to obtain corresponding analysis results; based on the analysis result, adjusting the initial rank allocation scheme to obtain a first rank allocation scheme meeting the budget requirement; and taking the first rank allocation scheme as the designated rank allocation scheme.
- 5. The task processing method based on a large language model according to claim 1, wherein the step of performing scheme generation processing on the semantic importance information and a preset parameter budget by the preset dynamic rank allocation engine to obtain a corresponding designated rank allocation scheme specifically includes: Invoking a preset dynamic rank allocation model based on the dynamic rank allocation engine; acquiring a parameter budget corresponding to the large language model; Performing distribution processing on the parameter budget and the semantic importance information based on the dynamic rank distribution model to obtain a corresponding distribution result; And taking the allocation result as the specified rank allocation scheme.
- 6. The large language model based task processing method according to claim 1, wherein the step of performing semantic fidelity verification on the second large language model specifically comprises: Performing term consistency verification on the second large language model; if the second large language model passes the term consistency verification, carrying out service logic verification on the second large language model; If the second large language model passes the service logic verification, performing performance verification on the second large language model; and if the second large language model passes the performance verification, judging that the second large language model does not pass the semantic fidelity verification, otherwise, judging that the second large language model does not pass the semantic fidelity verification.
- 7. The task processing method based on a large language model according to claim 1, wherein after the step of performing model training, fine tuning and evaluation processing on the large language model by using the target rank allocation scheme based on a preset iteration processing strategy until a preset iteration termination condition is met and a corresponding target large language model is obtained, further comprising: acquiring an optimal second rank allocation scheme corresponding to the target large language model; carrying out report generation processing on the second rank distribution scheme based on a preset report generation strategy to obtain a corresponding rank distribution analysis report; Performing format conversion processing on the rank allocation analysis report to obtain a corresponding target rank allocation analysis report; And outputting the target rank allocation analysis report.
- 8. A large language model based task processing device, comprising: The first acquisition module is used for acquiring financial corpus data collected in advance; The first processing module is used for processing the financial corpus based on a preset large language model and analyzing semantic importance information of each network layer contained in the large language model based on the obtained model output data; The first generation module is used for carrying out scheme generation processing on the semantic importance information and the preset parameter budget based on a preset dynamic rank allocation engine to obtain a corresponding designated rank allocation scheme; The training module is used for training the large language model by using a preset sparse regularization constraint module based on a preset training data set and the appointed rank allocation scheme to obtain a trained first large language model; The verification module is used for carrying out fine adjustment on the first large language model based on a preset test data set to obtain a corresponding second large language model, and carrying out semantic fidelity verification on the second large language model; The adjustment module is used for adjusting the designated rank allocation scheme based on a preset adjustment strategy to obtain a corresponding target rank allocation scheme if the second large language model fails semantic fidelity verification; The second processing module is used for performing model training, fine tuning and evaluation processing on the large language model by using the target rank allocation scheme based on a preset iteration processing strategy until the preset iteration termination condition is met and a corresponding target large language model is obtained; and the third processing module is used for processing the task data to be processed based on the target large language model.
- 9. A computer device comprising a memory and a processor, the memory having stored therein computer readable instructions which when executed by the processor implement the steps of the large language model based task processing method of any one of claims 1 to 7.
- 10. A computer readable storage medium having stored thereon computer readable instructions which when executed by a processor implement the steps of the large language model based task processing method according to any of claims 1 to 7.
Description
Task processing method, device, equipment and storage medium based on large language model Technical Field The application relates to the technical field of artificial intelligence, which can be applied to the field of financial science and technology, in particular to a task processing method, a task processing device, computer equipment and a storage medium based on a large language model. Background Under the current development of artificial intelligence technology, a large language model is increasingly widely applied in the financial field by virtue of strong language understanding and generating capability, and a plurality of key business scenes such as financial risk assessment, market trend prediction, customer service interaction and the like are covered. The parameter efficient fine tuning technology is used as a key means for adapting a large language model to a specific financial task, and is very important for improving the performance of the model in a financial scene. Currently, loRA (Low-Rank Adaptation) method is used as a typical representative in the efficient fine tuning technology of parameters, and shows certain advantages in various fields. However, in a financial context, the conventional LoRA method has significant drawbacks. Specifically, the conventional LoRA method adopts a fixed rank configuration, and uniformly applies the same rank value to all layers of the large language model. But the finance semantics have distinct layering characteristics, and the complexity and importance of semantic information processed by different network layers are different. The fixed rank configuration mode cannot adapt to the layering characteristic of financial semantics, so that when a model processes a financial task, key semantic information is difficult to capture accurately, and further the accuracy of a large language model to the financial task processing is low. For example, in a claim audit scenario in the field of financial insurance, insurance claim materials contain rich and complex text information such as accident descriptions, medical diagnoses, bill of charge, and the like. The processing requirements of different network layers on the information are different, some layers need to process basic semantic information, and other layers need to deeply mine key details to judge the compliance of the claim. The fixed rank configuration of the traditional LoRA method cannot be flexibly adjusted according to different requirements, so that important information can be missed or information can be understood inaccurately in the auditing process of the model, and finally deviation of the result of the claim auditing is caused, and the operation efficiency and the customer satisfaction of an insurance company are affected. Therefore, there is a need for an intelligent efficient fine tuning technique for parameters to improve the accuracy of large language models in financial task processing. Disclosure of Invention The embodiment of the application aims to provide a task processing method, device, computer equipment and storage medium based on a large language model, so as to solve the technical problem that the accuracy of the existing large language model on financial task processing is low. In a first aspect, a task processing method based on a large language model is provided, including: Acquiring financial corpus data collected in advance; processing the financial corpus based on a preset large language model, and analyzing semantic importance information of each network layer contained in the large language model based on obtained model output data; carrying out scheme generation processing on the semantic importance information and preset parameter budgets based on a preset dynamic rank allocation engine to obtain a corresponding designated rank allocation scheme; training the large language model by using a preset sparse regularization constraint module based on a preset training data set and the appointed rank allocation scheme to obtain a trained first large language model; performing fine tuning on the first large language model based on a preset test data set to obtain a corresponding second large language model, and performing semantic fidelity verification on the second large language model; if the second large language model fails to pass the semantic fidelity verification, adjusting the designated rank allocation scheme based on a preset adjustment strategy to obtain a corresponding target rank allocation scheme; Based on a preset iteration processing strategy, performing model training, fine tuning and evaluation processing on the large language model by using the target rank allocation scheme until a preset iteration termination condition is met and a corresponding target large language model is obtained; and processing task data to be processed based on the target large language model. In a second aspect, there is provided a task processing device based on a large langu