CN-122021636-A - Mixed fine-tuning framework and method for large language model in vertical field

CN122021636ACN 122021636 ACN122021636 ACN 122021636ACN-122021636-A

Abstract

The application provides a vertical domain large language model mixed fine tuning framework and a method, wherein the framework comprises a domain knowledge graph construction module, a knowledge fusion module, a multi-granularity fine tuning module, a model evaluation module and a dynamic calibration module, wherein the domain knowledge graph construction module is used for acquiring vertical domain data and constructing a structured domain knowledge graph based on the vertical domain data, the knowledge fusion module is used for fusing structured knowledge in the domain knowledge graph into semantic representation of a large language model, the multi-granularity fine tuning module is used for carrying out multi-granularity fine tuning on the large language model fused with the domain knowledge, the model evaluation module is used for monitoring and calculating multidimensional evaluation indexes of the model in real time in the fine tuning process, and the dynamic calibration module is used for dynamically generating an adjustment strategy according to the multidimensional evaluation indexes and dynamically calibrating the knowledge injection strength of the knowledge fusion module and the fine tuning parameters of the multi-granularity fine tuning module based on the adjustment strategy. The application aims to improve the knowledge precision, reasoning capacity and output reliability of a large language model in a specific field.

Inventors

XIAO XIN
FU MINGXING
LIU SHUQIANG

Assignees

北京安晨信息技术有限公司

Dates

Publication Date: 20260512
Application Date: 20251217

Claims (10)

1. A vertical domain large language model hybrid fine tuning framework comprising: the domain knowledge graph construction module is connected with the knowledge fusion module and used for acquiring vertical domain data and constructing a structured domain knowledge graph based on the vertical domain data; The knowledge fusion module is connected with the domain knowledge graph construction module, the multi-granularity fine adjustment module and the dynamic calibration module and is used for fusing the structured knowledge in the domain knowledge graph into the semantic representation of the large language model; The multi-granularity fine tuning module is connected with the knowledge fusion module, the model evaluation module and the dynamic calibration module and is used for carrying out multi-granularity fine tuning on the large language model fused with the domain knowledge; the model evaluation module is connected with the multi-granularity fine adjustment module and the dynamic calibration module and is used for monitoring and calculating multi-dimensional evaluation indexes of the model in real time in the fine adjustment process; And the dynamic calibration module is connected with the knowledge fusion module, the multi-granularity fine adjustment module and the model evaluation module and is used for dynamically generating an adjustment strategy according to the multi-dimensional evaluation index and dynamically calibrating the knowledge injection intensity of the knowledge fusion module and the fine adjustment parameters of the multi-granularity fine adjustment module based on the adjustment strategy.
2. The framework of claim 1, wherein the domain knowledge graph construction module specifically comprises: The system comprises a knowledge extraction unit, a data acquisition unit, a data processing unit and a data processing unit, wherein the knowledge extraction unit is connected with the data acquisition unit and is used for acquiring vertical field data, and the vertical field data comprises text data, form data and expert rule data acquired from a vertical field authoritative data source; the knowledge extraction unit is connected with the data acquisition unit and the map construction unit and is used for extracting structural knowledge elements representing domain knowledge from the vertical domain data by adopting a mixing strategy of combining a domain pre-training model with expert rules; The map construction unit is connected with the knowledge extraction unit and is used for carrying out entity linking, relation disambiguation and conflict detection on the extracted structured knowledge elements, constructing a domain knowledge map and carrying out increment updating on the domain knowledge map.
3. The framework of claim 1, wherein the knowledge fusion module specifically comprises: The knowledge embedding unit is connected with the knowledge injection unit and is used for converting the entity and the relation in the domain knowledge graph into a low-dimensional vector representation; The knowledge cue generation unit is connected with the knowledge injection unit and is used for generating a knowledge guidance cue based on the entity-relation structure of the domain knowledge graph; the knowledge injection unit is connected with the knowledge embedding unit and the knowledge prompt generation unit and used for fusing the low-dimensional vector representation with the word embedding vector of the large language model through an attention mechanism and splicing the knowledge guide prompt with the original input text.
4. The frame according to claim 1, characterized in that the multi-granularity fine-tuning module is specifically configured to: The method comprises the steps of performing full-parameter fine adjustment on an encoder top layer and a decoder layer of a large language model, performing fine adjustment on an intermediate encoder layer of the large language model by adopting a low-rank adaptation strategy, performing special promt template library in the design field of an input layer of the large language model, and performing fine adjustment on promt parameters.
5. The framework of claim 4, wherein the multidimensional assessment index includes knowledge accuracy, inference consistency, and training stability.
6. The framework of claim 5, wherein the dynamic calibration module is specifically configured to: if the knowledge accuracy is lower than a first threshold, generating a corresponding instruction to enhance the attention weight of the knowledge fusion module and supplement a relevant training sample; if the reasoning consistency is lower than a second threshold value, generating a corresponding instruction to improve the learning rate in the whole parameter fine tuning process and introducing a domain reasoning rule; and if the training stability is lower than a third threshold, generating a corresponding instruction to reduce the rank parameter of fine adjustment by adopting a low-rank adaptation strategy and enabling gradient clipping.
7. A vertical domain large language model hybrid fine tuning method based on the framework of any one of claims 1 to 6, comprising: Acquiring vertical field data, and constructing a structured field knowledge graph based on the vertical field data; fusing the structured knowledge in the domain knowledge graph into semantic representation of a large language model; Performing multi-granularity fine tuning on the large language model fused with the domain knowledge, and monitoring and calculating multi-dimensional evaluation indexes of the model in real time in the fine tuning process; and dynamically generating an adjustment strategy according to the multidimensional evaluation index, and dynamically calibrating the knowledge injection strength and the fine adjustment parameter based on the adjustment strategy to obtain a final large language model in the vertical field.
8. The method of claim 7, wherein dynamically generating an adjustment strategy based on the multi-dimensional evaluation index, and dynamically calibrating knowledge injection strength and trim parameters based on the adjustment strategy comprises: if the knowledge accuracy is lower than a first threshold, generating a corresponding instruction to enhance the attention weight and supplement a relevant training sample; if the reasoning consistency is lower than a second threshold value, generating a corresponding instruction to improve the learning rate in the whole parameter fine tuning process and introducing a domain reasoning rule; and if the training stability is lower than a third threshold, generating a corresponding instruction to reduce the rank parameter of fine adjustment by adopting a low-rank adaptation strategy and enabling gradient clipping.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the computer program implementing the steps of the vertical domain large language model hybrid tuning method according to any one of claims 7-8 when executed by the processor.
10. A computer-readable storage medium, wherein a program for implementing information transfer is stored on the computer-readable storage medium, and the program when executed by a processor implements the steps of the vertical domain large language model hybrid fine tuning method according to any one of claims 7 to 8.

Description

Mixed fine-tuning framework and method for large language model in vertical field Technical Field The invention relates to the technical field of natural language processing and artificial intelligence, in particular to a large language model hybrid fine tuning framework and a large language model hybrid fine tuning method in the vertical field. Background In recent years, a large language model (Large Language Models, LLM) with a transducer architecture as a core has made a breakthrough in the general natural language processing task, but its application in the vertical field still faces three major core bottlenecks: 1. The problem of lack of domain knowledge and deviation is that the general large language model training data has wide coverage range but low professional knowledge density in the vertical domain, and is easy to generate 'illusion' content which is inconsistent with the domain facts. For example, disease typing and diagnosis and treatment schemes may be confused in medical scenarios, and regulatory policy requirements may be misjudged in financial scenarios. 2. The fine Tuning efficiency and generalization are unbalanced, the traditional full-parameter fine Tuning consumes a large amount of calculation resources, overfitting is easily caused by insufficient field data amount, and the small sample fine Tuning (such as promt Tuning) reduces the resource requirement, but is difficult to fully mine deep association of field knowledge, and the model output stability is insufficient. 3. The fine tuning process lacks a dynamic adaptation mechanism, namely the existing fine tuning method mostly adopts a fixed training strategy and parameter updating logic, and cannot dynamically adjust the optimization direction according to real-time performance (such as knowledge accuracy and reasoning rationality) of the model in the field task, so that the model convergence speed is low, and the optimal performance is difficult to guarantee. The knowledge graph is used as a structured knowledge representation form, can clearly model the entity, the relation and the attribute in the vertical field, and provides important support for remedying the knowledge defect of the large language model. However, the existing fine tuning method combined with the knowledge graph mostly stays in the "static knowledge injection" stage (for example, knowledge graph information is used as a training data prefix), the deep fusion of knowledge and a model cannot be realized, the dynamic calibration capability for the fine tuning process is also lacking, and the bottleneck problem cannot be effectively solved. Disclosure of Invention The invention aims to provide a vertical field large language model hybrid fine tuning framework and a method, and aims to solve the problems in the prior art. The embodiment of the invention provides a vertical field large language model hybrid fine tuning framework, which comprises the following components: the domain knowledge graph construction module is connected with the knowledge fusion module and used for acquiring vertical domain data and constructing a structured domain knowledge graph based on the vertical domain data; The knowledge fusion module is connected with the domain knowledge graph construction module, the multi-granularity fine adjustment module and the dynamic calibration module and is used for fusing the structured knowledge in the domain knowledge graph into the semantic representation of the large language model; The multi-granularity fine tuning module is connected with the knowledge fusion module, the model evaluation module and the dynamic calibration module and is used for carrying out multi-granularity fine tuning on the large language model fused with the domain knowledge; the model evaluation module is connected with the multi-granularity fine adjustment module and the dynamic calibration module and is used for monitoring and calculating multi-dimensional evaluation indexes of the model in real time in the fine adjustment process; And the dynamic calibration module is connected with the knowledge fusion module, the multi-granularity fine adjustment module and the model evaluation module and is used for dynamically generating an adjustment strategy according to the multi-dimensional evaluation index and dynamically calibrating the knowledge injection intensity of the knowledge fusion module and the fine adjustment parameters of the multi-granularity fine adjustment module based on the adjustment strategy. The embodiment of the invention provides a vertical field large language model hybrid fine tuning method, which comprises the following steps: Acquiring vertical field data, and constructing a structured field knowledge graph based on the vertical field data; fusing the structured knowledge in the domain knowledge graph into semantic representation of a large language model; Performing multi-granularity fine tuning on the large language model fused with the domain knowledg