CN-121996749-A - Poultry field large language model question-answering method based on progressive knowledge enhancement

CN121996749ACN 121996749 ACN121996749 ACN 121996749ACN-121996749-A

Abstract

The invention discloses a method for solving a problem in a poultry field large language model based on progressive knowledge enhancement, which comprises the steps of firstly, decomposing a user complex problem into a sub-problem set through fine adjustment planning of the large language model, matching a field knowledge base after vectorizing the problem and the sub-problem through a field adaptation embedded model by adopting a hierarchical matching strategy, directly returning answers when the similarity is greater than or equal to a threshold value, otherwise, analyzing the sub-problem by utilizing a triplet extraction model to generate a standardized triplet, supplementing a missing triplet through a poultry disease knowledge graph, activating a webpage retrieval module to extract information and complete the structural supplement if the graph cannot be completed, then fusing a multi-source knowledge by the large language model to perform feature alignment and consistency verification, generating a final answer, synchronously realizing dynamic knowledge evolution, filtering the high-confidence triplet for webpage retrieval, and writing back the knowledge graph when the user satisfaction reaches the threshold value. The question answering capability of a large language model in the field of poultry is remarkably improved.

Inventors

LIU FAN
ZHANG HAOJIAN
Zhou Daojie
ZHOU JUNJIE
MIN RUI
HUA YUCHEN
YAO LIANG
CHEN ZIBIN
ZHANG XINLEI

Assignees

河海大学

Dates

Publication Date: 20260508
Application Date: 20251211

Claims (8)

1. A method for solving a question and answer in a large language model in the field of poultry based on progressive knowledge enhancement is characterized by comprising the following steps: step 1, based on complex problem characteristics in the field of poultry diseases, constructing a structured data set oriented to triad extraction optimization, and utilizing the structured data set to finely tune a planning large language model so that each sub-problem obtained by disassembling the complex problem of the finely tuned planning large language model meets an atomic constraint mechanism; Step 2, calculating cosine similarity between the complex questions input by the user and questions of each question-answer pair in a pre-constructed poultry disease knowledge base, finding the maximum value in all cosine similarity, judging whether the maximum value is greater than or equal to a first preset threshold value, if so, returning answers of the question-answer pairs corresponding to the maximum value to the user, otherwise, entering step 3; Step 3, decomposing the complex questions input by the user into a plurality of sub-questions by utilizing the finely tuned planning large language model, calculating cosine similarity between each sub-question and questions of each question-answer pair in a pre-constructed poultry disease knowledge base, finding out the maximum value in all cosine similarity, judging whether the maximum value is greater than or equal to a first preset threshold value, if so, inputting answers of the question-answer pairs corresponding to the maximum value and corresponding sub-questions into the question-answer large language model, otherwise, entering step 4; Step 4, carrying out matching query on the sub-questions with the maximum value smaller than a first preset threshold value by utilizing a pre-constructed poultry disease knowledge graph, if the answers of the sub-questions are matched and queried, inputting the answers and the corresponding sub-questions into a question-answer large language model, otherwise, entering step 5; Step 5, for the sub-questions with the maximum value smaller than the first preset threshold value and the answers which are not matched and inquired in the pre-constructed poultry disease knowledge graph, acquiring the answers of the sub-questions in a webpage retrieval mode, and inputting the answers and the corresponding sub-questions into a question-answering language model; step 6, carrying out multi-source information fusion on all the sub-questions and answers of the sub-questions obtained in the steps 3-5 by utilizing a question-answer large language model, generating answers of complex questions and returning the answers to the user; and 7, dynamically and synergistically optimizing the pre-constructed poultry disease knowledge base and the pre-constructed poultry disease knowledge map for solving the follow-up complex problems.
2. The progressive knowledge-based enhanced method for question-answering in a large language model in the poultry field according to claim 1, wherein the specific process of step 4 is as follows: Extracting a large language model by using the triples to perform semantic deep analysis on the sub-problems to generate to-be-complemented problem triples; and constructing an SPARQL structured query statement according to the to-be-complemented question triples, executing directional retrieval in the pre-constructed poultry disease knowledge graph, inputting the answers and the corresponding sub-questions into a question-answer large language model if the corresponding answers are retrieved, and otherwise, entering step 5.
3. The progressive knowledge-based enhanced method for question-answering in a large language model in the poultry field according to claim 2, wherein the specific process of step 5 is as follows: step 5.1, obtaining related webpage content of the sub-problem through a search interface by utilizing SEARCHAPI, and extracting the related webpage content before A web page; step 5.2, using python bag beautifulSoup before extraction Inputting the to-be-completed problem triples and the text contents of the webpages into a large model kimi-128k to generate a complete triples based on webpage knowledge; step 5.3, calculating the confidence coefficient of each complement triplet, deleting the complement triplet with the confidence coefficient smaller than a second preset threshold value, and inputting the rest complement triplet into the question-answering large language model, wherein the confidence coefficient The calculation formula is as follows: , Wherein, the A feasibility score representing the domain name of the web page, , A semantic similarity score representing the text content of the web page with the sub-questions, , Representing the support proportion of each completion triplet in all completion triples, And Are weight coefficients.
4. The progressive knowledge-based enhanced method for question-answering in a large language model in the poultry field according to claim 3, wherein the specific process of step 6 is as follows: Step 6.1, mapping the sub-questions and corresponding answers matched by the poultry disease knowledge base pre-constructed in step 3, the sub-questions and corresponding answers matched by the poultry disease knowledge base pre-constructed in step 4 and the residual complement triples retrieved by the webpage in step 5 into natural language fragments by utilizing the pre-constructed poultry disease knowledge base; and 6.2, fusing all the sub-questions and natural language fragments corresponding to the sub-questions by using a question and answer large language model to generate final solutions of the complex questions.
5. The progressive knowledge-based enhanced method for question-answering in a large language model in the poultry field according to claim 4, wherein the specific process of step 7 is as follows: step 7.1, dynamically complementing and optimizing the pre-constructed poultry disease knowledge graph by using the residual complementing triplets obtained in the step 5.3; And 7.2, monitoring an explicit satisfaction degree signal and an implicit satisfaction degree score of the user on the complex question answers, triggering knowledge write-back when the user is satisfied with the complex question answers and the satisfaction degree score reaches a third preset threshold, and storing the complex questions and the corresponding answers into a pre-built poultry disease knowledge base.
6. A progressive knowledge enhancement based poultry domain large language model question-answering system, the system comprising: the planning large language model fine adjustment module is used for constructing a structured data set oriented to triad extraction optimization based on complex problem characteristics in the poultry disease field, and fine adjustment of the planning large language model is realized by utilizing the structured data set, so that each sub problem obtained by disassembling the complex problem of the fine adjusted planning large language model meets an atomicity constraint mechanism; The poultry disease knowledge base comparison module is used for calculating cosine similarity between complex questions input by a user and questions of each question-answer pair in a pre-constructed poultry disease knowledge base, finding out the maximum value in all cosine similarity, judging whether the maximum value is greater than or equal to a first preset threshold value, and if so, returning answers of the question-answer pairs corresponding to the maximum value to the user; The complex question disassembling and sub-question matching module is used for disassembling complex questions input by a user into a plurality of sub-questions by utilizing the finely tuned planning large language model, calculating cosine similarity between each sub-question and questions of each question-answer pair in a pre-constructed poultry disease knowledge base, finding out the maximum value in all cosine similarity, judging whether the maximum value is greater than or equal to a first preset threshold value, if so, inputting answers of the question-answer pairs corresponding to the maximum value and the corresponding sub-questions into the question-answer large language model, otherwise, entering the knowledge graph enhancement matching module; The knowledge graph enhancement matching module is used for carrying out matching query on the sub-questions by utilizing the pre-constructed poultry disease knowledge graph, if the answers of the sub-questions are matched and queried, inputting the answers and the corresponding sub-questions into the question-answering large language model, otherwise, entering the webpage retrieval enhancement module; The webpage retrieval enhancement module is used for acquiring answers of the sub-questions in a webpage retrieval mode for the sub-questions which have the maximum value smaller than a first preset threshold and are not matched and queried in the pre-constructed poultry disease knowledge graph, and inputting the answers and the corresponding sub-questions into the question-answering large language model; The fusion module is used for carrying out multi-source information fusion on all the sub-questions and answers of the sub-questions by utilizing the question-answer large language model, generating answers of complex questions and returning the answers to the user; The dynamic collaborative optimization module is used for carrying out dynamic collaborative optimization on the pre-constructed poultry disease knowledge base and the pre-constructed poultry disease knowledge map and is used for solving the follow-up complex problems.
7. A computer device comprising a memory, a processor, and a computer program stored in the memory and capable of running on the processor, wherein the processor, when executing the computer program, implements the steps of the progressive knowledge-based enhanced method of question-answering a poultry domain big language model as claimed in any one of claims 1 to 5.
8. A computer readable storage medium storing a computer program, wherein the computer program when executed by a processor implements the steps of the progressive knowledge-based enhanced poultry domain large language model question-answering method according to any one of claims 1 to 5.

Description

Poultry field large language model question-answering method based on progressive knowledge enhancement Technical Field The invention relates to a method for solving a question and answer of a large language model in the field of poultry based on progressive knowledge enhancement, and belongs to the technical field of natural language processing. Background In recent years, the production scale of poultry industry is continuously expanding, and the industrial structure is accelerated to be transformed from dispersion to intensification and the whole industrial chain integration. However, under the background of high productivity enterprises and market price bearing, industries are facing multiple challenges such as rising feed cost, changing consumption demands, complicating epidemic risk, and improving fine and further processing capability. In the process, digitization and intelligent transformation become key consensus of industry cost reduction and efficiency enhancement and high-quality development. Meanwhile, with the excellent performance of the large language model in the general field, the potential of the large language model in the vertical professional fields such as agriculture, animal husbandry and the like is increasingly highlighted. However, in a knowledge-intensive and highly specialized scenario such as poultry farming, the direct use of a generic large language model for intelligent question-answering presents significant challenges: Firstly, the model has limited expertise in the poultry field contacted in the pre-training stage, when facing to complex problems such as specific disease prevention and control, nutrition management, environmental control and the like, the model is easy to generate a 'phantom' answer which is seemingly reasonable and contains a fact error, the reliability is low, secondly, the existing method based on retrieval enhancement generation (RAG) depends on simple matching of a single vector library, the processing capacity of the complex problem requiring multi-step reasoning and sub-problem disassembly is insufficient, a static knowledge base is difficult to cover the production practice and epidemic disease dynamics which are rapidly updated, the answer efficiency is low, and finally, most systems lack effective knowledge closed loop and self-evolution mechanisms, cannot continuously learn from interaction, so that the answer capacity is solidified, and the continuous development of the field knowledge is difficult to adapt. Therefore, the prior art cannot meet the urgent need of reliable, accurate and traceable intelligent decision support in the process of scale and intelligent upgrading of the poultry industry. A specialized intelligent question-answering scheme which can deeply fuse domain knowledge, has complex problem understanding and reasoning capability and can realize knowledge closed-loop growth is urgently needed. Disclosure of Invention The technical problem to be solved by the invention is to provide a method for solving the problems that the conventional general large language model has low answer accuracy and insufficient reliability and can not realize continuous accumulation of knowledge and self-optimization in the question and answer of the poultry and other professional fields based on progressive knowledge enhancement. The invention adopts the following technical scheme for solving the technical problems: A method for solving questions and answers of a large language model in the poultry field based on progressive knowledge enhancement comprises the following steps: step 1, based on complex problem characteristics in the field of poultry diseases, constructing a structured data set oriented to triad extraction optimization, and utilizing the structured data set to finely tune a planning large language model so that each sub-problem obtained by disassembling the complex problem of the finely tuned planning large language model meets an atomic constraint mechanism; Step 2, calculating cosine similarity between the complex questions input by the user and questions of each question-answer pair in a pre-constructed poultry disease knowledge base, finding the maximum value in all cosine similarity, judging whether the maximum value is greater than or equal to a first preset threshold value, if so, returning answers of the question-answer pairs corresponding to the maximum value to the user, otherwise, entering step 3; Step 3, decomposing the complex questions input by the user into a plurality of sub-questions by utilizing the finely tuned planning large language model, calculating cosine similarity between each sub-question and questions of each question-answer pair in a pre-constructed poultry disease knowledge base, finding out the maximum value in all cosine similarity, judging whether the maximum value is greater than or equal to a first preset threshold value, if so, inputting answers of the question-answer pairs corresponding to the maxi