CN-121998078-A - Knowledge-intensive multi-document question-answering method and product based on multi-agent collaboration
Abstract
The invention discloses a knowledge intensive multi-document question-answering method and a product based on multi-agent cooperation, the method comprises the steps of 1 dynamically generating a multi-dimensional complementary expert role set based on role multi-view agent cooperation, forming a candidate strategy set based on each expert role, selecting an optimal strategy for integrating knowledge through a voting mechanism, 2 constructing a closed-loop flow for dislike knowledge refinement, extracting atomic-level facts through a sliding window and a dislike construction mechanism, carrying out a recursive flow for task decomposition and knowledge distillation to obtain essence knowledge, and 3 realizing cross-document comprehensive reasoning based on the optimal strategy in the step 1 and the essence knowledge obtained in the step 2 to generate a final answer. The invention breaks through the limitation of a single role view, provides a high-quality knowledge base for subsequent reasoning, can flexibly cope with different types of knowledge-intensive cross-document tasks, efficiently processes scattered and multi-source information, and has wide application prospect and market value.
Inventors
- HU MINGHAO
- TIAN YU
- Geng Guotong
- LUO WEI
- LUO ZHUNCHEN
- TIAN CHANGHAI
- YE YUMING
- ZHOU XIAN
Assignees
- 中国人民解放军军事科学院军事科学信息研究中心
Dates
- Publication Date
- 20260508
- Application Date
- 20251218
Claims (10)
- 1. A knowledge-intensive multi-document question-answering method based on multi-agent collaboration comprises the following steps: Step 1, dynamically generating a multi-dimensional complementary expert role set based on role multi-view agent cooperation, forming a candidate strategy set based on each expert role, and selecting an optimal strategy for integrating knowledge through a voting mechanism; Step 2, constructing a closed loop flow of dislike knowledge refinement, extracting atomic-level facts through a sliding window and a dislike construction mechanism, and performing a recursive flow of task decomposition and knowledge distillation to obtain essence knowledge; And 3, based on the optimal strategy in the step 1 and the essence knowledge obtained in the step 2, realizing cross-document comprehensive reasoning and generating a final answer.
- 2. The multi-agent collaboration-based knowledge-intensive multi-document question-answering method according to claim 1, wherein the step 1 includes: Step 101, constructing a general role definition prompt template based on task description, and guiding a large language model to dynamically generate a multi-dimensional complementary expert role set, wherein each expert role is a structured agent and comprises role description, voting standards and task consistency weights; Step 102, generating a plurality of strategies for integrating the cross-document long text information representation in parallel based on each expert role to form a candidate strategy set; And 103, voting the candidate strategy set by each expert role, and selecting the strategy with the highest ranking as the optimal strategy for integrating knowledge by combining the task consistency weight.
- 3. The multi-agent collaboration-based knowledge-intensive multi-document question-answering method according to claim 2, wherein the prompt template of step 101 includes a target task and related document titles.
- 4. The multi-agent collaboration-based knowledge-intensive multi-document question-answering method according to claim 2, wherein in step 101, minimum semantic overlap between each expert role feature is ensured by semantic constraint, and view redundancy is avoided.
- 5. The multi-agent collaboration-based knowledge-intensive multi-document question-answering method according to claim 2, wherein in step 102, the candidate policies include at least one of knowledge maps, comparison tables, timelines, and flowcharts that are dynamically generated based on document content.
- 6. The multi-agent collaboration-based knowledge-intensive multi-document question-answering method according to claim 1, wherein the step 2 includes: Step 201, processing a document by using a document analysis tool, extracting atomic-level facts by combining sliding windows and a self-thinking mechanism, generating a local abstract of each sliding window, updating task memories related to the current window, and summarizing to form a document abstract; step 202, traversing a document set based on an optimal strategy, and preliminarily constructing coarse-granularity cross-document knowledge; And 203, splitting the task into a subtask set, calling a large language model to extract content related to the subtask, further generating essence knowledge, and iteratively repeating the steps until the subtask set and the essence knowledge are confirmed to be in a final form through an anti-thinking mechanism.
- 7. The multi-agent collaboration-based knowledge-intensive multi-document question-answering method according to claim 6, wherein the sliding window size of step 201 is a configurable parameter, each time processing adjacent paragraphs and generating local summaries and task memories.
- 8. The multi-agent collaboration-based knowledge-intensive multi-document question-answering method according to claim 6, wherein in step 203, if the task cannot be split, the task splitting step is skipped.
- 9. The multi-agent collaboration-based knowledge-intensive multi-document question-answering method according to claim 6, wherein the step 3 includes: And taking the optimal strategy, the document abstract, the final subtask set and the essence knowledge as inputs, guiding the large language model to carry out cross-document comprehensive reasoning through the prompt word, and generating a final answer.
- 10. A computer program product comprising computer readable instructions which, when run on a computer device, cause the computer device to perform the method of any of claims 1 to 9.
Description
Knowledge-intensive multi-document question-answering method and product based on multi-agent collaboration Technical Field The invention belongs to the technical field of natural language processing, and particularly relates to a knowledge-intensive multi-document question-answering method and a product based on multi-agent collaboration. Background Because of their excellent performance in a variety of scenarios, large language models are widely expected to be applied to cross-document question-and-answer tasks, such as legal analysis and financial reporting analysis. However, due to the lack of in-depth domain-specific knowledge, large language models still have difficulty generating factually accurate content (i.e., a "illusion") in tasks that require accurate factual answers. To address this limitation, a search-and-enhance generation (RAG) framework has been proposed that enhances the ability of large language models to answer questions by incorporating external knowledge. However, conventional RAG methods process knowledge through simple text blocking, which faces the problem that document-level knowledge may be too long to fit into the context window of the language model, resulting in large models failing to take all knowledge into account, as longer documents tend to ignore critical information. Furthermore, the relevant information required to address these tasks is often scattered across multiple documents, which makes it difficult for the model to efficiently integrate and utilize the scattered information for reasoning. For example, in financial budget analysis, it is necessary to extract and correlate key data from multiple historical budget files, expense reports, and related policy documents, which requires models with information integration and comparison capabilities across the documents. However, these sample documents often contain a lot of noise, making it difficult for the model to identify the relationships between the information and make accurate knowledge reasoning. Many studies indicate that analogizing the working mechanism of large language models to human reasoning processes helps to better cope with complex tasks. Similar to humans, large language models are not limited to direct reading when processing information, but rather tend to refine information into a structured form to reduce cognitive load and improve judgment accuracy. This process is typically accompanied by self-jeopardy and verification. Similarly, the large language model has the capability of thinking-back reasoning, can identify and correct errors, and simulates the judicious judgment flow of human beings. Cognitive fit theory further states that humans tend to take different forms of knowledge organization when faced with different types of tasks, such as using tables for statistical analysis, or relying on graphics to assist long-chain reasoning. Recent studies have shown that large language models also have the ability to build diverse knowledge structures. However, these approaches are limited by the reliance on a single strategy, which prevents their effectiveness in processing complex real world information. Reliance on one approach results in difficulty in establishing logical connections and results in loss of information across documents. This makes reasoning challenging, especially when dealing with fragmented content. While large language models can produce diverse knowledge structures and retake their reasoning, they cannot dynamically combine multiple strategies, which limits their ability to efficiently process and integrate scattered information, which is critical to more complex tasks. In summary, the prior art has the following problems in a cross-document question-answering task that firstly, a traditional retrieval enhancement generation framework is difficult to adapt to a context window of a large language model due to the fact that simple text block processing knowledge is adopted, long-document key information is easy to ignore, related information scattered in multiple documents cannot be effectively integrated, and therefore reasoning effect is limited; the existing knowledge processing method relying on a single strategy is difficult to establish logic connection between complex information, cross-document information is easy to lose, particularly reasoning difficulty is high when fragmented content is processed, and the large-scale language model has the capability of generating diversified knowledge structures and anti-thinking reasoning, but cannot dynamically combine various strategies, is difficult to process and integrate scattered information efficiently, and is difficult to meet the requirements of complex cross-long document question-answering tasks. Disclosure of Invention The invention aims to overcome the defects of the prior art and provides a knowledge-intensive multi-document question-answering method and a product based on multi-agent collaboration. In view of the a