CN-121681787-B - Retrieval enhancement generation method and system based on multidimensional reordering
Abstract
The invention discloses a retrieval enhancement generation method and a retrieval enhancement generation system based on multidimensional reordering, which relate to the technical field of information retrieval and comprise the following steps of S1, constructing tensor indexes, keyword indexes and compressed abstracts; S2, driving Qwen a large model to execute query expansion and hypothesis answer generation through a double-task query processing template, S3, executing mixed search through semantic and keyword double-way indexes, executing two-stage reordering through an improved DistilBERT model and Qwen large model, S4, performing multi-dimensional evaluation through sub-problem decomposition, S5, iterating according to the sub-problem sequence to gradually generate sub-problem answers, and S6, executing multi-dimensional evaluation correction to generate a final answer. The invention overcomes the limitations of the traditional retrieval enhancement generation technology in information matching precision, context correlation evaluation and answer generation quality, and provides a high-efficiency and accurate solution.
Inventors
- WEI HAIZHI
- QI XIAOFANG
- CHEN JIALONG
Assignees
- 科讯嘉联信息技术有限公司
- 东南大学
Dates
- Publication Date
- 20260508
- Application Date
- 20260211
Claims (8)
- 1. The retrieval enhancement generation method based on multi-dimensional reordering is characterized by comprising the following steps of: s1, performing segmentation processing on an original document, constructing tensor indexes and keyword indexes for each text segment, and performing information extraction on each text segment by using a lightweight language model to generate a corresponding compression abstract; S2, defining roles, task descriptions and input and output formats through a double-task query processing template, driving Qwen a large model to execute query expansion and hypothesis answer generation, and converting original query of a user into optimized query and hypothesis perfect answer; S3, generating an initial candidate set by using optimized query and suppositional perfect answer, performing two-way index mixed retrieval through semantic and keyword, performing two-stage reordering by using an improved DistilBERT model and a Qwen large model, and extracting a fine candidate segment set; S4, inputting the original query of the user, the fine-ranking candidate fragment set and the corresponding compression abstract of each fragment in the fine-ranking candidate fragment set into Qwen large models, decomposing the original query of the user into a plurality of sub-questions, carrying out multidimensional evaluation based on the sub-questions, outputting structured scores, and generating a final context according to the structured score ordering; S5, iterating according to the sequence of the sub-questions, inputting Qwen the current questions, the historical answers and the final context into a large model, gradually generating answers until all the sub-questions are completely answered, and integrating all the sub-questions into a final answer manuscript; S6, carrying out multidimensional evaluation correction on the final answer manuscript by utilizing the Qwen large model, and generating a final answer, a corresponding confidence score and a correction log; The improved DistilBERT model comprises a feature coding layer, a later interactive coding layer, a dynamic feature extraction layer, a convolution circulation enhancement layer, a gating self-adaptive fusion layer and a correlation classification output layer: the feature coding layer is used for receiving the optimized query and the text fragment, and generating a query token tensor matrix and a document token tensor matrix through a DistilBERT coder; The post interactive coding layer is used for generating an initial interactive score by calculating cosine similarity between each tensor in the query token tensor matrix and all tensors in the document token tensor matrix, taking the maximum value and summing; the dynamic feature extraction layer is used for receiving the query token tensor matrix and the document token tensor matrix, respectively carrying out average pooling to generate a query average tensor and a document average tensor; the convolution cyclic enhancement layer is used for extracting local features through one-dimensional convolution and outputting cyclic enhancement tensors by utilizing the dependence of a bidirectional LSTM modeling sequence; The gating self-adaptive fusion layer is used for splicing the query average tensor and the document average tensor, generating weight through a Sigmoid gating network, and carrying out element-by-element weighting on the cyclic enhancement tensor to generate a final fusion representation tensor; The relevance classification output layer is used for splicing the initial interaction score and the final fusion representation tensor, and outputting relevance probability scores through linear transformation and Sigmoid activation.
- 2. The method for generating the retrieval enhancement based on the multi-dimensional reordering according to claim 1, wherein the step S1 comprises: S11, performing segmentation processing on an original document through a LANGCHAIN recursive character text divider according to a preset block size and a context overlap length to obtain text fragments, and distributing a globally unique fragment ID for each text fragment; S12, carrying out semantic embedding on each text segment through a pre-trained all-MiniLM-L6-v2 model to generate sentence tensors with preset dimension sizes, and storing the sentence tensors and corresponding segment IDs in a FAISS index library to form tensor indexes; s13, performing word segmentation and stop word removal processing on each text segment, calculating the weight of each word element by using a BM25 algorithm based on the processed word elements, and constructing an inverted index according to the segment ID of the corresponding text segment to form a keyword index; s14, processing each text segment through a T5-Small model finely tuned on the abstract task data set, designating the maximum generation length, generating a compressed abstract, and storing the compressed abstract and the corresponding segment ID in an associated manner; S15, associating and storing the segment ID of each text segment, the original text content, the compression abstract and the reference information in the tensor index and the keyword index by taking the segment ID as a primary key, constructing a unified data record, and storing the unified data record in a database.
- 3. The multi-dimensional reordering-based retrieval enhancement generation method according to claim 1, wherein the dual-task query processing template comprises role definition, task description and input/output format requirements, the role definition is an information retrieval expert, the task description comprises a query optimization task and a suppositional answer generation task, the query optimization task comprises expanding synonyms, supplementing contexts and mining potential intentions, the suppositional answer generation task comprises generating paragraphs comprising core facts and logic structures based on the optimized query, the input/output format is a JSON format, the original query of a user is converted into an optimized query and suppositional perfect answer, the suppositional perfect answer list is searched based on word vector similarity, context information is filled in by means of reference resolution in combination with historical query, the intent guide words are searched through entity recognition and intention tree matching, synonyms, context information and intent guide words are fused to generate an optimized query text, and the optimized query text is input Qwen to reasoning and generate suppositional perfect paragraphs.
- 4. The method for generating the retrieval enhancement based on the multi-dimensional reordering according to claim 1, wherein the step S3 specifically comprises: S31, extracting a query text from the optimized query, performing semantic embedding on the query text by using an all-MiniLM-L6-v2 model to generate 384-dimensional query tensors, inputting the query tensors into a FAISS index library, calculating cosine similarity between the query tensors and all sentence tensors in the FAISS index library, and obtaining a preset number of sentence tensor indexes with highest similarity; S32, performing word segmentation and stop word removal processing on the suppositional perfect answer to obtain a keyword list, traversing each keyword in the keyword list, searching in an inverted index by using the keywords to obtain text segment IDs containing each keyword, merging the text segment IDs returned by all the keywords, counting the frequency of the keywords appearing in each ID, sorting the text segment IDs according to the keyword frequency, selecting a preset number of text segment IDs with the front sorting, retrieving corresponding original text segments from a database according to the text segment IDs, and forming a second candidate set; s33, merging the first candidate set and the second candidate set to form an initial candidate set, inputting each text segment in the initial candidate set and the optimized query into an improved DistilBERT model, and outputting a relevance probability score; s34, sorting the initial candidate set according to the relevance probability score, selecting the segments with the highest scores and the preset number to form an elite segment set, inputting Qwen the elite segment set and the corresponding compression abstract of each segment into a large model to generate an fine-ranking relevance probability score, sorting the elite segment set according to the fine-ranking relevance probability score, and selecting the segments with the highest scores and the preset number to form an fine-ranking candidate segment set.
- 5. The method for generating the retrieval enhancement based on the multi-dimensional reordering according to claim 1, wherein the step S4 specifically comprises: S41, constructing a context reordering instruction template, wherein the context reordering instruction template comprises role definition, task description, input content, evaluation dimension and output format requirement, the role definition is an advanced information analyst, the task description is used for decomposing user inquiry into sub-questions and carrying out multidimensional evaluation and sequencing on candidate fragments based on the sub-questions, the input content is user original inquiry, a candidate fragment list and corresponding compression abstracts, the evaluation dimension comprises coverage of each sub-question, answer quality and information novelty of the sub-questions, the output format is a JSON array, and each element comprises fragment ID, coverage score, answer quality score and structural score; S42, combining the original query of the user, the fine-ranking candidate segment set and the corresponding compression abstract of each segment in the fine-ranking candidate segment set according to the format requirement of the context reordering instruction template to generate a reordering prompt word; s43, inputting the reordered prompt words into a Qwen large model, executing sub-problem decomposition and multidimensional evaluation by the Qwen large model, and outputting a JSON array meeting the requirement of an output format; S44, analyzing the JSON array, extracting the structured score of each fine candidate segment, sorting the fine candidate segments in a descending order according to the structured score, and selecting the segments with the highest scores and the preset number to form a final context.
- 6. The method for generating the retrieval enhancement based on the multi-dimensional reordering according to claim 1, wherein the step S5 specifically comprises: S51, constructing an iterative answer generation instruction template, wherein the instruction template comprises role definition, task description, input content, output format and constraint conditions, the role definition is an expert researcher, the task description is that the subtasks are gradually answered based on context and history dialogue, the input content is a current subtask, an answered subtask and a corresponding answer and a final context, the output format is a structured JSON and comprises an answer of the current subtask and whether the answer is a final subtask, and the constraint conditions comprise logic that the answer must be strictly based on the provided final context and refer to the answered answer; s52, initializing an empty answer list, and taking out a first sub-question from the sub-question list as a current sub-question; S53, combining the current sub-questions, the answer list and the final context into a generated prompt word according to the format requirement of the iterative answer generation instruction template; S54, inputting the generated prompt words into a Qwen large model to obtain output JSONs, analyzing the JSONs, extracting the answers of the current sub-questions, and adding the answers into a solved answer list; S55, judging whether the current sub-problem is the last problem in the sub-problem list, if not, taking out the next sub-problem from the sub-problem list as a new current sub-problem, and returning to S53 to continue iteration; s56, inputting the answer list to Qwen large models, and integrating the answer list into a final answer manuscript according to the logic and time sequence relation.
- 7. The method for generating the retrieval enhancement based on the multi-dimensional reordering according to claim 1, wherein the step S6 specifically comprises: S61, constructing a multi-dimensional evaluation and correction instruction template, wherein the multi-dimensional evaluation and correction instruction template comprises role definition, task description, an evaluation dimension, an output format and a correction requirement, the role definition is defined as a senior fact checker and a logic analyst, the task description is used for performing criticizing evaluation, correction and quantification of confidence coefficient on a final answer first draft, the evaluation dimension comprises fact accuracy, logic coherence and content integrity, the output format is structured JSON and comprises corrected final answers, a confidence coefficient score list and a correction log, and the correction requirement comprises clear indication of errors, provision of correction basis and explanation of correction reasons; s62, combining the final answer manuscript and the corresponding final context according to the format requirements of the multi-dimensional evaluation and correction instruction template to generate an evaluation prompt word; s63, inputting an evaluation prompt word into the Qwen big model, executing fact checking, logic checking and integrity evaluation, and outputting JSON meeting the output format requirement; S64, analyzing the JSON output by the Qwen large model, and extracting and outputting corrected final answers, confidence scores and correction logs.
- 8. A retrieval enhancement generation system based on multi-dimensional reordering, which performs the retrieval enhancement generation method based on multi-dimensional reordering according to any one of claims 1 to 7, comprising the following modules: the knowledge base construction module is used for carrying out segmentation processing on the original document, constructing tensor indexes and keyword indexes for each text segment, and carrying out information extraction on each text segment by using a lightweight language model to generate a corresponding compression abstract; The query understanding module is used for defining roles, task descriptions and input and output formats through the double-task query processing template, driving Qwen large models to execute query expansion and hypothesis answer generation, and converting original queries of users into optimized queries and hypothesis perfect answers; the information recall module is used for generating an initial candidate set by using optimized query and suppositional perfect answer through semantic and keyword double-path index mixed retrieval, performing two-stage reordering by using an improved DistilBERT model and a Qwen large model, and extracting a fine-ranking candidate fragment set; The context optimization module is used for inputting Qwen large models of the original query of the user, the fine-ranking candidate fragment set and the compression abstract corresponding to each fragment in the fine-ranking candidate fragment set, decomposing the original query of the user into a plurality of sub-questions, carrying out multidimensional evaluation based on the sub-questions, outputting structured scores, and generating a final context according to the structured score ordering; the answer synthesis module is used for iterating according to the sequence of the sub-questions, inputting Qwen the current questions, the historical answers and the final context into a large model, gradually generating answers until all the sub-questions are completely answered, and integrating all the sub-question answers into a final answer manuscript; and the result quality control module is used for carrying out multidimensional evaluation and correction on the final answer manuscript by utilizing the Qwen large model and generating a final answer, a corresponding confidence score and a correction log.
Description
Retrieval enhancement generation method and system based on multidimensional reordering Technical Field The invention relates to the technical field of information retrieval, in particular to a retrieval enhancement generation method and system based on multidimensional reordering. Background With the rapid development of artificial intelligence and big data technology, an intelligent question-answering system has become a core carrier for information acquisition and man-machine interaction. In the field of intensive knowledge, such as professional consultation, scientific research assistance and enterprise knowledge management, whether the system can generate accurate and reliable answers based on a massive and complex knowledge base directly determines the application value and user experience of the system. However, existing search enhancement generation methods face serious challenges in terms of recall accuracy of information retrieval, depth of context understanding, and quality of final answer generation in the face of large-scale, long text knowledge base and complex, ambiguous user queries. The main limitation of the traditional retrieval enhancement generation method is the shallow information matching and the blindness of context construction. Most of the existing methods rely on single semantic or keyword retrieval, and the understanding of the query intention of the user is not deep enough, so that the correlation of the recalled information fragments is insufficient. Meanwhile, when the context is constructed, the search fragments are often sequenced and intercepted only according to simple similarity scores, and comprehensive evaluation of logic relations, information complementarity and coverage of the query core problems among the fragments is lacking. When the knowledge base has strong content specialization and long document space, the shallow matching and blind construction mode is extremely easy to introduce noise or miss key information, so that fact deviation, logic fracture or content hollowness occur when the model generates an answer, and the accuracy and the credibility of the answer are seriously affected. In addition, in the answer generation link, the traditional method often adopts a one-time generation mode, and lacks a self-checking and iterative optimization mechanism for answer quality. The prior art generally directly inputs Qwen the retrieved context into a large model for single-round generation, fails to carry out structural guidance on the generation process, and lacks criticism evaluation and correction on the generation result. This makes it difficult for a system to guarantee the logical consistency and factual accuracy of answers in the face of complex questions requiring multi-step reasoning, integrating multiple sources of information. Even if a reordering or iteration mechanism is introduced into a part of methods, multidimensional information such as query intention, content quality, answer credibility and the like cannot be effectively fused to perform closed-loop optimization, and efficient, accurate and reliable high-quality answer generation is difficult to achieve. Therefore, how to provide a method and a system for generating search enhancement based on multidimensional reordering is a problem that needs to be solved by those skilled in the art. Disclosure of Invention The invention provides a retrieval enhancement generation method and a retrieval enhancement generation system based on multidimensional reordering, which remarkably improve the accuracy and reliability of answers under a complex question-answer scene by constructing a mixed retrieval and multistage reordering mechanism and combining iterative generation and closed-loop evaluation. The method comprises the steps of constructing tensor and keyword double-way indexes of an original document, generating a compression abstract, generating deepened understanding of user intention through query expansion and supposition answers, performing two-stage reordering by utilizing an improved DistilBERT model and a large model, accurately screening candidate fragments, constructing an optimal context based on multidimensional evaluation, and ensuring logic rigor and fact accuracy of a final answer through iterative answer generation and multidimensional evaluation correction. The invention overcomes the limitations of low information matching precision, blind context construction and uncontrollable answer generation quality in the traditional retrieval enhancement generation technology, and provides a high-efficiency and accurate solution for an intelligent question-answering system which needs to process complex query, long-document knowledge base and high-precision answer generation requirements. According to the embodiment of the invention, the retrieval enhancement generation method based on multi-dimensional reordering comprises the following steps of: s1, performing segmentation processing on an o