CN-122019711-A - Knowledge recall method, device, equipment and storage medium based on document type
Abstract
Responding to an input problem of a user, adopting a target retrieval strategy corresponding to the document type of a document to be retrieved, and retrieving first knowledge contents related to the input problem of the user from a knowledge base; screening the first knowledge contents based on the preset similarity to obtain second knowledge contents with similarity larger than the preset similarity in the first knowledge contents, carrying out mixed reordering on the first knowledge contents based on semantic relativity of the second knowledge contents and user problems to obtain a reordered knowledge list, carrying out secondary semantic matching on the input problems of the user and the second knowledge contents through a large model under the condition that the matching degree of the reordered knowledge list does not meet the preset condition, and determining a final recall knowledge list based on a matching result. The invention improves the accuracy of knowledge recall.
Inventors
- ZHAO YUNTAO
- LUO ZIHAN
- HUANG CHUAN
- REN SIYU
- HUANG DAN
Assignees
- 吉旗(成都)科技有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20260120
Claims (10)
- 1. A document type based knowledge recall method, comprising: Responding to an input problem of a user, and adopting a target retrieval strategy corresponding to the document type of a document to be retrieved to retrieve each first knowledge content related to the input problem of the user from a knowledge base; screening the first knowledge contents based on preset similarity to obtain second knowledge contents with similarity larger than the preset similarity in the first knowledge contents, and performing mixed reordering on the first knowledge contents based on semantic relativity of the second knowledge contents and the user problem to obtain a reordered knowledge list; Under the condition that the matching degree of the reordered knowledge list does not meet the preset condition, performing secondary semantic matching on the input problem of the user and each second knowledge content through a large model; Based on the matching results, a final recall knowledge list is determined.
- 2. The document type-based knowledge recall method of claim 1 wherein the document type comprises unstructured documents of a first type and structured question-answer pairs of a second type, wherein the method further comprises, in response to an input question of a user, before retrieving each first knowledge content related to the input question of the user from a knowledge base using a target retrieval policy corresponding to the document type of the document to be retrieved: Determining the document to be searched which accords with a preset question-answer format as the second type of structured question-answer pair through the format, structure or content characteristics of the document to be searched; and determining the document to be retrieved which does not accord with the preset question-answer format as the unstructured document of the first type.
- 3. The document type-based knowledge recall method of claim 2 wherein in response to a user's input question, before retrieving each first knowledge content related to the user's input question from a knowledge base using a target retrieval policy corresponding to a document type of a document to be retrieved, further comprising: Performing text parsing and semantic slicing processing on the unstructured document of the first type aiming at the unstructured document of the first type to form a plurality of document slices, and generating a corresponding vectorized representation for each document slice to construct a first index; for the second type of structured question-answer pairs, a plurality of independent processing units are determined based on the integrity of each question-answer pair, and a corresponding vectorized representation is generated for the question portion in each of the processing units to construct a second index.
- 4. The document type-based knowledge recall method of claim 1 wherein the step of mixedly reordering each of the first knowledge content based on semantic relevance of each of the second knowledge content to the user question to obtain a reordered knowledge list comprises: respectively calculating a first semantic relativity of the input questions of the user and each searched question-answer pair, and respectively calculating a second semantic relativity of the input questions of the user and each document slice; Uniformly sorting the question-answer pairs according to the first semantic relativity to obtain a sorted question-answer pair list; Uniformly sorting the document slices according to the second semantic relativity to form a sorted document slice list; and determining the reordered knowledge list based on the ordered question-answer pair list and the ordered document slice list.
- 5. The document type-based knowledge recall method of claim 4, wherein, before performing secondary semantic matching on the input problem of the user and each of the second knowledge contents through a large model if the matching degree of the reordered knowledge list does not satisfy a preset condition, further comprising: Configuring a first relevance threshold for unstructured documents of a first type and a second relevance threshold for structured question-answer pairs of a second type; Screening knowledge content meeting the preset condition from the ordered document slice list according to the first correlation threshold; and screening out knowledge content meeting the preset condition from the ordered question-answer pair list according to the second correlation threshold.
- 6. The document type-based knowledge recall method of claim 5 wherein said performing a secondary semantic match of said user's input questions with each of said second knowledge content via a large model comprises: Determining candidate knowledge content in the reordered knowledge list having a relevance to the user's input question above a minimum benchmark but below the first relevance threshold or a second relevance threshold; inputting the candidate knowledge content into the large model, guiding the large model through a preset prompt word, judging semantic relevance between the candidate knowledge content and the input problem of the user, and obtaining associated candidate knowledge content; And determining the associated candidate knowledge content as the matching result.
- 7. A document type based knowledge recall device comprising: The differential retrieval module is used for responding to the input problem of a user, and retrieving each first knowledge content related to the input problem of the user from a knowledge base by adopting a target retrieval strategy corresponding to the document type of the document to be retrieved; the mixed reordering module is used for screening each first knowledge content based on preset similarity to obtain each second knowledge content with similarity larger than the preset similarity in the first knowledge content, and carrying out mixed reordering on each first knowledge content based on semantic relativity of each second knowledge content and the user problem to obtain a reordered knowledge list; the depth judging module is used for carrying out secondary semantic matching on the input problem of the user and each second knowledge content through a large model under the condition that the matching degree of the reordered knowledge list does not meet the preset condition; and the recall module is used for determining a final recall knowledge list based on the matching result.
- 8. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the document type based knowledge recall method of any one of claims 1 to 6 when the computer program is executed.
- 9. A non-transitory computer readable storage medium having stored thereon a computer program, which when executed by a processor implements the document type based knowledge recall method of any one of claims 1 to 6.
- 10. A computer program product comprising a computer program which, when executed by a processor, implements the document type based knowledge recall method of any one of claims 1 to 6.
Description
Knowledge recall method, device, equipment and storage medium based on document type Technical Field The invention relates to the technical field of artificial intelligence and natural language processing, in particular to a knowledge recall method, device, equipment and storage medium based on document types. Background The knowledge base serves as an agent's plug-in capability as part of the information source when answering the user's questions. When industry and enterprise proprietary knowledge is involved, knowledge base information is needed to be used as input for solving the problem, otherwise, the intelligent agent can input information based on model synthesis, the problem can be answered inaccurately, and the problem can be randomly said. At present, due to the diversity of Chinese types and the structure of the knowledge base, the semantic retrieval precision is different during knowledge recall, so that the recall accuracy is reduced. Disclosure of Invention Based on the defects existing in the prior art, the invention provides a knowledge recall method, device, equipment and storage medium based on document types, which improves recall accuracy. In a first aspect, the present invention provides a document type based knowledge recall method comprising the steps of: Responding to an input problem of a user, and adopting a target retrieval strategy corresponding to the document type of a document to be retrieved to retrieve each first knowledge content related to the input problem of the user from a knowledge base; screening the first knowledge contents based on preset similarity to obtain second knowledge contents with similarity larger than the preset similarity in the first knowledge contents, and performing mixed reordering on the first knowledge contents based on semantic relativity of the second knowledge contents and the user problem to obtain a reordered knowledge list; Under the condition that the matching degree of the reordered knowledge list does not meet the preset condition, performing secondary semantic matching on the input problem of the user and each second knowledge content through a large model; Based on the matching results, a final recall knowledge list is determined. The invention provides a knowledge recall method based on a document type, wherein the document type comprises an unstructured document of a first type and a structured question-answer pair of a second type, and before a target retrieval strategy corresponding to the document type of a document to be retrieved is adopted to retrieve each first knowledge content related to the input problem of a user from a knowledge base in response to the input problem of the user, the knowledge recall method further comprises the following steps: Determining the document to be searched which accords with a preset question-answer format as the second type of structured question-answer pair through the format, structure or content characteristics of the document to be searched; and determining the document to be retrieved which does not accord with the preset question-answer format as the unstructured document of the first type. According to the knowledge recall method based on the document type, the method responds to the input problem of the user, and before the first knowledge contents related to the input problem of the user are searched from a knowledge base by adopting a target search strategy corresponding to the document type of the document to be searched, the method further comprises the following steps: Performing text parsing and semantic slicing processing on the unstructured document of the first type aiming at the unstructured document of the first type to form a plurality of document slices, and generating a corresponding vectorized representation for each document slice to construct a first index; for the second type of structured question-answer pairs, a plurality of independent processing units are determined based on the integrity of each question-answer pair, and a corresponding vectorized representation is generated for the question portion in each of the processing units to construct a second index. According to the document type-based knowledge recall method provided by the invention, the mixed reordering is carried out on each first knowledge content based on the semantic relativity of each second knowledge content and the user problem to obtain a reordered knowledge list, and the method comprises the following steps: respectively calculating a first semantic relativity of the input questions of the user and each searched question-answer pair, and respectively calculating a second semantic relativity of the input questions of the user and each document slice; Uniformly sorting the question-answer pairs according to the first semantic relativity to obtain a sorted question-answer pair list; Uniformly sorting the document slices according to the second semantic relativity to form a sorted document slice l