CN-121996780-A - Rule retrieval method, medium and device based on knowledge graph multi-hop reasoning query
Abstract
The application discloses a rule retrieval method, medium and device based on knowledge-graph multi-hop reasoning query. The method comprises the steps of firstly constructing a financial tax knowledge graph and vector database. And after receiving the query request, extracting tax labels and tax concepts, and obtaining a first candidate segment set through vector retrieval. And carrying out one-hop query on the candidate segment sets in the knowledge graph, obtaining related tax concept nodes, and filtering to form an extended tax concept set. And then carrying out one-hop query by taking the extended tax concept set as a starting point to acquire more relevant legal fragments so as to form a second candidate fragment set. And finally merging the two candidate segment sets, sorting and screening according to the rule effectiveness, the release time and the semantic similarity, and generating a structured final search result. The method effectively solves the problems of incomplete rule recall and irrelevant information interference in a multi-hop reasoning scene through twice directional inquiry and intermediate concept filtering.
Inventors
- BAI XUEFANG
- CHEN GUN
- ZHAN DONG
- Du Bingzhu
- GAO WEI
- LIU DONGDONG
Assignees
- 福建博思软件股份有限公司
Dates
- Publication Date
- 20260508
- Application Date
- 20260225
Claims (10)
- 1. A legal search method based on knowledge-graph multi-hop reasoning inquiry is characterized by comprising the following steps: Constructing a financial tax knowledge base which comprises a financial tax knowledge map and a vector database, wherein the vector database is used for storing document fragments which are segmented and vectorized from rule texts; Receiving a query request input by a user, analyzing the query request, and extracting tax labels and key tax concepts in the query request, wherein the tax labels are obtained by matching a large model with a preset keyword library in the financial tax field, and the key tax concepts are obtained by extracting a named entity recognition model based on the large model; Filtering the document fragments in the vector database based on the tax label, calculating the semantic similarity between the query request and the filtered document fragments in a mixed retrieval mode, and selecting K document fragments with the highest semantic similarity as a first candidate fragment set; Performing one-hop query based on rule and regulation rule nodes corresponding to the rule and regulation rule corresponding to the first candidate segment set in the financial knowledge graph, obtaining tax concept nodes directly related to the rule and regulation rule nodes to form an initial extended concept set, and filtering the initial extended concept set by using key tax concepts extracted from a query request to obtain a filtered extended tax concept set; Taking each tax concept node in the filtered extended tax concept set as a starting point, carrying out one-hop query in a financial tax knowledge graph, acquiring rule and regulation rule nodes directly related to the tax concept nodes, acquiring corresponding document fragments based on the rule and regulation rule nodes to form a second candidate fragment set, and combining the first candidate fragment set and the second candidate fragment set to form a related rule candidate set; And sequencing and screening the document fragments in the related legal rule candidate set based on preset multidimensional features, and generating a structured final search result, wherein the multidimensional features comprise legal rule effectiveness levels, release time and semantic similarity of the document fragments and a query request.
- 2. The legal search method based on knowledge-graph multi-hop inference query of claim 1, wherein generating structured final search results comprises: And inputting the candidate document fragments which are ranked ahead together with the query request into a large language model, so that the large language model carries out correlation discrimination, redundancy removal and information abstract generation on the input candidate fragments according to the query request, and outputting a structured search result.
- 3. The legal searching method based on knowledge-graph multi-hop reasoning inquiry as set forth in claim 1, wherein the constructing a financial tax knowledge base includes: Preprocessing the full text of financial regulations to obtain document fragments, wherein the preprocessing comprises text cleaning, paragraph segmentation and fragment segmentation; Based on the document fragments, extracting rule and regulation entities and tax concept entities in a mode of combining rules and models, and identifying a first association relation between the rule and regulation entities and a tax concept entity and a second association relation between the rule and regulation entities to which the tax concept entity belongs; Constructing the financial and tax knowledge graph by taking the rule and regulation entity and the tax concept entity as nodes and the first association relationship or the second association relationship as edges; And carrying out vectorization coding on the document fragments through a pre-trained semantic vector model, and storing the vectorized document fragments and metadata thereof into the vector database, wherein the metadata comprise the belonged regulations, effectiveness levels and release time.
- 4. The legal search method based on knowledge-graph multi-hop inference query according to claim 1, wherein calculating the semantic similarity between the query request and the filtered document segments by the hybrid search method comprises: Calculating cosine similarity between the semantic vector of the query request and the semantic vector of each filtered document fragment to obtain a first similarity score; Based on the frequency and the position information of the tax label in the document fragment, carrying out keyword matching degree calculation to obtain a second similarity score; and carrying out weighted fusion on the first similarity score and the second similarity score according to a preset weight coefficient to obtain the comprehensive semantic similarity.
- 5. The legal searching method based on knowledge-graph multi-hop inference query according to claim 1, further comprising the following steps after obtaining the filtered extended tax concept set: The priority ordering is determined based on the association relation type between the tax concept nodes and the rule treaty nodes corresponding to the first candidate segment set, and specifically comprises the steps of identifying relation edges connecting the rule treaty nodes and the tax concept nodes in the financial tax knowledge graph, and giving high priority to the corresponding tax concept nodes if the relation edges are specific relations representing rule time-varying, and giving ordinary priority lower than the high priority to the corresponding tax concept nodes if the relation edges are other relations; The step of performing one-hop query in the financial tax knowledge graph by taking each tax concept node in the filtered extended tax concept set as a starting point comprises the step of preferentially executing one-hop query from the tax concept node with high priority according to the priority ordering.
- 6. The rule retrieving method based on knowledge-graph multi-hop inference query as claimed in claim 1, wherein rule treaty nodes in the financial tax knowledge graph are associated with a structured condition list, and the structured condition list is used for describing legal composition conditions required to be satisfied for applying the rule treaty; The method comprises the following steps: When tax labels and key tax concepts are extracted from the query request, a structured fact element list is also extracted from the query request; After the relevant rule candidate set is formed, performing element-element matching calculation for each rule node in the relevant rule candidate set, wherein the method specifically comprises the steps of comparing a structured element list associated with the rule node with a structured fact element list extracted from the query request, and calculating to obtain a matching degree score; and when the document fragments in the related legal rule candidate set are ranked and screened based on the preset multidimensional feature, the multidimensional feature further comprises the matching degree score.
- 7. The rule retrieval method based on knowledge graph multi-hop inference query as claimed in claim 1, wherein rule treaty nodes in the financial tax knowledge graph are associated with historical version nodes and revised abstract information corresponding to the rule treaty nodes to form a rule evolution chain; After generating the structured final search result, the method further comprises the following steps: And for the rule treaty related to the final search result, if the rule treaty has related historical version nodes in the financial and tax knowledge graph, extracting a key revision abstract between the current rule treaty version and the historical version based on the rule evolution chain, and generating and adding evolution prompt information under the rule treaty.
- 8. The rule retrieval method based on knowledge graph multi-hop inference query as set forth in claim 1, wherein the financial and tax knowledge graph further comprises a risk early warning sub-graph, the risk early warning sub-graph comprises risk nodes connected by a relationship, wherein the risk nodes represent illegal behaviors, the risk clause nodes represent forbidden or obligatory clauses, and penalty basis nodes represent corresponding penalties; When a query request is analyzed, identifying an implied illegal action or risk scene in the query request through a large language model, and extracting the illegal action or risk scene as a risk keyword; The method further comprises the steps of: and carrying out matching query in the risk early warning subgraph based on the risk keywords, if the related risk clause nodes and/or punishment basis nodes are matched, acquiring corresponding rule text fragments, generating independent risk prompt information, and outputting the risk prompt information and the final search result together.
- 9. A computer-readable storage medium, on which a computer program is stored, characterized in that the program, when executed by a processor, implements the rule retrieval method based on knowledge-graph multi-hop inference query as claimed in any one of claims 1 to 8.
- 10. An electronic device having stored thereon a computer program, characterized by comprising a processor and a storage medium having stored thereon a computer program which, when executed by the processor, implements the rule retrieval method based on knowledge-graph multi-hop inference query as claimed in any one of claims 1 to 8.
Description
Rule retrieval method, medium and device based on knowledge graph multi-hop reasoning query Technical Field The application relates to the technical field of knowledge graph information retrieval, in particular to a rule retrieval method, medium and device based on knowledge graph multi-hop reasoning query. Background The financial and tax law search is a branch of information search and has the characteristics of strong territory, high accuracy requirement, complex multi-hop reasoning process and the like. The conventional knowledge graph-based hybrid search method (GraphRAG) has obvious defects in coping with multi-hop inference query in the financial and tax field, namely a common method is to find seed nodes in a graph through vector search and then construct a subgraph by utilizing a 1-hop relation, but the method can not recall a complete target rule due to hop limit when solving a multi-hop inference scene, and another method is to directly convert a user problem into a graph query language by using a Text2Cypher technology, but the method depends on the expression quality of the user problem and is also difficult to effectively process the multi-hop inference problem. In addition, the existing method does not generally distinguish the types of the relations when constructing the subgraph, so that a large amount of irrelevant information is introduced by single expansion, the cost and difficulty of subsequent screening are obviously increased, and the accuracy and efficiency of retrieval are affected. Disclosure of Invention In view of the problems, the application provides a rule retrieval method, medium and device based on knowledge graph multi-hop reasoning query, which are used for solving the problem of incomplete rule recall and irrelevant information interference in a multi-hop reasoning scene and improving the accuracy and efficiency of financial and tax rule retrieval. To achieve the above object, in a first aspect, the present application provides a rule retrieval method based on knowledge-graph multi-hop inference query, the method comprising: Constructing a financial tax knowledge base which comprises a financial tax knowledge map and a vector database, wherein the vector database is used for storing document fragments which are segmented and vectorized from rule texts; receiving a query request input by a user; Analyzing the query request, and extracting tax labels and key tax concepts in the query request, wherein the tax labels are obtained by matching a large model with a preset keyword library in the financial tax field, and the key tax concepts are obtained by extracting a named entity recognition model based on the large model; Filtering document fragments in a vector database based on tax labels, calculating semantic similarity between a query request and the filtered document fragments in a mixed retrieval mode, and selecting K document fragments with the highest semantic similarity as a first candidate fragment set; Performing one-hop query based on rule and regulation rule nodes corresponding to the rule and regulation rule corresponding to the first candidate segment set in the financial knowledge graph, obtaining tax concept nodes directly related to the rule and regulation rule nodes to form an initial extended concept set, and filtering the initial extended concept set by using key tax concepts extracted from a query request to obtain a filtered extended tax concept set; Taking each tax concept node in the filtered extended tax concept set as a starting point, carrying out one-hop query in a financial tax knowledge graph, acquiring rule and regulation rule nodes directly related to the tax concept nodes, acquiring corresponding document fragments based on the rule and regulation rule nodes to form a second candidate fragment set, and combining the first candidate fragment set and the second candidate fragment set to form a related rule candidate set; And sequencing and screening the document fragments in the related legal rule candidate set based on preset multidimensional features, and generating a structured final search result, wherein the multidimensional features comprise legal rule effectiveness levels, release time and semantic similarity of the document fragments and a query request. Further, generating the structured final search result includes: And inputting the candidate document fragments which are ranked ahead together with the query request into a large language model, so that the large language model carries out correlation discrimination, redundancy removal and information abstract generation on the input candidate fragments according to the query request, and outputting a structured final search result. Further, the constructing the financial tax knowledge base includes: Preprocessing the full text of financial regulations to obtain document fragments, wherein the preprocessing comprises text cleaning, paragraph segmentation and fragment segmentation; Base