CN-121998058-A - Method, device, equipment and medium for solving problem based on context map
Abstract
The invention relates to the technical field of artificial intelligence, and provides a problem solving method, device, equipment and medium based on a context map, which solves the problem of non-traceability and insufficient interpretability through the context map, and effectively reduces construction cost by establishing mapping between the context map and a vector index based on a similar entity edge-increment updating mechanism and a conflict resolution mechanism, generates a seed set according to a similarity retrieval result of a user query in the vector index and a joint index system, performs sub-graph expansion and pruning processing in the context map by taking the seed set as a starting point, calculates a fusion score of a graph ordering score and a vector similarity score of each context segment node to extract a target context segment node, and improves retrieval efficiency and accuracy. Thereby enabling accurate solution of the problem to be generated.
Inventors
- XU CHENGJIN
- JIANG XUHUI
- ZHOU HAO
- SUN YUANLIANG
- CHEN MINGZHEN
Assignees
- 数创弧光(深圳)科技有限公司
Dates
- Publication Date
- 20260508
- Application Date
- 20260127
Claims (10)
- 1. A method for solving a problem based on a context graph, the method comprising: obtaining unstructured text, segmenting the unstructured text into a plurality of context segments, and constructing a target structured mapping according to the plurality of context segments; Constructing a context map according to the target structural mapping, and constructing a target vector index according to the target structural mapping; establishing mapping between the context map and the target vector index based on a similar entity edge-increment updating mechanism and a conflict resolution mechanism to obtain a joint index system; Responding to a problem solving instruction triggered based on target user inquiry, carrying out similarity retrieval in the target vector index according to the target user inquiry, and generating a target seed set according to a retrieval result and the joint index system; Performing sub-graph expansion and pruning treatment in the context map by taking the target seed set as a starting point to obtain candidate sub-graphs; calculating a fusion score of a graph ordering score and a vector similarity score of each context segment node in the candidate subgraph, and extracting a target context segment node from the candidate subgraph according to the fusion score; taking the context segment corresponding to the target context segment node as a constraint context, and combining the constraint context with the target user query to obtain a target prompt word; And acquiring feedback information of the large language model based on the target prompt word, and acquiring target solution data of the target user query.
- 2. The context-map based problem solving method according to claim 1, wherein said constructing a target structured map from said plurality of context segments comprises: Generating a context segment identifier and positioning meta information of each context segment; extracting entity sets included in each context segment, and extracting relations among entities in the entity sets as triple facts by adopting a structured extraction component; binding the context segment identifier of each context segment with the corresponding entity set and the triplet facts to obtain an initial structural mapping; performing standardization processing on entity names and fact texts in the initial structural mapping; Carrying out hash calculation on the entity name obtained after the standardization processing to obtain an entity identifier, and carrying out hash calculation on the fact text obtained after the standardization processing to obtain a fact identifier; Merging the same entities of the cross-context fragments in the initial structural mapping according to the entity identification, and merging the same facts of the cross-context fragments in the initial structural mapping according to the fact identification to obtain the target structural mapping; Wherein the number of references of each entity and the number of occurrences of each fact are recorded in the target structured map.
- 3. The context map-based problem solving method according to claim 2, wherein said constructing a context map from said target structured map and constructing a target vector index from said target structured map comprises: Extracting the context segment identifier from the target structural map as a unique key, extracting context texts and the positioning meta information as attributes to construct the context segment node; Extracting the entity identifier from the target structural mapping as a unique key, and extracting the entity name, the entity type and the entity attribute as attributes to construct an entity node; the method comprises the steps of constructing a mention relation side, wherein the mention relation side is used for connecting the context segment node and the entity node, and the side attribute weight of the mention relation side is configured according to the mention times or extraction confidence of a corresponding entity; The fact relation edge is used for connecting two entity nodes according to the triplet facts, and the edge attribute of the fact relation edge comprises the fact identification, the fact text and weight configured according to the occurrence times or the accumulated confidence coefficient of the facts; Generating the context graph according to the context segment nodes, the entity nodes, the mentioned relation edges and the fact relation edges; Generating a context segment vector according to the context text, generating an entity vector according to the entity name, and generating a fact vector according to the fact text; And calling a vector retrieval library to respectively establish vector indexes for the context segment vector, the entity vector and the fact vector to obtain the target vector index.
- 4. The method of claim 3, wherein the establishing a mapping between the context graph and the target vector index based on a similar entity edge-delta update mechanism and a conflict resolution mechanism to obtain a joint index hierarchy comprises: Performing neighbor search according to vector indexes of the entity vectors to calculate vector similarity among the entity vectors, screening entity vectors with the corresponding vector similarity higher than a similarity threshold to form entity pairs, and creating similarity relation edges among the entity pairs in the context graph to update the context graph, wherein the attribute of the similarity relation edges is the vector similarity among the corresponding entity vectors, and/or When detecting an update instruction triggered based on a context segment to be processed, acquiring update data of the context segment to be processed, incrementally updating the context map according to the update data, and/or Identifying whether entity attribute conflicts and fact conflicts across context segments exist in the context graph, and eliminating the entity attribute conflicts and the fact conflicts in the context graph based on a conflict resolution strategy to update the context graph; And establishing a mapping relation between nodes and edges of the context map and the target vector index to obtain the joint index system.
- 5. The context-map-based problem solving method according to claim 4, wherein said performing similarity retrieval in said target vector index according to said target user query and generating a target seed set according to a retrieval result and said joint index hierarchy comprises: vectorizing the target user query to obtain a query vector; Similarity calculation between the query vector and the context segment vector, between the query vector and the entity vector and between the query vector and the entity vector are carried out in the target vector index, and candidate objects are recalled from the target vector index according to calculation results; The method comprises the steps of constructing a candidate seed set according to a candidate object and a joint index system, acquiring a candidate mention relation side corresponding to a candidate context fragment vector according to the joint index system when the candidate object is the candidate context fragment vector, extracting entity nodes corresponding to the candidate context fragment vector according to the candidate mention relation side, constructing the candidate seed set according to the extracted entity nodes, and/or acquiring entity nodes corresponding to the candidate entity vector according to the joint index system when the candidate object is the candidate entity vector, constructing the candidate seed set, and/or acquiring a candidate fact relation side corresponding to the candidate fact vector according to the joint index system when the candidate object is the candidate fact vector, extracting head entity nodes and tail entity nodes corresponding to the candidate fact vector according to the fact relation side, and constructing the candidate seed set according to the extracted head entity nodes and tail entity nodes; And introducing a light-weight sorting model to rearrange seeds in the candidate seed set, and filtering low-confidence seeds according to a rearrangement result to obtain the target seed set.
- 6. The method of claim 5, wherein performing sub-graph expansion and pruning in the context graph with the target seed set as a starting point to obtain candidate sub-graphs comprises: Taking each seed in the target seed set as a starting point, and carrying out multi-hop expansion along the mentioned relation edge, the fact relation edge and the similar relation edge according to a preset step length to obtain an expansion subgraph, wherein the preset step length is dynamically adjusted according to query complexity; the neighbor nodes of the edges with the weight larger than the pruning parameter in the extended subgraph are reserved to obtain an intermediate subgraph, wherein the pruning parameter is adaptively adjusted according to the query correlation and the retrieval confidence signal; calculating the similarity between the query vector and each node in the intermediate subgraph; and filtering nodes with the corresponding similarity smaller than a preset threshold value from the intermediate subgraphs to obtain the candidate subgraphs.
- 7. The context-map-based problem solving method according to claim 5, wherein said calculating a fusion score of a graph rank score and a vector similarity score for each context segment node in said candidate subgraph comprises: Calculating importance scores of each context segment node in the candidate subgraph through multiple rounds of random walk, and taking the importance scores as graph sorting scores corresponding to each context segment node; Calculating the similarity score of the query vector and each context segment node as the vector similarity score corresponding to each context segment node; And fusing the graph ordering score corresponding to each context segment node with the vector similarity score corresponding to each context segment node according to a preset fusion weight to obtain the fusion score corresponding to each context segment node.
- 8. A problem solving device based on a context map is characterized in that, the context map-based problem solving device comprises: the building unit is used for obtaining unstructured text, segmenting the unstructured text into a plurality of context fragments, and building a target structured mapping according to the context fragments; the building unit is further configured to build a context map according to the target structural mapping, and build a target vector index according to the target structural mapping; the mapping unit is used for establishing mapping between the context map and the target vector index based on a similar entity edge-increment updating mechanism and a conflict resolution mechanism to obtain a joint index system; The generation unit is used for responding to a problem solving instruction triggered based on target user inquiry, carrying out similarity retrieval in the target vector index according to the target user inquiry, and generating a target seed set according to a retrieval result and the joint index system; The processing unit is used for carrying out sub-graph expansion and pruning processing in the context graph by taking the target seed set as a starting point to obtain candidate sub-graphs; The extraction unit is used for calculating a fusion score of the graph ordering score and the vector similarity score of each context segment node in the candidate subgraph, and extracting a target context segment node from the candidate subgraph according to the fusion score; the combination unit is used for taking the context segment corresponding to the target context segment node as a constraint context and combining the constraint context with the target user query to obtain a target prompt word; And the acquisition unit is used for acquiring the feedback information of the large language model based on the target prompt word to acquire target solution data of the target user query.
- 9. A computer device, the computer device comprising: a memory storing at least one instruction, and a processor executing the instructions stored in the memory to implement the context-map based problem solving method according to any of claims 1 to 7.
- 10. A computer readable storage medium having stored therein at least one instruction for execution by a processor in a computer device to implement a context-map based problem solving method according to any of claims 1 to 7.
Description
Method, device, equipment and medium for solving problem based on context map Technical Field The invention relates to the technical field of artificial intelligence, in particular to a problem solving method, device, equipment and medium based on a context map. Background In recent years, large language models (Large Language Models, LLMs) have strong capabilities in tasks such as text generation, automatic abstracting and dialogue systems, but the large language models still have limitations in terms of specific domain knowledge and complex fact information, namely, on one hand, model parameterized knowledge is difficult to update in time and difficult to cover the latest or private domain knowledge, and on the other hand, the models can have 'illusions' in the generation process, namely, inaccurate and even fictional contents are generated. In the scenario with high accuracy and reliability requirements, the above problems can significantly affect the usability and reliability of the system. The prior art mainly adopts knowledge graph and search enhancement generation (RETRIEVAL-Augmented Generation, RAG) to solve the problems, but still faces the following challenges: Firstly, the traditional knowledge graph is mainly represented by using a triplet as a core, and an explicit binding mechanism for a context (such as a source paragraph, a sentence, a fragment and the like) where an entity and a relation are located is often lacking, so that in a question-answering stage, structured knowledge is difficult to precisely align with original context evidence, and answer traceability and interpretability are affected; secondly, entity and relation extraction, map construction and updating maintenance costs for a large number of unstructured texts are high, and engineering requirements of high-throughput database construction and continuous evolution are difficult to meet; Thirdly, the traditional RAG usually takes vector similarity as a main part to carry out candidate recall, and as the library scale increases, the situations of similarity but irrelevant or relevance but unresumption are easy to appear, for multi-hop problems needing cross-segment and multi-fact combination, structural association among texts is difficult to be captured explicitly only by similarity retrieval, so that incomplete answer or lack of key link evidence is caused; Fourthly, the retrieval phase is always required to be balanced between quick recall and cross-segment association reasoning, the implicit association information is easy to miss by simply relying on text similarity retrieval, the problems of calculation cost and noise diffusion are possibly caused by simply relying on graph structure reasoning, and the requirements of quick recall and complicated problem reasoning are difficult to meet in engineering due to the lack of a mechanism for uniformly modeling, jointly sequencing and controlling the graph structure signals and vector similarity signals. Therefore, a contextual atlas driving scheme which can be efficiently constructed for unstructured content and has context evidence backtracking capability and structuring association capability at the question-answering stage is needed to improve the accuracy, the interpretability and the engineering expandability of a retrieval enhancement question-answering system. Disclosure of Invention In view of the foregoing, it is necessary to provide a method, apparatus, device and medium for solving the problems of non-traceability, insufficient interpretability, high cost, low efficiency and low accuracy of the existing solution. A context-based solution to a problem, the context-based solution to a problem comprising: obtaining unstructured text, segmenting the unstructured text into a plurality of context segments, and constructing a target structured mapping according to the plurality of context segments; Constructing a context map according to the target structural mapping, and constructing a target vector index according to the target structural mapping; establishing mapping between the context map and the target vector index based on a similar entity edge-increment updating mechanism and a conflict resolution mechanism to obtain a joint index system; Responding to a problem solving instruction triggered based on target user inquiry, carrying out similarity retrieval in the target vector index according to the target user inquiry, and generating a target seed set according to a retrieval result and the joint index system; Performing sub-graph expansion and pruning treatment in the context map by taking the target seed set as a starting point to obtain candidate sub-graphs; calculating a fusion score of a graph ordering score and a vector similarity score of each context segment node in the candidate subgraph, and extracting a target context segment node from the candidate subgraph according to the fusion score; taking the context segment corresponding to the target context segmen