CN-122021653-A - Context semantic consistency control method, device, computer equipment and storage medium supporting user correction

CN122021653ACN 122021653 ACN122021653 ACN 122021653ACN-122021653-A

Abstract

The application provides a control method, a device, computer equipment and a storage medium for supporting context semantic consistency of user correction. The method comprises the steps of obtaining a source interaction data tree and a context view tree corresponding to a current dialogue event stream, displaying a first tree diagram and a second tree diagram in a visual interface, determining semantic drift scores of all interaction nodes based on the source interaction data tree and the context view tree, determining whether a target interaction node exists or not, highlighting the target interaction node in the context view tree if the target interaction node exists, updating the source interaction data tree and the context view tree in response to processing operation of a user on the target interaction node, updating the first tree diagram and the second tree diagram, and returning to execute the steps of determining the semantic drift score of each interaction node based on the source interaction data tree and the context view tree until the target interaction node does not exist. By adopting the method, the semantic distortion condition of the context content of the large language model after compression can be reduced.

Inventors

Yu Miaopi
QU ZUOMIN
LAI BOYU
YANG DAIWEI
LIANG ZHIHONG
HONG CHAO
LIU HAONAN
Xiong Zhentao
DAI TAO
YANG YIFAN
HUANG CHANGLIN
DU JINRAN
YANG CHUNYAN
DENG LITING
XUAN JIANTONG
CHEN ZEZHENG

Assignees

南方电网科学研究院有限责任公司

Dates

Publication Date: 20260512
Application Date: 20260415

Claims (10)

1. A method for controlling semantic consistency of context supporting user correction, the method comprising: The method comprises the steps of obtaining a source interaction data tree and a context view tree corresponding to a current dialogue event stream, arranging all interaction nodes in the current dialogue event stream according to a time sequence by the source interaction data tree, and bypassing each interaction node to obtain the interactive content of a past compression version of the interaction node; Displaying a first tree diagram generated based on the source interaction data tree and a second tree diagram generated based on the context view tree in a visual interface; Determining semantic drift scores of the interaction nodes based on the source interaction data tree and the context view tree, and determining whether target interaction nodes with semantic drift grades being preset grades exist according to the semantic drift scores of the interaction nodes; if the target interaction node exists, highlighting the target interaction node in the second tree diagram; And responding to processing operation of a user aiming at the target interaction node, updating the source interaction data tree and the context view tree to update the first tree diagram and the second tree diagram, and returning to execute the step of determining the semantic drift score of each interaction node based on the source interaction data tree and the context view tree until the target interaction node does not exist, wherein the updated context view tree is used for assembling the context content and inputting the context content into a large language model.
2. The method of claim 1, wherein the source interaction data tree maintains a chain of compressed versions of each of the interaction nodes, wherein the determining a semantic drift score for each of the interaction nodes based on the source interaction data tree and the contextual view tree comprises: Determining an interactive content source-returning reference for each interactive node in the historically compressed version interactive content of each interactive node based on the compressed version chain of each interactive node; Determining a semantic drift value of each interaction node under each semantic drift evaluation dimension based on the interaction content back-source reference of each interaction node and the latest compressed version interaction content of each interaction node; and carrying out weighted summation on the semantic drift values of each interaction node under each semantic drift evaluation dimension to obtain the semantic drift score of each interaction node.
3. The method of claim 2, wherein said determining an interaction content back-source reference for each of said interaction nodes in each of said interaction nodes 'historically compressed version interaction content based on each of said interaction node's compressed version chains comprises: if the interactive content of the back source priority version is recorded in the compressed version chain of the interactive node, the interactive content corresponding to the back source priority version is used as the interactive content back source reference of the interactive node; And if the interactive content of the source returning priority version is not recorded in the compressed version chain of the interactive node, taking the original version interactive content recorded in the compressed version chain as an interactive content source returning reference of the interactive node.
4. The method of claim 2, wherein said determining a semantic drift value for each of said interaction nodes in respective semantic drift assessment dimensions based on the interaction content back-to-source reference for each of said interaction nodes and the most recently compressed version of the interaction content for each of said interaction nodes comprises: respectively executing key constraint extraction on the interactive content back-source reference and the latest compressed version interactive content of each interactive node to obtain a reference power service key constraint set and a latest power service key constraint set of each interactive node; Determining interaction content similarity based on interaction content back-source reference and latest compressed version interaction content of each interaction node, and determining power business key constraint coverage rate, structured field consistency deviation and time sequence consistency deviation based on reference power business key constraint set and latest power business key constraint set of each interaction node; And determining the similarity of the interaction content, the key constraint coverage rate of the power service, the consistency deviation of the structured field and the consistency deviation of the time sequence corresponding to each interaction node as semantic drift values of each interaction node under each semantic drift evaluation dimension.
5. The method of claim 1, wherein the processing operations include a review trigger operation and a correction operation, wherein the updating the source interaction data tree and the context view tree in response to a processing operation by a user for the target interaction node comprises: Responding to the review triggering operation of a user for the target interaction node, acquiring the interaction content source returning reference and the latest compressed version interaction content of the target interaction node, and generating comparison information to be displayed on the visual interface, wherein the comparison information is used for providing guidance when the user edits the interaction content of the target interaction node; Responding to the correction operation of the user on the latest compressed version interactive content of the target interactive node, acquiring pre-correction content and post-correction content, and determining a correction type according to the pre-correction content and the post-correction content; and updating the source interaction data tree and the context view tree according to the updating strategy corresponding to the correction type.
6. The method of claim 5, wherein updating the source interaction data tree and the context view tree according to the update policy corresponding to the correction type comprises: If the correction type is expansion correction, judging whether the number of the words of the corrected content exceeds the upper limit of the number of the words of the target interaction node; If yes, determining the number of the tokens to be released according to the corrected content and the upper limit of the number of the tokens, and selecting the interactive nodes which are not confirmed and not corrected from low to high according to importance scores in other topic groups except the topic group to which the target interactive node belongs to add the node set to be compressed; If the sum of the number of the releasable tokens of the node set to be compressed is smaller than the number of the tokens to be released, in the subject group to which the target interaction node belongs, selecting other interaction nodes which are not confirmed and are not corrected from low to high according to the importance score to supplement the node set to be compressed, so as to obtain the node set to be compressed after the first supplement; If the sum of the number of the releasable tokens of the node set to be compressed after the first supplement is smaller than the number of the tokens to be released, selecting the confirmed but uncorrected interaction node from low to high according to the importance score in other topic groups except the topic group to which the target interaction node belongs to add the node set to be compressed after the first supplement to obtain a node set to be compressed after the second supplement; If the sum of the number of the releasable tokens of the node set to be compressed after the second supplementation is smaller than the number of the tokens to be released, selecting other interactive nodes which are confirmed but not corrected from low to high according to the importance scores in the subject group to which the target interactive node belongs, and adding the other interactive nodes into the node set to be compressed after the second supplementation to obtain a node set to be compressed after the third supplementation; If the sum of the number of the releasable tokens of the third-time supplemented node set to be compressed is smaller than the number of the tokens to be released, selecting to add the third-time supplemented node set to be compressed from corrected interaction nodes according to the importance score from low to high in other topic groups except topic groups to which the target interaction node belongs, and obtaining a fourth-time supplemented node set to be compressed; If the sum of the number of the releasable tokens of the fourth-time supplemented node set to be compressed is smaller than the number of the tokens to be released, selecting the fourth-time supplemented node set to be compressed from other corrected interactive nodes according to the importance score from low to high in the subject group to which the target interactive node belongs, and obtaining a fifth-time supplemented node set to be compressed; if the sum of the number of the releasable words of the node set to be compressed after the fifth supplement is greater than or equal to the number of the words to be released, performing source-returning compression operation on the node set to be compressed after the fifth supplement, and releasing the number of the words to be released so as to update the source interaction data tree and the context view tree; If the sum of the number of the releasable tokens of the node set to be compressed after the fifth supplement is smaller than the number of the tokens to be released, generating a token budget shortage prompt message and displaying the token budget shortage prompt message on the visual interface, wherein the token budget shortage prompt message is used for guiding the user to execute correction operation again.
7. The method of claim 5, wherein updating the source interaction data tree and the context view tree according to the update policy corresponding to the correction type comprises: if the correction type is simplified correction, determining a releasable word line according to the content before correction and the content after correction; selecting excessive compression nodes from the interactive nodes, and sorting the excessive compression nodes from high to low according to importance scores to form a node set to be released, wherein the excessive compression nodes are interactive nodes with the current number of words being lower than the preset proportion of the upper limit of the current number of words; sequentially distributing word line according to importance scores of all nodes to be released in the node set to be released from high to low until the releasable word line is distributed; performing back source expansion on a target node to be released, which acquires the word element amount, and tracing back along a compressed version chain of the target node to be released, and searching whether at least one target version interactive content, of which the word element number does not exceed the upper limit of the word element number of the target node to be released and the compression degree is lower than that of the current compressed version, exists; if so, the target version interaction content with the lowest compression degree in the target version interaction contents is used as the latest compressed version interaction content of the target node to be released, so that the source interaction data tree and the context view tree are updated; and if the latest compressed version interaction content does not exist, acquiring the last compressed version interaction content of the target node to be released, recompressing the last compressed version interaction content by taking the upper limit of the number of the word elements of the target node to be released as the target length, and taking the compression result as the latest compressed version interaction content of the target node to be released so as to update the source interaction data tree and the context view tree.
8. A context semantic consistency control apparatus supporting user correction, the apparatus comprising: The system comprises an acquisition module, a context view tree, a source interaction data tree, a context view tree, a user interface module and a user interface module, wherein the acquisition module is used for acquiring a source interaction data tree and a context view tree corresponding to a current dialogue event stream, the source interaction data tree arranges each interaction node in the current dialogue event stream according to a time sequence, and each interaction node is laterally linked with the interactive content of a past compression version of the interaction node; the visualization module is used for displaying a first tree diagram generated based on the source interaction data tree and a second tree diagram generated based on the context view tree in a visualization interface; The determining module is used for determining the semantic drift score of each interaction node based on the source interaction data tree and the context view tree, and determining whether a target interaction node with the semantic drift level being a preset level exists or not according to the semantic drift score of each interaction node; The highlighting module is used for highlighting the target interaction node in the second tree diagram if the target interaction node exists; And the response module is used for responding to the processing operation of a user aiming at the target interaction node, updating the source interaction data tree and the context view tree to update the first tree diagram and the second tree diagram, and returning to execute the step of determining the semantic drift score of each interaction node based on the source interaction data tree and the context view tree until the target interaction node does not exist, wherein the updated context view tree is used for assembling the context content and inputting the context content into a large language model.
9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1 to 7 when the computer program is executed.
10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 7.

Description

Context semantic consistency control method, device, computer equipment and storage medium supporting user correction Technical Field The present application relates to the field of large language models, and in particular, to a method, an apparatus, a computer device, and a storage medium for controlling context semantic consistency supporting user correction. Background With the widespread use of large language models in a variety of specialized fields, the models are often required to handle complex interaction scenarios involving large amounts of specialized information, long-range multi-round conversations. Because the context window has the limitation of word element number, compression processing such as abstract, rewriting or merging is often needed to be executed on the historical interaction content so as to accommodate new interaction information. Currently, the commonly adopted context compression scheme usually performs automatic compression in a 'black box' mode, and historical interaction content is simplified through abstracting, rewriting or merging and other modes. However, interactive data in many business scenarios has strong logic association, timing dependence or professional constraint requirements, and the "black box" type compression processing easily causes loss or distortion of key information, operation sequence or core constraint, causes semantic drift, and makes output provided by a large language model inconsistent with early conclusions or established rules. Therefore, the prior art has the problem that the context on which the large language model is based in reasoning is easy to distort. Disclosure of Invention Based on this, the present application aims to solve at least one of the above technical drawbacks, and in particular, the technical drawback that the context content according to which the large language model is based in the prior art is easy to distort after compression, and the present application provides a method, an apparatus, a computer device, and a storage medium for controlling context semantic consistency supporting user correction, which can reduce the situation that the context content of the large language model is distorted after compression. In a first aspect, the present application provides a method for controlling context semantic consistency supporting user correction, including: The method comprises the steps of obtaining a source interaction data tree and a context view tree corresponding to a current dialogue event stream, arranging all interaction nodes in the current dialogue event stream according to a time sequence by the source interaction data tree, and bypassing each interaction node to obtain the interactive content of a previous compression version of the interaction node; Displaying a first tree diagram generated based on the source interaction data tree and a second tree diagram generated based on the context view tree in a visual interface; Determining semantic drift scores of all interaction nodes based on a source interaction data tree and a context view tree, and determining whether target interaction nodes with semantic drift grades being preset grades exist according to the semantic drift scores of all interaction nodes; if the target interaction node exists, highlighting the target interaction node in the second tree diagram; and updating the source interaction data tree and the context view tree in response to the processing operation of the user on the target interaction node so as to update the first tree diagram and the second tree diagram, and returning to execute the step of determining the semantic drift score of each interaction node based on the source interaction data tree and the context view tree until the target interaction node does not exist, wherein the updated context view tree is used for assembling the context content and inputting the context content into the large language model. In an exemplary embodiment, a source interaction data tree maintains a compressed version chain for each interaction node, determining semantic drift scores for each interaction node based on the source interaction data tree and a context view tree, comprising: determining an interactive content return source reference for each interactive node in the historically compressed version interactive content of each interactive node based on the compressed version chain of each interactive node; Determining a semantic drift value of each interaction node under each semantic drift evaluation dimension based on the interaction content back-source reference of each interaction node and the latest compressed version interaction content of each interaction node; and carrying out weighted summation on the semantic drift values of each interaction node under each semantic drift evaluation dimension to obtain the semantic drift score of each interaction node. In an exemplary embodiment, determining the interaction content back-source reference