CN-121996737-A - Dialogue data storage method and related device based on large language model

CN121996737ACN 121996737 ACN121996737 ACN 121996737ACN-121996737-A

Abstract

The application discloses a dialogue data storage method and a related device based on a large language model, and relates to the field of data storage, wherein the method comprises the steps of calling a memory encoder to acquire dialogue data between a user and an intelligent agent, adopting the large language model to extract key information from the dialogue data to acquire the key information, and carrying out integrity verification and rationality verification on the key information; and according to the importance evaluation value, the key information is respectively stored in a long-term memory database, a short-term memory database or a working memory database. The application realizes the purpose of reliable storage of dialogue data based on a large language model.

Inventors

LIU PENG
CUI XIUYUAN
YAO BIN

Assignees

青岛巨商汇网络科技有限公司

Dates

Publication Date: 20260508
Application Date: 20251231

Claims (10)

1. A large language model-based dialog data storage method, comprising: calling a memory encoder to acquire dialogue data between a user and an agent, wherein the dialogue data is natural language dialogue data; performing key information extraction operation on the dialogue data by adopting a large language model to obtain key information; carrying out integrity verification and rationality verification on the key information; If the key information passes the integrity verification and the rationality verification, carrying out importance assessment on the key information to obtain an importance assessment value; If the importance evaluation value is larger than a preset first importance evaluation value, storing the key information into a long-term memory database; If the importance evaluation value is smaller than a preset second importance evaluation value, storing the key information into a working memory database; and if the importance evaluation value is not smaller than the preset second importance evaluation value and not larger than the preset first importance evaluation value, storing the key information into a short-term memory database.
2. The large language model based dialogue data storage method as claimed in claim 1, further comprising, after obtaining the importance evaluation value: If the integrity verification and the rationality verification are not passed, the enhanced guide word is adopted to guide the large language model to re-conduct key information extraction operation on the dialogue data, and the key information is returned to be executed for the integrity verification and the rationality verification; If the re-extracted key information still fails to pass the integrity verification and the rationality verification, carrying out the supplementary operation on incomplete key information in the re-extracted key information, carrying out the deletion operation on the unreasonable key information in the re-extracted key information, and carrying out the importance evaluation on the key information after the supplementary operation and the deletion operation are completed, so as to obtain an importance evaluation value.
3. The large language model based dialogue data storage method of claim 1, wherein said integrity verification and rationality verification of said key information comprises: And converting the key information into preset structured information, checking whether main information and semantic content in the structured information are complete, and checking whether the structured information has time conflict or data distortion.
4. The large language model based dialogue data storage method of claim 1, wherein the importance assessment process comprises: acquiring context persistence of the key information, repetition triggering frequency and time attenuation factor, wherein the context persistence is used for representing the degree of the key information which is referred in a plurality of rounds of conversations, the repetition triggering frequency is used for representing the probability that the key information is repeatedly triggered, and the time attenuation factor is used for representing the importance change trend of the key information along with the change of time; and carrying out importance assessment on the context persistence, the repeated triggering frequency and the time attenuation factor by adopting a preset assessment model to obtain an importance assessment value.
5. The large language model based dialogue data storage method as claimed in claim 2, wherein the performing the supplementary operation on the incomplete key information in the re-extracted key information, the deleting operation on the unreasonable key information in the re-extracted key information, and the performing the importance evaluation on the key information after the supplementary operation and the deleting operation, to obtain the importance evaluation value, includes: identifying missing semantic elements in the incomplete key information by using a large language model, and carrying out supplementary operation on the missing semantic elements by combining original dialogue data to generate supplementary key information; Determining unreasonable key information with semantic conflict or main body attribution error with the original dialogue data by utilizing a large language model, and removing the unreasonable key information from the key information; and carrying out importance evaluation on the key information after the supplementing operation and the deleting operation are completed, and obtaining an importance evaluation value.
6. The large language model based dialogue data storage method as claimed in claim 1, further comprising: Calling a forgetting device to acquire key information in all databases, and calculating forgetting scores of the key information, wherein the forgetting device is packaged by a preset forgetting algorithm, and the forgetting scores are calculated by the preset forgetting algorithm and are used for representing the importance of the key information in the databases; If the forgetting score of the key information is larger than a first forgetting threshold value and smaller than a second forgetting threshold value, generating abstract information of the key information by adopting a large language model, and replacing the key information with the abstract information; if the forgetting score of the key information is not smaller than a second forgetting threshold value, deleting the key information from a database; And if the forgetting score of the key information is not greater than the forgetting threshold value, reserving the key information.
7. The method for storing dialogue data based on a large language model according to claim 1, wherein the step of extracting key information from the dialogue data using the large language model to obtain key information comprises: Calling a large language model to segment the dialogue data, and carrying out semantic understanding and intention recognition operation on the segmented dialogue data to obtain core semantic content; And performing de-duplication and semantic merging processing on the core semantic content to obtain key information.
8. A computer program product comprising computer readable instructions which, when run on an electronic device, cause the electronic device to implement the large language model based dialog data storage method of any of claims 1 to 7.
9. An electronic device comprising at least one processor and a memory coupled to the processor, wherein: the memory is used for storing a computer program; the processor is configured to execute the computer program to enable the electronic device to implement the large language model based dialog data storage method as claimed in any of claims 1 to 7.
10. A computer storage medium carrying one or more computer programs which, when executed by an electronic device, enable the electronic device to implement the large language model based dialog data storage method of any of claims 1 to 7.

Description

Dialogue data storage method and related device based on large language model Technical Field The application relates to the technical field of data storage, in particular to a dialogue data storage method and a related device based on a large language model. Background With the development of large language models and intelligent agent technologies, conversations with intelligent agents are widely applied in intelligent customer service, intelligent assistant, automation office and other scenes, and a large amount of conversation data in the form of natural language can be continuously generated in the conversation process. In the prior art, the dialogue data is usually stored in a simple log record or integral storage mode, and the problem of unreliability of the stored dialogue data is easy to occur. Therefore, a method for storing dialogue data based on a large language model is needed to improve the reliability of dialogue data management. Disclosure of Invention In view of the above problems, the present application provides a method for storing dialogue data based on a large language model and a related device, so as to achieve the purpose of reliable storage of dialogue data. The specific scheme is as follows: the first aspect of the present application provides a dialogue data storage method based on a large language model, comprising: calling a memory encoder to acquire dialogue data between a user and an intelligent agent, wherein the dialogue data is natural language dialogue data; performing key information extraction operation on dialogue data by adopting a large language model to obtain key information; Carrying out integrity verification and rationality verification on the key information; if the integrity verification and the rationality verification are passed, carrying out importance assessment on the key information to obtain an importance assessment value; If the importance evaluation value is larger than the preset first importance evaluation value, storing the key information into a long-term memory database; if the importance evaluation value is smaller than a preset second importance evaluation value, storing the key information into a working memory database; And if the importance evaluation value is not smaller than the preset second importance evaluation value and not larger than the preset first importance evaluation value, storing the key information into a short-term memory database. Optionally, after obtaining the importance evaluation value, the method further includes: If the integrity verification and the rationality verification are not passed, the enhanced guide word is adopted to guide the large language model to re-conduct key information extraction operation on the dialogue data, and the key information is returned to be executed for the integrity verification and the rationality verification; If the re-extracted key information still fails to pass the integrity verification and the rationality verification, carrying out the supplementary operation on incomplete key information in the re-extracted key information, carrying out the deletion operation on the unreasonable key information in the re-extracted key information, and carrying out the importance evaluation on the key information after the supplementary operation and the deletion operation are completed, so as to obtain an importance evaluation value. Optionally, the integrity verification and the rationality verification of the key information include: and converting the key information into preset structured information, checking whether main information and semantic content in the structured information are complete, and checking whether the structured information has time conflict or data distortion. Optionally, the importance assessment process includes: acquiring context persistence, repeated triggering frequency and time attenuation factor of the key information, wherein the context persistence is used for representing the degree of the key information which is cited in a multi-round dialogue, the repeated triggering frequency is used for representing the probability of repeated triggering of the key information, and the time attenuation factor is used for representing the importance change trend of the key information which changes along with time; And carrying out importance assessment on the context persistence, the repeated triggering frequency and the time attenuation factor by adopting a preset assessment model to obtain an importance assessment value. Optionally, performing a supplemental operation on incomplete key information in the re-extracted key information, performing a deletion operation on unreasonable key information in the re-extracted key information, and performing importance evaluation on the key information after the supplemental operation and the deletion operation are completed, to obtain an importance evaluation value, including: Identifying missing semantic elements in incomplete