CN-117992584-B - Text large model illusion relieving method, device, equipment and medium

CN117992584BCN 117992584 BCN117992584 BCN 117992584BCN-117992584-B

Abstract

The invention relates to a method, a device, equipment and a medium for alleviating illusion of a text large model, wherein the method comprises the steps of acquiring updated document data; preprocessing updated document data, extracting a knowledge triplet from the preprocessed document data, storing the knowledge triplet in a graph database, acquiring a Query input by a user, carrying out entity identification on the Query, searching in the graph database to obtain corresponding entity nodes and contents related to the entity, splicing the corresponding entity nodes and the contents related to the entity to obtain a Prompt, splicing the Prompt and the Query, and taking the splice of the Prompt and the Query as input of a text large model to generate a response result. The invention selectively understands the information retrieved by means of the reasoning capability of the text large model, and finally outputs correct contents related to facts, thereby effectively relieving the illusion problem of the text large model.

Inventors

TANG BO
WANG YINING
LIU SHENGPING
LIANG JIAEN
HUANG WEI

Assignees

云知声（上海）智能科技有限公司
云知声(杭州)智能科技有限公司

Dates

Publication Date: 20260512
Application Date: 20231223

Claims (5)

1. A method for text large model illusion mitigation, comprising: acquiring updated document data; Preprocessing the updated document data; extracting a knowledge triplet from the preprocessed document data; The knowledge triples are put into a graph database to be stored; obtaining a Query input by a user, and carrying out entity identification on the Query; Searching and taking out corresponding entity nodes and contents associated with the entities from the graph database; splicing the extracted corresponding entity nodes with the content associated with the entity to obtain a Prompt, and splicing the Prompt with the Query; splicing the Prompt and the Query as the input of a text large model to generate a response result; The extracting the knowledge triples from the preprocessed document data comprises the following steps: preprocessed document data as input text set For a pair of Coding and mapping to hidden state space : In the formula, , , ; Identifying and extracting entities in the text data by using an information extraction algorithm GPlinker, and obtaining a knowledge triplet according to the relationship between the entities, wherein the relationship between the entities is as follows: placing the knowledge triples into a graph database for storage, including: creating a graph space in a graph database Nebula, defining a schema of a knowledge triplet, and placing the knowledge triplet into the graph database for storage; The obtaining the Query input by the user, and performing entity identification on the Query comprises the following steps: Representing the Query as , Conversion to hidden state space by pre-trained language model Bert ; In the formula, , ; Using CRF algorithm to mask state space A calculation is performed to transform the state of each location into probability distributions of probabilities BIEO for beginning, middle of entity, end of entity, and non-entity, the transformation formula is as follows: In the formula, Representing predicted target tags I.e. the fraction of unnormalized softmax, Representing the probability of transfer between words, The normalization factor is represented as such, The corresponding probability O labels are a beginning B, an entity middle I, an entity end E and a non-entity corresponding to each state; retrieving in the graph database to obtain a corresponding entity node and content associated with the entity, including: Searching in a graph database by using nGQL sentences to obtain a result of one or more hops of the entity node; nGQL statement MATCH (event: { clause }) RETURN < hop > (eventi); Wherein clause represents a query condition, eventi represents a final query result, hop represents a hop count, onehop represents one hop, multi-hop represents multiple hops, and the result obtained by searching is obtained according to the one hop; the splicing of the Prompt and the Query is used as the input of a text large model, and the generating of the response result comprises the following steps: Formalized representation of a splice of Prompt and Query ; The text large model encodes the formalized representation to obtain a hidden layer representation : Outputting hidden layer representation according to the coding structure of the text large model : In the formula, Represent the first At each moment, model number Layer output, hidden layer of model final output is expressed as ; According to the hidden layer representation, calculating the maximum generation probability of words in each time vocabulary: In the formula, Representing the probability distribution generated by each word in the vocabulary, And Is a training parameter; and the word corresponding to the highest probability value in the probability distribution is used as a response generation result of the current moment.
2. The method for text large model illusion mitigation according to claim 1, wherein the preprocessing the updated document data comprises: and carrying out data cleaning, duplication removal and sensitive information filtering on the updated document data.
3. A text large model illusion mitigation device employing a text large model illusion mitigation method according to any of the claims 1-2, comprising: the acquisition module is used for acquiring updated document data; the preprocessing module is used for preprocessing the updated document data; the extraction module is used for extracting a knowledge triplet from the preprocessed document data; The storage module is used for placing the knowledge triples into a graph database for storage; The entity identification module is used for acquiring a Query input by a user and carrying out entity identification on the Query; the searching module is used for searching and taking out the corresponding entity node and the content associated with the entity from the graph database; the splicing module is used for splicing the entity node which is taken out and the content which is related to the entity to obtain a Prompt, and splicing the Prompt and the Query; and the input module is used for taking the concatenation of the Prompt and the Query as the input of the text large model to generate a response result.
4. An electronic device is characterized by comprising a processor and a memory; the processor is configured to execute a text large model illusion mitigation method according to any of the claims 1 to 2 by invoking a program or instructions stored in the memory.
5. A computer-readable storage medium storing a program or instructions that cause a computer to perform a text large model illusion mitigation method according to any of the claims 1 to 2.

Description

Text large model illusion relieving method, device, equipment and medium Technical Field The invention relates to the technical field of model illusion alleviation, in particular to a method, a device, equipment and a medium for alleviating the illusion of a text large model. Background A triplet is a structured representation in a knowledge graph that is used to organize and present knowledge in the real world. It represents knowledge in the form of a graph, where entities and relationships are nodes and edges of the graph, text large models are the current best-effort technique, the method is a method for completing objective world knowledge modeling by utilizing a large-scale Transformer Decoder structure, and various text understanding and text generating tasks are realized by using a text generating method, wherein the tasks comprise a question-answering task. The method for training the text large model is high in cost, and the method for filling the latest knowledge through retraining is not practical, so that the knowledge updating can be efficiently performed, the phenomenon that the text large model generates illusion is avoided, and false or outdated information is generated, and the method is a main problem to be solved. Disclosure of Invention The invention provides a method, a device, equipment and a medium for alleviating illusion of a text large model, which can solve the technical problems. In a first aspect, an embodiment of the present invention provides a text large model illusion alleviating method, including: acquiring updated document data; preprocessing the updated document data; extracting a knowledge triplet from the preprocessed document data; The knowledge triples are put into a graph database to be stored; Obtaining a Query input by a user, and carrying out entity identification on the Query; Searching and taking out the corresponding entity node and the content associated with the entity from the graph database; splicing the extracted corresponding entity nodes with the content associated with the entity to obtain a Prompt, and splicing the Prompt with the Query; and taking the concatenation of the Prompt and the Query as the input of a text large model, and generating a response result. Further, in the above method for alleviating illusion of text large model, preprocessing the updated document data includes: and carrying out data cleaning, duplication removal and sensitive information filtering on the updated document data. Further, in the above method for alleviating illusion of text big model, extracting the knowledge triples from the preprocessed document data includes: the preprocessed document data is used as an input text set, and elements in the text set are coded and mapped to a hidden state space; the knowledge triples are obtained by identifying and extracting entities and relationships between the entities in the text data using the information extraction algorithm GPlinker. Further, in the above method for alleviating illusion of text big model, the storing of the knowledge triples in the graph database includes: creating a graph space in a graph database Nebula, defining a schema of the knowledge triples, and placing the knowledge triples into the graph database for storage. Further, in the above method for alleviating illusion of text big model, the method for acquiring Query input by the user, and identifying entity of the Query comprises: converting the Query into a hidden state space through a pre-training language model Bert; The hidden state space is computed using CRF algorithms, converting the state of each location into probabilities of beginning, middle of entity, end of entity, and non-entity. Further, in the above method for alleviating illusion of text big model, searching and retrieving corresponding entity nodes and contents associated with the entities in the graph database includes: And searching in the graph database by using nGQL sentences to obtain the result of one or more hops of the entity node. Further, in the above method for alleviating illusion of a text large model, the splicing of the Prompt and the Query is used as the input of the text large model, and the generating of the response result includes: performing formal representation on the splice of the Prompt and the Query; The large text model encodes the formalized representation to obtain a hidden layer representation; outputting hidden layer representation according to the coding structure of the text large model; And calculating the maximum generation probability of words in each time word list according to the hidden layer representation, wherein the word with the highest probability value in the probability distribution is used as a response generation result at the current time. In a second aspect, an embodiment of the present invention further provides a text large model illusion alleviating apparatus, including: the acquisition module is used for acquiring updated document