CN-121996997-A - Large-model fine-granularity illusion detection and correction method and product

CN121996997ACN 121996997 ACN121996997 ACN 121996997ACN-121996997-A

Abstract

The invention discloses a large-model fine-granularity illusion detection and correction method and a product, comprising the steps of constructing reference information according to the title and description information of a reference document, extracting entities in the reference information through a large language model, acquiring the coding representation of each entity, and constructing a reference information entity set; the method comprises the steps of extracting entities in content to be detected through a large language model, obtaining coding representation of each entity, constructing a content entity set to be detected, respectively calculating support degree scores between each entity in the content entity set to be detected and the entities in a reference information entity set, determining whether the entities in the content entity set to be detected are supported by references or not according to the support degree scores, positioning suspicious entities, constructing illusion correction prompt words based on the positioned suspicious entities, judging whether the suspicious entities relate to illusion content by utilizing text understanding capability of the large language model, correcting the related illusion content to the minimum extent, and outputting corrected content.

Inventors

HU MINGHAO
LIU SHILONG
WANG FANG
Geng Guotong
BAI XIAOYING
LUO WEI
LUO ZHUNCHEN
TIAN CHANGHAI
HU WENPENG

Assignees

中国人民解放军军事科学院军事科学信息研究中心

Dates

Publication Date: 20260508
Application Date: 20251229

Claims (7)

1. A large model fine granularity illusion detection and correction method comprising: step 1, constructing reference information according to the title and description information of a reference document, extracting entities in the reference information through a large language model, obtaining the coding representation of each entity, and constructing a reference information entity set; Step 2, extracting entities in the content to be detected through a large language model, obtaining the coding representation of each entity, and constructing a content entity set to be detected; Step 3, for each entity in the entity set of the content to be tested, respectively calculating the support degree score between the entity in the entity set of the reference information, determining whether the entity in the entity set of the content to be tested is supported by the reference document according to the support degree score, and positioning suspicious entities; And 4, constructing a hallucination correction prompt word based on the positioned suspicious entity, judging whether the suspicious entity relates to hallucination content by utilizing the text understanding capability of the large language model, correcting the related hallucination content in a minimum amplitude, and outputting corrected content.
2. The large model fine granularity illusion detection and correction method according to claim 1, wherein the step 1 includes: step 101, constructing reference information according to the title and description information of the reference document given by the user; Step 102, extracting prompt words by using a preset entity, and extracting the entity from the reference document through a large language model, wherein the large language model is Qwen-32B model; step 103, de-duplicating the extracted entity to construct a reference information entity set; And 104, encoding the entities in the reference information entity set through a pre-training large language model to obtain hidden layer representation of entity text, and constructing a reference information encoding set, wherein the pre-training large language model is Sentence-BERT model.
3. The large model fine granularity illusion detection and correction method according to claim 1, wherein the step 2 includes: step 201, calculating space position and establishing index mapping from character to word when the content to be tested is English, and performing word segmentation by a bidirectional maximum matching algorithm when the content to be tested is Chinese; Step 202, performing clause processing on the content to be detected according to punctuation, extracting prompt words by using preset entities, and extracting the entities from each clause through a large language model, wherein the large language model is Qwen-32B model; Step 203, performing de-duplication and validity verification on the extracted entity to construct a content entity set to be detected; And 204, encoding the entities in the entity set of the content to be detected through a pre-training large language model to obtain hidden layer representation of entity text, and constructing the content to be detected encoding set, wherein the pre-training large language model is Sentence-BERT model.
4. The method of claim 1, wherein the support score of step 3 comprises a weighted average of the implication score and the BM25 score, both weights being preset to be 0.5.
5. The method for large model fine granularity illusion detection and correction according to claim 4, wherein the step 3 further comprises performing hard matching between the entity in the entity set of the content to be detected and the entity in the reference information entity set, or determining whether the support score exceeds a preset threshold value to locate the suspicious entity.
6. The large model fine granularity illusion detection and correction method according to claim 1, wherein the step 4 includes: Step 401, combining the original sentence and suspicious entity to form illusion correction prompt words aiming at each sentence of the content to be detected containing suspicious entity; Judging whether the hallucination exists in the sentence of the content to be detected through a large language model according to the hallucination correction prompt word, if so, correcting the sentence of the content to be detected with the minimum amplitude, and outputting the corrected sentence, if not, marking that correction is not needed, wherein the large language model is Qwen-32B model; And step 402, replacing the corrected sentences with corresponding sentences in the original content to be detected to form final corrected non-illusion content.
7. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any of claims 1-6.

Description

Large-model fine-granularity illusion detection and correction method and product Technical Field The invention belongs to the field of detection and correction of knowledge question and answer hallucinations in specific fields, and particularly relates to a large-model fine-granularity hallucinations detection and correction method and a product. Background With the rapid development of natural language processing (Natural Language Processing, NLP), large language models (Large Language Models, LLMs) have made significant progress over a variety of tasks. However, unlike conventional systems that are task-specific, LLMs trains based on large-scale online text, and thus has excellent language generation capabilities, the bias in data is inevitably absorbed. In the face of ambiguous or misinterpreted cues, they may deviate from reasoning and even "build" details to maintain context coherence. The resulting "illusion" (hallucination) appears to generate content that is independent of a given context or contradictory to facts. More importantly, most LLMs lack a built-in mechanism for tracing output to external evidence, which weakens the reliability of the output in practical application and becomes a main obstacle for realizing safe and reliable landing. For phantom detection, the prior art focuses mainly on two types of methods. The first class of methods verifies the generated content by retrieving external facts (e.g., wikipedia, knowledge-graph, etc.) to determine if there are illusions therein. However, such methods typically require first identifying information elements (e.g., entities, concepts, etc.) in the generated content, then using these information elements to generate questions, using the questions to retrieve external knowledge, and then using the knowledge to verify the facts of the generated content. Although these methods can detect the illusion of the generated content to some extent, they merely do presence detection in terms of the degree of matching of the information elements, lacking the semantic judgment of the information element context, making detection too onesided. The second category of methods attempts to detect hallucination phenomena by analyzing the internal state or behavior of the model. These methods typically employ the word prediction probabilities of the models and entropy values in the prediction process to capture the abnormal behavior of the generated content. While these methods may reveal the phantom propensity of the model to some extent, they typically require extensive analysis of the model structure. For hallucination correction, the actual errors in the answers are located and corrected based on an external high-confidence information source to finally obtain correct answers. The prior art has focused mainly on two types of methods. The first type of method adopts a large model prompt engineering technology. Depending on the ability of a large language model in terms of natural language understanding and generation, the error content is usually diagnosed and corrected by learning the prompting words with examples through few sample prompting. However, the method has the problems of high large model calling cost, slow response, complex prompt design and the like. The second type of method adopts a small model instruction fine tuning technology. The related study translates content correction into instruction following tasks, fine tuning the small scale model to solidify correction capability. While the above approach works well for the factual correction task of LLMs generated content, the factual errors tend to be deep bound to the wrong entity, and their correction should be refined to the entity level. Such as errors in concept noun levels, errors in intra-sentence entity relationships, etc. Disclosure of Invention Aiming at the defects of the prior art, such as one-sided dependent information element matching, complicated internal analysis, high correction cost, difficulty in accurate positioning at entity level and the like, the invention aims to overcome the defects of the prior art, and provides a large-model fine-granularity illusion detection and correction method and a product. In view of the above, the present invention provides a method for detecting and correcting illusion of fine granularity of a large model, comprising: step 1, constructing reference information according to the title and description information of a reference document, extracting entities in the reference information through a large language model, obtaining the coding representation of each entity, and constructing a reference information entity set; Step 2, extracting entities in the content to be detected through a large language model, obtaining the coding representation of each entity, and constructing a content entity set to be detected; Step 3, for each entity in the entity set of the content to be tested, respectively calculating the support degree score between the e