Search

CN-122021606-A - Multi-dimensional consistency detection method and system in large model

CN122021606ACN 122021606 ACN122021606 ACN 122021606ACN-122021606-A

Abstract

The application provides a multi-dimensional consistency detection method and a system in a large model, wherein the method comprises the steps of obtaining a text generated by the large model, and processing the text to generate a sentence sequence containing time information; the method comprises the steps of extracting events in a text, combining the events with time information to construct a time link where the events occur, quantifying the time link, generating a time consistency score, wherein the time consistency score is used for evaluating the logic rationality of the events based on time sequence, forming context structure information according to semantic relations of sentence sequences and a plurality of events, quantifying the context structure information, generating a context consistency score, wherein the context consistency score is used for evaluating the logic consistency of the text, comprehensively calculating the time consistency score and the context consistency score, generating a time sequence risk index, and the time sequence risk index is used for evaluating the time sequence deviation degree of the text. The problem that the output content of the large model has deviation in multiple dimensions is solved.

Inventors

  • SHI YIJIE
  • ZHAO LUJIN
  • QIN SUJUAN

Assignees

  • 北京邮电大学

Dates

Publication Date
20260512
Application Date
20251219

Claims (10)

  1. 1. A method for multi-dimensional consistency detection in a large model, the method comprising: acquiring a text generated by the large model, and processing the text to generate a sentence sequence containing time information; Extracting an event from the text, combining the event with the time information to construct a time link in which the event occurs, quantifying the time link, and generating a time consistency score, wherein the time consistency score is used for evaluating the logical rationality of the event based on time sequence; Forming context structure information according to the sentence sequence and the semantic relation of a plurality of events, and quantizing the context structure information to generate a context consistency score, wherein the context consistency score is used for evaluating the logical consistency of the text; and comprehensively calculating the time consistency score and the context consistency score to generate a time sequence risk index, wherein the time sequence risk index is used for evaluating the time sequence deviation degree of the text.
  2. 2. The method for multi-dimensional consistency detection in a large model according to claim 1, wherein after the generating the time series risk index, the method comprises: Setting a time sequence risk index threshold; determining that the text is abnormal in response to the time sequence risk index being greater than the time sequence risk index threshold; And analyzing the quantification process of the time consistency scores and the quantification process of the context consistency scores to generate abnormal explanation.
  3. 3. The method of multi-dimensional consistency detection in a large model according to claim 1, wherein the processing the text to generate a sentence sequence containing time information comprises: cutting the text to form the sentence sequence; Analyzing sentences in the sentence sequence, and generating structured content of each sentence, wherein the structured content comprises grammar structures, entity relations and sentence boundaries; And identifying time information in the sentence according to the structured content, and mapping the time information into a standard timestamp.
  4. 4. The method of multi-dimensional consistency detection in a large model according to claim 1, wherein the extracting events in the text, combining the events with the time information to construct a time link for the events to occur, comprises: extracting a plurality of events in the text, wherein each event comprises an action and main structure, a standard time stamp of occurrence, a logic relation with other events in the text and a local context corresponding to the event; And establishing a plurality of time links for the event to occur according to the standard time stamp.
  5. 5. The method of multi-dimensional consistency detection in a large model according to claim 4, wherein said quantifying said time links to generate a time consistency score comprises: responding to the text with the display time sequence description, and establishing a time sequence constraint set according to the display time sequence description; Calculating an event sequence consistency score for the time link and the set of timing constraints according to the following formula: ; Wherein S order is the event sequence consistency score, C order is the time sequence constraint set, e p and e q are two events, I (standard timestamp is inconsistent with time sequence constraint time) is a first indication function, when the standard timestamp is inconsistent with time sequence constraint time, the first indication function is recorded as 1, and when the standard timestamp is consistent with time sequence constraint time, the first indication function is recorded as 0; Acquiring an input problem of the large model, and extracting reference time in the problem; calculating an alignment score of the text with the reference time according to the following formula: ; ; wherein S align is the alignment score, N fact is the number of determined time labels in the text, For the deviation threshold, t fact is the standard timestamp, t q is the reference time, d (t fact ,t q ) is the time difference, The second indicator function is marked as 1 when the time difference is larger than the deviation threshold value, and is marked as 0 when the time difference is smaller than or equal to the deviation threshold value; Extracting the fact information in the text, and calculating the knowledge timeliness score of the fact information according to the following formula: ; Wherein S fresh is the knowledge timeliness score, λ is the decay coefficient, E facts () is the set of fact information; The time consistency score is calculated according to the following formula: ; wherein S temp is the time consistency score, and α, β, and γ are weight coefficients.
  6. 6. The method for multi-dimensional consistency detection in a large model according to claim 1, wherein the forming context structure information according to the sentence sequence and semantic relation of a plurality of the events comprises: dividing the sentence sequence into a plurality of contextual windows; Calculating the semantic similarity of adjacent context windows, and calculating the context drift according to the semantic similarity, wherein the context drift is used for representing the degree of semantic mutation; Extracting an entity in the text, picking up the entity in the context window, and searching for contradictory descriptions of the entity; constructing a plurality of non-time logic relations of the events to generate an event inference chain; The context structure information is composed of the context drift degree, the contradictory descriptions and the event inference chain.
  7. 7. The method of multi-dimensional consistency detection in a large model according to claim 6, wherein said quantifying the contextual structural information to generate a contextual consistency score comprises: the local context consistency score is calculated according to the following formula: ; ; Wherein D ctx (l) is the context drift, For the semantic similarity of the adjacent context windows, S ctx_local is the local context consistency score, L is the total number of sentences in the sentence sequence, and D max is the set upper limit of the context drift; constructing a theme distribution vector for each context window; the topic stability score was calculated according to the following formula: ; Wherein S topic is the topic stability score, KL norm is the divergence of the adjacent topic distributions, and p l is the topic distribution vector; calculating an entity consistency score according to the following formula: ; Wherein S entity is the entity consistency score, M is the total number of the entities, M is the sequence number of the entities, For the similarity of semantic vectors of the entity in the two context windows, p m is a semantic distribution vector; the context consistency score is calculated according to the following formula: ; Where S ctx is the context consistency score, μ 1 、μ 2 and μ 3 are weight coefficients.
  8. 8. The method of multi-dimensional consistency detection in a large model according to claim 1, wherein the comprehensively calculating the temporal consistency score and the contextual consistency score generates a temporal risk index, comprising: calculating a weighted sum of the time consistency score and the context consistency score to obtain a comprehensive consistency score; And processing the comprehensive consistency scores by using a normalization function to obtain the time sequence risk index.
  9. 9. The method of multi-dimensional consistent detection in large models according to claim 8, wherein the comprehensively calculating the temporal consistency score and the contextual consistency score generates a temporal risk index, comprising: introducing a logical consistency score, wherein the logical consistency score is used for evaluating the logical self-right degree of the text; calculating the composite consistency score according to the following formula: ; Wherein S total is the comprehensive consistency score, S temp is the temporal consistency score, S ctx is the context consistency score, S logic is the logical consistency score, and w 1 ,w 2 ,w 3 is a weight coefficient; Calculating the time series risk index according to the following formula: ; wherein R temporal is the time sequence risk index, and sigma is the standard deviation.
  10. 10. A multi-dimensional consistency detection system in a large model, the system comprising: the acquisition module is used for acquiring the text generated by the large model, and processing the text to generate a sentence sequence containing time information; the time consistency generation module is used for extracting an event from the text, combining the event with the time information to construct a time link for the event to occur, quantifying the time link and generating a time consistency score, wherein the time consistency score is used for evaluating the logic rationality of the event based on time sequence; The context consistency generation module is used for forming context structure information according to the sentence sequence and the semantic relation of the events, quantifying the context structure information and generating a context consistency score, wherein the context consistency score is used for evaluating the logical consistency of the text; The comprehensive evaluation module is used for comprehensively calculating the time consistency score and the context consistency score to generate a time sequence risk index, wherein the time sequence risk index is used for evaluating the time sequence deviation degree of the text.

Description

Multi-dimensional consistency detection method and system in large model Technical Field The application relates to the technical field of computers, in particular to a multi-dimensional consistency detection method and system in a large model. Background With the wide application of large models in the fields of search questions and answers, news generation, legal consultation, medical assistance, education coaching and the like, the problems of authenticity, timeliness and context suitability of output contents of the large models gradually become important factors influencing the reliability of the large models. Large models are typically trained using datasets containing massive amounts of web text, web content, news stories, encyclopedia knowledge, academic literature, and social media content. Such data is not only structurally complex and long in time span, but also has significant differences in different contexts and different time periods, so that the model can mix information from different time windows and different context environments when internalizing knowledge. In practical application, the large model is expected to generate contents with fact correctness, logic self-consistency and time sequence consistency, however, due to the limitation of various aspects such as training data, reasoning mechanism and memory structure, phenomena such as outdated knowledge reference, old news reptation, context misplacement, event time chain confusion, cross-segment understanding deviation and the like often occur when the large model processes tasks such as time correlation, context dependence or cross-segment reasoning. These problems not only affect the quality of content, but may also mislead the user, potentially endangering in high risk scenarios such as news public opinion, medical diagnosis, legal consultation, etc. In addition, with the enhancement of multi-round dialogue tasks and long text generation capability, the defects of a large model in long sequence information retention, span semantic association and event chain maintenance are more remarkable. Models often have systematic effects on task results due to broken inference chains, semantic focus shifts, or incoherence of narrative logic caused by attentiveness attenuation, context forgetting, or semantic drift. Therefore, how to identify the deviation of the large model output in multiple dimensions by technical means and construct a quantifiable and auditable multidimensional consistency check mechanism becomes a key technical problem in the current artificial intelligence content security field. Disclosure of Invention Therefore, the application aims to provide a multi-dimensional consistency detection method and system in a large model, which solve the problem that the output content of the large model has deviation in multiple dimensions. In order to achieve one of the above disclosed objects, the present application provides a multi-dimensional consistency detection method in a large model, the method comprising: Acquiring a text generated by the large model, and processing the text to generate a sentence sequence containing time information; Extracting an event from the text, combining the event with the time information to construct a time link in which the event occurs, quantifying the time link, and generating a time consistency score, wherein the time consistency score is used for evaluating the logical rationality of the event based on time sequence; Forming context structure information according to the sentence sequence and the semantic relation of a plurality of events, and quantizing the context structure information to generate a context consistency score, wherein the context consistency score is used for evaluating the logical consistency of the text; and comprehensively calculating the time consistency score and the context consistency score to generate a time sequence risk index, wherein the time sequence risk index is used for evaluating the time sequence deviation degree of the text. As a further improvement of an embodiment of the present application, after the generating the time series risk index, the method includes: Setting a time sequence risk index threshold; determining that the text is abnormal in response to the time sequence risk index being greater than the time sequence risk index threshold; And analyzing the quantification process of the time consistency scores and the quantification process of the context consistency scores to generate abnormal explanation. As a further improvement of an embodiment of the present application, the processing the text to generate a sentence sequence including time information includes: cutting the text to form the sentence sequence; Analyzing sentences in the sentence sequence, and generating structured content of each sentence, wherein the structured content comprises grammar structures, entity relations and sentence boundaries; And identifying time information in the s