US-20260127374-A1 - EXPLAINABLE AND EFFICIENT TEXT SUMMARIZATION

US20260127374A1US 20260127374 A1US20260127374 A1US 20260127374A1US-20260127374-A1

Abstract

A computer-implemented, machine learning method for generating explainable text summaries includes extracting a subset of sentences from an input document as an extractive summary and adding context to the extracted sentences to generate a prompt. A fluent summary is generated by using the prompt as input to a generative language model. Source information for a sentence from the fluent summary is determined by mapping the sentence from the fluent summary to a sentence in the extractive summary and the sentence from the extractive summary to a sentence from the input document. A transparent summary view is generated showing the sentence from the fluent summary along with the source information from the extractive summary and the input document for display on a user interface. The method has applications including, but not limited to medical AI, public safety and other machine learning applications for reliable and explainable document summarization.

Inventors

Masafumi ENOMOTO
Kunihiro TAKEOKA
Kiril Gashteovski
Carolin Lawrence

Assignees

NEC CORPORATION

Dates

Publication Date: 20260507
Application Date: 20260105

Claims (18)

1 . A computer-implemented, machine learning method for generating explainable text summaries, the method comprising: extracting a subset of sentences from at least one input document as an extractive summary; adding context to the extracted sentences to generate a prompt; generating a fluent summary by using the prompt as input to a generative language model; determining source information for a sentence from the fluent summary by mapping the sentence from the fluent summary to at least one sentence in the extractive summary and the at least one sentence from the extractive summary to at least one sentence from the at least one input document; and generating a transparent summary view showing the sentence from the fluent summary along with the source information from the extractive summary and the at least one input document for display on a user interface; wherein mapping the sentence from the fluent summary to the at least one sentence in the extractive summary is performed by embedding the sentence from the fluent summary and each respective one of the extracted sentences as a numerical vector using a sentence embedding model, and selecting a number k of the extracted sentences that are nearest neighbors to the sentence from the fluent summary as evidence in the extractive summary; wherein adding the context includes performing co-reference resolution and entity linking against a knowledge graph to rewrite each extracted sentence into a contextualized, stand-alone sentence; removing meaningless words and phrases from each contextualized sentence based on a database of words and phrases previously classified as meaningless; checking whether one or more of the preprocessed sentences (including the contextualized sentences after removal of meaningless words) is a duplicate by semantically comparing sentence embeddings using a similarity threshold, and excluding the one or more of the preprocessed sentences from the prompt based on a determination that the one or more of the preprocessed sentences is within the similarity threshold to another one of the preprocessed sentences; and constructing a prompt that comprises a list of the remaining contextualized sentences with the added context concatenated to each respective sentence using a delimiter, together with an instruction to the generative language model to summarize, paraphrase or re-write the extracted sentences.
2 . The method according to claim 1 , wherein mapping the sentence from the fluent summary to the at least one sentence in the extractive summary and/or mapping the at least one sentence from the extractive summary to the at least one sentence from the at least one input document is performed using a natural language inference model that predicts for the mapping whether a respective one of the sentences is entailed by another one of the sentences.
3 . The method according to claim 1 , wherein mapping the at least one sentence from the extractive summary to the at least one sentence from the at least one input document is performed by embedding each respective one of the at least one sentence from the at least one input document as a numerical vector using a sentence embedding model, and selecting, as evidence in the input documents, a number k of the at least one sentence from the at least one input document that are nearest neighbors to the number k of the extracted sentences that are in the evidence in the extractive summary.
4 . The method according to claim 1 , further comprising removing meaningless words and phrases from the extracted sentences prior to generating the prompt.
5 . The method according to claim 4 , wherein the meaningless words and phrases are determined by comparing the extracted sentences to a database containing words and phrases that have been previously classified as meaningless.
6 . The method according to claim 1 , further comprising determining the subset of sentences using a neural network that receives the at least one input document and outputs an informativeness score for each sentence contained in the at least one input document.
7 . The method according to claim 1 , wherein adding the context to the extracted sentences includes resolving ambiguities in individual ones of the extracted sentences by performing co-reference resolution and entity linking based on the at least input document.
8 . The method according to claim 1 , further comprising checking whether one or more of the extracted sentences is a duplicate by semantically comparing embeddings of the extracted sentences using a similarity threshold, and excluding the one or more of the extracted sentences from the prompt based on a determination that the one or more of the extracted sentences is within the similarity threshold to another one of the extracted sentences.
9 . The method according to claim 1 , wherein the prompt comprises a list of the extracted sentences and, for respective ones of the extracted sentences having the added context, the added context is concatenated to the respective extracted sentence, and wherein the prompt further comprises an instruction to the generative language model to summarize, paraphrase or re-write the extracted sentences, which is output as the fluent summary.
10 . The method according to claim 1 , wherein the transparent summary view highlights on the user interface the sentence from the fluent summary as well as the source information including the at least one sentence from the extractive summary and the at least one sentence from the at least one input document.
11 . The method according to claim 1 , wherein the at least one input document includes patient data, and wherein the transparent summary view is used to support decision-making in a medical Artificial Intelligence (AI) or automated healthcare use case.
12 . The method according to claim 1 , wherein the at least one input document includes a criminal investigation report, and wherein the transparent summary view is used to support decision-making in a public safety use case and/or to activate a forensic tool.
13 . A computer system for generating text summaries comprising one or more processors which, alone or in combination, are configured to perform a machine learning method for generating explainable text summaries comprising the following steps: extracting a subset of sentences from at least one input document as an extractive summary; adding context to the extracted sentences to generate a prompt; generating a fluent summary by using the prompt as input to a generative language model; determining source information for a sentence from the fluent summary by mapping the sentence from the fluent summary to at least one sentence in the extractive summary and the at least one sentence from the extractive summary to at least one sentence from the at least one input document; and generating a transparent summary view showing the sentence from the fluent summary along with the source information from the extractive summary and the at least one input document for display on a user interface; wherein mapping the sentence from the fluent summary to the at least one sentence in the extractive summary is performed by embedding the sentence from the fluent summary and each respective one of the extracted sentences as a numerical vector using a sentence embedding model, and selecting a number k of the extracted sentences that are nearest neighbors to the sentence from the fluent summary as evidence in the extractive summary; wherein adding the context includes performing co-reference resolution and entity linking against a knowledge graph to rewrite each extracted sentence into a contextualized, stand-alone sentence; removing meaningless words and phrases from each contextualized sentence based on a database of words and phrases previously classified as meaningless; checking whether one or more of the preprocessed sentences (including the contextualized sentences after removal of meaningless words) is a duplicate by semantically comparing sentence embeddings using a similarity threshold, and excluding the one or more of the preprocessed sentences from the prompt based on a determination that the one or more of the preprocessed sentences is within the similarity threshold to another one of the preprocessed sentences; and constructing a prompt that comprises a list of the remaining contextualized sentences with the added context concatenated to each respective sentence using a delimiter, together with an instruction to the generative language model to summarize, paraphrase or re-write the extracted sentences.
14 . A tangible, non-transitory computer-readable medium for generating explainable text summaries containing instructions which, upon being executed by one or more hardware processors, provide for execution of a machine learning method comprising the following steps: extracting a subset of sentences from at least one input document as an extractive summary; adding context to the extracted sentences to generate a prompt; generating a fluent summary by using the prompt as input to a generative language model; determining source information for a sentence from the fluent summary by mapping the sentence from the fluent summary to at least one sentence in the extractive summary and the at least one sentence from the extractive summary to at least one sentence from the at least one input document; and generating a transparent summary view showing the sentence from the fluent summary along with the source information from the extractive summary and the at least one input document for display on a user interface; wherein mapping the sentence from the fluent summary to the at least one sentence in the extractive summary is performed by embedding the sentence from the fluent summary and each respective one of the extracted sentences as a numerical vector using a sentence embedding model, and selecting a number k of the extracted sentences that are nearest neighbors to the sentence from the fluent summary as evidence in the extractive summary wherein adding the context includes performing co-reference resolution and entity linking against a knowledge graph to rewrite each extracted sentence into a contextualized, stand-alone sentence; removing meaningless words and phrases from each contextualized sentence based on a database of words and phrases previously classified as meaningless; checking whether one or more of the preprocessed sentences (including the contextualized sentences after removal of meaningless words) is a duplicate by semantically comparing sentence embeddings using a similarity threshold, and excluding the one or more of the preprocessed sentences from the prompt based on a determination that the one or more of the preprocessed sentences is within the similarity threshold to another one of the preprocessed sentences; and constructing a prompt that comprises a list of the remaining contextualized sentences with the added context concatenated to each respective sentence using a delimiter, together with an instruction to the generative language model to summarize, paraphrase or re-write the extracted sentences.
15 . The method according to claim 1 , wherein the at least one input document comprises patient data for a patient, and wherein the transparent summary view is provided via the user interface to at least one of a medical Artificial Intelligence (AI) system and an automated healthcare system to support a diagnosis or treatment for the patient.
16 . The method according to claim 15 , wherein the transparent summary view highlights, in response to a selection of a sentence of the fluent summary by a doctor, corresponding evidence sentences in the extractive summary and in the at least one input document so that the doctor can verify factual consistency before decision making regarding a diagnosis or treatment for the patient.
17 . The method according to claim 1 , wherein the at least one input document includes at least one of a criminal investigation report, a suspect report and a citizen report, and wherein the transparent summary view is provided via the user interface to at least one of a police worker, a government worker and another public safety worker to support public safety or forensic analysis.
18 . The method according to claim 17 , wherein the transparent summary view highlights, for each sentence of the fluent summary, corresponding evidence sentences in the extractive summary and in the at least one input document so as to provide traceable evidence for decision making in public safety or for operating a forensic tool.

Description

CROSS-REFERENCE TO PRIOR APPLICATION This application is a continuation of U.S. application Ser. No. 18/374,676, filed on Sep. 29, 2023, which claims priority to U.S. Provisional Application No. 63/522,470, filed on Jun. 22, 2023. The entire disclosures of the above-referenced applications are incorporated herein by reference. FIELD The present invention relates to Artificial Intelligence (AI) and machine learning, and, in particular, to a method, system and computer-readable medium for explainable and efficient text summarization. BACKGROUND Large Language Models (LLMs), such as ChatGPT, exhibit strong performance on many Natural Language Processing (NLP) tasks, including text summarization. Within the generated summary, however, such generative LLMs can generate information that could be false. In particular, the generated text can contain factually incorrect information, referred to also as hallucinations of fact, that at the same time appears to be stated with confidence, thereby resulting in a lack of trust and reliability of LLM systems, in addition to making their use dangerous in a number of higher risk scenarios. Moreover, the use of such LLM systems is inefficient, in terms of computational resources and compute time. SUMMARY In an embodiment, the present invention provides a computer-implemented, machine learning method for generating explainable text summaries. A subset of sentences are extracted from at least one input document as an extractive summary. Context is added to the extracted sentences to generate a prompt. A fluent summary is generated by using the prompt as input to a generative language model. Source information for a sentence from the fluent summary is determined by mapping the sentence from the fluent summary to at least one sentence in the extractive summary and the at least one sentence from the extractive summary to at least one sentence from the at least one input document. A transparent summary view is generated showing the sentence from the fluent summary along with the source information from the extractive summary and the at least one input document for display on a user interface. The method has applications including, but not limited to medical AI, public safety and other machine learning applications for reliable and explainable document summarization. BRIEF DESCRIPTION OF THE DRAWINGS Embodiments of the present invention will be described in even greater detail below based on the exemplary figures. The present invention is not limited to the exemplary embodiments. All features described and/or illustrated herein can be used alone or combined in different combinations in embodiments of the present invention. The features and advantages of various embodiments of the present invention will become apparent by reading the following detailed description with reference to the attached drawings which illustrate the following: FIG. 1 schematically illustrates a method and overall system architecture for generating text summaries in accordance with an embodiment of the present invention; FIG. 2 schematically illustrates an overall architecture of a preprocessor, and steps performed by the preprocessor, in accordance with an embodiment of the present invention; FIG. 3 schematically illustrates an overall architecture of a contextualizer, and steps performed by the contextualizer, in accordance with an embodiment of the present invention; FIG. 4 schematically illustrates an overall architecture of an explainer, and steps performed by the explainer, in accordance with an embodiment of the present invention; FIG. 5 schematically illustrates potential implementations of a tracer in accordance with an embodiment of the present invention; FIG. 6 shows an example of a transparent summary view in accordance with an embodiment of the present invention; and FIG. 7 is a block diagram of an exemplary processing system, which can be configured to perform any and all operations disclosed herein. DETAILED DESCRIPTION Embodiments of the invention provide to make the use of LLMs more trustworthy, secure, reliable and efficient, in terms of required computation resources and/or compute time, for summarization by modifying the input in a transparent and explainable manner. At the same time, embodiments of the present invention provide to significantly reduce the costs of LLM by reducing the size of input documents are before they are given to the LLM, therefore shortening the input lengths and thus enabling to reduce the computational load, in terms of required computational resources and/or compute time. This, in turn, enables to free up computational resources for other tasks, such as other incoming queries, and allows for the processing of an increased amount of queries in a secure, reliable and trustworthy manner. According to existing technology, LLM systems suffer from the technical deficiency of hallucinating facts and can generate wrong information in the summary (i.e., factually incorrec