US-12626071-B2 - Generating summaries of texts using large language models selected based on a minimization of a classification score, processing and reading times and words in a deny list

US12626071B2US 12626071 B2US12626071 B2US 12626071B2US-12626071-B2

Abstract

Systems and methods for generating summaries from text using a generative model are disclosed. The system is configured to access an article; identify section; provide, to one or more generative models, a prompt including instructions to generate a section summary; generate an article summary based on the section summary; determine, from the article summary, a first concept found in the article summary that is missing from the article; determine, using a classifier, for a first sentence included in the article summary, a confidence score; and provide, for presentation at a client device, a document including the article summary.

Inventors

Walter Bender
Nithi Vivatrat
Richard Graves
Tomá Valena
David McMinn

Assignees

Sorcero, Inc.

Dates

Publication Date: 20260512
Application Date: 20241127

Claims (20)

1 . A system, comprising: one or more processors coupled to memory, the one or more processors configured to: access an article including a title and a body; identify, from the article, a plurality of sections; provide, to one or more generative pre-training transformer (GPT) models, for each section of the plurality of sections of the article, a prompt including instructions to generate a section summary for the section based on one or more sentences included in the section, wherein providing the prompt to the one or more GPT models comprises the one or more processors to select the one or more GPT models from a plurality of GPT models based on the selected one or more GPT models providing higher metrics than non-selected GPT models, the metrics comprising at least two of: a minimization of a classification score indicating a likelihood that the one or more sentences included in the section correspond to the section summary; a minimization of processing time; a minimization of a score indicating a duration a user associated with a client device is reading; a minimization of a period of time associated with a reading time, and a minimization of words on a deny list included within the section summaries; generate an article summary based on the section summary generated for each section; determine, using the one or more GPT models, from the article summary, a first concept found in the article summary that is missing from the article; determine, using a classifier, for a first sentence included in the article summary, a confidence score indicating a likelihood that the first sentence is not supported by the article; and provide, for presentation at the client device, a document including the article summary, a first indicator corresponding to the first concept and a second indicator corresponding to the first sentence.
2 . The system of claim 1 , wherein the classifier is a first classifier, and comprising the one or more processors to: determine, using the first classifier responsive to providing the prompt to generate the section summary for the section, that at least one sentence included in a section summary of a first section of the plurality of sections of the article has a classification score indicating that the at least one sentence belongs to a second section of the plurality of sections of the article; remove the at least one sentence from the section summary of the first section to generate an updated section summary of the first section; and generate the article summary based on the section summary generated for each section and the updated section summary of the first section.
3 . The system of claim 1 , comprising the one or more processors to: determine, responsive to determining the confidence score indicating the likelihood that the first sentence is not supported by evidence in the article, a score indicating a reading level for the article summary; and provide, for presentation at the client device, the document including the score.
4 . The system of claim 1 , comprising the one or more processors to: determine, responsive to determining the confidence score indicating the likelihood that the first sentence is not supported by evidence in the article, a period of time indicating a duration a user associated with the client device is reading for each of the article summary and the article; and provide, for presentation at the client device, the document including the period of time.
5 . The system of claim 1 , wherein providing the document comprises the one or more processors to: identify a deny list of words associated with the article; parse the article summary to identify a first word of the deny list appearing within the article summary; and replace the first word appearing within the article summary with a second word based on an index mapping the deny list of words to an allow list of words.
6 . The system of claim 1 , comprising the one or more processors to: determine, responsive to determining a section identifier and a confidence score for each sentence, that a first confidence score for a sentence is above a threshold confidence score; associate the sentence with the section identifier corresponding to the first confidence score; determine that a second confidence score for a second sentence is at or below the threshold confidence score; and associate the second sentence with a second section identifier different than the section identifier associated with the first confidence score.
7 . The system of claim 1 , wherein the classifier comprises a first classifier and a second classifier, and comprising the one or more processors to: train the first classifier using a plurality of bodies from a plurality of articles to recognize a plurality of section identifiers; and train the second classifier using a plurality of sentences of a plurality of section summaries and the plurality of bodies to compare the plurality of sentences to the plurality of bodies.
8 . The system of claim 1 , comprising the one or more processors to: determine that the article satisfies a format comprising the body and the title; and access the article responsive to the determination that the article satisfies the format.
9 . The system of claim 1 , comprising the one or more processors to: determine that a second article does not satisfy a format comprising the body and the title; and provide, for presentation at the client device, an indication of the second article not satisfying the format.
10 . The system of claim 1 , comprising the one or more processors to provide, for presentation at the client device, the document comprising a comparison of a first score indicating a reading level associated with the article summary and a second score indicating a reading level associated with the article.
11 . The system of claim 1 , wherein to identify, from the article, a plurality of sections, the one or more processors configured to provide, to a GPT model, a prompt to cause the GPT model to output portions of the body of the article in respective sections of the plurality of sections.
12 . The system of claim 1 , wherein the classifier is a first classifier, wherein the body includes a plurality of sentences, and wherein to identify, from the article, a plurality of sections, the one or more processors configured to determine, by inputting each sentence of the one or more sentences of the article into the first classifier, a section identifier and a confidence score for each sentence, the confidence score indicating a likelihood of its respective sentence corresponding to its respective section identifier.
13 . The system of claim 1 , wherein to generate the article summary, the one or more processors configured to iteratively prompt the one or more GPT models based on a threshold associated with the article summary.
14 . A method, comprising: accessing, by one or more processors coupled to memory, an article including a title and a body; identifying, by the one or more processors, from the article, a plurality of sections; providing, by the one or more processors, to one or more GPT models, for each section of the plurality of sections of the article, a prompt including instructions to generate a section summary for the section based on one or more sentences included in the section, wherein providing the prompt to the one or more GPT models comprises selecting the one or more GPT models from a plurality of GPT models based on the selected one or more GPT models providing higher metrics than non-selected GPT models, the metrics comprising at least two of: a minimization of a classification score indicating a likelihood that the one or more sentences included in the section correspond to the section summary; a minimization of processing time; a minimization of a score indicating a duration a user associated with a client device is reading; a minimization of a period of time associated with a reading time; and a minimization of words on a deny list included within the section summaries; generating, by the one or more processors, an article summary based on the section summary generated for each section; determining, by the one or more processors using the one or more GPT models, from the article summary, a first concept found in the article summary that is missing from the article; determining, by the one or more processors, using a classifier, for a first sentence included in the article summary, a confidence score indicating a likelihood that the first sentence is not supported by the article; and providing, by the one or more processors, for presentation at the client device, a document including the article summary, a first indicator corresponding to the first concept and a second indicator corresponding to the first sentence.
15 . The method of claim 14 , wherein the classifier is a first classifier, and the method comprising: determining, by the one or more processors responsive to providing the prompt to generate the section summary, using the first classifier, that at least one sentence included in a section summary of a first section of the plurality of sections of the article has a classification score indicating that the at least one sentence belongs to a second section of the plurality of sections of the article; removing, by the one or more processors, the at least one sentence from the section summary of the first section to generate an updated section summary of the first section; and generating, by the one or more processors, the article summary based on the section summary generated for each section and the updated section summary of the first section.
16 . The method of claim 14 , comprising: determining, by the one or more processors, responsive to determining the confidence score indicating the likelihood that the first sentence is not supported by evidence in the article, a score indicating a reading level for the article summary; and providing, by the one or more processors, for presentation at the client device, the document including the score.
17 . The method of claim 14 , comprising: determining, by the one or more processors, responsive to determining the confidence score indicating the likelihood that the first sentence is not supported by evidence in the article, a period of time indicating a duration a user associated with the client device is reading for each of the article summary and the article; and providing, by the one or more processors, for presentation at the client device, the document including the period of time.
18 . The method of claim 14 , comprising: determining, by the one or more processors, responsive to determining a section identifier and a confidence score for each sentence, that a first confidence score for a sentence is above a threshold confidence score; associating, by the one or more processors, the sentence with the section identifier corresponding to the first confidence score; determining, by the one or more processors, that a second confidence score for a second sentence is at or below the threshold confidence score; and associating, by the one or more processors, the second sentence with a second section identifier different than the section identifier associated with the first confidence score.
19 . The method of claim 14 , comprising: determining, by the one or more processors, that the article satisfies a format comprising the body and the title; and accessing, by the one or more processors, the article responsive to the determination that the article satisfies the format.
20 . The method of claim 14 , comprising: determining, by the one or more processors, that a second article does not satisfy a format comprising the body and the title; and providing, by the one or more processors, for presentation at the client device, an indication of the second article not satisfying the format.

Description

CROSS-REFERENCE TO RELATED APPLICATION This application claims the benefit of and priority to U.S. Provisional Patent Application No. 63/603,944, filed on Nov. 29, 2023, the disclosure of which is incorporated herein by reference in its entirety. BACKGROUND For regulatory and other purposes, complex texts such as clinical study reports (CSR) are summarized. Human generation of summaries is time-consuming and requires expertise. Machine generation of such summaries can be computationally expensive, due to processing the text using a large language model. Large language models can be subject to extreme latency and hallucinations when fed a large amount of complex text, such as a CSR. Therefore, it is challenging to appropriately create inputs for a large language model and validate the outputs to summarize complex texts. SUMMARY The present disclosure relates to one or more systems and methods for summarizing text using a large language model. The data processing system described herein can retrieve an article, determine if the article includes relevant sections such as an abstract and title, identify classifications for the article, generate prompts for the article based on sentences in the article, identify acronyms within the article, generate summaries according to the prompts and using a large language model, filter the summaries to remove hallucinations, and provide a document including an article summary. The systems and methods can generate plain-language summaries of scientific content, such as CSRs, at a reading level lower than that of the source content, enabling easier comprehension and consequently allowing the summaries to reach a broader segment of the population. Generation of summaries from complicated texts such as CSRs involves several technological challenges. Large language models are unable to take large texts as inputs, and if they can, the large text causes extreme latency issues. Furthermore, texts are generally unstructured, thereby adding increased complexity to any data extraction or generative artificial intelligence techniques. Additionally, large language models can introduce hallucinations. By creating curated inputs from the text, the systems and methods described herein can reduce computational expense, such as time, bandwidth, and processing power, when using a large language model. The systems and methods of this technical solution solves these and other issues by implementing a series of classifications, scores, and prompts to create an input for the large language model. This reduces latency and power consumption while also providing an accurate technical summary of a complex text at a lower reading level. Furthermore, the systems and methods described herein implement a series of filters, both before and after the generation of the summary, to remove hallucinations introduced by the large language model, such as concepts not found in the article or conclusions not reached. This technical solution involves an alignment of content sections of the complex text to the appropriate prompts and validation. In this way, an accurate, fast, and simplified summary of a complex text can be generated from a large language model. At least one aspect relates to a system. The system can include one or more processors coupled to memory. The one or more processors can be configured to access an article including a title and a body. The one or more processors can identify, from the article, a plurality of sections. The one or more processors can provide, to one or more generative models, for each section of the set of sections of the article, a prompt. The prompt can include instructions to generate a section summary for the section based on the one or more sentences included in the section. The one or more processors can generate an article summary based on the section summary generated for each section and an updated section summary of a first section. The one or more processors can determine, from the article summary, a first concept found in the article summary that is missing from the article. The one or more processors can determine, using a classifier, for a first sentence included in the article summary, a confidence score indicating a likelihood that the first sentence is not supported by the article. The one or more processors can provide, for presentation at a client device, a document including the article summary, a first indicator corresponding to the first concept, and a second indicator corresponding to the first sentence. In some embodiments, the classifier can be a first classifier. The one or more processors can determine, using the first classifier, that at least one sentence included in a section summary of a first section of the set of sections of the article has a classification score indicating that the at least one sentence belongs to a second section of the set of sections of the article. The one or more processors can remove the at least one sentence from the se