Search

US-12626067-B2 - Grid-cell highlighting based on LLM attention scores

US12626067B2US 12626067 B2US12626067 B2US 12626067B2US-12626067-B2

Abstract

A data analytics system uses a grid-based data structure to improve the usability of LLMs in the analysis of large data sets, to synthesize information for use in other generative AI contexts, and to improve a user's ability to interface with an LLM. A grid-based data structure is a data structure or database that stores the results of column prompts applied to sources. The grid-based data structure may store the results in a relational manner. For example, a grid-based data structure may have rows that correspond to sources (e.g., documents, files, or databases) and columns that correspond to prompts. Each cell of the grid-based data structure stores the output of the column prompt applied to a source using an LLM. Thus, each column prompt may be systematically applied to each source to generate information based on the sources in an organized way.

Inventors

  • George Sivulka

Assignees

  • Hebbia Inc.

Dates

Publication Date
20260512
Application Date
20241126

Claims (20)

  1. 1 . A method comprising: receiving a content item request from a client device, wherein the content item request comprises a text query to generate content; generating a grid-based data structure comprising a set of rows and a set of columns, where each row is associated with a source of a plurality of sources and each column is associated with a column prompt, and wherein the set of rows and set of columns define a set of cells for the grid-based data structure, and wherein each of the set of cells comprises contents from applying a column prompt of a corresponding column to a source of a corresponding row using a large language model; identifying a subset of the set of cells that relate to the text query; generating a content item prompt based on contents of the identified subset of cells and the text query, wherein the content item prompt comprises text instructions for a large language model generate content item data for a content item based on the text query and the contents of the identified subset of cells; transmitting the content item prompt to the large language model; receiving a response from the large language model comprising the content item data; generating a content item based on the content item data and the content item request; and transmitting the content item to the client device for display to a user.
  2. 2 . The method of claim 1 , further comprising: receiving a selection of a portion of the content item through a matrix user interface, wherein the selection identifies a portion of the content item; accessing a set of attention scores corresponding to the portion of the content item, wherein the set of attention scores describe a relance of contents of the set of cells to the portion of the content item; generating a cell score for each cell of the set of cells based on the set of attention scores; and updating a matrix user interface to each of the set of cells based on the generated cell scores.
  3. 3 . The method of claim 2 , wherein the content item data comprises a plurality of output tokens and wherein generating a cell score for a cell comprises: identifying a set of output tokens corresponding to the selected portion of the content item; and identifying attention scores associated with the identified set of output tokens.
  4. 4 . The method of claim 3 , wherein the contents of each cell of the subset of cells comprise a set of input tokens that were input to the large language model, and wherein generating a cell score for a cell comprises: identifying, for each input token of the cell, a subset of attention scores associated with the input token, wherein each of the subset of attention scores also corresponds to an output token of the identified set of output tokens; and computing the cell score for the cell based on the identified subsets of attention scores for the input tokens of the cell.
  5. 5 . The method of claim 4 , wherein computing the cell score comprises: computing an average score of the attention scores in the identified subsets of attention scores.
  6. 6 . The method of claim 1 , wherein the content item request comprises a type of content item to generate.
  7. 7 . The method of claim 6 , wherein the type of content item is one of a document, a presentation, slides, a spreadsheet, an email, a chat message, or a memorandum.
  8. 8 . The method of claim 1 , wherein the received response from the large language model comprises formatting instructions for formatting content item data within the content item.
  9. 9 . The method of claim 1 , wherein generating the content item comprises: extracting the content item data from the response.
  10. 10 . The method of claim 1 , wherein the content item is displayed to the user in a matrix user interface.
  11. 11 . A non-transitory computer-readable medium storing instructions that, when executed by a processor, causes the processor to perform operations comprising: receiving a content item request from a client device, wherein the content item request comprises a text query to generate content; generating a grid-based data structure comprising a set of rows and a set of columns, where each row is associated with a source of a plurality of sources and each column is associated with a column prompt, and wherein the set of rows and set of columns define a set of cells for the grid-based data structure, and wherein each of the set of cells comprises contents from applying a column prompt of a corresponding column to a source of a corresponding row using a large language model; identifying a subset of the set of cells that relate to the text query; generating a content item prompt based on contents of the identified subset of cells and the text query, wherein the content item prompt comprises text instructions for a large language model generate content item data for a content item based on the text query and the contents of the identified subset of cells; transmitting the content item prompt to the large language model; receiving a response from the large language model comprising the content item data; generating a content item based on the content item data and the content item request; and transmitting the content item to the client device for display to a user.
  12. 12 . The computer-readable medium of claim 11 , the operations further comprising: receiving a selection of a portion of the content item through a matrix user interface, wherein the selection identifies a portion of the content item; accessing a set of attention scores corresponding to the portion of the content item, wherein the set of attention scores describe a relance of contents of the set of cells to the portion of the content item; generating a cell score for each cell of the set of cells based on the set of attention scores; and updating a matrix user interface to each of the set of cells based on the generated cell scores.
  13. 13 . The computer-readable medium of claim 12 , wherein the content item data comprises a plurality of output tokens and wherein generating a cell score for a cell comprises: identifying a set of output tokens corresponding to the selected portion of the content item; and identifying attention scores associated with the identified set of output tokens.
  14. 14 . The computer-readable medium of claim 13 , wherein the contents of each cell of the subset of cells comprise a set of input tokens that were input to the large language model, and wherein generating a cell score for a cell comprises: identifying, for each input token of the cell, a subset of attention scores associated with the input token, wherein each of the subset of attention scores also corresponds to an output token of the identified set of output tokens; and computing the cell score for the cell based on the identified subsets of attention scores for the input tokens of the cell.
  15. 15 . The computer-readable medium of claim 14 , wherein computing the cell score comprises: computing an average score of the attention scores in the identified subsets of attention scores.
  16. 16 . The computer-readable medium of claim 11 , wherein the content item request comprises a type of content item to generate.
  17. 17 . The computer-readable medium of claim 16 , wherein the type of content item is one of a document, a presentation, slides, a spreadsheet, an email, a chat message, or a memorandum.
  18. 18 . The computer-readable medium of claim 11 , wherein the received response from the large language model comprises formatting instructions for formatting content item data within the content item.
  19. 19 . The computer-readable medium of claim 11 , wherein generating the content item comprises: extracting the content item data from the response.
  20. 20 . The computer-readable medium of claim 11 , wherein the content item is displayed to the user in a matrix user interface.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS This application claims the benefit of U.S. Provisional Patent Application No. 63/604,124, entitled “Matrix User Interface for LLM-Powered Data Analysis and Generation” and filed Nov. 29, 2023, and U.S. Provisional Patent Application No. 63/563,117, entitled “Matrix User Interface for LLM-Powered Data Analysis and Generation” and filed Mar. 8, 2024, each of which is incorporated by reference. BACKGROUND Generative models, including large language models (LLMs), are machine-learning algorithms that leverage significant numbers of parameters to analyze input data (“prompts”) from users and create an appropriate response. Many systems allow users to interact with generative models through a chat user interface wherein a user provides a prompt and the model provides a response. While these interfaces may be effective for one-off prompts that a user may have, they are ineffective for when a user wants to perform many prompts across many sources of data. For example, if a user wants to have an LLM generate multiple pieces of information from a document, the user generally must create multiple prompts to the LLM or must craft a long, complicated prompt that requests all the information. On the other hand, if a user wants to extract the same information across multiple different documents, the user generally must repeatedly create prompts containing the context of each of the documents. Thus, while chat interfaces can be effective in certain circumstances, they are ineffective at allowing users to use generative models to run multiple prompts across multiple sources. Moreover, the output of generative models often decays in proportion to the complexity of the reasoning required to respond to a prompt correctly. Standard chat interfaces, limited to a single dialogue thread, constrain the possible complexity of tasks to a single line of reasoning. In order to get AI-powered systems to correctly execute complex “multi-step” reasoning, repeated “single-step” analysis across many prompts and many sources is often necessary. This has the added benefit of building trust, observability, and explainability to system output. In addition, generative models are limited by their context window, and slow down exponentially for each amount of data that is added to the current context. For example, most documents have more words than can currently fit in the context window of the best LLMs available, and inference is slow for any long document. This limits the number of sources and prompts that can be analyzed in a single model thread. SUMMARY A data analytics system uses a grid-based data structure to improve the usability of LLMs in the analysis of large data sets, to synthesize information for use in other generative AI contexts, and to improve a user's ability to interface with an LLM. A grid-based data structure is a data structure or database that stores the results of column prompts applied to sources. The grid-based data structure may store the results in a relational manner. For example, a grid-based data structure may have rows that correspond to sources (e.g., documents, files, or databases) and columns that correspond to prompts. Each cell of the grid-based data structure stores the output of the column prompt applied to a source using an LLM. Thus, each column prompt may be systematically applied to each source to generate information based on the sources in an organized way. The data analytics system may present a grid-based data structure through a matrix user interface. The matrix user interface displays the grid-based data structure as a matrix in the user interface, where the rows of the matrix correspond to the rows of the grid-based data structure and the columns of the matrix correspond to the columns of the grid-based data structure. Each cell of the matrix contains text output by an LLM when the prompt of its corresponding column is applied to the source of its corresponding row. A user can use the matrix user interface to add, edit, or remove column prompts or to set parameters for how those prompts are input to an LLM. The user can also use the matrix user interface to upload sources to the data analytics system or to provide locator information from which the data analytics system may retrieve data for the sources. A matrix user interface provides many improvements to the technical fields of user interfaces and large language models. A matrix user interface allows a user to perform and utilize LLM-based analytics in a way that is practically impossible through traditional approaches, such as through a chat interface. The matrix user interface leverages the unique relational structure of the grid-based data structure to effectively correlate the input data to the LLM with inferences and predictions made by the LLM. Thus, the matrix user interface expands the capabilities of LLMs. Furthermore, as noted above, an LLM's context window generally limits its ability to ana