Search

US-12625845-B2 - Responding to a user query using machine learning

US12625845B2US 12625845 B2US12625845 B2US 12625845B2US-12625845-B2

Abstract

A method, apparatus, non-transitory computer readable medium, and system for data processing include obtaining a query relating to a document and identifying metadata for the document based on the query, where the metadata describes a structure including a plurality of portions of the document. Some embodiments including generating, using a machine learning model, a retrieval command based on the query and the metadata, selectively retrieving at least one of the plurality of portions of the document based on the retrieval command, and generating, using the machine learning model, a response to the query based on the at least one of the plurality of portions of the document.

Inventors

  • Jon Saad-Falcon
  • Joseph D. Barrow
  • Varun Manjunatha
  • Anusha PRAKASH
  • Ryan A. Rossi
  • FRANCK DERNONCOURT
  • Alexa F Siu
  • Ani Nenkova Nenkova
  • Seunghyun Yoon

Assignees

  • ADOBE INC.

Dates

Publication Date
20260512
Application Date
20240517

Claims (13)

  1. 1 . A method for data processing, comprising: obtaining, from a computing device, a query relating to a document; and in response to the obtaining of the query from the computing device: identifying, by a metadata component executed on a data processing system, metadata for the document based on the query, wherein the metadata describes a structure of the document including a plurality of portions of the document; generating, by a query component executed on the data processing system, a retrieval command prompt, wherein the retrieval command prompt includes the query, the metadata, a plurality of executable functions, and an instruction to generate, based on the query and the metadata, a retrieval command including at least one executable function of the plurality of executable functions and an argument for the at least one executable function; generating, by a language generation machine learning model of the data processing system executing an attention mechanism, the retrieval command based on the instruction by computing a first set of attention weights corresponding to the instruction, wherein the retrieval command is based on the first set of attention weights and includes an executable function of the plurality of executable functions and an argument generated by the language generation machine learning model for the executable function; selectively retrieving at least one portion of the plurality of portions of the document based on a context window size of the language generation machine learning model of the data processing system executing the attention mechanism by executing the executable function according to the argument for the executable function; generating, by the language generation machine learning model of the data processing system executing the attention mechanism, a response to the query based on the retrieved portion of the document by computing a second set of attention weights corresponding to the retrieved portion of the document, wherein the second set of attention weights is different from the first set of attention weights, wherein the retrieval command is based on the second set of attention weights, and wherein the response comprises natural language text; and displaying the response to a user via the computing device.
  2. 2 . The method of claim 1 , wherein: the query specifies the portion of the document.
  3. 3 . The method of claim 1 , wherein: the metadata comprises a hierarchical tree of structural elements included in the document.
  4. 4 . The method of claim 3 , wherein obtaining the metadata comprises: generating the hierarchical tree of structural elements based on text of the document.
  5. 5 . The method of claim 1 , wherein: the language generation machine learning model is trained to generate text in response to a natural language query.
  6. 6 . A non-transitory computer readable medium storing instructions that, when executed by a processor, cause the processor to: obtain, from a computing device, a query relating to a document; and in response to the obtaining of the query from the computing device: identify, by a metadata component executed on a data processing system, metadata for the document based on the query, wherein the metadata describes a structure of the document including a plurality of portions of the document; generate, by a query component executed on the data processing system, a retrieval command prompt, wherein the retrieval command prompt includes the query, the metadata, a plurality of executable functions, and an instruction to generate, based on the query and the metadata, a retrieval command including at least one executable function of the plurality of executable functions and an argument for the at least one executable function; generate, by a language generation machine learning model of the data processing system executing an attention mechanism, the retrieval command based on the instruction by computing a first set of attention weights corresponding to the instruction, wherein the retrieval command is based on the first set of attention weights and includes an executable function of the plurality of executable functions and an argument generated by the language generation machine learning model for the executable function; selectively retrieve at least one portion of the plurality of portions of the document based on a context window size of the language generation machine learning model of the data processing system executing the attention mechanism by executing the executable function according to the argument for the executable function; generate, by the language generation machine learning model of the data processing system executing the attention mechanism, a response to the query based on the retrieved portion of the document by computing a second set of attention weights corresponding to the retrieved portion of the document and different from the first set of attention weights, wherein the retrieval command is based on the second set of attention weights and wherein the response comprises natural language text; and display the response to a user via the computing device.
  7. 7 . The non-transitory computer readable medium of claim 6 , wherein: the query specifies the portion of the document.
  8. 8 . The non-transitory computer readable medium of claim 6 , wherein: the metadata comprises a hierarchical tree of structural elements included in the document.
  9. 9 . The non-transitory computer readable medium of claim 8 , wherein the instructions further cause the processor to: generate the hierarchical tree of structural elements based on text of the document.
  10. 10 . The non-transitory computer readable medium of claim 6 , wherein: the language generation machine learning model is trained to generate text in response to a natural language query.
  11. 11 . A data processing system, comprising: a memory; and a processing device coupled to the memory, the processing device configured to perform operations comprising: obtaining, from a computing device, a query relating to a document; and in response to the obtaining of the query from the computing device: identifying, by a metadata component executed on a data processing system, metadata for the document based on the query, wherein the metadata describes a structure of the document including a plurality of portions of the document; generating, by a query component executed on the data processing system, a retrieval command prompt, wherein the retrieval command prompt includes the query, the metadata, a plurality of executable functions, and an instruction to generate, based on the query and the metadata, a retrieval command including at least one executable function of the plurality of executable functions and an argument for the at least one executable function; generating, by a language generation machine learning model of the data processing system executing an attention mechanism, the retrieval command based on the instruction by computing a first set of attention weights corresponding to the instruction, wherein the retrieval command is based on the first set of attention weights and includes an executable function of the plurality of executable functions and an argument generated by the language generation machine learning model for the executable function; selectively retrieving at least one portion of the plurality of portions of the document based on a context window size of the language generation machine learning model of the data processing system executing the attention mechanism by executing the executable function according to the argument for the executable function; generating, by the language generation machine learning model of the data processing system executing the attention mechanism, a response to the query based on the retrieved portion of the document by computing a second set of attention weights corresponding to the retrieved portion of the document, wherein the second set of attention weights is different from the first set of attention weights, wherein the retrieval command is based on the second set of attention weights, and wherein the response comprises natural language text; and displaying the response to a user via the computing device.
  12. 12 . The system of claim 11 , further comprising: a metadata component configured to identify the metadata for the document based on the query.
  13. 13 . The system of claim 12 , wherein the metadata component is further configured to: generate a hierarchical tree of structural elements based on text of the document.

Description

BACKGROUND The following relates generally to natural language processing, and more specifically to responding to a user query using machine learning. Natural language processing (NLP) is a field of machine learning that focuses on understanding, interpreting, and generating human language using computers. NLP can include tasks such as text parsing, sentiment analysis, named entity recognition (NER), language translation, text summarization, speech recognition, and language generation. In some cases, NPL techniques are used to generate a response to a user query. In some cases, the user query relates to a document, and a machine learning language model uses contents of the document as an information context for generating the response. However, machine learning language models have context window sizes, or a number of words that the machine learning language model is capable of parsing in connection with each other. In some cases, a number of words in a document exceeds a machine learning language model's context window size, and the machine learning language model is then unable to accurately use the document as context for generating a response to a query about the document, leading to a generation of an inaccurate response. There is therefore a need in the art for a data processing system that generates a more accurate response to a query about a document. SUMMARY Embodiments of the present disclosure provide a data processing system that identifies metadata for a document based on a query relating to the document. In some cases, the metadata describes a structure including a plurality of portions of the document. In some cases, the data processing system generates, using a machine learning model, a retrieval command based on the query and the metadata, selectively retrieves at least one of the plurality of portions of the document based on the retrieval command, and generates, using the machine learning model, a response to the query based on the at least one of the plurality of portions of the document. In some cases, using the machine learning model to generate the retrieval command enables a relevant portion of the document to be retrieved without human supervision or intervention, which increases an efficiency of the response generation process. Furthermore, in some cases, because the response is generated based on the retrieved portion of the document, rather than the entire document, the machine learning model processes a number of words that fit within the context window size of the machine learning model, thereby increasing an accuracy of the response. Still further, in some cases, by generating the response based on portions of the document that are determined to be a most pertinent context for the response, the data processing system avoids using potentially misleading portions of the document as a response context and therefore provide a more accurate response to a user query than conventional data processing systems can provide, regardless of a context window size of the machine learning model. A method, apparatus, non-transitory computer readable medium, and system for data processing are described. One or more aspects of the method, apparatus, non-transitory computer readable medium, and system include obtaining a query relating to a document; identifying metadata for the document based on the query, wherein the metadata describes a structure including a plurality of portions of the document; generating, using a machine learning model, a retrieval command based on the query and the metadata; selectively retrieving at least one of the plurality of portions of the document based on the retrieval command; and generating, using the machine learning model, a response to the query based on the at least one of the plurality of portions of the document. An apparatus and system for data processing are described. One or more aspects of the apparatus and system include at least one memory; at least one processor executing instructions stored in the at least one memory; a database including a document stored in the at least one memory; and a machine learning model comprising machine learning parameters stored in the at least one memory, the machine learning model trained to generate a retrieval command for retrieving at least one of a plurality of portions of the document based on a query and metadata of the document and to generate a response to the query based on the at least one of the plurality of portions of the document, wherein the metadata describes a structure including a plurality of portions of the document. BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 shows an example of a data processing system according to aspects of the present disclosure. FIG. 2 shows an example of a data processing apparatus according to aspects of the present disclosure. FIG. 3 shows an example of a transformer according to aspects of the present disclosure. FIG. 4 shows an example of data flow in a data processing appar