EP-4736057-A1 - COMPOSE ASSISTANT MANAGER FOR AN APPLICATION

EP4736057A1EP 4736057 A1EP4736057 A1EP 4736057A1EP-4736057-A1

Abstract

An application may receive a prompt from a user related to an input for a text field of digital content displayed on a user device. An application may generate context data about the digital content. An application may provide the prompt and the context data to a generative language model. An application may receive a response generated by the generative language model and provide the response as a suggestion for the input for the text field.

Inventors

WONG, JANICE AN-LEI
HU, Mingpu
JABLONSKI, MEGAN MICHAUX
SERVICE, TRAVIS COE
BAIO, Arielle
MEJIA REYES, JUAN BERNARDO
KNIPPSCHILD, CARLOS EDUARDO
CROUSE, Michael Blair
TITOV, DMITRY
DEWITT, JUSTIN ROBERT
BANSAL, TARUN
YU, YOUNG BIN

Assignees

Google LLC

Dates

Publication Date: 20260506
Application Date: 20240826

Claims (20)

1. A method comprising: receiving textual data from a user related to an input for a text field of digital content displayed on a user device; generating context data about the digital content; providing the textual data and the context data to a generative language model; receiving a response generated by the generative language model; and providing the response as a suggestion for the input for the text field.
2. The method of claim 1, further comprising: detecting an interaction with the text field; and determining, by a model, whether to render a callout affordance, the callout affordance, when selected, configured to render a compose assistant interface for the text field, the compose assistant interface having an input field configured to receive the textual data from the user.
3. The method of claim 2, further comprising: determining whether to render the callout affordance based on signals, the signals including one or more signals about the text field, one or more signals about the digital content, or one or more signals about the user and other users of a compose assistant.
4. The method of any one of claims 1 to 3, further comprising: in response to an amount of the textual data inputted by the user into the text field achieving a threshold level, rendering a callout affordance, the callout affordance, when selected, configured to render a compose assistant interface for the text field, the compose assistant interface having an input field with the textual data.
5. The method of any one of claims 1 to 4, further comprising: receiving a selection to a user interface object with respect to the text field of the digital content; rendering a compose assistant interface for the text field, the compose assistant interface having an input field configured to receive the textual data from the user; and in response to selection of a generate control of the compose assistant interface, transmitting the textual data and the context data to the generative language model.
6. The method of any one of claims 1 to 5, further comprising: receiving a selection of the textual data inputted by the user into the text field; and rendering a compose assistant interface with a control, which when selected, causes transmission of the textual data and the context data to the generative language model.
7. The method of any one of claims 1 to 6, further comprising: in response to an amount of the textual data inputted by the user into the text field achieving a threshold level, transmitting the textual data and the context data; and providing the response as a suggestion in a compose assistant interface.
8. The method of claim 7, further comprising: detecting a cursor position on the suggestion; and providing a preview of the response in the text field.
9. The method of any one of claims 1 to 8, further comprising: inserting the response into the text field.
10. The method of any one of claims 1 to 9, wherein the digital content is a web page, the method further comprising: retrieving first page content of the web page; retrieving second page content of a web page embedded into the web page; and generating the context data to include the first page content and the second page content.
11. The method of any one of claims 1 to 10, wherein the digital content is a web page, the method further comprising: retrieving a document object model (DOM) representation of the web page; extracting a DOM portion from the DOM representation; and generating the context data to include the DOM portion.
12. The method of any one of claims 1 to 11, wherein the digital content is a web page, the method further comprising: retrieving an accessible content structure of the web page; and generating the context data to include the accessible content structure.
13. An apparatus comprising: at least one processor; and a non-transitory computer-readable medium storing executable instructions that cause at the at least one processor to execute operations, the operations comprising: receiving textual data from a user related to an input for a text field of digital content displayed on a user device; generating context data about the digital content; providing the textual data and the context data to a generative language model; receiving a response generated by the generative language model; and providing the response as a suggestion for the input for the text field.
14. The apparatus of claim 13, wherein the operations further comprise: determining, by a model, whether to render a callout affordance based on signals, the signals including one or more signals about the text field, one or more signals about the digital content, or one or more signals about the user and other users of a compose assistant, the callout affordance, when selected, configured to render a compose assistant interface for the text field, the compose assistant interface having an input field configured to receive the textual data from the user.
15. The apparatus of claim 13 or 14, wherein the operations further comprise: in response to an amount of the textual data inputted by the user into the text field achieving a threshold level, rendering a callout affordance, the callout affordance, when selected, configured to render a compose assistant interface for the text field, the compose assistant interface having an input field with the textual data.
16. The apparatus of any one of claims 13 to 15, wherein the operations further comprise: receiving a selection to a user interface object with respect to the text field of the digital content; and rendering a compose assistant interface for the text field, the compose assistant interface having an input field configured to receive the textual data from the user.
17. The apparatus of any one of claims 13 to 16, wherein the digital content is a web page, wherein the operations further comprise: retrieving first page content of the web page; retrieving second page content of a web page embedded into the web page; and generating the context data to include the first page content and the second page content.
18. A non-transitory computer-readable medium storing executable instructions that cause at least one processor to execute operations, the operations comprising: receiving textual data from a user related to an input for a text field of digital content displayed on a user device; generating context data for the digital content; providing the textual data and the context data to a generative language model; receiving a response generated by the generative language model; and providing the response as a suggestion for the input for the text field.
19. The non-transitory computer-readable medium of claim 18, wherein the operations further comprise: determining, by a model, whether to render a callout affordance, the callout affordance, when selected, configured to render a compose assistant interface for the text field, the compose assistant interface having an input field configured to receive the textual data from the user.
20. The non-transitory computer-readable medium of claim 18 or 19, wherein the digital content is a web page, wherein the operations further comprise: retrieving first page content of the web page; retrieving second page content of a web page embedded into the web page; and generating the context data to include the first page content and the second page content.

Description

COMPOSE ASSISTANT MANAGER FOR AN APPLICATION CROSS-REFERENCE TO RELATED APPLICATION [0001] This application claims priority to U.S. Provisional Patent Application No. 63/578,816, filed August 25, 2023, the disclosure of which is incorporated by reference herein in its entirety. BACKGROUND [0002] Some web pages include text boxes that obtain text input from a user. Examples may include web pages that enable users to leave reviews about a product, a service, a place, etc., web pages that enable users to leave comments or replies to comments, web pages that enable users to post messages (e.g., web pages for social media websites), and/or web pages that include a survey, etc. A user can use a generative language model to help draft input content for a web page. However, the user may have to be relatively specific in their terminology when drafting their prompt and/or may have to perform multiple iterations with the language model to create a desired review. Further, obtaining contextual data from web content used by generative language models may pose one or more technical challenges relating to security. SUMMARY [0003] This disclosure relates to a compose assistant manager for an application (e.g., a browser application) that integrates a generative model (e.g., a language model) for drafting content as input to a text field of digital content (e.g., a web page) that provides one or more technical benefits of maintaining the security of application content (e.g., web pages) and/or reducing the amount of computing resources (e.g., memory, CPU) consumed for generating and inserting generative content into (e.g., directly into) the text field of the digital content. The compose assistant manager may provide reduced overhead to the user when creating prompts and tailor the generated outputs to the context of the digital content. The compose assistant manager may generate one or more context signals (also referred to as context data) about the digital content (e.g., web page), and the compose assistant manager may transmit textual data received from a user (e g., also referred to as a prompt or a user-provided prompt) and the content signals to the generative language model, which returns a model response that can be directly inserted into the text field. Put another way, the compose assistant manager assists the user in entering text into a text field provided by a computer system, and does so using technical information, specifically context information about the web page, which could be content from the web page. [0004] In some aspects, the techniques described herein relate to a method including: receiving textual data from a user related to an input for a text field of digital content displayed on a user device; generating context data about the digital content; providing the textual data and the context data to a generative language model; receiving a response generated by the generative language model; and providing the response as a suggestion for the input for the text field. [0005] In some aspects, the techniques described herein relate to an apparatus including: at least one processor; and a non-transitory computer-readable medium storing executable instructions that cause at the at least one processor to execute operations, the operations including: receiving textual data from a user related to an input for a text field of digital content displayed on a user device; generating context data about the digital content; providing the textual data and the context data to a generative language model; receiving a response generated by the generative language model; and providing the response as a suggestion for the input for the text field. [0006] In some aspects, the techniques described herein relate to a non-transitory computer-readable medium storing executable instructions that cause at least one processor to execute operations, the operations including: receiving textual data from a user related to an input for a text field of digital content displayed on a user device; generating context data for the digital content; providing the textual data and the context data to a generative language model; receiving a response generated by the generative language model; and providing the response as a suggestion for the input for the text field. [0007] The details of one or more implementations are set forth in the accompanying drawings and the description below. Other features will be apparent from the description and drawings. BRIEF DESCRIPTION OF THE DRAWINGS [0008] FIG. 1A illustrates an example callout affordance for invoking a compose assistant manager according to an aspect. [0009] FIG. IB illustrates an example callout affordance for invoking a compose assistant manager according to an aspect. [0010] FIG. 1C illustrates a compose assistant interface for receiving a prompt according to an aspect. [0011] FIG. ID illustrates a compose assistant interface for displaying a model response according to an as