Search

US-12619452-B2 - Intelligent automated assistant in a messaging environment

US12619452B2US 12619452 B2US12619452 B2US 12619452B2US-12619452-B2

Abstract

Systems and processes for operating an intelligent automated assistant in a messaging environment are provided. In one example process, a graphical user interface (GUI) having a plurality of previous messages between a user of the electronic device and the digital assistant can be displayed on a display. The plurality of previous messages can be presented in a conversational view. User input can be received and in response to receiving the user input, the user input can be displayed as a first message in the GUI. A contextual state of the electronic device corresponding to the displayed user input can be stored. The process can cause an action to be performed in accordance with a user intent derived from the user input. A response based on the action can be displayed as a second message in the GUI.

Inventors

  • Petr KARASHCHUK
  • Tomas A. VEGA GALVEZ
  • Thomas R. Gruber

Assignees

  • APPLE INC.

Dates

Publication Date
20260505
Application Date
20230720

Claims (20)

  1. 1 . A non-transitory computer-readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by one or more processors of an electronic device with a display, cause the electronic device to: display, on the display, a first conversational element associated with a previous interaction between a user of the electronic device and a digital assistant implemented on the electronic device, wherein the previous interaction occurs at a first time, and wherein the first conversational element represents an interactive session that includes two or more previous messages between the user and the digital assistant; detect a user input corresponding to a selection of the first conversational element; in response to detecting the user input corresponding to the selection of the first conversational element: retrieving a previous contextual state of the electronic device associated with the first conversational element at the first time when the previous interaction occurred, wherein the previous contextual state includes information related to a state of the electronic device at the first time when a previous user input associated with the first conversational element was received, and wherein each of the two or more previous messages between the user and the digital assistant is associated with the previous contextual state of the electronic device; and after receiving the user input corresponding to the selection of the first conversational element: receive, at a second time after the first time, a user request; and in response to receiving the user request: display a representation of the user request as a second conversational element; and display a response to the user request, wherein the response to the user request is determined based on the user request and the previous contextual state of the electronic device associated with the first conversational element at the first time when the previous interaction occurred that is retrieved in response to the user input corresponding to the selection of the first conversational element.
  2. 2 . The non-transitory computer-readable storage medium of claim 1 , wherein displaying the response to the user request includes displaying the response as a media object and/or as text.
  3. 3 . The non-transitory computer-readable storage medium of claim 2 , wherein the media object comprises an image, a video clip, or an audio clip.
  4. 4 . The non-transitory computer-readable storage medium of claim 1 , wherein the user request includes a media object.
  5. 5 . The non-transitory computer-readable storage medium of claim 1 , wherein the first conversational element includes a previous message received at the first time.
  6. 6 . The non-transitory computer-readable storage medium of claim 1 , wherein the previous contextual state of the electronic device associated with the first conversational element includes a location of the electronic device at the first time.
  7. 7 . The non-transitory computer-readable storage medium of claim 1 , wherein the previous contextual state of the electronic device associated with the first conversational element includes prior dialogue between the user and the digital assistant.
  8. 8 . The non-transitory computer-readable storage medium of claim 1 , wherein the previous contextual state of the electronic device associated with the first conversational element includes time and date information at the first time.
  9. 9 . The non-transitory computer-readable storage medium of claim 1 , wherein the previous contextual state of the electronic device associated with the first conversational element includes application information associated with the electronic device at the first time.
  10. 10 . The non-transitory computer-readable storage medium of claim 9 , wherein the application information includes at least one of: contact information; search history; information indicating which applications are installed on the electronic device at the first time; and information indicating which applications are actively running on the electronic device at the first time.
  11. 11 . The non-transitory computer-readable storage medium of claim 1 , wherein the response to the user request is determined locally at the electronic device.
  12. 12 . The non-transitory computer-readable storage medium of claim 1 , wherein the response to the user request is determined by an external electronic device.
  13. 13 . The non-transitory computer-readable storage medium of claim 1 , wherein displaying the response to the user request includes displaying, in a messaging application user interface, an indication that a portion of the response to the user request is selectable, and wherein the one or more programs comprise further instructions, which when executed by the one or more processors of the electronic device, cause the electronic device to: detect user selection of the indication; and in response to detecting the user selection of the indication, display a first application user interface different from the messaging application user interface.
  14. 14 . The non-transitory computer-readable storage medium of claim 1 , wherein displaying the response to the user request includes displaying an indication that a more detailed response to the user request is available, and wherein the one or more programs comprise further instructions, which when executed by the one or more processors of the electronic device, cause the electronic device to: detect user selection of the indication; and in response to detecting the user selection of the indication, display expanded results associated with the response to the user request.
  15. 15 . The non-transitory computer-readable storage medium of claim 14 , wherein: the response to the user request is displayed in a messaging application user interface; and the expanded results are displayed in an application user interface different from the messaging application user interface.
  16. 16 . The non-transitory computer-readable storage medium of claim 1 , wherein displaying the response to the user request includes displaying a request for additional information.
  17. 17 . The non-transitory computer-readable storage medium of claim 16 , wherein the request for additional information includes a list of one or more suggestions, and wherein the one or more programs comprise further instructions, which when executed by the one or more processors of the electronic device, cause the electronic device to: detect user selection of a first suggestion of the one or more suggestions; and in response to detecting the user selection of the first suggestion, displaying a representation of the first suggestion as a third conversational element.
  18. 18 . The non-transitory computer-readable storage medium of claim 1 , wherein: the user request includes an ambiguous term; and the ambiguous term is resolved based on the previous contextual state of the electronic device to determine the response to the user request based on the previous contextual state of the electronic device.
  19. 19 . The non-transitory computer-readable storage medium of claim 1 , wherein the one or more programs comprise further instructions, which when executed by the one or more processors of the electronic device, cause the electronic device to: concurrently display, on the display: the first conversational element associated with the previous interaction between the user and the digital assistant; and a third conversational element associated with a second previous interaction between the user and the digital assistant, wherein the third conversational element is different from the first conversational element, the second previous interaction is different from the previous interaction, and the second previous interaction occurs at a third time different from the first time; while concurrently displaying the first conversational element and the third conversational element, detect a second user input corresponding to a selection of the third conversational element, wherein a second previous contextual state of the electronic device associated with the third conversational element is selected based on the second user input corresponding to the selection of the third conversational element; and after receiving the second user input corresponding to the selection of the third conversational element: receive, at a fourth time after the third time, a second user request; and in response to receiving the second user request: display a representation of the second user request as a fourth conversational element; and display a second response to the user request, wherein the second response to the second user request is determined based on the second previous contextual state of the electronic device associated with the third conversational element that is selected based on the second user input corresponding to the selection of the third conversational element.
  20. 20 . A method, comprising: at an electronic device with a display: displaying, on the display, a first conversational element associated with a previous interaction between a user of the electronic device and a digital assistant implemented on the electronic device, wherein the previous interaction occurs at a first time, and wherein the first conversational element represents an interactive session that includes two or more previous messages between the user and the digital assistant; detecting a user input corresponding to a selection of the first conversational element; in response to detecting the user input corresponding to the selection of the first conversational element: retrieving a previous contextual state of the electronic device associated with the first conversational element at the first time when the previous interaction occurred, wherein the previous contextual state includes information related to a state of the electronic device at the first time when a previous user input associated with the first conversational element was received, and wherein each of the two or more previous messages between the user and the digital assistant is associated with the previous contextual state of the electronic device; and after receiving the user input corresponding to the selection of the first conversational element: receiving, at a second time after the first time, a user request; and in response to receiving the user request: displaying a representation of the user request as a second conversational element; and displaying a response to the user request, wherein the response to the user request is determined based on the user request and the previous contextual state of the electronic device associated with the first conversational element at the first time when the previous interaction occurred that is retrieved in response to the user input corresponding to the selection of the first conversational element.

Description

CROSS-REFERENCE TO RELATED APPLICATION This application is a continuation of U.S. application Ser. No. 17/949,136, filed on Sep. 20, 2022, entitled “INTELLIGENT AUTOMATED ASSISTANT IN A MESSAGING ENVIRONMENT,” which is a continuation of U.S. application Ser. No. 15/931,384, now U.S. Pat. No. 11,526,368, filed on May 13, 2020, entitled “INTELLIGENT AUTOMATED ASSISTANT IN A MESSAGING ENVIRONMENT,” which is a continuation of U.S. application Ser. No. 15/151,191, now U.S. Pat. No. 10,691,473, filed on May 10, 2016, entitled “INTELLIGENT AUTOMATED ASSISTANT IN A MESSAGING ENVIRONMENT,” which claims priority to U.S. Provisional Application Ser. No. 62/252,311, filed on Nov. 6, 2015, entitled “INTELLIGENT AUTOMATED ASSISTANT IN A MESSAGING ENVIRONMENT.” The entire contents of each of these applications are incorporated herein by reference in their entireties. FIELD This relates generally to intelligent automated assistants and, more specifically, to intelligent automated assistants in a messaging environment. BACKGROUND Intelligent automated assistants (or digital assistants) can provide a beneficial interface between human users and electronic devices. Such assistants can allow users to interact with devices or systems using natural language in spoken and/or text forms. For example, a user can provide a speech input containing a user request to a digital assistant operating on an electronic device. The digital assistant can interpret the user's intent from the speech input and operationalize the user's intent into tasks. The tasks can then be performed by executing one or more services of the electronic device, and a relevant output responsive to the user request can be returned to the user. Typically, electronic devices implement a dedicated user interface for interacting with the digital assistant. For example, an electronic device can implement a dedicated voice interface for interacting with the digital assistant. Such dedicated user interfaces can limit the opportunities for interaction, which can limit the widespread adoption and application of digital assistants to benefit people's lives. SUMMARY Systems and processes for operating an intelligent automated assistant in a messaging environment are provided. In one example process, a graphical user interface (GUI) having a plurality of previous messages between a user of the electronic device and the digital assistant can be displayed on a display. The plurality of previous messages can be presented in a conversational view. User input can be received and in response to receiving the user input, the user input can be displayed as a first message in the GUI. A contextual state of the electronic device corresponding to the displayed user input can be stored. The process can cause an action to be performed in accordance with a user intent derived from the user input. A response based on the action can be displayed as a second message in the GUI. In another example process, a GUI having a plurality of previous messages between a user and the digital assistant can be displayed on a display of an electronic device. The plurality of previous messages can be presented in a conversational view. A first user input including a media object can be received. In response to receiving the first user input, the media object can be displayed as a first message in the GUI. A second user input including text can be received. In response to receiving the second user input, the text can be displayed as a second message in the GUI. The process can cause a user intent corresponding to the first user input and the second user input to be determined. A determination of whether the user intent requires extracting text from the media object can be obtained. In response to obtaining a determination that the user intent requires extracting text from the media object: text from the media object can be extracted, a task in accordance with the user intent can be perform using the extracted text, and a response indicative of the user intent being satisfied can be displayed as a third message in the GUI. In yet another example process, a GUI having a plurality of previous messages between a user of the electronic device and a user of a remote device can be displayed on the display of an electronic device. The plurality of previous messages can be presented in a conversational view. A first user input addressed to the digital assistant can be received from the user of the electronic device. In response to receiving the first user input, the first user input can be displayed as a first message in the GUI. The process can cause an action to be performed in accordance with a user intent derived from the first user input. A response based on the action can be displayed as a second message in the GUI. BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a block diagram illustrating a system and environment for implementing a digital assistant according to various examples. FIG. 2A is a block diagram illust