US-12619304-B2 - Device control using gaze information

US12619304B2US 12619304 B2US12619304 B2US 12619304B2US-12619304-B2

Abstract

The present disclosure generally relates to controlling electronic devices. In some examples, the electronic device uses gaze information to activate a digital assistant. In some examples, the electronic device uses gaze information to identify an external device on which to act. In some examples, the electronic device provides an indication that distinguishes between different speakers.

Inventors

Sean B. Kelly
Felipe BACIM DE ARAUJO E SILVA
Karlin Y. Bark

Assignees

APPLE INC.

Dates

Publication Date: 20260505
Application Date: 20240909

Claims (18)

1 . An electronic device that is configured to communicate with a display generation component and one or more cameras, the electronic device comprising: one or more processors; and memory storing one or more programs configured to be executed by the one or more processors, the one or more programs including instructions for: obtaining, using the one or more cameras, information about a respective user including a respective direction of the respective user relative to the electronic device and whether the electronic device is within a field of view of the respective user; in accordance with a determination that the electronic device is within the field of view of the respective user and a determination that the respective user is a first user and the respective direction is a first direction: displaying, via the display generation component, a first visual indicator corresponding to the first user, wherein the first visual indicator is displayed in a first position corresponding to the first direction of the first user relative to the electronic device; and in accordance with a determination that the electronic device is within the field of view of the respective user and a determination that the respective user is a second user, different from the first user, and the respective direction is a second direction: displaying, via the display generation component, a second visual indicator corresponding to the second user, wherein the second visual indicator is displayed in a second position corresponding to the second direction of the second user relative to the electronic device.
2 . The electronic device of claim 1 , the one or more programs further including instructions for: prior to obtaining information about the respective user via the one or more cameras, receiving registration information; and associating, using the registration information, the first visual indicator with the first user and the second visual indicator with the second user.
3 . The electronic device of claim 1 , the one or more programs further including instructions for: determining that the respective user is the first user or the second user based on one or more of voice recognition, facial recognition, and the respective direction of the respective user.
4 . The electronic device of claim 1 , the one or more programs further including instructions for: receiving an audio user input request from the respective user; in accordance with a determination that the respective user is the first user, responding to the audio user input request using the first visual indicator; and in accordance with a determination that the respective user is the second user, responding to the audio user input request using the second visual indicator.
5 . The electronic device of claim 4 , wherein responding to the audio user input request includes providing an audio indicator corresponding to a user associated with the audio user input request.
6 . A non-transitory computer-readable storage medium storing one or more programs configured to be executed by one or more processors of an electronic device that is in communication with a display generation component and one or more cameras, the one or more programs including instructions for: obtaining, using the one or more cameras, information about a respective user including a respective direction of the respective user relative to the electronic device and whether the electronic device is within a field of view of the respective user; in accordance with a determination that the electronic device is within the field of view of the respective user and a determination that the respective user is a first user and the respective direction is a first direction: displaying, via the display generation component, a first visual indicator corresponding to the first user, wherein the first visual indicator is displayed in a first position corresponding to the first direction of the first user relative to the electronic device; and in accordance with a determination that the electronic device is within the field of view of the respective user and a determination that the respective user is a second user, different from the first user, and the respective direction is a second direction: displaying, via the display generation component, a second visual indicator corresponding to the second user, wherein the second visual indicator is displayed in a second position corresponding to the second direction of the second user relative to the electronic device.
7 . The non-transitory computer-readable storage medium of claim 6 , the one or more programs further including instructions for: prior to obtaining information about the respective user via the one or more cameras, receiving registration information; and associating, using the registration information, the first visual indicator with the first user and the second visual indicator with the second user.
8 . The non-transitory computer-readable storage medium of claim 6 , the one or more programs further including instructions for: determining that the respective user is the first user or the second user based on one or more of voice recognition, facial recognition, and the respective direction of the respective user.
9 . The non-transitory computer-readable storage medium of claim 6 , the one or more programs further including instructions for: receiving an audio user input request from the respective user; in accordance with a determination that the respective user is the first user, responding to the audio user input request using the first visual indicator; and in accordance with a determination that the respective user is the second user, responding to the audio user input request using the second visual indicator.
10 . The non-transitory computer-readable storage medium of claim 9 , wherein responding to the audio user input request includes providing an audio indicator corresponding to a user associated with the audio user input request.
11 . A method, comprising: at an electronic device that is in communication with a display generation component and one or more cameras: obtaining, using the one or more cameras, information about a respective user including a respective direction of the respective user relative to the electronic device and whether the electronic device is within a field of view of the respective user; in accordance with a determination that the electronic device is within the field of view of the respective user and a determination that the respective user is a first user and the respective direction is a first direction: displaying, via the display generation component, a first visual indicator corresponding to the first user, wherein the first visual indicator is displayed in a first position corresponding to the first direction of the first user relative to the electronic device; and in accordance with a determination that the electronic device is within the field of view of the respective user and a determination that the respective user is a second user, different from the first user, and the respective direction is a second direction: displaying, via the display generation component, a second visual indicator corresponding to the second user, wherein the second visual indicator is displayed in a second position corresponding to the second direction of the second user relative to the electronic device.
12 . The method of claim 11 , further comprising: prior to obtaining information about the respective user via the one or more cameras, receiving registration information; and associating, using the registration information, the first visual indicator with the first user and the second visual indicator with the second user.
13 . The method of claim 11 , the method further comprising: determining that the respective user is the first user or the second user based on one or more of voice recognition, facial recognition, and the respective direction of the respective user.
14 . The method of claim 11 , further comprising: receiving an audio user input request from the respective user; in accordance with a determination that the respective user is the first user, responding to the audio user input request using the first visual indicator; and in accordance with a determination that the respective user is the second user, responding to the audio user input request using the second visual indicator.
15 . The method of claim 14 , wherein responding to the audio user input request includes providing an audio indicator corresponding to a user associated with the audio user input request.
16 . The electronic device of claim 1 , wherein: the first visual indicator is displayed with a first color corresponding to a preference of the first user; and the second visual indicator is displayed with a second color corresponding to a preference of the second user.
17 . The non-transitory computer-readable storage medium of claim 6 , wherein: the first visual indicator is displayed with a first color corresponding to a preference of the first user; and the second visual indicator is displayed with a second color corresponding to a preference of the second user.
18 . The method of claim 11 , wherein: the first visual indicator is displayed with a first color corresponding to a preference of the first user; and the second visual indicator is displayed with a second color corresponding to a preference of the second user.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS This application is a continuation of U.S. application Ser. No. 18/105,191, entitled “DEVICE CONTROL USING GAZE INFORMATION”, filed on Feb. 2, 2023, which is a continuation of U.S. application Ser. No. 17/087,855, now issued as U.S. Pat. No. 11,619,991, entitled “DEVICE CONTROL USING GAZE INFORMATION”, filed on Nov. 3, 2020, which is a continuation of U.S. application Ser. No. 16/553,622, now issued as U.S. Pat. No. 10,860,096, entitled “DEVICE CONTROL USING GAZE INFORMATION”, filed on Aug. 28, 2019, which claims priority to U.S. Provisional Application Ser. No. 62/739,087, entitled “DEVICE CONTROL USING GAZE INFORMATION”, filed on Sep. 28, 2018, the contents of which are hereby incorporated by reference in their entirety. FIELD The present disclosure relates generally to computer user interfaces, and more specifically to techniques for controlling electronic devices using gaze information. BACKGROUND Users frequently provide inputs, such as keypresses and voice inputs, to control electronic devices. For example, users activate a device's button or speak a trigger phrase to start an application on the device. Such inputs frequently require the user to be within arm's reach or within microphone range. Intelligent automated assistants (or digital assistants) can provide a beneficial interface between human users and electronic devices. Such assistants can allow users to interact with devices or systems using natural language in spoken and/or text forms. For example, a user can provide a speech input containing a user request to a digital assistant operating on an electronic device. The digital assistant can interpret the user's intent from the speech input, operationalize the user's intent into a task, and perform the task. In some systems, performing tasks in this manner may be constrained in the manner by which a task is identified. In some cases, however, a user may be limited to a particular set of commands such that the user cannot readily instruct a digital assistant to perform a task using natural-language speech inputs. Further, in many instances digital assistants fail to adapt based on previous user behavior and in turn lack a desirable optimization of user experience. BRIEF SUMMARY Some techniques for controlling electronic devices, however, are generally cumbersome and inefficient. For example, some existing techniques use a complex and time-consuming user interface, which may include multiple key presses or keystrokes. For another example, some existing techniques require the user to be within arm's distance to activate a button of the device. Existing techniques require more time than necessary, wasting user time and device energy. This latter consideration is particularly important in battery-operated devices. Accordingly, the present technique provides electronic devices with faster, more efficient methods and interfaces for controlling electronic devices. Such methods and interfaces optionally complement or replace other methods for controlling electronic devices. Such methods and interfaces reduce the cognitive burden on a user and produce a more efficient human-machine interface. For battery-operated computing devices, such methods and interfaces conserve power and increase the time between battery charges. Such techniques also allow users to more efficiently interact with electronic devices in environments where the user is not within reaching distance of the electronic device and/or the user is in a noisy environment (e.g., including noise based on audio being produced by the electronic device). In accordance with some embodiments, a method is provided. The method is performed at an electronic device. The method comprises: while a digital assistant of the electronic device is not activated: obtaining, using one or more camera sensors, first gaze information; and in accordance with a determination that the first gaze information satisfies a set of one or more activation criteria: activating the digital assistant of the electronic device; and providing an indication that the set of one or more activation criteria has been satisfied. In accordance with some embodiments, a non-transitory computer-readable storage medium is provided. The medium stores one or more programs configured to be executed by one or more processors of an electronic device. The one or more programs including instructions for: while a digital assistant of the electronic device is not activated: obtaining, using one or more camera sensors, first gaze information; and in accordance with a determination that the first gaze information satisfies a set of one or more activation criteria: activating the digital assistant of the electronic device; and providing an indication that the set of one or more activation criteria has been satisfied. In accordance with some embodiments, a transitory computer-readable storage medium is provided. The medium stores one or more programs configured to