EP-4736132-A1 - MANAGING AUTOMATIONS USING NATURAL LANGUAGE INPUT
Abstract
This document describes systems and techniques for implementing automations using natural language inputs. A natural language request is received to create a new automation. The systems and techniques determine components associated with the new automation and create the new automation based on the natural language request. Images are received from an image capture device associated with the new automation and analyzed to determine whether at least one image satisfies the components associated with the new automation. The new automation is triggered if at least one received image satisfies the components associated with the new automation.
Inventors
- RAMAMURTHI, INDU
- TAI, Ryan Kam Wang
- LI, YUAN
- WANG, JINGBIN
Assignees
- Google LLC
Dates
- Publication Date
- 20260506
- Application Date
- 20240123
Claims (15)
- 1. A method comprising: receiving a natural language request to create a new automation; determining components associated with the new automation based on the natural language request; creating the new automation based on the determined components; receiving images from an image capture device; analyzing the received images to determine whether at least one image satisfies the components associated with the new automation; and triggering the new automation responsive to at least one received image satisfying the components associated with the new automation.
- 2. The method of claim 1, wherein the components associated with the new automation include at least one of an object, an activity, or a combination of an object and an activity.
- 3. The method of claim 1 or 2, further comprising: encoding the natural language request using a text embedding model; creating features associated with the encoded natural language request; identifying features of the received images from the image capture device; comparing the identified features of the received images to the features associated with the encoded natural language request; and identifying relevant received images based on the comparison.
- 4. The method of claim 3, wherein the features associated with the encoded natural language request and the features of the received images are stored in a common embedding space.
- 5. The method of any one of claims 1 to 4, wherein analyzing the received images is performed by an image processing system.
- 6. The method of any one of claims 1 to 5, further comprising: responsive to triggering the new automation, initiating an activity.
- 7. The method of claim 6, wherein the activity includes at least one of generating an alarm, locking a door, unlocking a door, turning on a smart light, turning off a smart light, turning on a sprinkler, or activating a smart device.
- 8. The method of any one of claims 1 to 7, further comprising: responsive to triggering the new automation, notifying a user of the triggering.
- 9. The method of claim 8, wherein notifying the user includes at least one of communicating a text message to the user, communicating a video segment to the user, or communicating an audible message to the user.
- 10. The method of any one of claims 1 to 9, further comprising summarizing a plurality of events and locating video segments associated with the summarized events using a large language model (LLM).
- 11. The method of claim 10, wherein the LLM is trained based on the results of summarizing events and locating video segments associated with the summarized events.
- 12. The method of any one of claims 1 to 11, wherein analyzing the received images includes at least one of identifying at least one object, classifying at least one object, or identifying at least one activity.
- 13. The method of claim 12, wherein analyzing the received images determines a sequence in multiple images of the received images, the determined sequence identifying the at least one activity' based on a movement of the at least one object identified in the sequence of the multiple images.
- 14. The method of any one of claims 1 to 13, wherein the method is implemented by executing instructions stored on one or more non-transitory computer-readable media.
- 15. The method of any one of claims 1 to 13, wherein the method is implemented by an apparatus that includes an image processing system and an automation management system.
Description
MANAGING AUTOMATIONS USING NATURAL LANGUAGE INPUT BACKGROUND [0001] Camera systems provide a variety of benefits by capturing images of objects and activities within the camera’s field of view. The captured images may be surrounding a home, business, or other area depending on the location and orientation of the camera. The captured images may provide security and monitoring activities for a user of the camera system. [0002] Some existing cameras can detect a few' objects in the captured images, such as people, vehicles, or animals. However, these existing cameras are typically limited to identifying this small set of objects. This prevents users from detecting more sophisticated objects through their camera and prevents them from identifying more interesting situations or activities captured by the camera. SUMMARY [0003] This document describes systems and techniques for managing automations using natural language input. In some aspects, these systems and techniques receive a phrase spoken by a user and, based on the phrase, identify a particular event, create an automation, monitor images captured by a camera, and generate an appropriate trigger upon detection of the automation based on the images captured by the camera. In response to the trigger, the systems and techniques may initiate an activity (such as an alert) and/or communicate a notification of the trigger to a user. In some situations, the systems and techniques may trigger detection of an automation based on activities or events that are not related to a captured image, such as activities or events associated with devices or systems. [0004] Allowing a user to define automations using natural language input simplifies the process for the user. Instead of requiring the user to remember specific phrases and exact terms, the user merely speaks in their own language to describe the desired automation. The systems and techniques described herein process the user’s natural language input, determine the user’s desired automation, and create that automation. Thus, the user can quickly and easily set up new automations without having to learn specific phrases or techniques required by the automation system. [0005] For example, a method comprises receiving a natural language request from a user to create a new automation and determining components associated with the new automation based on the natural language request. The method further comprises creating a new automation based on the determined components and receiving images from an image capture device associated with the new automation. The method further comprises analyzing the received images to determine whether at least one image satisfies the components associated with the new automation. The new automation is triggered if at least one image satisfies the components associated with the new automation. [0006] In another example, an apparatus includes an image processing system configured to receive images from an image capture device. An automation management system is coupled to the image processing system and configured to receive a natural language request from a user to create a new automation. The automation management system also determines components associated with the new automation based on the natural language request and creates the new' automation based on the determined components. The automation management system further analyzes images received by the image processing system to determine whether at least one image satisfies the components associated with the new' automation. The automation management system triggers the new' automation if at least one received image satisfies the components associated with the new automation. [0007] This document also describes other methods, configurations, and systems for implementing automations based on natural language statements. Optional features of one aspect, such as the apparatus or method described above, may be combined with other aspects. [0008] This summary is provided to introduce simplified concepts for implementing automations based on natural language statements, which is further described below in the detailed description and drawings. This summary is not intended to identify essential features of the claimed subject matter, nor is it intended for use in determining the scope of the claimed subject matter. BRIEF DESCRIPTION OF THE DRAWINGS [0009] The details of one or more aspects for managing automations using natural language input are described in this document with reference to the following drawings. The same numbers are used throughout multiple drawings to reference like features and components. [0010] FIG. 1 illustrates an example diagram of a computer system in which techniques for managing automations using natural language input can be implemented. [0011] FIG. 2 illustrates an example process for creating a new' automation using a natural language request. [0012] FIG. 3 illustrates an example process for triggering