Search

US-20260126962-A1 - APPLICATION CREATION ENVIRONMENT USING MULTI-MODALITY INTERFACE OPTIONS

US20260126962A1US 20260126962 A1US20260126962 A1US 20260126962A1US-20260126962-A1

Abstract

Systems and methods provide for a multi-modal development environment to receive inputs using a variety of different input modalities in different user interfaces (UIs). Multiple user interfaces may be linked within the development environment to maintain state information so that inputs provided to one UI are represented in the other UIs using an appropriate equivalent representation based on the UI modality. Users of the development environment may select a given UI for interaction based on a desired task and then see changes tracked and relayed through the different UIs to verify changes within the development environment. The UIs may also be contextually linked to permit the user to work between both UIs without losing the context due to the switch.

Inventors

  • Daniele Bonadiman
  • Ruhaab Markas
  • Ganesh Kumar Gella
  • Katrin Kirchhoff
  • Sailik Sengupta
  • James Gung
  • Arshit Gupta
  • John Baker
  • Yi-An Lai
  • Sebastien Jean
  • Saab Mansour
  • Santosh Kumar Ameti

Assignees

  • AMAZON TECHNOLOGIES, INC.

Dates

Publication Date
20260507
Application Date
20251230

Claims (20)

  1. 1 . A computer-implemented method, comprising: receiving a request corresponding to a first input modality; determining a domain associated with the request; selecting, from a set of actions, one or more actions associated with the domain; generating, for presentation in a first user interface (UI), one or more first representations of the one or more actions; updating, responsive to the presentation in the first UI, a first UI state; determining the first UI state is different from a second UI state; generating, for presentation in a second UI, one or more second representations of the one or more actions corresponding to a second input modality; updating the second UI state; and providing a graphical output of both the first UI and the second UI.
  2. 2 . The computer-implemented method of claim 1 , further comprising: determining, from the request, one or more keywords or parameters, wherein the domain is determined based, at least in part, on the one or more keywords or parameters.
  3. 3 . The computer-implemented method of claim 1 , wherein the first UI is a graphical user interface and the second UI is a conversational user interface.
  4. 4 . The computer-implemented method of claim 1 , wherein the request is associated with a modification category including at least one of a global change, a recommendation request, and an atomic change.
  5. 5 . The computer-implemented method of claim 1 , further comprising: receiving a second input of the second input modality to modify the second UI; identifying, based at least on the second input and the request, a context for the second input corresponds to the one or more first representations; and determining one or more second actions to modify the one or more first representations responsive to the second input.
  6. 6 . The computer-implemented method of claim 1 , wherein first input modality is a textual input, an interaction with one or more icons, an auditory input, a video input, an image input, a data file input, or a combination thereof.
  7. 7 . The computer-implemented method of claim 1 , further comprising: providing a builder environment configured to generate an executable application, wherein parameters of the executable application are defined, at least in part, within at least one of the first UI or the second UI.
  8. 8 . A system, comprising: at least one processor; and memory including instructions that, when executed by the at least one processor, cause the system to: receive a request corresponding to a first input modality; determine a domain associated with the request; select, from a set of actions, one or more actions associated with the domain; generate, for presentation in a first user interface (UI), one or more first representations of the one or more actions; update, responsive to the presentation in the first UI, a first UI state; determine the first UI state is different from a second UI state; generate, for presentation in a second UI, one or more second representations of the one or more actions corresponding to a second input modality; update the second UI state; and provide a graphical output of both the first UI and the second UI.
  9. 9 . The system of claim 8 , wherein the instructions, when executed by the at least one processor, cause the system to: determine, from the request, one or more keywords or parameters, wherein the domain is determined based, at least in part, on the one or more keywords or parameters.
  10. 10 . The system of claim 8 , wherein the first UI is a graphical user interface and the second UI is a conversational user interface.
  11. 11 . The system of claim 8 , wherein the request is associated with a modification category including at least one of a global change, a recommendation request, and an atomic change.
  12. 12 . The system of claim 8 , wherein the instructions, when executed by the at least one processor, cause the system to: receive a second input of the second input modality to modify the second UI; identify, based at least on the second input and the request, a context for the second input corresponds to the one or more first representations; and determine one or more second actions to modify the one or more first representations responsive to the second input.
  13. 13 . The system of claim 8 , wherein first input modality is a textual input, an interaction with one or more icons, an auditory input, a video input, an image input, a data file input, or a combination thereof.
  14. 14 . The system of claim 8 , wherein the instructions, when executed by the at least one processor, cause the system to: provide a builder environment configured to generate an executable application, wherein parameters of the executable application are defined, at least in part, within at least one of the first UI or the second UI.
  15. 15 . A computer-implemented method, comprising: receiving a request to a first user interface (UI) as a first input modality; selecting, from a set of actions, one or more actions associated with a domain of the request; generating, for presentation in the first UI, one or more first representations of the one or more actions; updating, responsive to the presentation in the first UI, a first UI state; determining the first UI state is different from a second UI state; generating, for presentation in a second UI, one or more second representations of the one or more actions equivalent to the one or more first representations, as a second input modality associated with the second UI; updating, responsive to the presentation in the second UI, the second UI state; and providing a graphical output of both the first UI and the second UI.
  16. 16 . The computer-implemented method of claim 15 , further comprising: determining, from the request, one or more keywords or parameters, wherein the domain is determined based, at least in part, on the one or more keywords or parameters.
  17. 17 . The computer-implemented method of claim 15 , wherein the first UI is a graphical user interface and the second UI is a conversational user interface.
  18. 18 . The computer-implemented method of claim 15 , wherein the request is associated with a modification category including at least one of a global change, a recommendation request, and an atomic change.
  19. 19 . The computer-implemented method of claim 15 , further comprising: receiving a second input of the second input modality to modify the second UI; identifying, based at least on the second input and the request, a context for the second input corresponds to the one or more first representations; and determining one or more second actions to modify the one or more first representations responsive to the second input.
  20. 20 . The computer-implemented method of claim 15 , wherein first input modality is a textual input, an interaction with one or more icons, an auditory input, a video input, an image input, a data file input, or a combination thereof.

Description

CROSS-REFERENCE TO RELATED APPLICATION This application is a continuation application and claims priority to U.S. patent application Ser. No. 18/477,994, filed Sep. 29, 2023, of which is incorporated by reference herein in its entirety. BACKGROUND Developers often create customizable interaction environments for users to accomplish specific tasks or help guide certain actions. For example, a developer may offer up a chat bot or other interactive service when a user visits their website or application, which may receive inputs or prompts from the user and then provide suggestions or replies to receive additional information from the users. Interaction environments may need to be highly customizable to specifically support operations associated with the developer, which may lead to systems that are difficult to create and maintain without extensive knowledge, cost, and time. Additionally, ongoing maintenance and support of the services may further increase costs. BRIEF DESCRIPTION OF THE DRAWINGS Various embodiments in accordance with the present disclosure will be described with reference to the drawings, in which: FIG. 1 illustrates an example system for providing a development environment in accordance with various embodiments. FIGS. 2A-2H illustrate example representations of builder tools for a development environment in accordance with various embodiments. FIG. 3 illustrates an example environment for generating interaction environments using an interaction manager in accordance with various embodiments. FIGS. 4A-4D illustrate example representations of a testing and validation environment that can be utilized in accordance with various embodiments. FIG. 5 illustrates an example process for tracking and updating state information between user interfaces in a builder environment that can be utilized in accordance with various embodiments. FIG. 6 illustrates an example process for tracking and updating state information between user interfaces in a builder environment that can be utilized in accordance with various embodiments. FIG. 7 illustrates an example environment in which aspects of various embodiments can be implemented. FIG. 8 illustrates components of an example data center that can be utilized in accordance with various embodiments. FIG. 9 illustrates components of an example computing device that can be used to perform aspects of the various embodiments. DETAILED DESCRIPTION Embodiments of the present disclosure are directed toward development environments for generating, maintaining, and testing interaction environments using a multi-modal interaction system. Various embodiments may include a development environment to permit a developer (e.g., a client, a user, etc.) to generate a customizable interaction environment, such as a chat bot or other artificial intelligence (AI) service, that provides a number of different interfaces to enable the developer to input information in a variety of different ways. These different interfaces may be updated to represent a common state, where different interfaces are updated responsive to an input or interaction with another, thereby providing developers multiple potential options to build out, modify, and/or test their interaction environments. In at least one embodiment, the interfaces may be configured or particularized to receive input from different modalities, such as a first interface serving as a graphical user interface (GUI) where the developer can click/drag icons, a second interface serving as a conversational user interface (CUI), where the developer can interact with plain language to accomplish a task, a third interface serving as a media user interface (MUI) where the developer can upload media content (e.g., images, video, transcripts, audio, etc.), and/or the like. Furthermore, in at least one embodiment, different interfaces may be combined to accomplish multiple tasks, such as the CUI and MUI functioning as a common interface. As a result, systems and methods may provide a multi-modal development environment to enable one or more developers to generate different interaction environments using different input commands. Various embodiments address problems associated with designing and maintaining interaction environments, including as a non-limiting example enterprise chat bots. These processes are often time-consuming, labor-intensive, and expensive, typically requiring skills related to software engineering, machine learning, and user experience design. Moreover, onboarding to a new chat bot platform is also difficult, as each platform may have its own concepts and best practices. Systems and methods address and overcome these problems, among others, by providing a multi-modal development environment that enables developers to design, build, and test their interaction environments through a fully-integrated multimodal interface with a number of different interface options, such as a GUI and a CUI. In at least one embodiment, CUI-based