EP-4738102-A1 - RENDERING GUI COMPONENTS FOR REFINING QUERIES TO GENERATIVE MACHINE LEARNING MODELS

EP4738102A1EP 4738102 A1EP4738102 A1EP 4738102A1EP-4738102-A1

Abstract

A computer-implemented method includes receiving (S1001) an output from a generative machine learning model (300). The output includes a definition (1412) of a graphical user interface (GUI) component (610, 620, 630, 640) configured to receive user input for refining an input (211). The definition (1412) of the GUI component (610, 620, 630, 640) is according to a predefined schema. Based on the definition (1412), executable code (124) for rendering the GUI component (610, 620, 630, 640) is generated (S1002), and the code (124) is caused to be executed to render the GUI component (610, 620, 630, 640). The method provides a mechanism for rendering dynamically-generated GUI components for refining inputs to generative machine learning models.

Inventors

WILLIAMS, JOHN HERBERT MARTIN
DROSOS, Ian Zachariah
SARKAR, ADVAIT
WILSON, NICHOLAS CHARLES

Assignees

Microsoft Technology Licensing LLC

Dates

Publication Date: 20260506
Application Date: 20251024

Claims (15)

A computer-implemented method comprising: receiving (S1001), from a first generative machine learning model (300), an output comprising a definition (1412) of a graphical user interface, GUI, component (610, 620, 630, 640) configured to receive user input for refining an input (211), the definition of the GUI component (610, 620, 630, 640) being according to a predefined schema; generating (S1002), based on the definition of the GUI component (610, 620, 630, 640), executable code (124) for rendering the GUI component; and causing (S1003) execution of the executable code to render the GUI component (610, 620, 630, 640).
The method of claim 1, comprising: receiving data representative of a user input made to the GUI component (610, 620, 630, 640) to refine the input (211); generating an input for a second generative machine learning model (300) based on the data representative of the user input and the input (211); providing the input (211) to the second generative machine learning model (300), and in response receiving a response (133) based on the input (211); and causing rendering of the response (133).
The method of claim 2, wherein generating the input for the second generative machine learning model (300) comprises retrieving stored text corresponding to the user input and including the text in the input (211), wherein the stored text is included in the received definition of the GUI component (610, 620, 630, 640).
The method of claim 2 or 3, comprising: receiving data representative of a second user input made to the GUI component (610, 620, 630, 640) to refine the input (211); generating a second input for a second generative machine learning model (300) based on the data representative of the second user input and the input (211); providing the second input to the second generative machine learning model (300), and in response receiving a second response comprising a response to the input (211); and causing rendering of the second response.
The method of any of claims 2 to 4, wherein the response comprises one or more of text, images, audio, video or code.
The method of any of claims 2 to 5, comprising: storing a session option (213) representative of a refinement for an input that applies to further queries in a same session; wherein the input for the second generative machine learning model (300) is further based on the session option (213).
The method of claim 6, comprising: receiving data representative of a second user input made to the GUI component to designate the user input for refining the input (211) as a session option (213); and storing the session option (213) based on the data representative of the second user input.
The method of claim 6 or 7, comprising: receiving data representative of a user input including a description (680) of the session option (213); generating an input for the first generative machine learning model (300) comprising the description (680) of the session option (213) and GUI component definition generation instructions, which when processed by the first generative machine learning model (300) cause the first generative machine learning model (300) to generate a definition of a second GUI component (670) according to the predefined schema receiving, from the first generative machine learning model (300), the definition of the second GUI component (670); generating, based on the definition of the second GUI component (670), executable code for rendering the GUI component; causing execution of the executable code to render the second GUI component (670); receiving data representative of input made to the second GUI component (670); storing the session option (213) based on the data representative of input made to the second GUI component (670).
The method of any preceding claim, comprising: receiving the input (211); generating an input for the first generative machine learning model (300) comprising the input (211) and GUI component definition generation instructions, which when processed by the first generative machine learning model (300) cause the first generative machine learning model (300) to generate the definition of the GUI component (610, 620, 630, 640) according to the predefined schema; and providing the input (211) to the first generative machine learning model (300).
The method of claim 9 wherein the GUI component definition generation instructions comprise a definition (1412) of the predefined schema.
The method of any preceding claim, wherein the predefined schema defines a data structure for representing a GUI component (610, 620, 630, 640), including one or more of: an appearance of the GUI component, a label for the GUI component, a set of options for the GUI component, and an initial value of the set of options.
The method of any preceding claim, wherein the generating, based on the definition of the GUI component (610, 620, 630, 640), executable code (124) for rendering the GUI component (610, 620, 630, 640) comprises: parsing the definition of the GUI component (610, 620, 630, 640) to determine a component type and attributes of the GUI component (610, 620, 630, 640); retrieving template executable code (143) corresponding to the component type; and generating the executable code (124) based on the template executable code (143) and the attributes of the GUI component (610, 620, 630, 640).
The method of any preceding claim, comprising: sequentially receiving a plurality of definitions of GUI components (610, 620, 630, 640) from the first generative machine learning model (300); upon receipt of a first of the plurality of definitions of GUI components (610, 620, 630, 640): generating, based on the definition of the GUI component (610, 620, 630, 640), executable code (124) for rendering a first GUI component; causing execution of the executable code (124) to render the first GUI component; upon receipt of a second of the plurality of definitions of GUI components (660, 670): generating, based on the definition of the GUI component (610, 620, 630, 640), executable code for rendering a second GUI component (660,670); and causing execution of the executable code to render the second GUI component (660,670).
A computer system (1200) comprising a processor (1202) and a memory (1204) storing instructions, the instructions when executed by the processor (1202) causing the system (1200) to: receive (S1001), from a first generative machine learning model (300), an output comprising a definition (1412) of a graphical user interface, GUI, component (610, 620, 630, 640) configured to receive user input for refining an input (211), the definition of the GUI component (610, 620, 630, 640) being according to a predefined schema; generate (S1002), based on the definition (1412) of the GUI component (610, 620, 630, 640), executable code (124) for rendering the GUI component (610, 620, 630, 640); and cause execution of the executable code (124) to render the GUI component (610, 620, 630, 640).
A non-transitory computer-readable medium comprising instructions, which when executed by a processor (1202), cause the processor (1202) to: receive (S1001), from a first generative machine learning model (300), an output comprising a definition (1412) of a graphical user interface, GUI, component (610, 620, 630, 640) configured to receive user input for refining an input (211), the definition of the GUI component (610, 620, 630, 640) being according to a predefined schema; generate (S1002), based on the definition (1412) of the GUI component (610, 620, 630, 640), executable code (124) for rendering the GUI component (610, 620, 630, 640); and cause execution of the executable code (124) to render the GUI component (610, 620, 630, 640).

Description

Background In a wide range of settings, users are required to provide input queries to generative artificial intelligence (Al) models, such as Large Language Models (LLMs) or image generation models. Standalone interfaces for directly accessing such models to provide input and receive responses are commonplace, and exist in the form of web interfaces and mobile applications. An example is OpenAl®'s ChatGPT interface. Furthermore, interfaces that involve providing input queries to generative Al models are increasingly being incorporated into other applications, in the form of intelligent assistants. For example, many Microsoft® applications include Copilot® functionality. Summary According to a first aspect of the disclosure, there is provided a computer-implemented method comprising: receiving, from a first generative machine learning model, an output comprising a definition of a graphical user interface (GUI) component configured to receive user input for refining an input, the definition of the GUI component being according to a predefined schema; generating, based on the definition of the GUI component, executable code for rendering the GUI component; and causing execution of the executable code to render the GUI component. This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter. Nor is the claimed subject matter limited to implementations that solve any or all of the disadvantages noted herein. Brief Description of the Drawings To assist understanding of the present disclosure and to show how embodiments may be put into effect, reference is made by way of example to the accompanying drawings in which: FIG. 1 is a block diagram of an example system.FIG. 2 is a block diagram illustrating an example options generator.FIG. 3 is a schematic representation of the structure of an example prompt.FIG. 4 is an example schema for defining a GUI component.FIG. 5 a block diagram illustrating an example rendering engine.FIG. 6 is a schematic representation of an example user interface.FIG. 7 is a block diagram illustrating an example query executor.FIG. 8 is a schematic representation of another example user interface.FIG. 9 is a schematic representation of another example user interface.FIG. 10 is a schematic flowchart of an example method.FIG. 11 is a block diagram of an example computing system. Detailed Description Generative models, when provided with input (e.g. an instruction or query), consume a significant amount of computing resource in generating a response. This is especially the case for large language models and the like, that may be executed on high-performance computing hardware. Typically, such models are hosted in remote environments, and accessible over a network connection, for example via an application programming interface (API), so that network resources are consumed by interacting with the model. However, providing suitable input to generative Al models that results in accurate responses may be difficult for users. For example, they may forget to include important instructions or context for getting accurate or informative responses, or they do not know exactly what they are looking for. This causes excessive use of computing resources, and excessive network communication. To address these difficulties, the disclosure provides a means of dynamically generating graphical user interface, GUI, components based on an input, such as a query or instruction intended for input to a generative model. The GUI components comprise inputs for refining the input, allowing the user to select or provide appropriate inputs (e.g. radio buttons, check boxes, text boxes) to refine the input. The GUI components comprise inputs that are generated based on the input query. In other words, the options provided may be generated from, and specific to the input query. This increases the accuracy of the responses provided, and thus reduces the computational and network overhead caused by repeated queries. In examples, a generative Al model is used in the process of generating the GUI components. Particularly, the input is provided to a generative model, along with instructions that cause the generative model to provide an output in the form of definitions of GUI components according to a predefined schema. In one step, this therefore provides both the refinement options for the query and suitable information for rendering the options. This reduces the computational and network resource consumption that would be associated with separately causing the model to generate the refinement options and then subsequently the code for rendering them. These definitions are then provided to a rendering engine, which is configured to interpret the GUI component de