US-20260127463-A1 - FLOW ORCHESTRATION FOR MODEL-BASED AGENTS

US20260127463A1US 20260127463 A1US20260127463 A1US 20260127463A1US-20260127463-A1

Abstract

A computerized system and method for flow orchestration for language model-based agents are provided. A workflow comprising a plurality of agents configured to execute in a multi-step multi-pass (MSMP) mode is defined. A request for data is received by a generative artificial intelligence (GAI) model. A portion of the requested data is retrieved based on executing a first agent of the plurality of agents of the workflow e.g., in a first pass of the MSMP mode. The GAI model adjusts the workflow based on the retrieved portion of the requested data. For example, an order of the plurality of agents of the workflow (e.g., which agent is to be executed first, second, and so on) is adjusted based on the retrieved portion of the requested data. The requested data is obtained based on executing the adjusted workflow.

Inventors

Yan Li
Yu Zhang
Qianyun Chang

Assignees

MICROSOFT TECHNOLOGY LICENSING, LLC

Dates

Publication Date: 20260507
Application Date: 20250226

Claims (20)

1 . A system comprises: a processor; and a memory comprising computer program code, the memory and the computer program code configured to cause the processor to: generate a workflow comprising a plurality of agents to be executed in a multi-step multi-pass (MSMP) mode; receive, by a generative artificial intelligence (GAI) model, a request for data; retrieve a portion of the requested data based on executing a first agent of the plurality of agents of the workflow; adjust, by the GAI model, the workflow based on the retrieved portion of the requested data; and obtain the requested data based on executing the adjusted workflow.
2 . The system of claim 1 , wherein the portion of the requested data is retrieved based on executing the first agent of the plurality of agents of the workflow in a first pass of the MSMP mode, and the requested data is obtained based on executing the first agent of the plurality of agents of the adjusted workflow in a second pass of the MSMP mode.
3 . The system of claim 1 , wherein the memory and the computer program code are configured to cause the processor to identify the first agent of the plurality of agents of the workflow to retrieve the portion of the requested data.
4 . The system of claim 1 , wherein adjusting the workflow comprises adjusting an order of the plurality of agents of the workflow based on the retrieved portion of the requested data.
5 . The system of claim 1 , wherein the memory and the computer program code are configured to cause the processor to retrieve another portion of the requested data based on executing a second agent of the plurality of agents of the workflow, wherein the workflow is adjusted based on the retrieved other portion of the requested data.
6 . The system of claim 1 , wherein the request for data is received as a voice input or a text input by the GAI model from a user, and the memory and the computer program code are configured to cause the processor to provide the requested data as a voice output or a text output to the user.
7 . A computerized method comprising: defining a workflow comprising a plurality of steps; receiving, by a generative artificial intelligence (GAI) model, a request for data; retrieving a portion of the requested data based on executing a first agent associated with a first step of the plurality of steps of the workflow; adjusting, by the GAI model, the workflow based on the retrieved portion of the requested data; and obtaining the requested data based on executing the adjusted workflow.
8 . The computerized method of claim 7 , wherein the plurality of steps is to be executed in a multi-step multi-pass (MSMP) mode.
9 . The computerized method of claim 8 , wherein the portion of the requested data is retrieved based on executing the first agent in a first pass of the MSMP mode, wherein the requested data is obtained based on executing the first agent in a second pass of the MSMP mode after the workflow has been adjusted.
10 . The computerized method of claim 7 , further comprising identifying the first agent associated with the first step of the plurality of steps of the workflow to retrieve the portion of the requested data.
11 . The computerized method of claim 7 , wherein adjusting the workflow comprises adjusting an order of the plurality of steps of the workflow based on the retrieved portion of the requested data.
12 . The computerized method of claim 7 , further comprising retrieving another portion of the requested data based on executing a second agent associated with a second step of the workflow, wherein the workflow is adjusted based on the retrieved other portion of the requested data.
13 . The computerized method of claim 7 , wherein the request for data is received as a voice input or a text input by the GAI model from a user and the requested data is provided to the user as a voice output or a text output.
14 . The computerized method of claim 7 , further comprising: receiving another request for data; identifying, by the GAI model, the workflow applicable for the other request; retrieving a portion of the requested data based on executing the first agent associated with the first step of the plurality of steps of the identified workflow; and obtaining the requested data based on executing the adjusted workflow without adjusting the identified workflow.
15 . A computer storage medium storing computer program code, that upon execution by a processor cause the processor to: receive, by a language model (LM), a request for data from a first user; based on the request, identify a workflow comprising a plurality of steps to be executed in a multi-step multi-pass (MSMP) mode; retrieve a portion of the requested data based on executing a first agent associated with a first step of the plurality of steps of the workflow; adjust, by the LM, the workflow based on the retrieved portion of the requested data; obtain the requested data based on executing the adjusted workflow; and provide the requested data to the first user to initiate an action based on the requested data.
16 . The computer storage medium of claim 15 , wherein the portion of the requested data is retrieved based on executing the first agent in a first pass of the MSMP mode, wherein the requested data is obtained based on executing the first agent in a second pass of the MSMP mode after the workflow has been adjusted.
17 . The computer storage medium of claim 15 , wherein the computer program code upon execution causes the processor to identify the first agent associated with the first step of the plurality of steps of the workflow to retrieve the portion of the requested data.
18 . The computer storage medium of claim 15 , wherein adjusting the workflow comprises adjusting an order of the plurality of steps of the workflow based on the retrieved portion of the requested data.
19 . The computer storage medium of claim 15 , wherein the computer program code upon execution causes the processor to retrieve another portion of the requested data based on executing a second agent associated with a second step of the workflow, wherein the workflow is adjusted based on the retrieved other portion of the requested data.
20 . The computer storage medium of claim 15 , wherein the computer program code upon execution causes the processor to: receive another request for data from a second user; identify, by the LM, the workflow applicable for the other request from the second user; retrieve a portion of the requested data based on executing the first agent associated with the first step of the plurality of steps of the identified workflow; obtain the requested data based on executing the adjusted workflow without adjusting the identified workflow; and provide the requested data to the second user to initiate an action based on the requested data.

Description

CROSS-REFERENCE TO RELATED APPLICATION This application claims priority to U.S. Provisional Patent Application No. 63/717,838, entitled “FLOW ORCHESTRATION FOR MODEL-BASED AGENTS,” filed on Nov. 7, 2024, the disclosure of which is incorporated herein by reference in its entirety. BACKGROUND Artificial intelligence (AI) systems have increasingly been used to streamline and automate complex tasks. For example, generative artificial intelligence (GAI) models have gained prominence due to their ability to process and generate natural language text, code, and other types of content. These models are capable of interpreting user inputs, generating relevant outputs, and facilitating the automation of workflows. However, existing workflow automation systems typically rely on predefined sequences of tasks executed by individual agents or processes. While such systems can handle predictable and static workflows, they often lack the flexibility to adapt to changes in real-time data or unexpected results. Moreover, GAI models are often deployed in isolation, functioning as a standalone tool for generating responses or performing specific tasks without leveraging the generative capabilities of AI to dynamically adjust workflows based on intermediate results. For instance, when a workflow automation system retrieves partial information that necessitates a change in the workflow's structure, traditional approaches struggle to reconfigure the workflow dynamically and require significant manual intervention resulting in a waste of computing and networking resources. SUMMARY This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter. A system and method for flow orchestration for language model-based agents are provided. A workflow comprising a plurality of agents to be executed in a multi-step multi-pass (MSMP) mode is defined. A request for data is received by a generative artificial intelligence (GAI) model. A portion of the requested data is retrieved based on executing a first agent of the plurality of agents of the workflow. The GAI model adjusts the workflow based on the retrieved portion of the requested data. The requested data is obtained based on executing the adjusted workflow. BRIEF DESCRIPTION OF THE DRAWINGS The present description will be better understood from the following detailed description read considering the accompanying drawings, wherein: FIG. 1 is a block diagram illustrating an example system configured to perform flow orchestration for model-based agents; FIG. 2 is a block diagram illustrating orchestration of multi-step actions; FIG. 3 is an example user interface for creating a software as a service (Saas) offer; FIG. 4 is a flowchart illustrating an example method for providing flow orchestration for model-based agents; and FIG. 5 illustrates an example computing apparatus as a functional block diagram. Corresponding reference characters indicate corresponding parts throughout the drawings. In FIGS. 1 to 5, the systems are illustrated as schematic drawings. The drawings may not be to scale. Any of the figures may be combined into a single example or embodiment. DETAILED DESCRIPTION Large language models (LLMs) have shown remarkable capabilities in understanding, generating, and even interacting with human-like text across diverse applications. From customer service chatbots and virtual assistants to automated content generation and code completion, LLM-based agents are being employed in both enterprise and consumer sectors. However, managing the complex processes and workflows that involve multiple interactions, contextual dependencies, and decision-making steps in real-world applications remains a significant challenge. Current language model agents often face limitations in handling extended tasks that require sequential or multi-step interactions. These limitations arise from their lack of inherent memory capabilities and difficulties in managing context across multiple stages. Additionally, the unstructured nature of natural language and the high-dimensionality of LLM outputs further complicate the process of orchestrating workflows that involve condition-based decisions, loops, branching, and other complex flows. Traditional rule-based systems, such as decision trees or finite state machines, have often been used to manage multi-step workflows in automation. However, these approaches are not optimized for the fluid, nuanced, and variable output of LLMs. Language models, by design, rely on probabilistic methods for response generation, which can lead to variability in responses based on input nuances, user history, and/or ongoing context changes. Moreover, the integration of language models with external sys