US-12625732-B2 - Systems and methods for generating and executing function calls using machine learning

US12625732B2US 12625732 B2US12625732 B2US 12625732B2US-12625732-B2

Abstract

The methods, systems, and computer networking apparatuses described herein enable language models to receive input (e.g., a query or a request) from a user or application, and without any additional training data or instructions, determine to generate a function call based on the received input from the user and generate the function call based on the determination. In some embodiments, a language model may further access an external tool or application to request an output as a response to the generated function call. The disclosed methods, systems, and networking apparatuses improve the technical field by incorporating language model capabilities within the function calling process, and allowing for function-related information to be provided to a language model via input received in any number of formats or types, including structured or unstructured input, language or non-language input, or any combination thereof.

Inventors

Athyuttam Eleti
Jeffrey Harris
Logan Kilpatrick, III
Andrey Mishchenko

Assignees

OpenAI Opco, LLC

Dates

Publication Date: 20260512
Application Date: 20250227

Claims (20)

1 . A method for generating and executing function calls using a machine learning model, comprising: receiving an input, wherein the input comprises a request to perform a computational task within a computing environment; employing a machine learning model to process the received input, wherein the machine learning model is trained to: identify an external application based on the received input; determine, based on the identified external application and by calculating a probability based on a decision boundary and the received input, to generate a function call to perform the computational task, the function call including a function and at least one argument associated with the function; and generate, based on the determination, the function and the at least one argument as an output; and sending the function and the at least one argument to the user.
2 . The method of claim 1 , further comprising: sending the function call to an external application programming interface (API) to execute the function with the at least one argument; receiving a response from the external API based on the executed function; and sending the received response to the machine learning model, wherein the machine learning model generates an output based on the received response.
3 . The method of claim 2 , further comprising sending the output to the user.
4 . The method of claim 2 , further comprising presenting the received response from the external API to the user via a user interface.
5 . The method of claim 1 , wherein the user is a computing device.
6 . The method of claim 5 , wherein the computing device is configured to send a plurality of inputs to the machine learning model based on a single instruction.
7 . The method of claim 5 , wherein the computing device is configured by receiving instructions including an input field that defines whether the function call must be generated.
8 . The method of claim 1 , wherein the generated function call is converted into an executable code in a programming language suitable for the computing environment in which the computational task or the function is to be executed.
9 . The method of claim 8 , wherein the programming language is determined by the machine learning model based on instructions provided by the user.
10 . The method of claim 8 , wherein the executable code and the programming language are determined by accessing a specification associated with the external API.
11 . The method of claim 1 , wherein the generated function call further comprises logic for dynamically modifying the computational task based on real-time data or changing conditions.
12 . The method of claim 1 , wherein the function call further comprises metadata specifying an expected response format or output structure resulting from executing the function populated with the at least one argument.
13 . The method of claim 1 , further comprising verifying a correctness of the generated function call.
14 . The method of claim 13 , wherein the machine learning model further provides an output that indicates that the generated and sent function call is incorrect.
15 . The method of claim 1 , wherein the function call further comprises self-updating information for automatically adapting to changes in data sources, APIs, or dependencies.
16 . The method of claim 1 , wherein the received input comprises at least two requests to perform different computational tasks or functions.
17 . The method of claim 1 , wherein the function call includes at least one of a script, a code, or a data structure for executing the computational task.
18 . The method of claim 1 , wherein the function call further includes error handling information.
19 . The method of claim 1 , wherein the input includes a template function.
20 . The method of claim 1 , wherein the input indicates a request for the function call without including a template function.

Description

FIELD OF DISCLOSURE The disclosure generally relates to systems, methods, and computer networking apparatuses for generating and executing function calls using machine learning. BACKGROUND Language Models (LMs) represent a transformative branch of machine learning (ML) and artificial intelligence (AI), leveraging advanced deep learning algorithms to process, understand, and generate human-like natural language. These sophisticated systems are trained on massive datasets, enabling them to recognize intricate patterns, relationships, and structures within textual data. The versatility of LMs extends across numerous applications, including language translation, sentiment analysis, automated dialogue systems, content generation, and more. At their core, LMs possess the remarkable ability to comprehend and generate complex textual information. They can identify entities and their relationships, perform contextual reasoning, and generate coherent, grammatically accurate text. This makes them invaluable for solving intricate tasks that demand not only linguistic fluency but also contextual understanding and logical inference. Interaction with LMs typically involves users providing input in the form of prompts, such as questions, instructions, or scenarios. In response, the LM processes the input using its deep neural network-based algorithms and generates tailored outputs. This interaction often unfolds iteratively, with sequences of prompts refining the model's responses to address multifaceted problems or nuanced user requirements. Moreover, the adaptability of LMs allows for their fine-tuning and specialization in complex domains, such as, e.g., cybersecurity and threat intelligence, scientific research, data analysis, or creative writing. By harnessing their capacity for reasoning, context determination, and advanced computation, LMs are driving innovation across industries, reshaping how tasks are approached and how problems are solved. SUMMARY Language models, while useful for certain functions, may be limited in many ways. For example, training a language model is expensive, and coupled with the fact that training data is often out-of-date and/or must be tailored specifically to one of many potential applications, the cost of training continues to rise while the benefits of the training may remain limited. Furthermore, with regard to external data and applications, language models typically can only perform limited, if any, operations relating to the generation and execution of function calls associated with the external data and applications. There also exists a need for increasing the capabilities and processing power of language models by adding technical solutions that adapt the models so they are able to determine when external information may be useful, and to utilize such external information, when helpful, to perform tasks and/or answer questions based on user queries. Drawbacks of existing solutions include a limited knowledge base based solely on training data received by the model, an inability to determine, based on a user input, if a function call is necessary, and also the inability to generate proper function calls as well as arguments for particular functions before making a function call to an external API. The disclosed methods, systems, and computer networking apparatuses present technological improvements as solutions to one or more of the technical problems in conventional systems. The methods, systems, and computer networking apparatuses described herein enable language models to receive input (e.g., a query or a request) from a user, and without any additional training data or instructions, access an external tool or application to request an output from that external tool or application as a response to the received input. In some embodiments, the language models may provide a response to a received input from a user without accessing an external tool or application to first obtain a particular output. Prior to requesting the output from an external tool or application, the language models may be configured to determine to use a function to receive the proper output and to generate the necessary function to be sent to the external tool or application. In some embodiments, the language models may be configured to determine to use a function and/or to provide a function call to the user based on the received input. Based on additional user input or based on documentation or information that is available online, language models may further be enabled to harvest data (e.g., provided by a user or available online) to understand the functions available for interacting with an API associated with the external tool or application. In some embodiments, language models may be enabled to receive a template function as input without performing any further modification (other than providing, e.g., a populated function call) based on the received template function. Language models may thereby take