US-20260127201-A1 - COMPRESSING TOOL PROMPTS VIA RELATIVE INFORMATION ENTROPY

US20260127201A1US 20260127201 A1US20260127201 A1US 20260127201A1US-20260127201-A1

Abstract

Mechanisms are provided to compress a tool prompt. An original tool prompt is segmented into text chunks. At least one semantic vector representation of the text chunks is generated and a first semantic distribution of the original tool prompt is generated based on the at least one semantic vector representation. A perturbed semantic vector representation is generated by eliminating at least one text chunk from the text chunks, and a second semantic distribution is generated based on the perturbed semantic vector representation. A comparison of the first and second semantic distributions is performed to generate at least one similarity metric. A compressed tool prompt is generated based on the at least one similarity metric by eliminating one or more text chunks that have a similarity metric that is above a threshold similarity value.

Inventors

Wen Wang
Zhong Fang Yuan
Li Juan Gao
He Li
Tong Liu

Assignees

INTERNATIONAL BUSINESS MACHINES CORPORATION

Dates

Publication Date: 20260507
Application Date: 20241104

Claims (20)

1 . A computer-implemented method comprising: receiving an original tool prompt for a generative machine learning model; segmenting the original tool prompt into multiple text chunks; generating at least one semantic vector representation of the multiple text chunks; generating a first semantic distribution based on the at least one semantic vector representation; generating a perturbed semantic vector representation based on a subset of the multiple text chunks, the subset being generated by eliminating at least one text chunk from the multiple text chunks; generating a second semantic distribution based on the perturbed semantic vector representation; performing a comparison of the first semantic distribution and the second semantic distribution to generate at least one similarity metric; and in response to the at least one similarity metric exceeding a threshold similarity value, generating a compressed tool prompt based on the subset of the multiple text chunks.
2 . The method of claim 1 , further comprising storing the compressed tool prompt in a data storage that is accessible to an artificial intelligence agent that communicates with the generative machine learning model.
3 . The method of claim 1 , further comprising: adding the compressed tool prompt to a task prompt; inputting the task prompt into a generative machine learning model; and in response to the inputting, receiving a task output from the generative machine learning model.
4 . The method of claim 1 , wherein the original tool prompt helps define a function tool and comprises one or more defining elements selected from a group consisting of: a function declaration specifying an identifier of the function tool, a function description that describes what the function tool does, a parameter description that describes parameters used by the function tool, and a return description that describes a type of output to be provided by the function tool in response to the function tool being invoked.
5 . The method of claim 4 , wherein the original tool prompt comprises a first number of the defining elements and the compressed tool prompt comprises a second number of the defining elements, the second number being smaller than the first number.
6 . The method of claim 1 , further comprising generating an associative tree data structure based on the multiple text chunks and at least one similarity metric, wherein the at least one similarity metric comprises a plurality of similarity metrics, and wherein connections between nodes of the associative tree data structure comprise corresponding similarity metrics, in the plurality of similarity metrics, specifying a similarity between nodes connected by a corresponding connection.
7 . The method of claim 6 , wherein generating the compressed tool prompt comprises pruning the associative tree data structure by removing nodes and paths which have only connections whose corresponding similarity metrics meet a predetermined criterion, to thereby generate a pruned associative tree data structure.
8 . The method of claim 7 , wherein generating the compressed tool prompt comprises traversing the pruned associative tree data structure to reconstruct a tool prompt that comprises less textual content than the original tool prompt.
9 . The method of claim 1 , wherein the at least one similarity metric is generated by executing at least one of a first algorithm that measures a largest difference between the first semantic distribution and the second semantic distribution, and a second algorithm that measures how much the first semantic distribution and the second semantic distribution agree or differ.
10 . The method of claim 9 , wherein the first algorithm is a K-S test algorithm, and the second algorithm is a Jensen-Shannon divergence algorithm.
11 . The method of claim 1 , wherein segmenting the original tool prompt into multiple text chunks comprises parsing the original tool prompt and generating text chunks based on an identification of at least one of tags, key words, phrases, or structural elements specific to functional tool descriptions in tool prompts.
12 . The method of claim 1 , wherein generating the first semantic distribution comprises processing the at least one semantic vector representation via a Gaussian Mixture Model (GMM), and wherein generating the second semantic distribution comprises processing the perturbed semantic vector representation via the GMM.
13 . The method of claim 1 , wherein generating at least one semantic vector representation of the multiple text chunks comprises generating a separate semantic vector representation for each text chunk in the multiple text chunks, and wherein generating the perturbed semantic vector representation comprises generating a separate perturbed semantic vector representation for each text chunk in the multiple text chunks other than the eliminated at least one text chunk.
14 . A computer program product comprising: a computer readable storage medium; and program instructions stored on the computer readable storage medium to perform operations comprising: receiving an original tool prompt for a generative machine learning model; segmenting the original tool prompt into multiple text chunks; generating at least one semantic vector representation of the multiple text chunks; generating a first semantic distribution based on the at least one semantic vector representation; generating a perturbed semantic vector representation based on a subset of the multiple text chunks, the subset being generated by eliminating at least one text chunk from the multiple text chunks; generating a second semantic distribution based on the perturbed semantic vector representation; performing a comparison of the first semantic distribution and the second semantic distribution to generate at least one similarity metric; and generating, in response to the at least one similarity metric exceeding a threshold similarity value, a compressed tool prompt based on the subset of the multiple text chunks.
15 . The computer program product of claim 14 , wherein the operations further comprise storing the compressed tool prompt in a data storage that is accessible to an artificial intelligence agent that communicates with the generative machine learning model.
16 . The computer program product of claim 14 , wherein the operations further comprise: adding the compressed tool prompt to a task prompt; inputting the task prompt into a generative machine learning model; and in response to the inputting, receiving a task output from the generative machine learning model.
17 . The computer program product of claim 14 , wherein the original tool prompt helps define a function tool and comprises one or more defining elements selected from a group consisting of: a function declaration specifying an identifier of the function tool, a function description that describes what the function tool does, a parameter description that describes parameters used by the function tool, and a return description that describes a type of output to be provided by the function tool in response to the function tool being invoked.
18 . The computer program product of claim 17 , wherein the original tool prompt comprises a first number of the defining elements and the compressed tool prompt comprises a second number of the defining elements, the second number being smaller than the first number.
19 . The computer program product of claim 14 , wherein the operations further comprise generating an associative tree data structure based on the multiple text chunks and at least one similarity metric, wherein the at least one similarity metric comprises a plurality of similarity metrics, and wherein connections between nodes of the associative tree data structure comprise corresponding similarity metrics, in the plurality of similarity metrics, specifying a similarity between nodes connected by a corresponding connection.
20 . A computer system comprising: a processor set; one or more computer-readable storage media; and program instructions stored on the one or more computer-readable storage media to cause the processor set to perform operations comprising: receiving an original tool prompt for a generative machine learning model; segmenting the original tool prompt into multiple text chunks; generating at least one semantic vector representation of the multiple text chunks; generating a first semantic distribution based on the at least one semantic vector representation; generating a perturbed semantic vector representation based on a subset of the multiple text chunks, the subset being generated by eliminating at least one text chunk from the multiple text chunks; generating a second semantic distribution based on the perturbed semantic vector representation; performing a comparison of the first semantic distribution and the second semantic distribution to generate at least one similarity metric; and generating, in response to the at least one similarity metric exceeding a threshold similarity value, a compressed tool prompt based on the subset of the multiple text chunks.

Description

BACKGROUND The present application relates generally to machine learning models, generative machine learning models, large language models (LLMs), artificial intelligence agents (AI agents) which are automated and can perform actions on their environment based on observations, and agentic interaction with LLMs. SUMMARY This Summary is provided to introduce a selection of concepts in a simplified form that are further described herein in the Detailed Description. This Summary is not intended to identify key factors or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter. In one illustrative embodiment, a method, in a data processing system, is provided that comprises receiving an original tool prompt, and segmenting the original tool prompt into multiple text chunks. The method further comprises generating at least one semantic vector representation of the multiple text chunks. The method also comprises generating a first semantic distribution of the original tool prompt based on the at least one semantic vector representation. In addition, the method comprises generating a perturbed semantic vector representation based on a subset of the multiple text chunks, the subset being generated by eliminating at least one text chunk from the multiple text chunks, and generating a second semantic distribution based on the perturbed semantic vector representation. The method further comprises performing a comparison of the first semantic distribution and the second semantic distribution to generate at least one similarity metric. Moreover, the method comprises, in response to the at least one similarity metric exceeding a threshold similarity value, generating a compressed tool prompt based on the subset of the multiple text chunks. In other illustrative embodiments, a computer program product comprising a computer useable or readable medium having a computer readable program is provided. The computer readable program, when executed on a computing device, causes the computing device to perform various ones of, and combinations of, the operations outlined above with regard to the method illustrative embodiment. In yet another illustrative embodiment, a system/apparatus is provided. The system/apparatus may comprise one or more processors and a memory coupled to the one or more processors. The memory may comprise instructions which, when executed by the one or more processors, cause the one or more processors to perform various ones of, and combinations of, the operations outlined above with regard to the method illustrative embodiment. These and other features and advantages of the present invention will be described in, or will become apparent to those of ordinary skill in the art in view of, the following detailed description of the example embodiments of the present invention. BRIEF DESCRIPTION OF THE DRAWINGS The invention, as well as a preferred mode of use and further objectives and advantages thereof, will best be understood by reference to the following detailed description of illustrative embodiments when read in conjunction with the accompanying drawings, wherein: FIG. 1A is an example diagram of a process of a Large Language Model (LLM) agent in accordance with one illustrative embodiment; FIG. 1B is an example diagram of a LLM task prompt which includes multiple agent tool prompts; FIG. 2 is an example diagram illustrating one of the LLM agent tool prompts that was shown in FIG. 1B and an overview of the inventive compression process performed in accordance with one illustrative embodiment; FIG. 3 is an example diagram of a distributed data processing system environment in which aspects of the illustrative embodiments may be implemented and at least some of the computer code involved in performing the inventive prompt compression methods may be executed; FIG. 4 is an example block diagram illustrating another distributed computing environment in which the inventive compression methods are carried out and including the primary operational components of a LLM agent tool prompt compressor in accordance with one illustrative embodiment; FIG. 5 is an example diagram illustrating operations for performing a vector space modeling of a tool prompt in accordance with one or more illustrative embodiments; FIG. 6 is an example diagram illustrating operations for performing importance assessment of text chunks in accordance with one illustrative embodiment; and FIG. 7 is a flowchart outlining an example operation for compressing a LLM agent tool prompt in accordance with one illustrative embodiment. DETAILED DESCRIPTION A Large Language Model (LLM) is a is a type of artificial intelligence (AI) model that is designed to understand and generate human language. LLMs are trained on vast amounts of text data and use deep learning techniques to perform various natural language processing tasks, such as text generation, translation, and summarization