US-12619822-B2 - Responsible prompt recommendation
Abstract
An embodiment generates, by analyzing prompts, a prompt template. Each prompt includes a text description of a content to be generated by a model. An embodiment classifies, using a first trained classification model, a variant portion of a prompt into a category in a set of categories. An embodiment selects, from a repository of prompt templates including the prompt template, a selected prompt template having a similarity above a threshold similarity to a first prompt. An embodiment classifies, using the selected prompt template, a variant portion of the first prompt into a first category in the set of categories. An embodiment adjusts, responsive to determining that the first category is designated as a harmful category, the variant portion of the first prompt. An embodiment causes, using the adjusted first prompt, the model to produce a first content.
Inventors
- VAGNER FIGUEREDO DE SANTANA
- Marisa Affonso Vasconcelos
- Melina de Vasconcelos Alberio Guerra
- Michael Anthony Feffer
- Sara E Berger
- Tianyu Su
Assignees
- INTERNATIONAL BUSINESS MACHINES CORPORATION
Dates
- Publication Date
- 20260505
- Application Date
- 20240109
Claims (20)
- 1 . A computer-implemented method comprising: generating, by executing a natural language processing algorithm analyzing a plurality of prompts, a prompt template, wherein each prompt in the plurality of prompts comprises a text description of a content to be generated by a large language model; segmenting, using the prompt template, each prompt in the plurality of prompts into an invariant portion and a variant portion; training, using prompt text data, a machine learning model to classify variant portions of prompts, the training generating a first trained classification model; classifying, using the first trained classification model, a variant portion of a prompt in the plurality of prompts into a category in a set of categories, wherein each category of the set of categories comprises a level of harm and wherein each level corresponds to a different type of adjustment; selecting, using the first trained classification model, from a repository of prompt templates including the prompt template, a selected prompt template, the selected prompt template having a similarity above a threshold similarity to a first prompt; classifying, using the first trained classification model using the selected prompt template, a variant portion of the first prompt into a first category in the set of categories, the first category comprising a first level of harm; automatically adjusting, responsive to determining by the first trained classification model that the first category is designated as a harmful category, the variant portion of the first prompt based on the first level of harm, the adjusting resulting in an adjusted first prompt; and causing, using the adjusted first prompt, the large language model to produce a first content.
- 2 . The computer-implemented method of claim 1 , wherein a first category in the set of categories is designated as a responsible category.
- 3 . The computer-implemented method of claim 1 , wherein a second category in the set of categories is designated as a harmful category.
- 4 . The computer-implemented method of claim 1 , wherein the adjusting removes the variant portion of the first prompt.
- 5 . The computer-implemented method of claim 1 , wherein the adjusting replaces the variant portion of the first prompt with a replacement variant portion.
- 6 . The computer-implemented method of claim 1 , further comprising: selecting, from the repository of prompt templates including the prompt template, a second selected prompt template, the second selected prompt template having a similarity above a threshold similarity to a second prompt; classifying, using the second selected prompt template, a variant portion of the second prompt into a second category in the set of categories; and identifying, to a user, responsive to determining that the second category is designated as a responsible category, the variant portion of the first prompt as a responsible portion.
- 7 . A computer program product comprising one or more computer readable storage media, and program instructions collectively stored on the one or more computer readable storage media, the program instructions executable by a processor to cause the processor to perform operations comprising: generating, by executing a natural language processing algorithm analyzing a plurality of prompts, a prompt template, wherein each prompt in the plurality of prompts comprises a text description of a content to be generated by a large language model; segmenting, using the prompt template, each prompt in the plurality of prompts into an invariant portion and a variant portion; training, using prompt text data, a machine learning model to classify variant portions of prompts, the training generating a first trained classification model; classifying, using the first trained classification model, a variant portion of a prompt in the plurality of prompts into a category in a set of categories, wherein each category of the set of categories comprises a level of harm and wherein each level corresponds to a different type of adjustment; selecting, using the first trained classification model, from a repository of prompt templates including the prompt template, a selected prompt template, the selected prompt template having a similarity above a threshold similarity to a first prompt; classifying, using the first trained classification model using the selected prompt template, a variant portion of the first prompt into a first category in the set of categories, the first category comprising a first level of harm; automatically adjusting, responsive to determining by the first trained classification model that the first category is designated as a harmful category, the variant portion of the first prompt based on the first level of harm, the adjusting resulting in an adjusted first prompt; and causing, using the adjusted first prompt, the large language model to produce a first content.
- 8 . The computer program product of claim 7 , wherein the stored program instructions are stored in a computer readable storage device in a data processing system, and wherein the stored program instructions are transferred over a network from a remote data processing system.
- 9 . The computer program product of claim 7 , wherein the stored program instructions are stored in a computer readable storage device in a server data processing system, and wherein the stored program instructions are downloaded in response to a request over a network to a remote data processing system for use in a computer readable storage device associated with the remote data processing system, further comprising: program instructions to meter use of the program instructions associated with the request; and program instructions to generate an invoice based on the metered use.
- 10 . The computer program product of claim 7 , wherein a first category in the set of categories is designated as a responsible category.
- 11 . The computer program product of claim 7 , wherein a second category in the set of categories is designated as a harmful category.
- 12 . The computer program product of claim 7 , wherein the adjusting removes the variant portion of the first prompt.
- 13 . The computer program product of claim 7 , wherein the adjusting replaces the variant portion of the first prompt with a replacement variant portion.
- 14 . The computer program product of claim 7 , further comprising: selecting, from the repository of prompt templates including the prompt template, a second selected prompt template, the second selected prompt template having a similarity above a threshold similarity to a second prompt; classifying, using the second selected prompt template, a variant portion of the second prompt into a second category in the set of categories; and identifying, to a user, responsive to determining that the second category is designated as a responsible category, the variant portion of the first prompt as a responsible portion.
- 15 . A computer system comprising a processor and one or more computer readable storage media, and program instructions collectively stored on the one or more computer readable storage media, the program instructions executable by the processor to cause the processor to perform operations comprising: generating, by executing a natural language processing algorithm analyzing a plurality of prompts, a prompt template, wherein each prompt in the plurality of prompts comprises a text description of a content to be generated by a large language model; segmenting, using the prompt template, each prompt in the plurality of prompts into an invariant portion and a variant portion; training, using prompt text data, a machine learning model to classify variant portions of prompts, the training generating a first trained classification model; classifying, using the first trained classification model, a variant portion of a prompt in the plurality of prompts into a category in a set of categories, wherein each category of the set of categories comprises a level of harm and wherein each level corresponds to a different type of adjustment; selecting, using the first trained classification model, from a repository of prompt templates including the prompt template, a selected prompt template, the selected prompt template having a similarity above a threshold similarity to a first prompt; classifying, using the first trained classification model using the selected prompt template, a variant portion of the first prompt into a first category in the set of categories, the first category comprising a first level of harm; automatically adjusting, responsive to determining by the first trained classification model that the first category is designated as a harmful category, the variant portion of the first prompt based on the first level of harm, the adjusting resulting in an adjusted first prompt; and causing, using the adjusted first prompt, the large language model to produce a first content.
- 16 . The computer system of claim 15 , wherein a first category in the set of categories is designated as a responsible category.
- 17 . The computer system of claim 15 , wherein a second category in the set of categories is designated as a harmful category.
- 18 . The computer system of claim 15 , wherein the adjusting removes the variant portion of the first prompt.
- 19 . The computer system of claim 15 , wherein the adjusting replaces the variant portion of the first prompt with a replacement variant portion.
- 20 . The computer system of claim 15 , further comprising: selecting, from the repository of prompt templates including the prompt template, a second selected prompt template, the second selected prompt template having a similarity above a threshold similarity to a second prompt; classifying, using the second selected prompt template, a variant portion of the second prompt into a second category in the set of categories; and identifying, to a user, responsive to determining that the second category is designated as a responsible category, the variant portion of the first prompt as a responsible portion.
Description
BACKGROUND The present invention relates generally to generative artificial intelligence (AI) models. More particularly, the present invention relates to a method, system, and computer program for responsible prompt recommendation. A generative artificial intelligence (AI) model is a machine learning model that learns the patterns and structure of input training data, such as text, computer source code, audio, still images, or video, and then generates new data with similar characteristics. For example, GPT-3 and GPT-4 are generative AI models that produce text, and DALL-E, Midjourney, and Stable Diffusion are generative AI models that produce still images. (GPT-3 and GPT-4 are registered trademarks of OpenAI OpCo, LLC in the United States and other countries. Midjourney is a registered trademark of MidJourney, Inc. in the United States and other countries. Stable Diffusion is a registered trademark of Stability AI Ltd. in the United States and other countries.) A prompt is a description of the task that a generative AI model should perform. Typically, a prompt includes text, often in natural language, describing the task that a generative AI model should perform. For example, a prompt to a text generative AI model might be “tell me about cats” or “write me a haiku about cats”, while a prompt to an image generative AI model might be “give me an image of a cat on a bicycle” or “give me a combination of these two images”. To achieve desired results, most prompts are more complex than these examples, often including strategies such as including additional detail to achieve a more relevant answer (e.g., a specific language or style to be used), asking the model to adopt a persona (e.g., a data scientist, a law professor), indicating distinct parts of the prompt (e.g., a field in a template to be filled in with text), specifying the steps required to complete a desired task, providing examples or samples of the desired output, specifying the desired length of the output, providing a reference text for use in generating the desired output, and others. Often, a user employs a sequence of prompts and responses, using trial and error to eventually arrive at what the user considers the best version of the desired model output. Prompt engineering is the process of generating a prompt to a generative AI model. The illustrative embodiments recognize that prompts that produce improved versions of generative AI model output are more desirable than prompts that do not produce improved versions of generative AI model output. Prompts that produce improved versions of generative AI model output take time to produce, and require knowledge of both generative AI models and a particular subject matter domain. Prompting practices also change over time, as new models are developed and users explore new ways of using existing models, and both responsible and harmful prompting techniques are evolving. Here, responsible prompting refers to prompting that produces an improved result from a generative AI model when the model is used as designed or intended, and harmful prompting refers to prompting that produces a socially harmful result, a result that does not use a model as intended, or an otherwise undesirable result. A new prompt engineer might not have the skills or domain knowledge to instruct a generative AI model to create content in a responsible way, and avoid a harmful result. Thus, there is a need to guide prompt engineers in prompt generation, encouraging responsible practices while discouraging harmful ones. SUMMARY The illustrative embodiments provide for responsible prompt recommendation. An embodiment includes generating, by analyzing a plurality of prompts, a prompt template, wherein each prompt in the plurality of prompts comprises a text description of a content to be generated by a model. An embodiment includes segmenting, using the prompt template, each prompt in the plurality of prompts into an invariant portion and a variant portion. An embodiment includes classifying, using a first trained classification model, a variant portion of a prompt in the plurality of prompts into a category in a set of categories. An embodiment includes selecting, from a repository of prompt templates including the prompt template, a selected prompt template, the selected prompt template having a similarity above a threshold similarity to a first prompt. An embodiment includes classifying, using the selected prompt template, a variant portion of the first prompt into a first category in the set of categories. An embodiment includes adjusting, responsive to determining that the first category is designated as a harmful category, the variant portion of the first prompt, the adjusting resulting in an adjusted first prompt. An embodiment includes causing, using the adjusted first prompt, the model to produce a first content. Thus, an embodiment provides responsible prompt recommendation. Other embodiments of this aspect include correspondi