US-12626108-B2 - Similarity-based generative AI output filtering

US12626108B2US 12626108 B2US12626108 B2US 12626108B2US-12626108-B2

Abstract

Methods and systems for generating output content using a generative artificial intelligence (AI) model based on an input. A similarity-assessment layer at the output of the generative AI model determines a similarity measure for the output content vis-à-vis pre-existing items in a repository. The similarity measure is compared to a threshold value and, responsive to the comparison indicating excessive similarity, one or both of the input and the generative AI model are adjusted, and the generative AI model is re-run to generate new output content.

Inventors

Neil Leonard Padgett
Andra Adams

Assignees

SHOPIFY INC.

Dates

Publication Date: 20260512
Application Date: 20230508

Claims (20)

1 . A computer-implemented method, comprising: generating output content using a generative artificial intelligence (AI) model based on an input; determining, using a similarity-assessment layer, a similarity measure for the output content with respect to a repository of pre-existing content; comparing the similarity measure to a threshold value; and responsive to the comparing indicating excessive similarity, adjusting one or both of the input and the generative AI model, and re-generating new output content using the generative AI model.
2 . The method of claim 1 , further comprising repeating the generating, determining, comparing, adjusting, and re-generating until the new output content has a respective similarity measure at or below the threshold value.
3 . The method of claim 1 , wherein determining the similarity measure includes calculating a distance metric by comparing the output content to items in the repository of pre-existing content.
4 . The method of claim 3 , wherein comparing includes comparing the output content to each item in the repository of pre-existing content in turn to determine a distance value representing similarity between the output content and that item, and identifying the lowest distance value and the corresponding item as the most similar of the items to the output content.
5 . The method of claim 1 , wherein the input includes a prompt and wherein adjusting the input includes changing the prompt.
6 . The method of claim 5 , wherein the prompt includes text and wherein changing the prompt includes changing the text.
7 . The method of claim 6 , wherein changing the text includes one or more of replacing a word in the text with an alternative word, changing an order of words in the text, or changing a verb tense for one or more words in the text.
8 . The method of claim 1 , wherein the input includes a seed value, and wherein adjusting the input includes changing the seed value.
9 . The method of claim 1 , wherein the output content includes a text output and wherein the similarity-assessment layer determines an infringement probability value by comparing the text output to content in one or more databases and finding a match within a threshold value between the text output and a pre-existing text in the one or more databases.
10 . The method of claim 1 , wherein the output content includes a textual or graphic brand and wherein the similarity-assessment layer determines an infringement probability value by comparing the textual or graphic brand with trademarks in one or more trademark databases and finding a match within a threshold value between the textual or graphic brand and at least one trademark.
11 . The method of claim 10 , wherein the similarity-assessment layer further receives product or service data associated with the textual or graphic brand, and wherein comparing and finding includes comparing the product or service data with goods and services data associated with trademarks in the one or more trademark databases.
12 . The method of claim 1 , wherein the similarity-assessment layer includes a machine learning model trained on an intellectual property infringement data set.
13 . A computing system, comprising: a processor; and memory coupled to the processor, the memory storing computer-executable instructions that, when executed by the processor, configure the processor to: generate output content using a generative artificial intelligence (AI) model based on an input; determine, using a similarity-assessment layer, a similarity measure for the output content with respect to a repository of pre-existing content; compare the similarity measure to a threshold value; and responsive to the comparing indicating excessive similarity, adjust one or both of the input and the generative AI model, and re-generate new output content using the generative AI model.
14 . The computing system of claim 13 , wherein the instructions, when executed, are to further cause the processor to repeat the generating, determining, comparing, adjusting, and re-generating until the new output content has a respective similarity measure at or below the threshold value.
15 . The computing system of claim 13 , wherein the instructions, when executed, are to further cause the processor to determine the similarity measure at least in part by calculating a distance metric by comparing the output content to items in the repository of pre-existing content.
16 . The computing system of claim 15 , wherein comparing includes comparing the output content to each item in the repository of pre-existing content in turn to determine a distance value representing similarity between the output content and that item, and identifying the lowest distance value and the corresponding item as the most similar of the items to the output content.
17 . The computing system of claim 13 , wherein the input includes a prompt and wherein the instructions, when executed, are to further cause the processor to adjust the input by changing the prompt.
18 . The computing system of claim 17 , wherein the prompt includes text and wherein the instructions, when executed, are to further cause the processor to change the prompt by changing the text.
19 . The computing system of claim 18 , wherein the instructions, when executed, are to further cause the processor to change the text by, at least, one or more of replacing a word in the text with an alternative word, changing an order of words in the text, or changing a verb tense for one or more words in the text.
20 . A non-transitory processor-readable medium storing processor-executable instructions that, when executed by a processor, are to cause the processor to: generate output content using a generative artificial intelligence (AI) model based on an input; determine, using a similarity-assessment layer, a similarity measure for the output content with respect to a repository of pre-existing content; compare the similarity measure to a threshold value; and responsive to the comparison indicating excessive similarity, adjust one or both of the input and the generative AI model, and re-generate new output content using the generative AI model.

Description

CROSS-REFERENCE TO RELATED APPLICATION The present application claims the benefit of priority to U.S. Provisional Patent Application No. 63/424,577 filed on Nov. 11, 2022, and to U.S. Provisional Patent Application No. 63/480,135 filed on Jan. 17, 2023, the contents of each of which are incorporated herein by reference. TECHNICAL FIELD The present disclosure relates to generative artificial intelligence (AI) systems. BACKGROUND The present disclosure relates to generative artificial intelligence (AI) systems, which may sometimes employ language learning models (LLMs). In the present application, the term generative AI model may be used to describe a machine learning model (MLM). A trained generative AI model, e.g. an LLM, may respond to an input prompt by generating and producing an output or result. The output or result may be generated by the generative AI model through interpreting the intent and context of the prompt. In some cases, the generative AI model may be implemented with constraints on the acceptable prompts. In some cases, this may include a prompt template. A prompt template may specify that prompts have a certain structure or constrained intents, or that acceptable prompts exclude certain classes of subject matter or intent, such as the production of results or outputs that are violent, pornographic, etc. Significant advances have been made recently in generative AI models. Different implementations may be trained to create digital art, computer code, conversation text responses, or other types of outputs. Examples include Stable Diffusion by Stability AI Ltd., ChatGPT by OpenAI, DALL-E 2 by OpenAI, and GitHub CoPilot by GitHub and OpenAI. The generative AI models are typically trained using a large data set of example training data. For instance, in the case of AI for generating images, the training data set may include a database of millions of images tagged with information regarding the contents, style, artist, context, or other data about the image or its manner of creation. The generative AI trained on such a data set is then able to take an input prompt in text form, which may include suggested topics, features, styles or other suggestions, and provide an output image that reflects, at least to some degree, the input prompt. BRIEF DESCRIPTION OF THE DRAWINGS Embodiments will be described, by way of example only, with reference to the accompanying figures wherein: FIG. 1 illustrates, in block diagram form, an example system for similarity-based generative AI output filtering; FIG. 2 shows, in flowchart form, a simplified method of similarity-based generative AI output filtering; FIG. 3A shows another example method of similarity-based generative AI output filtering; FIG. 3B shows a further example method of similarity-based generative AI output filtering; FIG. 3C shows yet another example method of similarity-based generative AI output filtering; FIG. 4 shows, in block diagram form, another example system for similarity-based generative AI output filtering; FIG. 5 shows, in flowchart form, one example method of generating outputs using a generative AI model and the system of FIG. 4; FIG. 6A is a high-level schematic diagram of an example computing device; FIG. 6B shows a simplified organization of software components stored in a memory of the computing device of FIG. 6A. FIG. 7 is a block diagram of a simplified transformer neural network, which may be used in examples of the present disclosure. Like reference numerals are used in the drawings to denote like elements and features. DETAILED DESCRIPTION OF EMBODIMENTS In an aspect, the present application discloses a computer-implemented method that may include generating output content using a generative artificial intelligence (AI) model based on an input; determining, using a similarity-assessment layer, a similarity measure for the output content with respect to a repository of pre-existing content; comparing the similarity measure to a threshold value; and responsive to the comparing indicating excessive similarity, adjusting one or both of the input and the generative AI model, and re-generating new output content using the generative AI model. In some implementations, the method may also include repeating the generating, determining, comparing, adjusting, and re-generating until the new output content has a respective similarity measure at or below the threshold value. In some implementations, determining the similarity measure may include calculating a distance metric by comparing the output content to items in the repository of pre-existing content. In some cases, comparing includes comparing the output content to each item in the repository of pre-existing content in turn to determine a distance value representing similarity between the output content and that item, and identifying the lowest distance value and the corresponding item as the most similar of the items to the output content. In some implementations, the input incl