KR-20260064141-A - METHOD AND APPARATUS FOR POISIONING A DIFFUSION MODEL FOR IMAGE GENERATION

KR20260064141AKR 20260064141 AKR20260064141 AKR 20260064141AKR-20260064141-A

Abstract

The present invention includes a memory storing at least one instruction for poisoning a diffusion model for image generation of protection data to be protected; and a processor that performs an operation according to the instruction, wherein the processor stores a first learning trajectory, which is a trajectory of stepwise weights during the iterative fine-tuning of the diffusion model for image generation by inputting target data, and applies noise to the protection data that is repeatedly updated stepwise so that the distance difference between the first learning trajectory and the second learning trajectory, which is a trajectory of weights during the iterative fine-tuning of the diffusion model for image generation of the protection data, is minimized, thereby inducing the diffusion model for image generation to be fine-tuned.

Inventors

이호준

Assignees

주식회사 엑스퍼티

Dates

Publication Date: 20260507
Application Date: 20241031

Claims (10)

A memory storing at least one instruction for poisoning a diffusion model for generating an image of the data to be protected; and It includes a processor that performs an operation according to the above instruction, The above processor is, A method for inducing the image generation diffusion model to fine-tune by inputting target data, storing a first learning trajectory which is the trajectory of stepwise weights during iterative fine-tuning of the image generation diffusion model, and applying noise to the protected data that is iteratively updated stepwise such that the distance difference between the second learning trajectory, which is the trajectory of weights during fine-tuning of the image generation diffusion model of the protected data, and the first learning trajectory is minimized. Server.
In Article 1, The above processor is, Selecting an arbitrary i-th weight from a first learning trajectory stored in the target data, applying the i-th weight to the protection data to generate second protection data, calculating an i+k-th weight obtained by performing k-times of fine-tuning using the second protection data, calculating the distance between the i+k-th weight of the second protection data and the i+k-th weight of the target data, and gradually adjusting the noise for the second protection data based on the calculation result to reduce the distance difference between the second learning trajectory and the first learning trajectory. Server.
In Article 2, The above distance is calculated according to the objective function of the following mathematical formula, for the server. In the above mathematical formula, is the i+kth weight of the above second protection data, and is the i+kth weight of the above target data, and is the i-th weight of the above target data.
In Paragraph 3, Reducing the distance difference between the second learning trajectory and the first learning trajectory is, A server that calculates the gradient of the above objective function and adjusts the noise stepwise using the sign of the above gradient.
In paragraph 4, The above noise is limited so as not to exceed the noise budget. Server.
In paragraph 5, If the above noise exceeds the noise budget, clip the noise to maintain it within the noise budget. Server.
In paragraph 5, Adjustment of the noise using the sign of the gradient is such that the gradient of the noise is set to -1 for negative values and to +1 for positive values. Server.
In paragraph 1, The initial value of the weight is set to 0, Server.
In paragraph 1, The above image generation diffusion model uses LoRA (Low-Rank Adaptation), Server.
A method for poisoning a diffusion model for image generation of protection data to be protected, A step of storing a first learning trajectory, which is the trajectory of stepwise weights during iterative fine-tuning of the image generation diffusion model by inputting target data; A step of repeatedly updating noise step by step such that the distance difference between the first learning trajectory and the second learning trajectory, which is the trajectory of weights during fine-tuning of the diffusion model for image generation of the protection data, is minimized; and A step comprising applying the updated noise to the protection data to induce the image generation diffusion model to fine-tune, method.

Description

Method and apparatus for poisoning a diffusion model for image generation Various embodiments of the present disclosure relate to a method and apparatus for poisoning a diffusion model for image generation. The rapid advancement of image generation technology through diffusion models has significantly improved productivity and creativity. However, these advancements can raise serious concerns. For example, diffusion models can learn image data without proper authorization, and the generated images may infringe upon the rights of the original creators. Furthermore, the misuse of these models can lead to problems such as identity theft, the dissemination of false information, and harassment through the creation of deepfakes. In response to these concerns, existing data protection research has explored methods such as Untargeted Data Protection (UDP) as a way to intentionally degrade the quality of images generated by models trained on unauthorized data. This approach provides a method to prevent unauthorized use of data by rendering image output unusable. However, because UDP focuses solely on output quality degradation, its effectiveness in resolving issues arising from misuse has been limited. It is difficult to prove unauthorized use or use it as legal evidence with this method. While UDP provides a form of protection through output quality degradation, it lacks specificity and traceability. To address these limitations, Targeted Data Protection (TDP) has been proposed. Unlike UDP, TDP aims to induce a model trained on protected data to generate specific target images. TDP makes it easier to prove misuse by including specific data alterations that help track and identify unauthorized use. Furthermore, TDP can be customized to protect specific images or data types and adapt to various security requirements. Additionally, because TDP adds intentional changes to data, it serves as evidence of unauthorized use, making it easier to take legal action if necessary. However, TDP requires the diffusion model to generate different target images even after being trained on protected data. In other words, it must add a small amount of invisible noise to the protected data while simultaneously altering the high-dimensional content or semantics of the images. This causes existing TDP methods to be minimally limited in relation to the target image or the output of the corrupted model. For example, Mist defined text loss as the target loss, which was effective for capturing unique patterns but struggled to alter the actual content. FIG. 1 is a block diagram of a server for poisoning a diffusion model for image generation according to one embodiment of the present invention. FIG. 2 is a drawing for explaining the concept of the present invention. FIG. 3 is a schematic diagram illustrating the process of inputting a target image into a diffusion model and fine-tuning the diffusion model according to an embodiment of the present invention. FIG. 4 is a schematic diagram illustrating the process of matching the training path of a protection image with the training path of a target image according to one embodiment of the present invention. FIG. 5 is a flowchart of a method for poisoning a diffusion model for image generation according to one embodiment of the present invention. The embodiments described in this specification and the configurations illustrated in the drawings are merely preferred examples of the disclosed invention, and various modifications that may replace the embodiments and drawings of this specification may exist at the time of filing this application, and the same reference numerals or symbols presented in each of the drawings represent parts or components that perform substantially the same function. In addition, the suffix "~bu" for components used in the description of this specification is assigned or used interchangeably solely for the sake of ease of drafting the specification and does not inherently possess a distinct meaning or role. Furthermore, the "~bu" includes units realized by hardware, units realized by software, and units realized using both methods. Additionally, one unit may be realized using two or more hardware, and two or more units may be realized by one hardware. In this specification, expressions such as "A and/or B", "at least one of A and B," etc. refer to all possible combinations of items listed together, and terms including ordinal numbers, such as first and second, etc., may be used to describe various components, but said components are not limited by said terms. These terms are used solely for the purpose of distinguishing one component from another. Furthermore, in this specification, terms such as "comprising" and "may comprise" are intended to indicate the existence of the features, numbers, steps, actions, components, parts, or combinations thereof described in the specification, and do not preclude the existence or addition of one or more other features, numbers, steps