Search

CN-122023573-A - Method, apparatus, device and storage medium for image generation

CN122023573ACN 122023573 ACN122023573 ACN 122023573ACN-122023573-A

Abstract

According to embodiments of the present disclosure, methods, apparatuses, devices, and storage medium for image generation are provided. The method includes determining a plurality of sample images respectively corresponding to a plurality of time steps and a plurality of noises respectively corresponding to the plurality of sample images using an image generation model based on a description sample representing an image generation target, determining a second sample image corresponding to a reference time step based on a first sample image corresponding to the target time step and a first noise corresponding to the first sample image for a target time step among the plurality of time steps, the reference time step being between the target time step and a previous time step, the previous time step being a time step immediately before the target time step among the plurality of time steps, and training the image generation model based on at least the second sample image. In this way, the original generating capacity of the image generating model is kept as far as possible without being damaged while the sampling speed is improved, and the calculation and time expenditure is saved.

Inventors

  • XIA XIN
  • YANG YUHONG
  • XIAO XUEFENG

Assignees

  • 北京字跳网络技术有限公司

Dates

Publication Date
20260512
Application Date
20241107

Claims (12)

  1. 1. A method for image generation, comprising: Determining a plurality of sample images corresponding to the plurality of time steps and a plurality of noises corresponding to the plurality of sample images, respectively, using an image generation model based on a description sample representing an image generation target; Determining, for a target time step of the plurality of time steps, a second sample image corresponding to a reference time step between the target time step and a previous time step of the plurality of time steps that is a time step immediately preceding the target time step based on a first sample image of the plurality of sample images corresponding to the target time step and a first noise of the first sample image, and The image generation model is trained based at least on the second sample image.
  2. 2. The method of claim 1, wherein the reference time step is determined by: Dividing a time step interval between the target time step and the previous time step into a plurality of sub-intervals, and A time step in a selected subinterval of the plurality of subintervals is selected as the reference time step.
  3. 3. The method of claim 1, wherein determining a second sample image corresponding to the reference time step comprises: a second sample image corresponding to the reference time step is determined according to a target algorithm based on the first sample image, the first noise, and a parameter related to the reference time step.
  4. 4. The method of claim 1, wherein training the image generation model comprises: acquiring first feedback information for the second sample image by using a reference feedback model based on the second sample image; acquiring second feedback information for the second sample image using a pseudo feedback model based on the second sample image, and Model parameters of the image generation model are updated based on differences between the first feedback information and the second feedback information.
  5. 5. The method of claim 4, further comprising: Determining the prediction noise of the second sample image by using the pseudo feedback model, and The pseudo feedback model is trained based on differences from true value noise possessed by the prediction noise and the first sample image.
  6. 6. The method of claim 4, wherein the first feedback information comprises respective first scores of a plurality of image blocks in the second sample image, and the second feedback information comprises respective second scores of the plurality of image blocks in the second sample image.
  7. 7. The method of claim 1, wherein the description sample is one of a plurality of description samples, and the plurality of sample images and the plurality of noise are determined based on the plurality of description samples, respectively, and the method further comprises: Storing the plurality of sample images and the plurality of noise respectively determined based on the plurality of description samples in a data set, and The first sample image and the first noise are read from the data set before the second sample image is determined.
  8. 8. The method of claim 1, wherein the denoising process that trains the image generation model includes a predetermined number of time steps, and the denoising process includes the plurality of time steps, which are a subset of the predetermined number of time steps.
  9. 9. An apparatus for image generation, comprising: A determining module configured to determine a plurality of sample images respectively corresponding to a plurality of time steps and a plurality of noises respectively corresponding to the plurality of sample images using an image generation model based on a description sample representing an image generation target; a second sample image determination module configured to determine, for a target time step of the plurality of time steps, a second sample image corresponding to a reference time step between the target time step and a previous time step that is a time step immediately preceding the target time step of the plurality of time steps, based on a first sample image corresponding to the target time step of the plurality of sample images and a first noise corresponding to the first sample image, and An image generation model training module configured to train the image generation model based at least on the second sample image.
  10. 10. An electronic device, comprising: at least one processing unit, and At least one memory coupled to the at least one processing unit and storing instructions for execution by the at least one processing unit, which when executed by the at least one processing unit, cause the electronic device to perform the method of any one of claims 1-8.
  11. 11. A computer readable storage medium having stored thereon a computer program executable by a processor to implement the method of any of claims 1 to 8.
  12. 12. A computer program product tangibly stored in a computer storage medium and comprising computer-executable instructions that, when executed by a device, cause the device to perform the method of any one of claims 1 to 8.

Description

Method, apparatus, device and storage medium for image generation Technical Field Example embodiments of the present disclosure relate generally to the field of computers and, more particularly, relate to methods, apparatuses, devices, and computer-readable storage media for image generation. Background In the field of Computer Vision (CV), various image generation techniques based on machine learning have been remarkably developed and have been widely used. For example, in many application scenarios such as social, gaming, image editing, etc., it is necessary to generate images desired by a user. Machine learning based image generation techniques may be used in such application scenarios to enhance the user's experience. In some example application scenarios, it is desirable to generate an image that matches the user input based on the user's input information, such as text description information. Disclosure of Invention In a first aspect of the present disclosure, a method of image generation is provided. The method includes determining a plurality of sample images respectively corresponding to a plurality of time steps and a plurality of noises respectively corresponding to the plurality of sample images using an image generation model based on a description sample representing an image generation target, determining a second sample image corresponding to a reference time step based on a first sample image corresponding to the target time step and a first noise corresponding to the first sample image for a target time step among the plurality of time steps, the reference time step being between the target time step and a previous time step, the previous time step being a time step immediately before the target time step among the plurality of time steps, and training the image generation model based on at least the second sample image. In a second aspect of the present disclosure, an apparatus for image generation is provided. The apparatus includes a determination module configured to determine a plurality of sample images respectively corresponding to a plurality of time steps and a plurality of noises respectively corresponding to the plurality of sample images using an image generation model based on a description sample representing an image generation target, a second sample image determination module configured to determine a second sample image corresponding to a reference time step based on a first sample image corresponding to the target time step and a first noise corresponding to the first sample image for a target time step among the plurality of time steps, the reference time step being between the target time step and a previous time step, the previous time step being a time step immediately before the target time step among the plurality of time steps, and an image generation model training module configured to train the image generation model based on at least the second sample image. In a third aspect of the present disclosure, an electronic device is provided. The apparatus includes at least one processing unit, and at least one memory coupled to the at least one processing unit and storing instructions for execution by the at least one processing unit. The instructions, when executed by at least one processing unit, cause the apparatus to perform the method of the first aspect. In a fourth aspect of the present disclosure, a computer-readable storage medium is provided. The computer readable storage medium has stored thereon a computer program executable by a processor to implement the method of the first aspect. In a fifth aspect of the present disclosure, a computer program product is provided. The computer program product is tangibly stored in a computer storage medium and comprises computer-executable instructions which, when executed by a device, cause the device to perform the method of the first aspect. It should be understood that what is described in this section of the disclosure is not intended to limit key features or essential features of the embodiments of the disclosure, nor is it intended to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following description. Drawings The above and other features, advantages and aspects of embodiments of the present disclosure will become more apparent by reference to the following detailed description when taken in conjunction with the accompanying drawings. In the drawings, wherein like or similar reference numerals denote like or similar elements, in which: FIG. 1 illustrates a schematic diagram of an example environment in which embodiments of the present disclosure may be implemented; FIG. 2 illustrates a schematic diagram of an example architecture of model training, according to some embodiments of the present disclosure; FIG. 3 illustrates a schematic diagram of determining a reference time step in accordance with some embodiments of the present disclosure; FIG. 4 ill