CN-122023567-A - Image processing method, device, electronic equipment and storage medium

CN122023567ACN 122023567 ACN122023567 ACN 122023567ACN-122023567-A

Abstract

The method comprises the steps of obtaining an original image, identifying a plurality of target objects in the original image to obtain target masks, extracting exclusive features of the target objects of the original image to obtain exclusive feature information of the target objects, inputting the target masks and the exclusive feature information of the target objects into an image generation model to obtain a first image, wherein the first image comprises a plurality of derivative objects, the positions of the derivative objects in the first image correspond to the positions of the target objects in the original image one by one, and the derivative objects at the same positions are consistent with the exclusive features of the target objects. The essence of this is to provide an image generation method that can simultaneously retain the proprietary features of a plurality of target objects in the original image into the newly generated image.

Inventors

Wu Cenya
SUN QICHAO

Assignees

北京字跳网络技术有限公司

Dates

Publication Date: 20260512
Application Date: 20241101

Claims (11)

1. An image processing method, comprising: the method comprises the steps of obtaining an original image, wherein the original image comprises a plurality of target objects; Identifying a plurality of target objects in the original image to obtain a target mask, wherein the target mask is used for indicating the positions of the target objects in the original image; extracting the exclusive characteristic of the target object of the original image to obtain the exclusive characteristic information of the target object; Inputting the target mask and the exclusive characteristic information of the target object into an image generation model to obtain a first image, wherein the first image comprises a plurality of derivative objects, the positions of the derivative objects in the first image are in one-to-one correspondence with the positions of the target object in the original image, and the derivative objects at the same position are consistent with the exclusive characteristics of the target object.
2. The method according to claim 1, wherein if at least some of the plurality of target objects belong to different categories, the inputting the target mask and the specific feature information of the target object into the image generation model, before obtaining the first image, the method further comprises: Determining target fine tuning models corresponding to the categories to which the target objects belong, wherein the number of the target fine tuning models is at least two; And fusing the target fine tuning model with the base model to obtain an image generation model.
3. The method as recited in claim 1, further comprising: acquiring style prompt information; obtaining first text information for describing image content of the original image and structural information of the original image based on the original image; inputting the target mask and the exclusive characteristic information of the target object into an image generation model to obtain a first image, wherein the method comprises the following steps: and inputting the target mask, the exclusive characteristic information of the target object, the style prompt information, the first text information and the structural information of the original image into an image generation model to obtain a first image.
4. A method according to claim 3, characterized in that the structural information of the original image comprises contour information and/or depth information of the original image.
5. A method according to claim 3, further comprising: based on a preset filtering rule, filtering the phrase in the first text information to obtain second text information; Inputting the target mask, the exclusive characteristic information of the target object, the style prompt information, the first text information and the structural information of the original image into an image generation model to obtain a first image, wherein the method comprises the following steps of: And inputting the target mask, the exclusive characteristic information of the target object, the style prompt information, the second text information and the structural information of the original image into an image generation model to obtain a first image.
6. The method as recited in claim 5, further comprising: Acquiring negative prompt information; inputting the target mask, the exclusive characteristic information of the target object, the style prompt information, the second text information and the structural information of the original image into an image generation model to obtain a first image, wherein the method comprises the following steps of: And inputting the target mask, the exclusive characteristic information of the target object, the style prompt information, the second text information, the structural information of the original image and the negative prompt information into an image generation model to obtain a first image.
7. The method according to any one of claims 1 to 6, wherein the inputting the target mask and the specific feature information of the target object into the image generation model to obtain the first image further comprises: dividing a target object in the original image to obtain a target object image; And fusing the first image and the target object image based on the target mask to obtain a target image.
8. The method as recited in claim 7, further comprising: determining an angle and a blocked area of the target object in the original image; Determining whether to perform fusion processing on the target object based on the angle, the blocked area and the preset judging conditions of the target object in the original image; the fusing the first image and the target object image based on the target mask to obtain a target image, and the method further comprises: And if the determination result is that the target object is subjected to fusion processing, fusing the first image and the target object image based on the target mask to obtain a target image.
9. An image processing apparatus, comprising: The acquisition module is used for acquiring an original image, wherein the original image comprises a plurality of target objects; The identification module is used for identifying a plurality of target objects in the original image to obtain a target mask, and the target mask is used for indicating the position of the target object in the original image; the extraction module is used for extracting the exclusive characteristic of the target object of the original image to obtain the exclusive characteristic information of the target object; The generation module is used for inputting the target mask and the exclusive characteristic information of the target object into an image generation model to obtain a first image, wherein the first image comprises a plurality of derivative objects, the positions of the derivative objects in the first image are in one-to-one correspondence with the positions of the target object in the original image, and the derivative objects at the same positions are consistent with the exclusive characteristics of the target object.
10. An electronic device, the electronic device comprising: one or more processors; a storage means for storing one or more programs; the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of any of claims 1-8.
11. A computer readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the method according to any of claims 1-8.

Description

Image processing method, device, electronic equipment and storage medium Technical Field The disclosure relates to the technical field of artificial intelligence, and in particular relates to an image processing method, an image processing device, electronic equipment and a storage medium. Background With the development of artificial intelligence technology, particularly the application of deep learning in the field of image processing, a technology for generating images based on a guidance map has been realized. There are still some limitations to the current image generation technology. Particularly in the case where a plurality of target objects are included in an image as a guidance image, how to make the objects included in the generated image highly similar to the target objects in the guidance image. Disclosure of Invention In order to solve the above technical problems or at least partially solve the above technical problems, the present disclosure provides an image processing method, an apparatus, an electronic device, and a storage medium. In a first aspect, the present disclosure provides an image processing method, including: the method comprises the steps of obtaining an original image, wherein the original image comprises a plurality of target objects; Identifying a plurality of target objects in the original image to obtain a target mask, wherein the target mask is used for indicating the positions of the target objects in the original image; extracting the exclusive characteristic of the target object of the original image to obtain the exclusive characteristic information of the target object; Inputting the target mask and the exclusive characteristic information of the target object into an image generation model to obtain a first image, wherein the first image comprises a plurality of derivative objects, the positions of the derivative objects in the first image are in one-to-one correspondence with the positions of the target object in the original image, and the derivative objects at the same position are consistent with the exclusive characteristics of the target object. In a second aspect, the present disclosure also provides an image processing apparatus including: The acquisition module is used for acquiring an original image, wherein the original image comprises a plurality of target objects; The identification module is used for identifying a plurality of target objects in the original image to obtain a target mask, and the target mask is used for indicating the position of the target object in the original image; the extraction module is used for extracting the exclusive characteristic of the target object of the original image to obtain the exclusive characteristic information of the target object; The generation module is used for inputting the target mask and the exclusive characteristic information of the target object into an image generation model to obtain a first image, wherein the first image comprises a plurality of derivative objects, the positions of the derivative objects in the first image are in one-to-one correspondence with the positions of the target object in the original image, and the derivative objects at the same positions are consistent with the exclusive characteristics of the target object. In a third aspect, the present disclosure also provides an electronic device, including: one or more processors; a storage means for storing one or more programs; The one or more programs, when executed by the one or more processors, cause the one or more processors to implement the image processing method as described above. In a fourth aspect, the present disclosure also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the image processing method as described above. Compared with the prior art, the technical scheme provided by the embodiment of the disclosure has the following advantages: The technical scheme provided by the embodiment of the disclosure comprises the steps of identifying a plurality of target objects in an original image to obtain a target mask, wherein the target mask is used for indicating the position of the target object in the original image, extracting the exclusive characteristic of the target object of the original image to obtain exclusive characteristic information of the target object, inputting the target mask and the exclusive characteristic information of the target object into an image generation model to obtain a first image, wherein the first image comprises a plurality of derivative objects, the positions of the derivative objects in the first image correspond to the positions of the target objects in the original image one by one, and the derivative objects at the same position are consistent with the exclusive characteristic of the target object. The essence of the method is to provide an image generation method capable of simultaneously maintain