CN-121999082-A - Scene-adaptive image generation method, device, equipment and medium

CN121999082ACN 121999082 ACN121999082 ACN 121999082ACN-121999082-A

Abstract

The application relates to the technical field of images, and provides an image generation method, device, equipment and medium matched with a scene. The method comprises the steps of obtaining object information of a plurality of first objects in a target image, determining object information of a second object matched with a scene where the plurality of first objects are located based on the object information of the plurality of first objects, and generating a layout image based on the object information of the second object. Generating a second object matched with the visual characteristics of the target image in the position range of the second object contained in the layout image by taking the visual characteristics of the target image as a reference condition and taking the position range of the second object in the layout image as a space constraint condition; and adding the second object which is generated in the layout image and matches with the visual characteristics of the target image into the target image to obtain the target image containing the second object. The image generation method provided by the application can meet the training requirement on complex scene samples, so that the generated image has semantic logic which accords with common sense.

Inventors

ZHANG XU

Assignees

北京神州光大科技有限公司

Dates

Publication Date: 20260508
Application Date: 20251229

Claims (10)

1. A method of generating an image for adaptation to a scene, the method comprising: acquiring object information of a plurality of first objects in a target image; determining object information of a second object adapted to a scene in which the plurality of first objects are located based on the object information of the plurality of first objects; generating a layout image based on the object information of the second object, wherein the size of the layout image is the same as that of the target image, and the layout image comprises the outline of the second object and the position range of the second object; generating the second object matched with the visual characteristics of the target image in the position range of the second object contained in the layout image by taking the visual characteristics of the target image as a reference condition and taking the position range of the second object in the layout image as a space constraint condition; and adding the second object which is generated in the layout image and is matched with the visual characteristics of the target image to the target image, so as to obtain the target image containing the second object.
2. The image generation method according to claim 1, wherein the acquiring object information of the plurality of first objects in the target image includes: identifying the target image to obtain object information of each object in the target image; determining the confidence coefficient of each object in the target image, wherein the confidence coefficient of each object is used for representing the reliability degree of the object information of the object; And taking object information of N objects with highest confidence as the object information of the first object, wherein N is an integer greater than 1.
3. The method of generating an image according to claim 1 or 2, wherein the object information includes at least one of an object category, position information, or an object size.
4. The image generation method according to claim 3, wherein the object information includes position information, and the determining object information of a second object adapted to a scene in which the plurality of first objects are located based on the object information of the plurality of first objects includes: determining spatial layout information of the plurality of first objects based on the position information of the plurality of first objects; position information of the second object is determined based on the spatial layout information of the plurality of first objects.
5. The image generation method according to claim 3, wherein the object information includes an object category, position information, and an object size, and the generating a layout image based on the object information of the second object includes: the layout image is generated based on the object class, object size, and position information of the second object.
6. The image generation method according to claim 1, wherein the layout image is a binary mask image.
7. An image generation apparatus adapted to a scene, the image generation apparatus comprising: the acquisition module is used for acquiring object information of a plurality of first objects in the target image; A determining module, configured to determine object information of a second object adapted to a scene where the plurality of first objects are located, based on object information of the plurality of first objects; A generating module, configured to generate a layout image based on object information of the second object, where a size of the layout image is the same as a size of the target image, and the layout image includes a contour of the second object and a position range of the second object; The generating module is further configured to generate, in the position range of the second object included in the layout image, the second object that matches the visual feature of the target image, with the visual feature of the target image as a reference condition and with the position range of the second object in the layout image as a spatial constraint condition; The generating module is further configured to add the second object that is generated in the layout image and matches with the visual feature of the target image to the target image, so as to obtain the target image that includes the second object.
8. An electronic device comprising a processor and a memory storing a computer program, characterized in that the processor implements the image generation method of any of claims 1 to 6 and adapted to a scene when executing the computer program.
9. A non-transitory computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when executed by a processor, implements the image generation method of any one of claims 1 to 6 and scene adaptation.
10. A computer program product comprising a computer program, characterized in that the computer program, when executed by a processor, implements the image generation method of any of claims 1 to 6 and adapted to a scene.

Description

Scene-adaptive image generation method, device, equipment and medium Technical Field The present application relates to the field of image technologies, and in particular, to an image generating method, apparatus, device, and medium adapted to a scene. Background Currently, the mainstream data enhancement methods are mainly divided into two types. The first is a simple transformation technology based on the image level, and common means include rotation, clipping, color dithering or stitching of images, etc. The method can only realize limited enhancement effect, and is difficult to meet the training requirement of the model on complex scene samples. The second is copy-and-paste based instance embedding methods that attempt to enrich sample diversity by embedding new object instances into the original image, but semantic logic suffers from serious inadequacy-due to lack of understanding of scene high-level semantics, the class of newly added objects tends to be severely disagreeable with scene context, such as randomly embedding giraffes in office scenes. Disclosure of Invention The embodiment of the application provides an image generation method matched with a scene, which not only can meet the training requirement of a model on complex scene samples, but also can effectively avoid the condition that the semantic logic of a data enhancement technology in the related technology is unreasonable, so that the generated image has the semantic logic which accords with common sense. In a first aspect, an embodiment of the present application provides an image generating method adapted to a scene, the method including: the method comprises the steps of obtaining object information of a plurality of first objects in a target image, determining object information of a second object adapted to a scene where the plurality of first objects are located based on the object information of the plurality of first objects, and generating a layout image based on the object information of the second object, wherein the size of the layout image is the same as that of the target image, and the layout image comprises the outline of the second object and the position range of the second object. Generating a second object matched with the visual characteristics of the target image in the position range of the second object contained in the layout image by taking the visual characteristics of the target image as a reference condition and taking the position range of the second object in the layout image as a space constraint condition; and adding the second object which is generated in the layout image and matches with the visual characteristics of the target image into the target image to obtain the target image containing the second object. In one embodiment, the acquiring object information of the plurality of first objects in the target image may specifically include identifying the target image to obtain object information of each object in the target image, determining a confidence coefficient of each object in the target image, where the confidence coefficient of each object is used to characterize a reliability degree of the object information of the object, and taking the object information of N objects with highest confidence coefficients as the object information of the first objects, where N is an integer greater than 1. In one embodiment, the object information includes at least one of an object category, location information, or an object size. In one embodiment, the object information includes location information, and determining object information of a second object adapted to a scene in which the plurality of first objects are located based on the object information of the plurality of first objects may include determining spatial layout information of the plurality of first objects based on the location information of the plurality of first objects, and determining location information of the second object based on the spatial layout information of the plurality of first objects. In one embodiment, the object information includes an object category, position information, and an object size, and generating the layout image based on the object information of the second object may specifically include: a layout image is generated based on the object class, the object size, and the position information of the second object. In one embodiment, the layout image is a binary mask image. In a second aspect, an embodiment of the present application provides an image generating apparatus adapted to a scene, the image generating apparatus including: the acquisition module is used for acquiring object information of a plurality of first objects in the target image; A determining module, configured to determine object information of a second object adapted to a scene where the plurality of first objects are located, based on object information of the plurality of first objects; A generating module, configured to generate a lay