Search

EP-4742058-A1 - IMAGE GENERATION METHOD AND DEVICE, INTELLIGENT AGENT, INTELLIGENT AGENT SYSTEM AND STORAGE MEDIUM

EP4742058A1EP 4742058 A1EP4742058 A1EP 4742058A1EP-4742058-A1

Abstract

The invention provides an image generation method and apparatus, an intelligent agent, an intelligent agent system and a storage medium, relates to the field of artificial intelligence technology, in particular to the field of computer vision, deep learning and large language models, and can be applied to an artificial intelligence generated content scene. The specific implementation solution includes: obtaining (101, 201, 301, 501, 601, 701) image generation requirement information; determining (102, 202, 502, 602, 702) a target image generation manner according to the image generation requirement information; querying (103, 305, 503) a first reference image based on the image generation requirement information; and based on the image generation requirement information and the first reference image, generating (104, 204, 306, 605) a target image using the target image generation manner.

Inventors

  • WANG, HAIFENG
  • LIU, JIACHEN
  • XIAO, XINYAN
  • DING, ERRUI
  • WANG, JINGDONG
  • WU, HUA
  • WU, TIAN

Assignees

  • Beijing Baidu Netcom Science Technology Co., Ltd.

Dates

Publication Date
20260513
Application Date
20250618

Claims (15)

  1. An image generation method, comprising: obtaining (101, 201, 301, 501, 601, 701) image generation requirement information; determining (102, 202, 502, 602, 702) a target image generation manner according to the image generation requirement information; querying (103, 305, 503) a first reference image based on the image generation requirement information; and generating (104, 204, 306, 605) a target image using the target image generation manner based on the image generation requirement information and the first reference image.
  2. An image generation method, comprising: obtaining (101, 201, 301, 501, 601) image generation requirement information; determining (102, 202, 502, 602) a target image generation manner according to the image generation requirement information; determining (603) whether a reference image query needs to be performed for the image generation requirement information; querying (604, 706) a first reference image based on the image generation requirement information, in a case where the reference image query needs to be performed for the image generation requirement information; and generating (104, 204, 306, 605, 707) a target image using the target image generation manner based on the image generation requirement information and the first reference image.
  3. The method of claim 2, wherein determining (603) whether the reference image query needs to be performed for the image generation requirement information comprises: obtaining (703) requirement text in the image generation requirement information; performing (704) requirement understanding on the requirement text to obtain a requirement type corresponding to the image generation requirement information; and determining (705), based on the requirement type, whether the reference image query needs to be performed for the image generation requirement information.
  4. The method of claim 2 or 3, further comprising: generating (708) the target image using the target image generation manner based on the image generation requirement information, in a case where the reference image query does not need to be performed for the image generation requirement information.
  5. The method of any of claims 1-4, wherein querying (103, 305, 503, 604) the first reference image based on the image generation requirement information comprises: obtaining (203) image main-body information comprised in the image generation requirement information, and performing an image query in a preset image library based on the image main-body information, to obtain the first reference image.
  6. The method of claim 5, wherein performing the image query in the preset image library based on the image main-body information, to obtain the first reference image, comprises: performing the image query in the preset image library based on the image main-body information, to obtain candidate reference images; obtaining an image quality of each candidate reference image, and obtaining a set quality requirement corresponding to the image generation requirement information; and screening the candidate reference images according to the image quality and the set quality requirement to obtain the first reference image.
  7. The method of any of claims 1-6, wherein determining (102, 202, 502, 602) the target image generation manner according to the image generation requirement information comprises: obtaining (302) requirement text in the image generation requirement information; inputting (303) the requirement text into a first large model to perform main-body modification intention detection, to obtain a target main-body modification intention corresponding to the image generation requirement information, wherein the target main-body modification intention is used to indicate whether a main-body in the first reference image needs to be modified; and obtaining (304) the target image generation manner corresponding to the target main-body modification intention based on a mapping relationship between main-body modification intentions and image generation manners.
  8. The method of claim 7, wherein the requirement text corresponds to a text feature, and generating the target image using the target image generation manner based on the image generation requirement information and the first reference image comprises: in response to the target main-body modification intention indicating that the main-body in the first reference image does not need to be modified, performing feature extraction on the first reference image to obtain a first image feature, and obtaining a main-body segmentation image in the first reference image; inputting the first image feature and the text feature into a first image generation model to obtain a background image and main-body layout information; and fusing the main-body segmentation image and the background image according to the main-body layout information to obtain the target image; or in response to the target main-body modification intention indicating that the main-body in the first reference image needs to be modified, performing feature extraction on the first reference image to obtain a second image feature; splicing the text feature and the second image feature in an interleaved manner based on descriptive objects corresponding to sub-features in the text feature and the second image feature, to obtain an image-text interleaved feature; and inputting the image-text interleaved feature into a second image generation model to obtain the target image.
  9. The method of claim 7, wherein the image generation requirement information comprises a second reference image input by a user, and splicing the text feature and the second image feature in the interleaved manner based on the descriptive objects corresponding to the sub-features in the text feature and the second image feature, to obtain the image-text interleaved feature, comprises: performing feature extraction on the second reference image to obtain a third image feature; and splicing the text feature, the second image feature and the third image feature in the interleaved manner based on the descriptive objects corresponding to the sub-features in the text feature, the second image feature and the third image feature, to obtain the image-text interleaved feature.
  10. The method of any of claims 1-9, wherein generating (104, 204, 306, 605) the target image using the target image generation manner based on the image generation requirement information and the first reference image comprises: obtaining (504) requirement text in the image generation requirement information, rewriting and expanding the requirement text to obtain target requirement text, and generating the target image using the target image generation manner based on the target requirement text and the first reference image.
  11. The method of claim 9, wherein rewriting and expanding the requirement text to obtain the target requirement text comprises: obtaining context information of the image generation requirement information; in a case where the image generation requirement information comprises a second reference image input by a user, obtaining descriptive information of the second reference image; and inputting the requirement text and at least one of the context information or the descriptive information into a second largest model for rewriting and expanding, to obtain the target requirement text.
  12. An image generation apparatus, comprising: an obtaining module (801), configured to obtain image generation requirement information; a determining module (802), configured to determine a target image generation manner according to the image generation requirement information; a querying module (803), configured to query a first reference image based on the image generation requirement information; and a generating module (804), configured to generate a target image using the target image generation manner based on the image generation requirement information and the first reference image.
  13. An image generation apparatus, comprising: an obtaining module (901), configured to obtain image generation requirement information; a determining module (902), configured to determine a target image generation manner according to the image generation requirement information; a judging module (903), configured to determine whether a reference image query needs to be performed for the image generation requirement information; a querying module (904), configured to query a first reference image based on the image generation requirement information, in a case where the reference image query needs to be performed for the image generation requirement information; and a generating module (905), configured to generate a target image using the target image generation manner based on the image generation requirement information and the first reference image.
  14. An intelligent agent system, comprising: at least one processor; and a memory connected to the at least one processor communicatively; wherein the memory stores instructions executable by the at least one processor, the instructions are executed by the at least one processor, to cause the at least one processor to perform the method of any one of claims 1-11.
  15. A computer-readable storage medium for storing computer instructions, wherein the computer instructions are used to cause a computer to perform the method of any one of claims 1-11.

Description

TECHNICAL FIELD The present invention relates to the field of artificial intelligence technology, specifically to the field of computer vision, deep learning and large language models, and can be applied to an artificial intelligence generated content (AIGC) scene, in particular to an image generation method and apparatus, an intelligent agent, an intelligent agent system and a storage medium. BACKGROUND The kernel of artificial intelligence (AI) image generation technology based on artificial intelligence generated content (AIGC) aims to use an AI image generation model to realize text-to-image conversion or image-to-image conversion. However, an image generation effect of the AI image generation technology highly depends on quality and timeliness of model training data. Specifically, after the AI image generation model is trained, its generated image content is often limited by the timeliness of the model training data. Before the model training data is updated, the AI image generation model cannot obtain the latest content, so there may be a certain lag in its generated image. SUMMARY The present invention proposes an image generation method and apparatus, an intelligent agent, an intelligent agent system and a storage medium. According to a first aspect of the present invention, an image generation method is provided, including: obtaining image generation requirement information; determining a target image generation manner according to the image generation requirement information; querying a first reference image based on the image generation requirement information; and generating a target image using the target image generation manner based on the image generation requirement information and the first reference image. According to a second aspect of the present invention, an image generation method is provided, including: obtaining image generation requirement information; determining a target image generation manner according to the image generation requirement information; determining whether a reference image query needs to be performed for the image generation requirement information; querying a first reference image based on the image generation requirement information, in a case where the reference image query needs to be performed for the image generation requirement information; and generating a target image using the target image generation manner based on the image generation requirement information and the first reference image. According to a third aspect of the present invention, an image generation apparatus is provided. The apparatus includes an obtaining module, configured to obtain image generation requirement information; a determining module, configured to determine a target image generation manner according to the image generation requirement information; a querying module, configured to query a first reference image based on the image generation requirement information; and a generating module, configured to generate a target image using the target image generation manner based on the image generation requirement information and the first reference image. According to a fourth aspect of the present invention, an image generation apparatus is provided. The apparatus includes an obtaining module, configured to obtain image generation requirement information; a determining module, configured to determine a target image generation manner according to the image generation requirement information; a judging module, configured to determine whether a reference image query needs to be performed for the image generation requirement information; a querying module, configured to query a first reference image based on the image generation requirement information, in a case where the reference image query needs to be performed for the image generation requirement information; and a generating module, configured to generate a target image using the target image generation manner based on the image generation requirement information and the first reference image. According to a fifth aspect of the present invention, an intelligent agent is provided, including an inputting module, configured to obtain image generation requirement information; a processing module, configured to determine a target image generation manner according to the image generation requirement information, query a first reference image based on the image generation requirement information, and generate a target image using the target image generation manner based on the image generation requirement information and the first reference image; and an outputting module, configured to output the target image. According to a sixth aspect of the present invention, an intelligent agent is provided, including an inputting module, configured to obtain image generation requirement information; a processing module, configured to determine a target image generation manner according to the image generation requirement information, determin