JP-7855566-B2 - Information processing device, information processing method, and program
Inventors
- 大▲高▼ 聡
Assignees
- キヤノン株式会社
Dates
- Publication Date
- 20260508
- Application Date
- 20231220
Claims (20)
- An acquisition means for acquiring an input image, An extraction means for extracting objects from the aforementioned input image, A means for identifying the missing portion of the object extracted by the extraction means based on the shape of the object's outline, A generation means for generating a complementary image by filling in the missing portion identified in the input image, Equipped with, The identification means, when the extraction means extracts a plurality of objects from the input image and the plurality of objects are parts of the same object, identifies the region between the plurality of objects that constitute the same object as the missing portion. The generation means generates a interpolated image by filling in the missing portion identified in the input image. An information processing device characterized by the following:
- The identification means identifies the linear portion as the missing portion if the contour of the extracted object includes a linear portion. The information processing apparatus according to feature 1.
- The identifying means determines that the portion of the contour that coincides with the edge of the input image is the straight line portion. The information processing apparatus according to feature 2.
- The aforementioned straight portion is defined as a straight line where the variation in the position of the pixels relative to the straight line, calculated based on the position of the pixels corresponding to the straight portion, is below a predetermined threshold. The information processing apparatus according to claim 3.
- The aforementioned identifying means sets the region outside the input image adjacent to the linear portion as a interpolation region for the input image. The information processing apparatus according to feature 2.
- The identification means determines that if the missing portions of the plurality of objects extracted by the extraction means are facing each other, the plurality of objects are part of the same object. The information processing apparatus according to feature 1.
- The generation means generates the interpolated image when an object cropping process is performed on the input image. The information processing apparatus according to feature 1.
- The information processing apparatus according to claim 1, characterized in that the generation means generates the interpolated image when the missing portion is within a predetermined area.
- The generation means generates a plurality of different complementary images for a single input image. The information processing apparatus according to feature 1.
- The generation means acquires a text prompt and generates the complementary image based on the acquired text prompt. The information processing apparatus according to feature 1.
- The extraction means extracts the object from the complementary image generated by the generation means, The aforementioned identification means identifies the missing portion of the object extracted from the complementary image, The generation means generates a re-completed image in which the missing portion identified in the completed image is completed. The information processing apparatus according to feature 1.
- Further equipped with a means for receiving user input, The information processing apparatus according to claim 1, characterized in that the identifying means determines whether or not to generate the complementary image in response to the user input.
- Further equipped with a means for receiving user input, The aforementioned identification means sets a completion area based on the user input. The information processing apparatus according to feature 1.
- The generation means does not generate the interpolation image if multiple input images are acquired by the acquisition means and other input images are placed on the missing portion. The information processing apparatus according to feature 1.
- The generation means generates a plurality of complementary images with different complementary regions. The information processing apparatus according to feature 1.
- The extraction means extracts objects having predetermined attributes. The information processing apparatus according to feature 1.
- The extraction means obtains user input specifying an object and sets the object to be extracted based on the user input. The information processing apparatus according to feature 1.
- The information processing apparatus according to claim 1, characterized in that the generation means generates the complementary image using a pre-trained generation model.
- The generated complementary image is displayed on the layout data editing screen. The information processing apparatus according to feature 1.
- The aforementioned layout data editing screen is a screen for editing layout data related to printed materials or documents. The information processing apparatus according to feature 19.
Description
This invention relates to an information processing technique for filling in missing portions of an input image. In recent years, services have emerged that support the creation of posters and flyers by allowing anyone to easily produce high-quality results by editing layouts from a large number of pre-prepared templates. When a user attempts to use an image they own within a selected template, some parts of the object in that image may be missing. If the missing parts are essential, the user will have to prepare a different image suitable for their intended purpose. Incidentally, recently, it's becoming possible to generate necessary content using generative AI technology. With generative AI technology, when a user inputs images or text as input prompts to a generative model, it can generate text, images, videos, etc., that have a high probability of fitting the "context" represented by the input prompts. Using this technology, users can easily obtain images with missing parts filled in (completed images). However, the user must specify which part of the image to fill in. Patent Document 1 discloses a technology for automatically correcting image data for forming an image of a document based on ground truth data. Patent Document 1 compares the scanned image of the document with ground truth data containing multiple lines of different thicknesses and orientations, detects missing parts such as broken lines and changes in line width in the scanned image, and automatically corrects the image data for forming an image of the document. Japanese Patent Publication No. 2008-250046 An example of a system configuration for a layout data editing device targeting an image output device in this embodiment.An example of the hardware configuration of the image output device in this embodiment.An example of the hardware configuration of the client PC and server in this embodiment.An example of a functional block for a printing system targeting the image output device in this embodiment.An example of image generation processing using a generative model.An example of a layout data database in this embodiment.This figure shows an example of the layout data editing screen on a client PC in this embodiment.An example of a processing flow for generating complementary images when editing layout data in Embodiment 1.An example of image completion processing using a generative model.An example of a processing flow for generating complementary images when cropping content in Embodiment 2.An example of content placement.An example of a processing flow for determining whether or not to generate a complementary image based on the placement position of the content in Embodiment 3.An example of a processing flow for generating multiple complementary images in Embodiment 4.An example of a processing flow for generating a complementary image using a text prompt in Embodiment 5.An example of a processing flow for filling in the missing parts of the interpolated image generated in Embodiment 6.An example of a processing flow for receiving a user's request to generate a complementary image in Embodiment 7.An example where the extracted object is truncated.An example of a processing flow in Embodiment 8 where, when the same object is extracted as multiple objects during content cropping, a complementary image is generated with the area between the same objects as the complementary region.An example of a processing flow for receiving a interpolation region from the user to generate a complementary image in Embodiment 9.An example of a dialog box that allows the user to set the interpolation region for generating the interpolation image.An example of content overlap.An example of a processing flow for determining whether or not to generate a complementary image based on the content overlap state in Embodiment 10.An example of a processing flow for generating multiple complementary images with different complementary regions in Embodiment 11. The following describes preferred embodiments of the present invention in detail with reference to the attached drawings. Note that the following embodiments do not limit the scope of the present invention as defined in the claims, and not all combinations of features described in these embodiments are necessarily essential to the solution of the present invention. <Embodiment 1> First, the information processing system according to this embodiment will be described. The information processing system according to this embodiment is a printing system that involves editing layout data for an image output device. In the printing system, a PC connected from an external source performs layout data editing and sends print jobs to the image output device. When a print job is generated, the print settings are edited on the PC screen as needed. Figure 1 shows an example of the system configuration in the network environment of this system. As shown in Figure 1, the client PC 102 can connect to the server 104 and im