EP-4738266-A2 - CROSS-DOMAIN CONTENT BLENDING
Abstract
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for blending content from different domains. A method includes obtaining, from a given content provider, a set of text and a set of images that are designated for combination to create a digital component. A saliency model is applied to an electronic document to identify salient areas in the electronic document. A set of modifications that do not result in the salient areas in the electronic document being overlapped are constructed. Visual characteristics of the electronic document are determined. A request for content to integrate into the electronic document that is provided by a different domain than the digital component is received. Visual modifications are made to the digital component based on (i) the constructed set of modifications and (ii) the determined visual characteristics of the electronic document. The modified digital component is served.
Inventors
- FARRE GUIU, MIQUEL ANGEL
Assignees
- Google LLC
Dates
- Publication Date
- 20260506
- Application Date
- 20230106
Claims (13)
- A method, comprising: obtaining, by one or more processors and from a given content provider, a set of text and a set of images that are designated for combination to create a digital component; determining, by the one or more processors, a target zone in an electronic document available for presentation of the digital component; determining dimensions of the target zone; modifying a shape of the digital component to a target shape that is different from an original shape of the digital component, wherein the target shape is sized to fit within the target zone such that less than all of the target zone is occupied by the digital component having the target shape; selecting, by the one or more processors, fill content to occupy areas of the target zone that are not occupied by the digital component having the target shape, wherein selecting the fill content comprises using a deep learning model to generate content to fill the areas of the target zone not occupied by the digital component; visually modifying, by the one or more processors, the digital component by applying the target shape and inserting the selected fill content into the areas of the target zone that are not occupied by the digital component; and serving, by the one or more processors, the digital component to a client device as visually modified by the one or more processors.
- The method of claim 1, wherein visually modifying the digital component further comprises: determining, by the one or more processors and based on the determined dimensions of the target zone, a scaling factor for the digital component; and adjusting, by the one or more processors, a size of the digital component according to the scaling factor prior to applying the target shape.
- The method of any preceding claim, wherein the set of images comprises a first image having the original shape, and wherein modifying the shape of the digital component to the target shape comprises: identifying, by the one or more processors, a salient area within the first image; and cropping, by the one or more processors, the first image into the target shape such that the salient area is preserved within the target shape.
- The method of any preceding claim, wherein selecting the fill content comprises: determining, by the one or more processors, a set of visual characteristics of native content in the electronic document adjacent to the target zone; and configuring, by the one or more processors, the deep learning model to generate content that matches at least one visual characteristic from among the set of visual characteristics.
- The method of claim 4, wherein the set of visual characteristics comprises at least one of a color, a texture, or a pattern.
- The method of any preceding claim, wherein selecting the fill content comprises generating a visual transition between an edge of the digital component having the target shape and a boundary of the target zone.
- The method of claim 6, wherein generating the visual transition comprises applying a fading effect that transitions from a first color associated with the digital component to a second color associated with native content of the electronic document.
- The method of any preceding claim, wherein the target shape is selected from a set of available shapes based on a context of the electronic document.
- The method of any preceding claim, wherein the digital component comprises video content, the method further comprising: analyzing, by the one or more processors, native content of the electronic document to determine a semantic context; identifying, by the one or more processors, a specific playback location within a duration of the video content that corresponds to the semantic context; and initiating, by the one or more processors, playback of the video content at the identified specific playback location.
- The method of claim 1, further comprising: monitoring, by the one or more processors, a performance metric associated with the digital component as visually modified; and updating, by the one or more processors, the deep learning model based on the performance metric.
- The method of claim 1, wherein determining the target zone comprises applying a saliency model to the electronic document to identify areas that do not contain salient native content.
- A system, comprising: one or more memory devices; and one or more processors configured to access the one or more memory devices and execute instructions that cause the one or more processors to perform operations of the method recited by claims 1-11.
- A computer readable medium storing instructions that, when executed by one or more processors, cause the one or more processors to perform operations comprising the method recited by claims 1-11.
Description
BACKGROUND This specification relates to data processing and blending content from different domains into a combined visual presentation. SUMMARY In general, one innovative aspect of the subject matter described in this specification can be embodied in methods that include the actions of obtaining, by one or more processors and from a given content provider, a set of text and a set of images that are designated for combination to create a digital component; applying, by the one or more processors and to an electronic document, a saliency model configured, using at least one machine learning training technique, to accept image data of an image as input and output locations of salient areas in the image; constructing, by the one or more processors, a set of modifications to the set of text or the set of images that do not result in the salient areas in the electronic document being overlapped by the set of text or the set of images; determining, by the one or more data processing apparatus, visual characteristics of the electronic document; after performing the obtaining, applying, constructing, and determining: receiving, from a client device, a request for content to integrate into the electronic document that is provided to the client device by a different domain than the digital component; visually modifying, by the one or more processors and after receiving the request for content, at least one of the set of text or the set of images based a combination of (i) the constructed set of modifications and (ii) the determined visual characteristics of the electronic document: serving, by the one or more processors and in response to receiving the request for content, the digital component to the client device as visually modified by the one or more processors. Other embodiments of this aspect include corresponding systems, apparatus, and computer programs, configured to perform the actions of the methods, encoded on computer storage devices. These and other embodiments can each optionally include one or more of the following features. Constructing the set of modifications can include modifying a typography of the set of text according to a given set of modifications: rendering a revised digital component that includes at least one image from among the set of images and at least some text from among the set of text with the modified typography; performing one or more computer vision processes on a presentation of the revised digital component overlaid on the electronic document to determine whether the salient areas of the image are overlapped; and classifying the given set of modifications based on whether the salient areas of the image are overlapped. Visually modifying at least one of the set of text or the set of images based a combination of (i) the constructed set of modifications and (ii) the determined visual characteristics of the electronic document can include modifying the set of text to match a target typography of the electronic document based on the target typography being classified as acceptable based on the salient areas of the images not being overlapped when modified to the target typography: and serving the digital component to the client device as visually modified by the one or more processors comprises serving the digital component including the target typography of the electronic document. Constructing the set of modifications can include: modifying a shape of the image to a target shape: rendering the revised digital component with the modified shape of the image: performing a set of computer vision processes on the revised digital component; and classifying the revised digital component based on a result of the set of computer vision processes performed on the revised digital component. Visually modifying at least one of the set of text or the set of images based a combination of (i) the constructed set of modifications and (ii) the determined visual characteristics of the electronic document can include modifying the shape of the image to the target shape based on the set of computer vision processes indicating that the target shape of the image results in a pre-specified change in distance between the image and the content of the electronic document. Visually modifying at least one of the set of text or the set of images based a combination of (i) the constructed set of modifications and (ii) the determined visual characteristics of the electronic document can include modifying the shape of the image to the target shape based on one or more characteristics of an intended audience for the digital component. The digital component can include video content. The method can further include determining, based on the content of the electronic document, a context of the electronic document; determining a location within the video content that corresponds to the context of the electronic document; and beginning playback of the video content at the location rather than a beginning of the vide