EP-4204992-B1 - AUTOMATIC GENERATION OF EVENTS USING A MACHINE-LEARNING MODEL

EP4204992B1EP 4204992 B1EP4204992 B1EP 4204992B1EP-4204992-B1

Inventors

BOHL, Kristina
AGAJANIAN, Joe
BERG, Lily
POTETZ, BRIAN
MOSLEY, Keegan
CHENG, SHINKO

Dates

Publication Date: 20260506
Application Date: 20211213

Claims (11)

A computer-implemented method comprising: segmenting (902) a library of media associated with a user account into episodes, wherein each episode is associated with a corresponding time period; generating (904), using an event machine-learning model, an event signal that indicates a likelihood that an event occurred in each episode, wherein the event machine-learning model is a classifier that receives the media as input, wherein the event machine-learning model includes a plurality of trained models that are each applicable to same input data, wherein the event machine-learning model executes the plurality of the trained models and applies a time threshold with regard to the plurality of the trained models by utilizing outputs of the trained models of the plurality of the trained models that are available within the time threshold, and discarding outputs that are not available within the time threshold; generating (906), using a scoring machine-learning model, an event significance score for each episode based on one or more selected from the group of at least a threshold number of media items in a corresponding episode, at least a threshold number of face clusters in the corresponding episode, a quality indicator of the media items in the corresponding episode, at least one face cluster of a threshold rank, or a presence of a rare face cluster; determining (908) one or more events from the episodes based on the event signal and a corresponding event significance score exceeding a threshold event significance value; and providing (910) a user interface that includes an event item with corresponding media from a particular event of the one or more events; wherein the user interface is part of a reverse-chronological grid of media that includes the corresponding media, and the method further comprises: responsive to a user selecting the event item from the reverse-chronological grid, replacing the reverse-chronological grid with a display of the corresponding media from the particular event for a predetermined time duration; and responsive to completing the display of the corresponding media, displaying the reverse-chronological grid; wherein the reverse-chronological grid displays the corresponding media in a reverse-chronological order; wherein the user interface includes an option to remove a media item from the event item and, if the media item is removed from the event item, a feedback screen is displayed requesting feedback about why the media item is removed from the event item, wherein the feedback is transmitted to the event machine-learning model for using the feedback to modify parameters of the event machine-learning model.
The method of claim 1, further comprising: receiving a request from a user to remove depictions of a person from the media in the library of media that include depictions of the person; and filtering the depictions of the person from the media, wherein the filtering is performed before the event significance score is generated.
The method of claim 1, wherein the user interface includes: an option to hide or an option to add one or more selected from the group of an additional media item from the one or more events, a date associated with the one or more events, a person or pet depicted in media items from the one or more events, an option to edit the corresponding media from the particular event; or an option to change a title of the event item.
The method of claim 1, wherein: the method further comprises: determining computations to be carried out on individual devices to optimize computations; and implementing the event machine-learning model on multiple devices based on the computations to be carried out on the individual devices; or the method further comprises combining multiple episodes into a single event based on one or more selected from the group of the multiple episodes being associate with respective time periods that all fall a 24-hour period, the single event being a type of event that occurs over multiple days, celebrations on different days that relate to the single event, the multiple episodes involving a same location, the multiple episodes involving a same set of face clusters,
The method of claim 1, further comprising: generating a confidence score that indicates a likelihood that a corresponding event is a type of event that is accurately recognized; and responsive to the confidence score meeting a threshold confidence value, adding an automatically generated title descriptive of the type of event to the corresponding event.
The method of claim 5, wherein: the method further comprises: responsive to the confidence score failing to meet the threshold confidence value, adding the title to the corresponding event based on a template phrase; or a title machine-learning model receives the corresponding media from the one or more events as input and the title machine-learning model generates a title as output.
The method of claim 1, further comprising: determining the one or more events comprises determining the events such that a number of the events is less than or equal to a predetermined number each month; receiving new media to associate with the library of media; and replacing the particular event of the one or more events with a new event, responsive to the new event being associated with a new event significance score that is higher than the event significance score for the particular event.
The method of claim 1, further comprising generating audio for the corresponding media that is based on a type of the particular event.
A computing device comprising: a processor; and a memory coupled to the processor, with instructions stored thereon that, when executed by the processor, cause the processor to perform operations comprising: segmenting (902) a library of media associated with a user account into episodes, wherein each episode is associated with a corresponding time period; generating (904), using an event machine-learning model, an event signal that indicates a likelihood that an event occurred in each episode, wherein the event machine-learning model is a classifier that receives the media as input, wherein the event machine-learning model includes a plurality of trained models that are each applicable to same input data, wherein the event machine-learning model is configured to execute the plurality of the trained models and apply a time threshold with regard to the plurality of the trained models by utilizing outputs of the trained models of the plurality of the trained models that are available within the time threshold, and discarding outputs that are not available within the time threshold; generating (906), using a scoring machine-learning model, an event significance score for each episode based on one or more selected from the group of at least a threshold number of media items in a corresponding episode, at least a threshold number of face clusters in the corresponding episode, a quality indicator of the media items in the corresponding episode, at least one face cluster of a threshold rank, or a presence of a rare face cluster; determining (908) one or more events from the episodes based on the event signal and a corresponding event significance score exceeding a threshold event significance value; and providing (910) a user interface that includes an event item with corresponding media from a particular event of the one or more events; wherein the user interface is part of a reverse-chronological grid of media that includes the corresponding media and the operations further comprise: responsive to a user selecting the event item from the reverse-chronological grid, replacing the reverse-chronological grid with a display of the corresponding media from the particular event for a predetermined time duration; and responsive to completing the display of the corresponding media, displaying the reverse-chronological grid; wherein the reverse-chronological grid displays the corresponding media in a reverse-chronological order; wherein the user interface includes an option to remove a media item from the event item and, if the media item is removed from the event item, a feedback screen is displayed requesting feedback about why the media item is removed from the event item, wherein the feedback is transmitted to the event machine-learning model for using the feedback to modify parameters of the event machine-learning model.
The computing device of claim 9, wherein the event item is displayed with a size that is based on one or more selected from the group of the event significance score, a number of media items for the particular event, a total number of events in a period of time, an event type,
A non-transitory computer-readable medium with instructions stored thereon that, when executed by one or more computers, cause the one or more computers to perform operations, the operations comprising: segmenting (902) a library of media associated with a user account into episodes, wherein each episode is associated with a corresponding time period; generating (904), using an event machine-learning model, an event signal that indicates a likelihood that an event occurred in each episode, wherein the event machine-learning model is a classifier that receives the media as input, wherein the event machine-learning model includes a plurality of trained models that are each applicable to same input data, wherein the event machine-learning model executes the plurality of the trained models and applies a time threshold with regard to the plurality of the trained models by utilizing outputs of the trained models of the plurality of the trained models that are available within the time threshold, and discarding outputs that are not available within the time threshold; generating (906), using a scoring machine-learning model, an event significance score for each episode based on one or more selected from the group of at least a threshold number of media items in a corresponding episode, at least a threshold number of face clusters in the corresponding episode, a quality indicator of the media items in the corresponding episode, at least one face cluster of a threshold rank, or a presence of a rare face cluster; determining (908) one or more events from the episodes based on the event signal and a corresponding event significance score exceeding a threshold event significance value; and providing (910) a user interface that includes an event item with corresponding media from a particular event of the one or more events; wherein the user interface is part of a reverse-chronological grid of media that includes the corresponding media and the operations further comprise: responsive to a user selecting the event item from the reverse-chronological grid, replacing the reverse-chronological grid with a display of the corresponding media from the particular event for a predetermined time duration; and responsive to completing the display of the corresponding media, displaying the reverse-chronological grid; wherein the reverse-chronological grid displays the corresponding media in a reverse-chronological order; wherein the user interface includes an option to remove a media item from the event item and, if the media item is removed from the event item, a feedback screen is displayed requesting feedback about why the media item is removed from the event item, wherein the feedback is transmitted to the event machine-learning model for using the feedback to modify parameters of the event machine-learning model.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS The present application claims priority to U.S. Patent Application No. 17/404,773, filed August 17, 2021 and titled "Automatic Generation of Events Using a Machine-Learning Model," which claims priority both to U.S. Provisional Patent Application No. 63/187,392, filed May 11, 2021 and titled "Automatic Generation of Events Using a Machine-Learning Model," and U.S. Provisional Patent Application No. 63/189,657, filed May 17, 201 and titled "Automatic Generation of Events Using a Machine-Learning Model," BACKGROUND Users of devices, such as smartphones or other digital cameras capture and store a large amount of media (e.g., photos and videos) in their libraries. Users access the libraries to view their media to reminisce about various events, such as birthdays, weddings, vacations, trips, etc. However, the libraries often have thousands of images taken over a long period of time that are difficult to organize. US 2021/103611 A1 describes a method for organization of digital media in which a digital media gallery is organized based on underlying events or occasions by leveraging content tags associated with media. US 2007/294716 A1 describes a method for detecting a real time event in a sports video, the method including testing a confidence of an online model, calculated in a sports video stream, detecting an event by using an offline model in the sports video stream, when the confidence of the online model does not meet a threshold, training the online model through an event detected by using the offline model, and detecting an event by using the online model in the sports video stream, when the confidence of the online model meets the threshold. The background description provided herein is for the purpose of generally presenting the context of the disclosure. Work of the presently named inventors, to the extent it is described in this background section, as well as aspects of the description that may not otherwise qualify as prior art at the time of filing, are neither expressly nor impliedly admitted as prior art against the present disclosure. SUMMARY The invention is defined by the independent claims. Dependent claims specify embodiments thereof. BRIEF DESCRIPTION OF THE DRAWINGS Figure 1 is a block diagram of an example network environment, according to some embodiments described herein.Figure 2 is a block diagram of an example computing device, according to some embodiments described herein.Figure 3 illustrates example titles based on confidence scores that indicate a likelihood that a corresponding event is a type of event that is accurately recognized, according to some embodiments.Figure 4 illustrates an example reverse-chronological grid of media, according to some embodiments.Figure 5 illustrates an example user interface for removing media items from events, according to some embodiments.Figure 6 illustrates an example user interface with the option to remove a media item from an event item and provide feedback, according to some embodiments.Figure 7 illustrates an example user interface with options for editing a title, removing an event, changing a size of the event in a reverse-chronological grid, and changing an importance of the event in the reverse-chronological grid, according to some embodiments.Figures 8A-8B illustrate an example block diagram that shows different examples of reorganizing of different events based on changes, according to some embodiments.Figure 9 is a flow diagram illustrating an example method for displaying an event item, according to some embodiments. DETAILED DESCRIPTION Network Environment 100 Figure 1 illustrates a block diagram of an example environment 100. In some embodiments, the environment 100 includes a media server 101, a user device 115a, a user device 115n, and a network 105. Users 125a, 125n may be associated with respective user devices 115a, 115n. In some embodiments, the environment 100 may include other servers or devices not shown in Figure 1 or the media server 101 may not be included. In Figure 1 and the remaining figures, a letter after a reference number, e.g., "115a," represents a reference to the element having that particular reference number. A reference number in the text without a following letter, e.g., "115," represents a general reference to embodiments of the element bearing that reference number. The media server 101 may include a processor, a memory, and network communication hardware. In some embodiments, the media server 101 is a hardware server. The media server 101 is communicatively coupled to the network 105 via signal line 102. Signal line 102 may be a wired connection, such as Ethernet, coaxial cable, fiber-optic cable, etc., or a wireless connection, such as Wi-Fi®, Bluetooth®, or other wireless technology. In some embodiments, the media server 101 sends and receives data to and from one or more of the user devices 115a, 115n via the network 105. The media server 101 may include a m