US-12621530-B2 - Augmented display from conversational monitoring

US12621530B2US 12621530 B2US12621530 B2US 12621530B2US-12621530-B2

Abstract

Systems and methods are provided for generating for display an indication of a segment of media content relevant to a voice communication. This may be accomplished by a media guidance application that monitors a voice communication between users. The media guidance application determines that a first user is describing media content. In response to determining that the first user is describing the media content, the media guidance application retrieves media asset viewing history of the first user. The media guidance application determines, based on metadata of each media asset in the media asset viewing history of the first user and the voice communication, a media asset that the first user is describing. The media guidance application determines, based on metadata of the media asset, a segment of the media asset that the first user is describing. The media guidance application generates, for display, an indication of the segment.

Inventors

Michael K. McCarty
Glen E. Roe

Assignees

ADEIA GUIDES INC.

Dates

Publication Date: 20260505
Application Date: 20241009

Claims (20)

1 . A method comprising: detecting, at a user device, a plurality of words communicated between a first user and a second user; determining, based on comparing the plurality of words with a plurality of keywords indicating that media content is being described, that the first user is describing a media content; retrieving metadata for each of a plurality of media assets that the first user previously consumed at least in part; comparing the metadata of each media asset in the plurality of media assets that the first user has previously consumed at least in part with the plurality of words; determining, based on comparing the metadata of each media asset in the plurality of media assets that the first user has previously consumed at least in part with the plurality of words, a media asset that the first user is describing; and generating, for display at a display device, an indication of the media asset.
2 . The method of claim 1 , wherein the media asset relates to an event and wherein the indication comprises information related to the event.
3 . The method of claim 1 further comprising: retrieving metadata for each of a plurality of segments of the media asset; and determining, based on the metadata for each of the plurality of segments and the plurality of words, a segment of the media asset that the first user is describing; wherein the indication comprises the segment.
4 . The method of claim 1 , wherein the user device is connected to a wireless network, and wherein the display device is connected to the wireless network.
5 . The method of claim 1 , wherein the indication of the media asset is automatically displayed at the display device.
6 . The method of claim 1 , further comprising: generating, at the user device or the display device, a prompt related to confirming the media asset; and receiving an input from at least one of the first user or the second user confirming that the plurality of words communicated between the first user and the second user describe the media content.
7 . The method of claim 1 , wherein comparing the plurality of words with the plurality of keywords comprises: selecting a first word of the plurality of words; determining, based on comparing the first word of the plurality of words with each keyword of the plurality of keywords, whether the first word matches any of the plurality of keywords; and based on determining that the first word matches a keyword from the plurality of keywords, updating a word matching score.
8 . The method of claim 7 , wherein determining, based on comparing the plurality of words with the plurality of keywords, that the first user is describing the media content comprises: determining whether the word matching score is greater than a threshold value; and based on determining that the word matching score is greater than the threshold value, determining that the first user is describing the media content.
9 . The method of claim 7 , wherein updating the word matching score comprises: retrieving a weight associated with the keyword; and updating the word matching score with the weight associated with the keyword.
10 . The method of claim 1 , wherein determining, based on comparing the metadata of each media asset in the plurality of media assets that the first user has previously consumed at least in part with the plurality of words, the media asset that the first user is describing comprises: determining, for each of the plurality of media assets, an amount of words of the plurality of words that match a corresponding media asset; and determining the media asset that the first user is describing based on the determined amount of words for each corresponding media asset.
11 . A system comprising: communication circuitry; and control circuitry configured to: detect, at a user device, a plurality of words communicated between a first user and a second user; determine, based on comparing the plurality of words with a plurality of keywords indicating that media content is being described, that the first user is describing a media content; retrieve metadata for each of a plurality of media assets that the first user previously consumed at least in part; compare the metadata of each media asset in the plurality of media assets that the first user has previously consumed at least in part with the plurality of words; determine, based on comparing the metadata of each media asset in the plurality of media assets that the first user has previously consumed at least in part with the plurality of words, a media asset that the first user is describing; and generate, for display at a display device, an indication of the media asset.
12 . The system of claim 11 , wherein the media asset relates to an event and wherein the indication comprises information related to the event.
13 . The system of claim 11 , wherein the control circuitry is further configured to: retrieve metadata for each of a plurality of segments of the media asset; and determining, based on the metadata for each of the plurality of segments and the plurality of words, a segment of the media asset that the first user is describing; wherein the indication comprises the segment.
14 . The system of claim 11 , wherein the user device is connected to a wireless network, and wherein the display device is connected to the wireless network.
15 . The system of claim 11 , wherein the indication of the media asset is automatically displayed at the display device.
16 . The system of claim 11 , wherein the control circuitry is further configured to: generate, at the user device or the display device, a prompt related to confirming the media asset; and receive an input from at least one of the first user or the second user confirming that the plurality of words communicated between the first user and the second user describe the media content.
17 . The system of claim 11 , wherein the control circuitry is configured, when comparing the plurality of words with the plurality of keywords, to: select a first word of the plurality of words; determine, based on comparing the first word of the plurality of words with each keyword of the plurality of keywords, whether the first word matches any of the plurality of keywords; and based on determining that the first word matches a keyword from the plurality of keywords, update a word matching score.
18 . The system of claim 17 , wherein the control circuitry is configured, when determining, based on comparing the plurality of words with the plurality of keywords, that the first user is describing the media content, to: determine whether the word matching score is greater than a threshold value; and based on determining that the word matching score is greater than the threshold value, determine that the first user is describing the media content.
19 . The system of claim 17 , wherein the control circuitry is configured, when updating the word matching score, to: retrieve a weight associated with the keyword; and update the word matching score with the weight associated with the keyword.
20 . The system of claim 11 , wherein the control circuitry is configured, when determining, based on comparing the metadata of each media asset in the plurality of media assets that the first user has previously consumed at least in part with the plurality of words, the media asset that the first user is describing, to: determine, for each of the plurality of media assets, an amount of words of the plurality of words that match a corresponding media asset; and determine the media asset that the first user is describing based on the determined amount of words for each corresponding media asset.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS This application is a continuation of U.S. patent application Ser. No. 18/225,983, filed Jul. 25, 2023, which is a continuation of U.S. patent application Ser. No. 17/681,059, filed Feb. 25, 2022, (now U.S. Pat. No. 11,758,230), which is a continuation of U.S. patent application Ser. No. 17/042,322, filed Sep. 28, 2020 (now U.S. Pat. No. 11,297,390), which is national a stage application under 35 U.S.C. § 371 of International Application No. PCT/US2018/039436, filed Jun. 26, 2018, the disclosures of each application are incorporated by reference herein in their entireties. BACKGROUND Viewers often want to discuss exciting and interesting media content with friends and colleagues. Viewers may have conversations in person or across the digital divide with friends and family, and may want to share a show, clip, episode, or highlight with another other person who has not seen the same content. Even if the other person has seen the same event, a viewer may want to talk through a highlight, or the other person may have watched a different presentation of the event or otherwise missed a segment of the media content the viewer is discussing. If a viewer is forced to use conventional interfaces to select a desired segment, the viewer may lose the moment and excitement leading to subpar experiences and interest in the conversation. Viewers, therefore, desire mechanisms for sharing media content, and particular segments of that content, with friends and colleagues during conversations related to that content. SUMMARY The integration of media into daily life has increased the ability for users to share experiences, particularly experiences related to the enjoyment of media consumption, with others, both in person and across distances. In particular, media systems may provide mechanisms for users to share links to media content, including segments of media content. For example, a user may select a hyperlink of a YouTube® video or even to a particular time segment in a YouTube® video. However, while media systems are able to display content when a recipient clicks a link, the systems still fail to overcome problems associated with sharing media content, such as: (i) the amount of media content can be overwhelming and it can be difficult to find a desired segment of content to share; (ii) interfaces to share media content can be confusing and difficult to use, especially when users are engaged in conversations and don't want to focus on controlling their user device. Accordingly, to overcome the problems created when attempting to share media content with another party, methods and systems are described for generating for display an indication of a segment of media content relevant to a voice communication between two users. Specifically, a media guidance application renders visual indicators to a target user about segments that a source user is trying to discuss with the target user. While counter to the prevailing systems, which rely solely on the source user selecting specific time portions of a segment of a media asset, the media guidance application use a novel, hybrid approach that relies on keywords of a conversation in combination with the source user's viewing history to select the appropriate portion of a media asset to share with the target user. For example, the media guidance application generates for display an indication of a segment of media content relevant to a voice communication between two users. The media guidance application does this by monitoring a voice communication between a first user (i.e., a source user) at a first communication device (e.g., a smartphone) and a second user (i.e., a target user) at a second communication device (e.g., a smartphone). For example, the first user may be discussing a recent sporting event (e.g., when the Washington Capitals won the 2018 Stanley Cup Championship) with a second user. The media guidance application may extract from the voice communication words spoken by the first user. For example, the source user may ask the target user, “Did you see Ovi lift the cup?” which is a reference to the Capitals' captain, Alexander Ovechkin, lifting the champion trophy. In this example, the second user may not have seen that event, and the first user may desire to share a particular segment of the event with the second user. By monitoring a conversation between two users, the media guidance application can locate the media asset of interest, e.g., a replay of the final game of the 2018 Stanley Cup Championship). The media guidance application may compare words from the users' conversation with keywords that indicate that media content is being described. For example, the media guidance application may extract the words “did you see” from the conversation which suggests that media content is being described. The media guidance application may be configured with other trigger words that indicate media content is being described. In