US-12620398-B2 - Electronic device stores tag information of content

US12620398B2US 12620398 B2US12620398 B2US 12620398B2US-12620398-B2

Abstract

An electronic device according to an embodiment comprises a memory, a display, and a processor operatively connected to the memory and the display, wherein the processor may be configured to: collect speech data; match the collected speech data with user information related to the collected speech data and store, in the memory, association information between the collected speech data and the user information; when generating content, detect speech data of the content input that is input during generation of the content; and when there is user information matching with the detected speech data in the memory, store the user information matching with the detected speech data of the content as tag information of the content.

Inventors

Hwakyung HYUN

Assignees

SAMSUNG ELECTRONICS CO., LTD.

Dates

Publication Date: 20260505
Application Date: 20230915
Priority Date: 20210318

Claims (13)

1 . An electronic device comprising: a display; at least one processor, and memory instructions that, when executed by the at least one processor individually or collectively, cause the electronic device to: collect speech data, match the collected speech data and user information related to the collected speech data, and store, in the memory, association information between the collected speech data and the user information; when generating content, detect first speech data of the content that is input during generation of the content, and obtain the user information matching the detected first speech data of the content from the memory, store the user information matching with the detected first speech data of the content as tag information of the content, wherein the content comprises a video, wherein the user information corresponds to a speaker in the video having the detected first speech data; and in response to identifying that the user information is stored as the tag information in the content when reproducing the content based on a user input, display the user information stored as the tag information in a section in which the first speech data of the content is outputted while the content is being reproduced.
2 . The electronic device of claim 1 , wherein the instructions, when executed by the at least one processor individually or collectively, cause the electronic device to: store the collected speech data and the user information related to the collected speech data in a speech pattern database (DB) of the memory; or store, in the speech pattern DB of the memory, index information, the index information enabling detection of the user information related to the collected speech data stored in a contact information DB of the memory and the collected speech data.
3 . The electronic device of claim 1 , wherein the instructions, when executed by the at least one processor individually or collectively, cause the electronic device to: based on a feature vector detected from the collected speech data, determine pattern information of the feature vector; match the determined pattern information of the feature vector and the user information related to the collected speech data; and store the determined pattern information of the feature vector matched with the user information related to the collected speech data in the memory.
4 . The electronic device of claim 1 , wherein the instructions, when executed by the at least one processor individually or collectively, cause the electronic device to: when collecting the collected speech data during a call, identify, based on contact information, information of a counterpart making the call; and determine the identified counterpart information as the user information related to the collected speech data.
5 . The electronic device of claim 1 , wherein the instructions, when executed by the at least one processor individually or collectively, cause the electronic device to: when collecting the collected speech data while performing speech recording, determine contact information, which is input by a user, as the user information related to the collected speech data.
6 . The electronic device of claim 1 , wherein the instructions, when executed by the at least one processor individually or collectively, cause the electronic device to: in response to the collected speech data being received while generating the content, identify, based on a feature vector detected from the collected speech data, pattern information of the feature vector; and in response to a first pattern information of a first feature vector, having a similarity to the identified pattern information of the feature vector that is equal to or greater than a reference value, existing in the memory, obtain the user information matching with the first pattern information of the first feature vector; and store the user information matching with the first pattern information of the first feature vector as tag information of a section in the content, where the collected speech data is received.
7 . The electronic device of claim 6 , wherein the instructions, when executed by the at least one processor individually or collectively, cause the electronic device to: in response to the first pattern information of the first feature vector not having in the memory the similarity to the identified pattern information of the feature vector that is equal to or greater than the reference value, receive a speech pattern DB from at least one external electronic device related to the electronic device; and wherein the received speech pattern DB includes the first pattern information of the first feature vector and the user information matching with the first pattern information of the first feature vector.
8 . A method for storing tag information of content in an electronic device, the method comprising: collecting speech data, matching the collected speech data and user information related to the collected speech data, and storing, in a memory of the electronic device, association information between the collected speech data and the user information; when generating content, detecting first speech data of the content that is input during generation of the content, obtaining the user information matching with the detected first speech data of the content from the memory, and storing the user information matching with the detected first speech data of the content as tag information of the content, wherein the content comprises a video, wherein the user information corresponds to a speaker in the video having the detected first speech data; and in response to identifying that the user information is stored as the tag information in the content when reproducing the content based on a user input, displaying the user information stored as the tag information in a section in which the first speech data of the content is outputted while the content is being reproduced.
9 . The method of claim 8 , further comprising: storing the collected speech data and the user information related to the collected speech data in a speech pattern database (DB) of the memory; or storing, in the speech pattern DB of the memory, index information, the index information enabling detection of the user information stored related to the collected speech data in a contact information DB of the memory and the collected speech data.
10 . The method of claim 8 , wherein the storing in the memory the association information between the collected speech data and the user information further comprises: detecting a feature vector from the collected speech data; based on the detected feature vector, determining pattern information of the feature vector; matching the determined pattern information of the feature vector and the user information related to the collected speech data; and storing the determined pattern information of the feature vector matched to the user information related to the collected speech data in the memory.
11 . The method of claim 8 , further comprising: when collecting the collected speech data during a call, identifying, based on contact information, information of a counterpart making the call; and determining the identified counterpart information as the user information related to the collected speech data.
12 . The method of claim 8 , further comprising, when collecting the collected speech data while performing speech recording, determining contact information, which is input by a user, as the user information related to the collected speech data.
13 . The method of claim 8 , wherein the storing as the tag information of the content comprises: in response to the collected speech data being received while generating the content, detecting a feature vector, based on the collected speech data; based on the detected feature vector, identifying pattern information of the feature vector; in response to a first pattern information of a first feature vector, having a similarity to the identified pattern information of the feature vector that is equal to or greater than a reference value, existing in the memory, obtaining the user information matching with the first pattern information of the first feature vector; and storing the user information matching with the first pattern information of the first feature vector as tag information of a section in the content, where the speech data is received; and in response to the first pattern information of the first feature vector not having in the memory the similarity to the identified pattern information of the feature vector that is equal to or greater than the reference value receiving a speech pattern DB from at least one external electronic device related to the electronic device, wherein the received speech pattern DB includes the first pattern information of the first feature vector and the user information matching with the first pattern information of the first feature vector.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS This application is a continuation application, claiming priority under § 365(c), of International Application No. PCT/KR2022/001445 filed on Jan. 27, 2022, which is based on and claims the benefit of Korean patent application number 10-2021-0035143 filed on Mar. 18, 2021, in the Korean Intellectual Property Office, the disclosures of which are incorporated by reference herein in their entireties. TECHNICAL FIELD An embodiment relate to an electronic device and a method for storing tag information of content in the electronic device. BACKGROUND ART An electronic device has a function of inputting and managing a content tag. A desired tag may be selected or directly input and added to content, and then, contents may be filtered and provided for each tag. DETAILED DESCRIPTION OF THE INVENTION Technical Problem Although there is a function to input and manage tags for contents, such as photos and videos, tags are manually input via a complicated procedure. For example, in order to add a person tag to a video, there is discomfort (including extra steps and time) that a user should first play the video to identify information of a person included in the video, and then enter a detailed information editing menu to save the tag for the person information by a manual input. Technical Solution According to one or more embodiments, an electronic device capable of storing person tag information in content by using speech data, and a method for storing tag information of content in the electronic device are provided. An electronic device according to an embodiment may include a memory, a display, and a processor operatively connected to the memory and the display, wherein the processor is configured to collect speech data, match the collected speech data and user information related to the collected speech data, store, in the memory, association information between the collected speech data and the user information, detect, when generating content, speech data of the content that is input during generation of the content, and obtain user information matching with the detected speech data of the content from the memory, store the user information matching with the detected speech data of the content, as tag information of the content. A method for storing tag information of content in an electronic device according to an embodiment may include collecting speech data, matching the collected speech data and user information related to the collected speech data, storing, in a memory of the electronic device, association information between the collected speech data and the user information, detecting, when generating content, speech data of the content that is input during generation of the content, and obtaining user information matching with the detected speech data of the content from in the memory, storing the user information matching with the detected speech data of the content, as tag information of the content. Advantageous Effects According to an embodiment, a person tag may be automatically stored in content. According to an embodiment, content can be classified based on speech data, and therefore utilization can be increased in, for example, video playlists, social network (SNS) sharing, and scenario editing and processing. According to an embodiment, in the case of a video in which a person tag is stored based on speech data, since it is possible to know when speech of a specific person is output and when the output is stopped automatically, editing can be facilitated using a section where the speech of the specific person is output. According to an embodiment, a person tag can be stored in real time not only in a video but also in various contents, such as a voice memo enabling speech recording and an image received from a home surveillance camera. BRIEF DESCRIPTION OF DRAWINGS FIG. 1 is a block diagram of an electronic device in a network environment according to various embodiments; FIG. 2 is a block diagram illustrating an electronic device according to an embodiment; FIG. 3 is a block diagram illustrating a tag management module for storing tag information in content in an electronic device according to an embodiment; FIG. 4 is a diagram illustrating an operation of identifying pattern information of a feature vector having a similarity equal to or greater than a certain reference value in an electronic device according to an embodiment; FIG. 5 is a diagram illustrating an operation of sharing a speech pattern database (DB) with an external electronic device in an electronic device according to an embodiment; FIG. 6A, FIG. 6B, FIG. 6C, and FIG. 6D are diagrams illustrating operations of displaying user information stored as tag information in content in an electronic device according to an embodiment; FIG. 7A and FIG. 7B are diagrams illustrating operations of displaying user information stored as tag information in content in an electronic device according to an emb