KR-20260062454-A - DEVICE, METHOD AND COMPUTER PROGRAM FOR PROVIDING CUSTOMIZED SERVICE

KR20260062454AKR 20260062454 AKR20260062454 AKR 20260062454AKR-20260062454-A

Abstract

A terminal providing a customized service includes an input unit that receives voice data from a user, an analysis unit that inputs the voice data into an emotion analysis model and outputs state information including the user's emotions and intentions through the emotion analysis model, and a service providing unit that derives one service scenario among a plurality of service scenarios based on the state information and provides a customized service based on the derived service scenario. The emotion analysis model is configured with a hierarchical structure including a Phone Layer, a Word Layer, an Emotion Layer, and an Intent Layer.

Inventors

손단영

Assignees

주식회사 케이티

Dates

Publication Date: 20260507
Application Date: 20241029

Claims (20)

In a terminal providing customized services, Input unit for receiving user's voice data; An analysis unit that inputs the above voice data into an emotion analysis model and outputs state information including the user's emotions and intentions through the emotion analysis model; and A service providing unit that derives one service scenario among multiple service scenarios based on the above-mentioned state information and provides a customized service based on the derived service scenario. Includes, The above sentiment analysis model is a terminal composed of a hierarchical structure including a Phone Layer, a Word Layer, an Emotion Layer, and an Intent Layer.
In Article 1, The above sentiment analysis model extracts voice feature information including at least one of tone, pitch, and speech rate from the voice data, and A terminal that outputs the user's emotion information based on the extracted voice feature information.
In Article 1, A terminal in which the above sentiment analysis model outputs intention information including at least one of an execution command, an execution target, and an execution location based on a sentence included in the above voice data.
In Article 1, A terminal further comprising a learning unit that pre-trains a basic learning model based on a plurality of emotions and first learning data classified by the intensity of expression of the plurality of emotions.
In Article 4, The above input unit receives input data classified by the plurality of emotions and the expression intensity of the plurality of emotions based on a voice file generated from a user's speech according to a speech script, and A terminal in which the learning unit inputs the input data into the basic learning model and outputs initial state information including the user's initial emotion and initial intention through the basic learning model.
In Article 5, The above service providing unit derives a service scenario related to the initial state information among the above-mentioned multiple service scenarios, and The above learning unit generates second learning data based on the above initial state information and a service scenario related to the above initial state information, and A terminal in which the above sentiment analysis model is transferred learned from the above basic learning model so that the user's sentiment and intention are respectively output based on the above second learning data.
In Article 6, It further includes a display unit that displays the initial state information of the user and a service scenario related to the initial state information through a display, A terminal in which the input unit receives from the user a modification of at least one of the initial state information or a service scenario related to the initial state information.
In Article 1, The above phoneme layer and the above word layer are pre-trained, and A terminal in which the above emotion layer and the above intention layer are learned through transfer learning.
In Article 1, The above service provider is a terminal that controls at least one IoT device based on a service scenario related to the user's state information.
In Article 1, A terminal in which the service providing unit provides a conversation service between the user and a bot based on a service scenario related to the user's status information.
In a method for providing customized services at a terminal, Step of receiving user's voice data; A step of inputting voice data into an emotion analysis model and outputting state information including the user's emotions and intentions through the emotion analysis model; A step of deriving one of a plurality of service scenarios based on the state information above; and A step of providing a customized service based on any one of the service scenarios derived above. Includes, A method for providing customized services, wherein the above sentiment analysis model is composed of a hierarchical structure including a Phone Layer, a Word Layer, an Emotion Layer, and an Intent Layer.
In Article 11, The above sentiment analysis model extracts voice feature information including at least one of tone, pitch, and speech rate from the voice data, and A method for providing a customized service, which outputs the user's emotional information based on the extracted voice feature information.
In Article 11, A method for providing a customized service, wherein the above sentiment analysis model outputs intention information including at least one of an execution command, an execution target, and an execution location based on a sentence included in the voice data.
In Article 11, A method for providing a customized service, further comprising the step of pre-training a basic learning model based on first learning data classified by multiple emotions and the intensity of expression of multiple emotions by multiple speakers.
In Article 14, The step of receiving the above voice data is, The method includes the step of receiving input data classified by the plurality of emotions and the intensity of expression of the plurality of emotions based on a voice file generated from a user's speech according to a speech script. The aforementioned pre-training step is, A method for providing a customized service, comprising the step of inputting the above input data into the above basic learning model and outputting initial state information including the user's initial emotion and initial intention through the above basic learning model.
In Article 15, The step of providing the above customized service is, It includes a step of deriving a service scenario related to the initial state information among the aforementioned preset service scenarios, and The aforementioned pre-training step is, It includes the step of generating second training data based on the initial state information and a service scenario related to the initial state information, and A method for providing a customized service, wherein the above sentiment analysis model is transferred learning from the above basic learning model so that the user's sentiment and intention are respectively output based on the above second learning data.
In Article 11, The above phoneme layer and the above word layer are pre-trained, and A method for providing customized services in which the emotion layer and the intention layer are learned through transfer learning.
In Article 11, The step of providing the above customized service is, A method for providing a customized service, comprising the step of controlling at least one IoT device based on a service scenario related to the state information of the user.
In Article 11, The step of providing the above customized service is, A method for providing a customized service, comprising the step of providing a conversation service between the user and a bot based on a service scenario related to the user's status information.
In a computer program stored on a computer storage medium comprising a sequence of instructions that provide customized services, When the above computer program is executed by a computing device, Receives the user's voice data, and The above voice data is input into an emotion analysis model, and state information including the user's emotions and intentions is output through the emotion analysis model. A sequence of commands is included to derive one service scenario among multiple service scenarios based on the state information, and to provide a customized service based on the derived service scenario. A computer program stored on a computer storage medium, wherein the above sentiment analysis model is composed of a hierarchical structure including a Phone Layer, a Word Layer, an Emotion Layer, and an Intent Layer.

Description

Device, method and computer program for providing customized service The present invention relates to a terminal, a method, and a computer program for providing customized services. An intelligent personal assistant is a software agent that processes tasks requested by the user and provides services specialized for the user. Based on artificial intelligence (AI) engines and voice recognition, intelligent personal assistants have the advantage of enhancing user convenience by collecting and providing customized information to the user and performing various functions, such as schedule management, sending emails, and restaurant reservations, according to the user's voice commands. These intelligent personal assistants are primarily provided in the form of customized personal services on smartphones, and representative examples include Apple's Siri, Google's Now, and Samsung's Bixby. In this regard, the prior art Korean Published Patent No. 10-2016-0071111 discloses a method for providing a personal assistant service in an electronic device. However, in order to provide services based on the user's emotions using an intelligent personal assistant, it is necessary to collect the user's biometric information and lifestyle patterns from various sensors. Consequently, there is a disadvantage in that various sensors must be installed, or even if the user's emotional information is acquired, the information may be inaccurate, making personalized services based on it incomplete. FIG. 1 is a configuration diagram of a terminal according to one embodiment of the present invention. FIG. 2 is an exemplary drawing illustrating first learning data classified by a plurality of emotions and the intensity of expression of a plurality of emotions by a plurality of speakers according to an embodiment of the present invention. FIG. 3 is an exemplary drawing illustrating input data classified by a plurality of emotions and the intensity of expression of a plurality of emotions based on a voice file generated from a user's speech according to a speech script according to an embodiment of the present invention. FIG. 4 is an exemplary diagram illustrating the process of receiving initial state information or at least one modification of a service scenario related to initial state information from a user according to an embodiment of the present invention. FIG. 5 is an exemplary drawing illustrating the phonemes of the Korean language according to one embodiment of the present invention. FIG. 6 is an exemplary diagram illustrating the process of outputting state information including a user's emotions and intentions through an emotion analysis model configured in a hierarchical structure according to an embodiment of the present invention. FIG. 7 is an exemplary diagram illustrating the process of controlling at least one IoT device based on a service scenario related to user status information according to an embodiment of the present invention. FIG. 8 is a flowchart of a method for providing a customized service performed at a terminal according to an embodiment of the present invention. Embodiments of the present invention are described below with reference to the attached drawings so that those skilled in the art can easily implement the invention. However, the present invention may be embodied in various different forms and is not limited to the embodiments described herein. Furthermore, in order to clearly explain the present invention in the drawings, parts unrelated to the explanation have been omitted, and similar parts throughout the specification are denoted by similar reference numerals. Throughout the specification, when a part is described as being "connected" to another part, this includes not only cases where they are "directly connected" but also cases where they are "electrically connected" with other elements interposed between them. Furthermore, when a part is described as "including" a component, this means that, unless specifically stated otherwise, it does not exclude other components but may include additional components, and it should be understood that this does not preclude the existence or addition of one or more other features, numbers, steps, actions, components, parts, or combinations thereof. In this specification, the term "part" includes a unit realized by hardware, a unit realized by software, and a unit realized using both. Additionally, one unit may be realized using two or more hardware, and two or more units may be realized by one hardware. Some of the operations or functions described in this specification as being performed by a terminal or device may instead be performed by a server connected to said terminal or device. Likewise, some of the operations or functions described as being performed by a server may also be performed by a terminal or device connected to said server. An embodiment of the present invention will be described in detail below with reference to the attached drawings. FIG.