US-20260127803-A1 - VIRTUAL PERFORMER EXPRESSION ADJUSTMENT SYSTEM WITH EMOTION-AWARE AND METHOD THEREOF

US20260127803A1US 20260127803 A1US20260127803 A1US 20260127803A1US-20260127803-A1

Abstract

A virtual performer expression adjustment system with emotion-aware and a method thereof are disclosed. In the system, voice signals and corresponding emotion messages are loaded initially as training data, the loaded data is input into an artificial intelligence model to perform training to generate an emotion recognition model; a user voice is received to perform feature extraction, standardization and dimensionality reduction processes, the processed user voice is input into an emotion recognition model to obtain an emotional status, an facial expression generation calculation is then executed to generate facial landmarks based on the emotional status, and face model parameters of a virtual performer are adjusted based on the facial landmarks in real time, for dynamically displaying a facial expression of the virtual performer. Therefore, the technical effect of enhancing the realism and richness of the virtual performer's expressions can be achieved.

Inventors

Chuan-Cheng Chiu
Hai-Hong Sha
Po-Shuo CHIU

Assignees

SQ Technology (Shanghai) Corporation
INVENTEC CORPORATION

Dates

Publication Date: 20260507
Application Date: 20250114
Priority Date: 20241107

Claims (10)

1 . A virtual performer expression adjustment system with emotion-aware, comprising: an emotional voice database, configured to store voice signals and emotion messages, wherein each of the voice signals corresponds to one of the emotion messages; and a computer host, connected to the emotional voice database and comprising: a non-transitory computer-readable storage medium, configured to store computer readable instructions; and a hardware processor, electrically connected to the non-transitory computer-readable storage medium, and configured to execute the computer readable instructions to operate: loading the voice signals and the emotion message corresponding to the loaded voice signals as training data from the emotional voice database, and inputting the training data into an artificial intelligence model to perform training to generate an emotion recognition model; receiving a user voice, performing feature extraction, standardization and dimensionality reduction processes on the user voice, and inputting the processed user voice into the emotion recognition model to obtain an emotional status; and executing a facial expression generation calculation to generate facial landmarks based on the emotional status, and adjusting face model parameters of a virtual performer based on the facial landmarks in real time to dynamically display a facial expression of the virtual performer.
2 . The virtual performer expression adjustment system with emotion-aware according to claim 1 , wherein the emotion recognition model comprises a convolutional neural network (CNN), and a recurrent neural network (RNN), and during a training process, the emotion recognition model is allowed to receive the emotion message containing text, images, videos, audio, or a combination thereof as the training data in multimodal forms.
3 . The virtual performer expression adjustment system with emotion-aware according to claim 1 , wherein the facial expression generation calculation is performed by at least one of a generative adversarial network (GAN), deep learning (DL) and reinforcement learning (RL), and is configured to generate an face image corresponding to the emotional status and extract the facial landmarks from the face image.
4 . The virtual performer expression adjustment system with emotion-aware according to claim 1 , wherein the virtual performer has a predefined facial expression model, the facial expression model comprises the face model parameters, after the facial landmarks are generated, a change of the facial landmarks is smoothed through a filter, a boundary check is executed to remove an unnatural expression, and the processed facial landmarks are mapped to the face model parameters to modify the facial expression model.
5 . The virtual performer expression adjustment system with emotion-aware according to claim 1 , wherein the hardware processor further operates: detecting whether the facial landmarks match an inappropriate expression feature; and when the facial landmarks match the inappropriate expression feature, prohibit using the facial landmarks to adjust the face model parameters of the virtual performer, and initializing the face model parameters.
6 . A virtual performer expression adjustment method with emotion-aware, comprising: connecting an emotional voice database to a computer host, wherein the emotional voice database stores voice signals and emotion messages, and each of the emotion messages corresponding to one of the voice signals; loading the voice signals and emotion message corresponding to the loaded voice signals as training data from the emotional voice database, and inputting the training data into an artificial intelligence model for training to generate an emotion recognition model, by the computer host; receiving a user voice, performing feature extraction, standardization and dimensionality reduction processes on the user voice, and inputting the processed user voice into the emotion recognition model to obtain an emotional status, by the computer host; and executing a facial expression generation calculation to generate facial landmarks based on the emotional status, and adjusting face model parameters of a virtual performer based on the facial landmarks in real time for dynamically displaying a facial expression of the virtual performer, by the computer host.
7 . The virtual performer expression adjustment method with emotion-aware according to claim 6 , wherein the emotion recognition model comprises a convolutional neural network (CNN), and a recurrent neural network (RNN), and during a training process, the emotion recognition model is allowed to receive the emotion message containing text, images, videos, audio, or a combination thereof as the training data in multimodal forms.
8 . The virtual performer expression adjustment method with emotion-aware according to claim 6 , wherein the facial expression generation calculation is performed by at least one of a generative adversarial network (GAN), deep learning (DL) and reinforcement learning (RL), and is configured to generate an face image corresponding to the emotional status, extract the facial landmarks from the face image.
9 . The virtual performer expression adjustment method with emotion-aware according to claim 6 , wherein the virtual performer has a predefined facial expression model, the facial expression model comprises the face model parameters, after the facial landmarks are generated, a change of the facial landmarks is smoothed through a filter, a boundary check is executed to remove an unnatural expression, and the processed facial landmarks are mapped to the face model parameters to modify the facial expression model.
10 . The virtual performer expression adjustment method with emotion-aware according to claim 6 , further comprising: detecting whether the facial landmarks match an inappropriate expression feature, by the hardware processor; and when the facial landmarks match the inappropriate expression feature, prohibiting using the facial landmarks to adjust the face model parameters of the virtual performer and initializing the face model parameters, by the hardware processor.

Description

BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an expression adjustment system and a method thereof, and more particularly to a virtual performer expression adjustment system with emotion-aware and a method thereof. 2. Description of the Related Art In recent years, with the rapid development and widespread adoption of virtual technologies, various applications of virtual technologies have emerged rapidly. Among the applications, the economic value of virtual performers has attracted the most attention. Generally, an existing virtual performer typically obtains motions and even expressions through a motion capture technology. This conventional method allows the virtual performer to mimic human motions and expressions in real-time, but the equipment required for motion capture technology is overly complex and needs high synchronization, and even often require post-process, so the usability of the conventional method is greatly limited. In view of this, some companies have proposed expression simulation technologies of directly pre-simulating various human expressions and applying them to virtual performers. However, this method can only vary expressions based on predefined workflows and may led to insufficient flexibility and usability of expressions, for example, the simulated expressions may lack richness and have rigidity; furthermore, this method may fail to synchronize with voice and cause dissonance and reducing realism, for example, angry tones may be paired with happy expressions. Therefore, this conventional method still fails to effectively resolve the lack of realism and richness in expressions of virtual performers. According to above-mentioned contents, what is needed is to develop an improved solution to solve the problem of insufficient realism and richness in expressions of virtual performers. SUMMARY OF THE INVENTION An objective of the present invention is to disclose a virtual performer expression adjustment system with emotion-aware and a method thereof, to solve the problem of insufficient realism and richness in expressions of virtual performers. To achieve the objective, the present invention discloses a virtual performer expression adjustment system with emotion-aware, and the virtual performer expression adjustment system includes an emotional voice database and a computer host. The emotional voice database is configured to store voice signals and emotion messages, wherein each of the voice signals corresponds to one of the emotion messages. The computer host is connected to the emotional voice database and includes a non-transitory computer-readable storage medium and a hardware processor. The non-transitory computer-readable storage medium is configured to store computer readable instructions. The hardware processor is electrically connected to the non-transitory computer-readable storage medium, and configured to execute the computer readable instructions to operate: loading the voice signals and the emotion message corresponding to the loaded voice signals as training data from the emotional voice database, and inputting the training data into an artificial intelligence model to perform training to generate an emotion recognition model; receiving a user voice, performing feature extraction, standardization and dimensionality reduction processes on the user voice, and inputting the processed user voice into the emotion recognition model, to obtain an emotional status; executing a facial expression generation calculation to generate facial landmarks based on the emotional status, and adjusting face model parameters of a virtual performer based on the facial landmarks in real time, to dynamically display a facial expression of the virtual performer. To achieve the objective, the present invention discloses a virtual performer expression adjustment method with emotion-aware, include steps of: connecting an emotional voice database to a computer host, wherein the emotional voice database stores voice signals and emotion messages, and each of the emotion messages corresponding to one of the voice signals; loading the voice signals and emotion message corresponding to the loaded voice signals as training data from the emotional voice database, and inputting the training data into an artificial intelligence model for training to generate an emotion recognition model, by the computer host; receiving a user voice, performing feature extraction, standardization and dimensionality reduction processes on the user voice, and inputting the processed user voice into the emotion recognition model to obtain an emotional status, by the computer host; executing a facial expression generation calculation to generate facial landmarks based on the emotional status, and adjusting face model parameters of a virtual performer based on the facial landmarks in real time, for dynamically displaying a facial expression of the virtual performer, by the computer host. Acc