US-12627620-B2 - System of generative chatbot in real multi-person response situation and method thereof

US12627620B2US 12627620 B2US12627620 B2US 12627620B2US-12627620-B2

Abstract

A system of generative chatbot in a real multi-person response situation and a method thereof are disclosed. In the system, speech signals are sensed and converted into feature vectors and text messages, and a timing label and a classification label are embedded into the text messages, the text messages are stored as a context message, so that a server-end host can determine a timing logic of a multi-person conversation, and the context message and the timing logic are transmitted to an artificial intelligence device which determines a current conversation stage, a topic evolution, predicts a conversation development, and actively generates and stores a response message to the server-end host; the server-end host can filter out the response messages and transmit the filtered response message to a portable device for output. Therefore, the technical effect of improving conversational initiative and response efficiency can be achieved.

Inventors

Chuan-Cheng Chiu
Zhuo-Jia Bian

Assignees

SQ Technology (Shanghai) Corporation
INVENTEC CORPORATION

Dates

Publication Date: 20260512
Application Date: 20231207
Priority Date: 20230915

Claims (10)

1 . A system of generative chatbot in a real multi-person response situation, comprising: an artificial intelligence device, configured to receive a context message and a timing logic corresponding to the context message through an application programming interface, input the context message and the timing logic to a large language model to generate at least one response message, and transmit the at least one response message through the application programming interface; a portable device, comprising: at least one sensor, configured to continuously sense speech signals; a speaker, configured to output a feedback speech; a storage device, configured to store feature vectors and text messages corresponding to the speech signals, wherein each of the text messages comprises a timing label and a classification label; and a speech processor, electrically connected to the sensor, the speaker, and the storage device, and configured to: convert the sensed speech signals into the feature vectors based on Mel-frequency cepstral coefficients, and classify the speech signals based on the feature vector; perform a speech-to-text (STT) process to convert the speech signals into the text messages; embed the timing label and the classification label into the text messages corresponding to the speech signals based on a timing relationship and a classification result, and store the text messages to the storage device as the context message; and when an on-demand chat message is received, perform a text-to-speech process to convert the on-demand chat message into the feedback speech, and output the feedback speech through the speaker; and a server-end host, connected to the artificial intelligence device and the portable device, and comprising: a non-transitory computer-readable storage medium configured to store computer readable instructions; and a hardware processor, electrically connected to the non-transitory computer-readable storage medium, and configured to execute the computer readable instructions to make the server-end host execute: continuously loading the context message from the storage device of the portable device, and determining a timing logic of a multi-person conversation based on the embedded timing label and the embedded classification label, wherein the timing logic comprises the number of people, a timing and a topic of conversation; transmitting the context message and the timing logic to the artificial intelligence device, receiving the response message from the artificial intelligence device, and storing the response message into a response list; and automatically selecting at least one of response messages in the response list as the on-demand chat message based on a personality parameter, and transmitting the on-demand chat message to the portable device.
2 . The system of generative chatbot in real multi-person response situation according to claim 1 , wherein the on-demand chat message is the response message randomly filtered out from the response list and matching the personality parameter, and the portable device is permitted to link with the server-end host to set the personality parameter.
3 . The system of generative chatbot in real multi-person response situation according to claim 2 , wherein the sensor is configured to sense at least one of user's physiological statuses, facial expressions and body movements to generate a user behavior message, the portable device transmits the user behavior message to the server-end host, and the server-end host determines a user's personality to set the personality parameter.
4 . The system of generative chatbot in real multi-person response situation according to claim 2 , wherein the portable device converts the speech signals of a user into the feature vectors and transmits the feature vectors to the server-end host, the server-end host compares the received feature vector with preset personality feature vectors to determine a personality of the user, and set the personality parameter based on the personality of the user.
5 . The system of generative chatbot in real multi-person response situation according to claim 1 , wherein the portable device comprises a display device configured to display the on-demand chat message synchronously when the speaker outputs the feedback speech.
6 . A method of generative chatbot in a real multi-person response situation, comprising: connecting a server-end host to an artificial intelligence device and a portable device, wherein the artificial intelligence device receives a context message and a timing logic corresponding to the context message through an application programming interface, and transmits at least one response message; continuously sensing, by the portable device, speech signals through an at least one sensor, and converting the sensed speech signals into feature vectors based on Mel-frequency cepstral coefficients, and classifying the speech signals based on the feature vector; performing, by the portable device, a speech-to-text (STT) process to convert the speech signals into text messages; embedding, by the portable device, a timing label and a classification label into the text messages corresponding to the speech signals based on a timing relationship and a classification result, and storing the text messages to a storage device of the portable device as the context message; continuously loading the context message from the storage device of the portable device, and determining a timing logic of multi-person conversation based on the embedded timing label and the embedded classification label, wherein the timing logic comprises the number of people, a timing and a topic of conversation; transmitting, by the server-end host, the context message and the timing logic to the artificial intelligence device to input the context message and the timing logic into a large language model of the artificial intelligence device, to generate, by the artificial intelligence device, the response message, and transmitting, by the artificial intelligence device, the generated response message to the server-end host through the application programming interface; storing, by the server-end host, the response message to a response list, automatically selecting at least one of response messages in the response list as an on-demand chat message based on a personality parameter, and transmitting the on-demand chat message to the portable device; and when the portable device receives the on-demand chat message, performing, by the portable device, a text-to-speech process to convert the on-demand chat message into a feedback speech, and outputting the feedback speech through a speaker of the portable device.
7 . The method of generative chatbot in real multi-person response situation according to claim 6 , wherein the on-demand chat message is the response message randomly filtered out from the response list and matching the personality parameter, and the portable device is permitted to link with the server-end host to set the personality parameter.
8 . The method of generative chatbot in real multi-person response situation according to claim 7 , wherein the sensor is configured to sense at least one of user's physiological statuses, facial expressions and body movements to generate a user behavior message, the portable device transmits the user behavior message to the server-end host, and the server-end host determines a user's personality to set the personality parameter.
9 . The method of generative chatbot in real multi-person response situation according to claim 7 , wherein the portable device converts the speech signals of a user into the feature vectors and transmits the feature vectors to the server-end host, the server-end host compares the received feature vector with preset personality feature vectors to determine the personality of the user, and set the personality parameter based on the personality of the user.
10 . The method of generative chatbot in real multi-person response situation according to claim 6 , wherein the portable device comprises a display device configured to display the on-demand chat message synchronously when the speaker outputs the feedback speech.

Description

CROSS-REFERENCE TO RELATED APPLICATION This application claims the benefit of Chinese Application Serial No. 202311196405.2, filed Sep. 15, 2023, which is hereby incorporated herein by reference in its entirety. BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a chatbot system and a method thereof, and more particularly to a system of generative chatbot in a real multi-person response situation and a method thereof. 2. Description of the Related Art In recent years, with the popularity and vigorous development of artificial intelligence, various artificial intelligence applications have sprung up. Among the developed artificial intelligence applications, chatbots attract the most attention. Generally speaking, the conventional chatbot usually have one-to-one conversation with a user, that is, only when the user sends a question, the conventional chatbot can respond according to the question. However, no conventional chatbot is able to actively give appropriate response suggestions or prompts in a real multi-person response situation, for example, in a multi-person conversation environment, the conventional chatbot is unable to proactively and quickly give users appropriate conversation suggestions. Therefore, the conventional chatbot has problems of poor chat initiative and response efficiency. According to above-mentioned contents, what is needed is to develop an improved solution to solve the conventional problems of poor chat initiative and response efficiency. SUMMARY OF THE INVENTION An objective of the present invention is to disclose a system of generative chatbot in a real multi-person response situation and a method thereof, to solve the conventional problems. In order to achieve the objective, the present invention provides a system of generative chatbot in a real multi-person response situation. The system includes an artificial intelligence device, a portable device and a server-end host. The artificial intelligence device is configured to receive a context message and a timing logic corresponding to the context message through an application programming interface, input the context message and the timing logic to a large language model to generate at least one response message, and transmit the at least one response message through the application programming interface. The portable device includes a sensor, a speaker, a storage device and a speech processor. The sensor is configured to continuously sense speech signals. The speaker is configured to output a feedback speech. The storage device is configured to store feature vectors and text messages corresponding to the speech signals, wherein each of the text messages comprises a timing label and a classification label. The speech processor is electrically connected to the sensor, the speaker, and the storage device, and configured to convert the sensed speech signals into the feature vectors based on Mel-frequency cepstral coefficients, and classify the speech signals, perform a speech-to-text (STT) process to convert the speech signals into the text messages, embed the timing label and the classification label into the text messages corresponding to the speech signals based on a timing relationship and a classification result, and store the text messages to the storage device as the context message, and when an on-demand chat message is received, perform a text-to-speech process to convert the on-demand chat message into the feedback speech, and output the feedback speech through the speaker. a server-end host is connected to the artificial intelligence device and the portable device, and includes a non-transitory computer-readable storage medium and a hardware processor. The non-transitory computer-readable storage medium is configured to store computer readable instructions. The hardware processor is electrically connected to the non-transitory computer-readable storage medium, and configured to execute the computer readable instructions to make the server-end host execute: continuously loading the context message from the storage device of the portable device, and determining a timing logic of a multi-person conversation based on the embedded timing label and the classification label, wherein the timing logic comprises the number of people, a timing and a topic of conversation; transmitting the context message and the timing logic to the artificial intelligence device, receiving the response message from the artificial intelligence device, and storing the response message into a response list; automatically selecting at least one of the response messages in the response list as the on-demand chat message, and transmitting the on-demand chat message to the portable device. In order to achieve the objective, the present invention provides a method of generative chatbot in a real multi-person response situation, and the method includes steps of: connecting the server-end host to an artificial intelligenc