EP-4742685-A1 - AVATAR CONTROL METHOD, APPARATUS, AND RELATED DEVICE

EP4742685A1EP 4742685 A1EP4742685 A1EP 4742685A1EP-4742685-A1

Abstract

This application provides a virtual avatar control method, including: obtaining live streaming interaction data of a live streaming room, where the live streaming interaction data includes live streaming bullet comment data and live streamer speech data, and the live streaming room includes a virtual avatar and a live streamer; determining a target to-be-answered question based on the live streaming bullet comment data, where the live streaming bullet comment data is sent by a viewer in the live streaming room; determining target interaction content based on the target to-be-answered question, where the target interaction content includes reply content of the target to-be-answered question; determining target interaction data based on the target interaction content, where the target interaction data is for displaying the target interaction content to the viewer in the live streaming room by using the virtual avatar, and the target interaction data includes target audio data and target video data; and adding the target interaction data to live streaming data of the live streaming room. In this way, interaction with the viewer is implemented by using the virtual avatar, so that a response can be made to the interaction between the viewer and the live streamer in a timely manner, to improve live streaming watching experience. This application further provides a corresponding apparatus and a related device.

Inventors

Zhang, Huilan
LI, Chengyuan
YANG, Changpeng

Assignees

Huawei Cloud Computing Technologies Co., Ltd.

Dates

Publication Date: 20260513
Application Date: 20240207

Claims (17)

A virtual avatar control method, wherein the method is applied to a virtual avatar control apparatus, the virtual avatar control apparatus is located in at least one data center, and the method comprises: obtaining live streaming interaction data of a live streaming room, wherein the live streaming interaction data comprises live streaming bullet comment data and live streamer speech data, and the live streaming room comprises a virtual avatar and a live streamer; determining a target to-be-answered question based on the live streaming bullet comment data, wherein the live streaming bullet comment data is sent by a viewer in the live streaming room; determining target interaction content based on the target to-be-answered question, wherein the target interaction content comprises reply content of the target to-be-answered question; determining target interaction data based on the target interaction content, wherein the target interaction data is for displaying the target interaction content to the viewer in the live streaming room by using the virtual avatar, and the target interaction data comprises target audio data and target video data; and adding the target interaction data to live streaming data of the live streaming room.
The method according to claim 1, wherein the determining the target to-be-answered question based on the live streaming bullet comment data comprises: determining at least one to-be-answered question based on the live streaming bullet comment data; and determining the target to-be-answered question from the at least one to-be-answered question based on the live streamer speech data, wherein the live streamer speech data does not comprise audio data corresponding to the reply content of the target to-be-answered question.
The method according to claim 1 or 2, wherein before the adding the target interaction data to the live streaming data of the live streaming room, the method further comprises: determining that there is a pause in a live streamer speech in the live streamer speech data.
The method according to any one of claims 1 to 3, wherein the virtual avatar control apparatus comprises a virtual avatar knowledge base, and the determining the target to-be-answered question based on the live streaming bullet comment data comprises: determining a bullet comment intention information set by clustering the live streaming bullet comment data; and determining the target to-be-answered question based on the bullet comment intention information set; and the determining the target interaction content based on the target to-be-answered question comprises: determining the reply content of the target to-be-answered question from the virtual avatar knowledge base based on the target to-be-answered question; and determining the target interaction content based on the reply content.
The method according to claim 4, wherein the method further comprises: updating the virtual avatar knowledge base based on the target to-be-answered question and audio data for answering the target to-be-answered question.
The method according to any one of claims 1 to 5, wherein the determining the target interaction data based on the target interaction content comprises: generating the corresponding target audio data based on the target interaction content; and generating the target video data based on the virtual avatar and the target audio data, wherein the target video data is video data of the virtual avatar producing audio corresponding to the target interaction content.
The method according to claim 6, wherein the generating the corresponding target audio data based on the target interaction content comprises: determining a live streamer emotion category by analyzing the live streamer speech data; and generating, based on the target interaction content, the target audio data that matches the live streamer emotion category.
The method according to any one of claims 1 to 7, wherein the method further comprises: determining live streamer instruction information based on the live streamer speech data; determining instruction interaction data based on the live streamer instruction information, wherein the instruction interaction data is for displaying, by using the virtual avatar to the viewer in the live streaming room, an action completed by the virtual avatar as instructed by the live streamer instruction information; and adding the instruction interaction data to the live streaming data of the live streaming room.
The method according to any one of claims 1 to 8, wherein the method further comprises: determining target subtitle data based on the live streamer speech data; and adding the target subtitle data to the live streaming data of the live streaming room.
The method according to any one of claims 1 to 9, wherein the method further comprises: determining a live streaming room operation instruction by parsing the live streamer speech data; and adjusting the live streaming data of the live streaming room based on the live streaming room operation instruction.
A virtual avatar control apparatus, wherein the virtual avatar control apparatus is located in at least one data center, and the apparatus comprises: an obtaining module, configured to obtain live streaming interaction data of a live streaming room, wherein the live streaming interaction data comprises live streaming bullet comment data and live streamer speech data, and the live streaming room comprises a virtual avatar and a live streamer; a question determining module, configured to determine a target to-be-answered question based on the live streaming bullet comment data, wherein the live streaming bullet comment data is sent by a viewer in the live streaming room; an interaction content determining module, configured to determine target interaction content based on the target to-be-answered question and the live streaming interaction data, wherein the target interaction content comprises reply content of the target to-be-answered question; an interaction data determining module, configured to determine target interaction data based on the target interaction content, wherein the target interaction data is for displaying the target interaction content to the viewer in the live streaming room by using the virtual avatar, and the target interaction data comprises target audio data and target video data; and a live streaming data adjustment module, configured to add the target interaction data to live streaming data of the live streaming room.
The apparatus according to claim 11, wherein the question determining module is specifically configured to: determine at least one to-be-answered question based on the live streaming bullet comment data; and determine the target to-be-answered question from the at least one to-be-answered question based on the live streamer speech data, wherein the live streamer speech data does not comprise audio data corresponding to the reply content of the target to-be-answered question.
The apparatus according to claim 11 or 12, wherein the question determining module is further configured to determine that there is a pause in a live streamer speech in the live streamer speech data.
The apparatus according to any one of claims 11 to 13, wherein the virtual avatar control apparatus comprises a virtual avatar knowledge base; the question determining module is configured to: determine a bullet comment intention information set by clustering the live streaming bullet comment data; and determine the target to-be-answered question based on the bullet comment intention information set; and the interaction content determining module is configured to: determine the reply content of the target to-be-answered question from the virtual avatar knowledge base based on the target to-be-answered question; and determine the target interaction content based on the reply content.
A computing device cluster, wherein the computing device cluster comprises at least one computing device, and each computing device comprises a processor and a memory, wherein the memory is configured to store instructions; and the processor is configured to cause, based on the instructions, the computing device cluster to perform the method according to any one of claims 1 to 10.
A computer-readable storage medium, wherein the computer-readable storage medium stores instructions, and when the instructions are run on a computing device, the computing device is caused to perform the method according to any one of claims 1 to 10.
A computer program product comprising instructions, wherein when the computer program product runs on a computing device, the computing device is caused to perform the method according to any one of claims 1 to 10.

Description

This application claims priority to Chinese Patent Application No. 202310973189.1, filed with the China National Intellectual Property Administration on August 3, 2023 and entitled "VIRTUAL AVATAR CONTROL METHOD, COMPUTING CLUSTER, AND RELATED DEVICE", and further claims priority to Chinese Patent Application No. 202311638844.4, filed with the China National Intellectual Property Administration on December 1, 2023 and entitled "VIRTUAL AVATAR CONTROL METHOD AND APPARATUS, AND RELATED DEVICE", both of which are incorporated herein by reference in their entireties. TECHNICAL FIELD This application relates to the field of media technologies, and in particular, to a virtual avatar control method and apparatus, and a related device. BACKGROUND With the continuous evolution and development of computer technologies, live streaming is increasingly widely used. In a live streaming scenario, a live streamer may upload a live streaming data stream via a terminal device. The live streaming data stream may be forwarded via a live streaming server (or another device) to a client corresponding to a live streaming viewer. The live streaming data stream may include a video data stream and/or an audio data stream. In the live streaming scenario, the live streamer needs to display live streaming content to users. In addition, in some scenarios, the live streamer is further responsible for managing a live streaming room, interacting with another live streamer, interacting with viewers, and other tasks. With a heavy workload in live streaming, it may be difficult for the live streamer to balance all aspects. SUMMARY In view of this, embodiments of this application provide a virtual avatar control method, to provide a systematic solution to a problem that it is difficult for a live streamer to balance all aspects in a live streaming scenario. This application further provides a corresponding apparatus, a computing device cluster, a computer-readable storage medium, and a computer program product. According to a first aspect, this application provides a virtual avatar control method. The virtual avatar control method may be implemented by a virtual avatar control apparatus in a live streaming system, where a virtual avatar may be used to assist in live streaming. The virtual avatar control apparatus may be deployed in at least one data center. Specifically, during live streaming in a live streaming room, the virtual avatar control apparatus may obtain live streaming interaction data. The live streaming interaction data includes live streaming bullet comment data, that is, bullet comment data sent by a viewer to a live streamer. In addition, the live streaming interaction data further includes live streamer speech data, that is, sound data produced by the live streamer in the live streaming room during the live streaming. Then, the virtual avatar control apparatus may determine a target to-be-answered question based on the live streaming bullet comment data, and determine target interaction content based on the to-be-answered question and the live streaming interaction data. The target interaction content includes reply content of the target to-be-answered question. Then, in the virtual avatar control method, target interaction data may be determined based on the target interaction content, and the target interaction data may be added to live streaming data of the live streaming room. The target interaction data is for displaying the target interaction content to the viewer in the live streaming room by using the virtual avatar. When viewing the live streaming data to which the target interaction data is added, the viewer may not only see the live streaming data corresponding to the live streamer, but also see a process in which the virtual avatar displays the target interaction content to the viewer. During the live streaming, the virtual avatar control apparatus may actively obtain the live streaming interaction data of the live streaming room, determine corresponding target interaction content based on a question asked by using a bullet comment, and display the target interaction content to the viewer in the live streaming room by using the virtual avatar. In this way, the target interaction content that needs to be exchanged may be automatically determined based on the live streaming interaction data, and the target interaction content may be further displayed to the viewer by using the virtual avatar. Automatic interaction with the viewer is implemented by using the virtual avatar, so that a response can be made to the interaction between the viewer and the live streamer in a timely manner, to improve live streaming watching experience. In addition, the virtual avatar is used to assist in the live streaming, so that the live streamer does not need to focus mainly on the interaction with the viewer, but pays more attention to live streaming content, to improve live streaming efficiency of the live streamer. In some possible im