EP-4742106-A1 - QUESTION-ANSWER METHOD AND RELATED APPARATUS

EP4742106A1EP 4742106 A1EP4742106 A1EP 4742106A1EP-4742106-A1

Abstract

A question-answering method is provided, applied to a terminal device. A large language model is deployed on the terminal device. The method includes: The large language model generates answer content according to a first user instruction, and the terminal device broadcasts the answer content; during broadcasting of the answer content, when a second user instruction is detected, the broadcasting is interrupted and a processing instruction is sent to the large language model, where the large language model instructs, according to the processing instruction, to interrupt generation of the answer content, or continue the generation of the answer content to cache continuously generated content; the broadcasting of the answer content is interrupted in response to a broadcast interruption request; and unbroadcast content is broadcast when a third user instruction that instructs to continue broadcasting is detected, where the unbroadcast content is obtained by the large language model through generation based on the first user instruction and broadcasted content, or is obtained by reading the cached continuously generated content. In this application, based on a feature of generating-while-broadcasting of the large language model, a generation process of the large language model is controlled, so that the answer content can be continuously broadcast after being interrupted.

Inventors

CHEN, QIMENG
SONG, Kaikai
DONG, Qinghao
TAN, Binlin
XIA, JIE

Assignees

Huawei Technologies Co., Ltd.

Dates

Publication Date: 20260513
Application Date: 20240903

Claims (20)

A question-answering method, applied to a terminal device, wherein a large language model is deployed on the terminal device, and the method comprises: broadcasting answer content, wherein the answer content is generated by the large language model according to a first user instruction; during broadcasting of the answer content, when a second user instruction is detected, generating a broadcast interruption request and sending a processing instruction to the large language model, wherein the processing instruction instructs the large language model to interrupt generation of the answer content, or the processing instruction instructs the large language model to continue the generation of the answer content to cache continuously generated content; interrupting the broadcasting of the answer content in response to the broadcast interruption request; and broadcasting unbroadcast content in the answer content when a third user instruction is detected, wherein the third user instruction instructs to continue broadcasting the answer content, and the unbroadcast content is obtained by the large language model through generation based on the first user instruction and broadcasted content, or the unbroadcast content is obtained by reading the cached continuously generated content.
The method according to claim 1, wherein when the unbroadcast content is obtained by the large language model through generation based on the first user instruction and the broadcasted content, the first user instruction and/or the broadcasted content are/is obtained through caching when the large language model interrupts the generation of the answer content.
The method according to claim 1 or 2, wherein when the unbroadcast content is obtained by the large language model through generation based on the first user instruction, a context of the first user instruction, and the broadcasted content, the context of the first user instruction is obtained through caching when the large language model interrupts the generation of the answer content.
The method according to any one of claims 1 to 3, wherein when the second user instruction is detected, generating the broadcast interruption request and sending the processing instruction to the large language model comprises: when the second user instruction is detected, in response to the second user instruction, after determining that a user intention corresponding to the second user instruction meets a preset condition, generating the broadcast interruption request and sending the processing instruction to the large language model.
The method according to claim 4, wherein the preset condition is a preset trustlist, and after it is determined that the user intention corresponding to the second user instruction is an intention in the preset trustlist, the broadcast interruption request is generated, and the processing instruction is sent to the large language model.
The method according to claim 4 or 5, wherein determining that the user intention corresponding to the second user instruction meets the preset condition comprises: determining, according to the second user instruction by using a natural language understanding model, the user intention corresponding to the second user instruction; and determining whether the user intention meets the preset condition.
The method according to any one of claims 1 to 6, wherein the first user instruction and/or the second user instruction and/or the third user instruction are/is obtained by performing voice analysis on audio input by a user, or is obtained by processing, according to a preset processing rule, a text input by the user.
A question-answering method, applied to a question-answering system, wherein the question-answering system comprises a terminal device and a cloud device, the terminal device is communicatively connected to the cloud device, a large language model is deployed on the cloud device, and the method comprises: sending, by the terminal device, a first user instruction to the cloud device when the first user instruction is detected; generating, by the cloud device, answer content by using the large language model after receiving the first user instruction; receiving and broadcasting, by the terminal device, the answer content, and during broadcasting of the answer content, sending, by the terminal device, a second user instruction to the cloud device when the second user instruction is detected; generating, by the cloud device, a broadcast interruption request after receiving the second user instruction, sending the broadcast interruption request to the terminal device, and sending a processing instruction to the large language model, wherein the processing instruction instructs the large language model to interrupt generation of the answer content, or the processing instruction instructs the large language model to continue the generation of the answer content, so that the cloud device or the terminal device caches continuously generated content; receiving, by the terminal device, the broadcast interruption request, and interrupting the broadcasting of the answer content; sending, by the terminal device, a third user instruction to the cloud device when the third user instruction is detected, wherein the third user instruction instructs to continue broadcasting the answer content; receiving, by the cloud device, the third user instruction, determining unbroadcast content in the answer content that is generated by the large language model based on the first user instruction and broadcasted content, and sending the unbroadcast content to the terminal device; or obtaining, by the cloud device or the terminal device, the unbroadcast content in the answer content by reading the cached continuously generated content; and broadcasting, by the terminal device, the unbroadcast content.
The method according to claim 8, further comprising: caching, by the cloud device or the terminal device, the first user instruction and/or the broadcasted content when the large language model interrupts the generation of the answer content, so that the large language model obtains the unbroadcast content through generation based on the first user instruction and the broadcasted content.
The method according to claim 8 or 9, further comprising: caching, by the cloud device or the terminal device, a context of the first user instruction when the large language model interrupts the generation of the answer content, so that the large language model obtains the unbroadcast content through generation based on the first user instruction, the context of the first user instruction, and the broadcasted content.
The method according to any one of claims 8 to 10, wherein generating, by the cloud device, the broadcast interruption request after receiving the second user instruction comprises: receiving, by the cloud device, the second user instruction, and generating the broadcast interruption request after determining that a user intention corresponding to the second user instruction meets a preset condition.
The method according to claim 11, wherein the preset condition is a preset trustlist, and determining that the user intention corresponding to the second user instruction meets the preset condition comprises: determining that the user intention corresponding to the second user instruction is an intention in the preset trustlist.
The method according to claim 11 or 12, wherein determining that the user intention corresponding to the second user instruction meets the preset condition comprises: determining, according to the second user instruction by using a natural language understanding model, the user intention corresponding to the second user instruction; and determining whether the user intention meets the preset condition.
The method according to any one of claims 8 to 13, wherein the first user instruction and/or the second user instruction and/or the third user instruction are/is obtained by performing voice analysis on audio input by a user, or is obtained by processing, according to a preset processing rule, a text input by the user.
A question-answering method, applied to a cloud device, wherein a large language model is deployed on the cloud device, and the method comprises: in response to a first user instruction, generating answer content by using the large language model; during broadcasting of the answer content, generating a broadcast interruption request in response to a second user instruction, and sending a processing instruction to the large language model, wherein the processing instruction instructs the large language model to interrupt generation of the answer content, or the processing instruction instructs the large language model to continue the generation of the answer content, so that continuously generated content is cached, and the broadcast interruption request is configured to interrupt the broadcasting of the answer content; and in response to a third user instruction, determining unbroadcast content in the answer content that is generated by the large language model based on the first user instruction and broadcasted content, or obtaining the unbroadcast content in the answer content by reading the cached continuously generated content, so that the unbroadcast content is broadcast, wherein the third user instruction instructs to continue broadcasting the answer content.
The method according to claim 15, further comprising: caching the first user instruction and/or the broadcasted content when the large language model interrupts the generation of the answer content, so that the large language model obtains the unbroadcast content through generation based on the first user instruction and the broadcasted content.
The method according to claim 15 or 16, further comprising: caching a context of the first user instruction when the large language model interrupts the generation of the answer content, so that the large language model obtains the unbroadcast content through generation based on the first user instruction, the context of the first user instruction, and the broadcasted content.
The method according to any one of claims 15 to 17, wherein generating the broadcast interruption request in response to the second user instruction comprises: in response to the second user instruction, and generating the broadcast interruption request after determining that a user intention corresponding to the second user instruction meets a preset condition.
The method according to claim 18, wherein the preset condition is a preset trustlist, and determining that the user intention corresponding to the second user instruction meets the preset condition comprises: determining that the user intention corresponding to the second user instruction is an intention in the preset trustlist.
The method according to claim 18 or 19, wherein determining that the user intention corresponding to the second user instruction meets the preset condition comprises: determining, according to the second user instruction by using a natural language understanding model, the user intention corresponding to the second user instruction; and determining whether the user intention meets the preset condition.

Description

This application claims priority to Chinese Patent Application No. 202311764553X, filed with the China National Intellectual Property Administration on December 20, 2023 and entitled "QUESTION-ANSWERING METHOD AND RELATED APPARATUS", which is incorporated herein by reference in its entirety. TECHNICAL FIELD This application relates to the field of artificial intelligence technologies, and in particular, to a question-answering method and a related apparatus. BACKGROUND With development of artificial intelligence technologies, question-answering technologies become increasingly mature. In an existing question-answering method, broadcasting of current answer content may be interrupted according to a user instruction, to start a new dialog. However, when a user wants to continue broadcasting the interrupted answer content, because only adjacent dialogs are associated in the existing question-answering method, a related instruction of the user for continuing the broadcasting cannot be recognized. In other words, the user needs to ask a question again, and broadcast content is regenerated for broadcasting, and continuous broadcasting cannot be performed at an interruption position, resulting in poor user experience. SUMMARY To resolve the foregoing problem, embodiments of this application provide a question-answering method and a related apparatus. Corresponding processing is performed after broadcasting is interrupted, so that answer content can be continuously broadcast after being interrupted. Therefore, the following technical solutions are used in embodiments of this application. According to a first aspect, an embodiment of this application provides a question-answering method. The method is applied to a terminal device, and a large language model is deployed on the terminal device. The method mainly includes: The large language model generates answer content according to a first user instruction, and the terminal device broadcasts the answer content; during broadcasting of the answer content, the terminal device detects a second user instruction, generates a broadcast interruption request, and sends a processing instruction to the large language model, where the large language model instructs, according to the processing instruction, to interrupt generation of the answer content, or continues the generation of the answer content according to the processing instruction, and the terminal device or the cloud device caches continuously generated content; the terminal device interrupts the broadcasting of the answer content after receiving the broadcast interruption request; and a third user instruction is detected, where the third user instruction instructs to continue broadcasting the answer content, the terminal device broadcasts unbroadcast content in the answer content, and the unbroadcast content is obtained by the large language model through generation based on the first user instruction and broadcasted content, or the unbroadcast content is obtained by reading the cached continuously generated content. In other words, based on a feature of generating-while-broadcasting of the large language model in generating the answer content. This embodiment of this application provides the question-answering method, so that the answer content can be continuously broadcast after being interrupted. From a perspective of a user, the user sends the first user instruction to obtain the answer content for broadcasting. During broadcasting of the answer content, the user sends the second user instruction, and needs to interrupt the broadcasting of the answer content to execute the second user instruction or another instruction. After several user instructions including the second user instruction are executed, the user may send the third user instruction, and continue broadcasting the unbroadcast part after the answer content is interrupted. In this embodiment of this application, to implement continuous broadcasting, after the second user instruction is received, the broadcast interruption request and the processing instruction are generated. First, the broadcast interruption request may control broadcasting of current answer content to be interrupted, to respond to another instruction of the user. Second, the processing instruction may control interruption of the generation of the answer content while interrupting the broadcast, and record the first user instruction and the broadcasted content, so that when the user instruction for continuing broadcasting appears, the unbroadcast content is determined based on the recorded first user instruction and the broadcasted content; or the generation of the answer content is not interrupted, that is, the generation of the answer content continues, and the unbroadcast content is recorded from an interruption position, so that the unbroadcast content can be directly obtained when the user instruction for continuing broadcasting appears. In this way, the continuous broadcasting o