CN-121996747-A - Response generating device and method

CN121996747ACN 121996747 ACN121996747 ACN 121996747ACN-121996747-A

Abstract

A response generating apparatus and method. The device stores a language model and a response verification model. The device generates a first response corresponding to a user dialogue based on the user dialogue and the language model. The device determines whether the first response corresponds to a failed authentication state based on the user session, a plurality of authentication metrics, and the response authentication model. The device generates a second response corresponding to the user session in response to determining that the first response corresponds to the unverified state. The response generation technology provided by the invention can ensure the safety of the response finally provided for the user.

Inventors

CAI CHENGHAN
Peng Yushao

Assignees

宏达国际电子股份有限公司

Dates

Publication Date: 20260508
Application Date: 20251107
Priority Date: 20241107

Claims (10)

1.A response generation apparatus, comprising: A memory storing a language model and a response verification model; A receiving-transmitting interface, and The processor is electrically connected to the memory and the transceiver interface and is used for executing the following operations: generating a first response corresponding to a user dialogue based on the user dialogue and the language model; Judging whether the first response corresponds to a failed verification state based on the user dialogue, a plurality of verification indexes and the response verification model, and In response to determining that the first response corresponds to the unverified state, a second response corresponding to the user session is generated.
2. The response generating device of claim 1, wherein the processor further performs the following operations: Judging a verification result of the first response corresponding to each of the verification indexes by the response verification model, and And in response to at least one of the plurality of verification results being determined to be a failed state, determining that the first response corresponds to the failed state.
3. The response generation device of claim 1, wherein the second response is generated based on: Based on the user session and the language model, generating the second response corresponding to the user session, wherein the second response is different from the first response.
4. The response generation device of claim 2, wherein the second response is generated based on: Generating a feedback corresponding to the first response based on the verification result of each of the plurality of verification indicators by the response verification model in response to determining that the first response corresponds to the failed verification state, wherein the feedback is indicative of at least one of the plurality of verification indicators determined to be the failed state, and Based on the user session, the feedback, and the language model, the second response corresponding to the user session is generated.
5. The response generation device of claim 2, wherein the memory further stores a response modification model, and the second response is generated based on: Generating a feedback corresponding to the first response based on the verification result of each of the plurality of verification indicators by the response verification model in response to determining that the first response corresponds to the failed verification state, wherein the feedback is indicative of at least one of the plurality of verification indicators determined to be the failed state, and Based on the first response, the feedback, and the response modification model, the second response corresponding to the user session is generated.
6. The response generating device of claim 1, wherein the processor further performs the following operations: Judging whether the second response corresponds to the failed verification state based on the user dialogue, the plurality of verification indexes and the response verification model, and In response to determining that the second response corresponds to the unverified state, a third response corresponding to the user session is generated, wherein the third response is different from the second response.
7. The response generating device of claim 1, wherein the processor further performs the following operations: Judging whether the second response corresponds to the failed verification state based on the user dialogue, the plurality of verification indexes and the response verification model, and In response to determining that the second response does not correspond to the unverified state, the second response is set to a target response corresponding to the user session.
8. The response generation device of claim 1, wherein the memory further stores a verification index comparison table, the verification index comparison table including the plurality of verification indexes and a scoring criterion corresponding to each of the plurality of verification indexes, and determining whether the first response corresponds to the unverified state further comprises: based on the user session, the verification index comparison table, and the response verification model, it is determined whether the first response corresponds to the unverified state.
9. The response generating apparatus of claim 1, wherein the memory further stores a verification index comparison table, the verification index comparison table comprising the plurality of verification indexes and a scoring criterion corresponding to each of the plurality of verification indexes, and the processor further performs the following operations: generating a new verification index comparison table and a new scoring standard corresponding to each of a plurality of new verification indexes based on a text description to update the verification index comparison table, and Based on the new verification index comparison table and the response verification model, whether the first response corresponds to the failed verification state is determined.
10. A response generation method for an electronic device, wherein the electronic device stores a language model and a response verification model, and the response generation method comprises the steps of: generating a first response corresponding to a user dialogue based on the user dialogue and the language model; Judging whether the first response corresponds to a failed verification state based on the user dialogue, a plurality of verification indexes and the response verification model, and In response to determining that the first response corresponds to the unverified state, a second response corresponding to the user session is generated.

Description

Response generating device and method Technical Field The present disclosure relates to a response generation device and method. More particularly, the present disclosure relates to a response generation apparatus and method capable of actively verifying whether a response generated by a language model is appropriate. Background With the recent rise of artificial intelligence and related applications, users can interact with chat robots to obtain various information responses from the chat robots. However, modern chat robots based on language models may misunderstand the intent of the user for various reasons, and may erroneously generate inappropriate responses to fail to meet the needs of the user. In this case, the bad response would be uncomfortable for the user to the language model chat robot. In the prior art, chat robots trained based on language models may employ precautions (e.g., security adjustments to unsafe inputs during supervised fine Tuning (Supervised Fine-Tuning; SFT), security training using reinforcement learning based on human feedback (reinforcement learning from human feedback; RLHF)) at development time to ensure the response security of the chat robot. However, these precautions only ensure the rationality of the response during the training phase of the first tier, and do not provide verification of the second tier at the time of operation to ensure that the chat robot's response is accurate. Accordingly, such language model based chat robots are still prone to undue response (e.g., large language model jail (LLM jailbreaking)) when the user inputs certain content. In view of the foregoing, it is an urgent need to provide a response generation technique that can actively verify whether the response generated by the language model is appropriate. Disclosure of Invention An object of the present disclosure is to provide a response generating device. The response generating device comprises a memory, a receiving-transmitting interface and a processor, wherein the processor is electrically connected to the memory and the receiving-transmitting interface. The memory stores a language model and a response verification model. The processor generates a first response corresponding to a user dialogue based on the user dialogue and the language model. The processor determines whether the first response corresponds to a failed verification state based on the user session, a plurality of verification indicators, and the response verification model. The processor generates a second response corresponding to the user session in response to determining that the first response corresponds to the unverified state. In one embodiment of the disclosure, the processor further performs the operations of determining, by the response verification model, that the first response corresponds to a verification result of each of the plurality of verification indicators, and determining, in response to at least one of the plurality of verification results being determined to be in a failed state, that the first response corresponds to the failed state. In one embodiment of the present disclosure, wherein the second response is generated based on the user dialog and the language model, the second response corresponding to the user dialog is generated, wherein the second response is different from the first response. In one embodiment of the present disclosure, the second response is generated based on generating, by the response verification model, a feedback corresponding to the first response based on the verification results of each of the plurality of verification indicators in response to determining that the first response corresponds to the failed state, wherein the feedback is indicative of at least one of the plurality of verification indicators determined to be the failed state, and generating the second response corresponding to the user session based on the user session, the feedback, and the language model. In one embodiment of the present disclosure, the memory further stores a response modification model, and the second response is generated based on generating a feedback corresponding to the first response based on the verification result of each of the plurality of verification indicators by the response verification model in response to determining that the first response corresponds to the failed verification state, wherein the feedback is indicative of at least one of the plurality of verification indicators determined to be the failed state, and generating the second response corresponding to the user session based on the first response, the feedback, and the response modification model. In one embodiment of the present disclosure, the processor further performs the operations of determining whether the second response corresponds to the unverified state based on the user session, the plurality of authentication metrics, and the response authentication model, and generating a third response