CN-122021658-A - Dialogue processing method and device and electronic equipment

CN122021658ACN 122021658 ACN122021658 ACN 122021658ACN-122021658-A

Abstract

The embodiment of the application provides a dialogue processing method, a dialogue processing device and electronic equipment, wherein the method comprises the steps of obtaining dialogue input content sent by an object, obtaining a target error operation class matched with the dialogue input content from an error operation library, enabling the target error operation class to refer to a set of error operations required to be avoided in the process of responding the dialogue input content, generating at least one reinforcement countermeasure context according to the target error operation class, enabling the at least one reinforcement countermeasure context to be used for indicating a model to iteratively correct errors in initial response results of the dialogue input content, and enabling the model to iteratively correct the initial response results by using the at least one reinforcement countermeasure context to obtain target response results. The application solves the AI illusion problem generated by the large model under the scene of limited number of the token.

Inventors

XIAO BAOLIANG
YE JINGTAO
ZHAO XIANYONG

Assignees

湖南快乐阳光互动娱乐传媒有限公司

Dates

Publication Date: 20260512
Application Date: 20260122

Claims (10)

1. A conversation processing method, comprising: Acquiring dialogue input content sent by an object, and acquiring a target error operation class matched with the dialogue input content from an error operation library, wherein the target error operation class refers to a set of error operations generated in the process of responding the dialogue input content by a model; generating at least one enhanced countermeasure context according to the target error operation class, wherein the at least one enhanced countermeasure context is used for indicating the model to iteratively correct errors in initial response results of the dialogue input content; Iteratively correcting the initial response result by using the at least one reinforcement countermeasure context through the model to obtain a target response result, and feeding back the target response result to the object.
2. The method of claim 1, wherein a plurality of classes of error operations are stored in the error operations library, each class of error operation in the plurality of classes of error operations being stored in the error operations library using a first pinyin string; the obtaining the target error operation class matched with the dialogue input content from the error operation library comprises the following steps: converting the dialogue input content into a second pinyin character string, and determining semantic similarity between the second pinyin character string and the first pinyin character string of each error operation class; Selecting at least one reference error operation class with semantic similarity larger than a preset matching degree threshold from the error operation library; determining a content association between the second pinyin string and the first pinyin string of each reference incorrect manipulation class; and taking the reference error operation class corresponding to the maximum content relevance in the at least one reference error operation class as the target error operation class matched with the dialogue input content.
3. The method of claim 2, wherein the determining the semantic similarity between the second pinyin string and the first pinyin string of each of the incorrect operation classes comprises: Determining the ratio of the total length of at least one character segment in the second pinyin character string to the length of the second pinyin character string as the semantic similarity between the second pinyin character string and the first pinyin character string of each incorrect operation class, wherein the at least one character segment refers to the same continuous and longest character sequence in the second pinyin character string as the characters of the first pinyin character string of each incorrect operation class.
4. The method of claim 2, wherein the determining the content association between the second pinyin string and the first pinyin string of each reference class of incorrect operations comprises: Determining the ratio between the total length of at least one character segment in the second pinyin character string and the length of the first pinyin character string of each reference error operation class in the at least one reference error operation class as the content relevance between the second pinyin character string and the first pinyin character string of each reference error operation class, wherein the at least one character segment refers to the same continuous and longest character sequence in the second pinyin character string as the characters of the first pinyin character string of each reference error operation class.
5. The method of claim 1, wherein the target class of error operations comprises a plurality of error operation items, each error operation item of the plurality of error operation items of the target class of error operations referring to an error operation that occurs during the model's response to the dialog input content; the generating at least one enhanced countermeasure context according to the target error operation class comprises: The method comprises the steps of integrating attribute information of at least one error operation item in a target error operation class into a reinforcement countermeasure context to obtain the reinforcement countermeasure context, wherein the attribute information of each error operation item at least comprises any one of error operation identification, error operation prompt words and error operation description, the error operation identification of each error operation item is used for uniquely identifying each error operation item, the error operation prompt words of each error operation item are used for indicating the type of each error operation item, and the error operation description of each error operation item is used for describing error operation corresponding to each error operation item.
6. The method of claim 5, wherein each of the at least one reinforcement countermeasure context includes at least one error operation identification, wherein iteratively correcting the initial response result by the model using the at least one reinforcement countermeasure context results in a target response result, comprising: In response to obtaining the initial response result, sequentially selecting one reinforcement countermeasure context from the at least one reinforcement countermeasure context as a current reinforcement countermeasure context, and performing the following processing operations until the at least one reinforcement countermeasure context is selected: Inputting the current reinforcement countermeasure context into the model, and acquiring a countermeasure response result corresponding to the current reinforcement countermeasure context, wherein the countermeasure response result corresponding to the current reinforcement countermeasure context carries an error operation identifier in the current reinforcement countermeasure context; the next reinforcement countermeasure context of the current reinforcement countermeasure context is selected according to the error operation identifier in the countermeasure response result corresponding to the current reinforcement countermeasure context; The target response result is a challenge response result corresponding to the last selected reinforcement challenge context in the at least one reinforcement challenge context.
7. The method of claim 6, wherein the attribute information of each of the error operation items further includes a number of repeated countermeasures, wherein the number of repeated countermeasures in each of the error operation items refers to an upper limit of a number of times of reinforcement countermeasures based on a reinforcement countermeasure context including the error operation item, and wherein the method further comprises: And in response to the number of repeated countermeasures in the current reinforcement countermeasure context being greater than 1, continuing to perform the step of inputting the current reinforcement countermeasure context into the model until the number of times reinforcement countermeasure is performed based on the current reinforcement countermeasure context is equal to the number of repeated countermeasure times in the current reinforcement countermeasure context, and stopping the operation of performing reinforcement countermeasure based on the current reinforcement countermeasure context.
8. The method of claim 7, wherein the method further comprises: Monitoring a plurality of countermeasure response results obtained by performing multiple reinforcement countermeasures based on the current reinforcement countermeasure context in response to the number of repeated countermeasures in the current reinforcement countermeasure context being greater than 1; responsive to maintaining agreement between a plurality of countermeasure response results obtained by conducting a plurality of reinforcement countermeasures based on the current reinforcement countermeasure context, deleting an error operation item contained in the current reinforcement countermeasure context from the error operation library.
9. A dialog processing device, comprising: The content matching module is used for acquiring dialogue input content sent by an object and acquiring a target error operation class matched with the dialogue input content from an error operation library, wherein the target error operation class refers to a set of error operations generated in the process of responding the dialogue input content by the model; The context generation module is used for generating at least one enhanced countermeasure context according to the target error operation class, wherein the at least one enhanced countermeasure context is used for indicating the model to iteratively correct errors in initial response results of the dialogue input content; and the iteration correction module is used for iteratively correcting the initial response result by using the at least one reinforcement countermeasure context through the model to obtain a target response result, and feeding back the target response result to the object.
10. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the method of any one of claims 1 to 8 when the computer program is executed.

Description

Dialogue processing method and device and electronic equipment Technical Field The embodiment of the application relates to the field of artificial intelligence, in particular to a dialogue processing method, a dialogue processing device and electronic equipment. Background Modern AI (ARTIFICIAL INTELLIGENCE, abbreviated as artificial intelligence) large models, while possessing large scale parameters and powerful generation capabilities, have a limited number of tokens that can be processed in a single reasoning. For the context content which can be remembered in the continuous dialogue is quite limited, and in addition, the defects of model quality deviation and training data are caused, so that the content generated by the AI model is easy to generate the situation of compiling, namely AI illusion, the AI can compile information which is plausible but actually unreal in the generating process, and the credibility and the practicability of the content generated by the AI are greatly reduced. Disclosure of Invention The embodiment of the application provides a dialogue processing method, a dialogue processing device and electronic equipment, which at least solve the problem of AI illusion generated by a large model in a limited number of tokens in the related technology. According to one aspect of the embodiment of the application, a dialogue processing method is provided, and the dialogue processing method comprises the steps of obtaining dialogue input content sent by an object, obtaining a target error operation class matched with the dialogue input content from an error operation library, enabling the target error operation class to refer to a set of error operations generated in the process of responding to the dialogue input content, generating at least one reinforcement countermeasure context according to the target error operation class, enabling the at least one reinforcement countermeasure context to be used for indicating the model to iteratively correct errors in initial response results of the dialogue input content, enabling the model to iteratively correct the initial response results through the at least one reinforcement countermeasure context, obtaining a target response result, and feeding the target response result back to the object. According to another aspect of the embodiment of the application, a dialogue processing device is provided, which comprises a content matching module, a context generating module and an iteration correction module, wherein the content matching module is used for acquiring dialogue input content sent by an object and acquiring a target error operation class matched with the dialogue input content from an error operation library, the target error operation class refers to a set of error operations generated in the process of responding the dialogue input content in a model mode, the context generating module is used for generating at least one reinforcement countermeasure context according to the target error operation class, the at least one reinforcement countermeasure context is used for indicating the model to iteratively correct errors in initial response results of the dialogue input content, and the iteration correction module is used for iteratively correcting the initial response results by using the at least one reinforcement countermeasure context through the model to obtain target response results and feeding the target response results back to the object. According to a further aspect of embodiments of the present application, there is also provided a computer-readable storage medium having a computer program stored therein, wherein the computer program is arranged to perform the steps of any of the method embodiments described above when being executed by a processor. According to yet another aspect of embodiments of the present application, there is provided a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer readable storage medium and executes the computer instructions to cause the computer device to perform the steps of any of the method embodiments described above. According to a further aspect of embodiments of the present application there is also provided an electronic device comprising a memory having a computer program stored therein and a processor arranged to perform the steps of any of the method embodiments described above by means of the computer program. According to the application, an error operation library is constructed, each error operation library comprises a plurality of error operation classes, each error operation class in the plurality of error operation classes refers to a set of error operations generated by a model under a specific scene, dialogue input contents sent by an object are matched with the plurality of error operation classes in the error operati