CN-122021874-A - Agent interaction and evaluation method, system and related device

CN122021874ACN 122021874 ACN122021874 ACN 122021874ACN-122021874-A

Abstract

The application discloses an agent interaction and evaluation method, system and related device, wherein the method comprises the steps of obtaining query content output by a target agent under a plurality of rounds and reply content output by a reference agent according to behavior labels, obtaining a dialogue set, wherein each behavior label corresponds to a dialogue mode, the behavior label of a single round is determined based on the dialogue content of at least part of historical rounds or is set as a behavior label preset by a target problem included in the query content of the current round, traversing each round in the dialogue set, obtaining behavior intents corresponding to the behavior label of the previous round, evaluating the query content of the current round based on the dialogue content and the behavior intents of at least part of historical rounds, and obtaining single-round scores of the query content of the current round, wherein the single-round scores of all rounds are used for evaluating the target agent. By the scheme, the reality of the interaction of the intelligent agents and the accuracy of the interaction effect evaluation can be improved.

Inventors

LIU DONG
GAO JIANQING
SONG SHIDE
PENG QIYU
ZHU XIAOYU
ZHU QUN
HU JIAXUE
HE ZHIYANG
ZHAO JINGHE
LIU CONG

Assignees

讯飞医疗科技股份有限公司

Dates

Publication Date: 20260512
Application Date: 20251222

Claims (11)

1. An agent interaction and assessment method, comprising: Acquiring query contents output by a target agent under a plurality of rounds and reply contents output by a reference agent according to behavior tags to obtain a dialogue set, wherein each behavior tag corresponds to a respective dialogue mode, and the behavior tag of a single round is determined based on at least part of dialogue contents of historical rounds or is set as a behavior tag preset by a target problem included in the query contents of the current round; Traversing each turn in the dialogue set, acquiring the behavior intention corresponding to the behavior label of the previous turn, and evaluating the query content of the current turn based on at least part of dialogue content and the behavior intention of the historical turn to obtain single-turn scores of the query content of the current turn, wherein the single-turn scores of all turns are used for evaluating the target agent.
2. The agent interaction and assessment method of claim 1, wherein, The reference agent can utilize the dialogue content of at least part of the historical rounds to estimate the dialogue mode of the current round, screen and obtain the behavior label of a single round, and output the reply content of the current round based on the behavior label and the dialogue content of at least part of the historical rounds; The target intelligent agent can output the query content of the next round based on the dialogue content of at least part of the historical rounds, the query content comprises target questions selected from a plurality of candidate questions, or the dialogue is selectively terminated based on the dialogue content of at least part of the historical rounds, and at least part of the candidate questions are preset with behavior labels.
3. The agent interaction and assessment method according to claim 2, wherein the at least some of the behavioral tags include corresponding sub-tags for adjusting output proportions of reply content matching the candidate questions in a single round; The sub-labels are selected by the reference agent based on dialogue contents of at least part of historical rounds, or behavior labels preset by at least part of candidate questions are configured with the sub-labels.
4. The agent interaction and assessment method according to claim 2, wherein the candidate questions provided with the behavior tags, and the behavior tags preset for the candidate questions are adjusted by a target object during the assessment process.
5. The agent interaction and assessment method according to claim 1, wherein the obtaining the behavior intent corresponding to the behavior tag of the previous round, based on at least part of dialogue content and the behavior intent of the historical round, assesses the query content of the current round, and obtains a single round score of the query content of the current round, includes: acquiring behavior intents corresponding to the behavior labels of the previous round, and evaluating query contents of the current round from a plurality of scoring dimensions based on dialogue contents and the behavior intents of at least part of historical rounds to obtain a single-dimension score of each scoring dimension, wherein each scoring dimension corresponds to a preset scoring key point; and obtaining single-round scores of the query content of the current round based on the single-dimensional scores of all the score dimensions.
6. The agent interaction and assessment method according to claim 1, wherein each of the dialog sets is matched with a respective dialog behavior path, turns exceeding a preset number or a preset proportion in the same dialog behavior path have the same behavior label, and each dialog behavior path includes a plurality of the dialog sets, the reference agent being configured with content to be recalled; The method comprises the steps of traversing each round in a dialogue set, obtaining a behavior intention corresponding to a behavior label of the previous round, evaluating the query content of the current round based on at least part of dialogue content and the behavior intention of the historical round, and obtaining single round scores of the query content of the current round, wherein after the single round scores of all rounds are used for evaluating the target agent, the method further comprises the steps of: Determining the whole-pass rewards of each word element in the query content in the dialogue set based on multiple rounds of the query content in the dialogue set and the recall rate of multiple rounds of the reply content compared with the content to be recalled, and determining the single-pass rewards of each word element in the query content in the corresponding rounds based on the single-round scores of the query content in each round in the dialogue set; And determining target rewards of each word element based on the whole rewards and the single-round rewards of each word element, and adjusting target agents by using the target rewards, wherein the adjusted target agents are used for dialogue with the reference agents or dialogue with target objects.
7. The agent interaction and assessment method according to claim 6, wherein the determining the overall rewards for each word element in the query content in the dialog set based on the multiple rounds of the query content in the dialog sets and the multiple rounds of the reply content compared to the recall rate of the to-be-recalled content comprises: Determining dialogue rewards of each dialogue set under the dialogue action path based on the total turn of the dialogue in each dialogue set under the dialogue action path, the content redundancy rate of multiple rounds of the query content and the recall rate of multiple rounds of the reply content compared with the content to be recalled; based on the dialogue rewards of each dialogue set in the dialogue action path, determining dialogue comparison rewards corresponding to all the dialogue sets in the dialogue action path; An integer rewards for each of the tokens in the query content within the dialog set is determined based on the dialog rewards and the dialog comparison rewards for the dialog set.
8. The agent interaction and assessment method of any of claims 1-7, wherein the target agent comprises a doctor agent and the reference agent comprises a patient agent.
9. An agent interaction and assessment system, comprising: The system comprises an acquisition module, a judgment module and a judgment module, wherein the acquisition module is used for acquiring query contents output by a target agent under a plurality of rounds and reply contents output by a reference agent according to behavior tags to obtain a dialogue set, each behavior tag corresponds to a respective dialogue mode, and the behavior tag of a single round is determined based on dialogue contents of at least part of historical rounds or is set as a behavior tag preset by a target problem included in the query contents of the current round; the execution module is used for traversing each turn in the dialogue set, obtaining the behavior intention corresponding to the behavior label of the previous turn, and evaluating the query content of the current turn based on at least part of dialogue content and the behavior intention of the historical turn to obtain single-turn scores of the query content of the current turn, wherein the single-turn scores of all turns are used for evaluating the target agent.
10. An electronic device comprising a memory and a processor coupled to each other, wherein the memory stores program data and the processor invokes the program data to perform the method of any of claims 1-8.
11. A computer readable storage medium having stored thereon program data, which when executed by a processor, implements the method of any of claims 1-8.

Description

Agent interaction and evaluation method, system and related device Technical Field The application relates to the technical field of artificial intelligence, in particular to an agent interaction and evaluation method, an agent interaction and evaluation system and a related device. Background Along with the development of artificial intelligence, the intelligent agents are widely applied, the training process of the intelligent agents needs to use a large amount of interactive dialogue samples, the interactive dialogue collection difficulty under the actual scene is high, two intelligent agents usually play roles in the dialogue respectively to conduct question-answering so as to collect data, but the interactive process between the intelligent agents is difficult to effectively control, so that data with higher reality degree which is matched with the actual scene can not be collected, and the effect of the intelligent agents is evaluated by utilizing customized rules, so that the evaluation process is disjointed with the interactive data, and the evaluation accuracy is lower. In view of this, how to improve the authenticity of the interaction of the agents and the accuracy of the evaluation of the interaction effect becomes a problem to be solved urgently. Disclosure of Invention The application mainly solves the technical problem of providing an agent interaction and evaluation method, an agent interaction and evaluation system and a related device, which can improve the authenticity of the agent interaction and the accuracy of the interaction effect evaluation. In order to solve the technical problems, the first aspect of the application provides an agent interaction and evaluation method, which comprises the steps of obtaining query contents output by a target agent under a plurality of rounds and reply contents output by a reference agent according to behavior labels, obtaining a dialogue set, wherein each behavior label corresponds to a respective dialogue mode, the behavior label of a single round is determined based on dialogue contents of at least part of historical rounds or is set as a behavior label preset by a target problem included in the query contents of the current round, traversing each round in the dialogue set, obtaining behavior intents corresponding to the behavior label of the previous round, evaluating the query contents of the current round based on dialogue contents and the behavior intents of at least part of historical rounds, and obtaining single-round scores of the query contents of the current round, wherein the single-round scores of all rounds are used for evaluating the target agent. In order to solve the technical problems, the second aspect of the application provides an agent interaction and evaluation system, which comprises an acquisition module, an execution module and a single-round evaluation module, wherein the acquisition module is used for acquiring query contents output by a target agent under a plurality of rounds and reply contents output by a reference agent according to behavior labels to obtain a dialogue set, each behavior label corresponds to a respective dialogue mode, the behavior label of a single round is determined based on dialogue contents of at least part of historical rounds or is set as a behavior label preset by a target problem included in the query contents of the current round, the execution module is used for traversing each round in the dialogue set to acquire behavior intents corresponding to the behavior label of the previous round, and evaluating the query contents of the current round based on dialogue contents and the behavior intents of at least part of historical rounds to obtain the single-round scores of the query contents of the current round, and the single-round scores of all rounds are used for evaluating the target agent. In order to solve the technical problem, a third aspect of the application provides an electronic device, which comprises a memory and a processor, wherein the memory and the processor are mutually coupled, the memory stores program data, and the processor calls the program data to execute the method in the first aspect. To solve the above technical problem, a fourth aspect of the present application provides a computer-readable storage medium having stored thereon program data which, when executed by a processor, implements the method described in the first aspect. The method has the beneficial effects that the method is different from the prior art, the method acquires the dialogue set when the target agent and the reference agent carry out dialogue under a plurality of rounds, and the dialogue set comprises the query content output by the target agent under each round and the reply content output by the reference agent according to the behavior label. Each behavior label corresponds to a respective dialogue mode, the behavior label of a single round is determined based on dialogue contents of at least