CN-121786431-B - Man-machine collaborative calibration method, system, equipment and storage medium

CN121786431BCN 121786431 BCN121786431 BCN 121786431BCN-121786431-B

Abstract

The application discloses a man-machine collaborative calibration method, a system, equipment and a storage medium, which aim to solve the problem that noise data generated by manual review pollutes an artificial intelligent model. The method comprises the steps of establishing a dynamic credibility model for workers and calculating current credibility scores, carrying out split processing on difference cases based on the credibility scores when a man-machine judges that differences are generated, distributing the difference cases to a direct adoption or arbitration path, carrying out weighted training on an artificial intelligent model according to training weights mapped by the credibility scores, establishing correction cases with confirmed errors as a personal error semantic vector library, and carrying out similarity matching when new tasks are processed so as to push early warning in real time. The application can inhibit noise data, relieve expert resource bottleneck, and realize accurate real-time coaching of personnel.

Inventors

WANG JIANBING
CHEN SHUFEN
TIAN YUANYUAN
Zuo Kekuang
HU LIPING

Assignees

上海浩宜信息科技有限公司

Dates

Publication Date: 20260512
Application Date: 20260305

Claims (10)

1. A human-computer collaborative calibration method applied to a system comprising an artificial intelligence model and a terminal, the terminal comprising at least one staff terminal, the method comprising: Issuing a new task to the staff terminal; Based on a plurality of preset evaluation dimensions, a dynamic credibility model is established for a worker to calculate the current credibility score according to the historical working data of the worker, wherein the preset initial credibility score is distributed to the worker entering the system for the first time to serve as the current credibility score; When the staff corrects the judgment result of the artificial intelligent model, carrying out shunting processing on the generated difference case based on the current credibility score of the staff, wherein the shunting processing comprises the steps of distributing the difference case to a preset processing path according to the comparison result of the current credibility score and at least one preset credibility threshold, wherein the processing path comprises the steps of directly adopting the correction result of the staff and submitting the difference case for arbitration; Training data which is confirmed to be correct through the shunt processing and is generated by manual correction is used for training the artificial intelligent model, wherein the current reliability score is mapped into training weights corresponding to the training data through a preset monotonically increasing nonlinear weight mapping function based on the current reliability score of the staff generating the training data, and the training weights are used for carrying out weighted training on the artificial intelligent model; Identifying the error correction cases generated by the staff member as personal history error cases of the staff member through the shunt processing, and establishing a semantic vector library of the personal history error cases for the staff member, and When the staff processes the new task, calculating the semantic similarity between the content of the new task and the personal historical error case in the semantic vector library in real time, and pushing relevant historical error case information to the staff terminal when the semantic similarity exceeds a preset threshold.
2. The human-computer collaborative calibration method according to claim 1, wherein the plurality of evaluation dimensions includes at least two of a working accuracy of the worker during a predetermined time period, a standard deviation of a historical working accuracy of the worker, and a reward score obtained by the worker for correcting a high value case defined by a predetermined rule, and the pushing related historical error case information includes pushing non-blocking prompt information on the worker terminal, the prompt information including a correct correction result and/or an arbitration conclusion corresponding to the historical error case.
3. The human-computer collaborative calibration method according to claim 1, wherein the terminal further comprises at least one expert terminal, the preset confidence threshold comprises a preset high confidence threshold and a preset low confidence threshold, and the splitting process comprises: according to the comparison result of the current credibility score of the staff member and the high credibility threshold value and the comparison result of the current credibility score of the staff member and the low credibility threshold value, the following steps are performed: directly adopting the correction result of the staff member when the current credibility score of the staff member is higher than the high credibility threshold value, or Submitting the difference case to arbitration when the current credibility score of the staff member is higher than or equal to the low credibility threshold and the current credibility score of the staff member is lower than or equal to the high credibility threshold, or And when the current credibility score of the staff is lower than the low credibility threshold, starting a layered arbitration flow, wherein the layered arbitration flow preferentially transmits the difference case to the staff with the current credibility score higher than the high credibility threshold for cross verification, when the cross verification is divergent, the difference case is submitted to an expert terminal for arbitration, and when the cross verification is agreed, the correction result of the difference case is confirmed according to the agreed result of the cross verification.
4. A human-machine co-calibration method according to any one of claims 1 to 3, wherein the nonlinear weight mapping function is at least one of a Sigmoid function, a variant of a Sigmoid function, a piecewise linear function.
5. A human-machine collaborative calibration system comprising an artificial intelligence model and a terminal, the terminal comprising at least one staff terminal, the system further comprising: The task release module is used for releasing a new task to the staff terminal; The credibility evaluation module is used for establishing a dynamic credibility model for a worker based on a plurality of preset evaluation dimensions so as to calculate the current credibility score according to the historical working data of the worker, wherein the credibility evaluation module is also used for distributing a preset initial credibility score as the current credibility score for the worker entering the system for the first time; the arbitration scheduling module is used for carrying out shunting processing on the generated difference case based on the current credibility score of the staff when the staff corrects the judgment result of the artificial intelligent model, wherein the shunting processing comprises the steps of distributing the difference case to a preset processing path according to the comparison result of the current credibility score and at least one preset credibility threshold, wherein the processing path comprises the steps of directly adopting the correction result of the staff and submitting the difference case to arbitration; a weighted training module for using training data generated by artificial correction, which is confirmed to be correct by the shunting processing of the arbitration scheduling module, to train the artificial intelligent model, wherein the current reliability score is mapped to training weight corresponding to the training data through a preset monotonically increasing nonlinear weight mapping function based on the current reliability score of the staff generating the training data, and the training weight is utilized to carry out weighted training on the artificial intelligent model, and The system comprises a task issuing module, a real-time intervention module, a real-time calculation module and a related historical error case information pushing module, wherein the real-time intervention module is used for confirming that an error correction case generated by the arbitration dispatching module is used as a personal historical error case of the worker, establishing a semantic vector library of the personal historical error case for the worker, calculating the semantic similarity between the content of a new task and the personal historical error case in the semantic vector library in real time when the worker processes the new task issued by the task issuing module, and pushing the related historical error case information to the worker terminal when the semantic similarity exceeds a preset threshold value.
6. The human-computer collaborative calibration system according to claim 5, wherein the plurality of evaluation dimensions comprises at least two of a working accuracy of the worker during a predetermined time period, a standard deviation of a historical working accuracy of the worker, and a reward score obtained by the worker for correcting a high value case defined by a predetermined rule, and the real-time intervention module is specifically configured to push a non-blocking prompt message on the worker terminal, wherein the prompt message comprises a correct correction result and/or an arbitration conclusion corresponding to the historical error case.
7. The human-machine co-calibration system of claim 5, wherein the terminal further comprises at least one expert terminal, the preset confidence threshold comprises a preset high confidence threshold and a preset low confidence threshold, and the split processing comprises: according to the comparison result of the current credibility score of the staff member and the high credibility threshold value and the comparison result of the current credibility score of the staff member and the low credibility threshold value, the following steps are performed: directly adopting the correction result of the staff member when the current credibility score of the staff member is higher than the high credibility threshold value, or Submitting the difference case to arbitration when the current credibility score of the staff member is higher than or equal to the low credibility threshold and the current credibility score of the staff member is lower than or equal to the high credibility threshold, or And when the current credibility score of the staff is lower than the low credibility threshold, starting a layered arbitration flow, wherein the layered arbitration flow preferentially transmits the difference case to the staff with the current credibility score higher than the high credibility threshold for cross verification, when the cross verification is divergent, the difference case is submitted to an expert terminal for arbitration, and when the cross verification is agreed, the correction result of the difference case is confirmed according to the agreed result of the cross verification.
8. The human-machine co-calibration system according to any one of claims 5 to 7, wherein the nonlinear weight mapping function is at least one of a Sigmoid function, a variant of a Sigmoid function, a piecewise linear function.
9. An electronic device comprising a processor and a memory, the memory having stored therein a computer program which, when executed by the processor, implements the human-machine co-calibration method of any one of claims 1 to 4.
10. A computer readable storage medium, having stored thereon a computer program which, when executed by a processor, implements a human-machine co-calibration method according to any of claims 1 to 4.

Description

Man-machine collaborative calibration method, system, equipment and storage medium Technical Field The present application relates to the field of man-machine coordination technologies, and in particular, to a man-machine coordination calibration method, system, device, and storage medium. Background In man-machine collaborative systems such as modern contact centers, content auditing platforms and the like, a hybrid working mode of 'artificial intelligence preliminary judgment plus artificial review confirmation' is commonly adopted. In this mode, optimization of artificial intelligence models typically relies on taking all of the manually reviewed or corrected data indiscriminately as "standard answers" and for model fine-tuning. However, this approach implies an impractical assumption that all manual corrections are completely correct. In practice, errors are inevitably included in the correction result due to differences in factors such as business ability, proficiency, fatigue state, etc. of the auditor. These erroneous "noise data" are treated equally with high quality data, systematically contaminating the model during the model training process, causing the model to learn an erroneous discrimination pattern, which in the long term would otherwise impair the model performance, forming a vicious circle of "model misleading person, person re-misleading model". In order to ensure the data quality, an improvement idea is to submit all "disputed cases" inconsistent with manual judgment to a few field experts for final arbitration. However, this causes rapid exhaustion of expert resources, and forms a flow bottleneck in a scenario with a large amount of traffic. In addition, for improving the manual examination capability, the method adopts periodic examination, random examination and review and other modes to carry out unified and general training, the mode is relatively lag and extensive, capability blind areas of individual auditors on a specific semantic understanding level cannot be identified and intervened in real time, and similar errors are repeatedly generated, so that the personnel grow slowly. Therefore, the prior art still has obvious defects in terms of how to effectively filter noise data in manual auditing, how to efficiently utilize limited expert resources, and how to improve the capability of auditing personnel in accurate real time. Disclosure of Invention The application aims to solve the technical problems that noise data generated by different artificial rechecking quality in the prior art pollutes an artificial intelligent model, expert resources are easy to form bottlenecks, and the traditional personnel capacity improving mode lacks accuracy and instantaneity. In order to solve the technical problems, the application provides a human-computer collaborative calibration method which is applied to a system comprising an artificial intelligent model and a terminal, wherein the terminal comprises at least one staff terminal, the method comprises the steps of issuing a new task to the staff terminal, establishing a dynamic credibility model for the staff based on a plurality of preset evaluation dimensions to calculate a current credibility score according to historical working data of the staff, wherein the staff entering the system for the first time is allocated a preset initial credibility score as the current credibility score, when the staff corrects a judgment result of the artificial intelligent model, carrying out shunt processing on a generated difference case based on the current credibility score of the staff, the shunt processing comprises the steps of allocating the difference case to a preset processing path according to a comparison result of the current credibility score and at least one preset credibility threshold, the processing path comprises the steps of directly receiving a correction result of the staff and arbitrating the difference, determining that the shunt processing is correct, using the manually-corrected training data generated by the shunt processing to be used as a training basis for the training model, carrying out the training of the error, carrying out the linear training by using the training of the error-weighted training data of the training model as the error-corrected version of the training data of the staff, and when the worker processes the new task, calculating the semantic similarity between the content of the new task and the personal historical error case in the semantic vector library in real time, and pushing related historical error case information to the worker terminal when the semantic similarity exceeds a preset threshold. Optionally, the plurality of evaluation dimensions comprise at least two of working accuracy of the staff member in a preset time period, standard deviation of historical working accuracy of the staff member and reward points obtained by the staff member due to correction of high-value cases defined by preset ru