KR-102963962-B1 - RISK-ADAPTIVE DUAL-PATH COUNSELING QUALITY EVALUATION SYSTEM AND METHOD

KR102963962B1KR 102963962 B1KR102963962 B1KR 102963962B1KR-102963962-B1

Abstract

The present invention is a risk-adaptive dual-path consultation quality evaluation system and method developed to improve existing inefficient consultation quality management. This system aims to simultaneously maximize cost efficiency and evaluation accuracy by evaluating and verifying the consultation quality of human agents and AI chatbots/callbots in real time. The invention intelligently branches evaluation paths into a Light Route and an LLM Route using dynamic gate functions, taking into account consultation risk (R) and system load (L). This reduces LLM call costs by 70% while maintaining high evaluation accuracy. Furthermore, it saves time and costs through automated evaluation utilizing metrics dedicated to AI bots, and continuously improves the accuracy of the evaluation model through Human Feedback-Based Reinforcement Learning (RL-HF). Consequently, it achieves an analysis rate of over 99% for all consultations and cost savings of over 80%, providing an innovative consultation quality management solution that is immediately commercially viable across various industries.

Inventors

서영승
임관택
김준수
김경재

Assignees

심랩스 주식회사
비비디글로벌(주)

Dates

Publication Date: 20260513
Application Date: 20250827

Claims (20)

A consultation data receiving unit that receives consultation conversation data from a real-time voice stream or text log; A risk score calculation unit that calculates a risk score (R) including multiple sub-indicators based on predefined rules and a machine learning model for the received consultation conversation data; An adaptive gate processing unit that monitors the computational resource usage of the system to measure the system load (L), calculates a dynamic gate function G(R, L) = w r · R + w l · L (where w r is the risk weight and w l is the load weight) using the risk score (R) and the system load (L) as inputs, and determines to select a precision evaluation path using a large-scale language model (LLM) if the value of the gate function G(R, L) is greater than or equal to a threshold value (θ), and to select a statistics-based lightweight evaluation path if it is less than the threshold value (θ); A quality score calculation unit that calculates a consultation quality score according to the processing path determined above; and A risk-adaptive dual-path counseling quality evaluation system characterized by including a quality report generation unit that generates a quality report reflecting the counseling quality score calculated above.
In paragraph 1, The above risk score calculation unit is, A risk-adaptive dual-path counseling quality evaluation system characterized by being configured to calculate at least one of the sentiment score of counseling conversation content, the script deviation score from standard response procedures, and the compliance violation score as a sub-indicator of the risk score (R).
In paragraph 1, The above adaptive gate processing unit is, A risk-adaptive dual-path consultation quality evaluation system characterized by being configured to periodically and automatically adjust the weights and thresholds of the dynamic gate function to optimize predefined operational goals (cost, latency, accuracy).
In paragraph 1, The above quality score calculation unit is, A lightweight evaluation unit that calculates a quality score based on statistics when the value of the gate function G(R, L) is less than a threshold value (θ) according to the determination of the adaptive gate processing unit; and A risk-adaptive dual-path counseling quality evaluation system characterized by including an LLM evaluation unit that calculates a quality score using a large-scale language model (LLM) when the value of the gate function G(R, L) is greater than or equal to a threshold (θ).
In paragraph 4, The above LLM evaluation unit is, A risk-adaptive dual-path consultation quality evaluation system characterized by including a Search Augmentation Generative (RAG) module that extracts reinforcement information related to the context of the consultation conversation data from a domain knowledge graph or vector database and injects it into a prompt.
In paragraph 1, The above quality score calculation unit is, A risk-adaptive dual-path counseling quality evaluation system characterized by being configured to additionally evaluate the quality of a handoff section when a handoff section occurs in which the subject of the counseling conversation switches between a human counselor and an AI bot.
In paragraph 1, A risk-adaptive dual-path consultation quality evaluation system further comprising a Bot-QA analysis unit that calculates at least one bot-specific quality indicator selected from a group consisting of response delay time, occurrence of conversation failure (Fallback), Natural Language Understanding (NLU) reliability, and persona consistency for a consultation conversation generated by an AI bot, and provides it to the quality score calculation unit.
In paragraph 1, A risk-adaptive dual-path consultation quality evaluation system characterized by further including a model tuning unit that updates the machine learning model of the risk score calculation unit and the large-scale language model (LLM) of the quality score calculation unit by performing human feedback-based reinforcement learning (RL-HF) using the received feedback as a reward signal, wherein a quality manager receives feedback directly specifying whether there is a 'script deviation,''complianceviolation,' or 'best consultation' section for a specific utterance section of a consultation conversation through an interactive user interface (UI) linked with the quality report generation unit.
In paragraph 1, The models used in the above-mentioned risk score calculation unit and the above-mentioned quality score calculation unit are, A risk-adaptive dual-path counseling quality evaluation system characterized by being a multilingual model trained using an Adapter-Fusion method to enable processing of multiple languages.
In paragraph 1, A risk-adaptive dual-path counseling quality evaluation system characterized by further including a federated learning management unit that manages federated learning, wherein, when the above system is executed on a plurality of distributed nodes, the learning results of each node are merged at a central server to update a global model and redistributed to each node.
As a risk-adaptive dual-path counseling quality evaluation method performed by a risk-adaptive dual-path counseling quality evaluation system, A step in which the consultation data receiving unit of the above system collects consultation conversation data; A risk score calculation unit of the above system calculates a risk score (R) including a plurality of sub-indicators based on predefined rules and a machine learning model for the collected consultation conversation data; A step in which an adaptive gate processing unit of the above system monitors the computational resource usage of the system to measure the system load (L), calculates a dynamic gate function G(R, L) = w r · R + w l · L (where w r is the risk weight and w l is the load weight) using the risk score (R) and the system load ( L ) as inputs, and decides to select a precision evaluation path using a large-scale language model (LLM) if the value of the gate function G(R, L) is greater than or equal to a threshold value (θ), and to select a statistics-based lightweight evaluation path if it is less than the threshold value (θ); A step in which a quality score calculation unit of the above system calculates a consultation quality score according to the determined processing path; and A risk-adaptive dual-path counseling quality evaluation method characterized by including the step of a quality report generation unit of the above-mentioned system generating a quality report reflecting the calculated counseling quality score.
In Paragraph 11, The above sub-indicators of the risk score (R) calculated in the step of calculating the above risk score (R) are, A risk-adaptive dual-path counseling quality evaluation method characterized by including at least one of a sentiment score indicating the intensity of positive or negative sentiment included in the counseling conversation content, a script deviation score indicating a difference from a set standard response procedure, and a compliance violation score indicating the possibility of violating laws or internal regulations.
In Paragraph 11, A risk-adaptive dual-path consultation quality evaluation method characterized by, when the consultation conversation data is generated by an AI bot, additionally calculating at least one bot-specific quality indicator selected from a group consisting of response latency, whether a conversation failure (Fallback) occurred, natural language understanding (NLU) reliability, and persona consistency, and integrating this into the quality score of the step of calculating the consultation quality score.
In Paragraph 11, The step of calculating the above consultation quality score is, A risk-adaptive dual-path consultation quality evaluation method characterized by executing a lightweight evaluation route (Lite Route) that calculates a quality score based on statistical features or rules when the value of the dynamic gate function G(R, L) is less than a preset threshold (θ), and executing an LLM evaluation route (LLM Route) that calculates a quality score using a large-scale language model (LLM) when the value of the gate function G(R, L) is greater than or equal to the threshold (θ).
In Paragraph 14, The above LLM evaluation path is, A risk-adaptive dual-path counseling quality evaluation method characterized by including a Search Augmentation Generative (RAG) module that extracts reinforcement information related to the context of the counseling conversation data from a domain knowledge graph or vector database, and inputting a prompt containing the extracted reinforcement information into the large-scale language model to calculate the quality score.
In Paragraph 14, The weights (w r , w l ) and threshold value (θ) of the above dynamic gate function are, A risk-adaptive dual-path consultation quality evaluation method characterized by periodic automatic adjustment to optimize predefined operational goals (cost, latency, accuracy).
In Paragraph 11, The step of calculating the above consultation quality score is, A risk-adaptive dual-path counseling quality evaluation method characterized by calculating a counseling quality score by additionally evaluating the accuracy of information transmission and the naturalness of the conversation during a handoff section when a handoff section occurs in which the subject of the counseling conversation switches between a human counselor and an AI bot.
In Paragraph 11, After the step of generating the above quality report, A risk-adaptive dual-path counseling quality evaluation method characterized by further including a step of updating a machine learning model in the step of calculating the risk score (R) through human feedback-based reinforcement learning (RL-HF) and a large-scale language model in the step of calculating the counseling quality score, wherein the quality manager receives feedback as a reward signal regarding the quality report, specifying whether a specific utterance section of the counseling conversation is 'script deviation', 'compliance violation', or 'best counseling' section.
In Paragraph 18, The feedback from the quality manager mentioned above is, A risk-adaptive dual-path consultation quality evaluation method characterized by receiving through an interactive user interface (UI) that corrects correct/incorrect labels for evaluation items of the above quality report or designates best consultation sections.
In Paragraph 11, A risk-adaptive dual-path counseling quality evaluation method characterized in that the machine learning model and large-scale language model used in the step of calculating the risk score (R) and the step of calculating the counseling quality score are multilingual models trained in an Adapter-Fusion manner to enable processing of multiple languages.

Description

Risk-Adaptive Dual-Path Counseling Quality Evaluation System and Method The present invention relates to a technology for evaluating and verifying the quality of an artificial intelligence-based counseling system, and more specifically, to a risk-adaptive dual-path counseling quality evaluation system and method for automatically inspecting the counseling quality of human counselors by analyzing voice and text counseling data in real time to improve the quality of human counselors, and for verifying and evaluating whether AI chatbots and callbots perform counseling conversations normally with people. With the recent advancement of artificial intelligence technology, the adoption of AI consultation systems, such as AI chatbots and callbots, is rapidly spreading across industries. While these AI consultation systems offer various benefits, such as increased customer service efficiency and reduced labor costs, they are simultaneously causing new problems. Traditional call center Quality Assurance (QA) relies on sample listening methods, resulting in low analysis rates and significant latency, which makes it difficult to accurately diagnose the overall quality of consultations. Purely rule-based QA approaches have limitations in effectively handling complex consultation situations, such as complex emotions or script deviations. On the other hand, while approaches utilizing Large Language Models (LLM) can provide high accuracy, they are inefficient to apply to all consultations due to the massive computational resources and high cost burdens involved. Existing prior art has lacked an integrated structure capable of dynamically minimizing LLM usage while simultaneously evaluating bot-specific metrics. This hinders real-time quality evaluation of all consultations and makes it difficult to evaluate human agents and AI bot consultations using integrated standards. In particular, bot-specific quality issues such as response delay, fallback, NLU reliability, and persona consistency are difficult to accurately measure using existing human agent evaluation metrics. Furthermore, the absence of an automated learning mechanism to continuously improve the accuracy of evaluation models reduces the efficiency of quality management. Against this backdrop, there is a growing need for a systematic system and method that evaluates all consultation data in real time to minimize quality deviations, minimizes LLM call costs by considering system load, evaluates human consultations and AI bot consultations based on appropriate standards while providing them in a single integrated quality report, and automatically improves the accuracy of the evaluation model through continuous learning. To solve these problems, the present invention applies a combination of Risk-Adaptive Dual Path pipelines, Search Augmentation Generative (RAG), Domain Knowledge Graph (KG), Quality Indicators for AI Bots (Bot-QA), and Reinforcement Learning-based Continuous Learning (RL-HF). Figure 1 is an overall configuration diagram of a risk-adaptive dual-path counseling quality evaluation system according to one embodiment of the present invention. FIG. 2 is an internal configuration diagram of a risk-adaptive dual-path counseling quality evaluation system according to one embodiment of the present invention. FIG. 3 is an internal configuration diagram of a processor according to one embodiment of the present invention. FIG. 4 is an internal configuration diagram of a quality score calculation unit according to one embodiment of the present invention. FIG. 5 is a flowchart showing the entire process of a risk-adaptive dual-path counseling quality evaluation method according to one embodiment of the present invention. The advantages and features of the present invention and the methods for achieving them will become clear by referring to the embodiments described below in detail together with the accompanying drawings. However, the present invention is not limited to the embodiments disclosed below but will be implemented in various different forms. The embodiments described in this specification are provided to ensure that the disclosure of the invention is complete and to fully inform those skilled in the art of the scope of the invention. And the present invention is defined only by the scope of the claims. Accordingly, in some embodiments, well-known components, well-known operations, and well-known techniques are not specifically described to avoid the present invention being interpreted ambiguously. Additionally, throughout the specification, the same reference numerals refer to the same components, and the terms used (mentioned) in this specification are for describing embodiments and are not intended to limit the invention. In this specification, the singular form includes the plural form unless specifically stated otherwise in the text, and components and operations referred to as 'comprising (or comprising)' do not exclude the presence or addition of o