KR-102963961-B1 - SYSTEM AND METHOD FOR CALCULATING QUALITY INDEX OF MULTI-AGENT-BASED AI COUNSELING SERVICE

KR102963961B1KR 102963961 B1KR102963961 B1KR 102963961B1KR-102963961-B1

Abstract

The present invention relates to a multi-agent-based AI consultation service quality index calculation system and method that quantitatively evaluates and officially certifies the service quality of AI chatbots and callbots in order to improve quality management problems of AI consultation systems. According to the present invention, a scenario analysis unit receives AI consultation scenario data and extracts error items such as dead-end nodes and loop structures, and a rule verification unit determines whether there is a violation of predefined UX/VUX rules. Subsequently, a multi-agent evaluation unit processes evaluations of scenarios and rule violation items in parallel using LLM-based agents that separate the roles of diagnosis, question answering (QA), feedback, and reporting. Additionally, an external knowledge verification unit verifies the consistency of responses with external knowledge using a RAG-based QA module. A CSQI calculation unit calculates the CSQI by applying weights to these multiple quality indicators, and finally, a certification and improvement guidance unit issues a quality certificate or provides improvement recommendations based on the CSQI. The present invention significantly reduces the time and cost required for evaluation compared to manual QA and expands the scope of analysis. It has the effect of reducing customer churn and complaints by eliminating dead-end and UX errors in advance.

Inventors

서영승
임관택
김준수
김경재

Assignees

심랩스 주식회사
비비디글로벌(주)

Dates

Publication Date: 20260513
Application Date: 20250827

Claims (18)

A scenario analysis unit that receives AI consultation scenario data from a user, converts the scenario data into a state-transition-based graph structure to detect Dead-End and loop structures, identifies the scenario structure, and extracts error items; A rule verification unit that determines whether the above AI consultation scenario violates rules according to predefined UX/VUX rules; A multi-agent evaluation unit comprising a diagnostic agent, a question-and-answer (QA) agent, a feedback agent, and a reporting agent, wherein multiple LLM (Large Language Model)-based agents with separated roles perform evaluations of the above scenarios and rule violation items in parallel; An external knowledge verification unit that searches for relevant information from an external knowledge database using an external RAG (Retrieval-Augmented Generation) based QA module and verifies the consistency between the searched information and the evaluation result; A CSQI calculation unit that calculates the CSQI (Customer Service Quality Index) by applying dynamically adjusted weights learned based on machine learning for multiple quality indicators; and A multi-agent-based AI consultation service quality index calculation system characterized by including a certification and improvement guidance unit that issues a quality certificate or provides improvement recommendations in accordance with the above CSQI.
In paragraph 1, The above scenario analysis department After receiving AI consultation scenario data, parse the metadata of the data to identify the scenario ID, version, language setting, and creation date, and Converts parsed scenario data into a state transition-based graph structure, mapping consultation response sentences, user intents, branching conditions, and conversion probability values to each node, and A multi-agent-based AI consultation service quality index calculation system characterized by determining a path that is disconnected without a termination node while traversing a transformed graph as a Dead-End node, determining a loop structure where the same node is repeatedly called consecutively, and registering the determined Dead-End and loop nodes as error items in a database.
In paragraph 1, The above rule verification unit Load multiple predefined UX/VUX rule sets from local or cloud storage, and It automatically filters applicable rules from the relevant rule set based on industry sector, service characteristics, and the client's security requirements and importance, and Each filtered rule is sequentially applied to the scenario graph to determine whether a violation has occurred, and A multi-agent-based AI consultation service quality index calculation system characterized by classifying judgment results by violation type and storing them in a rule violation report data structure.
In paragraph 1, The above multi-agent evaluation department It includes multiple role-separated agents based on LLM (Large Language Model), wherein the agents each perform the roles of diagnosis, question and answer (QA), feedback, and reporting. Each agent loads role-specific prompt templates and evaluation rubrics to process evaluations of input scenarios and rule violation items in parallel, The evaluation results include response appropriateness score, contextual consistency score, UX/VUX compliance rate, and whether improvement is needed, and A multi-agent-based AI consulting service quality index calculation system characterized by integrating evaluation results to generate an internal evaluation result dataset.
In paragraph 1, The above external knowledge verification unit Selectively or in parallel, internal scenario-based evaluation and RAG (Retrieval-Augmented Generation)-based external knowledge verification are performed, External knowledge verification searches for the latest information and reference materials related to the evaluation target from external knowledge databases, APIs, and document repositories, extracts key sentences and keywords from the retrieved documents, and compares and analyzes them with the responses of the multi-agent evaluation unit. Calculate the consistency score and reasons for discrepancies from the results of the comparative analysis, and A multi-agent-based AI consultation service quality index calculation system characterized by transmitting the results to the CSQI calculation unit.
In paragraph 1, The above CSQI calculation unit is Scores are calculated for multiple quality indicators such as dead-end rate, loop rate, UX/VUX violation rate, and response consistency index, and Depending on the industry sector, service characteristics, and the client's security requirements and importance, it refers to a predefined weight table or automatically adjusts weights using a machine learning algorithm based on historical evaluation data, and Multiply each quality metric score by the adjusted weight and sum them, A multi-agent-based AI consultation service quality index calculation system characterized by normalizing the results into CSQI scores in the range of 0 to 100.
In paragraph 6, A multi-agent-based AI consultation service quality index calculation system characterized by the above-mentioned CSQI calculation unit learning past evaluation log data and CSQI score changes using a machine learning algorithm to automatically correct and update weight values for each quality indicator and UX/VUX rule verification criteria of the rule verification unit.
In paragraph 1, The above certification and improvement guidance department If the calculated CSQI is above the threshold score, the quality grade is determined according to the grade mapping table, and It automatically generates a quality certificate containing an electronic signature corresponding to the grade, and If the CSQI is below the threshold score, generate an improvement report containing improvement recommendation messages for each error item, and A multi-agent-based AI consultation service quality index calculation system characterized by providing certificates or improvement reports via API, email, or dashboard UI.
A method for calculating the quality index of a multi-agent-based AI consultation service performed by a multi-agent-based AI consultation service quality index calculation system, A scenario analysis unit of the above system converts an AI consultation scenario into a state-transition-based graph structure, and detects Dead-End and loop structures in the converted graph to extract error items; A step in which a rule verification unit of the above system determines whether the AI consultation scenario violates a rule according to a predefined UX/VUX rule; A step in which the multi-agent evaluation unit of the above system executes a plurality of LLM (Large Language Model)-based agents with separated roles in parallel to perform an evaluation of the above scenario and rule violation items; A step in which an external knowledge verification unit of the above system verifies consistency with the evaluation result by referring to an external knowledge database; A step in which the CSQI calculation unit of the above system calculates the CSQI (Customer Service Quality Index) by applying weights that are dynamically adjusted by learning on a machine learning basis to a plurality of quality indicators; and A method for calculating a multi-agent-based AI consultation service quality index, characterized by including the step of the certification/improvement guidance unit of the above system issuing a quality certificate or providing improvement recommendations in accordance with the above CSQI.
In Paragraph 9, The step of extracting the above error items is, Receive AI consultation scenario data, and Converts the above data into a state transition-based graph structure, and A method for calculating the quality index of a multi-agent-based AI consultation service, characterized by detecting dead-end nodes and loop structures in a transformed graph structure and registering them as error items.
In Paragraph 9, The step of determining whether the above rule has been violated is, Referencing multiple predefined sets of UX/VUX rules, and A method for calculating a quality index of a multi-agent-based AI consulting service, characterized by automatically loading and applying a set of rules according to industry sector, service type, and user characteristics.
In Paragraph 9, The step of performing the above evaluation is, Calling multiple role separation agents including a diagnostic agent, a question and answer (QA) agent, a feedback agent, and a reporting agent, and Each agent independently generates results based on predefined prompts or evaluation criteria, and A method for calculating a quality index of a multi-agent-based AI consultation service characterized by collecting the above results in parallel.
In Paragraph 9, The above verification step is, A method for calculating a quality index of a multi-agent-based AI consultation service, characterized by performing an internal scenario-based evaluation and an external RAG (Retrieval-Augmented Generation)-based evaluation selectively or in parallel, wherein the external evaluation involves searching for relevant information from an external knowledge database and comparing and verifying the consistency between the searched information and the agent evaluation results.
In Paragraph 9, The step of calculating the above CSQI is, Individual scores are calculated by multiplying each quality indicator by a weight, and A method for calculating a multi-agent-based AI consultation service quality index characterized by summing and normalizing the calculated scores to generate a CSQI score between 0 and 100.
In Paragraph 14, A method for calculating a multi-agent-based AI consulting service quality index, characterized in that the above weights are dynamically adjusted according to industry sector, service characteristics, customer security requirements, and importance.
In Paragraph 14, A multi-agent-based AI consultation service quality index calculation method characterized by automatically correcting and updating weight values for each quality indicator and UX/VUX rule verification criteria by learning past evaluation log data and CSQI score changes using a machine learning algorithm.
In Paragraph 9, The step of issuing the above quality certificate is, If the CSQI is above the threshold score, a predefined quality grade is assigned, and It generates an electronic certificate corresponding to the relevant quality level, and A multi-agent-based AI consultation service quality index calculation method characterized by generating a report including improvement recommendation messages for each error item when the CSQI is below a threshold score.
A computer-readable recording medium storing a computer program for executing a method according to any one of paragraphs 9 through 17.

Description

System and Method for Calculating Quality Index of Multi-Agent Based AI Counseling Service The present invention relates to the field of AI (artificial intelligence)-based consultation system quality evaluation and certification technology, and more specifically, to a system and method for quantitatively evaluating the service quality of AI chatbots and callbots in an offline or black box manner to calculate a Comprehensive Quality Index (CSQI) and automatically issuing a quality certificate by grade according to the results. With the recent advancement of artificial intelligence technology, the adoption of AI consultation systems, such as AI chatbots and callbots, is rapidly spreading across various industries. While these AI consultation systems offer various benefits, such as increased customer service efficiency and reduced labor costs, they are simultaneously giving rise to new problems. Customer complaints are on the rise due to scenario errors, dead-ends, and violations of User Experience (UX) or Voice User Experience (VUX). In particular, "dead-end" phenomena, where customers fail to obtain desired information or conversations are interrupted, can directly lead to customer churn and are emerging as a serious issue in the quality management of AI consultation services. Current quality assurance (QA) methods for AI consultation systems largely rely on manual sample testing, which presents limitations such as the difficulty of verifying the entire system and the absence of a credible quality grading system. Given the characteristics of AI consultation systems involving large-scale and complex scenarios, this makes it inefficient to detect and eliminate potential errors in advance, potentially leading to decreased customer satisfaction and a decline in corporate credibility. Furthermore, conventional systems designed for real-time conversation quality correction have limitations in providing quantitative diagnostics of the entire scenario and formal quality certification after operation. Against this backdrop, there is a growing need for a systematic system and method capable of resolving quality degradation issues arising from the proliferation of AI consultation systems, objectively evaluating overall system quality, and calculating and certifying reliable quality indices. To address these problems, the present invention is differentiated from existing systems focused on real-time conversation quality correction by quantitatively diagnosing the entire post-operation scenario and providing official quality certification. FIG. 1 is an overall configuration diagram of a multi-agent-based AI consultation service quality index calculation system according to one embodiment of the present invention. FIG. 2 is an internal configuration diagram of a multi-agent-based AI consultation service quality index calculation system according to one embodiment of the present invention. FIG. 3 is an internal configuration diagram of a processor according to one embodiment of the present invention. FIG. 4 is a flowchart showing a specific operation method of a scenario analysis unit according to an embodiment of the present invention. FIG. 5 is a flowchart showing a specific operation method of a rule verification unit according to one embodiment of the present invention. FIG. 6 is a flowchart showing a specific operation method of a multi-agent evaluation unit according to an embodiment of the present invention. FIG. 7 is a flowchart showing a specific operation method of an external knowledge verification unit according to an embodiment of the present invention. FIG. 8 is a flowchart showing a specific operation method of a CSQI calculation unit according to one embodiment of the present invention. FIG. 9 is a flowchart showing a specific operation method of the authentication/improvement guidance unit according to one embodiment of the present invention. FIG. 10 is a flowchart showing the overall process of a method for calculating a multi-agent-based AI consultation service quality index according to one embodiment of the present invention. The advantages and features of the present invention and the methods for achieving them will become clear by referring to the embodiments described below in detail together with the accompanying drawings. However, the present invention is not limited to the embodiments disclosed below but will be implemented in various different forms. The embodiments described in this specification are provided to ensure that the disclosure of the invention is complete and to fully inform those skilled in the art of the scope of the invention. And the present invention is defined only by the scope of the claims. Accordingly, in some embodiments, well-known components, well-known operations, and well-known techniques are not specifically described to avoid the present invention being interpreted ambiguously. Additionally, throughout the specification, the same reference numerals refer to the same