US-20260127183-A1 - Systems and Methods for Grounded Generation in Network Troubleshooting AI Bots

US20260127183A1US 20260127183 A1US20260127183 A1US 20260127183A1US-20260127183-A1

Abstract

Systems and methods improve accuracy for retrieving relevant content from a domain knowledgebase. A network device obtains, in response to a query, an initial population of retrieval strategies for retrieving relevant content from the domain knowledgebase. The network device retrieves first documents from the domain knowledgebase based on each retrieval strategy and selects first top-performing strategies from the initial population based on scoring of the retrieved first documents. The network device generates an updated population including the first top-performing strategies and mutant retrieval strategies derived from the selected first top-performing strategies and retrieves second documents from the domain knowledgebase based on each retrieval strategy of the retrieval strategies. The network device selects second top-performing strategies from the updated population based on a scoring of the retrieved second documents and provides the one of the second top-performing strategies to a large language model (LLM) for responding to the query.

Inventors

Gopalakrishnan SANKARANARAYANAN
Satish Babu SEELA

Assignees

VERIZON PATENT AND LICENSING INC.

Dates

Publication Date: 20260507
Application Date: 20241011

Claims (20)

1 . A method comprising: obtaining, by a computing device and in response to a query, an initial population of retrieval strategies for retrieving relevant content from a domain knowledgebase; retrieving, by the computing device, first documents from the domain knowledgebase based on each retrieval strategy of the retrieval strategies; selecting, by the computing device, first top-performing strategies from the initial population based on scoring of the retrieved first documents; generating, by the computing device, an updated population including the first top-performing strategies and mutant retrieval strategies derived from the selected first top-performing strategies; retrieving, by the computing device, second documents from the domain knowledgebase based on each retrieval strategy of the retrieval strategies; selecting, by the computing device, second top-performing strategies from the updated population based on a scoring of the retrieved second documents; and providing, by the computing device, one of the second top-performing strategies to a large language model (LLM) for responding to the query.
2 . The method of claim 1 , wherein selecting the first top-performing strategies includes: generating a fitness score for each retrieval strategy based on the retrieved first documents; and selecting set of strategies with the highest fitness scores.
3 . The method of claim 2 , wherein each fitness score includes a mean reciprocal rank (MRR) and a normalized discounted cumulative gain (NDCG) for the first documents associated with one of the retrieval strategies.
4 . The method of claim 1 , wherein selecting the second top-performing strategies includes: generating fitness scores for each mutant retrieval strategy based on the retrieved second documents; and selecting set of strategies from the updated population with the highest fitness scores.
5 . The method of claim 1 , wherein generating the updated population includes: combining parameters of the first selected top-performing strategies to create crossover strategies.
6 . The method of claim 5 , wherein combining the parameters further includes: introducing a single-point crossover or a two-point crossover between two of the first selected top-performing strategies.
7 . The method of claim 5 , wherein generating the updated population further includes: introducing mutations to terms in the crossover strategies.
8 . The method of claim 1 , further comprising: using a highest scored strategy of the second top-performing strategies to generate a response to the query; collecting first candidate model feedback for highest scored strategy and user feedback from the response to the query; and providing the highest scored strategy, the user feedback, and the first candidate model feedback to a dynamic evaluation set.
9 . The method of claim 8 , further comprising: collecting second candidate model feedback for a next-highest scored strategy of the second top-performing strategies; and providing the next-highest scored strategy and the second candidate model feedback to a dynamic evaluation set.
10 . A network device, comprising: one or more processors to: obtain, in response to a query, an initial population of retrieval strategies for retrieving relevant content from a domain knowledgebase; retrieve first documents from the domain knowledgebase based on each retrieval strategy of the retrieval strategies; select first top-performing strategies from the initial population based on scoring of the retrieved first documents; generate an updated population including the first top-performing strategies and mutant retrieval strategies derived from the selected first top-performing strategies; retrieve second documents from the domain knowledgebase based on each retrieval strategy of the retrieval strategies; select second top-performing strategies from the updated population based on a scoring of the retrieved second documents; and provide one of the second top-performing strategies to a large language model (LLM) for responding to the query.
11 . The network device of claim 10 , wherein, when selecting the first top-performing strategies, the one or more processors are further to: generate a fitness score for each retrieval strategy based on relevance of the retrieved first documents.
12 . The network device of claim 11 , wherein, when generating the fitness score, the one or more processors are further to: calculate a mean reciprocal rank (MRR) and a normalized discounted cumulative gain (NDCG) for a portion of the first documents associated with each retrieval strategy.
13 . The network device of claim 10 , wherein, when selecting the second top-performing strategies, the one or more processors are further to: generate fitness scores for each mutant retrieval strategy based on the retrieved second documents; and selecting set of strategies from the updated population with the highest fitness scores.
14 . The network device of claim 10 , wherein, when generating the updated population, the one or more processors are further to: combine parameters of the first selected top-performing strategies to create crossover strategies; and introduce mutations to one or more terms in the crossover strategies.
15 . The network device of claim 10 , wherein the one or more processors are further to: use a highest scored strategy of the second top-performing strategies to generate a response to the query; collect first candidate model feedback for highest scored strategy and user feedback from the response to the query; and provide the highest scored strategy, the user feedback, and the first candidate model feedback to a dynamic evaluation set.
16 . The network device of claim 15 , wherein the one or more processors are further to: collect second candidate model feedback for a next-highest scored strategy of the second top-performing strategies; and provide the next-highest scored strategy and the second candidate model feedback to a dynamic evaluation set.
17 . A non-transitory, computer-readable storage medium storing instructions executable by a processor of a computing device for: obtaining, in response to a query, an initial population of retrieval strategies for retrieving relevant content from a domain knowledgebase; retrieving first documents from the domain knowledgebase based on each retrieval strategy of the retrieval strategies; selecting first top-performing strategies from the initial population based on scoring of the retrieved first documents; generating an updated population including the first top-performing strategies and mutant retrieval strategies derived from the selected first top-performing strategies; retrieving second documents from the domain knowledgebase based on each retrieval strategy of the retrieval strategies; selecting second top-performing strategies from the updated population based on a scoring of the retrieved second documents; and providing one of the second top-performing strategies to a large language model (LLM) for responding to the query.
18 . The non-transitory, computer-readable storage medium of claim 17 , wherein the instructions for selecting the first top-performing strategies further include instructions for: generating a mean reciprocal rank (MRR) and a normalized discounted cumulative gain (NDCG) for each retrieval strategy based on the retrieved first documents.
19 . The non-transitory, computer-readable storage medium of claim 17 , wherein the instructions for selecting the second top-performing strategies further include instructions for: generating a mean reciprocal rank (MRR) and a normalized discounted cumulative gain (NDCG) for each retrieval strategy based on the retrieved second documents; and selecting set of strategies with highest weighted MRR and NDCG scores.
20 . The non-transitory, computer-readable storage medium of claim 17 , wherein the instructions for generating the updated population further include instructions for: introducing a single-point crossover or a two-point crossover between two of the first selected top-performing strategies to form crossover strategies; and introducing mutations to parameters in the crossover strategies.

Description

BACKGROUND Service providers, organizations, or other types of entities or businesses may provide customer service centers, call centers, help desks, tools, and/or other kinds of platforms, for example, to enable a user to resolve an issue or problem. A significant volume of contact or use of these resources may relate to known and repetitive issues. As such, these resources may not be optimally utilized. For example, the user may be able to solve a given problem by a knowledge article or through a guided self-help measure. Generative artificial intelligence (AI) systems perform tasks and generate new content based on user input applied to large datasets. Applying Generative AI to enterprise applications with guardrails is a common problem faced today across enterprises. When it comes to Large Language Models (LLMs), the training of the models happens with various sources of world knowledge. The appropriate dataset selection can significantly affect the performance and utility of generative AI models. BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a diagram illustrating concepts described herein; FIG. 2 is a diagram illustrating an exemplary environment 200 in which an exemplary embodiment of the retrieval optimization system may be implemented; FIG. 3 is a diagram illustrating the retrieval optimization system within the context of a of Retrieval-Augmented Generation (RAG) environment, according to an implementation; FIG. 4 is a process flow for a genetic algorithm variant that may be applied in the retrieval optimization system to perform optimized augmentation and strategy selection; FIG. 5 is a block diagram illustrating implementation of an optimized augmentation and strategy selection process; FIG. 6 is a diagram illustrating exemplary components of a device according to an implementation described herein; FIG. 7 is a flow diagram illustrating an exemplary process for providing optimized content retrieval for AI bots, according to an implementation described herein; and FIG. 8 illustrates a use case of a retrieval augmentation system for a customer support chatbot, according to an implementation. DETAILED DESCRIPTION The following detailed description refers to the accompanying drawings. The same reference numbers in different drawings may identify the same or similar elements. Also, the following detailed description does not limit the invention. Retrieval-Augmented Generation (RAG) is a guided method to leverage specific knowledge sources of the enterprise along with the power of Large Language Models (LLMs) to create Artificial Intelligence (AI) systems/bots. As part of RAG process, retrieving the relevant content from a domain knowledgebase is a critical component. Retrieval accuracy plays a crucial role in the final result of the RAG pattern leveraging LLMs. A majority of the Generative AI applications use different techniques like vector databases where data is stored in a chunked format and retrieved as a set of chunks, graph format where the data is retrieved using entities, and relationship methods. Today there are multiple techniques to retrieve data for use with RAG. In some cases, advanced techniques are used, such as BM25 (Best Matching 25) and others, to improve retrieval accuracy. The evaluation of retrieval accuracy from the given corpus is heavily dependent on statistical nearest distance algorithms or search algorithms. These advanced RAG techniques for retrieval robustness rely on leveraging LLM calls, which is an expensive operation, or reranking, which is a high compute operation. A hybrid approach is needed that is cost effective and leverages minimal compute resources. Systems and methods described herein provide a retrieval optimization system to improve the overall robustness of content retrieval, complementing existing retrieval sources like vector databases and graph databases. According to an embodiment, the retrieval optimization system may be implemented in a service desk or product support environment and/or context that uses Generative AI applications. According to other exemplary embodiments, the retrieval optimization system may be implemented in other types of environments and/or contexts in which content recommendation may be suited. Implementations described herein provide an extension from a Genetic Algorithm for chromosome analysis. The techniques leveraged in selection of candidates from a complex pool may be applied to retrieval problems encountered in Generative AI applications. FIG. 1 illustrates concepts described herein. As shown in FIG. 1, a conventional RAG process includes retrieval from a domain knowledge data set, followed by augmentation, and generation. Typically, in the retrieval process, a user query is converted to a vector representation and matched with vector databases from the domain knowledge data set. Next, the RAG model augments the user input by adding the relevant retrieved data in context and passes the augmented input to an LLM. The LLM co