US-12619663-B1 - Follow-up question recommender for AI assistant
Abstract
One or more follow-up response recommendations relating to a question-answer event processed by an artificial intelligence assistant are generated. The one or more follow-up response recommendations identify questions from a predefined response library. A weighted average of embeddings is generated for each question-answer conversation of historical question-answer events collected via telemetry. A historical question-answer conversation includes a chronologically ordered sequence of question-answer events. A self-supervised weighted embedding dataset is generated and includes the weighted average of embeddings for each historical question-answer conversation and at least one corresponding semantic nearest neighbor question from the predefined response library. Topic weights of a follow-up question recommender are tuned by training the follow-up question recommender using the self-supervised weighted embedding dataset. The follow-up question recommender executes in an inference mode on the question-answer event to generate the one or more follow-up response recommendations of questions from the predefined response library.
Inventors
- Laurent BOUÉ
- Swarnim Narayan
- RAVI PRASAD KONDAPALLI
- Vijay Srinivas Agneeswaran
- Naveen Panwar
Assignees
- MICROSOFT TECHNOLOGY LICENSING, LLC
Dates
- Publication Date
- 20260505
- Application Date
- 20250219
Claims (20)
- 1 . A method of generating one or more response recommendations relating to a question-answer event processed by an artificial intelligence assistant, wherein the artificial intelligence assistant supports predefined artificial intelligence assistant topics and the one or more response recommendations identify responses from a predefined response library, the method comprising: generating a weighted average of embeddings for each question-answer conversation of historical question-answer events collected via telemetry, wherein a historical question-answer conversation includes a chronologically ordered sequence of question-answer events; generating a self-supervised weighted embedding dataset including the weighted average of embeddings for each historical question-answer conversation and at least one corresponding semantic nearest neighbor response from the predefined response library; tuning topic weights of a response recommender by training the response recommender using the self-supervised weighted embedding dataset; executing the response recommender in an inference mode on the question-answer event to generate the one or more response recommendations of responses from the predefined response library; and displaying the one or more response recommendations in a user interface.
- 2 . The method of claim 1 , wherein the responses include questions, the one or more response recommendations include one or more follow-up question recommendations, the at least one corresponding semantic nearest neighbor response includes at least one corresponding semantic nearest neighbor question, the predefined response library includes a predefined question library, and the response recommender includes a follow-up question recommender.
- 3 . The method of claim 2 , wherein generating the weighted average of embeddings comprises: weighting each event embedding with a corresponding topic weight to yield a weighted event embedding for each historical question-answer event of the historical question-answer conversation, wherein the weighted average of embeddings includes a summation of weighted event embeddings corresponding to each historical question-answer event of the historical question-answer conversation.
- 4 . The method of claim 3 , further comprising: generating a historical question embedding for each historical question-answer event of the historical question-answer conversation using an embedding model; classifying each historical question-answer event of a historical question-answer conversation according to the predefined artificial intelligence assistant topics; and annotating each predefined artificial intelligence assistant topic corresponding to each historical question-answer event of the historical question-answer conversation with a weight to yield a topic weight for each historical question-answer event of the historical question-answer conversation.
- 5 . The method of claim 4 , wherein the at least one corresponding semantic nearest neighbor question is selected by calculating a semantic distance between each predefined question embedding of the predefined response library and a question embedding from the historical question-answer conversation and selecting a predefined question embedding having a closest semantic distance from the question embedding.
- 6 . The method of claim 2 , wherein tuning weights of the follow-up question recommender comprises: tuning the topic weights as parameters of the follow-up question recommender via backpropagation concurrently with tuning other parameters of the follow-up question recommender via hyperparameter optimization techniques.
- 7 . The method of claim 1 , wherein the one or more response recommendations include answers, the predefined response library includes a semantic cache of answers, and the response recommender processes the semantic cache to select an answer responsive to the question-answer event.
- 8 . A computing system for generating one or more response recommendations relating to a question-answer event processed by an artificial intelligence assistant, wherein the artificial intelligence assistant supports predefined artificial intelligence assistant topics and the one or more response recommendations identify questions from a predefined response library, the computing system comprising: memory; one or more hardware processors; a weighted average generator storable in the memory, executable by the one or more hardware processors, and configured to generate a weighted average of embeddings for each question-answer conversation of historical question-answer events collected via telemetry, wherein a historical question-answer conversation includes a chronologically ordered sequence of question-answer events and to generate a self-supervised weighted embedding dataset including the weighted average of embeddings for each historical question-answer conversation and at least one corresponding semantic nearest neighbor question from the predefined response library; and a recommendation model trainer storable in the memory, executable by the one or more hardware processors, and configured to tune topic weights of a response recommender by training the response recommender using the self-supervised weighted embedding dataset, wherein the artificial intelligence assistant executes the response recommender in an inference mode on the question-answer event to generate the one or more response recommendations of questions from the predefined response library, and to display the one or more response recommendations in a user interface.
- 9 . The computing system of claim 8 , wherein the one or more response recommendations include follow-up questions, the predefined response library includes a predefined question library, and the response recommender includes a follow-up question recommender.
- 10 . The computing system of claim 9 , wherein the weighted average generator is further configured to weight each event embedding with a corresponding topic weight to yield a weighted event embedding for each historical question-answer event of the historical question-answer conversation, wherein the weighted average of embeddings includes a summation of weighted event embeddings corresponding to each historical question-answer event of the historical question-answer conversation.
- 11 . The computing system of claim 10 , further comprising: an embedding model storable in the memory, executable by the one or more hardware processors, and configured to generate a historical question embedding for each historical question-answer event of the historical question-answer conversation using an embedding model; and a tuned topic model classifier storable in the memory, executable by the one or more hardware processors, and configured to classify each historical question-answer event of a historical question-answer conversation according to the predefined artificial intelligence assistant topics, wherein each predefined artificial intelligence assistant topic corresponding to each historical question-answer event of the historical question-answer conversation is annotated with a weight to yield a topic weight for each historical question-answer event of the historical question-answer conversation.
- 12 . The computing system of claim 11 , wherein the at least one corresponding semantic nearest neighbor question is selected by calculating a semantic distance between each predefined question embedding of the predefined response library and a question embedding from the historical question-answer conversation and selecting a predefined question embedding having a closest semantic distances from the question embedding.
- 13 . The computing system of claim 9 , wherein the recommendation model trainer is further configured to tune weights of the follow-up question recommender by tuning the topic weights as parameters of the follow-up question recommender via backpropagation concurrently with tuning other parameters of the follow-up question recommender via hyperparameter optimization techniques.
- 14 . The computing system of claim 8 , wherein the one or more response recommendations include answers, the predefined response library includes a semantic cache of answers, and the response recommender processes the semantic cache to select an answer responsive to the question-answer event.
- 15 . One or more tangible processor-readable storage media embodied with instructions for executing on one or more processors and circuits of a computing device a process for generating one or more response recommendations relating to a question-answer event processed by an artificial intelligence assistant, wherein the artificial intelligence assistant supports predefined artificial intelligence assistant topics and the one or more response recommendations identify responses from a predefined response library, the process comprising: generating a weighted average of embeddings for each question-answer conversation of historical question-answer events collected via telemetry, wherein a historical question-answer conversation includes a chronologically ordered sequence of question-answer events; generating a self-supervised weighted embedding dataset including the weighted average of embeddings for each historical question-answer conversation and at least one corresponding semantic nearest neighbor response from the predefined response library; tuning topic weights of a response recommender by training the response recommender using the self-supervised weighted embedding dataset; executing the response recommender in an inference mode on the question-answer event to generate the one or more response recommendations of responses from the predefined response library; and displaying the one or more response recommendations in a user interface.
- 16 . The one or more tangible processor-readable storage media of claim 15 , wherein generating the weighted average of embeddings comprises: weighting each event embedding with a corresponding topic weight to yield a weighted event embedding for each historical question-answer event of the historical question-answer conversation, wherein the weighted average of embeddings includes a summation of weighted event embeddings corresponding to each historical question-answer event of the historical question-answer conversation.
- 17 . The one or more tangible processor-readable storage media of claim 16 , further comprising: generating a historical question embedding for each historical question-answer event of the historical question-answer conversation using an embedding model; classifying each historical question-answer event of a historical question-answer conversation according to the predefined artificial intelligence assistant topics; and annotating each predefined artificial intelligence assistant topic corresponding to each historical question-answer event of the historical question-answer conversation with a weight to yield a topic weight for each historical question-answer event of the historical question-answer conversation.
- 18 . The one or more tangible processor-readable storage media of claim 17 , wherein the responses include questions, the one or more response recommendations include one or more follow-up question recommendations, the at least one corresponding semantic nearest neighbor response includes at least one corresponding semantic nearest neighbor question, the predefined response library includes a predefined question library, the response recommender includes a follow-up question recommender, and the at least one corresponding semantic nearest neighbor question is selected by calculating a semantic distance between each predefined question embedding of the predefined response library and a question embedding from the historical question-answer conversation and selecting a predefined question embedding having a closest semantic distance from the question embedding.
- 19 . The one or more tangible processor-readable storage media of claim 15 , wherein tuning weights of the response recommender comprises: tuning the topic weights as parameters of the response recommender via backpropagation concurrently with tuning other parameters of the response recommender via hyperparameter optimization techniques.
- 20 . The one or more tangible processor-readable storage media of claim 15 , wherein the one or more response recommendations include answers, the predefined response library includes a semantic cache of answers, and the response recommender processes the semantic cache to select an answer responsive to the question-answer event.
Description
BACKGROUND Modern artificial intelligence assistants (AI assistants) can provide a conversational session with a user, generally simulating human-to-human conversations. For example, if a user asks a question to an AI assistant (e.g., “What was the name of the first U.S. President?”, the AI assistant may answer, “The name of the first U.S. President is George Washington” and then recommend three follow-up questions (e.g., “When was George Washington born?”, “What were George Washington's major accomplishments?”, and “Who succeeded George Washington as President”) to guide the user on a deeper investigation following from the initial question. The more relevant and natural the follow-up questions provided by the AI assistant, the more engaged and satisfied the user will be, especially when the user is unsure about what question to ask next or about how to ask a better question to get a better result. However, developing an AI assistant that actually recommends relevant follow-up questions is challenging on many levels, including, without limitation, training, accuracy, and scaling with memory and storage consumption. SUMMARY In some aspects, the techniques described herein relate to a method of generating one or more response recommendations relating to a question-answer event processed by an artificial intelligence assistant, wherein the artificial intelligence assistant supports predefined artificial intelligence assistant topics and the one or more response recommendations identify questions from a predefined response library, the method including: generating a weighted average of embeddings for each question-answer conversation of historical question-answer events collected via telemetry, wherein a historical question-answer conversation includes a chronologically ordered sequence of question-answer events; generating a self-supervised weighted embedding dataset including the weighted average of embeddings for each historical question-answer conversation and at least one corresponding semantic nearest neighbor question from the predefined response library; tuning topic weights of a question recommender by training the question recommender using the self-supervised weighted embedding dataset; executing the question recommender in an inference mode on the question-answer event to generate the one or more response recommendations of questions from the predefined response library; and displaying the one or more response recommendations in a user interface. In some aspects, the techniques described herein relate to a computing system for generating one or more response recommendations relating to a question-answer event processed by an artificial intelligence assistant, wherein the artificial intelligence assistant supports predefined artificial intelligence assistant topics and the one or more response recommendations identify questions from a predefined response library, the computing system including: memory; one or more hardware processors; a weighted average generator storable in the memory, executable by the one or more hardware processors, and configured to generate a weighted average of embeddings for each question-answer conversation of historical question-answer events collected via telemetry, wherein a historical question-answer conversation includes a chronologically ordered sequence of question-answer events and to generate a self-supervised weighted embedding dataset including the weighted average of embeddings for each historical question-answer conversation and at least one corresponding semantic nearest neighbor question from the predefined response library; and a recommendation model trainer storable in the memory, executable by the one or more hardware processors, and configured to tune topic weights of a question recommender by training the question recommender using the self-supervised weighted embedding dataset, wherein the artificial intelligence assistant executes the question recommender in an inference mode on the question-answer event to generate the one or more response recommendations of questions from the predefined response library, and to display the one or more response recommendations in a user interface. In some aspects, the techniques described herein relate to one or more tangible processor-readable storage media embodied with instructions for executing on one or more processors and circuits of a computing device a process for generating one or more follow-up response recommendations relating to a question-answer event processed by an artificial intelligence assistant, wherein the artificial intelligence assistant supports predefined artificial intelligence assistant topics and the one or more follow-up response recommendations identify questions from a predefined response library, the process including: generating a weighted average of embeddings for each question-answer conversation of historical question-answer events collected via telemetry, wherein a historical questi