US-20260127179-A1 - MULTI-CHANNEL SEARCH AND AGGREGATED SCORING TECHNIQUES FOR COMPLEX SEARCH DOMAINS
Abstract
Various embodiments of the present disclosure provide query processing techniques for resolving queries in a complex search domain to improve upon traditional search resolutions within such domains. The techniques may include generating a keyword and an embedding representation for an agnostic search query. The keyword representation may be compared against source text attributes within one or more domain channels to generate a plurality of keyword similarity scores between the search query and features within a search domain. The embedding representation may be compared against source embedding attributes within the one or more domain channels to generate a plurality of embedding similarity scores between the search query and the features within the search domain. The keyword and embedding similarity scores may be aggregated to generate aggregated similarity scores for identifying an intermediate query resolution for the search query. The intermediate query resolution may be leveraged to resolve the query.
Inventors
- Yizhao NI
- Cem Unsal
- Harsh M. Maheshwari
- Ramin Anushiravani
- Nicholas Paul GRAMSTAD
- Ayush Tomar
Assignees
- OPTUM, INC.
Dates
- Publication Date
- 20260507
- Application Date
- 20260105
Claims (20)
- 1 . A computer-implemented method comprising: receiving, by one or more processors, a search query; generating, by the one or more processors and based on the search query, a plurality of similarity scores associated with a plurality of domain channels, wherein: (i) a first similarity score of the plurality of similarity scores corresponds to a first channel-specific feature of a plurality of first channel-specific features from a first domain channel of the plurality of domain channels, (ii) a second similarity score of the plurality of similarity scores corresponds to a second channel-specific feature of a plurality of second channel-specific features from a second domain channel of the plurality of domain channels, and (iii) the first channel-specific feature and the second channel-specific feature are associated with one or more query result data objects of a plurality of query result data objects; generating, by the one or more processors, an intermediate query resolution for the search query based on the plurality of similarity scores, wherein the intermediate query resolution comprises at least one of the first channel-specific feature or the second channel-specific feature; and providing, by the one or more processors, data indicative of at least one of the one or more query result data objects of the plurality of query result data objects based on the intermediate query resolution.
- 2 . The computer-implemented method of claim 1 , wherein the plurality of first channel-specific features corresponds to a first topic type, and the plurality of second channel-specific features corresponds to a second topic type different from the first topic type.
- 3 . The computer-implemented method of claim 2 , wherein the plurality of domain channels further comprises: (i) a third domain channel comprising a plurality of third channel-specific features corresponding to a third topic type; (ii) a fourth domain channel comprising a plurality of fourth channel-specific features corresponding to a fourth topic type; and (iii) a fifth domain channel comprising a plurality of fifth channel-specific features corresponding to a fifth topic type.
- 4 . The computer-implemented method of claim 1 , wherein first domain channel and the second domain channel comprise divided information verticals within a domain knowledge datastore.
- 5 . The computer-implemented method of claim 1 , wherein the search query comprises a natural language text sequence, and a similarity score of the plurality of similarity scores is based on at least one of a keyword representation of the natural language text sequence or an embedding representation for the natural language text sequence.
- 6 . The computer-implemented method of claim 5 , wherein the first similarity score is based on the at least one of the keyword representation or the embedding representation for the natural language text sequence and at least one of a source text attribute or a source embedding attribute for the first channel-specific feature.
- 7 . The computer-implemented method of claim 6 , wherein the first similarity score comprises a weighted combination of (i) a keyword similarity score based on the keyword representation and the source text attribute and (ii) an embedding similarity score based on the embedding representation and the source embedding attribute.
- 8 . The computer-implemented method of claim 5 , further comprising: generating, using a language model, an expanded query for the search query; and generating at least one of the keyword representation or the embedding representation for the search query based on the expanded query.
- 9 . The computer-implemented method of claim 8 , wherein the language model is trained based on the plurality of first channel-specific features and the plurality of second channel-specific features.
- 10 . The computer-implemented method of claim 8 , wherein the language model is trained based on the first domain channel of the plurality of domain channels.
- 11 . A system comprising: one or more processors; and one or more memories storing processor-executable instructions that, when executed by the one or more processors, cause the one or more processors to perform operations comprising: receiving a search query; generating, based on the search query, a plurality of similarity scores associated with a plurality of domain channels, wherein: (i) a first similarity score of the plurality of similarity scores corresponds to a first channel-specific feature of a plurality of first channel-specific features from a first domain channel of the plurality of domain channels, (ii) a second similarity score of the plurality of similarity scores corresponds to a second channel-specific feature of a plurality of second channel-specific features from a second domain channel of the plurality of domain channels, and (iii) the first channel-specific feature and the second channel-specific feature are associated with one or more query result data objects of a plurality of query result data objects; generating an intermediate query resolution for the search query based on the plurality of similarity scores, wherein the intermediate query resolution comprises at least one of the first channel-specific feature or the second channel-specific feature; and providing data indicative of at least one of the one or more query result data objects of the plurality of query result data objects based on the intermediate query resolution.
- 12 . The system of claim 11 , wherein the plurality of first channel-specific features corresponds to a first topic type, and the plurality of second channel-specific features corresponds to a second topic type different from the first topic type.
- 13 . The system of claim 12 , wherein the plurality of domain channels further comprises: (i) a third domain channel comprising a plurality of third channel-specific features corresponding to a third topic type; (ii) a fourth domain channel comprising a plurality of fourth channel-specific features corresponding to a fourth topic type; and (iii) a fifth domain channel comprising a plurality of fifth channel-specific features corresponding to a fifth topic type.
- 14 . The system of claim 11 , wherein first domain channel and the second domain channel comprise divided information verticals within a domain knowledge datastore.
- 15 . The system of claim 11 , wherein the search query comprises a natural language text sequence, and a similarity score of the plurality of similarity scores is based on at least one of a keyword representation of the natural language text sequence or an embedding representation for the natural language text sequence.
- 16 . The system of claim 15 , wherein the first similarity score is based on the at least one of the keyword representation or the embedding representation for the natural language text sequence and at least one of a source text attribute or a source embedding attribute for the first channel-specific feature.
- 17 . The system of claim 16 , wherein the first similarity score comprises a weighted combination of (i) a keyword similarity score based on the keyword representation and the source text attribute and (ii) an embedding similarity score based on the embedding representation and the source embedding attribute.
- 18 . One or more non-transitory computer-readable media storing processor-executable instructions that, when executed by one or more processors, cause the one or more processors to perform operations comprising: receiving a search query; generating, based on the search query, a plurality of similarity scores associated with a plurality of domain channels, wherein: (i) a first similarity score of the plurality of similarity scores corresponds to a first channel-specific feature of a plurality of first channel-specific features from a first domain channel of the plurality of domain channels, (ii) a second similarity score of the plurality of similarity scores corresponds to a second channel-specific feature of a plurality of second channel-specific features from a second domain channel of the plurality of domain channels, and (iii) the first channel-specific feature and the second channel-specific feature are associated with one or more query result data objects of a plurality of query result data objects; generating an intermediate query resolution for the search query based on the plurality of similarity scores, wherein the intermediate query resolution comprises at least one of the first channel-specific feature or the second channel-specific feature; and providing data indicative of at least one of the one or more query result data objects of the plurality of query result data objects based on the intermediate query resolution.
- 19 . The one or more non-transitory computer-readable media of claim 18 , wherein the operations further comprise: generating, using a language model, an expanded query for the search query; and generating at least one of the keyword representation or the embedding representation for the search query based on the expanded query.
- 20 . The one or more non-transitory computer-readable media of claim 19 , wherein the language model is trained based on the plurality of first channel-specific features and the plurality of second channel-specific features.
Description
CROSS-REFERENCE TO RELATED APPLICATIONS This application is a continuation of U.S. application Ser. No. 18/390,940, entitled “MULTI-CHANNEL SEARCH AND AGGREGATED SCORING TECHNIQUES FOR COMPLEX SEARCH DOMAINS,” and filed Dec. 20, 2023, which claims the benefit of U.S. Provisional Application No. 63/578,461, entitled “MULTI-MODAL, MULTI-CHANNEL SEARCH,” and filed Aug. 24, 2023, and, the entire contents of which are hereby incorporated by reference. BACKGROUND Various embodiments of the present disclosure address technical challenges related to search query resolutions generally and, more specifically, for generating comprehensive query resolutions for complex search domains. Traditionally, query resolutions are retrieved using basic keyword searching techniques on limited search result features, such as provider names, addresses, specialties, and/or the like for a clinical domain. In some cases, these searches may be augmented with primitive structured filters, such as provider spoken languages, distances, and/or the like, to narrow down returned results. In such cases, a user is required to fill in a lengthy form (e.g., one or more input fields, etc.) to complete a search query, and very often, the search is not effective because the user lacks sufficient knowledge for a particular search domain (e.g., a user may not know what clinical specialty is needed for a condition, etc.). By way of example, in a clinical domain, a user's child may experience stomach pain for one week causing a user to look for a provider to treat the condition. However, the user may not understand the condition or provider specialties enough to search or recognize a correct provider. As such, the user may enter a search query indicative (e.g., including identifiers, such as international classification of diseases (ICD) codes, textual descriptions of a condition, etc.) of the condition, such as the natural language text sequence “my kid's stomach hurts all the time” and constrain the results to providers within 50 miles from the user's home. Such a search query may result in null results due to a lack of keyword matches between provider features and the keywords “stomach” and “hurts.” As shown by the example, traditional searching techniques are limited to user' with sufficient knowledge of a search domain. Various embodiments of the present disclosure make important contributions to traditional search query resolution techniques by addressing this technical challenge, among others. BRIEF SUMMARY Various embodiments of the present disclosure provide multi-modal and multi-channel search solutions to intelligently enhance search queries and aggregate multi-channel results for a search query that enable comprehensive query resolutions for generic search results. Using some of the techniques of the present disclosure, a search query may be transformed into multiple complementary representations, such as a keyword and embedding representation, to measure a syntactic and sematic similarity between the search query and features across multiple domain channels of a search domain. These measures may be aggregated to identify multi-channel features that correspond to the search query, which be used to identify an augment a final search resolution. In this way, some of the techniques of the present disclosure provide searching capabilities with deeper semantic and contextual understanding of search queries beyond literal words (e.g., interpreting “stomach hurts” as “stomach pain” or more general, “upper abdominal pain”). In addition, or alternatively, the search capability may match query intents with proper query resolutions (e.g., matching condition “stomach pain” with specialty “gastroenterology”), while adopting user preferences (e.g., location, spoken languages) and aggregating and disseminating results from multiple sources (e.g., from provider profiles, claim data, specialty information) in an efficient manner. In some embodiments, a computer-implemented method includes generating, by one or more processors, a keyword representation and an embedding representation for a search query; generating, by the one or more processors, a plurality of keyword similarity scores between the keyword representation and a plurality of source text attributes from a domain channel; generating, by the one or more processors, a plurality of embedding similarity scores between the embedding representation and a plurality of source embedding attributes from the domain channel; generating, by the one or more processors, an intermediate query resolution for the search query based on a plurality of aggregated similarity scores each including a weighted combination of (i) a keyword similarity score of the plurality of keyword similarity scores and (ii) an embedding similarity score of the plurality of embedding similarity scores; and providing, by the one or more processors, data indicative of a query resolution based on the intermediate query resolution. In some