Search

US-12619905-B2 - Flexible embedding systems and methods for real-time comparisons

US12619905B2US 12619905 B2US12619905 B2US 12619905B2US-12619905-B2

Abstract

Systems and methods for selecting items from a pool of items based on comparisons of composite embeddings are disclosed. A first initial embedding is obtained from a first database for each item in a pool of items and a second initial embedding is obtained from a second database for each item in the pool of items. The first initial embedding is generated using a first embedding model and the second initial embedding is generated using a second embedding model. A first composite embedding is generated for each item in the pool of items comprising the first initial embedding and the second initial embedding; and compare the first composite embedding for each item in the pool of items to a first anchor embedding, wherein the first anchor embedding comprises a first initial anchor embedding generated using the first embedding model and a second initial anchor embedding generated using the second embedding model.

Inventors

  • Yichuan Niu
  • Adrian Sonjong Yi
  • Peng Yang
  • Valeriy Pelyushenko
  • Haibo Yan
  • Vivek Kumar
  • Jayanth Korlimarla
  • Rajesh Garigipati

Assignees

  • WALMART APOLLO, LLC

Dates

Publication Date
20260505
Application Date
20210130

Claims (20)

  1. 1 . A system, comprising: a processor; and a non-transitory memory storing instructions that when executed, cause the processor to: generate in real-time a first composite item embedding for each item in a pool of items by: obtaining a first initial item embedding from a first database for each item in a pool of items, wherein the first initial item embedding is generated using a first embedding model and characterizes item-specific features of the corresponding item; obtaining a second initial item embedding from a second database for each item in the pool of items, wherein the second initial item embedding is generated using a second embedding model different than the first embedding model, and wherein the second initial item embedding characterizes category features for a category of the corresponding item; and generating a first composite item embedding by combining the first initial item embedding and the second initial item embedding; receive in real-time a search query provided by a respective user; generate in real-time a first anchor embedding based on the search query by: generating, by the first embedding model, a first initial anchor embedding based on a target string input; generating, by the second embedding model, a second initial anchor embedding based on the search query; and generating the first anchor embedding by combining the first initial anchor embedding and the second initial anchor embedding; determine a set of top N common input targets from a pool of hot strings distinct from the pool of items, the pool of hot strings defining frequently used targets by users and being updated periodically through a network interface; generate in real-time a hot string anchor embedding based on the top N common input targets, wherein the top N common input targets are the most frequent strings received by the system; simultaneously compare the first anchor embedding to the first composite item embedding for each item in the pool of items; simultaneously compare the first anchor embedding to the hot string anchor embedding; responsive to the first anchor embedding and the hot string anchor embedding being different, generate a similarity score for each item in the pool of items determined by the comparison of the first composite item embedding for each item in the pool of items to the first anchor embedding; generate a set of candidate items based, at least in part, on the similarity score; and generate a user interface including one or more interface elements representative of at least one candidate item in the set of candidate items for display to the respective user.
  2. 2 . The system of claim 1 , wherein the first composite item embedding is generated by concatenating the second initial item embedding to the first initial item embedding.
  3. 3 . The system of claim 1 , wherein the processor is configured to read instructions to apply a weighting factor to the first initial item embedding and the second initial item embedding for generating the first composite item embedding prior to comparing the first composite item embedding for each item in the pool of items to the first anchor embedding, wherein the first composite item embedding for each item in the pool of items is compared to the first anchor embedding by a cosine similarity.
  4. 4 . The system of claim 1 , wherein the processor is configured to read instructions to: obtain a third initial item embedding from a third database for each item in the pool of items; generate a second composite item embedding for each item in the pool of items comprising the first initial item embedding and the third initial item embedding; generate, by the third embedding model, a third initial anchor embedding based on the search query from a user; generate a second anchor embedding based on the search query comprising the first initial anchor embedding and the third initial anchor embedding; and compare the second composite item embedding for each item in the pool of items to the second anchor embedding.
  5. 5 . The system of claim 4 , wherein the third initial item embedding and the third initial anchor embedding are generated by a third embedding model different than the first and second embedding models.
  6. 6 . The system of claim 5 , wherein the second embedding model is generated by training a first model type using a first training dataset and the third embedding model is generated by training the first model type using a second training dataset.
  7. 7 . The system of claim 1 , wherein the first initial item embedding and the second initial item embedding are normalized prior to generating the first composite item embedding.
  8. 8 . The system of claim 1 , wherein the processor is configured to apply a set of weighting factors to the first composite item embedding.
  9. 9 . The system of claim 8 , wherein the weighting factors are applied as a dot product.
  10. 10 . A non-transitory computer readable medium having instructions stored thereon, wherein the instructions, when executed by a processor cause a device to perform operations comprising: generating in real-time a first composite item embedding for each item in a pool of items by: obtaining a first initial item embedding from a first database for each item in a pool of items, wherein the first initial item embedding is generated using a first embedding model and characterizes item-specific features of the corresponding item; obtaining a second initial item embedding from a second database for each item in the pool of items, wherein the second initial item embedding is generated using a second embedding model different than the first embedding model, and wherein the second initial item embedding characterizes category features for a category of the corresponding item; and generating a first composite item embedding by combining the first initial item embedding and the second initial item embedding; receiving in real-time a search query provided by a respective user; generating in real-time a first anchor embedding based on the search query by: generating, by the first embedding model, a first initial anchor embedding based on a target string input; generating, by the second embedding model, a second initial anchor embedding based on the search query; and generating the first anchor embedding by combining the first initial anchor embedding and the second initial anchor embedding; determining a set of top N common input targets from a pool of hot strings distinct from the pool of items, the pool of hot strings defining frequently used targets by users and being updated periodically through a network interface; generating in real-time a hot string anchor embedding based on the top N common input targets, wherein the top N common input targets are the most common strings received; simultaneously comparing the first anchor embedding to the first composite item embedding for each item in the pool of items; simultaneously comparing the first anchor embedding to the hot string anchor embedding; when the first anchor embedding and the hot string anchor embedding do not match, generating a similarity score for each item in the pool of items determined by the comparison of the first composite item embedding for each item in the pool of items to the first anchor embedding; generating a set of candidate items based, at least in part, on the similarity score; and generating a user interface including one or more interface elements representative of at least one candidate item in the set of candidate items for display to the respective user.
  11. 11 . The non-transitory computer readable medium of claim 10 , wherein the processor causes a device to perform operations comprising applying a weighting factor to the first initial item embedding and the second initial item embedding for generating the first composite item embedding prior to comparing the first composite item embedding for each item in the pool of items to the first anchor embedding, wherein the first composite item embedding for each item in the pool of items is compared to the first anchor embedding by a cosine similarity.
  12. 12 . The non-transitory computer readable medium of claim 10 , wherein the processor causes a device to perform operations comprising: obtaining a third initial item embedding for each item in the pool of items; generating a second composite item embedding for each item in the pool of items by concatenating the first initial item embedding and the third initial item embedding; generating, by the third embedding model, a third initial anchor embedding based on the target string input; generate a second anchor embedding based on the search query comprising the first initial anchor embedding and the third initial anchor embedding; and compare the second composite item embedding for each item in the pool of items to the second anchor embedding.
  13. 13 . The non-transitory computer readable medium of claim 12 , wherein the third initial item embedding and the third initial anchor embedding are generated by a third embedding model different than the first and second embedding models.
  14. 14 . The non-transitory computer readable medium of claim 10 , wherein the first initial item embedding and the second initial item embedding are normalized prior to generating the first composite item embedding.
  15. 15 . The non-transitory computer readable medium of claim 10 , wherein the processor is configured to apply a set of weighting factors to the first composite item embedding.
  16. 16 . The non-transitory computer readable medium of claim 15 , wherein the weighting factors are applied as a dot product.
  17. 17 . The non-transitory computer readable medium of claim 10 , wherein the processor causes a device to perform operations comprising: generating the first anchor embedding by concatenating the second initial anchor embedding to the first initial anchor embedding prior to comparing the first composite item embedding for each item in the pool of items to the first anchor embedding.
  18. 18 . A method, comprising: generating in real-time a first initial item embedding for each item in a pool of items using a first embedding model, wherein each first initial item embedding characterizes item-specific features of the corresponding item; generating in real-time a second initial item embedding for each item in the pool of items using a second embedding model, wherein each second initial item embedding characterizes category features for a category of the corresponding item; generating in real-time a third initial item embedding for each item in the pool of items using a third embedding model; receiving a first set of embedding parameters, wherein the first set of embedding parameters identify the first initial item embedding and the second initial item embedding, a first weighting factor, and a second weighting factor; generating in real-time a first composite item embedding for each item in the pool of items, wherein each first item embedding is generated at least in part by: normalizing the first initial item embedding and the second initial item embedding; applying the first weighting factor the first initial item embedding and the second weighting factor to the second initial item embedding; and concatenating the first initial item embedding and the second initial item embedding; receiving in real-time a first search query provided by a respective user; generating in real-time a first anchor embedding for the first search query by concatenating a first initial anchor embedding and a second initial anchor embedding, wherein the first initial anchor embedding is generated using the first embedding model and the second initial anchor embedding is generated using the second embedding model, and wherein the first weighting factor is applied to the first initial anchor embedding and the second weighting factor is applied to the second initial anchor embedding; determining a set of top N common input targets from a pool of hot strings distinct from the pool of items, the pool of hot strings defining frequently used targets by users and being updated periodically through a network interface; generating in real-time a hot string anchor embedding based on the top N common input targets, wherein the top N common input targets are the most common strings received; simultaneously comparing the first anchor embedding to the first composite item embedding for each item in the pool of items; simultaneously comparing the first anchor embedding to the hot string anchor embedding; when the first anchor embedding and the hot string anchor embedding match, loading the first anchor embedding based on the first search query from a first database; when the first anchor embedding and the hot string anchor embedding do not match, generating a similarity score for each item in the pool of items determined by the comparison of the first composite item embedding for each item in the pool of items to the first anchor embedding; generating a set of candidate items based, at least in part, on the similarity score; and generating a user interface including one or more interface elements representative of at least one candidate item in the set of candidate items for display to the respective user.
  19. 19 . The method of claim 18 , wherein the first composite item embedding for each item in the pool of items is compared to the first anchor embedding by a cosine similarity.
  20. 20 . The method of claim 18 , comprising: receiving a second set of embedding parameters, wherein the embedding parameters identify the first initial item embedding and the third initial item embedding, a third weighting factor, and a fourth weighting factor; generating a second composite item embedding for each item in the pool of items, wherein each second composite item embedding is generated at least in part by: normalizing the first initial item embedding and the third initial item embedding; applying the third weighting factor the first initial item embedding and the fourth weighting factor to the third initial item embedding; and concatenating the first initial item embedding and the third initial item embedding; receiving a second search query; generating a second anchor embedding for the second search query by concatenating a third initial anchor embedding and a fourth initial anchor embedding, wherein the third initial anchor embedding is generated using the first embedding model and the fourth initial anchor embedding is generated using the third embedding model, and wherein the third weighting factor is applied to the third initial anchor embedding and the fourth weighting factor is applied to the fourth initial anchor embedding; and comparing the second composite item embedding for each item in the pool of items to the second anchor embedding for the second search query.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS The application is related to United States Patent Application No. 17/163,383, filed concurrently with the present application, which is incorporated herein by reference in its entirety. TECHNICAL FIELD This application relates generally to composite embedding and, more particularly, to similarity relevance scoring using composite embeddings. BACKGROUND Computer-implemented models, such as machine learning models, that convert data elements into embeddings (e.g., vector embeddings) allow individual embeddings to be provided as input to other automated processes. In some instances, individual embeddings may be used to compare two or more items in a set of items. One known class of model embedding includes word embeddings that convert (or map) letters, words, phrases, etc. to vectors of real numbers. One example of such an embedding is the “word2vec” embeddings. Current systems utilize individual vector embeddings for various tasks, such as machine learning or other automated tasks. Individual embeddings may reflect certain information related to items being compared while omitting other information. Applying multiple similarity comparisons based on multiple embeddings of different dimensions increases processing time and complexity of a system. SUMMARY In various embodiments, a system is disclosed. The system includes a non-transitory memory having instructions stored thereon and a processor configured to read the instructions. The processor is configured to obtain a first initial embedding from a first database and obtain a second initial embedding from a second database for each item in the pool of items. The first initial embedding is generated using a first embedding model and the second initial embedding is generated using a second embedding model different than the first embedding model. The processor is further configured to generate a first composite embedding for each item in the pool of items comprising the first initial embedding and the second initial embedding and compare the first composite embedding for each item in the pool of items to a first anchor embedding. The first anchor embedding comprises a first initial anchor embedding generated using the first embedding model and a second initial anchor embedding generated using the second embedding model. In various embodiments, a non-transitory computer readable medium having instructions stored thereon is disclosed. The instructions, when executed by a processor cause a device to perform operations including obtaining a first initial embedding for each item in a pool of items, obtaining a second initial embedding for each item in the pool of items, generating a first item embedding for each item in the pool of items by concatenating the first initial embedding and the second initial embedding, and comparing the first item embedding for each item in the pool of items to a first anchor embedding. The first initial embedding is generated using a first embedding model, the second initial embedding is generated using a second embedding model different than the first embedding model, and the first anchor embedding comprises a first initial anchor embedding generated using the first embedding model and a second initial anchor embedding generated using the second embedding model. In various embodiments, a computer-implemented method is disclosed. The computer-implemented method includes the steps of generating a first initial embedding for each item in a pool of items using a first embedding model, generating a second initial embedding for each item in the pool of items using a second embedding model, generating a third initial embedding for each item in the pool of items using a third embedding model, and receiving a first set of embedding parameters. The embedding parameters identify the first initial embedding and the second initial embedding, a first weighting factor, and a second weighting factor. The method further includes generating a first item embedding for each item in the pool of items. Each first item embedding is generated at least in part by: normalizing the first initial embedding and the second initial embedding, applying the first weighting factor the first initial embedding and the second weighting factor to the second initial embedding, and concatenating the first initial embedding and the second initial embedding. The method further includes receiving a first target string and generating an anchor embedding for the first target string by concatenating a first initial anchor embedding and a second initial anchor embedding. The first initial anchor embedding is generated using the first embedding model and the second initial anchor embedding is generated using the second embedding model. The first weighting factor is applied to the first initial anchor embedding and the second weighting factor is applied to the second initial anchor embedding. The method further includes comparing the item embe