US-12619826-B2 - Finding coherent inferences across domains

US12619826B2US 12619826 B2US12619826 B2US 12619826B2US-12619826-B2

Abstract

Disclosed implementations relate to using mutual constraint satisfaction to sample from different stochastic processes and identify coherent inferences across domains. In some implementations, a first domain representation of a semantic concept may be used to conditionally sample a first set of candidate second domain representations of the semantic concept from a first stochastic process. Based on second domain representation(s) of the first set, candidate third domain representations of the semantic concept may be conditionally sampled from a second stochastic process. Based on candidate third domain representation(s), a second set of candidate second domain representations of the semantic concept may be conditionally sampled from a third stochastic process. Pairs of candidate second domain representations sampled across the first and second sets may be evaluated. Based on the evaluation, second domain representation(s) of the semantic concept are selected, e.g., as input for a downstream computer process.

Inventors

Garrett Raymond Honke
David Andre
Alberto Camacho Martinez
Irhum Shafkat

Assignees

X DEVELOPMENT LLC

Dates

Publication Date: 20260505
Application Date: 20221031

Claims (16)

1 . A method implemented using one or more processors and comprising: receiving, as input from a user, a first domain representation of a semantic concept; based on the first domain representation of the semantic concept, conditionally sampling a first set of candidate programming language domain representations of the semantic concept from a first stochastic process that translates from the first domain to the programming language domain, wherein the programming language domain comprises a second domain; based on one or more programming language domain representations of the first set, conditionally sampling candidate third domain representations of the semantic concept from a second stochastic process that translates from the programming language domain to the third domain; based on one or more of the candidate third domain representations of the semantic concept, conditionally sampling a second set of candidate programming language domain representations of the semantic concept from a third stochastic process that translates from the third domain to the programming language domain; evaluating pairs of candidate programming language domain representations sampled across the first and second sets, wherein the evaluating comprises: executing or interpreting a first candidate instance of source code sampled from the first set to generate a first output, executing or interpreting a second candidate instance of source code sampled from the second set to generate a second output, and comparing the first and second outputs; and based on the evaluation, selecting one or more of the programming language domain representations of the semantic concept as input for a downstream computer process.
2 . The method of claim 1 , wherein the evaluation comprises: determining similarity measures between a first plurality of reduced-dimensionality embeddings generated from the candidate programming language domain representations of the first set and a second plurality of reduced-dimensionality embeddings generated from the candidate programming language domain representations of the second set; and based on the similarity measures, selecting the one or more programming language domain representations of the semantic concept as the input for the downstream computer process.
3 . The method of claim 1 , wherein the first stochastic process comprises an attention-based transformer machine learning model, and the first set of candidate programming language domain representations of the semantic concept comprises multiple candidate token sequences generated by performing beam searching during application of the attention-based transformer machine learning model.
4 . The method of claim 1 , wherein the evaluation further comprises comparing the first output to the first domain representation of the semantic concept.
5 . The method of claim 1 , wherein the third domain comprises natural language, and the candidate third domain representations of the semantic concept comprise natural language descriptions of respective pieces of source code in the first set.
6 . The method of claim 1 , wherein the first domain representation of the semantic concept comprises one or more images that depict a base visual state and a successor visual state.
7 . The method of claim 1 , further comprising applying a contrastive model to compare one or more candidate programming language domain representations sampled from the first set with one or more of the candidate third domain representations of the semantic concept using a contrastive model, wherein the selecting is further based on output generated from applying the contrastive model.
8 . The method of claim 1 , wherein at least one of the first, second, and third stochastic processes comprises an attention-based transformer machine learning model.
9 . A system comprising one or more processors and memory storing instructions that, in response to execution by the one or more processors, cause the one or more processors to: receive, as input from a user, a first domain representation of a semantic concept; based on the first domain representation of the semantic concept, conditionally sample a first set of candidate programming language domain representations of the semantic concept from a first stochastic process that translates from the first domain to the programming language domain, wherein the programming language domain comprises a second domain; based on one or more programming language domain representations of the first set, conditionally sample candidate third domain representations of the semantic concept from a second stochastic process that translates from the programming language domain to the third domain; based on one or more of the candidate third domain representations of the semantic concept, conditionally sample a second set of candidate programming language domain representations of the semantic concept from a third stochastic process that translates from the third domain to the programming language domain; evaluate pairs of candidate programming language domain representations sampled across the first and second sets, wherein the instructions to evaluate comprise instructions to: execute or interpret a first candidate instance of source code sampled from the first set to generate a first output, execute or interpret a second candidate instance of source code sampled from the second set to generate a second output, and compare the first and second outputs; and based on the evaluation, select one or more of the programming language domain representations of the semantic concept as input for a downstream computer process.
10 . The system of claim 9 , wherein the instructions to evaluate comprise instructions to: determine similarity measures between a first plurality of reduced- dimensionality embeddings generated from the candidate programming language domain representations of the first set and a second plurality of reduced-dimensionality embeddings generated from the candidate programming language domain representations of the second set; and based on the similarity measures, select the one or more programming language domain representations of the semantic concept as the input for the downstream computer process.
11 . The system of claim 9 , wherein the first stochastic process comprises an attention-based transformer machine learning model, and the first set of candidate programming language domain representations of the semantic concept comprises multiple candidate token sequences generated by performing beam searching during application of the attention-based transformer machine learning model.
12 . The system of claim 9 , wherein the evaluation further comprises comparing the first output to the first domain representation of the semantic concept.
13 . The system of claim 9 , wherein the third domain comprises natural language, and the candidate third domain representations of the semantic concept comprise natural language descriptions of respective pieces of source code in the first set.
14 . The system of claim 9 , wherein the first domain representation of the semantic concept comprises one or more images that depict a base visual state and a successor visual state.
15 . The system of claim 9 , further comprising applying a contrastive model to compare one or more candidate programming language domain representations sampled from the first set with one or more of the candidate third domain representations of the semantic concept using a contrastive model, wherein the selecting is further based on output generated from applying the contrastive model.
16 . At least one non-transitory computer-readable medium comprising instructions that, when executed by one or more processors, cause the one or more processors to: receive, as input from a user, a first domain representation of a semantic concept; based on the first domain representation of the semantic concept, conditionally sample a first set of candidate programming language domain representations of the semantic concept from a first stochastic process that translates from the first domain to the programming language domain, wherein the programming language domain comprises a second domain; based on one or more programming language domain representations of the first set, conditionally sample candidate third domain representations of the semantic concept from a second stochastic process that translates from the programming language domain to the third domain; based on one or more of the candidate third domain representations of the semantic concept, conditionally sample a second set of candidate programming language domain representations of the semantic concept from a third stochastic process that translates from the third domain to the programming language domain; evaluate pairs of candidate programming language domain representations sampled across the first and second sets, wherein the instructions to evaluate comprise instructions to: execute or interpret a first candidate instance of source code sampled from the first set to generate a first output, execute or interpret a second candidate instance of source code sampled from the second set to generate a second output, and compare the first and second outputs; and based on the evaluation, select one or more of the programming language domain representations of the semantic concept as input for a downstream computer process.

Description

BACKGROUND Multiple stochastic processes may be used to process a representation of a semantic concept (e.g., an environment, state of a world) in one domain or modality and generate corresponding representations (e.g., “inferences,” “translations”) in other domains or modalities. In many cases, each stochastic process may generate multiple different candidate inferences within a respective domain. The inference having the highest probability or confidence may be selected for use in downstream processing. However, the highest probability inferences generated by two different stochastic processes in two different domains may not be coherent with each other. The inferences may diverge semantically and/or functionally, for instance. Consequently, it may be not clear which inference—or whether either inference—is most suitable for downstream processing. SUMMARY Implementations are described herein for finding a coherent and joint interpretation of a semantic concept across multiple domains and/or modalities, such as visual, natural language, and executable programming language domains, to name a few. More particularly, but not exclusively, techniques described herein relate to using mutual constraint satisfaction to sample from an ensemble of different stochastic processes—especially those applying trained machine learning models—and identify inferences that, irrespective of their individual probabilities or confidences, are coherent across different domains, and therefore may comprise high-quality representations of a given semantic concept. Implementations described herein may facilitate an overall (e.g., cross-modality) interpretation of a semantic concept that has the greatest joint probability, or an interpretation that is not ruled out by any individual stochastic process. The identified inference(s) may then be provided for downstream processing, such as responding to a user's request, or as part of a larger processing pipeline. In some implementations, a method may be implemented using one or more processors and may include: receiving, as input from a user, a first domain representation of a semantic concept; based on the first domain representation of the semantic concept, conditionally sampling a first set of candidate second domain representations of the semantic concept from a first stochastic process that translates from the first domain to the second domain; based on one or more second domain representations of the first set, conditionally sampling candidate third domain representations of the semantic concept from a second stochastic process that translates from the second domain to the third domain; based on one or more of the candidate third domain representations of the semantic concept, conditionally sampling a second set of candidate second domain representations of the semantic concept from a third stochastic process that translates from the third domain to the second domain; evaluating pairs of candidate second domain representations sampled across the first and second sets; and based on the evaluation, selecting one or more of the second domain representations of the semantic concept as input for a downstream computer process. In various implementations, the evaluation may include: determining similarity measures between a first plurality of reduced-dimensionality embeddings generated from the candidate second domain representations of the first set and a second plurality of reduced-dimensionality embeddings generated from the candidate second domain representations of the second set; and based on the similarity measures, selecting the one or more second domain representations of the semantic concept as the input for the downstream computer process. In various implementations, the first stochastic process may include an attention-based transformer machine learning model, and the first set of candidate second domain representations of the semantic concept comprises multiple candidate token sequences generated by performing beam searching during application of the attention-based transformer machine learning model. In various implementations, the second domain may include a programming language domain, and the candidate second domain representations in the first and second sets comprise candidate instances of source code. In various implementations, the evaluation may include: executing or interpreting a first candidate instance of source code sampled from the first set to generate a first output; executing or interpreting a second candidate instance of source code sampled from the second set to generate a second output; and comparing the first and second outputs. In various implementations, the evaluation further includes comparing the first output to the first domain representation of the semantic concept. In various implementations, the third domain may be natural language, and the candidate third domain representations of the semantic concept may include natural language descriptions of