Search

US-12625981-B2 - Privacy preserving cross-domain machine learning

US12625981B2US 12625981 B2US12625981 B2US 12625981B2US-12625981-B2

Abstract

This document describes a secure machine learning platform. In some aspects, a method includes transmitting by the application to the machine learning platform, a set of data including a user profile, one or more characteristics of a digital component, contextual signals, model identifier, and data indicating a type of event. The application receives a request generated based on the computer-readable instructions to upload a user profile of a user of the client device to a machine learning platform. The computer-readable instructions initiate the request in response to detecting an occurrence of the event with the digital component. In response to the request, the application can obtain the user profile request data element that includes a model identifier for a machine learning model and one or more characteristics of at least one of the digital component or the first content page.

Inventors

  • Yijian Bai
  • Gang Wang

Assignees

  • GOOGLE LLC

Dates

Publication Date
20260512
Application Date
20210319

Claims (20)

  1. 1 . A computer-implemented method comprising: receiving, by a client device, a first content page comprising a digital component comprising computer-readable instructions for initiating uploads of user profiles for use in training machine learning models; presenting, by the client device, the digital component with the first content page; receiving, by an application running on the client device, a request generated based on the computer-readable instructions to upload a user profile of a user of the client device to a machine learning platform, wherein the computer-readable instructions initiate the request in response to detecting of an occurrence of an event related to interaction or non-interaction with the digital component presented with the first content page within a specified time frame, wherein the event is an interaction event if a user interaction with the digital component is detected within the specified time frame, and wherein the event is a non-interaction event if user interaction with the digital component is not detected within the specified time frame; and in response to receiving the request: obtaining, by the application, a user profile request data element comprising a model identifier for a machine learning model and one or more characteristics of at least one of the digital component or the first content page; obtaining, by the application, a user profile for a user of the client device; obtaining, by the application and for use in training the machine learning model, contextual signals that were provided to one or more content platforms to enable the one or more content platforms to select digital components for presentation with the first content page; and transmitting, by the application and to the machine learning platform, a set of data comprising the user profile, the one or more characteristics, the contextual signals, the model identifier, and data indicating whether the event is the interaction event or the non-interaction event.
  2. 2 . The computer-implemented method of claim 1 , wherein the user profile request data element comprises a token received from a content platform that provided the digital component, the token comprising (i) a set of content comprising the model identifier, the data indicating the one or more characteristics, a domain of the content platform, and (ii) a digital signature of the set of content generated using an encryption key of the content platform.
  3. 3 . The computer-implemented method of claim 2 , further comprising verifying, by the application, the digital signature prior to transmitting the set of data to the machine learning platform.
  4. 4 . The computer-implemented method of claim 1 , wherein the event comprises an interaction event, the method further comprising, in response to detecting the occurrence of the interaction event, storing, at the client device, the contextual signals, the one or more characteristics of the digital component, and the user profile.
  5. 5 . The computer-implemented method of claim 4 , further comprising: in response to detecting the occurrence of the interaction event, accessing, by the client device, a second content page provided by a second content provider different from a first content provider that provided the first content page, wherein the second content page comprises a tag comprising computer-readable code; receiving, from the tag, a request for the contextual signals, the one or more characteristics of the digital component and the user profile; encrypting, by the application, the contextual signals, the one or more characteristics of the digital component and the user profile; and transmitting, to a content platform that provided the digital component, the encrypted contextual signals, the encrypted one or more characteristics of the digital component, and the encrypted user profile.
  6. 6 . The computer-implemented method of claim 5 , further comprising: detecting, by the computer-readable code of the tag, a conversion event; and transmitting, by the computer-readable code of the tag, a conversion notification for the conversion event to the content platform.
  7. 7 . The computer-implemented method of claim 1 , further comprising: for each of one or more digital components: sending, by the application, an inference request for the digital component to the machine learning platform, wherein the inference request comprises one or more of the user profile, the contextual signals, or characteristics of the first content page; receiving, from the machine learning platform, a predicted performance measure for the digital component, wherein the predicted performance measure is based on the user profile and one or more trained machine learning models trained by the machine learning platform; determining, based on the predicted performance, a selection value for the digital component; and selecting a given digital component for display at the client device based at least on the selection value for each of the one or more digital components.
  8. 8 . The computer-implemented method of claim 7 , wherein the inference request for the digital component to the machine learning platform further comprises the one or more characteristics of the digital component, the characteristics of the first content page and the contextual signals.
  9. 9 . The computer-implemented method of claim 7 , wherein the predicted performance comprises one of a predicted user interaction rate for the digital component, a predicted conversion rate, or a predicted conversion value for the digital component.
  10. 10 . The computer-implemented method of claim 7 , wherein the predicted performance is based on a performance of the digital component for k nearest neighbor profiles, that are determined, based on the one or more machine learning models, to be k most similar user profiles to the user profile for the user of the client device.
  11. 11 . The computer-implemented method of claim 1 , further comprising: receiving, from a first multi-party computation (MPC) computer of the machine learning platform, a first secret share of an inference result for a first digital component; receiving, from each of one or more second MPC computers of the machine learning platform, a second secret share of the inference result for the digital component; determining, based on the first secret share and each second secret share, a predicted performance measure for the digital component represented by the inference result; selecting the digital component for display at the client device based on the predicted performance measure; and displaying the digital component.
  12. 12 . The computer-implemented method of claim 5 , wherein the machine learning platform comprises two or more multi-party computation (MPC) computers that use a secure MPC process to train a machine learning model to predict a performance measure of the digital component using the encrypted contextual signals, the encrypted one or more characteristics of the digital component, the encrypted user profile and data received from client devices of one or more additional users.
  13. 13 . The computer-implemented method of claim 12 , wherein the two or more MPC computers train the machine learning model without accessing the encrypted contextual signals, the encrypted one or more characteristics of the digital component, or the encrypted user profile in cleartext.
  14. 14 . The computer-implemented method of claim 1 , wherein obtaining, by the application, a user profile for a user of the client device comprises selecting, by the application, the user profile based at least in part on the model identifier.
  15. 15 . The computer-implemented method of claim 1 , wherein: obtaining the user profile comprises generating, by the application, a first secret share of the user profile and a second secret share of the user profile; and transmitting the set of data comprises transmitting the first secret share to a first computing system of the machine learning platform and transmitting the second secret share to a second computing system.
  16. 16 . A system comprising: one or more processors; and one or more memories having stored thereon computer readable instructions configured to cause the one or more processors to perform operations comprising: receiving, by a client device, a first content page comprising a digital component comprising computer-readable instructions for initiating uploads of user profiles for use in training machine learning models; presenting, by the client device, the digital component with the first content page; receiving, by an application running on the client device, a request generated based on the computer-readable instructions to upload a user profile of a user of the client device to a machine learning platform, wherein the computer-readable instructions initiate the request in response to detecting of an occurrence of an event related to interaction or non-interaction with the digital component presented with the first content page within a specified time frame, wherein the event is an interaction event if a user interaction with the digital component is detected within the specified time frame, and wherein the event is a non-interaction event if user interaction with the digital component is not detected within the specified time frame; and in response to receiving the request: obtaining, by the application, a user profile request data element comprising a model identifier for a machine learning model and one or more characteristics of at least one of the digital component or the first content page; obtaining, by the application, a user profile for a user of the client device; obtaining, by the application and for use in training the machine learning model, contextual signals that were provided to one or more content platforms to enable the one or more content platforms to select digital components for presentation with the first content page; and transmitting, by the application and to the machine learning platform, a set of data comprising the user profile, the one or more characteristics, the contextual signals, the model identifier, and data indicating whether the event is the interaction event or the non-interaction event.
  17. 17 . The system of claim 16 , wherein the user profile request data element comprises a token received from a content platform that provided the digital component, the token comprising (i) a set of content comprising the model identifier, the data indicating the one or more characteristics, a domain of the content platform, and (ii) a digital signature of the set of content generated using an encryption key of the content platform.
  18. 18 . The system of claim 16 , wherein the event comprises an interaction event and wherein the operations comprise, in response to detecting the occurrence of the interaction event, storing, at the client device, the contextual signals, the one or more characteristics of the digital component, and the user profile.
  19. 19 . The system of claim 18 , wherein the operations comprise: in response to detecting the occurrence of the interaction event, accessing, by the client device, a second content page provided by a second content provider different from a first content provider that provided the first content page, wherein the second content page comprises a tag comprising computer-readable code; receiving, from the tag, a request for the contextual signals, the one or more characteristics of the digital component and the user profile; encrypting, by the application, the contextual signals, the one or more characteristics of the digital component and the user profile; and transmitting, to a content platform that provided the digital component, the encrypted contextual signals, the encrypted one or more characteristics of the digital component, and the encrypted user profile.
  20. 20 . A non-transitory computer readable medium storing instructions that, when executed by one or more data processing apparatuses, cause the one or more data processing apparatuses to perform operations comprising: receiving, by a client device, a first content page comprising a digital component comprising computer-readable instructions for initiating uploads of user profiles for use in training machine learning models; presenting, by the client device, the digital component with the first content page; receiving, by an application running on the client device, a request generated based on the computer-readable instructions to upload a user profile of a user of the client device to a machine learning platform, wherein the computer-readable instructions initiate the request in response to detecting of an occurrence of an event related to interaction or non-interaction with the digital component presented with the first content page within a specified time frame, wherein the event is an interaction event if a user interaction with the digital component is detected within the specified time frame, and wherein the event is a non-interaction event if user interaction with the digital component is not detected within the specified time frame; and in response to receiving the request: obtaining, by the application, a user profile request data element comprising a model identifier for a machine learning model and one or more characteristics of at least one of the digital component or the first content page; obtaining, by the application, a user profile for a user of the client device; obtaining, by the application and for use in training the machine learning model, contextual signals that were provided to one or more content platforms to enable the one or more content platforms to select digital components for presentation with the first content page; and transmitting, by the application and to the machine learning platform, a set of data comprising the user profile, the one or more characteristics, the contextual signals, the model identifier, and data indicating whether the event is the interaction event or the non-interaction event.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS This application is a National Stage Application under 35 U.S.C. § 371 and claims the benefit of International Application No. PCT/US2021/023102, filed Mar. 19, 2021. The disclosure of the foregoing application is incorporated herein by reference in its entirety. TECHNICAL FIELD This specification relates to a privacy preserving machine learning platform that trains and uses machine learning models using secure multi-party computation. BACKGROUND Some machine learning models are trained based on data collected from multiple sources, e.g., across multiple websites and/or native applications. However, this data can include private or sensitive data that should not be shared or allowed to leak to other parties. SUMMARY In general, one innovative aspect of the subject matter described in this specification can be embodied in methods that include the operations of receiving, by a client device, a first content page including a digital component that includes computer-readable instructions; receiving, by an application running on the client device, a request generated based on the computer-readable instructions to upload a user profile of a user of the client device to a machine learning platform, where the computer-readable instructions initiate the request in response to detecting an occurrence of an event related to interaction or non-interaction with the digital component; in response to receiving the request: obtaining, by the application, a user profile request data element including a model identifier for a machine learning model and one or more characteristics of at least one of the digital component or the first content page; obtaining, by the application, a user profile for a user of the client device; obtaining, by the application, contextual signals provided to one or more content platforms for use in training the machine learning model; and transmitting, by the application and to the machine learning platform, a set of data including the user profile, the one or more characteristics , the contextual signals, the model identifier, and data indicating whether the event is an interaction event or a non-interaction event. Other implementations of this aspect include corresponding apparatus, systems, and computer programs, configured to perform the aspects of the methods, encoded on computer storage devices. These and other implementations can each optionally include one or more of the following features. Some aspects include verifying by the application, the digital signature prior to transmitting the set of data to the machine learning platform. Some aspects include accessing, in response to detecting the occurrence of the interaction event, by the client device, a second content page provided by a second content provider different from a first content provider that provided the first content page, where the second content page includes a tag that includes computer-readable code; receiving, from the tag, a request for the contextual signals, the one or more characteristics of the digital component and the user profile; encrypting, by the application, the contextual signals, the one or more characteristics of the digital component and the user profile; and transmitting, to a content platform that provided the digital component, the encrypted contextual signals, the encrypted one or more characteristics of the digital component, and the encrypted user profile. Some aspects include detecting, by the computer-readable code of the tag, a conversion event and transmitting, by the computer-readable code of the tag, a conversion notification for the conversion event to the content platform. Some aspects include for each of one or more digital components: sending, by the application, an inference request for the digital component to the machine learning platform, where the inference request includes one or more of the user profile, the contextual signals, or characteristics of the current content page; receiving, from the machine learning platform, a predicted performance for the digital component, where the predicted performance measures is based on the user profile and one or more trained machine learning models trained by the machine learning platform; determining, based on the predicted performance, a selection value for the digital component; and selecting a given digital component for display at the client device based at least on the selection value for each of the one or more digital components. Some aspects include receiving, from a first multi-party computation (MPC) computer of the machine learning platform, a first secret share of an inference result for a first digital component; receiving, from each of one or more second MPC computers of the machine learning platform, a second secret share of the inference result for the digital component; determining, based on the first secret share and each second secret share, a predicted performance measure for the digita