Search

US-12619782-B2 - Systems and methods for computing with private healthcare data

US12619782B2US 12619782 B2US12619782 B2US 12619782B2US-12619782-B2

Abstract

Techniques are provided for computing with private healthcare data. The techniques include a de-identification method including receiving a text sequence; providing the text sequence to a plurality of entity tagging models, each of the plurality of entity tagging models being trained to tag one or more portions of the text sequence having a corresponding entity type; tagging one or more entities in the text sequence using the plurality of entity tagging models; and obfuscating each entity among the one or more tagged entities by replacing the entity with a surrogate, the surrogate being selected based on one or more attributes of the entity and maintaining characteristics similar to the entity being replaced.

Inventors

  • Murali Aravamudan
  • Rakesh Barve
  • Karthik MURUGADOSS
  • Sankar Ardhanari
  • Ajit Rajasekharan
  • Venkataramanan Soundararajan
  • Samir AWASTHI
  • Tyler Wagner
  • Shamim NAQVI
  • Akash Anand

Assignees

  • nference, inc.

Dates

Publication Date
20260505
Application Date
20240815

Claims (20)

  1. 1 . A method comprising: forming a first secure enclave within a first enterprise, wherein the first secure enclave is available to one or more processors for running one or more application computing processes in isolation from one or more unauthorized computing processes running on the one or more processors; and pre-provisioning software within the first secure enclave, wherein the pre-provisioned software is configured to execute instructions of the one or more application computing processes on the one or more processors by: receiving input data in an encrypted form; loading the input data into the first secure enclave; running one or more programs for analyzing the input data using the first secure enclave; decrypting the input data using one or more cryptographic keys; executing the one or more application computing processes based on the decrypted input data to generate output data, wherein executing the one or more application computing processes based on the decrypted input data comprises de-identifying the input data by removing personally identifying information from the input data; and outputting the de-identified input data to a second enterprise.
  2. 2 . A system comprising: a non-transitory memory; and one or more hardware processors configured to read instructions from the non-transitory memory that, when executed, cause the one or more hardware processors to perform operations comprising: forming a first secure enclave within a first enterprise, wherein the first secure enclave is available to one or more processors for running one or more application computing processes in isolation from one or more unauthorized computing processes running on the one or more processors; and pre-provisioning software within the first secure enclave, wherein the pre-provisioned software is configured to execute instructions of the one or more application computing processes on the one or more processors by: receiving input data in an encrypted form; loading the input data into the first secure enclave; running one or more programs for analyzing the input data using the first secure enclave; decrypting the input data using one or more cryptographic keys; executing the one or more application computing processes based on the decrypted input data to generate output data, wherein executing the one or more application computing processes based on the decrypted input data comprises de-identifying the input data by removing personally identifying information from the input data; and outputting the de-identified input data to a second enterprise.
  3. 3 . The method of claim 1 , wherein executing the one or more application computing processes based on the decrypted input data further comprises processing the input data as a function of specific data requests from the second enterprise.
  4. 4 . The method of claim 1 , further comprising: receiving, by the one or more processors, from a data provider, the input data; and receiving, by the one or more processors, from an instruction provider corresponding to an entity other than the data provider, the one or more programs for analyzing the input data.
  5. 5 . The method of claim 4 , further comprising providing a decryption key associated with the input data or the instructions of the one or more application computing processes inside the first secure enclave to the data provider or the instruction provider, respectively.
  6. 6 . The method of claim 1 , wherein the first secure enclave comprises a policy manager, wherein the policy manager is configured to implement a security policy.
  7. 7 . The method of claim 1 , wherein the pre-provisioned software is further configured to execute the instructions of the one or more application computing processes on the one or more processors by: connecting the first secure enclave to a web browser; and causing contents of the first secure enclave to be displayed as a web page.
  8. 8 . The method of claim 1 , wherein outputting the de-identified input data to the second enterprise comprises outputting the de-identified input data to a second secure enclave of the second enterprise.
  9. 9 . The method of claim 1 , wherein de-identifying input data by removing personally identifying information from the input data comprises replacing the personally identifying information with random data.
  10. 10 . The method of claim 1 , further comprising determining a de-identification probability of the de-identified input data.
  11. 11 . The method of claim 1 , wherein outputting the de-identified input data to a second enterprise comprises providing de-identified input data to a secure data layer, wherein the secure data layer is accessible to the second enterprise.
  12. 12 . The system of claim 2 , wherein executing the one or more application computing processes based on the decrypted input data further comprises processing the input data as a function of specific data requests from the second enterprise.
  13. 13 . The system of claim 2 , wherein the instructions from the non-transitory memory are further configured to cause the one or more hardware processors to perform operations of: receiving from a data provider, the input data; and receiving from an instruction provider corresponding to an entity other than the data provider, the one or more programs for analyzing the input data.
  14. 14 . The system of claim 13 , wherein the instructions from the non-transitory memory are further configured to cause the one or more hardware processors to perform operations of providing a decryption key associated with the input data or the instructions of the one or more application computing processes inside the first secure enclave to the data provider or the instruction provider, respectively.
  15. 15 . The system of claim 2 , wherein the first secure enclave comprises a policy manager, wherein the policy manager is configured to implement a security policy.
  16. 16 . The system of claim 2 , wherein the pre-provisioned software is further configured to execute the instructions of the one or more application computing processes on the one or more processors by: connecting the first secure enclave to a web browser; and causing contents of the first secure enclave to be displayed as a web page.
  17. 17 . The system of claim 2 , wherein outputting the de-identified input data to the second enterprise comprises outputting the de-identified input data to a second secure enclave of the second enterprise.
  18. 18 . The system of claim 2 , wherein de-identifying input data by removing personally identifying information from the input data comprises replacing the personally identifying information with random data.
  19. 19 . The system of claim 2 , wherein the instructions from the non-transitory memory are further configured to cause the one or more hardware processors to perform operations of determining a de-identification probability of the de-identified input data.
  20. 20 . The system of claim 2 , wherein outputting the de-identified input data to a second enterprise comprises providing de-identified input data to a secure data layer, wherein the secure data layer is accessible to the second enterprise.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS This application is a continuation of U.S. patent application Ser. No. 18/381,873, filed on Oct. 19, 2023, and entitled “Systems and Methods for Computing with Private Healthcare Data,” which is a continuation of U.S. patent application Ser. No. 17/975,489, filed on Oct. 27, 2022, now U.S. Pat. No. 11,829,514, issued on Nov. 28, 2023, and entitled “Systems and Methods for Computing with Private Healthcare Data,” which is a continuation of U.S. patent application Ser. No. 17/192,564, filed Mar. 4, 2021, now U.S. Pat. No. 11,487,902, issued on Nov. 1, 2022, and entitled “Systems and Methods for Computing with Private Healthcare Data”, which claims priority under 35 U.S.C. § 1 19 (e) to U.S. Provisional Application No. 63/128,542, filed Dec. 21, 2020, and entitled “Systems and Methods for Computing with Private Healthcare Data”, and U.S. Provisional Application No. 63/109,769, filed Nov. 4, 2020, and entitled “Systems and Methods for Computing with Private Healthcare Data”, and U.S. patent application Ser. No. 17/192,564 is also a continuation-in-part of U.S. Non-provisional application Ser. No. 16/908,520, filed Jun. 22, 2020, now U.S. Pat. No. 11,545,242, issued on Jan. 3, 2023, and entitled “Systems and Methods for Computing with Private Healthcare Data,” which claims priority to U.S. Provisional Application No. 62/865,030, filed Jun. 21, 2019, and entitled “Systems and Methods for Selective Information Masking in Text”, and U.S. Provisional Application No. 62/962,146, filed Jan. 16, 2020, and entitled “Systems and Methods for Retrieving Information Responsive to a Query”, and U.S. Provisional Application No. 62/984,989, filed Mar. 4, 2020, and entitled “Systems and Methods for Selective Information Masking in Text”, and U.S. Provisional Application No. 62/985,003, Mar. 4, 2020, and entitled “Pipelined Federated Architecture for Computing with Private Healthcare Data”, and U.S. Provisional Application No. 63/012,738, filed Apr. 20, 2020, and entitled “Systems and Methods for Augmented Curation and Temporal Discrimination of Health Records”, each of which is incorporated by reference herein in its entirety. This application is also a continuation-in-part of U.S. patent application Ser. No. 18/384,264, filed on Oct. 26, 2023, and entitled “Systems and Methods for Computing with Private Healthcare Data,” which is a continuation of U.S. Non-provisional application Ser. No. 17/935,852, filed Sep. 27, 2022, now U.S. Pat. No. 11,848,082, issued on Dec. 19, 2023, and entitled “Systems and Methods for Computing with Private Healthcare Data,” which is a continuation of U.S. Non-provisional application Ser. No. 16/908,520, filed Jun. 22, 2020, now U.S. Pat. No. 11,545,242, issued on Jan. 3, 2023, and entitled “Systems and Methods for Computing with Private Healthcare Data,” which claims priority to U.S. Provisional Application No. 63/012,738, entitled “Systems And Methods For Augmented Curation And Temporal Discrimination Of Health Records,” filed Apr. 20, 2020, U.S. Provisional Application No. 62/984,989, entitled “Systems And Methods For Selective Information Masking In Text,” filed Mar. 4, 2020, U.S. Provisional Application No. 62/985,003, entitled “Pipelined Federated Architecture For Computing With Private Healthcare Data,” filed Mar. 4, 2020, U.S. Provisional Application No. 62/962,146, entitled “Systems And Methods For Retrieving Information Responsive To A Query,” filed Jan. 16, 2020, and U.S. Provisional Application No. 62/865,030, entitled “Systems And Methods For Selective Information Masking In Text,” filed Jun. 21, 2019, this application is also a continuation of U.S. Non-Provisional patent application Ser. No. 17/985,777, filed on Nov. 11, 2022 entitled “SYSTEMS AND METHODS FOR ADAPTIVELY IMPROVING THE PERFORMANCE OF LOCKED MACHINE LEARNING PROGRAMS”, which is a continuation in-part of Ser. No. 16/908,520, filed on Jun. 22, 2020, now U.S. Pat. No. 11,545,242, issued on Jan. 3, 2023 entitled “SYSTEMS AND METHODS FOR COMPUTING WITH PRIVATE HEALTHCARE DATA”, each of which is incorporated by reference herein in its entirety. FIELD OF THE INVENTION The present invention relates generally to processing data that may contain information deemed private by consumers or regulations. BACKGROUND Hospitals, healthcare providers and care givers collect large amounts of data from patients. It is a necessary part of the processes by which healthcare is provided to members of the public. Typically, a patient provides data to the care giver as a part of receiving treatment for his/her ailments. This data is stored by the care giver and may be used later, inter alia, for research purposes. In another typical scenario data may be collected from consumers via one or more devices, e.g., pulse oximeter, glucose monitor, smart watch, fitness bracelet, etc. In such use cases, the collected data is often used to analyze a patient's health in a continuous manner or over a period of time. Con