Search

US-12626169-B2 - Bayesian causal relationship network models for healthcare diagnosis and treatment based on patient data

US12626169B2US 12626169 B2US12626169 B2US 12626169B2US-12626169-B2

Abstract

Systems, methods, and computer-readable medium are provided for healthcare analysis. Data corresponding to a plurality of patients is received. The data is parsed to generate normalized data for a plurality of variables, with normalized data generated for more than one variable for each patient. A causal relationship network model is generated relating the plurality of variables based on the generated normalized data using a Bayesian network algorithm. The causal relationship network model includes variables related to a plurality of medical conditions or medical drugs. In another aspect, a selection of a medical condition or drug is received. A sub-network is determined from a causal relationship network model. The sub-network includes one or more variables associated with the selected medical condition or drug. One or more predictors for the selected medical condition or drug are identified.

Inventors

  • Niven Rajin Narain
  • Viatcheslav R. Akmaev
  • Vijetha VEMULAPALLI

Assignees

  • BPGBIO, INC.

Dates

Publication Date
20260512
Application Date
20230616

Claims (19)

  1. 1 . A computer-implemented method for generating a causal relationship network model based on patient data, the method comprising: accessing or obtaining normalized data for a plurality of variables, the normalized data corresponding to a plurality of patients and including diagnostic information and/or treatment information for each patient, wherein, for each patient, the normalized data includes more than one variable; and generating a causal relationship network model relating the plurality of variables based on the normalized data for the plurality of patients using a programmed computing system including storage holding network model building code and a plurality of processors configured to execute the network model building code, the generating including creating and evolving an ensemble of probabilistic networks based on the normalized data for the plurality of patients, the causal relationship network model including variables related to a plurality of medical conditions, and the ensemble of probabilistic networks created and evolved in parallel on the plurality of processors.
  2. 2 . The method of claim 1 , wherein the causal relationship network model includes relationships indicating one or more predictors for each of the plurality of medical conditions.
  3. 3 . The method of claim 1 , wherein the accessed or obtained normalized data is not pre-selected as being relevant to one or more of the plurality of medical conditions.
  4. 4 . The method of claim 1 , wherein the plurality of patients includes a first subset of patients each having data indicating a diagnosis of a medical condition in the patient, and includes a second subset of patients each having data that does not indicate a diagnosis of the medical condition in the patient.
  5. 5 . The method of claim 1 , further comprising: receiving updated or additional data corresponding to one or more of the plurality of patients or corresponding to one or more additional patients; and updating the causal relationship network model based on the updated or additional data.
  6. 6 . The method of claim 1 , wherein the causal relationship network model is generated based solely on the normalized data.
  7. 7 . The method of claim 1 , further comprising: determining a sub-network from the causal relationship network model, one or more variables in the sub-network associated with a selected medical condition or with a selected drug; and probing relationships in the sub-network to determine one or more predictors for the selected medical condition or for the selected drug.
  8. 8 . The method of claim 7 , wherein the one or more predictors for the selected medical condition indicate a medical condition co-occurring with the selected medical condition.
  9. 9 . The method of claim 7 , wherein the extent of the sub-network is determined based on the one or more variables associated with the selected medical condition or with the selected drug and the strength of the relationships between the one or more variables and other variables in the causal relationship network model.
  10. 10 . The method of claim 7 , wherein the sub-network includes the one or more variables associated with the selected medical condition or with the selected drug, a first set of additional variables each having a first degree relationship with the one or more variables, and a second set of additional variables each having second degree relationship with the one or more variables.
  11. 11 . The method of claim 7 , wherein at least one of the one or more predictors is newly identified as a predictor for the medical condition or is newly identified as a predictor relevant to the selected drug.
  12. 12 . The method of claim 7 , further comprising: displaying the one or more predictors in a user interface, the displaying including a graphical representation of the one or more variables, the one or more predictors, and relationships among the one or more variables and the one or more predictors.
  13. 13 . The method of claim 7 , further comprising displaying a graphical representation of the sub-network in a user interface.
  14. 14 . The method of claim 7 , further comprising ranking the one or more predictors based on strength of relationships between the one or more variables and the one or more predictors.
  15. 15 . The method of claim 7 , wherein the one or more predictors relevant to the selected drug indicates a drug administered in conjunction with the selected drug.
  16. 16 . The method of claim 7 , wherein the one or more predictors indicate an adverse drug interaction between the selected drug and one or more other drugs.
  17. 17 . The method of claim 1 , wherein the causal relationship network model is generated based on between 50 variables and 1,000,000 variables.
  18. 18 . The method of claim 1 , wherein the normalized data includes information from patient electronic health records.
  19. 19 . The method of claim 1 , wherein generating the causal relationship network model relating the plurality of variables based on the normalized data for the plurality of patients comprises: creating a list of network fragments, each network fragment including two or more variables connected by one or more relationships, and determining a probabilistic score associated with each network fragment based on the normalized data; creating an ensemble of trial networks that is an ensemble of Bayesian networks, each trial network constructed from a different subset of the list of network fragments; and globally optimizing the ensemble of trial networks by evolving each trial network through local transformations in parallel using the plurality of processors to produce a consensus causal relationship network model.

Description

RELATED APPLICATION This application is a continuation of U.S. patent application Ser. No. 16/592,069, filed Oct. 3, 2019, which, in turn, is a continuation of U.S. patent application Ser. No. 14/851,846, filed Sep. 11, 2015, which, in turn, claims benefit of and priority to U.S. Provisional Patent Application No. 62/049,148 filed on Sep. 11, 2014, the entire disclosure of each application is incorporated herein by reference in its entirety. TECHNICAL FIELD The present disclosure relates generally to systems and methods for data analysis, in particular, for using healthcare data to generate a causal relationship network model. BACKGROUND Many systems analyze data to gain insights into various aspects of healthcare. Insights can be gained by determining relationships among the data. Conventional methods predetermine a few relevant variables to extract from healthcare data for processing and analysis. Based on the few pre-selected variables, relationships are established between various factors such as medical drug, disease, symptoms, etc. Preselecting the variables to focus on limits the ability to discover new or unknown relationships. Preselecting the variables also limits the ability to discover other relevant variables. For example, if the variables are preselected when considering analysis of diabetes, one would be limited to those variables and not realize that the data analysis supports another variable relevant to diabetes that was previously unknown to the healthcare community. SUMMARY In one aspect, the invention relates to a computer-implemented method for generating a causal relationship network model based on patient data. The method includes receiving data corresponding to a plurality of patients, where the data includes diagnostic information and/or treatment information for each patient, parsing the data to generate normalized data for a plurality of variables, wherein, for each patient, the normalized data is generated for more than one variable, generating a causal relationship network model relating the plurality of variables based on the generated normalized data using a Bayesian network algorithm, the causal relationship network model includes variables related to a plurality of medical conditions, and the causal relationship network generated using a programmed computing system including storage holding network model building code and one or more processors configured to execute the network model building code. In certain embodiments, the causal relationship network model includes relationships indicating one or more predictors for each of the plurality of medical conditions. In certain embodiments, the data received is not pre-selected as being relevant to one or more of the plurality of medical conditions. In some embodiments, the method further includes receiving additional data corresponding to one or more additional patients, and updating the causal relationship network model based on the additional data. In certain embodiments, the causal relationship network model is generated based solely on the generated normalized data. In some embodiments, the method further includes determining a sub-network from the causal relationship network model, one or more variables in the sub-network associated with a selected medical condition, and probing relationships in the sub-network to determine one or more predictors for the selected medical condition. In certain embodiments, the one or more predictors for the selected medical condition indicate a medical condition co-occurring with the selected medical condition. In certain embodiments, the extent of the sub-network is determined based on the one or more variables associated with the selected medical condition and the strength of the relationships between the one or more variables and other variables in the causal relationship network model. In certain embodiments, the sub-network includes the one or more variables associated with the selected medical condition, a first set of additional variables each having a first degree relationship with the one or more variables, and a second set of additional variables each having second degree relationship with the one or more variables. In some embodiments, at least one of the one or more predictors is previously unknown. In some embodiments, at least one of the one or more predictors is newly identified as a predictor for the medical condition. In certain embodiments, the number of predictors is less than the number of variables. In some embodiments, the method further includes displaying the one or more predictors in a user interface, the displaying including a graphical representation of the one or more variables, the one or more predictors, and relationships among the one or more variables and the one or more predictors. In some embodiments, the method further includes displaying a graphical representation of the sub-network in a user interface. In some embodiments, the method further includes ra