2020 Theses Doctoral
Machine Learning Methods for Causal Inference with Observational Biomedical Data
Causal inference -- the process of drawing a conclusion about the impact of an exposure on an outcome -- is foundational to biomedicine, where it is used to guide intervention. The current gold-standard approach for causal inference is randomized experimentation, such as randomized controlled trials (RCTs). Yet, randomized experiments, including RCTs, often enforce strict eligibility criteria that impede the generalizability of causal knowledge to the real world. Observational data, such as the electronic health record (EHR), is often regarded as a more representative source from which to generate causal knowledge. However, observational data is non-randomized, and therefore causal estimates from this source are susceptible to bias from confounders. This weakness complicates two central tasks of causal inference: the replication or evaluation of existing causal knowledge and the generation of new causal knowledge. In this dissertation I (i) address the feasibility of observational data to replicate existing causal knowledge and (ii) present new methods for the generation of causal knowledge with observational data, with a focus on the causal tasks of comparing an outcome between two cohorts and the estimation of attributable risks of exposures in a causal system.
Subjects
Files
-
Averitt_columbia_0054D_16037.pdf application/pdf 3.01 MB Download File
More About This Work
- Academic Units
- Biomedical Informatics
- Thesis Advisors
- Perotte, Adler J.
- Degree
- Ph.D., Columbia University
- Published Here
- July 28, 2020