Theses Doctoral

Estimating Individual Treatment Effects Using Emerging Methods from Machine Learning and Multiple Imputation

Park, Sangbaek

This dissertation used synthetic datasets, semi-synthetic datasets, and a real-world dataset from an educational intervention to compare the performance of 15 machine learning and multiple imputation methods to estimate the individual treatment effect (ITE). In addition, it examined the performance of five evaluation metrics that can be used to identify the best ITE estimation method when conducting research with real-world data.

Among the ITE estimation methods that were analyzed, the S-learner, the Bayesian Causal Forest (BCF), the Causal Forest, and the X-learner exhibited the best performance. In general, the meta-learners with BART and tree-based direct estimation methods performed better than the representation learning methods and the multiple imputation methods. As for the evaluation metrics, τ_(risk_R ) and the Switch Doubly Robust MSE (SDR-MSE) performed the best in identifying the best ITE estimation method when the true treatment effect was unknown.

This dissertation contributes to a small but growing body of research on ITE estimation which is gaining popularity in various fields due to its potential for tailoring interventions to meet the needs of individuals and targeting programs at those who would benefit the most from those interventions.

Files

  • thumnail for Park_columbia_0054D_18480.pdf Park_columbia_0054D_18480.pdf application/pdf 4.53 MB Download File

More About This Work

Academic Units
Measurement and Evaluation
Thesis Advisors
Keller, Bryan Sean
Degree
Ph.D., Columbia University
Published Here
September 4, 2024