Theses Doctoral

Machine Learning Methods for Intensive Longitudinal Data and Causal Inference in Multi-Study, Multi-Outcome Settings

Kim, Soohyun

This dissertation examines the challenges and opportunities of analyzing distinct sources of mental health data in the age of precision medicine and big data. The focus lies on two areas: leveraging real-time Ecological Momentary Assessment (EMA) data to understand individual-level variations in mental disorders, especially depression; and the integration of data from randomized clinical trials (RCTs) to assess treatment efficacy, with an application to schizophrenia. The multifaceted and heterogeneous nature of mental disorders calls for nuanced and personalized assessment methods.

In the first part of this dissertation, through our proposed machine learning method, the Heterogeneous-Dynamics Restricted Boltzmann Machine (HDRBM), we examine symptom-level variations beyond the traditional one-size-fits-all summary scores and learn the heterogeneous group dynamics. We demonstrate the effectiveness of our approach on simulated and real-world EMA data sets. We show that by incorporating covariates, HDRBM can improve accuracy and interpretability, explore the underlying drivers of the group dynamics of participants, and serve as a generative model for EMA studies. In the second part of the dissertation, we present the challenges of integrating multiple randomized clinical trials (RCTs) in mental health research, proposing data fusion as a means to integrate individual patient data across similar studies to enhance statistical power.

The dissertation introduces novel estimators tailored for multi-study, multi-outcome fused datasets, aiming for the optimization of health outcomes for each treatment. The method also addresses the utilization of similar trials with different outcome follow- up measurements, serving as proxies for unobserved outcomes. An application is provided on cognitive remediation (CR) therapy’s efficacy using the NIMH Database of Cognitive Training and Remediation Studies (DCTRS) as a resource, emphasizing the importance of leveraging surrogate outcomes in clinical trials.

Files

This item is currently under embargo. It will be available starting 2025-02-21.

More About This Work

Academic Units
Biostatistics
Thesis Advisors
Wang, Yuanjia
Miles, Caleb H.
Degree
Ph.D., Columbia University
Published Here
February 28, 2024