Theses Doctoral

Multi-level Latent Variable Models for Integrating Multiple Phenotypes for Mental Disorders

Zhao, Yinjun

The overarching goal of this dissertation is to integrate heterogeneous data for the estimation of disease coheritability and subtyping.

Chapter 2 focuses on the significance and estimation of heritability and coheritability, which quantify the proportion of phenotypic variation attributable to genetic factors and the genetic correlations between different traits, respectively. To achieve this, we develop robust statistical methods based on estimating equations that account for familial correlations and the computational challenges posed by large pedigrees and extensive datasets. Our methods are evaluated through simulations, demonstrating satisfactory consistency and robust inference properties. Compared to simpler methods performing separate trait analysis, our approaches show a greater power through joint analysis of multiple traits. An application to the analysis of heritability and coheritability in electronic health record (EHR) data reveals substantial genetic correlations between mental disorders and metabolic/endocrine measurements, suggesting shared genetic influences that warrant further investigation. These findings have implications for understanding these conditions' etiology, diagnosis, and treatment.

Chapters 3 and 4 focus on the importance of patient subtyping for personalized mental health care, particularly relevant to the substantial variability observed in mental disorders. Chapter 3 develops methods for subtyping patients with mental disorders using various data modalities and variational inference. We propose latent mixture models inspired by the Item Response Theory to handle both binary and continuous data. We also introduce Black Box Variational Inference (BBVI) algorithms to overcome the challenges of numeric integration in nonlinear models. Our numerical experiments validate the proposed methods, demonstrating that variance-controlling techniques improve convergence speed and reduce iteration variance. However, the proposed algorithm encounters limitations with latent mixture models containing binary modalities due to approximations used in non-conjugate posterior distributions resulting from the non-exponential family likelihood function.

Chapter 4 investigates multi-modal integration techniques for subtyping patients using data from the Adolescent Brain Cognitive Development (ABCD) study. We introduce a Bayesian hierarchical joint model with latent variables and utilize Pólya-Gamma augmentation for posterior approximation, which enables efficient Gibbs sampling and accurate estimation of model parameters. Extensive simulations confirm the consistency of estimators and the prediction accuracy of our method. Applying these methods to patient clustering in the ABCD study provides information for identifying potential clinical subtypes within mental health, which can inform the development of targeted psychological and educational interventions, ultimately improving mental health outcomes.

Keywords: latent mixture model, integrative analysis, coheritability, multi-modality, disease subtyping, variational inference, Pólya-Gamma

Files

  • thumnail for Zhao_columbia_0054D_18714.pdf Zhao_columbia_0054D_18714.pdf application/pdf 3.02 MB Download File

More About This Work

Academic Units
Biostatistics
Thesis Advisors
Wang, Yuanjia
Liu, Ying
Degree
Ph.D., Columbia University
Published Here
August 14, 2024