Theses Doctoral

Statistical Methods for Modeling Biomarkers of Neuropsychiatric Diseases

Sun, Ming

Due to a lack of a gold standard objective marker, the current practice for diagnosing neuropsychiatric disorders is mostly based on clinical symptoms, which may occur in the late stage of the disease. Clinical diagnosis is also subject to high variance due to between- and within-subject variability of patient symptomatology and between-clinician variability. Effectively modeling disease course and making early predictions using biomarkers and subtle clinical signs are critical and challenging both for improving diagnostic accuracy and designing preventive clinical trials for neurological disorders. Leveraging the domain knowledge that certain biological characteristics (i.e., causal genetic mutation, cognitive reserve) are part of the disease mechanism, we first propose a nonlinear model with random inflection points depending on subject-specific characteristics to jointly estimate the trajectories of the biomarkers. The model scales different biomarkers into comparable progression curves with a temporal order based on the mean inflection point. Meanwhile, it assesses how subject-specific characteristics affect the dynamic trajectory of different markers, which offers information on designing preventive therapeutics and personalized disease management strategy. We use EM algorithm for the estimation. Extensive simulation studies are conducted. The method is applied to biomarkers in neuroimaging, cognitive, and motor domains of Huntington’s disease.
Under the same nonlinear random effects model framework, we propose the second model inspired by the neural mass models. Biomarkers are modeled as the average manifestation of the functioning status of neuronal ensembles. A latent liability score is shared across biomarkers to pool information. We use EM algorithm for maximum likelihood estimation, and a normal approximation is used to facilitate numerical integration. The results show that some neuroimaging biomarkers are early signs of the onset of Huntington’s disease. Finally, we develop an online tool that provides the personalized prediction of biomarker trajectory given the medical history and baseline measurements.
The third model uses a dynamical system based on differential equations to model the evolution of biomarkers. The dynamical system is not only useful to characterize the temporal patterns of the biomarkers, but also informative of the interaction among the biomarkers. We propose a semiparametric dynamical system based on multi-index models. For estimation and inference, we consider a two-step procedure based on the integral equations from the proposed model. The algorithm iterates between the estimation of the link function through splines and the estimation of the index parameters, allowing for regularization to achieve sparsity. We prove the model identifiability and derive the asymptotic properties of the model parameters. A benefit of the model and the estimation approach is to pool information from multiple subjects to construct the network of biomarkers and provide inference. We demonstrate the empirical improvement over competing approaches with the simulated gene expression data from the third DREAM challenge. It is applied to the electroencephalogram (EEG) data and it reveals different effective connectivity of brain networks for patients with alcohol dependence under different cognitive tasks.


  • thumnail for Sun_columbia_0054D_14855.pdf Sun_columbia_0054D_14855.pdf application/pdf 6.23 MB Download File

More About This Work

Academic Units
Thesis Advisors
Wang, Yuanjia
Ph.D., Columbia University
Published Here
October 16, 2018