Theses Doctoral

Event History Analysis in Multivariate Longitudinal Data

Yuan, Chaoyu

This thesis studies event history analysis in multivariate longitudinal observational databases (LODs) and its application in postmarketing surveillance to identify and measure the relationship between events of health outcomes and drug exposures. The LODs contain repeated measurements on each individual whose healthcare information is recorded electronically. Novel statistical methods are being developed to handle challenging issues arising from the scale and complexity of postmarketing surveillance LODs. In particular, the self-controlled case series (SCCS) method has been developed with two major features (1) it only uses individuals with at least one event for analysis and inference and, (2) it uses each individual to be served as his/her own control, effectively requiring a person to switch treatments during the observation period. Although this method handles heterogeneity and bias, it does not take full advantage of the observational databases. In this connection, the SCCS method may lead to a substantial loss of efficiency.

We proposed a multivariate proportional intensity modeling approach with random effect for multivariate LODs. The proposed method can explain the heterogeneity and eliminate bias in LODs. It also handles multiple types of event cases and makes full use of the observational databases. In the first part of this thesis, we present the multivariate proportional intensity model with correlated frailty. We explore the correlation structure between multiple types of clinical events and drug exposures. We introduce a multivariate Gaussian frailty to incorporate thewithin-subject heterogeneity, i.e. hidden confounding factors. For parameter estimation, we adopt the Bayesian approach using the Markov chain Monte Carlo method to get a series of samples from the targeted full likelihood. We compare the new method with the SCCS method and some frailty models through simulation studies.

We apply the proposed model to an electronic health record (EHR) dataset and identify event types as defined in Observational Outcomes Medical Partnership (OMOP) project. We show that the proposed method outperforms the existing methods in terms of common metrics, such as receiver operating characteristic (ROC) metrics. Finally, we extend the proposed correlated frailty model to include a dynamic random effect. We establish a general asymptotic theory for the nonparametric maximum likelihood estimators in terms of identifiability, consistency, asymptotic normality and asymptotic efficiency. A detailed illustration of the proposed method is done with the clinical event Myocardial Infarction (MI) and drug treatment of Angiotensin-converting-enzyme (ACE) inhibitors, showing the dynamic effect of unobserved heterogeneity.


  • thumnail for Yuan_columbia_0054D_16854.pdf Yuan_columbia_0054D_16854.pdf application/pdf 567 KB Download File

More About This Work

Academic Units
Thesis Advisors
Ying, Zhiliang
Ph.D., Columbia University
Published Here
October 20, 2021