Academic Commons

Theses Doctoral

Clustering Algorithm for Zero-Inflated Data

Zero-inflated data are common in biomedical research. In cluster analysis, the heuristic
approach fails to provide inferential properties to the outcome while the existing model-based
approach only works in the case of a mixture of multivariate normal. In this dissertation, I
developed two new model-based clustering algorithms- the multivariate zero-inflated log-normal
and the multivariate zero-inflated Poisson clustering algorithms. I then applied these methods to
the questionnaire data and compare the resulting clusters to the ones derived from assuming
multivariate normal distribution. Associations between clustering results and clinical outcomes
were also investigated.

Files

  • thumnail for thanataveerat_cumc.columbia_0054E_10063.pdf thanataveerat_cumc.columbia_0054E_10063.pdf application/pdf 1.04 MB Download File

More About This Work

Academic Units
Biostatistics
Thesis Advisors
Cheng, Bin
Cheung, Ying Kuen
Degree
Dr.P.H., Mailman School of Public Health, Columbia University
Published Here
June 8, 2020