Theses Doctoral

Computational Analysis of Biomolecular Data for Medical Applications from Bulk to Single-cell

Zhu, Kaiyi

High-throughput technologies have continuously driven the generation of different biomolecular data, including the genomics, epigenomics, transcriptomics, and other omics data in the last two decades. The developments and advances have revolutionized medical research. In this dissertation, a collection of computational analyses and tools, based on different types of biomolecular data with particular applications on human diseases are presented including 1) a cascade ensemble model based on the Dirichlet process mixture model for reconstructing tumor subclonality from tumor DNA sequencing data; 2) a meta-analysis of gene expression and DNA methylation data from prefrontal cortex samples of patients with neuropsychiatric disorders indicating a stress-related epigenetic mechanism; 3) 2DImpute, an imputation algorithm that is designed to alleviate the sparsity problem in single-cell RNA-sequencing data; and 4) a pan-cancer transformation from adipose-derived stromal cells to metastasis-associated fibroblasts revealed by single cell analysis.


  • thumnail for Zhu_columbia_0054D_16158.pdf Zhu_columbia_0054D_16158.pdf application/pdf 2.97 MB Download File

More About This Work

Academic Units
Electrical Engineering
Thesis Advisors
Anastassiou, Dimitris
Ph.D., Columbia University
Published Here
September 8, 2020