2025 Theses Doctoral
Probabilistic and Optimization Methods for Biomedical and Single-Cell Studies
In this thesis, we explore probabilistic and optimization methods to address critical challenges in biomedical and single-cell studies. These methods encompass diverse applications, from group testing strategies to advanced computational approaches for single-cell data analysis, highlighting their utility across scales and disciplines.
First, we develop a novel probabilistic framework for one-stage noisy group testing, a technique that efficiently identifies infected individuals within large populations. Motivated by practical needs during the COVID-19 pandemic, we propose a pooling design guided by maximizing pool entropy and a maximum-likelihood recovery algorithm. Our findings reveal the interplay between pooling parameters and randomness in infection vectors, offering a robust and adaptable group testing strategy.
Second, we introduce CellStitch, a 3D cell segmentation method leveraging optimal transport to overcome the challenges of segmenting anisotropic microscopy images. Unlike most existing segmentation approaches, CellStitch circumvents the need for large 3D training datasets by aligning cellular correspondences across imaging layers. Benchmarking on diverse plant microscopy datasets demonstrates that CellStitch outperforms state-of-the-art segmentation methods, particularly on images with high anisotropy, enabling accurate analysis of 3D cellular structures.
Finally, we present RefCM, an algorithm for automating cell-type annotation in single-cell RNA sequencing data based on computing Wasserstein distance between cell populations. RefCM measures gene expression distribution similarities and solves an integer program to align query clusters with reference annotations. Our method achieves superior accuracy in cross-technology, cross-tissue, and cross-species mappings, addressing a critical need for robust annotation in single-cell studies and broadening the applicability of scRNA-seq technologies.
Together, these contributions demonstrate the power of probabilistic modeling and optimization methods in advancing the efficiency and accuracy of biomedical and single-cell analyses, providing tools to tackle challenges in diverse biological contexts.
Subjects
Files
-
Liu_columbia_0054D_19113.pdf application/pdf 1.33 MB Download File
More About This Work
- Academic Units
- Computer Science
- Thesis Advisors
- Blumberg, Andrew J.
- Degree
- Ph.D., Columbia University
- Published Here
- May 7, 2025