2023 Theses Doctoral
A Novel Analytical Framework for Regulatory Network Analysis of Single-Cell Transcriptomic Data
While single-cell RNA sequencing provides a remarkable window on pathophysiologic tissue biology and heterogeneity, its high gene-dropout rate and low signal-to-noise ratio challenge quantitative analyses and mechanistic understanding. This thesis addresses this issue by developing PISCES, a pipeline for regulatory network-based single-cell analysis of mammalian tissues. PISCES accurately estimates the mechanistic contribution of regulatory and signaling proteins to cell state implementation and maintenance based on the expression of their lineage-specific transcriptional targets, inferring protein activity for a putative set of transcriptional regulators and cell-state markers. Experimental validation assays – including technical analysis via downsampling of high depth data and biological analysis by assessing concordance with CITE-Seq-based measurements – show a significant improvement in the ability to identify rare subpopulations and to elucidate key lineage markers compared to gene expression analysis.
The improved ability to identify biologically meaningful cellular subpopulations makes PISCES an ideal tool to deconvolute heterogeneity in a wide variety of biological contexts. A systematic analysis of single-cell gene expression profiles in the Human Protein Atlas (HPA) by PISCES generated tissue-specific clustering and master regulator analyses across 26 human tissues, as well as a publicly available repository of ready-to-use regulatory networks specific to cell-lineages in each tissue. This resource will allow researchers to access the algorithmic advantages of PISCES without requiring prohibitively expensive or technically challenging computational resources.
Additionally, PISCES is able to unravel the heterogeneous stromal environment of Pancreatic Ductal Adenocarcinoma, a malignancy defined by a large and complicated stromal compartment. This analysis reveals several novel candidate subpopulations, including a fibroblast subtype that has never been observed in humans, a potential pro-metastatic population of endothelial cells, and a population of immune-suppressing stellate cells.
PISCES is also able to deconvolute more continuous forms of heterogeneity, as demonstrated by an analysis of epithelial cells in the developing murine lung. Here, PISCES is able to computationally reconstruct a developmental trajectory between Sox9+ distal cells and Sox2+ proximal cells, which is then leveraged to identify several novel markers of the critical intermediate population. Subsequent analysis suggests that these transition zone cells may share programs similar to those seen in injury repair and identifies a candidate therapeutic target that can drive cells into or out of this transition state.
Finally, protein activity measured by PISCES is used to refine faulty experimental labels through differential density analysis. This analysis lead to the development of a machine learning classifier that accurately predicted increased degrees of stemness in experimentally transduced populations. Additionally, the density analysis paradigm has been extended to unsupervised settings, allowing for the detection of stable cellular populations and transitory trajectories.
- Vlahos_columbia_0054D_17692.pdf application/pdf 6.94 MB Download File
More About This Work
- Academic Units
- Cellular, Molecular and Biomedical Studies
- Thesis Advisors
- Califano, Andrea
- Ph.D., Columbia University
- Published Here
- March 29, 2023