Theses Doctoral

The development and application of high-throughput tools for functional genomics

Sheng, Jiemin

The study of cell physiology and functional genomics has seen an explosion of interest stemming from the development and commercialization of DNA sequencing technologies that allow upwards of several billion molecules to be probed simultaneously. However, of the three most abundant biomolecules in the cell—DNA, RNA, and protein—the dynamic and ever-changing quantities of RNA and proteins in a cell dictate much of the phenotypic variation observed from tissue-to-tissue, organ-to-organ, and cell-to-cell. Though much work has been done to measure RNA quantities in cells, and even to model their temporal dynamics from a single time-point measurement, the focus of this thesis will be on the development of methods to measure proteins within cells to draw conclusions about their physiological implications for the larger organism. In the outlined work, we couple protein measurements to DNA readouts that allow us to leverage commercial sequencing platforms to determine phenotypic outcomes through different methodologies. This thesis will proceed in two parts.

Chapter 2 highlights the development of our method (Quantum Barcoding 2; QBC2) which uses DNA-barcoded antibodies to simultaneously quantify the expression of dozens of proteins on single cells. We demonstrate through head-to-head comparisons between our method and the traditional diagnostic gold standard of flow cytometry that we can accurately distinguish cell types and readily capture rare phenotypes that are otherwise too costly or labor intensive to probe using traditional methods.

Chapter 3 discusses a deep mutational scanning (DMS) study conceived and developed during the COVID-19 pandemic, which reveals a detailed understanding of the 3CL protease of the SARS-CoV-2 virus, one of the critical components of the virus replication machinery. This technique is similarly applied to DNAJB6 to evaluate its ability to function as a chaperone protein. By leveraging comprehensive mutagenesis with methods of probing gene function en masse, we were able to evaluate the fitness effect of all amino acid substitutions within the 3CL protease and a large portion of DNAJB6, giving us valuable insight into their mechanisms of action.

As a whole, this thesis presents a multi-faceted view of how new tools can be developed to measure protein expression and function, with the potential to generalize to other currently unexplored modalities.


  • thumnail for Sheng_columbia_0054D_17249.pdf Sheng_columbia_0054D_17249.pdf application/pdf 9.72 MB Download File

More About This Work

Academic Units
Cellular, Molecular and Biomedical Studies
Thesis Advisors
Chavez, Alejandro
Ph.D., Columbia University
Published Here
May 25, 2022