Academic Commons

Theses Doctoral

Accurate and Sensitive Quantification of Protein-DNA Binding Affinity

Rastogi, Chaitanya

Transcription factors control gene expression by binding to genomic DNA in a sequence-specific manner. Mutations in transcription factor binding sites are increasingly found to be associated with human disease, yet we currently lack robust methods to predict these sites. Here we developed a versatile maximum likelihood framework, named No Read Left Behind (NRLB), that fits a biophysical model of protein-DNA recognition to all in vitro selected DNA binding sites across the full affinity range. NRLB predicts human Max homodimer binding in near-perfect agreement with existing low-throughput measurements. The model captures the specificity of p53 tetrameric binding sites and discovers multiple binding modes in a single sample. Additionally, we confirm that newly-identified low-affinity enhancer binding sites are functional in vivo, and that their contribution to gene expression matches their predicted affinity. Our results establish a powerful paradigm for identifying protein binding sites and interpreting gene regulatory sequences in eukaryotic genomes.

Files

This item is currently under embargo. It will be available starting 2019-08-22.

More Information

Academic Units
Applied Physics and Applied Mathematics
Thesis Advisors
Bussemaker, Harmen J.
Bienstock, Daniel
Degree
Ph.D., Columbia University
Academic Commons provides global access to research and scholarship produced at Columbia University, Barnard College, Teachers College, Union Theological Seminary and Jewish Theological Seminary. Academic Commons is managed by the Columbia University Libraries.