2025 Theses Doctoral
Enhancer RNAs: a Source of Novel, Rapidly-Evolving Proteins
Enhancer RNAs (eRNAs) are a family of long noncoding RNAs (lncRNAs) transcribed from enhancer sites by RNA polymerase II (RNAPII) as part of the process of enhancer activation. Normally, eRNAs are typically processed by the Integrator complex and incorporated into the enhancer looping machinery formed by the Mediator and Cohesin complexes. However, some eRNAs escape this processing step and are instead transcribed to produce longer, polyadenylated RNAs. Poly(A)+ eRNAs are usually targeted for exosomal degradation by the Mtr4-containing PAXT complex, but under some conditions can be exported to the cytoplasm. I
n this latter option, as shown in Chapter 2 of this thesis, eRNAs that contain ORFs, of which there are a surprising number, are capable of being translated to produce functional proteins. eRNAs can gain ORFs through the process of de-novo gene birth, resulting in novel genes at sites capable of fulfilling the roles of both canonical coding genes and enhancers.This dissertation is divided into three main sections, and an additional fourth section outlining future research directions. In the first section, I review the current research on eRNAs and other lncRNAs including their origins, processing, and functions. I also review current research on proteins encoded by canonically noncoding RNAs, and the process and implications of de-novo gene birth as it relates to the aforementioned topics.
In the second section, we aimed to identify translating eRNAs in the human genome using ribosome profiling compared to a database of transcribed enhancers. Using these results, we selected ten large eRNA ORFs to investigate the functions of the proteins they encode, including their subcellular localization and protein interactomes. Finally, we investigated the homologs of these eRNA ORF sequences in other species to determine their level of evolutionary conservation compared to that observed in canonical protein-coding genes. Our findings in the second section provide evidence for novel, highly-basic, arginine-rich proteins encoded by eRNAs and capable of interacting with DNA and RNA, either directly or through interactions with other associated proteins. We also present evidence that the ORFs encoding these proteins appeared relatively recently in human evolution, with most being primate-specific and exhibiting mutation rates associated with purifying selection of coding sequences across their homologs in great apes.
In the third section, we present additional results from analysis of published proteomics and RNA-seq results that provide evidence for expression of ORF-containing eRNAs in differentiating stem cells. These results were also confirmed using qRT-PCR of whole-cell RNA samples isolated from these differentiating cells. These results show expression of several eRNAs identified in the second section during the early stages of human embryonic stem cell differentiation into the three germ layers and subsequent mature cell types. These results are also supported by a decrease in Mtr4 protein levels also detected in proteomics results from the same differentiations. These results indicate that expression of ORF-containing eRNAs and the Mtr4-depleted conditions leading to it are present in stem cells during differentiation, and stem cell differentiation is a promising subject of future research on the roles of eRNA-encoded proteins in normal cellular function.
Subjects
Files
This item is currently under embargo. It will be available starting 2029-12-19.
More About This Work
- Academic Units
- Biological Sciences
- Thesis Advisors
- Manley, James L.
- Degree
- Ph.D., Columbia University
- Published Here
- January 15, 2025