2013 Theses Doctoral
Probabilistic Reconstruction and Comparative Systems Biology of Microbial Metabolism
With the number of sequenced microbial species soon to be in the tens of thousands, we are in a unique position to investigate microbial function, ecology, and evolution on a large scale. In this dissertation I first describe the use of hundreds of in silico models of bacterial metabolic networks to study the long-term the evolution of growth and gene-essentiality phenotypes.
The results show that, over billions of years of evolution, the conservation of bacterial phenotypic properties drops by a similar fraction per unit time following an exponential decay. The analysis provides a framework to generate and test hypotheses related to the phenotypic evolution of different microbial groups and for comparative analyses based on phenotypic properties of species. Mapping of genome sequences to phenotypic predictions -such as used in the analysis just described- critically relies on accurate functional annotations.
In this context, I next describe GLOBUS, a probabilistic method for genome-wide biochemical annotations. GLOBUS uses Gibbs sampling to calculate probabilities for each possible assignment of genes to metabolic functions based on sequence information and both local and global genomic context data. Several important functional predictions made by GLOBUS were experimentally validated in Bacillus subtilis and hundreds more were obtained across other species. Complementary to the automated annotation method, I also describe the manual reconstruction and constraints-based analysis of the metabolic network of the malaria parasite Plasmodium falciparum. After careful reconciliation of the model with available biochemical and phenotypic data, the high-quality reconstruction allowed the prediction and in vivo validation of a novel potential antimalarial target. The model was also used to contextualize different types of genome-scale data such as gene expression and metabolomics measurements.
Finally, I present two projects related to population genetics aspects of sequence and genome evolution. The first project addresses the question of why highly expressed proteins evolve slowly, showing that, at least for Escherichia coli, this is more likely to be a consequence of selection for translational efficiency than selection to avoid misfolded protein toxicity. The second project investigates genetic robustness mediated by gene duplicates in the context of large natural microbial populations. The analysis shows that, under these conditions, the ability of duplicated yeast genes to effectively compensate for the loss of their paralogs is not a monotonic function of their sequence divergence.
- PlataCaviedes_columbia_0054D_11653.pdf binary/octet-stream 6.44 MB Download File
More About This Work
- Academic Units
- Cellular, Molecular and Biomedical Studies
- Thesis Advisors
- Vitkup, Dennis
- Ph.D., Columbia University
- Published Here
- October 18, 2013