Academic Commons

Articles

Large-scale experimental studies show unexpected amino acid effects on protein expression and solubility in vivo in E. coli

Hunt, John F.; Tong, Saichiu; Bracic, Ana; Luff, Jon; Naumov, Victor; Acton, Thomas; Manor, Philip; Xiao, Rong; Rost, Burkhard; Montelione, Gaetano; Everett, John; Price, W. Nicholson; Handelman, Samuel

The biochemical and physical factors controlling protein expression level and solubility in vivo remain incompletely characterized. To gain insight into the primary sequence features influencing these outcomes, we performed statistical analyses of results from the high-throughput protein-production pipeline of the Northeast Structural Genomics Consortium. Proteins expressed in E. coli and consistently purified were scored independently for expression and solubility levels. These parameters nonetheless show a very strong positive correlation. We used logistic regressions to determine whether they are systematically influenced by fractional amino acid composition or several bulk sequence parameters including hydrophobicity, sidechain entropy, electrostatic charge, and predicted backbone disorder. Decreasing hydrophobicity correlates with higher expression and solubility levels, but this correlation apparently derives solely from the beneficial effect of three charged amino acids, at least for bacterial proteins. In fact, the three most hydrophobic residues showed very different correlations with solubility level. Leu showed the strongest negative correlation among amino acids, while Ile showed a slightly positive correlation in most data segments. Several other amino acids also had unexpected effects. Notably, Arg correlated with decreased expression and, most surprisingly, solubility of bacterial proteins, an effect only partially attributable to rare codons. However, rare codons did significantly reduce expression despite use of a codon-enhanced strain. Additional analyses suggest that positively but not negatively charged amino acids may reduce translation efficiency in E. coli irrespective of codon usage. While some observed effects may reflect indirect evolutionary correlations, others may reflect basic physicochemical phenomena. We used these results to construct and validate predictors of expression and solubility levels and overall protein usability, and we propose new strategies to be explored for engineering improved protein expression and solubility.

Files

  • thumnail for 3b7c0ed56b2d6bf9c5c0d27771d08387.zip 3b7c0ed56b2d6bf9c5c0d27771d08387.zip binary/octet-stream 4.19 MB Download File

Also Published In

Title
Microbial Informatics and Experimentation

More About This Work

Academic Units
Biological Sciences
Biochemistry and Molecular Biophysics
Publisher
BioMed Central
Published Here
September 8, 2014
Academic Commons provides global access to research and scholarship produced at Columbia University, Barnard College, Teachers College, Union Theological Seminary and Jewish Theological Seminary. Academic Commons is managed by the Columbia University Libraries.