Bajaj Kanika, Madhusudhan M S, Adkar Bharat V, Chakrabarti Purbani, Ramakrishnan C, Sali Andrej, Varadarajan Raghavan
Molecular Biophysics Unit, Indian Institute of Science, Bangalore, India.
PLoS Comput Biol. 2007 Dec;3(12):e241. doi: 10.1371/journal.pcbi.0030241.
When incorporated into a polypeptide chain, proline (Pro) differs from all other naturally occurring amino acid residues in two important respects. The phi dihedral angle of Pro is constrained to values close to -65 degrees and Pro lacks an amide hydrogen. Consequently, mutations which result in introduction of Pro can significantly affect protein stability. In the present work, we describe a procedure to accurately predict the effect of Pro introduction on protein thermodynamic stability. Seventy-seven of the 97 non-Pro amino acid residues in the model protein, CcdB, were individually mutated to Pro, and the in vivo activity of each mutant was characterized. A decision tree to classify the mutation as perturbing or nonperturbing was created by correlating stereochemical properties of mutants to activity data. The stereochemical properties including main chain dihedral angle phi and main chain amide H-bonds (hydrogen bonds) were determined from 3D models of the mutant proteins built using MODELLER. We assessed the performance of the decision tree on a large dataset of 163 single-site Pro mutations of T4 lysozyme, 74 nsSNPs, and 52 other Pro substitutions from the literature. The overall accuracy of this algorithm was found to be 81% in the case of CcdB, 77% in the case of lysozyme, 76% in the case of nsSNPs, and 71% in the case of other Pro substitution data. The accuracy of Pro scanning mutagenesis for secondary structure assignment was also assessed and found to be at best 69%. Our prediction procedure will be useful in annotating uncharacterized nsSNPs of disease-associated proteins and for protein engineering and design.
当脯氨酸(Pro)掺入多肽链中时,它在两个重要方面与所有其他天然存在的氨基酸残基不同。Pro的φ二面角被限制在接近-65度的值,并且Pro缺乏酰胺氢。因此,导致Pro引入的突变会显著影响蛋白质稳定性。在本研究中,我们描述了一种准确预测Pro引入对蛋白质热力学稳定性影响的方法。模型蛋白CcdB中的97个非Pro氨基酸残基中有77个被分别突变为Pro,并对每个突变体的体内活性进行了表征。通过将突变体的立体化学性质与活性数据相关联,创建了一个将突变分类为干扰或非干扰的决策树。立体化学性质包括主链二面角φ和主链酰胺氢键,这些性质是从使用MODELLER构建的突变体蛋白质的三维模型中确定的。我们在一个包含163个T4溶菌酶单点Pro突变、74个非同义单核苷酸多态性(nsSNPs)和文献中52个其他Pro替代的大型数据集中评估了决策树的性能。发现该算法在CcdB的情况下总体准确率为81%,在溶菌酶的情况下为77%,在nsSNPs的情况下为76%,在其他Pro替代数据的情况下为71%。还评估了Pro扫描诱变用于二级结构分配的准确性,发现最高为69%。我们的预测程序将有助于注释疾病相关蛋白质的未表征nsSNPs以及用于蛋白质工程和设计。