Department of Physics and Astronomy, University of Denver, Denver, Colorado 80208, USA and Molecular and Cellular Biophysics, University of Denver, Denver, Colorado 80208, USA.
J Chem Phys. 2020 Apr 30;152(16):161102. doi: 10.1063/5.0004619.
Intrinsically Disordered Proteins (IDPs), unlike folded proteins, lack a unique folded structure and rapidly interconvert among ensembles of disordered states. However, they have specific conformational properties when averaged over their ensembles of disordered states. It is critical to develop a theoretical formalism to predict these ensemble average conformational properties that are encoded in the IDP sequence (the specific order in which amino acids/residues are linked). We present a general heteropolymer theory that analytically computes the ensemble average distance profiles (⟨R ⟩) between any two (i, j) monomers (amino acids for IDPs) as a function of the sequence. Information rich distance profiles provide a detailed description of the IDP in contrast to typical metrics such as scaling exponents, radius of gyration, or end-to-end distance. This generalized formalism supersedes homopolymer-like models or models that are built only on the composition of amino acids but ignore sequence details. The prediction of these distance profiles for highly charged polyampholytes and naturally occurring IDPs unmasks salient features that are hidden in the sequence. Moreover, the model reveals strategies to modulate the entire distance map to achieve local or global swelling/compaction by subtle changes/modifications-such as phosphorylation, a biologically relevant process-in specific hotspots in the sequence. Sequence-specific distance profiles and their modulation have been benchmarked against all-atom simulations. Our new formalism also predicts residue-pair specific coil-globule transitions. The analytical nature of the theory will facilitate design of new sequences to achieve specific target distance profiles with broad applications in synthetic biology and polymer science.
与折叠蛋白不同,无规蛋白(IDPs)缺乏独特的折叠结构,并且可以在无规状态的集合中快速转换。然而,它们在无规状态的集合中具有特定的构象特性。开发一种理论形式来预测这些集合平均构象特性是至关重要的,这些特性编码在 IDP 序列中(氨基酸/残基连接的特定顺序)。我们提出了一种通用的杂多聚合物理论,该理论可以分析计算任意两个(i,j)单体(IDP 的氨基酸)之间的集合平均距离分布(⟨R ⟩)作为序列的函数。丰富的信息距离分布提供了对 IDP 的详细描述,与典型的指标(如标度指数、回转半径或末端到末端的距离)形成对比。这种广义形式化方法取代了类似均聚物的模型或仅基于氨基酸组成但忽略序列细节的模型。对高度荷电聚电解质和天然 IDP 进行这些距离分布的预测揭示了序列中隐藏的显著特征。此外,该模型揭示了通过微妙的变化/修饰(如磷酸化,一种具有生物学相关性的过程)在序列中的特定热点来调节整个距离图以实现局部或全局肿胀/收缩的策略。序列特异性距离分布及其调制已与全原子模拟进行了基准测试。我们的新形式还预测了残基对特异性的卷曲-球转变。该理论的分析性质将有助于设计新的序列以实现特定的目标距离分布,在合成生物学和聚合物科学中有广泛的应用。