Amarasinghe Piyumi R, Allison Lloyd, Morton Craig J, Stuckey Peter J, Garcia de la Banda Maria, Lesk Arthur M, Konagurthu Arun S
Department of Data Science and Artificial Intelligence, Monash University, Clayton, VIC 3800, Australia.
Biomedical Manufacturing Program, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Clayton, VIC 3168, Australia.
Proc Natl Acad Sci U S A. 2025 Jan 7;122(1):e2416301121. doi: 10.1073/pnas.2416301121. Epub 2025 Jan 2.
As structural biology and drug discovery depend on high-quality protein structures, assessment tools are essential. We describe a new method for validating amino-acid conformations: "PhiSiCal ([Formula: see text]al) Checkup." Twenty new joint probability distributions in the form of statistical mixture models explain the empirical distributions of dihedral angles [Formula: see text] of canonical amino acids in experimental protein structures. Marginal and conditional probability distributions for subsets of dihedral angles are derived from these joint mixture models. Together, these distributions are employed to measure rapidly the information-theoretic "favorability" of any proposed experimental protein structure. The inferred statistical models and measures overcome several shortcomings and afford improvements over the current state of the art in amino-acid conformation verification. Experimental comparisons are made against current protein conformation verification software. In a number of examples, we pick up outliers that are invisible to current methods. We also calculate, as part of verification, the sensitivity of favorability to small changes in a proposed structure accounting for the precision of coordinates. In some cases a near neighbor of a proposed amino-acid conformation may be either less or more favorable. This raises the question, is the current reliance on fixed "thresholds" for validation a good thing? PhiSiCal-Checkup is freely available for online and offline (open-source) use from https://lcb.infotech.monash.edu.au/phisical/checkup.
由于结构生物学和药物发现依赖于高质量的蛋白质结构,评估工具至关重要。我们描述了一种验证氨基酸构象的新方法:“PhiSiCal([公式:见正文]al)检查”。以统计混合模型形式呈现的20种新的联合概率分布解释了实验性蛋白质结构中标准氨基酸二面角[公式:见正文]的经验分布。从这些联合混合模型中推导出二面角子集的边际概率分布和条件概率分布。这些分布共同用于快速测量任何提议的实验性蛋白质结构的信息论“适宜性”。推断出的统计模型和度量克服了几个缺点,并在氨基酸构象验证方面比当前的技术水平有所改进。针对当前的蛋白质构象验证软件进行了实验比较。在许多例子中,我们发现了当前方法无法察觉的异常值。作为验证的一部分,我们还计算了考虑坐标精度时,适宜性对提议结构中小变化的敏感度。在某些情况下,提议的氨基酸构象的近邻可能更适宜或更不适宜。这就提出了一个问题,当前对固定“阈值”进行验证的依赖是否是一件好事?PhiSiCal检查可从https://lcb.infotech.monash.edu.au/phisical/checkup免费在线和离线(开源)使用。