Zhang Chenwei, Condon Anne, Dao Duc Khanh
Department of Computer Science, University of British Columbia, ICICS/CS Building 201-2366 Main Mall, Vancouver BC V6T 1Z4, Canada.
Department of Mathematics, University of British Columbia, 1984 Mathematics Road, Vancouver BC V6T 1Z2, Canada.
Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf322.
Advancements in deep learning (DL) have recently led to new methods for automated construction of atomic models of proteins, from single-particle cryogenic electron microscopy (cryo-EM) density maps. We conduct a comprehensive survey of these methods, distinguishing between direct model building approaches that only use density maps, and indirect ones that integrate sequence-to-structure predictions from AlphaFold. To evaluate them with better precision, we refine standard existing metrics, and benchmark a subset of representative DL-methods against traditional physics-based approaches using 50 cryo-EM density maps at varying resolutions. Our findings demonstrate that overall, DL-based methods outperform traditional physics-based methods. Our benchmark also shows the benefit of integrating AlphaFold as it improved the completeness and accuracy of the model, although its dependency on available sequence information and limited training data may limit its usage.
深度学习(DL)的进展最近催生了一些新方法,可根据单颗粒低温电子显微镜(cryo-EM)密度图自动构建蛋白质原子模型。我们对这些方法进行了全面的调查,区分了仅使用密度图的直接模型构建方法和整合来自AlphaFold的序列到结构预测的间接方法。为了更精确地评估它们,我们改进了现有的标准指标,并使用50个不同分辨率的cryo-EM密度图,将一组代表性的DL方法与传统的基于物理的方法进行基准测试。我们的研究结果表明,总体而言,基于DL的方法优于传统的基于物理的方法。我们的基准测试还显示了整合AlphaFold的好处,因为它提高了模型的完整性和准确性,尽管其对可用序列信息的依赖和有限的训练数据可能会限制其使用。