Institute for Systems Biology, Seattle WA 98109, USA.
Center for Computational Mass Spectrometry, University of California, San Diego (UCSD), La Jolla, CA 92093, USA.
Nucleic Acids Res. 2023 Jan 6;51(D1):D1539-D1548. doi: 10.1093/nar/gkac1040.
Mass spectrometry (MS) is by far the most used experimental approach in high-throughput proteomics. The ProteomeXchange (PX) consortium of proteomics resources (http://www.proteomexchange.org) was originally set up to standardize data submission and dissemination of public MS proteomics data. It is now 10 years since the initial data workflow was implemented. In this manuscript, we describe the main developments in PX since the previous update manuscript in Nucleic Acids Research was published in 2020. The six members of the Consortium are PRIDE, PeptideAtlas (including PASSEL), MassIVE, jPOST, iProX and Panorama Public. We report the current data submission statistics, showcasing that the number of datasets submitted to PX resources has continued to increase every year. As of June 2022, more than 34 233 datasets had been submitted to PX resources, and from those, 20 062 (58.6%) just in the last three years. We also report the development of the Universal Spectrum Identifiers and the improvements in capturing the experimental metadata annotations. In parallel, we highlight that data re-use activities of public datasets continue to increase, enabling connections between PX resources and other popular bioinformatics resources, novel research and also new data resources. Finally, we summarise the current state-of-the-art in data management practices for sensitive human (clinical) proteomics data.
质谱 (MS) 是目前高通量蛋白质组学中最常用的实验方法。蛋白质组交换 (PX) 蛋白质组学资源联盟 (http://www.proteomexchange.org) 的成立最初是为了标准化公共 MS 蛋白质组学数据的提交和传播。自最初的数据工作流程实施以来,已经过去了 10 年。在本文中,我们描述了自 2020 年在《核酸研究》上发表上一篇更新文章以来,PX 中的主要发展。该联盟的六个成员是 PRIDE、PeptideAtlas(包括 PASSEL)、MassIVE、jPOST、iProX 和 Panorama Public。我们报告了当前的数据提交统计数据,展示了向 PX 资源提交的数据集数量每年都在持续增加。截至 2022 年 6 月,已经向 PX 资源提交了超过 34233 个数据集,其中仅在过去三年中就提交了 20062 个数据集(58.6%)。我们还报告了通用光谱标识符的发展以及对实验元数据注释的捕获的改进。同时,我们强调公共数据集的数据再利用活动继续增加,这使得 PX 资源与其他流行的生物信息学资源、新的研究以及新的数据资源之间建立了联系。最后,我们总结了目前敏感人类(临床)蛋白质组学数据的数据管理实践的最新状态。