Institute for Systems Biology , Seattle , Washington 98008 , United States.
Stoller Biomarker Discovery Centre , University of Manchester , Manchester M13 9PL , U.K.
J Proteome Res. 2019 Dec 6;18(12):4262-4272. doi: 10.1021/acs.jproteome.9b00205. Epub 2019 Jul 22.
Spectral matching sequence database search engines commonly used on mass spectrometry-based proteomics experiments excel at identifying peptide sequence ions, and in addition, possible sequence ions carrying post-translational modifications (PTMs), but most do not provide confidence metrics for the exact localization of those PTMs when several possible sites are available. Localization is absolutely required for downstream molecular cell biology analysis of PTM function in vitro and in vivo. Therefore, we developed PTMProphet, a free and open-source software tool integrated into the Trans-Proteomic Pipeline, which reanalyzes identified spectra from any search engine for which pepXML output is available to provide localization confidence to enable appropriate further characterization of biologic events. Localization of any type of mass modification (e.g., phosphorylation) is supported. PTMProphet applies Bayesian mixture models to compute probabilities for each site/peptide spectrum match where a PTM has been identified. These probabilities can be combined to compute a global false localization rate at any threshold to guide downstream analysis. We describe the PTMProphet tool, its underlying algorithms, and demonstrate its performance on ground-truth synthetic peptide reference data sets, one previously published small data set, one new larger data set, and also on a previously published phosphoenriched data set where the correct sites of modification are unknown. Data have been deposited to ProteomeXchange with identifier PXD013210.
基于质谱的蛋白质组学实验中常用的光谱匹配序列数据库搜索引擎擅长识别肽序列离子,此外,还可以识别可能带有翻译后修饰 (PTM) 的序列离子,但大多数情况下,当有多个可能的位点时,这些数据库不会提供 PTM 确切定位的置信度指标。对于体外和体内 PTM 功能的下游分子细胞生物学分析,定位是绝对必要的。因此,我们开发了 PTMProphet,这是一个免费的开源软件工具,集成到 Trans-Proteomic Pipeline 中,它重新分析来自任何搜索引擎的已识别光谱,这些搜索引擎提供 pepXML 输出,以提供定位置信度,从而能够对生物事件进行适当的进一步特征描述。支持任何类型的质量修饰(例如磷酸化)的定位。PTMProphet 应用贝叶斯混合模型来计算已识别 PTM 的每个位点/肽谱匹配的概率。这些概率可以组合起来,以计算任何阈值下的全局错误定位率,以指导下游分析。我们描述了 PTMProphet 工具及其底层算法,并在地面真实合成肽参考数据集、一个以前发表的小数据集、一个新的更大数据集以及一个以前发表的磷酸化富集数据集上展示了其性能,在这些数据集中,修饰的正确位点是未知的。数据已被存入 ProteomeXchange,标识符为 PXD013210。