INGEBI-CONICET Instituto de Investigaciones en Ingeniería Genética y Biología Molecular "Dr. Héctor Torres", Vuelta de Obligado 2490, 1428, CABA, Argentina.
Department of Chemical Engineering, Stanford University, 443 Via Ortega, Stanford, CA, 94305, USA.
BMC Plant Biol. 2021 Dec 14;21(1):592. doi: 10.1186/s12870-021-03377-9.
Proteins are the workforce of the cell and their phosphorylation status tailors specific responses efficiently. One of the main challenges of phosphoproteomic approaches is to deconvolute biological processes that specifically respond to an experimental query from a list of phosphoproteins. Comparison of the frequency distribution of GO (Gene Ontology) terms in a given phosphoproteome set with that observed in the genome reference set (GenRS) is the most widely used tool to infer biological significance. Yet, this comparison assumes that GO term distribution between the phosphoproteome and the genome are identical. However, this hypothesis has not been tested due to the lack of a comprehensive phosphoproteome database.
In this study, we test this hypothesis by constructing three phosphoproteome databases in Arabidopsis thaliana: one based in experimental data (ExpRS), another based in in silico phosphorylation protein prediction (PredRS) and a third that is the union of both (UnRS). Our results show that the three phosphoproteome reference sets show default enrichment of several GO terms compared to GenRS, indicating that GO term distribution in the phosphoproteomes does not match that of the genome. Moreover, these differences overshadow the identification of GO terms that are specifically enriched in a particular condition. To overcome this limitation, we present an additional comparison of the sample of interest with UnRS to uncover GO terms specifically enriched in a particular phosphoproteome experiment. Using this strategy, we found that mRNA splicing and cytoplasmic microtubule compounds are important processes specifically enriched in the phosphoproteome of dark-grown Arabidopsis seedlings.
This study provides a novel strategy to uncover GO specific terms in phosphoproteome data of Arabidopsis that could be applied to any other organism. We also highlight the importance of specific phosphorylation pathways that take place during dark-grown Arabidopsis development.
蛋白质是细胞的劳动力,其磷酸化状态有效地调整特定反应。磷酸蛋白质组学方法的主要挑战之一是从磷酸蛋白质组列表中推断出专门响应实验查询的生物过程。将特定于实验查询的磷酸蛋白质组集中的 GO(基因本体论)术语的频率分布与基因组参考集中(GenRS)观察到的分布进行比较,是推断生物学意义的最广泛使用的工具。然而,由于缺乏全面的磷酸蛋白质组数据库,这种比较假设磷酸蛋白质组和基因组之间的 GO 术语分布是相同的。然而,由于缺乏全面的磷酸蛋白质组数据库,这个假设尚未得到检验。
在这项研究中,我们通过构建拟南芥中的三个磷酸蛋白质组数据库来检验这个假设:一个基于实验数据(ExpRS),另一个基于计算机预测的磷酸化蛋白质(PredRS),第三个是两者的组合(UnRS)。我们的结果表明,与 GenRS 相比,这三个磷酸蛋白质组参考集默认情况下富含几种 GO 术语,这表明磷酸蛋白质组中的 GO 术语分布与基因组不匹配。此外,这些差异掩盖了鉴定在特定条件下特别富集的 GO 术语的能力。为了克服这个限制,我们还将感兴趣的样本与 UnRS 进行了额外的比较,以发现特定于磷酸蛋白质组实验的特别富集的 GO 术语。使用这种策略,我们发现暗培养的拟南芥幼苗的磷酸蛋白质组中特别富集的重要过程是 mRNA 剪接和细胞质微管化合物。
这项研究为揭示拟南芥磷酸蛋白质组数据中特定的 GO 术语提供了一种新的策略,可应用于任何其他生物体。我们还强调了在暗培养的拟南芥发育过程中发生的特定磷酸化途径的重要性。