ProCan®, Children's Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, Westmead, NSW, Australia.
Sciex, 2 Gilda Court, Mulgrave, VIC, Australia.
Nat Commun. 2020 Jul 30;11(1):3793. doi: 10.1038/s41467-020-17641-3.
Reproducible research is the bedrock of experimental science. To enable the deployment of large-scale proteomics, we assess the reproducibility of mass spectrometry (MS) over time and across instruments and develop computational methods for improving quantitative accuracy. We perform 1560 data independent acquisition (DIA)-MS runs of eight samples containing known proportions of ovarian and prostate cancer tissue and yeast, or control HEK293T cells. Replicates are run on six mass spectrometers operating continuously with varying maintenance schedules over four months, interspersed with ~5000 other runs. We utilise negative controls and replicates to remove unwanted variation and enhance biological signal, outperforming existing methods. We also design a method for reducing missing values. Integrating these computational modules into a pipeline (ProNorM), we mitigate variation among instruments over time and accurately predict tissue proportions. We demonstrate how to improve the quantitative analysis of large-scale DIA-MS data, providing a pathway toward clinical proteomics.
可重现性研究是实验科学的基础。为了实现大规模蛋白质组学的应用,我们评估了随时间推移和跨仪器的质谱 (MS) 的重现性,并开发了用于提高定量准确性的计算方法。我们对包含已知卵巢癌和前列腺癌组织与酵母或对照 HEK293T 细胞比例的八个样本进行了 1560 次数据非依赖性采集 (DIA)-MS 运行。重复运行在六台质谱仪上,这些质谱仪在四个月内连续运行,维护计划各不相同,其间穿插约 5000 次其他运行。我们利用阴性对照和重复样本来去除不需要的变化并增强生物学信号,从而优于现有方法。我们还设计了一种减少缺失值的方法。我们将这些计算模块集成到一个管道(ProNorM)中,减轻了仪器随时间的变化,并准确预测了组织比例。我们展示了如何改善大规模 DIA-MS 数据的定量分析,为临床蛋白质组学提供了一种途径。