Suppr超能文献

使用 FragPipe 处理的大规模非依赖数据采集质谱实验中检测差异丰度蛋白的 MSstats 工作流程。

An MSstats workflow for detecting differentially abundant proteins in large-scale data-independent acquisition mass spectrometry experiments with FragPipe processing.

机构信息

Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA.

Barnett Institute for Chemical and Biological Analysis, Northeastern University, Boston, MA, USA.

出版信息

Nat Protoc. 2024 Oct;19(10):2915-2938. doi: 10.1038/s41596-024-01000-3. Epub 2024 May 20.

Abstract

Technological advances in mass spectrometry and proteomics have made it possible to perform larger-scale and more-complex experiments. The volume and complexity of the resulting data create major challenges for downstream analysis. In particular, next-generation data-independent acquisition (DIA) experiments enable wider proteome coverage than more traditional targeted approaches but require computational workflows that can manage much larger datasets and identify peptide sequences from complex and overlapping spectral features. Data-processing tools such as FragPipe, DIA-NN and Spectronaut have undergone substantial improvements to process spectral features in a reasonable time. Statistical analysis tools are needed to draw meaningful comparisons between experimental samples, but these tools were also originally designed with smaller datasets in mind. This protocol describes an updated version of MSstats that has been adapted to be compatible with large-scale DIA experiments. A very large DIA experiment, processed with FragPipe, is used as an example to demonstrate different MSstats workflows. The choice of workflow depends on the user's computational resources. For datasets that are too large to fit into a standard computer's memory, we demonstrate the use of MSstatsBig, a companion R package to MSstats. The protocol also highlights key decisions that have a major effect on both the results and the processing time of the analysis. The MSstats processing can be expected to take 1-3 h depending on the usage of MSstatsBig. The protocol can be run in the point-and-click graphical user interface MSstatsShiny or implemented with minimal coding expertise in R.

摘要

质谱和蛋白质组学技术的进步使得进行更大规模和更复杂的实验成为可能。由此产生的数据的数量和复杂性给下游分析带来了重大挑战。特别是,下一代非依赖性数据获取(DIA)实验比更传统的靶向方法能够实现更广泛的蛋白质组覆盖范围,但需要能够处理更大数据集并从复杂和重叠的光谱特征中识别肽序列的计算工作流程。FragPipe、DIA-NN 和 Spectronaut 等数据处理工具已经进行了重大改进,可以在合理的时间内处理光谱特征。需要统计分析工具来对实验样本进行有意义的比较,但这些工具最初也是为处理较小的数据集而设计的。本方案描述了一种经过更新的 MSstats,已使其与大规模 DIA 实验兼容。一个非常大的 DIA 实验,使用 FragPipe 进行处理,被用作示例来演示不同的 MSstats 工作流程。工作流程的选择取决于用户的计算资源。对于太大而无法容纳在标准计算机内存中的数据集,我们演示了 MSstatsBig 的使用,这是 MSstats 的一个 R 包。该方案还突出了对分析结果和处理时间有重大影响的关键决策。根据 MSstatsBig 的使用情况,MSstats 的处理预计需要 1-3 小时。该方案可以在点选式图形用户界面 MSstatsShiny 中运行,也可以在具有最少编码专业知识的 R 中实现。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验