Suppr超能文献

通过保留订单进行批对齐,以预处理大规模多批 LC-MS 实验。

Batch alignment via retention orders for preprocessing large-scale multi-batch LC-MS experiments.

机构信息

Czech Centre for Phenogenomics, Institute of Molecular Genetics of the Czech Academy of Sciences, Průmyslova 595, Vestec 252 50, Czech Republic.

出版信息

Bioinformatics. 2022 Aug 2;38(15):3759-3767. doi: 10.1093/bioinformatics/btac407.

Abstract

MOTIVATION

Meticulous selection of chromatographic peak detection parameters and algorithms is a crucial step in preprocessing liquid chromatography-mass spectrometry (LC-MS) data. However, as mass-to-charge ratio and retention time shifts are larger between batches than within batches, finding apt parameters for all samples of a large-scale multi-batch experiment with the aim of minimizing information loss becomes a challenging task. Preprocessing independent batches individually can curtail said problems but requires a method for aligning and combining them for further downstream analysis.

RESULTS

We present two methods for aligning and combining individually preprocessed batches in multi-batch LC-MS experiments. Our developed methods were tested on six sets of simulated and six sets of real datasets. Furthermore, by estimating the probabilities of peak insertion, deletion and swap between batches in authentic datasets, we demonstrate that retention order swaps are not rare in untargeted LC-MS data.

AVAILABILITY AND IMPLEMENTATION

kmersAlignment and rtcorrectedAlignment algorithms are made available as an R package with raw data at https://metabocombiner.img.cas.cz.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

细致选择色谱峰检测参数和算法是液相色谱-质谱(LC-MS)数据预处理的关键步骤。然而,由于批次间质荷比和保留时间的漂移比批次内大,因此找到适用于大规模多批次实验所有样本的合适参数,以最大程度地减少信息丢失,成为一项具有挑战性的任务。单独预处理独立批次可以解决这些问题,但需要一种对齐和组合它们以供进一步下游分析的方法。

结果

我们提出了两种在多批次 LC-MS 实验中对齐和组合单独预处理批次的方法。我们开发的方法在六组模拟数据集和六组真实数据集上进行了测试。此外,通过估计真实数据集批次间峰插入、缺失和交换的概率,我们证明在非靶向 LC-MS 数据中,保留顺序交换并不罕见。

可用性和实现

kmersAlignment 和 rtcorrectedAlignment 算法作为一个 R 包提供,并在 https://metabocombiner.img.cas.cz/提供原始数据。

补充信息

补充数据可在 Bioinformatics 在线获取。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验