Suppr超能文献

基于拆分池结扎的单细胞转录组测序(SPLiT-seq)数据处理流程比较。

Split Pool Ligation-based Single-cell Transcriptome sequencing (SPLiT-seq) data processing pipeline comparison.

机构信息

Department of Cell Biology, Erasmus University Medical Center Rotterdam (Erasmus MC), Wytemaweg 80, Rotterdam, 3015CN, The Netherlands.

Center for Biomics, Erasmus University Medical Center Rotterdam (Erasmus MC), Rotterdam, The Netherlands.

出版信息

BMC Genomics. 2024 Apr 12;25(1):361. doi: 10.1186/s12864-024-10285-3.

Abstract

BACKGROUND

Single-cell sequencing techniques are revolutionizing every field of biology by providing the ability to measure the abundance of biological molecules at a single-cell resolution. Although single-cell sequencing approaches have been developed for several molecular modalities, single-cell transcriptome sequencing is the most prevalent and widely applied technique. SPLiT-seq (split-pool ligation-based transcriptome sequencing) is one of these single-cell transcriptome techniques that applies a unique combinatorial-barcoding approach by splitting and pooling cells into multi-well plates containing barcodes. This unique approach required the development of dedicated computational tools to preprocess the data and extract the count matrices. Here we compare eight bioinformatic pipelines (alevin-fry splitp, LR-splitpipe, SCSit, splitpipe, splitpipeline, SPLiTseq-demultiplex, STARsolo and zUMI) that have been developed to process SPLiT-seq data. We provide an overview of the tools, their computational performance, functionality and impact on downstream processing of the single-cell data, which vary greatly depending on the tool used.

RESULTS

We show that STARsolo, splitpipe and alevin-fry splitp can all handle large amount of data within reasonable time. In contrast, the other five pipelines are slow when handling large datasets. When using smaller dataset, cell barcode results are similar with the exception of SPLiTseq-demultiplex and splitpipeline. LR-splitpipe that is originally designed for processing long-read sequencing data is the slowest of all pipelines. Alevin-fry produced different down-stream results that are difficult to interpret. STARsolo functions nearly identical to splitpipe and produce results that are highly similar to each other. However, STARsolo lacks the function to collapse random hexamer reads for which some additional coding is required.

CONCLUSION

Our comprehensive comparative analysis aids users in selecting the most suitable analysis tool for efficient SPLiT-seq data processing, while also detailing the specific prerequisites for each of these pipelines. From the available pipelines, we recommend splitpipe or STARSolo for SPLiT-seq data analysis.

摘要

背景

单细胞测序技术通过提供单细胞分辨率下测量生物分子丰度的能力,正在彻底改变生物学的各个领域。尽管已经开发了几种用于多种分子模式的单细胞测序方法,但单细胞转录组测序是最流行和广泛应用的技术。SPLiT-seq(基于拆分池连接的转录组测序)是其中一种单细胞转录组技术,它采用独特的组合条形码方法,将细胞分裂并汇集到含有条形码的多孔板中。这种独特的方法需要开发专用的计算工具来预处理数据并提取计数矩阵。在这里,我们比较了八种生物信息学管道(alevin-fry splitp、LR-splitpipe、SCSit、splitpipe、splitpipeline、SPLiTseq-demultiplex、STARsolo 和 zUMI),这些管道都是为处理 SPLiT-seq 数据而开发的。我们提供了工具概述、它们的计算性能、功能以及对单细胞数据下游处理的影响,这些都因使用的工具而异。

结果

我们表明,STARsolo、splitpipe 和 alevin-fry splitp 都可以在合理的时间内处理大量数据。相比之下,其他五个管道在处理大型数据集时速度较慢。在使用较小的数据集时,除了 SPLiTseq-demultiplex 和 splitpipeline 之外,细胞条形码结果是相似的。最初设计用于处理长读测序数据的 LR-splitpipe 是所有管道中最慢的。alevin-fry 产生的下游结果不同,难以解释。STARsolo 的功能几乎与 splitpipe 相同,产生的结果非常相似。然而,STARsolo 缺少用于合并随机六聚体读数的功能,这需要一些额外的编码。

结论

我们的综合比较分析帮助用户选择最适合高效 SPLiT-seq 数据处理的分析工具,同时详细介绍了这些管道的特定前提条件。在可用的管道中,我们建议使用 splitpipe 或 STARsolo 进行 SPLiT-seq 数据分析。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd6e/11010347/50f2df735ca2/12864_2024_10285_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验