Suppr超能文献

组学管道:一个基于社区的可重复多组学数据分析框架。

Omics Pipe: a community-based framework for reproducible multi-omics data analysis.

作者信息

Fisch Kathleen M, Meißner Tobias, Gioia Louis, Ducom Jean-Christophe, Carland Tristan M, Loguercio Salvatore, Su Andrew I

机构信息

Department of Molecular and Experimental Medicine, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA 92037, USA, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA 92037, USA and Department of Human Biology, J. Craig Venter Institute, 4120 Capricorn Lane, La Jolla, CA 92037, USA.

出版信息

Bioinformatics. 2015 Jun 1;31(11):1724-8. doi: 10.1093/bioinformatics/btv061. Epub 2015 Jan 30.

Abstract

MOTIVATION

Omics Pipe (http://sulab.scripps.edu/omicspipe) is a computational framework that automates multi-omics data analysis pipelines on high performance compute clusters and in the cloud. It supports best practice published pipelines for RNA-seq, miRNA-seq, Exome-seq, Whole-Genome sequencing, ChIP-seq analyses and automatic processing of data from The Cancer Genome Atlas (TCGA). Omics Pipe provides researchers with a tool for reproducible, open source and extensible next generation sequencing analysis. The goal of Omics Pipe is to democratize next-generation sequencing analysis by dramatically increasing the accessibility and reproducibility of best practice computational pipelines, which will enable researchers to generate biologically meaningful and interpretable results.

RESULTS

Using Omics Pipe, we analyzed 100 TCGA breast invasive carcinoma paired tumor-normal datasets based on the latest UCSC hg19 RefSeq annotation. Omics Pipe automatically downloaded and processed the desired TCGA samples on a high throughput compute cluster to produce a results report for each sample. We aggregated the individual sample results and compared them to the analysis in the original publications. This comparison revealed high overlap between the analyses, as well as novel findings due to the use of updated annotations and methods.

AVAILABILITY AND IMPLEMENTATION

Source code for Omics Pipe is freely available on the web (https://bitbucket.org/sulab/omics_pipe). Omics Pipe is distributed as a standalone Python package for installation (https://pypi.python.org/pypi/omics_pipe) and as an Amazon Machine Image in Amazon Web Services Elastic Compute Cloud that contains all necessary third-party software dependencies and databases (https://pythonhosted.org/omics_pipe/AWS_installation.html).

摘要

动机

Omics Pipe(http://sulab.scripps.edu/omicspipe)是一个计算框架,可在高性能计算集群和云端自动执行多组学数据分析流程。它支持已发表的RNA测序、miRNA测序、外显子组测序、全基因组测序、染色质免疫沉淀测序分析以及来自癌症基因组图谱(TCGA)数据的自动处理的最佳实践流程。Omics Pipe为研究人员提供了一个用于可重复、开源且可扩展的下一代测序分析的工具。Omics Pipe的目标是通过显著提高最佳实践计算流程的可及性和可重复性,使下一代测序分析大众化,这将使研究人员能够生成具有生物学意义且可解释的结果。

结果

使用Omics Pipe,我们基于最新的加州大学圣克鲁兹分校(UCSC)hg19 RefSeq注释分析了100个TCGA乳腺浸润性癌配对肿瘤-正常数据集。Omics Pipe在高通量计算集群上自动下载并处理所需的TCGA样本,为每个样本生成结果报告。我们汇总了各个样本的结果,并将其与原始出版物中的分析进行比较。这种比较揭示了分析之间的高度重叠,以及由于使用更新的注释和方法而产生的新发现。

可用性和实现

Omics Pipe的源代码可在网上免费获取(https://bitbucket.org/sulab/omics_pipe)。Omics Pipe作为一个独立的Python包进行分发以用于安装(https://pypi.python.org/pypi/omics_pipe),并作为亚马逊网络服务弹性计算云中的一个亚马逊机器镜像进行分发,该镜像包含所有必要的第三方软件依赖项和数据库(https://pythonhosted.org/omics_pipe/AWS_installation.html)。

相似文献

1
Omics Pipe: a community-based framework for reproducible multi-omics data analysis.
Bioinformatics. 2015 Jun 1;31(11):1724-8. doi: 10.1093/bioinformatics/btv061. Epub 2015 Jan 30.
2
svist4get: a simple visualization tool for genomic tracks from sequencing experiments.
BMC Bioinformatics. 2019 Mar 6;20(1):113. doi: 10.1186/s12859-019-2706-8.
3
NGS-pipe: a flexible, easily extendable and highly configurable framework for NGS analysis.
Bioinformatics. 2018 Jan 1;34(1):107-108. doi: 10.1093/bioinformatics/btx540.
4
HTSeq--a Python framework to work with high-throughput sequencing data.
Bioinformatics. 2015 Jan 15;31(2):166-9. doi: 10.1093/bioinformatics/btu638. Epub 2014 Sep 25.
5
SparkSeq: fast, scalable and cloud-ready tool for the interactive genomic data analysis with nucleotide precision.
Bioinformatics. 2014 Sep 15;30(18):2652-3. doi: 10.1093/bioinformatics/btu343. Epub 2014 May 19.
7
TCGA Expedition: A Data Acquisition and Management System for TCGA Data.
PLoS One. 2016 Oct 27;11(10):e0165395. doi: 10.1371/journal.pone.0165395. eCollection 2016.
8

引用本文的文献

2
Unlocking the future of complex human diseases prediction: multi-omics risk score breakthrough.
Front Bioinform. 2024 Dec 16;4:1510352. doi: 10.3389/fbinf.2024.1510352. eCollection 2024.
4
Role of Network Pharmacology in Prediction of Mechanism of Neuroprotective Compounds.
Methods Mol Biol. 2024;2761:159-179. doi: 10.1007/978-1-0716-3662-6_13.
5
OmicsSuite: a customized and pipelined suite for analysis and visualization of multi-omics big data.
Hortic Res. 2023 Sep 28;10(11):uhad195. doi: 10.1093/hr/uhad195. eCollection 2023 Nov.
6
MXP: Modular eXpandable framework for building bioinformatics Pipelines.
J Bioinform Syst Biol. 2023;6(3):178-182. doi: 10.26502/jbsb.5107058. Epub 2023 Aug 7.
7
iCOMIC: a graphical interface-driven bioinformatics pipeline for analyzing cancer omics data.
NAR Genom Bioinform. 2022 Jul 25;4(3):lqac053. doi: 10.1093/nargab/lqac053. eCollection 2022 Sep.
8
Network Pharmacology Approach for Medicinal Plants: Review and Assessment.
Pharmaceuticals (Basel). 2022 May 4;15(5):572. doi: 10.3390/ph15050572.
9
A guide for the diagnosis of rare and undiagnosed disease: beyond the exome.
Genome Med. 2022 Feb 28;14(1):23. doi: 10.1186/s13073-022-01026-w.
10
Application and Challenge of 3rd Generation Sequencing for Clinical Bacterial Studies.
Int J Mol Sci. 2022 Jan 26;23(3):1395. doi: 10.3390/ijms23031395.

本文引用的文献

2
Unipro UGENE NGS pipelines and components for variant calling, RNA-seq and ChIP-seq data analyses.
PeerJ. 2014 Nov 4;2:e644. doi: 10.7717/peerj.644. eCollection 2014.
3
HTSeq--a Python framework to work with high-throughput sequencing data.
Bioinformatics. 2015 Jan 15;31(2):166-9. doi: 10.1093/bioinformatics/btu638. Epub 2014 Sep 25.
4
Count-based differential expression analysis of RNA sequencing data using R and Bioconductor.
Nat Protoc. 2013 Sep;8(9):1765-86. doi: 10.1038/nprot.2013.099. Epub 2013 Aug 22.
5
Harnessing virtual machines to simplify next-generation DNA sequencing analysis.
Bioinformatics. 2013 Sep 1;29(17):2075-83. doi: 10.1093/bioinformatics/btt352. Epub 2013 Jun 20.
6
The Taverna workflow suite: designing and executing workflows of Web Services on the desktop, web or in the cloud.
Nucleic Acids Res. 2013 Jul;41(Web Server issue):W557-61. doi: 10.1093/nar/gkt328. Epub 2013 May 2.
7
Dysregulation of the basal RNA polymerase transcription apparatus in cancer.
Nat Rev Cancer. 2013 May;13(5):299-314. doi: 10.1038/nrc3496.
8
STAR: ultrafast universal RNA-seq aligner.
Bioinformatics. 2013 Jan 1;29(1):15-21. doi: 10.1093/bioinformatics/bts635. Epub 2012 Oct 25.
9
Comprehensive molecular portraits of human breast tumours.
Nature. 2012 Oct 4;490(7418):61-70. doi: 10.1038/nature11412. Epub 2012 Sep 23.
10
An integrated encyclopedia of DNA elements in the human genome.
Nature. 2012 Sep 6;489(7414):57-74. doi: 10.1038/nature11247.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验