Fisch Kathleen M, Meißner Tobias, Gioia Louis, Ducom Jean-Christophe, Carland Tristan M, Loguercio Salvatore, Su Andrew I
Department of Molecular and Experimental Medicine, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA 92037, USA, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA 92037, USA and Department of Human Biology, J. Craig Venter Institute, 4120 Capricorn Lane, La Jolla, CA 92037, USA.
Bioinformatics. 2015 Jun 1;31(11):1724-8. doi: 10.1093/bioinformatics/btv061. Epub 2015 Jan 30.
Omics Pipe (http://sulab.scripps.edu/omicspipe) is a computational framework that automates multi-omics data analysis pipelines on high performance compute clusters and in the cloud. It supports best practice published pipelines for RNA-seq, miRNA-seq, Exome-seq, Whole-Genome sequencing, ChIP-seq analyses and automatic processing of data from The Cancer Genome Atlas (TCGA). Omics Pipe provides researchers with a tool for reproducible, open source and extensible next generation sequencing analysis. The goal of Omics Pipe is to democratize next-generation sequencing analysis by dramatically increasing the accessibility and reproducibility of best practice computational pipelines, which will enable researchers to generate biologically meaningful and interpretable results.
Using Omics Pipe, we analyzed 100 TCGA breast invasive carcinoma paired tumor-normal datasets based on the latest UCSC hg19 RefSeq annotation. Omics Pipe automatically downloaded and processed the desired TCGA samples on a high throughput compute cluster to produce a results report for each sample. We aggregated the individual sample results and compared them to the analysis in the original publications. This comparison revealed high overlap between the analyses, as well as novel findings due to the use of updated annotations and methods.
Source code for Omics Pipe is freely available on the web (https://bitbucket.org/sulab/omics_pipe). Omics Pipe is distributed as a standalone Python package for installation (https://pypi.python.org/pypi/omics_pipe) and as an Amazon Machine Image in Amazon Web Services Elastic Compute Cloud that contains all necessary third-party software dependencies and databases (https://pythonhosted.org/omics_pipe/AWS_installation.html).
Omics Pipe(http://sulab.scripps.edu/omicspipe)是一个计算框架,可在高性能计算集群和云端自动执行多组学数据分析流程。它支持已发表的RNA测序、miRNA测序、外显子组测序、全基因组测序、染色质免疫沉淀测序分析以及来自癌症基因组图谱(TCGA)数据的自动处理的最佳实践流程。Omics Pipe为研究人员提供了一个用于可重复、开源且可扩展的下一代测序分析的工具。Omics Pipe的目标是通过显著提高最佳实践计算流程的可及性和可重复性,使下一代测序分析大众化,这将使研究人员能够生成具有生物学意义且可解释的结果。
使用Omics Pipe,我们基于最新的加州大学圣克鲁兹分校(UCSC)hg19 RefSeq注释分析了100个TCGA乳腺浸润性癌配对肿瘤-正常数据集。Omics Pipe在高通量计算集群上自动下载并处理所需的TCGA样本,为每个样本生成结果报告。我们汇总了各个样本的结果,并将其与原始出版物中的分析进行比较。这种比较揭示了分析之间的高度重叠,以及由于使用更新的注释和方法而产生的新发现。
Omics Pipe的源代码可在网上免费获取(https://bitbucket.org/sulab/omics_pipe)。Omics Pipe作为一个独立的Python包进行分发以用于安装(https://pypi.python.org/pypi/omics_pipe),并作为亚马逊网络服务弹性计算云中的一个亚马逊机器镜像进行分发,该镜像包含所有必要的第三方软件依赖项和数据库(https://pythonhosted.org/omics_pipe/AWS_installation.html)。