Computational Biology Program, Ontario Institute for Cancer Research, Toronto, ON, Canada.
Department of Medical Biophysics, University of Toronto, Toronto, ON, Canada.
Nat Protoc. 2019 Feb;14(2):482-517. doi: 10.1038/s41596-018-0103-9.
Pathway enrichment analysis helps researchers gain mechanistic insight into gene lists generated from genome-scale (omics) experiments. This method identifies biological pathways that are enriched in a gene list more than would be expected by chance. We explain the procedures of pathway enrichment analysis and present a practical step-by-step guide to help interpret gene lists resulting from RNA-seq and genome-sequencing experiments. The protocol comprises three major steps: definition of a gene list from omics data, determination of statistically enriched pathways, and visualization and interpretation of the results. We describe how to use this protocol with published examples of differentially expressed genes and mutated cancer genes; however, the principles can be applied to diverse types of omics data. The protocol describes innovative visualization techniques, provides comprehensive background and troubleshooting guidelines, and uses freely available and frequently updated software, including g:Profiler, Gene Set Enrichment Analysis (GSEA), Cytoscape and EnrichmentMap. The complete protocol can be performed in ~4.5 h and is designed for use by biologists with no prior bioinformatics training.
通路富集分析有助于研究人员深入了解基因组规模(组学)实验生成的基因列表的机制。该方法识别出在基因列表中富集的生物学途径,其丰富程度超出了随机预期。我们解释了通路富集分析的程序,并提供了一个实用的逐步指南,以帮助解释 RNA-seq 和基因组测序实验产生的基因列表。该方案包括三个主要步骤:从组学数据定义基因列表、确定统计学上富集的途径,以及可视化和解释结果。我们描述了如何使用该方案处理差异表达基因和突变癌症基因的已发表示例;然而,这些原则可以应用于各种类型的组学数据。该方案描述了创新的可视化技术,提供了全面的背景和故障排除指南,并使用了免费提供且经常更新的软件,包括 g:Profiler、基因集富集分析(GSEA)、 Cytoscape 和 EnrichmentMap。完整的方案可以在大约 4.5 小时内完成,专为没有事先生物信息学培训的生物学家设计。