School of Biomedical Engineering, Science, and Health Systems, Drexel University, Philadelphia, PA, USA.
Center for Genomics and Biotechnology, Fujian Agriculture and Forestry University, Fuzhou, China.
Sci Rep. 2018 Jul 18;8(1):10872. doi: 10.1038/s41598-018-28948-z.
The biological interpretation of gene lists with interesting shared properties, such as up- or down-regulation in a particular experiment, is typically accomplished using gene ontology enrichment analysis tools. Given a list of genes, a gene ontology (GO) enrichment analysis may return hundreds of statistically significant GO results in a "flat" list, which can be challenging to summarize. It can also be difficult to keep pace with rapidly expanding biological knowledge, which often results in daily changes to any of the over 47,000 gene ontologies that describe biological knowledge. GOATOOLS, a Python-based library, makes it more efficient to stay current with the latest ontologies and annotations, perform gene ontology enrichment analyses to determine over- and under-represented terms, and organize results for greater clarity and easier interpretation using a novel GOATOOLS GO grouping method. We performed functional analyses on both stochastic simulation data and real data from a published RNA-seq study to compare the enrichment results from GOATOOLS to two other popular tools: DAVID and GOstats. GOATOOLS is freely available through GitHub: https://github.com/tanghaibao/goatools .
具有有趣共享属性的基因列表的生物学解释,例如在特定实验中上调或下调,通常使用基因本体论富集分析工具来完成。给定一个基因列表,基因本体论 (GO) 富集分析可能会在“平面”列表中返回数百个具有统计学意义的 GO 结果,这很难进行总结。跟上快速扩展的生物学知识也很困难,这通常会导致每天对描述生物学知识的 47,000 多个基因本体论中的任何一个进行更改。基于 Python 的 GOATOOLS 库可更高效地使用最新的本体论和注释保持最新状态,执行基因本体论富集分析以确定过度和不足表达的术语,并使用新颖的 GOATOOLS GO 分组方法组织结果以提高清晰度和更容易解释。我们对随机模拟数据和已发表的 RNA-seq 研究的真实数据进行了功能分析,以比较 GOATOOLS 与另外两个流行工具(DAVID 和 GOstats)的富集结果。GOATOOLS 可通过 GitHub 免费获得:https://github.com/tanghaibao/goatools。