Suppr超能文献

PANTHER:让所有人大开眼界的基因组系统发生学。

PANTHER: Making genome-scale phylogenetics accessible to all.

机构信息

Division of Bioinformatics, Department of Population and Public Health Sciences, University of Southern California, Los Angeles, California, USA.

出版信息

Protein Sci. 2022 Jan;31(1):8-22. doi: 10.1002/pro.4218. Epub 2021 Nov 25.

Abstract

Phylogenetics is a powerful tool for analyzing protein sequences, by inferring their evolutionary relationships to other proteins. However, phylogenetics analyses can be challenging: they are computationally expensive and must be performed carefully in order to avoid systematic errors and artifacts. Protein Analysis THrough Evolutionary Relationships (PANTHER; http://pantherdb.org) is a publicly available, user-focused knowledgebase that stores the results of an extensive phylogenetic reconstruction pipeline that includes computational and manual processes and quality control steps. First, fully reconciled phylogenetic trees (including ancestral protein sequences) are reconstructed for a set of "reference" protein sequences obtained from fully sequenced genomes of organisms across the tree of life. Second, the resulting phylogenetic trees are manually reviewed and annotated with function evolution events: inferred gains and losses of protein function along branches of the phylogenetic tree. Here, we describe in detail the current contents of PANTHER, how those contents are generated, and how they can be used in a variety of applications. The PANTHER knowledgebase can be downloaded or accessed via an extensive API. In addition, PANTHER provides software tools to facilitate the application of the knowledgebase to common protein sequence analysis tasks: exploring an annotated genome by gene function; performing "enrichment analysis" of lists of genes; annotating a single sequence or large batch of sequences by homology; and assessing the likelihood that a genetic variant at a particular site in a protein will have deleterious effects.

摘要

系统发生学是一种分析蛋白质序列的强大工具,通过推断它们与其他蛋白质的进化关系来实现。然而,系统发生学分析可能具有挑战性:它们计算成本高,并且必须小心执行,以避免系统误差和伪影。蛋白质通过进化关系分析(PANTHER;http://pantherdb.org)是一个公开的、以用户为中心的知识库,存储了广泛的系统发生重建管道的结果,该管道包括计算和手动过程以及质量控制步骤。首先,为从生命之树中完全测序的生物体的全基因组获得的一组“参考”蛋白质序列重建完全协调的系统发生树(包括祖先蛋白质序列)。其次,对重建的系统发生树进行手动审查和注释,以记录功能进化事件:沿着系统发生树分支推断出蛋白质功能的获得和丧失。在这里,我们详细描述了 PANTHER 的当前内容、如何生成这些内容以及如何在各种应用中使用这些内容。PANTHER 知识库可以下载或通过广泛的 API 访问。此外,PANTHER 提供了软件工具,以方便将知识库应用于常见的蛋白质序列分析任务:通过基因功能探索注释基因组;对基因列表进行“富集分析”;通过同源性对单个序列或大量序列进行注释;以及评估蛋白质中特定位置的遗传变异是否可能产生有害影响的可能性。

相似文献

1
PANTHER: Making genome-scale phylogenetics accessible to all.
Protein Sci. 2022 Jan;31(1):8-22. doi: 10.1002/pro.4218. Epub 2021 Nov 25.
2
PANTHER version 7: improved phylogenetic trees, orthologs and collaboration with the Gene Ontology Consortium.
Nucleic Acids Res. 2010 Jan;38(Database issue):D204-10. doi: 10.1093/nar/gkp1019. Epub 2009 Dec 16.
4
PANTHER version 10: expanded protein families and functions, and analysis tools.
Nucleic Acids Res. 2016 Jan 4;44(D1):D336-42. doi: 10.1093/nar/gkv1194. Epub 2015 Nov 17.
5
PANTHER in 2013: modeling the evolution of gene function, and other gene attributes, in the context of phylogenetic trees.
Nucleic Acids Res. 2013 Jan;41(Database issue):D377-86. doi: 10.1093/nar/gks1118. Epub 2012 Nov 27.
6
Large-scale gene function analysis with the PANTHER classification system.
Nat Protoc. 2013 Aug;8(8):1551-66. doi: 10.1038/nprot.2013.092. Epub 2013 Jul 18.
7
PANTHER version 6: protein sequence and function evolution data with expanded representation of biological pathways.
Nucleic Acids Res. 2007 Jan;35(Database issue):D247-52. doi: 10.1093/nar/gkl869. Epub 2006 Nov 27.
8
On the quality of tree-based protein classification.
Bioinformatics. 2005 May 1;21(9):1876-90. doi: 10.1093/bioinformatics/bti244. Epub 2005 Jan 12.
9
PANTHER version 11: expanded annotation data from Gene Ontology and Reactome pathways, and data analysis tool enhancements.
Nucleic Acids Res. 2017 Jan 4;45(D1):D183-D189. doi: 10.1093/nar/gkw1138. Epub 2016 Nov 29.
10
Protocol Update for large-scale genome and gene function analysis with the PANTHER classification system (v.14.0).
Nat Protoc. 2019 Mar;14(3):703-721. doi: 10.1038/s41596-019-0128-8. Epub 2019 Feb 25.

引用本文的文献

1
2
MicroRNA (miRNA) in Plasma Small Extracellular Vesicles (sEV) as Potential Early Indicators of Dairy Cow Subfertility.
J Extracell Biol. 2025 Sep 1;4(9):e70084. doi: 10.1002/jex2.70084. eCollection 2025 Sep.
3
Optineurin deficiency disrupts phosphorylated tau proteostasis and clusterin expression in human neurons.
Acta Neuropathol Commun. 2025 Sep 2;13(1):188. doi: 10.1186/s40478-025-02103-y.
6
Quantitative proteomic data of flotillin-2 interactome within detergent resistant membranes of HeLa cells.
Data Brief. 2025 Aug 12;62:111970. doi: 10.1016/j.dib.2025.111970. eCollection 2025 Oct.
7
DIAMOND2GO: rapid Gene Ontology assignment and enrichment detection for functional genomics.
Front Bioinform. 2025 Aug 15;5:1634042. doi: 10.3389/fbinf.2025.1634042. eCollection 2025.
9
Efficient candidate drug target discovery through proteogenomics in a Scottish cohort.
Commun Biol. 2025 Aug 29;8(1):1300. doi: 10.1038/s42003-025-08738-w.

本文引用的文献

1
Updates to HCOP: the HGNC comparison of orthology predictions tool.
Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab155.
2
PhyloGenes: An online phylogenetics and functional genomics resource for plant gene function inference.
Plant Direct. 2020 Dec 30;4(12):e00293. doi: 10.1002/pld3.293. eCollection 2020 Dec.
3
Polynomial-Time Statistical Estimation of Species Trees Under Gene Duplication and Loss.
J Comput Biol. 2021 May;28(5):452-468. doi: 10.1089/cmb.2020.0424. Epub 2020 Dec 15.
4
PEREGRINE: A genome-wide prediction of enhancer to gene relationships supported by experimental evidence.
PLoS One. 2020 Dec 15;15(12):e0243791. doi: 10.1371/journal.pone.0243791. eCollection 2020.
6
FANTOM enters 20th year: expansion of transcriptomic atlases and functional annotation of non-coding RNAs.
Nucleic Acids Res. 2021 Jan 8;49(D1):D892-D898. doi: 10.1093/nar/gkaa1054.
7
The InterPro protein families and domains database: 20 years on.
Nucleic Acids Res. 2021 Jan 8;49(D1):D344-D354. doi: 10.1093/nar/gkaa977.
8
Pfam: The protein families database in 2021.
Nucleic Acids Res. 2021 Jan 8;49(D1):D412-D419. doi: 10.1093/nar/gkaa913.
9
Estimating the quality of eukaryotic genomes recovered from metagenomic analysis with EukCC.
Genome Biol. 2020 Sep 10;21(1):244. doi: 10.1186/s13059-020-02155-4.
10
SBML Level 3: an extensible format for the exchange and reuse of biological models.
Mol Syst Biol. 2020 Aug;16(8):e9110. doi: 10.15252/msb.20199110.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验