Suppr超能文献

统计方法在基因组序列中串联重复的检测和分析。

Statistical approaches to detecting and analyzing tandem repeats in genomic sequences.

机构信息

Institute of Applied Simulation, School of Life Sciences and Facility Management, Zürich University of Applied Sciences (ZHAW) , Wädenswil , Switzerland.

Department of Biosystems Science and Engineering, ETH Zürich , Basel , Switzerland ; Department of Computer Science, ETH Zürich , Zürich , Switzerland.

出版信息

Front Bioeng Biotechnol. 2015 Mar 17;3:31. doi: 10.3389/fbioe.2015.00031. eCollection 2015.

Abstract

Tandem repeats (TRs) are frequently observed in genomes across all domains of life. Evidence suggests that some TRs are crucial for proteins with fundamental biological functions and can be associated with virulence, resistance, and infectious/neurodegenerative diseases. Genome-scale systematic studies of TRs have the potential to unveil core mechanisms governing TR evolution and TR roles in shaping genomes. However, TR-related studies are often non-trivial due to heterogeneous and sometimes fast evolving TR regions. In this review, we discuss these intricacies and their consequences. We present our recent contributions to computational and statistical approaches for TR significance testing, sequence profile-based TR annotation, TR-aware sequence alignment, phylogenetic analyses of TR unit number and order, and TR benchmarks. Importantly, all these methods explicitly rely on the evolutionary definition of a tandem repeat as a sequence of adjacent repeat units stemming from a common ancestor. The discussed work has a focus on protein TRs, yet is generally applicable to nucleic acid TRs, sharing similar features.

摘要

串联重复(TRs)在所有生命领域的基因组中都经常被观察到。有证据表明,一些 TR 对具有基本生物学功能的蛋白质至关重要,并且可能与毒力、抗性以及传染性/神经退行性疾病有关。对 TR 进行大规模的系统研究有可能揭示控制 TR 进化的核心机制以及 TR 在塑造基因组方面的作用。然而,由于异质且有时快速进化的 TR 区域,TR 相关的研究往往并不简单。在这篇综述中,我们讨论了这些复杂性及其后果。我们介绍了我们最近在用于 TR 显著性检验的计算和统计方法、基于序列特征的 TR 注释、TR 感知的序列比对、TR 单元数和顺序的系统发育分析以及 TR 基准方面的贡献。重要的是,所有这些方法都明确依赖于串联重复的进化定义,即将源自共同祖先的相邻重复单元序列作为串联重复。所讨论的工作主要集中在蛋白质 TR 上,但通常也适用于具有相似特征的核酸 TR。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2879/4362331/b21cac738250/fbioe-03-00031-g001.jpg

相似文献

1
Statistical approaches to detecting and analyzing tandem repeats in genomic sequences.
Front Bioeng Biotechnol. 2015 Mar 17;3:31. doi: 10.3389/fbioe.2015.00031. eCollection 2015.
2
Deep conservation of human protein tandem repeats within the eukaryotes.
Mol Biol Evol. 2014 May;31(5):1132-48. doi: 10.1093/molbev/msu062. Epub 2014 Feb 3.
3
The evolution and function of protein tandem repeats in plants.
New Phytol. 2015 Apr;206(1):397-410. doi: 10.1111/nph.13184. Epub 2014 Nov 24.
4
Genome-wide analysis of tandem repeats in Daphnia pulex--a comparative approach.
BMC Genomics. 2010 Apr 30;11:277. doi: 10.1186/1471-2164-11-277.
5
Graph-based modeling of tandem repeats improves global multiple sequence alignment.
Nucleic Acids Res. 2013 Sep;41(17):e162. doi: 10.1093/nar/gkt628. Epub 2013 Jul 22.
8
A New Census of Protein Tandem Repeats and Their Relationship with Intrinsic Disorder.
Genes (Basel). 2020 Apr 9;11(4):407. doi: 10.3390/genes11040407.
9
Repeat or not repeat?--Statistical validation of tandem repeat prediction in genomic sequences.
Nucleic Acids Res. 2012 Nov 1;40(20):10005-17. doi: 10.1093/nar/gks726. Epub 2012 Aug 25.
10
Characterization and visualization of tandem repeats at genome scale.
Nat Biotechnol. 2024 Oct;42(10):1606-1614. doi: 10.1038/s41587-023-02057-3. Epub 2024 Jan 2.

引用本文的文献

1
TRAL 2.0: Tandem Repeat Detection With Circular Profile Hidden Markov Models and Evolutionary Aligner.
Front Bioinform. 2021 Jun 25;1:691865. doi: 10.3389/fbinf.2021.691865. eCollection 2021.
3
Methodologies for the Discovery of Transposable Element Families.
Genes (Basel). 2022 Apr 17;13(4):709. doi: 10.3390/genes13040709.
4
Accuracy of short tandem repeats genotyping tools in whole exome sequencing data.
F1000Res. 2020 Mar 23;9:200. doi: 10.12688/f1000research.22639.1. eCollection 2020.
5
A New Census of Protein Tandem Repeats and Their Relationship with Intrinsic Disorder.
Genes (Basel). 2020 Apr 9;11(4):407. doi: 10.3390/genes11040407.
7
Tandem repeats mediating genetic plasticity in health and disease.
Nat Rev Genet. 2018 May;19(5):286-298. doi: 10.1038/nrg.2017.115. Epub 2018 Feb 5.
8
A polymorphic repeat in the IGF1 promoter influences the risk of endometrial cancer.
Endocr Connect. 2016 May;5(3):115-22. doi: 10.1530/EC-16-0003. Epub 2016 Apr 18.
9
Tandem Repeats in Proteins: Prediction Algorithms and Biological Role.
Front Bioeng Biotechnol. 2015 Sep 24;3:143. doi: 10.3389/fbioe.2015.00143. eCollection 2015.

本文引用的文献

1
The evolution and function of protein tandem repeats in plants.
New Phytol. 2015 Apr;206(1):397-410. doi: 10.1111/nph.13184. Epub 2014 Nov 24.
2
Deep conservation of human protein tandem repeats within the eukaryotes.
Mol Biol Evol. 2014 May;31(5):1132-48. doi: 10.1093/molbev/msu062. Epub 2014 Feb 3.
3
RepeatsDB: a database of tandem repeat protein structures.
Nucleic Acids Res. 2014 Jan;42(Database issue):D352-7. doi: 10.1093/nar/gkt1175. Epub 2013 Dec 5.
4
Activities at the Universal Protein Resource (UniProt).
Nucleic Acids Res. 2014 Jan;42(Database issue):D191-8. doi: 10.1093/nar/gkt1140. Epub 2013 Nov 18.
5
Graph-based modeling of tandem repeats improves global multiple sequence alignment.
Nucleic Acids Res. 2013 Sep;41(17):e162. doi: 10.1093/nar/gkt628. Epub 2013 Jul 22.
6
Dfam: a database of repetitive DNA based on profile hidden Markov models.
Nucleic Acids Res. 2013 Jan;41(Database issue):D70-82. doi: 10.1093/nar/gks1265. Epub 2012 Nov 30.
7
New and continuing developments at PROSITE.
Nucleic Acids Res. 2013 Jan;41(Database issue):D344-7. doi: 10.1093/nar/gks1067. Epub 2012 Nov 17.
8
Repeat or not repeat?--Statistical validation of tandem repeat prediction in genomic sequences.
Nucleic Acids Res. 2012 Nov 1;40(20):10005-17. doi: 10.1093/nar/gks726. Epub 2012 Aug 25.
9
Tandem repeats discovery service (TReaDS) applied to finding novel cis-acting factors in repeat expansion diseases.
BMC Bioinformatics. 2012 Mar 28;13 Suppl 4(Suppl 4):S3. doi: 10.1186/1471-2105-13-S4-S3.
10
ALF--a simulation framework for genome evolution.
Mol Biol Evol. 2012 Apr;29(4):1115-23. doi: 10.1093/molbev/msr268. Epub 2011 Dec 8.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验