Suppr超能文献

真核模式生物的内含子-外显子结构。

Intron-exon structures of eukaryotic model organisms.

作者信息

Deutsch M, Long M

机构信息

Department of Ecology and Evolution, The University of Chicago, 1101 East 57th Street, Chicago, IL 60637, USA.

出版信息

Nucleic Acids Res. 1999 Aug 1;27(15):3219-28. doi: 10.1093/nar/27.15.3219.

Abstract

To investigate the distribution of intron-exon structures of eukaryotic genes, we have constructed a general exon database comprising all available intron-containing genes and exon databases from 10 eukaryotic model organisms: Homo sapiens, Mus musculus, Gallus gallus, Rattus norvegicus, Arabidopsis thaliana, Zea mays, Schizosaccharomyces pombe, Aspergillus, Caenorhabditis elegans and Drosophila. We purged redundant genes to avoid the possible bias brought about by redundancy in the databases. After discarding those questionable introns that do not contain correct splice sites, the final database contained 17 102 introns, 21 019 exons and 2903 independent or quasi-independent genes. On average, a eukaryotic gene contains 3.7 introns per kb protein coding region. The exon distribution peaks around 30-40 residues and most introns are 40-125 nt long. The variable intron-exon structures of the 10 model organisms reveal two interesting statistical phenomena, which cast light on some previous speculations. (i) Genome size seems to be correlated with total intron length per gene. For example, invertebrate introns are smaller than those of human genes, while yeast introns are shorter than invertebrate introns. However, this correlation is weak, suggesting that other factors besides genome size may also affect intron size. (ii) Introns smaller than 50 nt are significantly less frequent than longer introns, possibly resulting from a minimum intron size requirement for intron splicing.

摘要

为了研究真核基因内含子 - 外显子结构的分布,我们构建了一个通用外显子数据库,该数据库包含所有可用的含内含子基因以及来自10种真核模式生物的外显子数据库,这10种生物分别是:智人、小家鼠、原鸡、褐家鼠、拟南芥、玉米、粟酒裂殖酵母、曲霉、秀丽隐杆线虫和果蝇。我们去除了冗余基因,以避免数据库冗余可能带来的偏差。在舍弃那些不包含正确剪接位点的可疑内含子后,最终数据库包含17102个内含子、21019个外显子和2903个独立或准独立基因。平均而言,一个真核基因每千碱基蛋白质编码区域含有3.7个内含子。外显子分布在30 - 40个残基左右达到峰值,且大多数内含子长度为40 - 125个核苷酸。这10种模式生物可变的内含子 - 外显子结构揭示了两个有趣的统计现象,为之前的一些推测提供了线索。(i)基因组大小似乎与每个基因的内含子总长度相关。例如,无脊椎动物的内含子比人类基因的内含子小,而酵母的内含子比无脊椎动物的内含子短。然而,这种相关性较弱,表明除基因组大小外的其他因素也可能影响内含子大小。(ii)小于50个核苷酸的内含子出现频率明显低于较长的内含子,这可能是由于内含子剪接存在最小内含子大小要求所致。

相似文献

1
Intron-exon structures of eukaryotic model organisms.
Nucleic Acids Res. 1999 Aug 1;27(15):3219-28. doi: 10.1093/nar/27.15.3219.
4
Statistical analysis of the exon-intron structure of higher and lower eukaryote genes.
J Biomol Struct Dyn. 1999 Oct;17(2):281-8. doi: 10.1080/07391102.1999.10508361.
5
Analysis of evolution of exon-intron structure of eukaryotic genes.
Brief Bioinform. 2005 Jun;6(2):118-34. doi: 10.1093/bib/6.2.118.
9
A gradient in the distribution of introns in eukaryotic genes.
J Mol Evol. 2006 Jul;63(1):136-41. doi: 10.1007/s00239-005-0261-6. Epub 2006 May 25.
10
Evidence of splice signal migration from exon to intron during intron evolution.
Curr Biol. 2003 Dec 16;13(24):2170-4. doi: 10.1016/j.cub.2003.12.003.

引用本文的文献

2
Splicing accuracy varies across human introns, tissues, age and disease.
Nat Commun. 2025 Jan 27;16(1):1068. doi: 10.1038/s41467-024-55607-x.
3
From computational models of the splicing code to regulatory mechanisms and therapeutic implications.
Nat Rev Genet. 2025 Mar;26(3):171-190. doi: 10.1038/s41576-024-00774-2. Epub 2024 Oct 2.
5
Co-transcriptional splicing facilitates transcription of gigantic genes.
PLoS Genet. 2024 Jun 13;20(6):e1011241. doi: 10.1371/journal.pgen.1011241. eCollection 2024 Jun.
7
gene prediction for protein-coding regions.
Bioinform Adv. 2023 Aug 10;3(1):vbad105. doi: 10.1093/bioadv/vbad105. eCollection 2023.
9
Studying stochastic systems biology of the cell with single-cell genomics data.
bioRxiv. 2023 May 29:2023.05.17.541250. doi: 10.1101/2023.05.17.541250.
10

本文引用的文献

1
Genome size and intron size in Drosophila.
Mol Biol Evol. 1998 Jun;15(6):770-3. doi: 10.1093/oxfordjournals.molbev.a025980.
2
High rate of DNA loss in the Drosophila melanogaster and Drosophila virilis species groups.
Mol Biol Evol. 1998 Mar;15(3):293-302. doi: 10.1093/oxfordjournals.molbev.a025926.
3
Classification of introns: U2-type or U12-type.
Cell. 1997 Dec 26;91(7):875-9. doi: 10.1016/s0092-8674(00)80479-1.
4
Relationship between "proto-splice sites" and intron phases: evidence from dicodon analysis.
Proc Natl Acad Sci U S A. 1998 Jan 6;95(1):219-23. doi: 10.1073/pnas.95.1.219.
5
The yeast splice site revisited: new exon consensus from genomic analysis.
Cell. 1997 Dec 12;91(6):739-40. doi: 10.1016/s0092-8674(00)80462-6.
6
Biology's new Rosetta stone.
Nature. 1997 Jan 2;385(6611):29-30. doi: 10.1038/385029a0.
7
High intrinsic rate of DNA loss in Drosophila.
Nature. 1996 Nov 28;384(6607):346-9. doi: 10.1038/384346a0.
8
Life with 6000 genes.
Science. 1996 Oct 25;274(5287):546, 563-7. doi: 10.1126/science.274.5287.546.
10
Intron phase correlations and the evolution of the intron/exon structure of genes.
Proc Natl Acad Sci U S A. 1995 Dec 19;92(26):12495-9. doi: 10.1073/pnas.92.26.12495.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验