Orengo Christine A, Thornton Janet M
Department of Biochemistry and Molecular Biology, University College, London WC1E 6BT, United Kingdom.
Annu Rev Biochem. 2005;74:867-900. doi: 10.1146/annurev.biochem.74.082803.133029.
We can now assign about two thirds of the sequences from completed genomes to as few as 1400 domain families for which structures are known and thus more ancient evolutionary relationships established. About 200 of these domain families are common to all kingdoms of life and account for nearly 50% of domain structure annotations in the genomes. Some of these domain families have been very extensively duplicated within a genome and combined with different domain partners giving rise to different multidomain proteins. The ways in which these domain combinations evolve tend to be specific to the organism so that less than 15% of the protein families found within a genome appear to be common to all kingdoms of life. Recent analyses of completed genomes, exploiting the structural data, have revealed the extent to which duplication of these domains and modifications of their functions can expand the functional repertoire of the organism, contributing to increasing complexity.
现在,我们可以将已完成基因组中的约三分之二的序列归为仅1400个已知结构的结构域家族,从而建立起更为古老的进化关系。其中约200个结构域家族存在于所有生命王国中,占基因组中结构域结构注释的近50%。这些结构域家族中的一些在基因组内大量复制,并与不同的结构域伙伴结合,产生了不同的多结构域蛋白。这些结构域组合的进化方式往往因生物而异,以至于基因组中发现的蛋白质家族中,只有不到15%似乎存在于所有生命王国中。利用结构数据对已完成基因组进行的最新分析揭示了这些结构域的复制及其功能修饰在多大程度上能够扩展生物体的功能库,从而导致复杂性增加。