Kanehisa Minoru, Sato Yoko, Kawashima Masayuki, Furumichi Miho, Tanabe Mao
Institute for Chemical Research, Kyoto University, Uji, Kyoto 611-0011, Japan
Healthcare Solutions Department, Fujitsu Kyushu Systems Ltd., Hakata-ku, Fukuoka 812-0007, Japan.
Nucleic Acids Res. 2016 Jan 4;44(D1):D457-62. doi: 10.1093/nar/gkv1070. Epub 2015 Oct 17.
KEGG (http://www.kegg.jp/ or http://www.genome.jp/kegg/) is an integrated database resource for biological interpretation of genome sequences and other high-throughput data. Molecular functions of genes and proteins are associated with ortholog groups and stored in the KEGG Orthology (KO) database. The KEGG pathway maps, BRITE hierarchies and KEGG modules are developed as networks of KO nodes, representing high-level functions of the cell and the organism. Currently, more than 4000 complete genomes are annotated with KOs in the KEGG GENES database, which can be used as a reference data set for KO assignment and subsequent reconstruction of KEGG pathways and other molecular networks. As an annotation resource, the following improvements have been made. First, each KO record is re-examined and associated with protein sequence data used in experiments of functional characterization. Second, the GENES database now includes viruses, plasmids, and the addendum category for functionally characterized proteins that are not represented in complete genomes. Third, new automatic annotation servers, BlastKOALA and GhostKOALA, are made available utilizing the non-redundant pangenome data set generated from the GENES database. As a resource for translational bioinformatics, various data sets are created for antimicrobial resistance and drug interaction networks.
京都基因与基因组百科全书(KEGG,网址:http://www.kegg.jp/ 或 http://www.genome.jp/kegg/)是一个用于对基因组序列及其他高通量数据进行生物学解读的综合数据库资源。基因和蛋白质的分子功能与直系同源组相关联,并存储在KEGG直系同源(KO)数据库中。KEGG通路图、BRITE层级结构和KEGG模块是作为KO节点网络开发的,代表了细胞和生物体的高级功能。目前,KEGG基因数据库中已有超过4000个完整基因组用KO进行了注释,这些基因组可用作KO分配以及后续KEGG通路和其他分子网络重建的参考数据集。作为一种注释资源,已进行了以下改进。首先,重新审视了每个KO记录,并将其与功能表征实验中使用的蛋白质序列数据相关联。其次,基因数据库现在包括病毒、质粒,以及完整基因组中未出现的功能表征蛋白质的补充类别。第三,利用从基因数据库生成的非冗余泛基因组数据集,提供了新的自动注释服务器BlastKOALA和GhostKOALA。作为转化生物信息学的一种资源,针对抗菌抗性和药物相互作用网络创建了各种数据集。