Peng Jiajie, Zhang Xuanshuo, Hui Weiwei, Lu Junya, Li Qianqian, Liu Shuhui, Shang Xuequn
School of Computer Science, Northwestern Polytechnical University, Xi'an, China.
Key Laboratory of Big Data Storage and Management, Northwestern Polytechnical University, Ministry of Industry and Information Technology, Xi'an, China.
BMC Syst Biol. 2018 Mar 19;12(Suppl 2):18. doi: 10.1186/s12918-018-0539-0.
Gene Ontology (GO) is one of the most popular bioinformatics resources. In the past decade, Gene Ontology-based gene semantic similarity has been effectively used to model gene-to-gene interactions in multiple research areas. However, most existing semantic similarity approaches rely only on GO annotations and structure, or incorporate only local interactions in the co-functional network. This may lead to inaccurate GO-based similarity resulting from the incomplete GO topology structure and gene annotations.
We present NETSIM2, a new network-based method that allows researchers to measure GO-based gene functional similarities by considering the global structure of the co-functional network with a random walk with restart (RWR)-based method, and by selecting the significant term pairs to decrease the noise information. Based on the EC number (Enzyme Commission)-based groups of yeast and Arabidopsis, evaluation test shows that NETSIM2 can enhance the accuracy of Gene Ontology-based gene functional similarity.
Using NETSIM2 as an example, we found that the accuracy of semantic similarities can be significantly improved after effectively incorporating the global gene-to-gene interactions in the co-functional network, especially on the species that gene annotations in GO are far from complete.
基因本体论(GO)是最受欢迎的生物信息学资源之一。在过去十年中,基于基因本体论的基因语义相似性已被有效地用于多个研究领域中基因与基因相互作用的建模。然而,大多数现有的语义相似性方法仅依赖于GO注释和结构,或者仅纳入共功能网络中的局部相互作用。这可能会由于不完整的GO拓扑结构和基因注释而导致基于GO的相似性不准确。
我们提出了NETSIM2,这是一种新的基于网络的方法,它允许研究人员通过基于重启随机游走(RWR)的方法考虑共功能网络的全局结构,并通过选择重要的术语对来减少噪声信息,从而测量基于GO的基因功能相似性。基于酵母和拟南芥的基于酶委员会(EC)编号的分组,评估测试表明NETSIM2可以提高基于基因本体论的基因功能相似性的准确性。
以NETSIM2为例,我们发现有效纳入共功能网络中全局基因与基因相互作用后,语义相似性的准确性可以显著提高,特别是对于GO中基因注释远未完整的物种。