Wei Pi-Jing, Jin Huai-Wan, Gao Zhen, Su Yansen, Zheng Chun-Hou
Information Materials and Intelligent Sensing Laboratory of Anhui Province, Institute of Physical Science and Information Technology, Anhui University, 111 Jiulong Road, Hefei 230601, Anhui, China.
School of Artificial Intelligence, Anhui University, 111 Jiulong Road, Hefei 230601, Anhui, China.
Brief Bioinform. 2025 May 1;26(3). doi: 10.1093/bib/bbaf232.
Reconstructing high-resolution gene regulatory networks (GRNs) based on single-cell RNA sequencing data provides an opportunity to gain insight into disease pathogenesis. At present, there are a large number of GRN reconstruction methods based on graph neural networks, and they can obtain excellent performance in GRN inference by extracting network structure features. However, most of these methods fail to fully exploit the directional characteristics or even ignore them when extracting network structural features. To this end, a novel framework called GAEDGRN is proposed based on gravity-inspired graph autoencoder (GIGAE) to infer potential causal relationships between genes. Among them, GIGAE can help us capture the complex directed network topology in GRN. Additionally, due to the uneven distribution of the latent vectors generated by the graph autoencoder, a random walk-based method is used to regularize the latent vectors learnt by the encoder. Furthermore, considering that some genes in GRN usually have a significant impact on biological functions, GAEDGRN designs a gene importance score calculation method and pays attention to genes with high importance in the process of GRN reconstruction. Experimental results on seven cell types of three GRN types show that GAEDGRN achieves high accuracy and strong robustness. Moreover, a case study on human embryonic stem cells demonstrates that GAEDGRN can help identify important genes.
基于单细胞RNA测序数据重建高分辨率基因调控网络(GRN)为深入了解疾病发病机制提供了契机。目前,有大量基于图神经网络的GRN重建方法,它们通过提取网络结构特征在GRN推理中能获得优异性能。然而,这些方法大多在提取网络结构特征时未能充分利用方向特征,甚至忽略了它们。为此,基于引力启发式图自动编码器(GIGAE)提出了一种名为GAEDGRN的新颖框架,以推断基因之间潜在的因果关系。其中,GIGAE有助于我们捕捉GRN中复杂的有向网络拓扑结构。此外,由于图自动编码器生成的潜在向量分布不均,采用基于随机游走的方法对编码器学习到的潜在向量进行正则化。此外,考虑到GRN中的一些基因通常对生物学功能有重大影响,GAEDGRN设计了一种基因重要性得分计算方法,并在GRN重建过程中关注重要性高的基因。对三种GRN类型的七种细胞类型进行的实验结果表明,GAEDGRN具有高准确性和强鲁棒性。此外,对人类胚胎干细胞的案例研究表明,GAEDGRN有助于识别重要基因。