Hong Danni, Lin Hongli, Liu Lifang, Shu Muya, Dai Jianwu, Lu Falong, Tong Mengsha, Huang Jialiang
State Key Laboratory of Cellular Stress Biology, School of Life Sciences, Faculty of Medicine and Life Sciences, Xiamen University, Xiamen, Fujian 361102, China.
State Key Laboratory of Molecular Developmental Biology, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, China.
Brief Bioinform. 2023 Jan 19;24(1). doi: 10.1093/bib/bbac508.
Many enhancers exist as clusters in the genome and control cell identity and disease genes; however, the underlying mechanism remains largely unknown. Here, we introduce an algorithm, eNet, to build enhancer networks by integrating single-cell chromatin accessibility and gene expression profiles. The complexity of enhancer networks is assessed by two metrics: the number of enhancers and the frequency of predicted enhancer interactions (PEIs) based on chromatin co-accessibility. We apply eNet algorithm to a human blood dataset and find cell identity and disease genes tend to be regulated by complex enhancer networks. The network hub enhancers (enhancers with frequent PEIs) are the most functionally important. Compared with super-enhancers, enhancer networks show better performance in predicting cell identity and disease genes. eNet is robust and widely applicable in various human or mouse tissues datasets. Thus, we propose a model of enhancer networks containing three modes: Simple, Multiple and Complex, which are distinguished by their complexity in regulating gene expression. Taken together, our work provides an unsupervised approach to simultaneously identify key cell identity and disease genes and explore the underlying regulatory relationships among enhancers in single cells.
许多增强子在基因组中以簇的形式存在,并控制细胞身份和疾病相关基因;然而,其潜在机制在很大程度上仍不清楚。在此,我们引入一种算法eNet,通过整合单细胞染色质可及性和基因表达谱来构建增强子网络。增强子网络的复杂性通过两个指标来评估:增强子的数量以及基于染色质共可及性的预测增强子相互作用(PEI)频率。我们将eNet算法应用于人类血液数据集,发现细胞身份和疾病相关基因倾向于受复杂的增强子网络调控。网络枢纽增强子(具有频繁PEI的增强子)在功能上最为重要。与超级增强子相比,增强子网络在预测细胞身份和疾病相关基因方面表现更佳。eNet在各种人类或小鼠组织数据集中具有稳健性且广泛适用。因此,我们提出了一种包含三种模式的增强子网络模型:简单模式、多重模式和复杂模式,它们通过调控基因表达的复杂性来区分。综上所述,我们的工作提供了一种无监督方法,可同时识别关键的细胞身份和疾病相关基因,并探索单细胞中增强子之间潜在的调控关系。