Dai Wenfeng, Wang Yanhong, Yan Shuai, Yu Qingzhi, Cheng Xiang
School of Information Engineering, Jingdezhen Ceramics University, Jingdezhen, Jiangxi, 333403, China.
Sci Rep. 2025 Aug 19;15(1):30326. doi: 10.1038/s41598-025-16098-y.
Reliable prediction of drug-target interaction (DTI) is essential for accelerating drug discovery, yet remains hindered by data imbalance, limited interpretability, and neglect of protein dynamics. Here, we present GHCDTI, a heterogeneous graph neural framework designed to overcome these challenges through three synergistic innovations. First, cross-view contrastive learning with adaptive positive sampling improves generalization under extreme class imbalance (positive/negative ratio<1:100). Second, heterogeneous data fusion integrates molecular graphs, protein structure graphs, and bioactivity data via cross-graph attention, enabling interpretable residue-level insights. Third, multi-scale wavelet feature extraction captures both conserved and dynamic structural features by decomposing protein conformations into frequency components. GHCDTI achieves state-of-the-art performance on benchmark datasets (AUC: 0.966 ± 0.016; AUPR: 0.888 ± 0.018) and processes 1,512 proteins and 708 drugs in under two minutes, highlighting its potential for scalable virtual screening and drug repositioning. These results demonstrate GHCDTI's ability to effectively identify novel drug-target pairs, providing a practical tool for accelerating drug discovery and improving biomedical knowledge integration.
可靠地预测药物-靶点相互作用(DTI)对于加速药物发现至关重要,但仍受到数据不平衡、可解释性有限以及对蛋白质动力学忽视的阻碍。在此,我们提出了GHCDTI,这是一个异构图神经网络框架,旨在通过三项协同创新来克服这些挑战。首先,采用自适应正样本采样的跨视图对比学习在极端类不平衡(正/负比例<1:100)情况下提高泛化能力。其次,异质数据融合通过跨图注意力整合分子图、蛋白质结构图和生物活性数据,实现可解释的残基水平洞察。第三,多尺度小波特征提取通过将蛋白质构象分解为频率成分来捕捉保守和动态的结构特征。GHCDTI在基准数据集上达到了当前最优性能(AUC:0.966±0.016;AUPR:0.888±0.018),并在两分钟内处理1512种蛋白质和708种药物,凸显了其在可扩展虚拟筛选和药物重新定位方面的潜力。这些结果证明了GHCDTI有效识别新型药物-靶点对的能力,为加速药物发现和改善生物医学知识整合提供了一个实用工具。