Lim Sangsoo, Kim Youngkuk, Gu Jeonghyeon, Lee Sunho, Shin Wonseok, Kim Sun
Bioinformatics Institute, Seoul National University, Gwanak-ro 1, Seoul 08826, South Korea.
Department of Computer Science and Engineering, Seoul National University, Gwanak-ro 1, Seoul 08826, South Korea.
iScience. 2022 Dec 26;26(1):105677. doi: 10.1016/j.isci.2022.105677. eCollection 2023 Jan 20.
Drug-induced liver injury (DILI) is the main cause of drug failure in clinical trials. The characterization of toxic compounds in terms of chemical structure is important because compounds can be metabolized to toxic substances in the liver. Traditional machine learning approaches have had limited success in predicting DILI, and emerging deep graph neural network (GNN) models are yet powerful enough to predict DILI. In this study, we developed a completely different approach, supervised subgraph mining (SSM), a strategy to mine explicit subgraph features by iteratively updating individual graph transitions to maximize DILI fidelity. Our method outperformed previous methods including state-of-the-art GNN tools in classifying DILI on two different datasets: DILIst and TDC-benchmark. We also combined the subgraph features by using SMARTS-based frequent structural pattern matching and associated them with drugs' ATC code.
药物性肝损伤(DILI)是临床试验中药物失败的主要原因。根据化学结构对有毒化合物进行表征很重要,因为化合物在肝脏中可代谢为有毒物质。传统的机器学习方法在预测DILI方面取得的成功有限,而新兴的深度图神经网络(GNN)模型也不足以强大到能够预测DILI。在本研究中,我们开发了一种截然不同的方法,即监督子图挖掘(SSM),这是一种通过迭代更新单个图转换以最大化DILI保真度来挖掘显式子图特征的策略。在对两个不同数据集(DILIst和TDC基准)上的DILI进行分类时,我们的方法优于包括最先进的GNN工具在内的先前方法。我们还通过基于SMARTS的频繁结构模式匹配来组合子图特征,并将它们与药物的ATC代码相关联。