Daood Nada J, Russo Daniel P, Chung Elena, Qin Xuebin, Zhu Hao
Department of Chemistry and Biochemistry, Rowan University, Glassboro, New Jersey 08028, United States.
Center for Biomedical Informatics and Genomics, Tulane University School of Medicine, New Orleans, Louisiana 70112, United States.
Environ Health (Wash). 2024 May 28;2(7):474-485. doi: 10.1021/envhealth.4c00026. eCollection 2024 Jul 19.
Computational modeling has emerged as a time-saving and cost-effective alternative to traditional animal testing for assessing chemicals for their potential hazards. However, few computational modeling studies for immunotoxicity were reported, with few models available for predicting toxicants due to the lack of training data and the complex mechanisms of immunotoxicity. In this study, we employed a data-driven quantitative structure-activity relationship (QSAR) modeling workflow to extensively enlarge the limited training data by revealing multiple targets involved in immunotoxicity. To this end, a probe data set of 6,341 chemicals was obtained from a high-throughput screening (HTS) assay testing for the activation of the aryl hydrocarbon receptor (AhR) signaling pathway, a key event leading to immunotoxicity. Searching this probe data set against PubChem yielded 3,183 assays with testing results for varying proportions of these 6,341 compounds. 100 assays were selected to develop QSAR models based on their correlations to AhR agonism. Twelve individual QSAR models were built for each assay using combinations of four machine-learning algorithms and three molecular fingerprints. 5-fold cross-validation of the resulting models showed good predictivity (average CCR = 0.73). A total of 20 assays were further selected based on QSAR model performance, and their resulting QSAR models showed good predictivity of potential immunotoxicants from external chemicals. This study provides a computational modeling strategy that can utilize large public toxicity data sets for modeling immunotoxicity and other toxicity endpoints, which have limited training data and complicated toxicity mechanisms.
计算建模已成为一种省时且经济高效的替代方法,可用于替代传统动物试验来评估化学品的潜在危害。然而,关于免疫毒性的计算建模研究报道较少,由于缺乏训练数据以及免疫毒性机制复杂,可用于预测有毒物质的模型也很少。在本研究中,我们采用了一种数据驱动的定量构效关系(QSAR)建模工作流程,通过揭示免疫毒性中涉及的多个靶点来大幅扩充有限的训练数据。为此,从一项针对芳烃受体(AhR)信号通路激活的高通量筛选(HTS)试验中获得了一个包含6341种化学品的探针数据集,AhR信号通路的激活是导致免疫毒性的关键事件。在PubChem中搜索该探针数据集,得到了3183项针对这6341种化合物不同比例的测试结果的试验。基于与AhR激动作用的相关性,选择了100项试验来开发QSAR模型。使用四种机器学习算法和三种分子指纹的组合,为每项试验构建了12个单独的QSAR模型。对所得模型进行5折交叉验证显示出良好的预测能力(平均CCR = 0.73)。根据QSAR模型性能进一步选择了总共20项试验,其所得的QSAR模型对外部化学品潜在免疫毒性物质具有良好的预测能力。本研究提供了一种计算建模策略,该策略可利用大型公共毒性数据集对免疫毒性和其他毒性终点进行建模,这些毒性终点的训练数据有限且毒性机制复杂。