National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA.
Exp Biol Med (Maywood). 2023 Nov;248(21):1952-1973. doi: 10.1177/15353702231209421. Epub 2023 Dec 6.
The ever-increasing number of chemicals has raised public concerns due to their adverse effects on human health and the environment. To protect public health and the environment, it is critical to assess the toxicity of these chemicals. Traditional and toxicity assays are complicated, costly, and time-consuming and may face ethical issues. These constraints raise the need for alternative methods for assessing the toxicity of chemicals. Recently, due to the advancement of machine learning algorithms and the increase in computational power, many toxicity prediction models have been developed using various machine learning and deep learning algorithms such as support vector machine, random forest, -nearest neighbors, ensemble learning, and deep neural network. This review summarizes the machine learning- and deep learning-based toxicity prediction models developed in recent years. Support vector machine and random forest are the most popular machine learning algorithms, and hepatotoxicity, cardiotoxicity, and carcinogenicity are the frequently modeled toxicity endpoints in predictive toxicology. It is known that datasets impact model performance. The quality of datasets used in the development of toxicity prediction models using machine learning and deep learning is vital to the performance of the developed models. The different toxicity assignments for the same chemicals among different datasets of the same type of toxicity have been observed, indicating benchmarking datasets is needed for developing reliable toxicity prediction models using machine learning and deep learning algorithms. This review provides insights into current machine learning models in predictive toxicology, which are expected to promote the development and application of toxicity prediction models in the future.
由于化学物质对人类健康和环境的不良影响,其数量的不断增加引起了公众的关注。为了保护公众健康和环境,评估这些化学物质的毒性至关重要。传统的毒性检测方法既复杂又昂贵,且耗时较长,还可能面临伦理问题。这些限制因素促使人们需要寻找替代方法来评估化学物质的毒性。最近,由于机器学习算法的进步和计算能力的提高,许多使用各种机器学习和深度学习算法(如支持向量机、随机森林、K-最近邻、集成学习和深度神经网络)的毒性预测模型已经被开发出来。本综述总结了近年来基于机器学习和深度学习的毒性预测模型。支持向量机和随机森林是最受欢迎的机器学习算法,肝毒性、心脏毒性和致癌性是预测毒理学中经常建模的毒性终点。众所周知,数据集会影响模型的性能。在使用机器学习和深度学习开发毒性预测模型时,数据集的质量对于开发模型的性能至关重要。在同一类型毒性的不同数据集之间,对于相同化学物质的毒性赋值存在差异,这表明需要基准数据集来开发使用机器学习和深度学习算法的可靠毒性预测模型。本综述提供了对预测毒理学中当前机器学习模型的深入了解,预计将促进未来毒性预测模型的开发和应用。