Sahibzada Kashif Iqbal, Shahid Shumaila, Akhter Mohsina, Faisal Muhammad, Abd El Rahman Reham A, Imran Muhammad, Lv Yangyong, Wei Dongqing, Hu Yuansen
College of Biological Engineering, Henan University of Technology, Zhengzhou 450001, China.
Department of Health Professional Technologies, Faculty of Allied Health Sciences, The University of Lahore, Lahore 54570, Pakistan.
Toxins (Basel). 2025 Apr 1;17(4):171. doi: 10.3390/toxins17040171.
The aaccurate prediction of enzymes with environment detoxification functions is crucial, not only to achieve a better understanding of bioremediation strategies, but also to alleviate environmental pollution. In the present study, a novel machine learning model was introduced which classifies enzymes by their toxin degradation ability. In this model, two different sets of data were used which include enzymes that can catalyze the toxin degradation as a positive dataset and non-toxin-degrading enzymes as a negative dataset. Further, a comparison of multiple classifiers was performed to find the best model and a Random Forest (RF) classifier was selected due to its strong performance. To enhance the accuracy, we combined RF with a Deep Neural Network (DNN), forming an ensemble model which effectively integrated both techniques. This combination achieved 95% precision, surpassing individual models. Our ensemble model not only ensures high prediction accuracy but also reliably differentiates toxin-degrading enzymes from non-degrading ones. This study highlights the power of combining classical machine learning with deep learning to advance prediction. Our model represents a significant step in enzyme classification and serves as a valuable resource for environmental biotechnology, food nutrition, and health applications.
准确预测具有环境解毒功能的酶至关重要,这不仅有助于更好地理解生物修复策略,还能减轻环境污染。在本研究中,引入了一种新颖的机器学习模型,该模型根据酶的毒素降解能力对其进行分类。在这个模型中,使用了两组不同的数据,其中包括能够催化毒素降解的酶作为正数据集,以及非毒素降解酶作为负数据集。此外,对多个分类器进行了比较以找到最佳模型,由于其强大的性能,选择了随机森林(RF)分类器。为了提高准确性,我们将RF与深度神经网络(DNN)相结合,形成了一个有效整合这两种技术的集成模型。这种组合实现了95%的精度,超过了单个模型。我们的集成模型不仅确保了高预测准确性,还能可靠地区分毒素降解酶和非降解酶。这项研究突出了将经典机器学习与深度学习相结合以推进预测的力量。我们的模型代表了酶分类中的重要一步,是环境生物技术、食品营养和健康应用的宝贵资源。