Department of Information Technology, Abbottabad University of Science and Technology, Havelian 22500, Abbottabad, Pakistan.
Department of Computer Science and I.T, Network Systems and Security Research Group, University of Malakand, Chakdara 18800, Khyber Pakhtunkhwa, Pakistan.
Comput Intell Neurosci. 2023 Mar 14;2023:9266889. doi: 10.1155/2023/9266889. eCollection 2023.
To diagnose an illness in healthcare, doctors typically conduct physical exams and review the patient's medical history, followed by diagnostic tests and procedures to determine the underlying cause of symptoms. Chronic kidney disease (CKD) is currently the leading cause of death, with a rapidly increasing number of patients, resulting in 1.7 million deaths annually. While various diagnostic methods are available, this study utilizes machine learning due to its high accuracy. In this study, we have used the hybrid technique to build our proposed model. In our proposed model, we have used the Pearson correlation for feature selection. In the first step, the best models were selected on the basis of critical literature analysis. In the second step, the combination of these models is used in our proposed hybrid model. Gaussian Naïve Bayes, gradient boosting, and decision tree classifier are used as a base classifier, and the random forest classifier is used as a meta-classifier in the proposed hybrid model. The objective of this study is to evaluate the best machine learning classification techniques and identify the best-used machine learning classifier in terms of accuracy. This provides a solution for overfitting and achieves the highest accuracy. It also highlights some of the challenges that affect the result of better performance. In this study, we critically review the existing available machine learning classification techniques. We evaluate in terms of accuracy, and a comprehensive analytical evaluation of the related work is presented with a tabular system. In implementation, we have used the top four models and built a hybrid model using UCI chronic kidney disease dataset for prediction. Gradient boosting achieves around 99% accuracy, random forest achieves 98%, decision tree classifier achieves 96% accuracy, and our proposed hybrid model performs best getting 100% accuracy on the same dataset. Some of the main machine learning algorithms used to predict the occurrence of CKD are Naïve Bayes, decision tree, K-nearest neighbor, random forest, support vector machine, LDA, GB, and neural network. In this study, we apply GB (gradient boosting), Gaussian Naïve Bayes, and decision tree along with random forest on the same set of features and compare the accuracy score.
为了在医疗保健中诊断疾病,医生通常会进行身体检查并回顾患者的病史,然后进行诊断测试和程序以确定症状的根本原因。慢性肾脏病(CKD)目前是导致死亡的主要原因,患者数量迅速增加,每年导致 170 万人死亡。虽然有各种诊断方法,但由于其准确性高,本研究使用了机器学习。在本研究中,我们使用了混合技术来构建我们提出的模型。在我们提出的模型中,我们使用 Pearson 相关系数进行特征选择。在第一步中,根据关键文献分析选择最佳模型。在第二步中,将这些模型组合用于我们提出的混合模型。在提出的混合模型中,高斯朴素贝叶斯、梯度提升和决策树分类器用作基分类器,随机森林分类器用作元分类器。本研究的目的是评估最佳的机器学习分类技术,并确定在准确性方面使用最佳的机器学习分类器。这提供了一种解决过度拟合的方法,并实现了最高的准确性。它还突出了一些影响更好性能结果的挑战。在本研究中,我们批判性地回顾了现有的可用机器学习分类技术。我们根据准确性进行评估,并以表格系统呈现对相关工作的全面分析评估。在实现方面,我们使用了前四个模型,并使用 UCI 慢性肾脏病数据集构建了一个混合模型进行预测。梯度提升的准确率约为 99%,随机森林的准确率为 98%,决策树分类器的准确率为 96%,而我们提出的混合模型在同一数据集上的表现最佳,准确率为 100%。用于预测 CKD 发生的一些主要机器学习算法包括朴素贝叶斯、决策树、K-最近邻、随机森林、支持向量机、LDA、GB 和神经网络。在本研究中,我们在同一组特征上应用了 GB(梯度提升)、高斯朴素贝叶斯和决策树以及随机森林,并比较了准确率得分。