Suppr超能文献

基于机器学习算法利用皮马印第安人糖尿病数据集对女性人群糖尿病的预测

Machine Learning Algorithm-Based Prediction of Diabetes Among Female Population Using PIMA Dataset.

作者信息

Ahmed Afshan, Khan Jalaluddin, Arsalan Mohd, Ahmed Kahksha, Shahat Abdelaaty A, Alhalmi Abdulsalam, Naaz Sameena

机构信息

Microbial & Pharmaceutical Biotechnology Laboratory, Department of Pharmacognosy & Phytochemistry, School of Pharmaceutical Education and Research, Jamia Hamdard, Delhi 110062, India.

Department of Computer Science and Engineering, St. Andrews Institute of Technology & Management (SAITM), Gurugram 122506, India.

出版信息

Healthcare (Basel). 2024 Dec 29;13(1):37. doi: 10.3390/healthcare13010037.

Abstract

: Diabetes is a metabolic disorder characterized by increased blood sugar levels. Early detection of diabetes could help individuals to manage and delay the progression of this disorder effectively. Machine learning (ML) methods are important in forecasting the progression and diagnosis of different medical problems with better accuracy. Although they cannot substitute the work of physicians in the prediction and diagnosis of disease, they can be of great help in identifying hidden patterns based on the results and outcome of disease. : In this research, we retrieved the PIMA dataset from the Kaggle repository, the retrieved dataset was further processed for applied PCA, heatmap, and scatter plot for exploratory data analysis (EDA), which helps to find out the relationship between various features in the dataset using visual representation. Four different ML algorithms Random Forest (RF), Decision Tree (DT), Naïve Bayes (NB), and Logistic regression (LR) were implemented on Rattle using Python for the prediction of diabetes among the female population. : Results of our study showed that RF performs better in terms of accuracy of 80%, precision of 82%, error rate of 20%, and sensitivity of 88% as compared to other developed models DT, NB, and LR. : Diabetes is a common problem prevailing across the globe, ML-based prediction models can help in the prediction of diabetes much earlier before the worsening of the condition.

摘要

糖尿病是一种以血糖水平升高为特征的代谢紊乱疾病。早期发现糖尿病有助于个体有效管理并延缓该疾病的进展。机器学习(ML)方法对于更准确地预测不同医疗问题的进展和诊断非常重要。虽然它们不能替代医生在疾病预测和诊断方面的工作,但基于疾病的结果和成果,它们在识别隐藏模式方面会有很大帮助。

在本研究中,我们从Kaggle存储库中检索了皮马印第安人糖尿病数据集,对检索到的数据集进行了进一步处理,以应用主成分分析(PCA)、热力图和散点图进行探索性数据分析(EDA),这有助于通过可视化表示找出数据集中各种特征之间的关系。使用Python在Rattle上实现了四种不同的机器学习算法——随机森林(RF)、决策树(DT)、朴素贝叶斯(NB)和逻辑回归(LR),用于预测女性人群中的糖尿病。

我们的研究结果表明,与其他已开发的模型DT、NB和LR相比,RF在准确率80%、精确率82%、错误率20%和敏感度88%方面表现更好。

糖尿病是全球普遍存在的常见问题,基于机器学习的预测模型可以在病情恶化之前更早地帮助预测糖尿病。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eee0/11719687/2a656f9fb25a/healthcare-13-00037-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验