Khokhar Pir Bakhsh, Gravino Carmine, Palomba Fabio
Department of Informatics, University of Salerno, Via Giovanni Paolo II, 132, Fisciano, 84084 Salerno, Italy.
Artif Intell Med. 2025 Jun;164:103132. doi: 10.1016/j.artmed.2025.103132. Epub 2025 Apr 15.
Diabetes mellitus (DM), a prevalent metabolic disorder, has significant global health implications. The advent of machine learning (ML) has revolutionized the ability to predict and manage diabetes early, offering new avenues to mitigate its impact. This systematic review examined 53 articles on ML applications for diabetes prediction, focusing on datasets, algorithms, training methods, and evaluation metrics. Various datasets, such as the Singapore National Diabetic Retinopathy Screening Program, REPLACE-BG, National Health and Nutrition Examination Survey (NHANES), and Pima Indians Diabetes Database (PIDD), have been explored, highlighting their unique features and challenges, such as class imbalance. This review assesses the performance of various ML algorithms, such as Convolutional Neural Networks (CNN), Support Vector Machines (SVM), Logistic Regression, and XGBoost, for the prediction of diabetes outcomes from multiple datasets. In addition, it explores explainable AI (XAI) methods such as Grad-CAM, SHAP, and LIME, which improve the transparency and clinical interpretability of AI models in assessing diabetes risk and detecting diabetic retinopathy. Techniques such as cross-validation, data augmentation, and feature selection are discussed in terms of their influence on the versatility and robustness of the model. Some evaluation techniques involving k-fold cross-validation, external validation, and performance indicators such as accuracy, area under curve, sensitivity, and specificity are presented. The findings highlight the usefulness of ML in addressing the challenges of diabetes prediction, the value of sourcing different data types, the need to make models explainable, and the need to keep models clinically relevant. This study highlights significant implications for healthcare professionals, policymakers, technology developers, patients, and researchers, advocating interdisciplinary collaboration and ethical considerations when implementing ML-based diabetes prediction models. By consolidating existing knowledge, this SLR outlines future research directions aimed at improving diagnostic accuracy, patient care, and healthcare efficiency through advanced ML applications. This comprehensive review contributes to the ongoing efforts to utilize artificial intelligence technology for a better prediction of diabetes, ultimately aiming to reduce the global burden of this widespread disease.
糖尿病(DM)是一种普遍存在的代谢紊乱疾病,对全球健康有着重大影响。机器学习(ML)的出现彻底改变了早期预测和管理糖尿病的能力,为减轻其影响提供了新途径。本系统综述研究了53篇关于ML在糖尿病预测中的应用的文章,重点关注数据集、算法、训练方法和评估指标。已探索了各种数据集,如新加坡国家糖尿病视网膜病变筛查计划、REPLACE-BG、国家健康与营养检查调查(NHANES)以及皮马印第安人糖尿病数据库(PIDD),突出了它们的独特特征和挑战,如类别不平衡。本综述评估了各种ML算法,如卷积神经网络(CNN)、支持向量机(SVM)、逻辑回归和XGBoost,用于从多个数据集中预测糖尿病结果的性能。此外,还探讨了可解释人工智能(XAI)方法,如Grad-CAM、SHAP和LIME,这些方法提高了人工智能模型在评估糖尿病风险和检测糖尿病视网膜病变方面的透明度和临床可解释性。讨论了交叉验证、数据增强和特征选择等技术对模型通用性和稳健性的影响。介绍了一些涉及k折交叉验证、外部验证以及准确性、曲线下面积、敏感性和特异性等性能指标的评估技术。研究结果突出了ML在应对糖尿病预测挑战方面的有用性、获取不同数据类型的价值、使模型具有可解释性的必要性以及使模型与临床相关的必要性。本研究突出了对医疗保健专业人员、政策制定者、技术开发者、患者和研究人员的重大影响,倡导在实施基于ML的糖尿病预测模型时进行跨学科合作和伦理考量。通过整合现有知识,本系统文献综述概述了未来的研究方向,旨在通过先进的ML应用提高诊断准确性、患者护理水平和医疗保健效率。这一全面综述有助于持续利用人工智能技术更好地预测糖尿病的努力,最终目标是减轻这种广泛疾病的全球负担。