Artificial Intelligence & Data Analytics Lab, CCIS, Prince Sultan University, Riyadh 11586, Saudi Arabia.
School of Systems and Technology, University of Management and Technology, Lahore, Pakistan.
Technol Health Care. 2024;32(6):3847-3870. doi: 10.3233/THC-230313.
Pneumonia is a dangerous disease that kills millions of children and elderly patients worldwide every year. The detection of pneumonia from a chest x-ray is perpetrated by expert radiologists. The chest x-ray is cheaper and is most often used to diagnose pneumonia. However, chest x-ray-based diagnosis requires expert radiologists which is time-consuming and laborious. Moreover, COVID-19 and pneumonia have similar symptoms which leads to false positives. Machine learning-based solutions have been proposed for the automatic prediction of pneumonia from chest X-rays, however, such approaches lack robustness and high accuracy due to data imbalance and generalization errors. This study focuses on elevating the performance of machine learning models by dealing with data imbalanced problems using data augmentation. Contrary to traditional machine learning models that required hand-crafted features, this study uses transfer learning for automatic feature extraction using Xception and VGG-16 to train classifiers like support vector machine, logistic regression, K nearest neighbor, stochastic gradient descent, extra tree classifier, and gradient boosting machine. Experiments involve the use of hand-crafted features, as well as, transfer learning-based feature extraction for pneumonia detection. Performance comparison using Xception and VGG-16 features suggest that transfer learning-based features tend to show better performance than hand-crafted features and an accuracy of 99.23% can be obtained for pneumonia using chest X-rays.
肺炎是一种危险的疾病,每年在全球范围内导致数百万儿童和老年患者死亡。通过专家放射科医生从胸部 X 光片中检测肺炎。胸部 X 光片价格更便宜,并且最常用于诊断肺炎。但是,基于 X 光的诊断需要专家放射科医生,这既耗时又费力。此外,COVID-19 和肺炎具有相似的症状,这导致了假阳性。已经提出了基于机器学习的解决方案来自动从胸部 X 光片中预测肺炎,但是,由于数据不平衡和泛化错误,这些方法缺乏鲁棒性和高精度。本研究专注于通过使用数据增强来处理数据不平衡问题来提高机器学习模型的性能。与传统的需要手工制作特征的机器学习模型不同,本研究使用迁移学习来使用 Xception 和 VGG-16 自动提取特征,然后使用支持向量机、逻辑回归、K 最近邻、随机梯度下降、随机森林和梯度提升机等分类器进行训练。实验涉及使用手工制作的特征以及基于迁移学习的特征提取来进行肺炎检测。使用 Xception 和 VGG-16 特征进行的性能比较表明,基于迁移学习的特征往往比手工制作的特征表现更好,并且可以使用胸部 X 光片获得 99.23%的肺炎准确率。