Department of Biomedical Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Karnataka, India.
Haematology and Clinical Pathology Lab, Kasturba Medical College, Manipal Academy of Higher Education, Karnataka, India.
Technol Health Care. 2024;32(4):2431-2444. doi: 10.3233/THC-231207.
Anaemia is a commonly known blood illness worldwide. Red blood cell (RBC) count or oxygen carrying capability being insufficient are two ways to describe anaemia. This disorder has an impact on the quality of life. If anaemia is detected in the initial stage, appropriate care can be taken to prevent further harm.
This study proposes a machine learning approach to identify anaemia from clinical markers, which will help further in clinical practice.
The models are designed with a dataset of 364 samples and 12 blood test attributes. The developed algorithm is expected to provide decision support to the clinicians based on blood markers. Each model is trained and validated on several performance metrics.
The accuracy obtained by the random forest, K nearest neighbour, support vector machine, Naive Bayes, xgboost, and catboost are 97%, 98%, 95%, 95%, 98% and 97% respectively. Four explainers such as Shapley Additive Values (SHAP), QLattice, Eli5 and local interpretable model-agnostic explanations (LIME) are explored for interpreting the model predictions.
The study provides insights into the potential of machine learning algorithms for classification and may help in the development of automated and accurate diagnostic tools for anaemia.
贫血是一种在全球范围内广为人知的血液疾病。可以通过红细胞(RBC)计数或携氧能力不足这两种方式来描述贫血。这种疾病会影响生活质量。如果在早期发现贫血,可以采取适当的护理措施来防止进一步的伤害。
本研究提出了一种从临床标志物中识别贫血的机器学习方法,这将有助于进一步在临床实践中应用。
该模型使用了一个包含 364 个样本和 12 个血液测试属性的数据集进行设计。该开发的算法有望根据血液标志物为临床医生提供决策支持。每个模型都使用多个性能指标进行训练和验证。
随机森林、K 最近邻、支持向量机、朴素贝叶斯、xgboost 和 catboost 的准确率分别为 97%、98%、95%、95%、98%和 97%。研究还探索了四种解释器,包括 Shapley Additive Values (SHAP)、QLattice、Eli5 和 local interpretable model-agnostic explanations (LIME),以解释模型预测。
该研究提供了对机器学习算法在分类方面的潜力的深入了解,并可能有助于开发自动化和准确的贫血诊断工具。