Institute of Electronics, Information Engineering and Telecommunications, National Research Council of Italy, Milan.
School of Health Care and Social Work, Seinäjoki University of Applied Sciences, Finland.
Am J Audiol. 2022 Sep 21;31(3S):961-979. doi: 10.1044/2022_AJA-21-00194. Epub 2022 Jul 25.
The aim of this study was to analyze the performance of multivariate machine learning (ML) models applied to a speech-in-noise hearing screening test and investigate the contribution of the measured features toward hearing loss detection using explainability techniques.
Seven different ML techniques, including transparent (i.e., decision tree and logistic regression) and opaque (e.g., random forest) models, were trained and evaluated on a data set including 215 tested ears (99 with hearing loss of mild degree or higher and 116 with no hearing loss). Post hoc explainability techniques were applied to highlight the role of each feature in predicting hearing loss.
Random forest (accuracy = .85, sensitivity = .86, specificity = .85, precision = .84) performed, on average, better than decision tree (accuracy = .82, sensitivity = .84, specificity = .80, precision = .79). Support vector machine, logistic regression, and gradient boosting had similar performance as random forest. According to post hoc explainability analysis on models generated using random forest, the features with the highest relevance in predicting hearing loss were age, number and percentage of correct responses, and average reaction time, whereas the total test time had the lowest relevance.
This study demonstrates that a multivariate approach can help detect hearing loss with satisfactory performance. Further research on a bigger sample and using more complex ML algorithms and explainability techniques is needed to fully investigate the role of input features (including additional features such as risk factors and individual responses to low-/high-frequency stimuli) in predicting hearing loss.
本研究旨在分析应用于语音噪声听力筛查测试的多元机器学习(ML)模型的性能,并通过可解释性技术研究测量特征对听力损失检测的贡献。
使用包括 215 个测试耳朵(99 个有轻度或更高程度听力损失,116 个无听力损失)的数据集,训练和评估了七种不同的 ML 技术,包括透明(即决策树和逻辑回归)和不透明(例如随机森林)模型。事后可解释性技术被应用于突出每个特征在预测听力损失中的作用。
随机森林(准确率=0.85,灵敏度=0.86,特异性=0.85,精度=0.84)的性能平均优于决策树(准确率=0.82,灵敏度=0.84,特异性=0.80,精度=0.79)。支持向量机、逻辑回归和梯度提升与随机森林具有相似的性能。根据随机森林生成模型的事后可解释性分析,预测听力损失的最高相关特征是年龄、正确回答的数量和百分比以及平均反应时间,而总测试时间的相关性最低。
本研究表明,多元方法可以帮助以令人满意的性能检测听力损失。需要进一步对更大的样本进行研究,并使用更复杂的 ML 算法和可解释性技术,以充分研究输入特征(包括风险因素和对低频/高频刺激的个体反应等附加特征)在预测听力损失中的作用。