Department of Neurosurgery, Itoigawa General Hospital, Niigata, Japan.
Department of Neurology, Saitama Neuropsychiatric Institute, Saitama, Japan.
Cephalalgia. 2023 May;43(5):3331024231156925. doi: 10.1177/03331024231156925.
Misdiagnoses of headache disorders are a serious issue. Therefore, we developed an artificial intelligence-based headache diagnosis model using a large questionnaire database in a specialized headache hospital.
Phase 1: We developed an artificial intelligence model based on a retrospective investigation of 4000 patients (2800 training and 1200 test dataset) diagnosed by headache specialists. Phase 2: The model's efficacy and accuracy were validated. Five non-headache specialists first diagnosed headaches in 50 patients, who were then re-diagnosed using AI. The ground truth was the diagnosis by headache specialists. The diagnostic performance and concordance rates between headache specialists and non-specialists with or without artificial intelligence were evaluated.
Phase 1: The model's macro-average accuracy, sensitivity (recall), specificity, precision, and F values were 76.25%, 56.26%, 92.16%, 61.24%, and 56.88%, respectively, for the test dataset. Phase 2: Five non-specialists diagnosed headaches without artificial intelligence with 46% overall accuracy and 0.212 kappa for the ground truth. The statistically improved values with artificial intelligence were 83.20% and 0.678, respectively. Other diagnostic indexes were also improved.
Artificial intelligence improved the non-specialist diagnostic performance. Given the model's limitations based on the data from a single center and the low diagnostic accuracy for secondary headaches, further data collection and validation are needed.
头痛疾病的误诊是一个严重的问题。因此,我们在一家专门的头痛医院使用大型问卷数据库开发了一种基于人工智能的头痛诊断模型。
第 1 阶段:我们基于头痛专家诊断的 4000 名患者(2800 名训练和 1200 名测试数据集)的回顾性调查开发了人工智能模型。第 2 阶段:验证模型的疗效和准确性。5 名非头痛专家首先对 50 名头痛患者进行诊断,然后使用 AI 重新诊断。真实情况是由头痛专家做出的诊断。评估了头痛专家和非专家使用或不使用人工智能的诊断性能和一致性率。
第 1 阶段:模型在测试数据集上的宏观平均准确率、灵敏度(召回率)、特异性、精度和 F 值分别为 76.25%、56.26%、92.16%、61.24%和 56.88%。第 2 阶段:5 名非专家在没有人工智能的情况下对头痛进行诊断,整体准确率为 46%,与真实情况的kappa 值为 0.212。使用人工智能的统计学上提高的值分别为 83.20%和 0.678。其他诊断指标也有所提高。
人工智能提高了非专家的诊断性能。鉴于该模型基于单一中心的数据存在局限性以及二级头痛的诊断准确性较低,需要进一步收集和验证数据。