Laboratory of Respiratory Diseases and Thoracic Surgery, Department of Chronic Diseases Metabolism and Ageing, KU Leuven, Leuven, Belgium.
Clinical Department of Respiratory Diseases, University Hospitals Leuven, Leuven, Belgium.
Eur Respir J. 2023 May 18;61(5). doi: 10.1183/13993003.01720-2022. Print 2023 May.
Few studies have investigated the collaborative potential between artificial intelligence (AI) and pulmonologists for diagnosing pulmonary disease. We hypothesised that the collaboration between a pulmonologist and AI with explanations (explainable AI (XAI)) is superior in diagnostic interpretation of pulmonary function tests (PFTs) than the pulmonologist without support.
The study was conducted in two phases, a monocentre study (phase 1) and a multicentre intervention study (phase 2). Each phase utilised two different sets of 24 PFT reports of patients with a clinically validated gold standard diagnosis. Each PFT was interpreted without (control) and with XAI's suggestions (intervention). Pulmonologists provided a differential diagnosis consisting of a preferential diagnosis and optionally up to three additional diagnoses. The primary end-point compared accuracy of preferential and additional diagnoses between control and intervention. Secondary end-points were the number of diagnoses in differential diagnosis, diagnostic confidence and inter-rater agreement. We also analysed how XAI influenced pulmonologists' decisions.
In phase 1 (n=16 pulmonologists), mean preferential and differential diagnostic accuracy significantly increased by 10.4% and 9.4%, respectively, between control and intervention (p<0.001). Improvements were somewhat lower but highly significant (p<0.0001) in phase 2 (5.4% and 8.7%, respectively; n=62 pulmonologists). In both phases, the number of diagnoses in the differential diagnosis did not reduce, but diagnostic confidence and inter-rater agreement significantly increased during intervention. Pulmonologists updated their decisions with XAI's feedback and consistently improved their baseline performance if AI provided correct predictions.
A collaboration between a pulmonologist and XAI is better at interpreting PFTs than individual pulmonologists reading without XAI support or XAI alone.
鲜有研究探讨人工智能(AI)与肺病专家在诊断肺部疾病方面的协作潜力。我们假设肺病专家与具有解释功能的 AI(可解释 AI(XAI))合作,在解读肺功能测试(PFT)的诊断结果方面,优于没有支持的单独肺病专家。
该研究分为两个阶段进行,一个是单中心研究(第 1 阶段),另一个是多中心干预研究(第 2 阶段)。每个阶段使用两组 24 份患者的 PFT 报告,这些报告均具有临床验证的金标准诊断。每份 PFT 报告均在没有(对照)和具有 XAI 建议(干预)的情况下进行解读。肺病专家提供了一个鉴别诊断,包括一个首选诊断和三个可选的附加诊断。主要终点比较了对照和干预组中首选诊断和附加诊断的准确性。次要终点是鉴别诊断中的诊断数量、诊断信心和组内一致性。我们还分析了 XAI 如何影响肺病专家的决策。
在第 1 阶段(n=16 名肺病专家),与对照相比,干预组中首选和鉴别诊断的准确性分别显著提高了 10.4%和 9.4%(p<0.001)。在第 2 阶段(n=62 名肺病专家),虽然改善幅度较低,但仍具有高度显著性(p<0.0001;分别提高了 5.4%和 8.7%)。在两个阶段中,鉴别诊断中的诊断数量并没有减少,但诊断信心和组内一致性在干预期间显著提高。肺病专家会根据 XAI 的反馈更新他们的决策,如果 AI 提供了正确的预测,他们的基线表现也会得到持续提高。
肺病专家与 XAI 的合作在解读 PFT 方面优于单独的肺病专家阅读,也优于没有 XAI 支持或仅 XAI 进行解读的情况。