University Institute for Population Health, King's College London, London, United Kingdom.
Department of Diabetes and Metabolic Diseases, Graduate School of Medicine, University of Tokyo, Tokyo, Japan.
J Med Internet Res. 2020 Dec 30;22(12):e22422. doi: 10.2196/22422.
Performing systematic reviews is a time-consuming and resource-intensive process.
We investigated whether a machine learning system could perform systematic reviews more efficiently.
All systematic reviews and meta-analyses of interventional randomized controlled trials cited in recent clinical guidelines from the American Diabetes Association, American College of Cardiology, American Heart Association (2 guidelines), and American Stroke Association were assessed. After reproducing the primary screening data set according to the published search strategy of each, we extracted correct articles (those actually reviewed) and incorrect articles (those not reviewed) from the data set. These 2 sets of articles were used to train a neural network-based artificial intelligence engine (Concept Encoder, Fronteo Inc). The primary endpoint was work saved over sampling at 95% recall (WSS@95%).
Among 145 candidate reviews of randomized controlled trials, 8 reviews fulfilled the inclusion criteria. For these 8 reviews, the machine learning system significantly reduced the literature screening workload by at least 6-fold versus that of manual screening based on WSS@95%. When machine learning was initiated using 2 correct articles that were randomly selected by a researcher, a 10-fold reduction in workload was achieved versus that of manual screening based on the WSS@95% value, with high sensitivity for eligible studies. The area under the receiver operating characteristic curve increased dramatically every time the algorithm learned a correct article.
Concept Encoder achieved a 10-fold reduction of the screening workload for systematic review after learning from 2 randomly selected studies on the target topic. However, few meta-analyses of randomized controlled trials were included. Concept Encoder could facilitate the acquisition of evidence for clinical guidelines.
进行系统评价是一个耗时且资源密集型的过程。
我们研究了机器学习系统是否可以更有效地进行系统评价。
评估了最近美国糖尿病协会、美国心脏病学会、美国心脏协会(2 项指南)和美国卒中协会临床指南中引用的干预性随机对照试验的所有系统评价和荟萃分析。根据每个指南发布的搜索策略重现主要筛选数据集后,我们从数据集中提取出正确的文章(实际进行了综述的文章)和错误的文章(未进行综述的文章)。这 2 组文章用于训练基于神经网络的人工智能引擎(Concept Encoder,Fronteo Inc)。主要终点是在 95%召回率(WSS@95%)下节省的工作。
在 145 项候选随机对照试验综述中,有 8 项综述符合纳入标准。对于这 8 项综述,与手动筛选相比,机器学习系统在 WSS@95%的基础上至少将文献筛选工作量减少了 6 倍。当使用由研究人员随机选择的 2 篇正确文章启动机器学习时,与基于 WSS@95%值的手动筛选相比,工作量减少了 10 倍,对合格研究具有较高的敏感性。接收器操作特征曲线下的面积每次算法学习一篇正确文章时都会显著增加。
Concept Encoder 在学习目标主题的 2 篇随机选择的研究后,将系统评价的筛选工作量减少了 10 倍。然而,纳入的随机对照试验荟萃分析较少。Concept Encoder 可以为临床指南的证据获取提供便利。