Rathbone John, Hoffmann Tammy, Glasziou Paul
Centre for Research in Evidence-Based Practice, Bond University, Gold Coast, Australia.
Syst Rev. 2015 Jun 15;4:80. doi: 10.1186/s13643-015-0067-6.
Citation screening is time consuming and inefficient. We sought to evaluate the performance of Abstrackr, a semi-automated online tool for predictive title and abstract screening.
Four systematic reviews (aHUS, dietary fibre, ECHO, rituximab) were used to evaluate Abstrackr. Citations from electronic searches of biomedical databases were imported into Abstrackr, and titles and abstracts were screened and included or excluded according to the entry criteria. This process was continued until Abstrackr predicted and classified the remaining unscreened citations as relevant or irrelevant. These classification predictions were checked for accuracy against the original review decisions. Sensitivity analyses were performed to assess the effects of including case reports in the aHUS dataset whilst screening and the effects of using larger imbalanced datasets with the ECHO dataset. The performance of Abstrackr was calculated according to the number of relevant studies missed, the workload saving, the false negative rate, and the precision of the algorithm to correctly predict relevant studies for inclusion, i.e. further full text inspection.
Of the unscreened citations, Abstrackr's prediction algorithm correctly identified all relevant citations for the rituximab and dietary fibre reviews. However, one relevant citation in both the aHUS and ECHO reviews was incorrectly predicted as not relevant. The workload saving achieved with Abstrackr varied depending on the complexity and size of the reviews (9 % rituximab, 40 % dietary fibre, 67 % aHUS, and 57 % ECHO). The proportion of citations predicted as relevant, and therefore, warranting further full text inspection (i.e. the precision of the prediction) ranged from 16 % (aHUS) to 45 % (rituximab) and was affected by the complexity of the reviews. The false negative rate ranged from 2.4 to 21.7 %. Sensitivity analysis performed on the aHUS dataset increased the precision from 16 to 25 % and increased the workload saving by 10 % but increased the number of relevant studies missed. Sensitivity analysis performed with the larger ECHO dataset increased the workload saving (80 %) but reduced the precision (6.8 %) and increased the number of missed citations.
Semi-automated title and abstract screening with Abstrackr has the potential to save time and reduce research waste.
文献筛选耗时且效率低下。我们试图评估Abstrackr的性能,这是一种用于预测性标题和摘要筛选的半自动化在线工具。
使用四项系统评价(非典型溶血尿毒综合征、膳食纤维、ECHO、利妥昔单抗)来评估Abstrackr。将生物医学数据库电子检索中的文献导入Abstrackr,根据纳入标准对标题和摘要进行筛选并决定纳入或排除。这个过程持续进行,直到Abstrackr将剩余未筛选的文献预测并分类为相关或不相关。将这些分类预测与原始评价决定进行准确性核对。进行敏感性分析,以评估在筛选非典型溶血尿毒综合征数据集时纳入病例报告的影响,以及使用更大的不平衡数据集与ECHO数据集的影响。根据遗漏的相关研究数量、工作量节省、假阴性率以及算法正确预测纳入相关研究的精度(即进一步的全文检查)来计算Abstrackr的性能。
在未筛选的文献中,Abstrackr的预测算法正确识别了利妥昔单抗和膳食纤维评价的所有相关文献。然而,非典型溶血尿毒综合征和ECHO评价中的一篇相关文献被错误地预测为不相关。使用Abstrackr实现的工作量节省因评价的复杂性和规模而异(利妥昔单抗为9%,膳食纤维为40%,非典型溶血尿毒综合征为67%,ECHO为57%)。预测为相关并因此需要进一步全文检查的文献比例(即预测精度)从16%(非典型溶血尿毒综合征)到45%(利妥昔单抗)不等,并受评价复杂性的影响。假阴性率在2.4%至21.7%之间。对非典型溶血尿毒综合征数据集进行的敏感性分析将精度从16%提高到25%,并将工作量节省提高了10%,但增加了遗漏的相关研究数量。对更大的ECHO数据集进行的敏感性分析增加了工作量节省(80%),但降低了精度(6.8%)并增加了遗漏文献的数量。
使用Abstrackr进行半自动化标题和摘要筛选有可能节省时间并减少研究浪费。