Fuller-Tyszkiewicz Matthew, Jones Allan, Vasa Rajesh, Macdonald Jacqui A, Deane Camille, Samuel Delyth, Evans-Whipp Tracy, Olsson Craig A
School of Psychology, Faculty of Health, Deakin University, Geelong, Australia.
SEED Centre for Lifespan Research, Deakin University, 221 Burwood Highway, Burwood, Melbourne, VIC, Australia.
Clin Child Fam Psychol Rev. 2025 Apr 18. doi: 10.1007/s10567-025-00519-5.
Systematic and meta-analytic reviews provide gold-standard evidence but are static and outdate quickly. Here we provide performance data on a new software platform, LitQuest, that uses artificial intelligence technologies to (1) accelerate screening of titles and abstracts from library literature searches, and (2) provide a software solution for enabling living systematic reviews by maintaining a saved AI algorithm for updated searches. Performance testing was based on LitQuest data from seven systematic reviews. LitQuest efficiency was estimated as the proportion (%) of the total yield of an initial literature search (titles/abstracts) that needed human screening prior to reaching the in-built stop threshold. LitQuest algorithm performance was measured as work saved over sampling (WSS) for a certain recall. LitQuest accuracy was estimated as the proportion of incorrectly classified papers in the rejected pool, as determined by two independent human raters. On average, around 36% of the total yield of a literature search needed to be human screened prior to reaching the stop-point. However, this ranged from 22 to 53% depending on the complexity of language structure across papers included in specific reviews. Accuracy was 99% at an interrater reliability of 95%, and 0% of titles/abstracts were incorrectly assigned. Findings suggest that LitQuest can be a cost-effective and time-efficient solution to supporting living systematic reviews, particularly for rapidly developing areas of science. Further development of LitQuest is planned, including facilitated full-text data extraction and community-of-practice access to living systematic review findings.
系统评价和荟萃分析提供了金标准证据,但这些证据是静态的,且很快就会过时。在此,我们提供了一个新软件平台LitQuest的性能数据,该平台使用人工智能技术来:(1)加速对图书馆文献检索中标题和摘要的筛选;(2)通过维护用于更新检索的保存人工智能算法,提供一种支持实时系统评价的软件解决方案。性能测试基于LitQuest来自七项系统评价的数据。LitQuest效率以在达到内置停止阈值之前需要人工筛选的初始文献检索总产量(标题/摘要)的比例(%)来估计。LitQuest算法性能通过在一定召回率下的抽样节省工作量(WSS)来衡量。LitQuest准确性以被拒绝库中被错误分类论文的比例来估计,由两名独立的人工评分者确定。平均而言,在达到停止点之前,文献检索总产量中约36%需要人工筛选。然而,这一比例在22%至53%之间,具体取决于特定评价中所纳入论文语言结构的复杂程度。在评分者间信度为95%时,准确性为99%,且标题/摘要被错误分配的比例为0%。研究结果表明,LitQuest可以成为支持实时系统评价的一种具有成本效益和时间效率的解决方案,特别是对于快速发展的科学领域。LitQuest的进一步开发计划正在进行,包括促进全文数据提取以及让实践社区能够获取实时系统评价结果。