Department of Biomedical Data Science, Stanford, California, USA.
Stanford Children's Health, Palo Alto, California, USA.
J Am Med Inform Assoc. 2023 Aug 18;30(9):1532-1542. doi: 10.1093/jamia/ocad114.
Heatlhcare institutions are establishing frameworks to govern and promote the implementation of accurate, actionable, and reliable machine learning models that integrate with clinical workflow. Such governance frameworks require an accompanying technical framework to deploy models in a resource efficient, safe and high-quality manner. Here we present DEPLOYR, a technical framework for enabling real-time deployment and monitoring of researcher-created models into a widely used electronic medical record system.
We discuss core functionality and design decisions, including mechanisms to trigger inference based on actions within electronic medical record software, modules that collect real-time data to make inferences, mechanisms that close-the-loop by displaying inferences back to end-users within their workflow, monitoring modules that track performance of deployed models over time, silent deployment capabilities, and mechanisms to prospectively evaluate a deployed model's impact.
We demonstrate the use of DEPLOYR by silently deploying and prospectively evaluating 12 machine learning models trained using electronic medical record data that predict laboratory diagnostic results, triggered by clinician button-clicks in Stanford Health Care's electronic medical record.
Our study highlights the need and feasibility for such silent deployment, because prospectively measured performance varies from retrospective estimates. When possible, we recommend using prospectively estimated performance measures during silent trials to make final go decisions for model deployment.
Machine learning applications in healthcare are extensively researched, but successful translations to the bedside are rare. By describing DEPLOYR, we aim to inform machine learning deployment best practices and help bridge the model implementation gap.
医疗机构正在建立框架,以管理和推动将准确、可操作和可靠的机器学习模型与临床工作流程集成的实施。此类治理框架需要一个配套的技术框架,以便以资源高效、安全和高质量的方式部署模型。在这里,我们介绍了 DEPLOYR,这是一个使研究人员创建的模型能够实时部署和监控到广泛使用的电子病历系统中的技术框架。
我们讨论了核心功能和设计决策,包括基于电子病历软件中的操作触发推理的机制、收集实时数据进行推理的模块、通过将推理显示回用户工作流程中来实现反馈循环的机制、随时间跟踪部署模型性能的监控模块、静默部署功能以及前瞻性评估部署模型影响的机制。
我们通过在斯坦福健康保健中心的电子病历中静默部署和前瞻性评估 12 个使用电子病历数据训练的机器学习模型来演示 DEPLOYR 的使用,这些模型用于预测实验室诊断结果,由临床医生点击按钮触发。
我们的研究强调了这种静默部署的必要性和可行性,因为前瞻性测量的性能与回顾性估计不同。在可能的情况下,我们建议在静默试验中使用前瞻性估计的性能指标来做出最终部署模型的决策。
医疗保健中的机器学习应用研究广泛,但成功转化为床边实践却很少。通过描述 DEPLOYR,我们旨在为机器学习部署最佳实践提供信息,并帮助弥合模型实施差距。