基于云的医疗中心间联邦学习的实现。

Cloud-Based Federated Learning Implementation Across Medical Centers.

机构信息

Department of Cancer Biology, Wake Forest University School of Medicine, Winston Salem, NC.

Department of Biomedical Engineering, Georgia Institute of Technology, Atlanta, GA.

出版信息

JCO Clin Cancer Inform. 2021 Jan;5:1-11. doi: 10.1200/CCI.20.00060.

DOI:10.1200/CCI.20.00060

PMID:33411624

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8140794/

Abstract

PURPOSE

Building well-performing machine learning (ML) models in health care has always been exigent because of the data-sharing concerns, yet ML approaches often require larger training samples than is afforded by one institution. This paper explores several federated learning implementations by applying them in both a simulated environment and an actual implementation using electronic health record data from two academic medical centers on a Microsoft Azure Cloud Databricks platform.

MATERIALS AND METHODS

Using two separate cloud tenants, ML models were created, trained, and exchanged from one institution to another via a GitHub repository. Federated learning processes were applied to both artificial neural networks (ANNs) and logistic regression (LR) models on the horizontal data sets that are varying in count and availability. Incremental and cyclic federated learning models have been tested in simulation and real environments.

RESULTS

The cyclically trained ANN showed a 3% increase in performance, a significant improvement across most attempts ( < .05). Single weight neural network models showed improvement in some cases. However, LR models did not show much improvement after federated learning processes. The specific process that improved the performance differed based on the ML model and how federated learning was implemented. Moreover, we have confirmed that the order of the institutions during the training did influence the overall performance increase.

CONCLUSION

Unlike previous studies, our work has shown the implementation and effectiveness of federated learning processes beyond simulation. Additionally, we have identified different federated learning models that have achieved statistically significant performances. More work is needed to achieve effective federated learning processes in biomedicine, while preserving the security and privacy of the data.

摘要

目的

由于数据共享问题，医疗保健领域构建性能良好的机器学习 (ML) 模型一直是一项艰巨的任务，但 ML 方法通常需要比一个机构所能提供的更大的训练样本。本文通过在模拟环境和使用来自两个学术医疗中心的电子健康记录数据在 Microsoft Azure Cloud Databricks 平台上的实际实现中应用几种联邦学习实现来探讨这个问题。

材料和方法

使用两个单独的云租户，通过 GitHub 存储库从一个机构到另一个机构创建、训练和交换 ML 模型。在水平数据集上应用联邦学习过程，这些数据集在数量和可用性上存在差异。在模拟和实际环境中测试了增量和循环联邦学习模型。

结果

在模拟环境中，经过循环训练的 ANN 性能提高了 3%，在大多数尝试中都有显著提高（<0.05）。在某些情况下，单个权重神经网络模型也有所改进。然而，经过联邦学习过程后，LR 模型并没有太大的改进。具体的改进过程因 ML 模型和联邦学习的实施方式而异。此外，我们已经确认在训练过程中机构的顺序确实会影响整体性能的提高。

结论

与之前的研究不同，我们的工作展示了联邦学习过程在模拟之外的实现和有效性。此外，我们还确定了一些实现了统计学上显著性能的不同联邦学习模型。需要进一步的工作来实现有效的生物医学联邦学习过程，同时保护数据的安全性和隐私性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/91d2/8140794/34358e029fa2/cci-5-cci.20.00060-g005.jpg

相似文献

Cloud-Based Federated Learning Implementation Across Medical Centers.

JCO Clin Cancer Inform. 2021 Jan;5:1-11. doi: 10.1200/CCI.20.00060.

Learning From Others Without Sacrificing Privacy: Simulation Comparing Centralized and Federated Machine Learning on Mobile Health Data.

JMIR Mhealth Uhealth. 2021 Mar 30;9(3):e23728. doi: 10.2196/23728.

The FeatureCloud Platform for Federated Learning in Biomedicine: Unified Approach.

J Med Internet Res. 2023 Jul 12;25:e42621. doi: 10.2196/42621.

Federated Learning in Medical Imaging: Part II: Methods, Challenges, and Considerations.

J Am Coll Radiol. 2022 Aug;19(8):975-982. doi: 10.1016/j.jacr.2022.03.016. Epub 2022 Apr 25.

Federated learning improves site performance in multicenter deep learning without data sharing.

J Am Med Inform Assoc. 2021 Jun 12;28(6):1259-1264. doi: 10.1093/jamia/ocaa341.

A scalable federated learning solution for secondary care using low-cost microcomputing: privacy-preserving development and evaluation of a COVID-19 screening test in UK hospitals.

Lancet Digit Health. 2024 Feb;6(2):e93-e104. doi: 10.1016/S2589-7500(23)00226-1.

Stochastic Channel-Based Federated Learning With Neural Network Pruning for Medical Data Privacy Preservation: Model Development and Experimental Validation.

JMIR Form Res. 2020 Dec 22;4(12):e17265. doi: 10.2196/17265.

Privacy-preserving federated machine learning on FAIR health data: A real-world application.

Comput Struct Biotechnol J. 2024 Feb 17;24:136-145. doi: 10.1016/j.csbj.2024.02.014. eCollection 2024 Dec.

Privacy-Preserving Federated Model Predicting Bipolar Transition in Patients With Depression: Prediction Model Development Study.

J Med Internet Res. 2023 Jul 20;25:e46165. doi: 10.2196/46165.

FedSPL: federated self-paced learning for privacy-preserving disease diagnosis.

Brief Bioinform. 2022 Jan 17;23(1). doi: 10.1093/bib/bbab498.

引用本文的文献

Advancing breast, lung and prostate cancer research with federated learning. A systematic review.

NPJ Digit Med. 2025 May 27;8(1):314. doi: 10.1038/s41746-025-01591-5.

Shareable artificial intelligence to extract cancer outcomes from electronic health records for precision oncology research.

Nat Commun. 2024 Nov 12;15(1):9787. doi: 10.1038/s41467-024-54071-x.

PPFL: A personalized progressive federated learning method for leveraging different healthcare institution-specific features.

iScience. 2024 Sep 13;27(10):110943. doi: 10.1016/j.isci.2024.110943. eCollection 2024 Oct 18.

Data Lake, Data Warehouse, Datamart, and Feature Store: Their Contributions to the Complete Data Reuse Pipeline.

JMIR Med Inform. 2024 Jul 17;12:e54590. doi: 10.2196/54590.

Artificial Intelligence in Congenital Heart Disease: Current State and Prospects.

JACC Adv. 2022 Dec 14;1(5):100153. doi: 10.1016/j.jacadv.2022.100153. eCollection 2022 Dec.

Federated learning for medical image analysis: A survey.

Pattern Recognit. 2024 Jul;151. doi: 10.1016/j.patcog.2024.110424. Epub 2024 Mar 12.

Survey of Medical Applications of Federated Learning.

Healthc Inform Res. 2024 Jan;30(1):3-15. doi: 10.4258/hir.2024.30.1.3. Epub 2024 Jan 31.

Leveraging Emerging Technologies to Expand Accessibility and Improve Precision in Rehabilitation and Exercise for People with Disabilities.

Int J Environ Res Public Health. 2024 Jan 10;21(1):79. doi: 10.3390/ijerph21010079.

Federated and distributed learning applications for electronic health records and structured medical data: a scoping review.

J Am Med Inform Assoc. 2023 Nov 17;30(12):2041-2049. doi: 10.1093/jamia/ocad170.

Clinical Informatics Approaches to Facilitate Cancer Data Sharing.

Yearb Med Inform. 2023 Aug;32(1):104-110. doi: 10.1055/s-0043-1768721. Epub 2023 Jul 6.

本文引用的文献

Evaluating shallow and deep learning strategies for the 2018 n2c2 shared task on clinical text classification.

J Am Med Inform Assoc. 2019 Nov 1;26(11):1247-1254. doi: 10.1093/jamia/ocz149.

BioBERT: a pre-trained biomedical language representation model for biomedical text mining.

Bioinformatics. 2020 Feb 15;36(4):1234-1240. doi: 10.1093/bioinformatics/btz682.

Machine Learning in Medicine.

N Engl J Med. 2019 Jun 27;380(26):2588-2589. doi: 10.1056/NEJMc1906060.

A systematic study of the class imbalance problem in convolutional neural networks.

Neural Netw. 2018 Oct;106:249-259. doi: 10.1016/j.neunet.2018.07.011. Epub 2018 Jul 29.

Distributed deep learning networks among institutions for medical imaging.

J Am Med Inform Assoc. 2018 Aug 1;25(8):945-954. doi: 10.1093/jamia/ocy017.

Predicting the Future - Big Data, Machine Learning, and Clinical Medicine.

N Engl J Med. 2016 Sep 29;375(13):1216-9. doi: 10.1056/NEJMp1606181.

Lung cancer risk from residential radon: meta-analysis of eight epidemiologic studies.

J Natl Cancer Inst. 1997 Jan 1;89(1):49-57. doi: 10.1093/jnci/89.1.49.

Lung cancer in radon-exposed miners and estimation of risk from indoor exposure.

J Natl Cancer Inst. 1995 Jun 7;87(11):817-27. doi: 10.1093/jnci/87.11.817.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于云的医疗中心间联邦学习的实现。

Cloud-Based Federated Learning Implementation Across Medical Centers.

机构信息

出版信息

PURPOSE

MATERIALS AND METHODS

RESULTS

CONCLUSION

目的

材料和方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献