Nolte Daniel, Bazgir Omid, Ghosh Souparno, Pal Ranadip
Department of Electrical and Computer Engineering, Texas Tech University, Lubbock, TX 79409, USA.
Genentech, South San Francisco, CA 94080, USA.
Bioinform Adv. 2023 Mar 22;3(1):vbad036. doi: 10.1093/bioadv/vbad036. eCollection 2023.
Predictive learning from medical data incurs additional challenge due to concerns over privacy and security of personal data. Federated learning, intentionally structured to preserve high level of privacy, is emerging to be an attractive way to generate cross-silo predictions in medical scenarios. However, the impact of severe population-level heterogeneity on federated learners is not well explored. In this article, we propose a methodology to detect presence of population heterogeneity in federated settings and propose a solution to handle such heterogeneity by developing a federated version of Deep Regression Forests. Additionally, we demonstrate that the recently conceptualized REpresentation of Features as Images with NEighborhood Dependencies CNN framework can be combined with the proposed Federated Deep Regression Forests to provide improved performance as compared to existing approaches.
The Python source code for reproducing the main results are available on GitHub: https://github.com/DanielNolte/FederatedDeepRegressionForests.
Supplementary data are available at online.
由于担心个人数据的隐私和安全问题,从医学数据中进行预测性学习面临额外挑战。为保护高度隐私而特意构建的联邦学习,正成为在医学场景中进行跨孤岛预测的一种有吸引力的方式。然而,严重的人群层面异质性对联邦学习者的影响尚未得到充分探索。在本文中,我们提出一种方法来检测联邦环境中人群异质性的存在,并通过开发深度回归森林的联邦版本来提出一种处理这种异质性的解决方案。此外,我们证明,与现有方法相比,最近概念化的具有邻域依赖性的特征图像表示卷积神经网络框架可以与所提出的联邦深度回归森林相结合,以提供更好的性能。
用于重现主要结果的Python源代码可在GitHub上获取:https://github.com/DanielNolte/FederatedDeepRegressionForests。
补充数据可在网上获取。