Que Jialan, Jiang Xiaoqian, Ohno-Machado Lucila
University of California, La Jolla, CA, USA.
AMIA Annu Symp Proc. 2012;2012:1350-9. Epub 2012 Nov 3.
A Support Vector Machine (SVM) is a popular tool for decision support. The traditional way to build an SVM model is to estimate parameters based on a centralized repository of data. However, in the field of biomedicine, patient data are sometimes stored in local repositories or institutions where they were collected, and may not be easily shared due to privacy concerns. This creates a substantial barrier for researchers to effectively learn from the distributed data using machine learning tools like SVMs. To overcome this difficulty and promote efficient information exchange without sharing sensitive raw data, we developed a Distributed Privacy Preserving Support Vector Machine (DPP-SVM). The DPP-SVM enables privacy-preserving collaborative learning, in which a trusted server integrates "privacy-insensitive" intermediary results. The globally learned model is guaranteed to be exactly the same as learned from combined data. We also provide a free web-service (http://privacy.ucsd.edu:8080/ppsvm/) for multiple participants to collaborate and complete the SVM-learning task in an efficient and privacy-preserving manner.
支持向量机(SVM)是一种常用的决策支持工具。构建SVM模型的传统方法是基于集中式数据存储库来估计参数。然而,在生物医学领域,患者数据有时存储在收集数据的本地存储库或机构中,由于隐私问题,这些数据可能不易共享。这给研究人员使用诸如支持向量机之类的机器学习工具从分布式数据中有效学习带来了巨大障碍。为了克服这一困难并在不共享敏感原始数据的情况下促进高效的信息交换,我们开发了一种分布式隐私保护支持向量机(DPP-SVM)。DPP-SVM支持隐私保护协作学习,其中一个可信服务器整合“隐私不敏感”的中间结果。全局学习到的模型保证与从组合数据中学习到的模型完全相同。我们还提供了一个免费的网络服务(http://privacy.ucsd.edu:8080/ppsvm/),供多个参与者以高效且保护隐私的方式进行协作并完成SVM学习任务。