Suppr超能文献

MolCFL:基于生成式聚类联邦学习的个性化隐私保护药物发现框架。

MolCFL: A personalized and privacy-preserving drug discovery framework based on generative clustered federated learning.

机构信息

Inner Mongolia University, College of Computer Science, Hohhot, 010000, China.

Inner Mongolia University, College of Computer Science, Hohhot, 010000, China.

出版信息

J Biomed Inform. 2024 Sep;157:104712. doi: 10.1016/j.jbi.2024.104712. Epub 2024 Aug 23.

Abstract

In today's era of rapid development of large models, the traditional drug development process is undergoing a profound transformation. The vast demand for data and consumption of computational resources are making independent drug discovery increasingly difficult. By integrating federated learning technology into the drug discovery field, we have found a solution that both protects privacy and shares computational power. However, the differences in data held by various pharmaceutical institutions and the diversity in drug design objectives have exacerbated the issue of data heterogeneity, making traditional federated learning consensus models unable to meet the personalized needs of all parties. In this study, we introduce and evaluate an innovative drug discovery framework, MolCFL, which utilizes a multi-layer perceptron (MLP) as the generator and a graph convolutional network (GCN) as the discriminator in a generative adversarial network (GAN). By learning the graph structure of molecules, it generates new molecules in a highly personalized manner and then optimizes the learning process by clustering federated learning, grouping compound data with high similarity. MolCFL not only enhances the model's ability to protect privacy but also significantly improves the efficiency and personalization of molecular design. MolCFL exhibits superior performance when handling non-independently and identically distributed data compared to traditional models. Experimental results show that the framework demonstrates outstanding performance on two benchmark datasets, with the generated new molecules achieving over 90% in Uniqueness and close to 100% in Novelty. MolCFL not only improves the quality and efficiency of drug molecule design but also, through its highly customized clustered federated learning environment, promotes collaboration and specialization in the drug discovery process while ensuring data privacy. These features make MolCFL a powerful tool suitable for addressing the various challenges faced in the modern drug research and development field.

摘要

在当今大模型快速发展的时代,传统的药物研发流程正在发生深刻的变革。对数据的巨大需求和计算资源的消耗使得独立药物发现变得越来越困难。通过将联邦学习技术整合到药物发现领域,我们找到了一种既能保护隐私又能共享计算能力的解决方案。然而,由于各个制药机构所持有的数据存在差异,以及药物设计目标的多样性,数据异质性问题更加严重,使得传统的联邦学习共识模型无法满足各方的个性化需求。在本研究中,我们引入并评估了一种创新的药物发现框架 MolCFL,该框架在生成对抗网络 (GAN) 中使用多层感知机 (MLP) 作为生成器,使用图卷积网络 (GCN) 作为鉴别器。通过学习分子的图结构,它以高度个性化的方式生成新的分子,然后通过聚类联邦学习优化学习过程,将具有高相似性的化合物数据分组。MolCFL 不仅增强了模型保护隐私的能力,而且显著提高了分子设计的效率和个性化程度。MolCFL 在处理非独立同分布数据方面的性能明显优于传统模型。实验结果表明,该框架在两个基准数据集上均表现出出色的性能,生成的新分子的独特性超过 90%,新颖性接近 100%。MolCFL 不仅提高了药物分子设计的质量和效率,而且通过其高度定制化的聚类联邦学习环境,在确保数据隐私的同时,促进了药物发现过程中的协作和专业化。这些特点使 MolCFL 成为一种强大的工具,适用于解决现代药物研发领域所面临的各种挑战。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验