Bhat Suhaas, Palepu Kalyan, Hong Lauren, Mao Joey, Ye Tianzheng, Iyer Rema, Zhao Lin, Chen Tianlai, Vincoff Sophia, Watson Rio, Wang Tian, Srijay Divya, Kavirayuni Venkata Srikar, Kholina Kseniia, Goel Shrey, Vure Pranay, Desphande Aniruddha J, Soderling Scott H, DeLisa Matthew P, Chatterjee Pranam
Department of Biomedical Engineering, Duke University.
Department of Cell Biology, Duke University.
bioRxiv. 2024 Jul 22:2023.06.26.546591. doi: 10.1101/2023.06.26.546591.
Designing binders to target undruggable proteins presents a formidable challenge in drug discovery, requiring innovative approaches to overcome the lack of putative binding sites. Recently, generative models have been trained to design binding proteins via three-dimensional structures of target proteins, but as a result, struggle to design binders to disordered or conformationally unstable targets. In this work, we provide a generalizable algorithmic framework to design short, target-binding linear peptides, requiring only the amino acid sequence of the target protein. To do this, we propose a process to generate naturalistic peptide candidates through Gaussian perturbation of the peptidic latent space of the ESM-2 protein language model, and subsequently screen these novel linear sequences for target-selective interaction activity via a CLIP-based contrastive learning architecture. By integrating these generative and discriminative steps, we create a tide ioritization via () pipeline and validate highly-ranked, target-specific peptides experimentally, both as inhibitory peptides and as fusions to E3 ubiquitin ligase domains, demonstrating functionally potent binding and degradation of conformationally diverse protein targets . Overall, our design strategy provides a modular toolkit for designing short binding linear peptides to any target protein without the reliance on stable and ordered tertiary structure, enabling generation of programmable modulators to undruggable and disordered proteins such as transcription factors and fusion oncoproteins.
设计针对不可成药蛋白的结合物在药物研发中是一项艰巨的挑战,需要创新方法来克服缺乏假定结合位点的问题。最近,生成模型已被训练通过靶蛋白的三维结构来设计结合蛋白,但结果是,在设计针对无序或构象不稳定靶标的结合物时遇到困难。在这项工作中,我们提供了一个可推广的算法框架来设计短的、与靶标结合的线性肽,只需要靶蛋白的氨基酸序列。为此,我们提出了一个通过对ESM-2蛋白质语言模型的肽潜空间进行高斯扰动来生成自然主义肽候选物的过程,随后通过基于CLIP的对比学习架构筛选这些新颖的线性序列以获得靶标选择性相互作用活性。通过整合这些生成步骤和判别步骤,我们创建了一个通过()管道进行的潮汐优先排序,并通过实验验证了高排名的、靶标特异性肽,既作为抑制性肽,也作为与E3泛素连接酶结构域的融合物,证明了对构象多样的蛋白质靶标的功能有效结合和降解。总体而言,我们的设计策略提供了一个模块化工具包,用于设计针对任何靶蛋白的短结合线性肽,而无需依赖稳定和有序的三级结构,从而能够生成针对不可成药和无序蛋白(如转录因子和融合癌蛋白)的可编程调节剂。