Otero-Carrasco Belén, Nevado Paloma Tejera, Muñoz Rafael Artiñano, Ferreiro Gema Díaz, Pérez Aurora Pérez, Caraça-Valente Hernández Juan Pedro, Rodríguez-González Alejandro
Centro de Tecnología Biomédica, Universidad Politécnica de Madrid, Pozuelo de Alarcón, Madrid, Spain.
ETS Ingenieros Informáticos, Universidad Politécnica de Madrid, Boadilla del Monte, Madrid, Spain.
PLoS One. 2025 May 7;20(5):e0322546. doi: 10.1371/journal.pone.0322546. eCollection 2025.
Proteins are fundamental biomolecules composed of one or more chains of amino acids. They are essential for all living organisms, contributing to various biological functions and regulatory processes. Alterations in protein structures and functions are closely linked to diseases, emphasizing the need for in-depth study. A thorough understanding of these associations is crucial for developing targeted and more effective therapeutic strategies.Computational analyses of biomedical data facilitate the identification of specific patterns in proteins associated with diseases, providing novel insights into their biological roles. This study introduces a computational approach designed to detect relevant sequence patterns within proteins. These patterns, characterized by specific amino acid arrangements, can be critical for protein functionality. The proposed methodology was applied to proteins targeted by drugs used in lung cancer treatment, a disease that remains the leading cause of cancer-related mortality worldwide. Given that non-small cell lung cancer represents 85-90% of all lung cancer cases, it was selected as the primary focus of this study.Significant sequence patterns were identified, establishing connections between drug-target proteins and proteins associated with lung cancer. Based on these findings, a novel computational framework was developed to extend this pattern-based analysis to proteins linked to other diseases. By employing this approach, relationships between lung cancer drug-target proteins and proteins associated with four additional cancer types were uncovered. These associations, characterized by shared amino acid sequence features, suggest potential opportunities for drug repurposing. Furthermore, validation through an extensive literature review confirmed biological links between lung cancer drug-target proteins and proteins related to other malignancies, reinforcing the potential of this methodology for identifying new therapeutic applications.
蛋白质是由一条或多条氨基酸链组成的基本生物分子。它们对所有生物体都至关重要,参与各种生物功能和调节过程。蛋白质结构和功能的改变与疾病密切相关,这凸显了深入研究的必要性。全面了解这些关联对于制定有针对性且更有效的治疗策略至关重要。生物医学数据的计算分析有助于识别与疾病相关的蛋白质中的特定模式,为其生物学作用提供新的见解。本研究介绍了一种旨在检测蛋白质内相关序列模式的计算方法。这些以特定氨基酸排列为特征的模式可能对蛋白质功能至关重要。所提出的方法应用于肺癌治疗药物所靶向的蛋白质,肺癌仍是全球癌症相关死亡的主要原因。鉴于非小细胞肺癌占所有肺癌病例的85 - 90%,它被选为这项研究的主要重点。识别出了显著的序列模式,建立了药物靶向蛋白与肺癌相关蛋白之间的联系。基于这些发现,开发了一个新的计算框架,将这种基于模式的分析扩展到与其他疾病相关的蛋白质。通过采用这种方法,发现了肺癌药物靶向蛋白与另外四种癌症类型相关蛋白之间的关系。这些以共享氨基酸序列特征为特点的关联提示了药物重新利用的潜在机会。此外,通过广泛的文献综述进行验证,证实了肺癌药物靶向蛋白与其他恶性肿瘤相关蛋白之间的生物学联系,加强了这种方法在识别新治疗应用方面的潜力。