Suppr超能文献

剪接变体的深度学习分析确定了PCP4与肌萎缩侧索硬化症之间的联系。

Deep learning analyses of splicing variants identify the link of PCP4 with amyotrophic lateral sclerosis.

作者信息

Tang Xuelin, Chen Yan, Ren Yongfei, Yang Wanli, Yu Wendi, Zhou Yu, Guo Jingyan, Hu Jiali, Chen Xi, Gu Yuqi, Wang Chuyi, Dong Yi, Yang Hong, Sato Christine, He Ji, Fan Dongsheng, You Linya, Zinman Lorne, Rogaeva Ekaterina, Chen Yelin, Zhang Ming

机构信息

State Key Laboratory of Cardiology and Medical Innovation Center, Shanghai East Hospital, Clinical Center for Brain and Spinal Cord Research, School of Medicine, Tongji University, Shanghai 200331, China.

The First Rehabilitation Hospital of Shanghai, School of Medicine, Tongji University, Shanghai 200090, China.

出版信息

Brain. 2025 Jul 7;148(7):2331-2347. doi: 10.1093/brain/awaf025.

Abstract

Amyotrophic lateral sclerosis (ALS) is a severe motor neuron disease, with most sporadic cases lacking clear genetic causes. Abnormal pre-mRNA splicing is a fundamental mechanism in neurodegenerative diseases. For example, TAR DNA-binding protein 43 (TDP-43) loss of function causes widespread RNA mis-splicing events in ALS. Additionally, splicing mutations are major contributors to neurological disorders. However, the role of intronic variants driving RNA mis-splicing in ALS remains poorly understood. To address this, we developed Spliformer to predict RNA splicing. Spliformer is a transformer-based deep learning model trained and tested on splicing events from the GENCODE database, in addition to RNA-sequencing data from blood and CNS tissues. We benchmarked Spliformer against SpliceAI and Pangolin using testing datasets and paired whole-genome sequencing with RNA-sequencing data. We also developed the Spliformer-motif model to identify splicing regulatory motifs. We analysed the Clinvar dataset to identify the link of splicing variants with disease pathogenicity. Additionally, we analysed whole-genome sequencing data of ALS patients and controls to identify common intronic splicing variants linked to ALS risk or disease phenotypes. We also profiled rare intronic splicing variants in ALS patients to identify known or novel ALS-associated genes. Minigene assays were used to validate candidate splicing variants. Finally, we measured spine density in neurons with a specific gene knockdown or those expressing a TDP-43 disease-causing mutant. Spliformer accurately predicts the possibilities of a nucleotide within a pre-mRNA sequence being a splice donor, acceptor or neither. Spliformer outperformed SpliceAI and Pangolin in both speed and accuracy in tested splicing events and/or paired whole-genome sequencing/RNA-sequencing data. Spliformer-motif successfully identified canonical and novel splicing regulatory motifs. In the Clinvar dataset, splicing variants are highly related to disease pathogenicity. Genome-wide analyses of common intronic splicing variants nominated one variant linked to ALS progression. Deep learning analyses of whole-genome sequencing data from 1370 ALS patients revealed rare splicing variants in reported ALS genes (such as PTPRN2 and CFAP410, validated through minigene assays and RNA sequencing) and TDP-43 loss-of-function-related RNA mis-splicing genes (such as PTPRD). Further genetic analysis and minigene assays nominated PCP4 and TMEM63A as ALS-associated genes. Functional assays demonstrated that PCP4 is crucial for maintaining spine density and can rescue spine loss in neurons expressing a disease-causing TDP-43 mutant. In summary, we developed Spliformer and Spliformer-motif, which accurately predict and interpret pre-mRNA splicing. Our findings highlight an intronic genetic mechanism driving RNA mis-splicing in ALS and nominate PCP4 as an ALS-associated gene.

摘要

肌萎缩侧索硬化症(ALS)是一种严重的运动神经元疾病,大多数散发性病例缺乏明确的遗传病因。异常的前体mRNA剪接是神经退行性疾病的一种基本机制。例如,TAR DNA结合蛋白43(TDP-43)功能丧失会在ALS中导致广泛的RNA错配剪接事件。此外,剪接突变是神经疾病的主要促成因素。然而,内含子变体在ALS中驱动RNA错配剪接的作用仍知之甚少。为了解决这个问题,我们开发了Spliformer来预测RNA剪接。Spliformer是一个基于Transformer的深度学习模型,除了来自血液和中枢神经系统组织的RNA测序数据外,还在GENCODE数据库的剪接事件上进行了训练和测试。我们使用测试数据集将Spliformer与SpliceAI和Pangolin进行了基准测试,并将全基因组测序与RNA测序数据配对。我们还开发了Spliformer-motif模型来识别剪接调控基序。我们分析了Clinvar数据集,以确定剪接变体与疾病致病性之间的联系。此外,我们分析了ALS患者和对照组的全基因组测序数据,以确定与ALS风险或疾病表型相关的常见内含子剪接变体。我们还分析了ALS患者中罕见的内含子剪接变体,以确定已知或新的与ALS相关的基因。使用小基因分析来验证候选剪接变体。最后,我们测量了特定基因敲低或表达TDP-43致病突变体的神经元中的棘密度。Spliformer准确地预测了前体mRNA序列中一个核苷酸成为剪接供体、受体或两者都不是的可能性。在测试的剪接事件和/或配对的全基因组测序/RNA测序数据中,Spliformer在速度和准确性方面均优于SpliceAI和Pangolin。Spliformer-motif成功地识别了典型和新的剪接调控基序。在Clinvar数据集中,剪接变体与疾病致病性高度相关。对常见内含子剪接变体的全基因组分析确定了一个与ALS进展相关的变体。对1370名ALS患者的全基因组测序数据进行的深度学习分析揭示了已报道的ALS基因(如通过小基因分析和RNA测序验证的PTPRN2和CFAP410)以及与TDP-43功能丧失相关的RNA错配剪接基因(如PTPRD)中的罕见剪接变体。进一步的遗传分析和小基因分析将PCP4和TMEM63A指定为与ALS相关的基因。功能分析表明,PCP4对于维持棘密度至关重要,并且可以挽救表达致病TDP-43突变体的神经元中的棘丢失。总之,我们开发了Spliformer和Spliformer-motif,它们能够准确地预测和解释前体mRNA剪接。我们的研究结果突出了一种在ALS中驱动RNA错配剪接的内含子遗传机制,并将PCP4指定为与ALS相关的基因。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验