Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba, Japan.
Ono Pharmaceutical Co., Ltd., Ibaraki, Japan.
Genome Biol. 2021 Jan 4;22(1):9. doi: 10.1186/s13059-020-02240-8.
Long-read sequencing of full-length cDNAs enables the detection of structures of aberrant splicing isoforms in cancer cells. These isoforms are occasionally translated, presented by HLA molecules, and recognized as neoantigens. This study used a long-read sequencer (MinION) to construct a comprehensive catalog of aberrant splicing isoforms in non-small-cell lung cancers, by which novel isoforms and potential neoantigens are identified.
Full-length cDNA sequencing is performed using 22 cell lines, and a total of 2021 novel splicing isoforms are identified. The protein expression of some of these isoforms is then validated by proteome analysis. Ablations of a nonsense-mediated mRNA decay (NMD) factor, UPF1, and a splicing factor, SF3B1, are found to increase the proportion of aberrant transcripts. NetMHC evaluation of the binding affinities to each type of HLA molecule reveals that some of the isoforms potentially generate neoantigen candidates. We also identify aberrant splicing isoforms in seven non-small-cell lung cancer specimens. An enzyme-linked immune absorbent spot assay indicates that approximately half the peptide candidates have the potential to activate T cell responses through their interaction with HLA molecules. Finally, we estimate the number of isoforms in The Cancer Genome Atlas (TCGA) datasets by referring to the constructed catalog and found that disruption of NMD factors is significantly correlated with the number of splicing isoforms found in the TCGA-Lung Adenocarcinoma data collection.
Our results indicate that long-read sequencing of full-length cDNAs is essential for the precise identification of aberrant transcript structures in cancer cells.
全长 cDNA 的长读测序能够检测癌细胞中异常剪接异构体的结构。这些异构体偶尔会被翻译,由 HLA 分子呈递,并被识别为新抗原。本研究使用长读测序仪(MinION)构建了非小细胞肺癌中异常剪接异构体的综合目录,从中鉴定出了新的异构体和潜在的新抗原。
对 22 种细胞系进行全长 cDNA 测序,共鉴定出 2021 种新的剪接异构体。然后通过蛋白质组分析验证这些异构体中的一些的蛋白表达。发现无意义介导的 mRNA 降解(NMD)因子 UPF1 和剪接因子 SF3B1 的缺失会增加异常转录本的比例。对每种 HLA 分子结合亲和力的 NetMHC 评估表明,其中一些异构体可能产生新抗原候选物。我们还在七个非小细胞肺癌标本中鉴定出异常剪接异构体。酶联免疫吸附斑点试验表明,大约一半的肽候选物具有通过与 HLA 分子相互作用激活 T 细胞反应的潜力。最后,我们通过参考构建的目录来估计 TCGA 数据集的异构体数量,并发现 NMD 因子的破坏与 TCGA-Lung Adenocarcinoma 数据集发现的剪接异构体数量显著相关。
我们的结果表明,全长 cDNA 的长读测序对于精确识别癌细胞中异常转录本结构至关重要。