Center for Computational Systems Medicine, School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, 77030, USA.
Center for Precision Health, School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, 77030, USA.
Nat Commun. 2020 Jan 3;11(1):89. doi: 10.1038/s41467-019-13779-x.
RNA sequencing experiments generate large amounts of information about expression levels of genes. Although they are mainly used for quantifying expression levels, they contain much more biologically important information such as copy number variants (CNVs). Here, we present CaSpER, a signal processing approach for identification, visualization, and integrative analysis of focal and large-scale CNV events in multiscale resolution using either bulk or single-cell RNA sequencing data. CaSpER integrates the multiscale smoothing of expression signal and allelic shift signals for CNV calling. The allelic shift signal measures the loss-of-heterozygosity (LOH) which is valuable for CNV identification. CaSpER employs an efficient methodology for the generation of a genome-wide B-allele frequency (BAF) signal profile from the reads and utilizes it for correction of CNVs calls. CaSpER increases the utility of RNA-sequencing datasets and complements other tools for complete characterization and visualization of the genomic and transcriptomic landscape of single cell and bulk RNA sequencing data.
RNA 测序实验产生了大量关于基因表达水平的信息。尽管它们主要用于定量表达水平,但它们还包含许多更具生物学意义的信息,如拷贝数变异(CNV)。在这里,我们提出了 CaSpER,这是一种信号处理方法,用于使用批量或单细胞 RNA 测序数据,以多尺度分辨率识别、可视化和综合分析焦点和大规模 CNV 事件。CaSpER 整合了表达信号和 CNV 调用的等位基因偏移信号的多尺度平滑。等位基因偏移信号衡量了杂合性丢失(LOH),这对于 CNV 识别很有价值。CaSpER 采用了一种从读取中生成全基因组 B-等位基因频率(BAF)信号图谱的有效方法,并利用它来纠正 CNV 调用。CaSpER 增加了 RNA 测序数据集的实用性,并补充了其他工具,用于全面表征和可视化单细胞和批量 RNA 测序数据的基因组和转录组景观。