Sun Wei, Jin Chong, Gelfond Jonathan A, Chen Ming-Hui, Ibrahim Joseph G
Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington.
Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina.
Biometrics. 2020 Sep;76(3):983-994. doi: 10.1111/biom.13198. Epub 2019 Dec 27.
Many computational methods have been developed to discern intratumor heterogeneity (ITH) using DNA sequence data from bulk tumor samples. These methods share an assumption that two mutations arise from the same subclone if they have similar mutant allele-frequencies (MAFs), and thus it is difficult or impossible to distinguish two subclones with similar MAFs. Single-cell DNA sequencing (scDNA-seq) data can be very informative for ITH inference. However, due to the difficulty of DNA amplification, scDNA-seq data are often very noisy. A promising new study design is to collect both bulk and single-cell DNA-seq data and jointly analyze them to mitigate the limitations of each data type. To address the analytic challenges of this new study design, we propose a computational method named BaSiC (Bulk tumor and Single Cell), to discern ITH by jointly analyzing DNA-seq data from bulk tumor and single cells. We demonstrate that BaSiC has comparable or better performance than the methods using either data type. We further evaluate BaSiC using bulk tumor and single-cell DNA-seq data from a breast cancer patient and several leukemia patients.
已经开发了许多计算方法,用于使用来自肿瘤组织样本的DNA序列数据来识别肿瘤内异质性(ITH)。这些方法都有一个假设:如果两个突变具有相似的突变等位基因频率(MAF),那么它们来自同一个亚克隆,因此很难或无法区分具有相似MAF的两个亚克隆。单细胞DNA测序(scDNA-seq)数据对于ITH推断可能非常有用。然而,由于DNA扩增的困难,scDNA-seq数据往往噪声很大。一种很有前景的新研究设计是同时收集肿瘤组织和单细胞DNA测序数据,并对它们进行联合分析,以减轻每种数据类型的局限性。为了解决这种新研究设计的分析挑战,我们提出了一种名为BaSiC(肿瘤组织和单细胞)的计算方法,通过联合分析肿瘤组织和单细胞的DNA测序数据来识别ITH。我们证明,BaSiC的性能与使用任何一种数据类型的方法相当或更好。我们进一步使用来自一名乳腺癌患者和几名白血病患者的肿瘤组织和单细胞DNA测序数据对BaSiC进行了评估。