Department of Computer Science, Princeton University, Princeton, NJ, 08540, USA.
Nat Commun. 2020 Sep 2;11(1):4301. doi: 10.1038/s41467-020-17967-y.
Copy-number aberrations (CNAs) and whole-genome duplications (WGDs) are frequent somatic mutations in cancer but their quantification from DNA sequencing of bulk tumor samples is challenging. Standard methods for CNA inference analyze tumor samples individually; however, DNA sequencing of multiple samples from a cancer patient has recently become more common. We introduce HATCHet (Holistic Allele-specific Tumor Copy-number Heterogeneity), an algorithm that infers allele- and clone-specific CNAs and WGDs jointly across multiple tumor samples from the same patient. We show that HATCHet outperforms current state-of-the-art methods on multi-sample DNA sequencing data that we simulate using MASCoTE (Multiple Allele-specific Simulation of Copy-number Tumor Evolution). Applying HATCHet to 84 tumor samples from 14 prostate and pancreas cancer patients, we identify subclonal CNAs and WGDs that are more plausible than previously published analyses and more consistent with somatic single-nucleotide variants (SNVs) and small indels in the same samples.
拷贝数畸变 (CNAs) 和全基因组倍增 (WGDs) 是癌症中常见的体细胞突变,但从肿瘤样本的 DNA 测序中定量它们具有挑战性。用于 CNA 推断的标准方法逐个分析肿瘤样本; 然而,最近越来越多的癌症患者的多个样本的 DNA 测序变得更加普遍。我们介绍了 HATCHet(整体等位基因特异性肿瘤拷贝数异质性),这是一种算法,可以在来自同一患者的多个肿瘤样本中联合推断等位基因和克隆特异性的 CNA 和 WGD。我们表明,HATCHet 在使用 MASCoTE(拷贝数肿瘤进化的多个等位基因特异性模拟)模拟的多样本 DNA 测序数据上优于当前最先进的方法。将 HATCHet 应用于 14 名前列腺癌和胰腺癌患者的 84 个肿瘤样本,我们确定了亚克隆 CNA 和 WGD,它们比以前发表的分析更合理,并且与同一样本中的体细胞单核苷酸变异 (SNVs) 和小插入缺失更一致。