Peterson Thomas A, Gauran Iris Ivy M, Park Junyong, Park DoHwan, Kann Maricel G
Department of Biological Sciences, University of Maryland, Baltimore County, Baltimore, Maryland, United States of America.
University of California, San Francisco, Institute for Computational Health Science, San Francisco, California, United States of America.
PLoS Comput Biol. 2017 Apr 20;13(4):e1005428. doi: 10.1371/journal.pcbi.1005428. eCollection 2017 Apr.
The fight against cancer is hindered by its highly heterogeneous nature. Genome-wide sequencing studies have shown that individual malignancies contain many mutations that range from those commonly found in tumor genomes to rare somatic variants present only in a small fraction of lesions. Such rare somatic variants dominate the landscape of genomic mutations in cancer, yet efforts to correlate somatic mutations found in one or few individuals with functional roles have been largely unsuccessful. Traditional methods for identifying somatic variants that drive cancer are 'gene-centric' in that they consider only somatic variants within a particular gene and make no comparison to other similar genes in the same family that may play a similar role in cancer. In this work, we present oncodomain hotspots, a new 'domain-centric' method for identifying clusters of somatic mutations across entire gene families using protein domain models. Our analysis confirms that our approach creates a framework for leveraging structural and functional information encapsulated by protein domains into the analysis of somatic variants in cancer, enabling the assessment of even rare somatic variants by comparison to similar genes. Our results reveal a vast landscape of somatic variants that act at the level of domain families altering pathways known to be involved with cancer such as protein phosphorylation, signaling, gene regulation, and cell metabolism. Due to oncodomain hotspots' unique ability to assess rare variants, we expect our method to become an important tool for the analysis of sequenced tumor genomes, complementing existing methods.
癌症的高度异质性阻碍了抗癌斗争。全基因组测序研究表明,个体恶性肿瘤包含许多突变,这些突变范围从肿瘤基因组中常见的突变到仅在一小部分病变中出现的罕见体细胞变异。这种罕见的体细胞变异在癌症基因组突变格局中占主导地位,但将在一个或少数个体中发现的体细胞突变与功能作用相关联的努力大多未成功。传统的识别驱动癌症的体细胞变异的方法是以“基因中心”的,因为它们只考虑特定基因内的体细胞变异,而不与同一家族中可能在癌症中发挥类似作用的其他相似基因进行比较。在这项工作中,我们提出了癌域热点,这是一种新的以“结构域为中心”的方法,用于使用蛋白质结构域模型识别整个基因家族中的体细胞突变簇。我们的分析证实,我们的方法创建了一个框架,可将蛋白质结构域封装的结构和功能信息用于癌症体细胞变异分析,通过与相似基因比较,甚至能够评估罕见的体细胞变异。我们的结果揭示了一个广阔的体细胞变异格局,这些变异在结构域家族水平上发挥作用,改变了已知与癌症相关的途径,如蛋白质磷酸化、信号传导、基因调控和细胞代谢。由于癌域热点评估罕见变异的独特能力,我们预计我们的方法将成为分析测序肿瘤基因组的重要工具,补充现有方法。