Suppr超能文献

鉴定多打击致癌基因组合:在 GPU 上使用压缩二进制矩阵表示对加权集合覆盖算法进行扩展。

Identifying multi-hit carcinogenic gene combinations: Scaling up a weighted set cover algorithm using compressed binary matrix representation on a GPU.

机构信息

Department of Electrical and Computer Engineering, Virginia Tech, Blacksburg, VA, 24060, USA.

Department of Computer Science, Virginia Tech, Blacksburg, VA, 24060, USA.

出版信息

Sci Rep. 2020 Feb 6;10(1):2022. doi: 10.1038/s41598-020-58785-y.

Abstract

Despite decades of research, effective treatments for most cancers remain elusive. One reason is that different instances of cancer result from different combinations of multiple genetic mutations (hits). Therefore, treatments that may be effective in some cases are not effective in others. We previously developed an algorithm for identifying combinations of carcinogenic genes with mutations (multi-hit combinations), which could suggest a likely cause for individual instances of cancer. Most cancers are estimated to require three or more hits. However, the computational complexity of the algorithm scales exponentially with the number of hits, making it impractical for identifying combinations of more than two hits. To identify combinations of greater than two hits, we used a compressed binary matrix representation, and optimized the algorithm for parallel execution on an NVIDIA V100 graphics processing unit (GPU). With these enhancements, the optimized GPU implementation was on average an estimated 12,144 times faster than the original integer matrix based CPU implementation, for the 3-hit algorithm, allowing us to identify 3-hit combinations. The 3-hit combinations identified using a training set were able to differentiate between tumor and normal samples in a separate test set with 90% overall sensitivity and 93% overall specificity. We illustrate how the distribution of mutations in tumor and normal samples in the multi-hit gene combinations can suggest potential driver mutations for further investigation. With experimental validation, these combinations may provide insight into the etiology of cancer and a rational basis for targeted combination therapy.

摘要

尽管经过了几十年的研究,但对于大多数癌症仍然缺乏有效的治疗方法。原因之一是,不同的癌症病例是由不同的多种基因突变(命中)组合而成的。因此,在某些情况下有效的治疗方法在其他情况下可能无效。我们之前开发了一种用于识别致癌基因突变组合(多命中组合)的算法,该算法可以提示癌症个体病例的可能原因。据估计,大多数癌症需要三个或更多的命中。然而,该算法的计算复杂度随着命中数量的指数级增长而增加,因此对于识别两个以上的命中组合来说,实际上是不可行的。为了识别两个以上的命中组合,我们使用了压缩二进制矩阵表示,并对算法进行了优化,以在 NVIDIA V100 图形处理单元 (GPU) 上进行并行执行。通过这些增强,优化后的 GPU 实现平均比原始基于整数矩阵的 CPU 实现快 12144 倍,用于 3 命中算法,使我们能够识别 3 命中组合。使用训练集识别的 3 命中组合能够以 90%的总体敏感性和 93%的总体特异性区分肿瘤和正常样本。我们说明了在多命中基因组合中肿瘤和正常样本中的突变分布如何提示进一步研究的潜在驱动突变。通过实验验证,这些组合可能为癌症的病因学提供深入了解,并为靶向联合治疗提供合理依据。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fd70/7005272/3ee455373e50/41598_2020_58785_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验