Suppr超能文献

基于靶向捕获数据的算法性体细胞拷贝数改变检测的可靠性。

Reliability of algorithmic somatic copy number alteration detection from targeted capture data.

机构信息

Molecular Health GmbH, Kurfürsten-Anlage 21, 69115 Heidelberg, Germany.

出版信息

Bioinformatics. 2017 Sep 15;33(18):2791-2798. doi: 10.1093/bioinformatics/btx284.

Abstract

MOTIVATION

Whole exome and gene panel sequencing are increasingly used for oncological diagnostics. To investigate the accuracy of SCNA detection algorithms on simulated and clinical tumor samples, the precision and sensitivity of four SCNA callers were measured using 50 simulated whole exome and 50 simulated targeted gene panel datasets, and using 119 TCGA tumor samples for which SNP array data were available.

RESULTS

On synthetic exome and panel data, VarScan2 mostly called false positives, whereas Control-FREEC was precise (>90% correct calls) at the cost of low sensitivity (<40% detected). ONCOCNV was slightly less precise on gene panel data, with similarly low sensitivity. This could be explained by low sensitivity for amplifications and high precision for deletions. Surprisingly, these results were not strongly affected by moderate tumor impurities; only contaminations with more than 60% non-cancerous cells resulted in strongly declining precision and sensitivity. On the 119 clinical samples, both Control-FREEC and CNVkit called 71.8% and 94%, respectively, of the SCNAs found by the SNP arrays, but with a considerable amount of false positives (precision 29% and 4.9%).

DISCUSSION

Whole exome and targeted gene panel methods by design limit the precision of SCNA callers, making them prone to false positives. SCNA calls cannot easily be integrated in clinical pipelines that use data from targeted capture-based sequencing. If used at all, they need to be cross-validated using orthogonal methods.

AVAILABILITY AND IMPLEMENTATION

Scripts are provided as supplementary information.

CONTACT

gunther.jansen@molecularhealth.com.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

全外显子组和基因panel 测序越来越多地用于肿瘤学诊断。为了研究 SCNA 检测算法在模拟和临床肿瘤样本中的准确性,使用 50 个模拟全外显子组和 50 个模拟靶向基因panel 数据集,以及 119 个具有 SNP 阵列数据的 TCGA 肿瘤样本,测量了四个 SCNA 调用者的精度和灵敏度。

结果

在合成外显子组和panel 数据上,VarScan2 主要会错误地检出假阳性,而 Control-FREEC 的精度(>90%正确调用)较高,但代价是灵敏度较低(<40%检出)。ONCOCNV 在基因panel 数据上的精度略低,灵敏度也相似。这可以解释为扩增的灵敏度较低,缺失的精度较高。令人惊讶的是,这些结果并没有受到中等肿瘤杂质的强烈影响;只有当非癌细胞的比例超过 60%时,精度和灵敏度才会明显下降。在 119 个临床样本中,Control-FREEC 和 CNVkit 分别正确检出 SNP 阵列发现的 SCNA 的 71.8%和 94%,但有相当数量的假阳性(精度为 29%和 4.9%)。

讨论

全外显子组和靶向基因panel 方法的设计限制了 SCNA 调用者的精度,使它们容易出现假阳性。SCNA 调用不能轻易地集成到使用靶向捕获测序数据的临床工作流程中。如果要使用,需要使用正交方法进行交叉验证。

可用性和实施

脚本作为补充信息提供。

联系人

gunther.jansen@molecularhealth.com

补充信息

补充数据可在 Bioinformatics 在线获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/79ce/5870863/5321eb3501df/btx284f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验