Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL, 60611, USA.
Sci Rep. 2021 Oct 26;11(1):21072. doi: 10.1038/s41598-021-00626-7.
Though whole exome sequencing (WES) is the gold-standard for measuring tumor mutational burden (TMB), the development of gene-targeted panels enables cost-effective TMB estimation. With the growing number of panels in clinical trials, developing a statistical method to effectively evaluate and compare the performance of different panels is necessary. The mainstream method uses R-squared value to measure the correlation between the panel-based TMB and WES-based TMB. However, the performance of a panel is usually overestimated via R-squared value based on the long-tailed TMB distribution of the dataset. Herein, we propose angular distance, a measurement used to compute the extent of the estimated bias. Our extensive in silico analysis indicates that the R-squared value reaches a plateau after the panel size reaches 0.5 Mb, which does not adequately characterize the performance of the panels. In contrast, the angular distance is still sensitive to the changes in panel sizes when the panel size reaches 6 Mb. In particular, R-squared values between the hypermutation-included dataset and the non-hypermutation dataset differ widely across many cancer types, whereas the angular distances are highly consistent. Therefore, the angular distance is more objective and logical than R-squared value for evaluating the accuracy of TMB estimation for gene-targeted panels.
虽然全外显子组测序(WES)是衡量肿瘤突变负荷(TMB)的金标准,但基因靶向panel 的发展使得 TMB 的经济有效估计成为可能。随着临床试验中panel 的数量不断增加,开发一种有效的统计方法来评估和比较不同panel 的性能是必要的。主流方法使用 R 平方值来衡量基于 panel 的 TMB 和基于 WES 的 TMB 之间的相关性。然而,基于数据集的 TMB 长尾分布,R 平方值通常会高估 panel 的性能。在此,我们提出了角距离,这是一种用于计算估计偏差程度的度量。我们广泛的计算机模拟分析表明,当 panel 大小达到 0.5 Mb 后,R 平方值会达到一个平台期,这并不能充分描述 panel 的性能。相比之下,当 panel 大小达到 6 Mb 时,角距离仍然对 panel 大小的变化敏感。特别是,在许多癌症类型中,包含超突变数据集和非超突变数据集之间的 R 平方值差异很大,而角距离则高度一致。因此,与 R 平方值相比,角距离更客观、更符合逻辑,适用于评估基因靶向 panel 对 TMB 估计的准确性。