Liu Guojun, Zhang Junying
School of Computer Science and Technology, Xidian University, Xi'an, China.
Front Genet. 2021 Jun 28;12:699510. doi: 10.3389/fgene.2021.699510. eCollection 2021.
The next-generation sequencing technology offers a wealth of data resources for the detection of copy number variations (CNVs) at a high resolution. However, it is still challenging to correctly detect CNVs of different lengths. It is necessary to develop new CNV detection tools to meet this demand. In this work, we propose a new CNV detection method, called CBCNV, for the detection of CNVs of different lengths from whole genome sequencing data. CBCNV uses a clustering algorithm to divide the read depth segment profile, and assigns an abnormal score to each read depth segment. Based on the abnormal score profile, Tukey's fences method is adopted in CBCNV to forecast CNVs. The performance of the proposed method is evaluated on simulated data sets, and is compared with those of several existing methods. The experimental results prove that the performance of CBCNV is better than those of several existing methods. The proposed method is further tested and verified on real data sets, and the experimental results are found to be consistent with the simulation results. Therefore, the proposed method can be expected to become a routine tool in the analysis of CNVs from tumor-normal matched samples.
下一代测序技术为高分辨率检测拷贝数变异(CNV)提供了丰富的数据资源。然而,正确检测不同长度的CNV仍然具有挑战性。有必要开发新的CNV检测工具来满足这一需求。在这项工作中,我们提出了一种新的CNV检测方法,称为CBCNV,用于从全基因组测序数据中检测不同长度的CNV。CBCNV使用聚类算法对读深度片段轮廓进行划分,并为每个读深度片段分配一个异常分数。基于异常分数轮廓,CBCNV采用Tukey's fences方法预测CNV。在模拟数据集上评估了所提方法的性能,并与几种现有方法进行了比较。实验结果证明,CBCNV的性能优于几种现有方法。在所提方法在真实数据集上进一步进行了测试和验证,发现实验结果与模拟结果一致。因此,所提方法有望成为肿瘤-正常匹配样本CNV分析中的常规工具。