Suppr超能文献

Hi-C 数据中拷贝数变异的有效归一化。

Effective normalization for copy number variation in Hi-C data.

机构信息

Institut Curie, PSL Research University, Paris, F-75005, France.

INSERM, U900, Paris, F-75005, France.

出版信息

BMC Bioinformatics. 2018 Sep 6;19(1):313. doi: 10.1186/s12859-018-2256-5.

Abstract

BACKGROUND

Normalization is essential to ensure accurate analysis and proper interpretation of sequencing data, and chromosome conformation capture data such as Hi-C have particular challenges. Although several methods have been proposed, the most widely used type of normalization of Hi-C data usually casts estimation of unwanted effects as a matrix balancing problem, relying on the assumption that all genomic regions interact equally with each other.

RESULTS

In order to explore the effect of copy-number variations on Hi-C data normalization, we first propose a simulation model that predict the effects of large copy-number changes on a diploid Hi-C contact map. We then show that the standard approaches relying on equal visibility fail to correct for unwanted effects in the presence of copy-number variations. We thus propose a simple extension to matrix balancing methods that model these effects. Our approach can either retain the copy-number variation effects (LOIC) or remove them (CAIC). We show that this leads to better downstream analysis of the three-dimensional organization of rearranged genomes.

CONCLUSIONS

Taken together, our results highlight the importance of using dedicated methods for the analysis of Hi-C cancer data. Both CAIC and LOIC methods perform well on simulated and real Hi-C data sets, each fulfilling different needs.

摘要

背景

为了确保测序数据和染色体构象捕获数据(如 Hi-C)的准确分析和正确解释,归一化是必不可少的。尽管已经提出了几种方法,但最广泛使用的 Hi-C 数据归一化类型通常将估计不需要的影响视为矩阵平衡问题,依赖于所有基因组区域彼此平等相互作用的假设。

结果

为了探索拷贝数变异对 Hi-C 数据归一化的影响,我们首先提出了一个模拟模型,预测大拷贝数变化对二倍体 Hi-C 接触图谱的影响。然后我们表明,在存在拷贝数变异的情况下,依赖于相等可见性的标准方法无法纠正不需要的影响。因此,我们提出了一种简单的矩阵平衡方法扩展,该方法可以模拟这些影响。我们的方法可以保留(LOIC)或去除(CAIC)拷贝数变异的影响。我们表明,这可以更好地分析重排基因组的三维结构。

结论

总之,我们的结果强调了使用专门的方法分析 Hi-C 癌症数据的重要性。CAIC 和 LOIC 方法在模拟和真实 Hi-C 数据集上都表现良好,每种方法都满足不同的需求。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验