Hi-C 数据中拷贝数变异的有效归一化。

Effective normalization for copy number variation in Hi-C data.

机构信息

Institut Curie, PSL Research University, Paris, F-75005, France.

INSERM, U900, Paris, F-75005, France.

出版信息

BMC Bioinformatics. 2018 Sep 6;19(1):313. doi: 10.1186/s12859-018-2256-5.

DOI:10.1186/s12859-018-2256-5

PMID:30189838

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6127909/

Abstract

BACKGROUND

Normalization is essential to ensure accurate analysis and proper interpretation of sequencing data, and chromosome conformation capture data such as Hi-C have particular challenges. Although several methods have been proposed, the most widely used type of normalization of Hi-C data usually casts estimation of unwanted effects as a matrix balancing problem, relying on the assumption that all genomic regions interact equally with each other.

RESULTS

In order to explore the effect of copy-number variations on Hi-C data normalization, we first propose a simulation model that predict the effects of large copy-number changes on a diploid Hi-C contact map. We then show that the standard approaches relying on equal visibility fail to correct for unwanted effects in the presence of copy-number variations. We thus propose a simple extension to matrix balancing methods that model these effects. Our approach can either retain the copy-number variation effects (LOIC) or remove them (CAIC). We show that this leads to better downstream analysis of the three-dimensional organization of rearranged genomes.

CONCLUSIONS

Taken together, our results highlight the importance of using dedicated methods for the analysis of Hi-C cancer data. Both CAIC and LOIC methods perform well on simulated and real Hi-C data sets, each fulfilling different needs.

摘要

背景

为了确保测序数据和染色体构象捕获数据（如 Hi-C）的准确分析和正确解释，归一化是必不可少的。尽管已经提出了几种方法，但最广泛使用的 Hi-C 数据归一化类型通常将估计不需要的影响视为矩阵平衡问题，依赖于所有基因组区域彼此平等相互作用的假设。

结果

为了探索拷贝数变异对 Hi-C 数据归一化的影响，我们首先提出了一个模拟模型，预测大拷贝数变化对二倍体 Hi-C 接触图谱的影响。然后我们表明，在存在拷贝数变异的情况下，依赖于相等可见性的标准方法无法纠正不需要的影响。因此，我们提出了一种简单的矩阵平衡方法扩展，该方法可以模拟这些影响。我们的方法可以保留（LOIC）或去除（CAIC）拷贝数变异的影响。我们表明，这可以更好地分析重排基因组的三维结构。

结论

总之，我们的结果强调了使用专门的方法分析 Hi-C 癌症数据的重要性。CAIC 和 LOIC 方法在模拟和真实 Hi-C 数据集上都表现良好，每种方法都满足不同的需求。

相似文献

Effective normalization for copy number variation in Hi-C data.

BMC Bioinformatics. 2018 Sep 6;19(1):313. doi: 10.1186/s12859-018-2256-5.

Identification and utilization of copy number information for correcting Hi-C contact map of cancer cell lines.

BMC Bioinformatics. 2020 Nov 7;21(1):506. doi: 10.1186/s12859-020-03832-8.

HiNT: a computational method for detecting copy number variations and translocations from Hi-C data.

Genome Biol. 2020 Mar 23;21(1):73. doi: 10.1186/s13059-020-01986-5.

OneD: increasing reproducibility of Hi-C samples with abnormal karyotypes.

Nucleic Acids Res. 2018 May 4;46(8):e49. doi: 10.1093/nar/gky064.

A computational strategy to adjust for copy number in tumor Hi-C data.

Bioinformatics. 2016 Dec 15;32(24):3695-3701. doi: 10.1093/bioinformatics/btw540. Epub 2016 Aug 16.

Normalization of Illumina Infinium whole-genome SNP data improves copy number estimates and allelic intensity ratios.

BMC Bioinformatics. 2008 Oct 2;9:409. doi: 10.1186/1471-2105-9-409.

SCOPE: A Normalization and Copy-Number Estimation Method for Single-Cell DNA Sequencing.

Cell Syst. 2020 May 20;10(5):445-452.e6. doi: 10.1016/j.cels.2020.03.005.

Hi-C as a tool for precise detection and characterisation of chromosomal rearrangements and copy number variation in human tumours.

Genome Biol. 2017 Jun 27;18(1):125. doi: 10.1186/s13059-017-1253-8.

HiC-bench: comprehensive and reproducible Hi-C data analysis designed for parameter exploration and benchmarking.

BMC Genomics. 2017 Jan 5;18(1):22. doi: 10.1186/s12864-016-3387-6.

Computational methods for DNA copy-number analysis of tumors.

Methods Mol Biol. 2014;1176:243-59. doi: 10.1007/978-1-4939-0992-6_20.

引用本文的文献

A comprehensive review and benchmark of differential analysis tools for Hi-C data.

Brief Bioinform. 2025 Mar 4;26(2). doi: 10.1093/bib/bbaf074.

Mapping the 3D genome architecture.

Comput Struct Biotechnol J. 2024 Dec 23;27:89-101. doi: 10.1016/j.csbj.2024.12.018. eCollection 2025.

Integrated analyses highlight interactions between the three-dimensional genome and DNA, RNA and epigenomic alterations in metastatic prostate cancer.

Nat Genet. 2024 Aug;56(8):1689-1700. doi: 10.1038/s41588-024-01826-3. Epub 2024 Jul 17.

Nucleosome spacing controls chromatin spatial structure and accessibility.

Biophys J. 2024 Apr 2;123(7):847-857. doi: 10.1016/j.bpj.2024.02.024. Epub 2024 Feb 27.

Loss of multi-level 3D genome organization during breast cancer progression.

bioRxiv. 2024 Aug 8:2023.11.26.568711. doi: 10.1101/2023.11.26.568711.

Tracing cancer evolution and heterogeneity using Hi-C.

Nat Commun. 2023 Nov 6;14(1):7111. doi: 10.1038/s41467-023-42651-2.

Multi-omics comparison of malignant and normal uveal melanocytes reveals molecular features of uveal melanoma.

Cell Rep. 2023 Sep 26;42(9):113132. doi: 10.1016/j.celrep.2023.113132. Epub 2023 Sep 13.

Efficient Hi-C inversion facilitates chromatin folding mechanism discovery and structure prediction.

Biophys J. 2023 Sep 5;122(17):3425-3438. doi: 10.1016/j.bpj.2023.07.017. Epub 2023 Jul 26.

Efficient Hi-C inversion facilitates chromatin folding mechanism discovery and structure prediction.

bioRxiv. 2023 Jul 21:2023.03.17.533194. doi: 10.1101/2023.03.17.533194.

Combinatorial effects on gene expression at the Lbx1/Fgf8 locus resolve split-hand/foot malformation type 3.

Nat Commun. 2023 Mar 17;14(1):1475. doi: 10.1038/s41467-023-37057-z.

本文引用的文献

3D genome of multiple myeloma reveals spatial genome disorganization associated with copy number variations.

Nat Commun. 2017 Dec 5;8(1):1937. doi: 10.1038/s41467-017-01793-w.

Hi-C as a tool for precise detection and characterisation of chromosomal rearrangements and copy number variation in human tumours.

Genome Biol. 2017 Jun 27;18(1):125. doi: 10.1186/s13059-017-1253-8.

Copy number alterations unmasked as enhancer hijackers.

Nat Genet. 2016 Dec 28;49(1):5-6. doi: 10.1038/ng.3754.

Pan-cancer analysis of somatic copy-number alterations implicates IRS4 and IGF2 in enhancer hijacking.

Nat Genet. 2017 Jan;49(1):65-74. doi: 10.1038/ng.3722. Epub 2016 Nov 21.

Regulation of disease-associated gene expression in the 3D genome.

Nat Rev Mol Cell Biol. 2016 Dec;17(12):771-782. doi: 10.1038/nrm.2016.138. Epub 2016 Nov 9.

Organization and function of the 3D genome.

Nat Rev Genet. 2016 Oct 14;17(11):661-678. doi: 10.1038/nrg.2016.112.

Formation of new chromatin domains determines pathogenicity of genomic duplications.

Nature. 2016 Oct 13;538(7624):265-269. doi: 10.1038/nature19800. Epub 2016 Oct 5.

A computational strategy to adjust for copy number in tumor Hi-C data.

Bioinformatics. 2016 Dec 15;32(24):3695-3701. doi: 10.1093/bioinformatics/btw540. Epub 2016 Aug 16.

TAD disruption as oncogenic driver.

Curr Opin Genet Dev. 2016 Feb;36:34-40. doi: 10.1016/j.gde.2016.03.008. Epub 2016 Apr 22.

Three-dimensional disorganization of the cancer genome occurs coincident with long-range genetic and epigenetic alterations.

Genome Res. 2016 Jun;26(6):719-31. doi: 10.1101/gr.201517.115. Epub 2016 Apr 6.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Hi-C 数据中拷贝数变异的有效归一化。

Effective normalization for copy number variation in Hi-C data.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献