School of Public Health; University of California; Berkeley, CA USA.
Epigenetics. 2013 Nov;8(11):1141-52. doi: 10.4161/epi.26037. Epub 2013 Aug 19.
Analysis of epigenetic mechanisms, particularly DNA methylation, is of increasing interest for epidemiologic studies examining disease etiology and impacts of environmental exposures. The Infinium HumanMethylation450 BeadChip(®) (450K), which interrogates over 480,000 CpG sites and is relatively cost effective, has become a popular tool to characterize the DNA methylome. For large-scale studies, minimizing technical variability and potential bias is paramount. The goal of this paper was to evaluate the performance of several existing and novel color channel normalizations designed to reduce technical variability and batch effects in 450K analysis from a large population study. Comparative assessment of 10 normalization procedures included the GenomeStudio(®) Illumina procedure, the lumi smooth quantile approach, and the newly proposed All Sample Mean Normalization (ASMN). We also examined the performance of normalizations in combination with correction for the two types of Infinium chemistry utilized on the 450K array. We observed that the performance of the GenomeStudio(®) normalization procedure was highly variable and dependent on the quality of the first sample analyzed in an experiment, which is used as a reference in this procedure. While the lumi normalization was able to decrease batch variability, it increased variation among technical replicates, potentially reducing biologically meaningful findings. The proposed ASMN procedure performed consistently well, both at reducing batch effects and improving replicate comparability. In summary, the ASMN procedure can improve existing color channel normalization, especially for large epidemiologic studies, and can be successfully implemented to enhance a 450K DNA methylation data pipeline.
分析表观遗传机制,特别是 DNA 甲基化,对于研究疾病病因和环境暴露影响的流行病学研究越来越重要。Infinium HumanMethylation450 BeadChip(®)(450K),它可以检测超过 480,000 个 CpG 位点,且相对具有成本效益,已成为一种研究 DNA 甲基化组的常用工具。对于大规模研究,最大限度地减少技术变异性和潜在偏差至关重要。本文的目的是评估几种现有和新颖的颜色通道归一化方法的性能,这些方法旨在减少大规模人群研究中 450K 分析中的技术变异性和批次效应。对 10 种归一化程序的比较评估包括 GenomeStudio(®)Illumina 程序、lumi smooth quantile 方法和新提出的所有样本均值归一化 (ASMN)。我们还研究了在与 450K 阵列上使用的两种类型的 Infinium 化学进行校正相结合的情况下,归一化的性能。我们观察到,GenomeStudio(®)归一化程序的性能高度可变,且取决于实验中分析的第一个样本的质量,该样本在该程序中用作参考。虽然 lumi 归一化能够减少批次变异性,但它增加了技术重复之间的差异,可能会减少有意义的生物学发现。所提出的 ASMN 程序表现一致良好,既能减少批次效应,又能提高重复可比性。总之,ASMN 程序可以改进现有的颜色通道归一化,特别是对于大型流行病学研究,并可以成功实施以增强 450K DNA 甲基化数据管道。