Institute for Genome Sciences and Policy, Duke University, Durham, North Carolina 27708, USA.
Genome Res. 2011 Oct;21(10):1757-67. doi: 10.1101/gr.121541.111. Epub 2011 Jul 12.
The human body contains thousands of unique cell types, each with specialized functions. Cell identity is governed in large part by gene transcription programs, which are determined by regulatory elements encoded in DNA. To identify regulatory elements active in seven cell lines representative of diverse human cell types, we used DNase-seq and FAIRE-seq (Formaldehyde Assisted Isolation of Regulatory Elements) to map "open chromatin." Over 870,000 DNaseI or FAIRE sites, which correspond tightly to nucleosome-depleted regions, were identified across the seven cell lines, covering nearly 9% of the genome. The combination of DNaseI and FAIRE is more effective than either assay alone in identifying likely regulatory elements, as judged by coincidence with transcription factor binding locations determined in the same cells. Open chromatin common to all seven cell types tended to be at or near transcription start sites and to be coincident with CTCF binding sites, while open chromatin sites found in only one cell type were typically located away from transcription start sites and contained DNA motifs recognized by regulators of cell-type identity. We show that open chromatin regions bound by CTCF are potent insulators. We identified clusters of open regulatory elements (COREs) that were physically near each other and whose appearance was coordinated among one or more cell types. Gene expression and RNA Pol II binding data support the hypothesis that COREs control gene activity required for the maintenance of cell-type identity. This publicly available atlas of regulatory elements may prove valuable in identifying noncoding DNA sequence variants that are causally linked to human disease.
人体内含有数千种独特的细胞类型,每种细胞都具有特定的功能。细胞身份在很大程度上受基因转录程序的控制,而这些程序则由 DNA 编码的调控元件决定。为了鉴定在七种代表不同人类细胞类型的细胞系中活跃的调控元件,我们使用了 DNase-seq 和 FAIRE-seq(甲醛辅助分离调控元件)来绘制“开放染色质”图谱。在这七种细胞系中,共鉴定出超过 870,000 个 DNaseI 或 FAIRE 位点,这些位点与核小体缺失区域紧密对应,几乎覆盖了基因组的 9%。DNaseI 和 FAIRE 的组合比单独使用任何一种方法更有效地鉴定可能的调控元件,这可以通过与在相同细胞中确定的转录因子结合位置的巧合来判断。所有七种细胞类型共有的开放染色质通常位于转录起始位点附近或附近,并且与 CTCF 结合位点重合,而仅在一种细胞类型中发现的开放染色质位点通常位于转录起始位点之外,并且包含细胞身份调控因子识别的 DNA 基序。我们表明,由 CTCF 结合的开放染色质区域是有效的绝缘子。我们鉴定了彼此物理上接近的开放调控元件 (CORE) 簇,并且它们在一种或多种细胞类型中的出现是协调的。基因表达和 RNA Pol II 结合数据支持 CORE 控制维持细胞身份所需的基因活性的假设。这个公开的调控元件图谱可能有助于鉴定与人类疾病因果相关的非编码 DNA 序列变异。