Suppr超能文献

一种用于分析基因组平铺阵列上的染色质免疫沉淀芯片实验的隐马尔可夫模型及其在p53结合序列中的应用。

A hidden Markov model for analyzing ChIP-chip experiments on genome tiling arrays and its application to p53 binding sequences.

作者信息

Li Wei, Meyer Clifford A, Liu X Shirley

机构信息

Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Harvard School of Public Health Boston, MA 02115, USA.

出版信息

Bioinformatics. 2005 Jun;21 Suppl 1:i274-82. doi: 10.1093/bioinformatics/bti1046.

Abstract

MOTIVATION

Transcription factors (TFs) regulate gene expression by recognizing and binding to specific regulatory regions on the genome, which in higher eukaryotes can occur far away from the regulated genes. Recently, Affymetrix developed the high-density oligonucleotide arrays that tile all the non-repetitive sequences of the human genome at 35 bp resolution. This new array platform allows for the unbiased mapping of in vivo TF binding sequences (TFBSs) using Chromatin ImmunoPrecipitation followed by microarray experiments (ChIP-chip). The massive dataset generated from these experiments pose great challenges for data analysis.

RESULTS

We developed a fast, scalable and sensitive method to extract TFBSs from ChIP-chip experiments on genome tiling arrays. Our method takes advantage of tiling array data from many experiments to normalize and model the behavior of each individual probe, and identifies TFBSs using a hidden Markov model (HMM). When applied to the data of p53 ChIP-chip experiments from an earlier study, our method discovered many new high confidence p53 targets including all the regions verified by quantitative PCR. Using a de novo motif finding algorithm MDscan, we also recovered the p53 motif from our HMM identified p53 target regions. Furthermore, we found substantial p53 motif enrichment in these regions comparing with both genomic background and the TFBSs identified earlier. Several of the newly identified p53 TFBSs are in the promoter region of known genes or associated with previously characterized p53-responsive genes.

SUPPLEMENTARY INFORMATION

Available at the following URL http://genome.dfci.harvard.edu/~xsliu/HMMTiling/index.html.

摘要

动机

转录因子(TFs)通过识别并结合基因组上的特定调控区域来调节基因表达,在高等真核生物中,这些调控区域可能距离被调控基因很远。最近,Affymetrix开发了高密度寡核苷酸阵列,该阵列以35 bp的分辨率覆盖了人类基因组的所有非重复序列。这个新的阵列平台允许使用染色质免疫沉淀后进行微阵列实验(ChIP-chip)来无偏差地绘制体内TF结合序列(TFBSs)。这些实验产生的大量数据集给数据分析带来了巨大挑战。

结果

我们开发了一种快速、可扩展且灵敏的方法,用于从基因组平铺阵列上的ChIP-chip实验中提取TFBSs。我们的方法利用来自许多实验的平铺阵列数据来归一化并建模每个单独探针的行为,并使用隐马尔可夫模型(HMM)识别TFBSs。当应用于早期研究中p53 ChIP-chip实验的数据时,我们的方法发现了许多新的高可信度p53靶标,包括所有通过定量PCR验证的区域。使用从头基序发现算法MDscan,我们还从HMM识别的p53靶标区域中恢复了p53基序。此外,与基因组背景和早期识别的TFBSs相比,我们发现这些区域中存在大量的p53基序富集。几个新识别的p53 TFBSs位于已知基因的启动子区域或与先前表征的p53反应基因相关。

补充信息

可在以下网址获取http://genome.dfci.harvard.edu/~xsliu/HMMTiling/index.html。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验