De Donato Marcos, Hussain Tanveer, Rodulfo Hectorina, Peters Sunday O, Imumorin Ikhide G, Thomas Bolaji N
Animal Genetics and Genomics Laboratory, Office of International Programs, College of Agriculture and Life Sciences, Cornell University, Ithaca, NY, USA.
Escuela de Bioingenierias, Tecnologico de Monterrey, Campus Querétaro, Santiago de Querétaro, Mexico.
Evol Bioinform Online. 2017 Jun 16;13:1176934317715238. doi: 10.1177/1176934317715238. eCollection 2017.
KCNQ1OT1 is located in the region with the highest number of genes showing genomic imprinting, but the mechanisms controlling the genes under its influence have not been fully elucidated. Therefore, we conducted a comparative analysis of the KCNQ1/KCNQ1OT1-CDKN1C region to study its conservation across the best assembled eutherian mammalian genomes sequenced to date and analyzed potential elements that may be implicated in the control of genomic imprinting in this region. The genomic features in these regions from human, mouse, cattle, and dog show a higher number of genes and CpG islands (detected using cpgplot from EMBOSS), but lower number of repetitive elements (including short interspersed nuclear elements and long interspersed nuclear elements), compared with their whole chromosomes (detected by RepeatMasker). The KCNQ1OT1-CDKN1C region contains the highest number of conserved noncoding sequences (CNS) among mammals, where we found 16 regions containing about 38 different highly conserved repetitive elements (using mVista), such as LINE1 elements: L1M4, L1MB7, HAL1, L1M4a, L1Med, and an LTR element: MLT1H. From these elements, we found 74 CNS showing high sequence identity (>70%) between human, cattle, and mouse, from which we identified 13 motifs (using Multiple Em for Motif Elicitation/Motif Alignment and Search Tool) with a significant probability of occurrence, 3 of which were the most frequent and were used to find transcription factor-binding sites. We detected several transcription factors (using JASPAR suite) from the families SOX, FOX, and GATA. A phylogenetic analysis of these CNS from human, marmoset, mouse, rat, cattle, dog, horse, and elephant shows branches with high levels of support and very similar phylogenetic relationships among these groups, confirming previous reports. Our results suggest that functional DNA elements identified by comparative genomics in a region densely populated with imprinted mammalian genes may be related to the regulation of imprinted gene expression.
KCNQ1OT1位于基因组印记基因数量最多的区域,但影响其调控的基因机制尚未完全阐明。因此,我们对KCNQ1/KCNQ1OT1-CDKN1C区域进行了比较分析,以研究其在迄今测序的最佳组装真兽类哺乳动物基因组中的保守性,并分析可能参与该区域基因组印记调控的潜在元件。与全染色体(通过RepeatMasker检测)相比,人类、小鼠、牛和狗这些区域的基因组特征显示基因和CpG岛数量更多(使用EMBOSS中的cpgplot检测),但重复元件数量更少(包括短散在核元件和长散在核元件)。KCNQ1OT1-CDKN1C区域在哺乳动物中包含数量最多的保守非编码序列(CNS),我们发现16个区域包含约38种不同的高度保守重复元件(使用mVista),如LINE1元件:L1M4、L1MB7、HAL1、L1M4a、L1Med,以及一个LTR元件:MLT1H。从这些元件中,我们发现74个CNS在人类、牛和小鼠之间显示出高序列同一性(>70%),从中我们鉴定出13个具有显著出现概率的基序(使用Multiple Em for Motif Elicitation/Motif Alignment and Search Tool),其中3个最常见,并用于寻找转录因子结合位点。我们从SOX、FOX和GATA家族中检测到几种转录因子(使用JASPAR套件)。对人类、狨猴、小鼠、大鼠、牛、狗、马和大象的这些CNS进行系统发育分析,显示出具有高支持水平的分支,并且这些群体之间的系统发育关系非常相似,证实了先前的报道。我们的结果表明,通过比较基因组学在一个充满印记哺乳动物基因的区域中鉴定出的功能性DNA元件可能与印记基因表达的调控有关。