Hon Chung-Chau, Ramilowski Jordan A, Harshbarger Jayson, Bertin Nicolas, Rackham Owen J L, Gough Julian, Denisenko Elena, Schmeier Sebastian, Poulsen Thomas M, Severin Jessica, Lizio Marina, Kawaji Hideya, Kasukawa Takeya, Itoh Masayoshi, Burroughs A Maxwell, Noma Shohei, Djebali Sarah, Alam Tanvir, Medvedeva Yulia A, Testa Alison C, Lipovich Leonard, Yip Chi-Wai, Abugessaisa Imad, Mendez Mickaël, Hasegawa Akira, Tang Dave, Lassmann Timo, Heutink Peter, Babina Magda, Wells Christine A, Kojima Soichi, Nakamura Yukio, Suzuki Harukazu, Daub Carsten O, de Hoon Michiel J L, Arner Erik, Hayashizaki Yoshihide, Carninci Piero, Forrest Alistair R R
RIKEN Center for Life Science Technologies (Division of Genomic Technologies), 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, 230-0045 Japan.
RIKEN Omics Science Center (OSC), 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan.
Nature. 2017 Mar 9;543(7644):199-204. doi: 10.1038/nature21374. Epub 2017 Mar 1.
Long non-coding RNAs (lncRNAs) are largely heterogeneous and functionally uncharacterized. Here, using FANTOM5 cap analysis of gene expression (CAGE) data, we integrate multiple transcript collections to generate a comprehensive atlas of 27,919 human lncRNA genes with high-confidence 5' ends and expression profiles across 1,829 samples from the major human primary cell types and tissues. Genomic and epigenomic classification of these lncRNAs reveals that most intergenic lncRNAs originate from enhancers rather than from promoters. Incorporating genetic and expression data, we show that lncRNAs overlapping trait-associated single nucleotide polymorphisms are specifically expressed in cell types relevant to the traits, implicating these lncRNAs in multiple diseases. We further demonstrate that lncRNAs overlapping expression quantitative trait loci (eQTL)-associated single nucleotide polymorphisms of messenger RNAs are co-expressed with the corresponding messenger RNAs, suggesting their potential roles in transcriptional regulation. Combining these findings with conservation data, we identify 19,175 potentially functional lncRNAs in the human genome.
长链非编码RNA(lncRNA)在很大程度上具有异质性,其功能也尚未明确。在此,我们利用FANTOM5基因表达帽分析(CAGE)数据,整合多个转录本集合,构建了一个包含27919个人类lncRNA基因的综合图谱,这些基因具有高可信度的5'端,并给出了来自主要人类原代细胞类型和组织的1829个样本的表达谱。对这些lncRNA进行基因组和表观基因组分类发现,大多数基因间lncRNA起源于增强子而非启动子。结合遗传和表达数据,我们发现与性状相关的单核苷酸多态性重叠的lncRNA在与这些性状相关的细胞类型中特异性表达,这表明这些lncRNA与多种疾病有关。我们进一步证明,与信使RNA的表达数量性状位点(eQTL)相关的单核苷酸多态性重叠的lncRNA与相应的信使RNA共表达,提示它们在转录调控中的潜在作用。将这些发现与保守性数据相结合,我们在人类基因组中鉴定出19175个潜在具有功能的lncRNA。