Provincial Key Laboratory of Agrobiology, Institute of Crop Germplasm and Biotechnology, Jiangsu Academy of Agricultural Sciences, Nanjing, China.
Department of Ecology and Evolutionary, Biology, UC Irvine, Irvine, CA, USA.
BMC Genomics. 2019 Nov 15;20(1):864. doi: 10.1186/s12864-019-6245-5.
Several studies have mined short-read RNA sequencing datasets to identify long non-coding RNAs (lncRNAs), and others have focused on the function of individual lncRNAs in abiotic stress response. However, our understanding of the complement, function and origin of lncRNAs - and especially transposon derived lncRNAs (TE-lncRNAs) - in response to abiotic stress is still in its infancy.
We utilized a dataset of 127 RNA sequencing samples that included total RNA datasets and PacBio fl-cDNA data to discover lncRNAs in maize. Overall, we identified 23,309 candidate lncRNAs from polyA+ and total RNA samples, with a strong discovery bias within total RNA. The majority (65%) of the 23,309 lncRNAs had sequence similarity to transposable elements (TEs). Most had similarity to long-terminal-repeat retrotransposons from the Copia and Gypsy superfamilies, reflecting a high proportion of these elements in the genome. However, DNA transposons were enriched for lncRNAs relative to their genomic representation by ~ 2-fold. By assessing the fraction of lncRNAs that respond to abiotic stresses like heat, cold, salt and drought, we identified 1077 differentially expressed lncRNA transcripts, including 509 TE-lncRNAs. In general, the expression of these lncRNAs was significantly correlated with their nearest gene. By inferring co-expression networks across our large dataset, we found that 39 lncRNAs are as major hubs in co-expression networks that respond to abiotic stress, and 18 appear to be derived from TEs.
Our results show that lncRNAs are enriched in total RNA samples, that most (65%) are derived from TEs, that at least 1077 are differentially expressed during abiotic stress, and that 39 are hubs in co-expression networks, including a small number that are evolutionary conserved. These results suggest that lncRNAs, including TE-lncRNAs, may play key regulatory roles in moderating abiotic responses.
已有多项研究通过挖掘短读长 RNA 测序数据集来鉴定长非编码 RNA(lncRNA),还有一些研究则专注于单个 lncRNA 在非生物胁迫响应中的功能。然而,我们对于 lncRNA(尤其是转座子衍生的 lncRNA[TE-lncRNA])在非生物胁迫响应中的互补、功能和起源的理解仍处于起步阶段。
我们利用包含总 RNA 数据集和 PacBio fl-cDNA 数据的 127 个 RNA 测序样本数据集来发现玉米中的 lncRNA。总体而言,我们从 polyA+和总 RNA 样本中鉴定出 23309 个候选 lncRNA,总 RNA 样本中存在强烈的发现偏差。23309 个 lncRNA 中的大多数(65%)与转座元件(TEs)具有序列相似性。大多数与 Copia 和 Gypsy 超家族的长末端重复反转录转座子具有相似性,反映出基因组中这些元件的比例很高。然而,与基因组代表性相比,DNA 转座子在 lncRNA 中富集了约 2 倍。通过评估 lncRNA 对热、冷、盐和干旱等非生物胁迫的响应比例,我们鉴定出 1077 个差异表达的 lncRNA 转录本,包括 509 个 TE-lncRNA。一般来说,这些 lncRNA 的表达与其最近的基因显著相关。通过在我们的大数据集中推断共表达网络,我们发现 39 个 lncRNA 是响应非生物胁迫的共表达网络中的主要枢纽,其中 18 个似乎来自 TEs。
我们的结果表明,lncRNA 在总 RNA 样本中富集,大多数(65%)来源于 TEs,至少有 1077 个在非生物胁迫下差异表达,并且 39 个是共表达网络中的枢纽,其中包括少数进化保守的 lncRNA。这些结果表明,lncRNA(包括 TE-lncRNA)可能在调节非生物响应中发挥关键的调节作用。