Nikhil Shinde, Mohideen Habeeb Shaikh, Sella Raja Natesan
Membrane Protein Interaction Lab, Department of Genetic Engineering, SRM Institute of Science and Technology, Chengalpattu District, Tamil Nadu, 603203, India.
Entomoinformatics Lab, Department of Genetic Engineering, SRM Institute of Science and Technology, Chengalpattu District, Tamil Nadu, 603203, India.
J Mol Evol. 2024 Dec;92(6):720-743. doi: 10.1007/s00239-024-10198-5. Epub 2024 Sep 11.
Sorghum (Sorghum bicolor (L.) Moench) is a multipurpose crop grown for food, fodder, and bioenergy production. Its cultivated varieties, along with their wild counterparts, contribute to the core genetic pool. Despite the availability of several re-sequenced sorghum genomes, a variable portion of sorghum genomes is not reported during reference genome assembly and annotation. The present analysis used 223 publicly available RNA-seq datasets from seven sweet sorghum cultivars to construct superTranscriptome. This approach yielded 45,864 Representative Transcript Assemblies (RTAs) that showcased intriguing Presence/Absence Variation (PAV) across 15 published sorghum genomes. We found 301 superTranscripts were exclusive to sweet sorghum, including 58 de novo genes encoded core and linker histones, zinc finger domains, glucosyl transferases, cellulose synthase, etc. The superTranscriptome added 2,802 new protein-coding genes to the Sweet Sorghum Reference Genome (SSRG), of which 559 code for different transcription factors (TFs). Our analysis revealed that MULE-like transposases were abundant in the sweet sorghum genome and could play a hidden role in the evolution of sweet sorghum. We observed large deletions in the D locus and terminal deletions in four other NAC encoding loci in the SSRG compared to its wild progenitor (353) suggesting non-functional NAC genes contributed to trait development in sweet sorghum. Moreover, superTranscript-based methods for Differential Exon Usage (DEU) and Differential Gene Expression (DGE) analyses were more accurate than those based on the SSRG. This study demonstrates that the superTranscriptome can enhance our understanding of fundamental sorghum mechanisms, improve genome annotations, and potentially even replace the reference genome.
高粱(Sorghum bicolor (L.) Moench)是一种用于粮食、饲料和生物能源生产的多用途作物。其栽培品种及其野生同类构成了核心基因库。尽管有多个重新测序的高粱基因组可供使用,但在参考基因组组装和注释过程中,高粱基因组仍有一部分未被报道。本分析使用了来自7个甜高粱品种的223个公开可用的RNA-seq数据集来构建超级转录组。这种方法产生了45,864个代表性转录本组装(RTA),展示了15个已发表的高粱基因组中有趣的存在/缺失变异(PAV)。我们发现301个超级转录本是甜高粱特有的,包括58个从头基因,编码核心和连接组蛋白、锌指结构域、糖基转移酶、纤维素合酶等。超级转录组为甜高粱参考基因组(SSRG)增加了2,802个新的蛋白质编码基因,其中559个编码不同的转录因子(TF)。我们的分析表明,类MULE转座酶在甜高粱基因组中丰富,可能在甜高粱的进化中发挥隐藏作用。与野生祖先(353)相比,我们观察到SSRG中D位点有大的缺失,其他四个NAC编码位点有末端缺失,这表明无功能的NAC基因有助于甜高粱的性状发育。此外,基于超级转录本的差异外显子使用(DEU)和差异基因表达(DGE)分析方法比基于SSRG的方法更准确。这项研究表明,超级转录组可以增强我们对高粱基本机制的理解,改善基因组注释,甚至可能取代参考基因组。