Salem Mohamed, Paneru Bam, Al-Tobasei Rafet, Abdouni Fatima, Thorgaard Gary H, Rexroad Caird E, Yao Jianbo
Department of Biology, Middle Tennessee State University, Murfreesboro, Tennessee, 37132, United States of America.
School of Biological Sciences and Center for Reproductive Biology, Washington State University, Pullman, Washington 99164, United States of America.
PLoS One. 2015 Mar 20;10(3):e0121778. doi: 10.1371/journal.pone.0121778. eCollection 2015.
Efforts to obtain a comprehensive genome sequence for rainbow trout are ongoing and will be complemented by transcriptome information that will enhance genome assembly and annotation. Previously, transcriptome reference sequences were reported using data from different sources. Although the previous work added a great wealth of sequences, a complete and well-annotated transcriptome is still needed. In addition, gene expression in different tissues was not completely addressed in the previous studies. In this study, non-normalized cDNA libraries were sequenced from 13 different tissues of a single doubled haploid rainbow trout from the same source used for the rainbow trout genome sequence. A total of ~1.167 billion paired-end reads were de novo assembled using the Trinity RNA-Seq assembler yielding 474,524 contigs > 500 base-pairs. Of them, 287,593 had homologies to the NCBI non-redundant protein database. The longest contig of each cluster was selected as a reference, yielding 44,990 representative contigs. A total of 4,146 contigs (9.2%), including 710 full-length sequences, did not match any mRNA sequences in the current rainbow trout genome reference. Mapping reads to the reference genome identified an additional 11,843 transcripts not annotated in the genome. A digital gene expression atlas revealed 7,678 housekeeping and 4,021 tissue-specific genes. Expression of about 16,000-32,000 genes (35-71% of the identified genes) accounted for basic and specialized functions of each tissue. White muscle and stomach had the least complex transcriptomes, with high percentages of their total mRNA contributed by a small number of genes. Brain, testis and intestine, in contrast, had complex transcriptomes, with a large numbers of genes involved in their expression patterns. This study provides comprehensive de novo transcriptome information that is suitable for functional and comparative genomics studies in rainbow trout, including annotation of the genome.
获取虹鳟鱼完整基因组序列的工作正在进行中,转录组信息将对其起到补充作用,从而加强基因组组装和注释。此前,已利用来自不同来源的数据报道了转录组参考序列。尽管先前的工作增加了大量序列,但仍需要一个完整且注释良好的转录组。此外,先前的研究并未完全涉及不同组织中的基因表达情况。在本研究中,从用于虹鳟鱼基因组序列的同一来源的单条双单倍体虹鳟鱼的13种不同组织中对未标准化的cDNA文库进行了测序。使用Trinity RNA-Seq组装器对总共约11.67亿对末端读数进行了从头组装,产生了474,524个长度大于500碱基对的重叠群。其中,287,593个与NCBI非冗余蛋白质数据库具有同源性。选择每个聚类中最长的重叠群作为参考,产生了44,990个代表性重叠群。共有4,146个重叠群(9.2%),包括710个全长序列,与当前虹鳟鱼基因组参考中的任何mRNA序列均不匹配。将读数映射到参考基因组上又鉴定出11,843个未在基因组中注释的转录本。一个数字基因表达图谱揭示了7,678个管家基因和4,021个组织特异性基因。约16,000 - 32,000个基因(占已鉴定基因的35 - 71%)的表达构成了每个组织的基本和特殊功能。白肌和胃的转录组最不复杂,其总mRNA的很大一部分由少数基因贡献。相比之下,脑、睾丸和肠道的转录组较为复杂,有大量基因参与其表达模式。本研究提供了全面的从头转录组信息,适用于虹鳟鱼的功能和比较基因组学研究,包括基因组注释。