利用短读 RNA-Seq 数据优化从头构建的普通小麦转录组组装。

Optimizing de novo common wheat transcriptome assembly using short-read RNA-Seq data.

机构信息

Key Laboratory of Crop Gene Resources and Germplasm Enhancement, Ministry of Agriculture, Institute of Crop Science, Chinese Academy of Agricultural Sciences, Zhongguancun, Beijing, People's Republic of China.

出版信息

BMC Genomics. 2012 Aug 14;13:392. doi: 10.1186/1471-2164-13-392.

DOI:10.1186/1471-2164-13-392

PMID:22891638

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3485621/

Abstract

BACKGROUND

Rapid advances in next-generation sequencing methods have provided new opportunities for transcriptome sequencing (RNA-Seq). The unprecedented sequencing depth provided by RNA-Seq makes it a powerful and cost-efficient method for transcriptome study, and it has been widely used in model organisms and non-model organisms to identify and quantify RNA. For non-model organisms lacking well-defined genomes, de novo assembly is typically required for downstream RNA-Seq analyses, including SNP discovery and identification of genes differentially expressed by phenotypes. Although RNA-Seq has been successfully used to sequence many non-model organisms, the results of de novo assembly from short reads can still be improved by using recent bioinformatic developments.

RESULTS

In this study, we used 212.6 million pair-end reads, which accounted for 16.2 Gb, to assemble the hexaploid wheat transcriptome. Two state-of-the-art assemblers, Trinity and Trans-ABySS, which use the single and multiple k-mer methods, respectively, were used, and the whole de novo assembly process was divided into the following four steps: pre-assembly, merging different samples, removal of redundancy and scaffolding. We documented every detail of these steps and how these steps influenced assembly performance to gain insight into transcriptome assembly from short reads. After optimization, the assembled transcripts were comparable to Sanger-derived ESTs in terms of both continuity and accuracy. We also provided considerable new wheat transcript data to the community.

CONCLUSIONS

It is feasible to assemble the hexaploid wheat transcriptome from short reads. Special attention should be paid to dealing with multiple samples to balance the spectrum of expression levels and redundancy. To obtain an accurate overview of RNA profiling, removal of redundancy may be crucial in de novo assembly.

摘要

背景

下一代测序方法的快速发展为转录组测序（RNA-Seq）提供了新的机会。RNA-Seq 提供的前所未有的测序深度使其成为转录组研究的强大且经济高效的方法，已广泛应用于模式生物和非模式生物，用于鉴定和定量 RNA。对于缺乏明确基因组的非模式生物，通常需要进行从头组装，以便进行下游 RNA-Seq 分析，包括 SNP 发现和表型差异表达基因的鉴定。尽管 RNA-Seq 已成功用于许多非模式生物的测序，但通过使用最新的生物信息学进展，仍可以改善来自短读长的从头组装结果。

结果

本研究使用了 2.126 亿对末端读长，总计 16.2GB，组装了六倍体小麦转录组。使用了两种最先进的组装器，Trinity 和 Trans-ABySS，分别使用单和多 k-mer 方法，整个从头组装过程分为以下四个步骤：预组装、合并不同的样本、去除冗余和支架搭建。我们记录了这些步骤的每一个细节以及这些步骤如何影响组装性能，以深入了解来自短读长的转录组组装。经过优化，组装的转录本在连续性和准确性方面与 Sanger 衍生的 EST 相当。我们还为社区提供了大量新的小麦转录数据。

结论

从短读长组装六倍体小麦转录组是可行的。应特别注意处理多个样本，以平衡表达水平和冗余的分布。为了获得 RNA 谱的准确概述，在从头组装中去除冗余可能至关重要。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/54e9/3485621/32e9a404df1a/1471-2164-13-392-1.jpg

相似文献

Optimizing de novo common wheat transcriptome assembly using short-read RNA-Seq data.

BMC Genomics. 2012 Aug 14;13:392. doi: 10.1186/1471-2164-13-392.

Optimizing de novo transcriptome assembly from short-read RNA-Seq data: a comparative study.

BMC Bioinformatics. 2011 Dec 14;12 Suppl 14(Suppl 14):S2. doi: 10.1186/1471-2105-12-S14-S2.

Optimizing de novo assembly of short-read RNA-seq data for phylogenomics.

BMC Genomics. 2013 May 14;14:328. doi: 10.1186/1471-2164-14-328.

De Novo Plant Transcriptome Assembly and Annotation Using Illumina RNA-Seq Reads.

Methods Mol Biol. 2019;1933:265-275. doi: 10.1007/978-1-4939-9045-0_16.

Evaluation of assembly strategies using RNA-seq data associated with grain development of wheat (Triticum aestivum L.).

PLoS One. 2013 Dec 12;8(12):e83530. doi: 10.1371/journal.pone.0083530. eCollection 2013.

A survey of the complex transcriptome from the highly polyploid sugarcane genome using full-length isoform sequencing and de novo assembly from short read sequencing.

BMC Genomics. 2017 May 22;18(1):395. doi: 10.1186/s12864-017-3757-8.

Comparative performance of transcriptome assembly methods for non-model organisms.

BMC Genomics. 2016 Jul 27;17:523. doi: 10.1186/s12864-016-2923-8.

Optimizing Hybrid de Novo Transcriptome Assembly and Extending Genomic Resources for Giant Freshwater Prawns (Macrobrachium rosenbergii): The Identification of Genes and Markers Associated with Reproduction.

Int J Mol Sci. 2016 May 7;17(5):690. doi: 10.3390/ijms17050690.

De novo transcriptome assembly: A comprehensive cross-species comparison of short-read RNA-Seq assemblers.

Gigascience. 2019 May 1;8(5). doi: 10.1093/gigascience/giz039.

Comprehensive evaluation of de novo transcriptome assembly programs and their effects on differential gene expression analysis.

Bioinformatics. 2017 Feb 1;33(3):327-333. doi: 10.1093/bioinformatics/btw625.

引用本文的文献

RNA-Seq transcriptome profiling of immature grain wheat is a technique for understanding comparative modeling of baking quality.

Sci Rep. 2024 May 13;14(1):10940. doi: 10.1038/s41598-024-61528-y.

Elucidating the Mesocarp Drupe Transcriptome of Açai ( Mart.): An Amazonian Tree Palm Producer of Bioactive Compounds.

Int J Mol Sci. 2023 May 26;24(11):9315. doi: 10.3390/ijms24119315.

QTL cluster analysis and marker development for kernel traits based on DArT markers in spring bread wheat ( L.).

Front Plant Sci. 2023 Feb 10;14:1072233. doi: 10.3389/fpls.2023.1072233. eCollection 2023.

Transcriptome and Proteome Co-Profiling Offers an Understanding of Pre-Harvest Sprouting (PHS) Molecular Mechanisms in Wheat ().

Plants (Basel). 2022 Oct 22;11(21):2807. doi: 10.3390/plants11212807.

Conservation and Diversity in Gibberellin-Mediated Transcriptional Responses Among Host Plants Forming Distinct Arbuscular Mycorrhizal Morphotypes.

Front Plant Sci. 2021 Dec 16;12:795695. doi: 10.3389/fpls.2021.795695. eCollection 2021.

Relationship between the Phenylpropanoid Pathway and Dwarfism of Based on RNA-Seq and iTRAQ.

Int J Mol Sci. 2021 Sep 3;22(17):9568. doi: 10.3390/ijms22179568.

Identification of Genes in Hexaploid Wheat ( L.) by RNA-Seq and Paralog Analyses.

Int J Mol Sci. 2021 Aug 24;22(17):9146. doi: 10.3390/ijms22179146.

Development and application of the Faba_bean_130K targeted next-generation sequencing SNP genotyping platform based on transcriptome sequencing.

Theor Appl Genet. 2021 Oct;134(10):3195-3207. doi: 10.1007/s00122-021-03885-0. Epub 2021 Jun 12.

De novo transcriptome analysis of Lantana camara L. revealed candidate genes involved in phenylpropanoid biosynthesis pathway.

Sci Rep. 2020 Aug 13;10(1):13726. doi: 10.1038/s41598-020-70635-5.

Interaction between serine carboxypeptidase-like protein TtGS5 and Annexin D1 in developing seeds of Triticum timopheevi.

J Appl Genet. 2020 May;61(2):151-162. doi: 10.1007/s13353-020-00539-7. Epub 2020 Jan 22.

本文引用的文献

De novo assembly and characterisation of the transcriptome during seed development, and generation of genic-SSR markers in peanut (Arachis hypogaea L.).

BMC Genomics. 2012 Mar 12;13:90. doi: 10.1186/1471-2164-13-90.

Oases: robust de novo RNA-seq assembly across the dynamic range of expression levels.

Bioinformatics. 2012 Apr 15;28(8):1086-92. doi: 10.1093/bioinformatics/bts094. Epub 2012 Feb 24.

Combining SNP discovery from next-generation sequencing data with bulked segregant analysis (BSA) to fine-map genes in polyploid wheat.

BMC Plant Biol. 2012 Jan 26;12:14. doi: 10.1186/1471-2229-12-14.

Transcriptomic analysis of Chinese bayberry (Myrica rubra) fruit development and ripening using RNA-Seq.

BMC Genomics. 2012 Jan 13;13:19. doi: 10.1186/1471-2164-13-19.

Transcriptome analysis of H2O2-treated wheat seedlings reveals a H2O2-responsive fatty acid desaturase gene participating in powdery mildew resistance.

PLoS One. 2011;6(12):e28810. doi: 10.1371/journal.pone.0028810. Epub 2011 Dec 12.

RNA-seq in grain unveils fate of neo- and paleopolyploidization events in bread wheat (Triticum aestivum L.).

Genome Biol. 2011 Dec 2;12(12):R119. doi: 10.1186/gb-2011-12-12-r119.

The Pfam protein families database.

Nucleic Acids Res. 2012 Jan;40(Database issue):D290-301. doi: 10.1093/nar/gkr1065. Epub 2011 Nov 29.

Cell walls of developing wheat starchy endosperm: comparison of composition and RNA-Seq transcriptome.

Plant Physiol. 2012 Feb;158(2):612-27. doi: 10.1104/pp.111.189191. Epub 2011 Nov 28.

Effect of the down-regulation of the high Grain Protein Content (GPC) genes on the wheat transcriptome during monocarpic senescence.

BMC Genomics. 2011 Oct 7;12:492. doi: 10.1186/1471-2164-12-492.

De novo sequence assembly and characterization of the floral transcriptome in cross- and self-fertilizing plants.

BMC Genomics. 2011 Jun 7;12:298. doi: 10.1186/1471-2164-12-298.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

利用短读 RNA-Seq 数据优化从头构建的普通小麦转录组组装。

Optimizing de novo common wheat transcriptome assembly using short-read RNA-Seq data.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献