Suppr超能文献

深度宏基因组中病毒基因组组装及多样性估计的评估

Evaluation of viral genome assembly and diversity estimation in deep metagenomes.

作者信息

Aguirre de Cárcer Daniel, Angly Florent E, Alcamí Antonio

机构信息

Centro de Biología Molecular Severo Ochoa, Consejo Superior de Investigaciones Científicas (CSIC)-Universidad Autónoma de Madrid, Madrid, Spain.

出版信息

BMC Genomics. 2014 Nov 18;15(1):989. doi: 10.1186/1471-2164-15-989.

Abstract

BACKGROUND

Viruses have unique properties, small genome and regions of high similarity, whose effects on metagenomic assemblies have not been characterized so far. This study uses diverse in silico simulated viromes to evaluate how extensively genomes can be assembled using different sequencing platforms and assemblers. Further, it investigates the suitability of different methods to estimate viral diversity in metagenomes.

RESULTS

We created in silico metagenomes mimicking various platforms at different sequencing depths. The CLC assembler revealed subpar compared to IDBA_UD and CAMERA , which are metagenomic-specific. Up to a saturation point, Illumina platforms proved more capable of reconstructing large portions of viral genomes compared to 454. Read length was an important factor for limiting chimericity, while scaffolding marginally improved contig length and accuracy. The genome length of the various viruses in the metagenomes did not significantly affect genome reconstruction, but the co-existence of highly similar genomes was detrimental. When evaluating diversity estimation tools, we found that PHACCS results were more accurate than those from CatchAll and clustering, which were both orders of magnitude above expected.

CONCLUSIONS

Assemblers designed specifically for the analysis of metagenomes should be used to facilitate the creation of high-quality long contigs. Despite the high coverage possible, scientists should not expect to always obtain complete genomes, because their reconstruction may be hindered by co-existing species bearing highly similar genomic regions. Further development of metagenomics-oriented assemblers may help bypass these limitations in future studies. Meanwhile, the lack of fully reconstructed communities keeps methods to estimate viral diversity relevant. While none of the three methods tested had absolute precision, only PHACCS was deemed suitable for comparative studies.

摘要

背景

病毒具有独特的特性、小基因组和高度相似的区域,其对宏基因组组装的影响迄今尚未得到表征。本研究使用多种计算机模拟病毒群落来评估使用不同测序平台和组装器能够在多大程度上组装基因组。此外,还研究了不同方法在估计宏基因组中病毒多样性方面的适用性。

结果

我们创建了模拟不同测序深度下各种平台的计算机宏基因组。与IDBA_UD和CAMERA这两种宏基因组特异性组装器相比,CLC组装器表现欠佳。在达到饱和点之前,与454平台相比,Illumina平台在重建大部分病毒基因组方面表现出更强的能力。读长是限制嵌合性的一个重要因素,而支架构建对重叠群长度和准确性的提升作用不大。宏基因组中各种病毒的基因组长度对基因组重建没有显著影响,但高度相似基因组的共存是有害的。在评估多样性估计工具时,我们发现PHACCS的结果比CatchAll和聚类分析的结果更准确,后两者的结果都比预期高出几个数量级。

结论

应使用专门设计用于宏基因组分析的组装器来促进高质量长重叠群的创建。尽管可能实现高覆盖率,但科学家不应期望总能获得完整的基因组,因为其重建可能会受到具有高度相似基因组区域的共存物种的阻碍。面向宏基因组学的组装器的进一步发展可能有助于在未来的研究中绕过这些限制。同时,缺乏完全重建的群落使得估计病毒多样性的方法仍然具有相关性。虽然测试的三种方法都没有绝对的精度,但只有PHACCS被认为适用于比较研究。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8c87/4247695/c9f5d21cdac8/12864_2014_6692_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验