Suppr超能文献

人类泛基因组参考草图。

A draft human pangenome reference.

机构信息

Department of Genetics, Yale University School of Medicine, New Haven, CT, USA.

Center for Genomic Health, Yale University School of Medicine, New Haven, CT, USA.

出版信息

Nature. 2023 May;617(7960):312-324. doi: 10.1038/s41586-023-05896-x. Epub 2023 May 10.

Abstract

Here the Human Pangenome Reference Consortium presents a first draft of the human pangenome reference. The pangenome contains 47 phased, diploid assemblies from a cohort of genetically diverse individuals. These assemblies cover more than 99% of the expected sequence in each genome and are more than 99% accurate at the structural and base pair levels. Based on alignments of the assemblies, we generate a draft pangenome that captures known variants and haplotypes and reveals new alleles at structurally complex loci. We also add 119 million base pairs of euchromatic polymorphic sequences and 1,115 gene duplications relative to the existing reference GRCh38. Roughly 90 million of the additional base pairs are derived from structural variation. Using our draft pangenome to analyse short-read data reduced small variant discovery errors by 34% and increased the number of structural variants detected per haplotype by 104% compared with GRCh38-based workflows, which enabled the typing of the vast majority of structural variant alleles per sample.

摘要

在这里,人类泛基因组参考联盟(Human Pangenome Reference Consortium)呈现了人类泛基因组参考的首个草案。该泛基因组包含了来自遗传多样化个体队列的 47 个相位、二倍体组装。这些组装涵盖了每个基因组中超过 99%的预期序列,在结构和碱基对水平上的准确性超过 99%。基于这些组装的比对,我们生成了一个草案泛基因组,它捕获了已知的变体和单倍型,并揭示了结构复杂位点的新等位基因。与现有的参考基因组 GRCh38 相比,我们还增加了 1.19 亿个碱基对的常染色质多态性序列和 1115 个基因重复。大约 9000 万个额外的碱基对来自结构变异。使用我们的草案泛基因组来分析短读长数据,与基于 GRCh38 的工作流程相比,减少了小变异发现错误 34%,并将每个单倍型检测到的结构变异数量增加了 104%,从而能够对每个样本的绝大多数结构变异等位基因进行分型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b4ab/10172123/33aaca4d3c91/41586_2023_5896_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验