Suppr超能文献

发射:基因注释的精确映射。

Liftoff: accurate mapping of gene annotations.

作者信息

Shumate Alaina, Salzberg Steven L

机构信息

Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21218, USA.

Center for Computational Biology, Whiting School of Engineering, Johns Hopkins University, Baltimore, MD 21211, USA.

出版信息

Bioinformatics. 2021 Jul 19;37(12):1639-1643. doi: 10.1093/bioinformatics/btaa1016.

Abstract

MOTIVATION

Improvements in DNA sequencing technology and computational methods have led to a substantial increase in the creation of high-quality genome assemblies of many species. To understand the biology of these genomes, annotation of gene features and other functional elements is essential; however, for most species, only the reference genome is well-annotated.

RESULTS

One strategy to annotate new or improved genome assemblies is to map or 'lift over' the genes from a previously annotated reference genome. Here, we describe Liftoff, a new genome annotation lift-over tool capable of mapping genes between two assemblies of the same or closely related species. Liftoff aligns genes from a reference genome to a target genome and finds the mapping that maximizes sequence identity while preserving the structure of each exon, transcript and gene. We show that Liftoff can accurately map 99.9% of genes between two versions of the human reference genome with an average sequence identity >99.9%. We also show that Liftoff can map genes across species by successfully lifting over 98.3% of human protein-coding genes to a chimpanzee genome assembly with 98.2% sequence identity.

AVAILABILITY AND IMPLEMENTATION

Liftoff can be installed via bioconda and PyPI. In addition, the source code for Liftoff is available at https://github.com/agshumate/Liftoff.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

DNA测序技术和计算方法的改进使得许多物种高质量基因组组装的数量大幅增加。为了理解这些基因组的生物学特性,对基因特征和其他功能元件进行注释至关重要;然而,对于大多数物种而言,只有参考基因组得到了充分注释。

结果

注释新的或改进的基因组组装的一种策略是将来自先前注释的参考基因组的基因进行映射或“转移”。在这里,我们描述了Liftoff,这是一种新的基因组注释转移工具,能够在同一物种或密切相关物种的两个组装之间映射基因。Liftoff将参考基因组中的基因与目标基因组进行比对,并找到在保留每个外显子、转录本和基因结构的同时最大化序列同一性的映射。我们表明,Liftoff能够在人类参考基因组的两个版本之间准确映射99.9%的基因,平均序列同一性>99.9%。我们还表明,Liftoff能够通过成功地将98.3%的人类蛋白质编码基因转移到具有98.2%序列同一性的黑猩猩基因组组装上,从而跨物种映射基因。

可用性和实现方式

Liftoff可以通过bioconda和PyPI进行安装。此外,Liftoff的源代码可在https://github.com/agshumate/Liftoff上获取。

补充信息

补充数据可在《生物信息学》在线版上获取。

相似文献

1
Liftoff: accurate mapping of gene annotations.
Bioinformatics. 2021 Jul 19;37(12):1639-1643. doi: 10.1093/bioinformatics/btaa1016.
2
CESAR 2.0 substantially improves speed and accuracy of comparative gene annotation.
Bioinformatics. 2017 Dec 15;33(24):3985-3987. doi: 10.1093/bioinformatics/btx527.
4
LiftoffTools: a toolkit for comparing gene annotations mapped between genome assemblies.
F1000Res. 2024 Apr 29;11:1230. doi: 10.12688/f1000research.124059.2. eCollection 2022.
5
GenomeQC: a quality assessment tool for genome assemblies and gene structure annotations.
BMC Genomics. 2020 Mar 2;21(1):193. doi: 10.1186/s12864-020-6568-2.
6
8
GOThresher: a program to remove annotation biases from protein function annotation datasets.
Bioinformatics. 2023 Jan 1;39(1). doi: 10.1093/bioinformatics/btad048.
9
genomepy: genes and genomes at your fingertips.
Bioinformatics. 2023 Mar 1;39(3). doi: 10.1093/bioinformatics/btad119.
10
AGC: compact representation of assembled genomes with fast queries and updates.
Bioinformatics. 2023 Mar 1;39(3). doi: 10.1093/bioinformatics/btad097.

引用本文的文献

1
WeavePop: A bioinformatics workflow to explore and analyze genomic variants of eukaryotic populations.
bioRxiv. 2025 Aug 20:2025.08.15.670593. doi: 10.1101/2025.08.15.670593.
2
Interactions between the genome and the nuclear lamina are multivalent and cooperative.
Nat Struct Mol Biol. 2025 Sep 1. doi: 10.1038/s41594-025-01655-w.
3
A species-wide inventory of receptor-like kinases in Arabidopsis thaliana.
BMC Biol. 2025 Aug 26;23(1):266. doi: 10.1186/s12915-025-02364-y.
4
Origin and evolutionary trajectories of brown algal sex chromosomes.
Nat Ecol Evol. 2025 Aug 25. doi: 10.1038/s41559-025-02838-w.
6
Phased genome assemblies and pangenome graphs of human populations of Japan and Saudi Arabia.
Sci Data. 2025 Aug 12;12(1):1316. doi: 10.1038/s41597-025-05652-y.
7
Iterative SCRaMbLE for engineering synthetic genome modules and chromosomes.
Nat Commun. 2025 Aug 7;16(1):7278. doi: 10.1038/s41467-025-62356-y.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验