Suppr超能文献

HTSeq——一个用于处理高通量测序数据的Python框架。

HTSeq--a Python framework to work with high-throughput sequencing data.

作者信息

Anders Simon, Pyl Paul Theodor, Huber Wolfgang

机构信息

Genome Biology Unit, European Molecular Biology Laboratory, 69111 Heidelberg, Germany.

出版信息

Bioinformatics. 2015 Jan 15;31(2):166-9. doi: 10.1093/bioinformatics/btu638. Epub 2014 Sep 25.

Abstract

MOTIVATION

A large choice of tools exists for many standard tasks in the analysis of high-throughput sequencing (HTS) data. However, once a project deviates from standard workflows, custom scripts are needed.

RESULTS

We present HTSeq, a Python library to facilitate the rapid development of such scripts. HTSeq offers parsers for many common data formats in HTS projects, as well as classes to represent data, such as genomic coordinates, sequences, sequencing reads, alignments, gene model information and variant calls, and provides data structures that allow for querying via genomic coordinates. We also present htseq-count, a tool developed with HTSeq that preprocesses RNA-Seq data for differential expression analysis by counting the overlap of reads with genes.

AVAILABILITY AND IMPLEMENTATION

HTSeq is released as an open-source software under the GNU General Public Licence and available from http://www-huber.embl.de/HTSeq or from the Python Package Index at https://pypi.python.org/pypi/HTSeq.

摘要

动机

在高通量测序(HTS)数据分析的许多标准任务中,存在大量可供选择的工具。然而,一旦项目偏离标准工作流程,就需要定制脚本。

结果

我们展示了HTSeq,这是一个用于促进此类脚本快速开发的Python库。HTSeq提供了用于HTS项目中许多常见数据格式的解析器,以及用于表示数据的类,如基因组坐标、序列、测序读数、比对、基因模型信息和变异调用,并提供了允许通过基因组坐标进行查询的数据结构。我们还展示了htseq-count,这是一个使用HTSeq开发的工具,通过计算读数与基因的重叠来预处理RNA-Seq数据以进行差异表达分析。

可用性和实现方式

HTSeq作为开源软件根据GNU通用公共许可证发布,可从http://www-huber.embl.de/HTSeq或Python软件包索引https://pypi.python.org/pypi/HTSeq获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e0d3/4287950/c52084a34258/btu638f1p.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验