Suppr超能文献

Sanger 测序的 FASTQ 文件格式,用于包含质量分数的序列,以及 Solexa/Illumina FASTQ 变体。

The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants.

机构信息

Plant Pathology, SCRI, Invergowrie, Dundee DD2 5DA, UK.

出版信息

Nucleic Acids Res. 2010 Apr;38(6):1767-71. doi: 10.1093/nar/gkp1137. Epub 2009 Dec 16.

Abstract

FASTQ has emerged as a common file format for sharing sequencing read data combining both the sequence and an associated per base quality score, despite lacking any formal definition to date, and existing in at least three incompatible variants. This article defines the FASTQ format, covering the original Sanger standard, the Solexa/Illumina variants and conversion between them, based on publicly available information such as the MAQ documentation and conventions recently agreed by the Open Bioinformatics Foundation projects Biopython, BioPerl, BioRuby, BioJava and EMBOSS. Being an open access publication, it is hoped that this description, with the example files provided as Supplementary Data, will serve in future as a reference for this important file format.

摘要

FASTQ 已成为一种通用的文件格式,用于共享测序读取数据,其中包含序列和每个碱基的相关质量评分,尽管迄今为止它还没有正式的定义,并且至少存在三种不兼容的变体。本文基于 MAQ 文档和最近由 Open Bioinformatics Foundation 项目 Biopython、BioPerl、BioRuby、BioJava 和 EMBOSS 共同商定的约定等公开信息,定义了 FASTQ 格式,涵盖了原始的桑格标准、Solexa/Illumina 变体以及它们之间的转换。作为一个开放获取的出版物,希望这个描述以及提供的示例文件作为补充数据,将来能成为这个重要文件格式的参考。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0992/2847217/c81bf3b7c984/gkp1137f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验