Raeisi Dehkordi Siavash, Luebeck Jens, Bafna Vineet
Department of Computer Science & Engineering, University of California, San Diego, La Jolla, CA 92093, USA.
Bioinformatics & Systems Biology Graduate Program, University of California, San Diego, La Jolla, CA 92093, USA.
Patterns (N Y). 2021 May 3;2(5):100248. doi: 10.1016/j.patter.2021.100248. eCollection 2021 May 14.
Optical mapping (OM) provides single-molecule readouts of fluorescently labeled sequence motifs on long fragments of DNA, resolved to nucleotide-level coordinates. With the advent of microfluidic technologies for analysis of DNA molecules, it is possible to inexpensively generate long OM data ( kbp) at high coverage. In addition to scaffolding for assembly, OM data can be aligned to a reference genome for identification of genomic structural variants. We introduce FaNDOM (Fast Nested Distance Seeding of Optical Maps)-an optical map alignment tool that greatly reduces the search space of the alignment process. On four benchmark human datasets, FaNDOM was significantly (4-14×) faster than competing tools while maintaining comparable sensitivity and specificity. We used FaNDOM to map variants in three cancer cell lines and identified many biologically interesting structural variants, including deletions, duplications, gene fusions and gene-disrupting rearrangements. FaNDOM is publicly available at https://github.com/jluebeck/FaNDOM.
光学图谱(OM)可对DNA长片段上荧光标记的序列基序进行单分子读数,并解析到核苷酸水平的坐标。随着用于分析DNA分子的微流控技术的出现,以高覆盖率廉价地生成长OM数据( kbp)成为可能。除了用于组装的支架外,OM数据还可以与参考基因组比对,以识别基因组结构变异。我们引入了FaNDOM(光学图谱的快速嵌套距离种子法)——一种光学图谱比对工具,它大大减少了比对过程的搜索空间。在四个基准人类数据集上,FaNDOM比同类工具快得多(4 - 14倍),同时保持了相当的灵敏度和特异性。我们使用FaNDOM对三种癌细胞系中的变异进行图谱绘制,并鉴定出许多具有生物学意义的结构变异,包括缺失、重复、基因融合和基因破坏重排。FaNDOM可在https://github.com/jluebeck/FaNDOM上公开获取。