Assembly of long error-prone reads using de Bruijn graphs

Proc Natl Acad Sci U S A. 2016 Dec 27;113(52):E8396-E8405. doi: 10.1073/pnas.1604560113. Epub 2016 Dec 12.

Abstract

The recent breakthroughs in assembling long error-prone reads were based on the overlap-layout-consensus (OLC) approach and did not utilize the strengths of the alternative de Bruijn graph approach to genome assembly. Moreover, these studies often assume that applications of the de Bruijn graph approach are limited to short and accurate reads and that the OLC approach is the only practical paradigm for assembling long error-prone reads. We show how to generalize de Bruijn graphs for assembling long error-prone reads and describe the ABruijn assembler, which combines the de Bruijn graph and the OLC approaches and results in accurate genome reconstructions.

Keywords: de Bruijn graph; genome assembly; single-molecule sequencing.

MeSH terms

  • Algorithms
  • Benchmarking
  • Escherichia coli / genetics
  • Genomics
  • High-Throughput Nucleotide Sequencing / methods*
  • Reproducibility of Results
  • Sequence Analysis, DNA / methods*
  • Software
  • Xanthomonas / genetics