ABPSE: DNA Aligner Based on Bit-level Parallelism and the Seed and Extend Strategy
Keywords:DNA, bioinformatics, Myers, seed-and-extend, FM index
DNA alignment is a key process in the assembly of genomes from the millions of short reads that are produced by massive parallel sequencing machines. Such a process is usually done by means of high spatial and temporal com-plexity algorithms, which takes hours to deliver the results as well as tens of GB of RAM. This has prompted the search for new algorithms and/or strategies that allow shorter runtimes, while using minimal memory footprint. In this article, we present ABPSE, a new DNA aligner that combines the Ferragina and Manzini algorithm (or FM indexes) and the Myers algorithm, by means of the seed and extend strategy. In the seeding, the FM indices allow a rapid calculation of the regions with high probability of alignment. In the extension, the Myers algorithm refines the alignment using operations based on bit vectors. It simultaneously calculates several cells of the dynamic pro-gramming matrix. The results show 96.1% of correctly aligned reads, an acceleration factor of 2.45x in relation to BWA-SW and a memory footprint of only 7.6 GB when aligning the entire human genome.
How to Cite
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Upon acceptance of an article in the RMIB, corresponding authors will be asked to fulfill and sign the copyright and the journal publishing agreement, which will allow the RMIB authorization to publish this document in any media without limitations and without any cost. Authors may reuse parts of the paper in other documents and reproduce part or all of it for their personal use as long as a bibliographic reference is made to the RMIB. However written permission of the Publisher is required for resale or distribution outside the corresponding author institution and for all other derivative works, including compilations and translations.