Low-Rank Methods for Parallelizing Dynamic Programming Algorithms

Authors:
Saeed Maleki

University of Illinois at Urbana-Champaign

University of Illinois at Urbana-Champaign
View Profile

,
Madanlal Musuvathi

Microsoft Research

Microsoft Research
View Profile

,
Todd Mytkowicz

Microsoft Research

Microsoft Research
View Profile

Authors Info & Claims

ACM Transactions on Parallel Computing Volume 2 Issue 4Article No.: 26pp 1–32https://doi.org/10.1145/2884065

Published:24 February 2016Publication History

ACM Transactions on Parallel Computing

Abstract

This article proposes efficient parallel methods for an important class of dynamic programming problems that includes Viterbi, Needleman-Wunsch, Smith-Waterman, and Longest Common Subsequence. In dynamic programming, the subproblems that do not depend on each other, and thus can be computed in parallel, form stages or wavefronts. The methods presented in this article provide additional parallelism allowing multiple stages to be computed in parallel despite dependencies among them. The correctness and the performance of the algorithm relies on rank convergence properties of matrix multiplication in the tropical semiring, formed with plus as the multiplicative operation and max as the additive operation.

This article demonstrates the efficiency of the parallel algorithm by showing significant speedups on a variety of important dynamic programming problems. In particular, the parallel Viterbi decoder is up to 24× faster (with 64 processors) than a highly optimized commercial baseline.

References

L. Allison and T. I. Dix. 1986. A bit-string longest-common-subsequence algorithm. Inform. Process. Lett. 23, 6 (Dec. 1986), 305--310. Google ScholarDigital Library
S. Aluru, N. Futamura, and K. Mehrotra. 2003. Parallel biological sequence comparison using prefix computations. J. Parallel Distrib. Comput. 63, 3 (2003), 264--272. Google ScholarDigital Library
A. Apostolico, M. J. Atallah, L. L. Larmore, and S. McFaddin. 1990. Efficient parallel algorithms for string editing and related problems. SIAM J. Comput. 19, 5 (1990), 968--988. Google ScholarDigital Library
R. Bellman. 1957. Dynamic Programming. Princeton University Press. Google ScholarDigital Library
M. Crochemore, C. S. Iliopoulos, Y. J. Pinzon, and J. F. Reid. 2001. A fast and practical bit-vector algorithm for the longest common subsequence problem. Inform. Process. Lett. 80, 6 (2001), 279--285. Google ScholarDigital Library
S. Deorowicz. 2010. Bit-parallel algorithm for the constrained longest common subsequence problem. Fundamenta Informaticae 99, 4 (2010), 409--433. Google ScholarDigital Library
M. Develin, F. Santos, and B. Sturmfels. 2005. On the rank of a tropical matrix. Combinatorial Computat. Geom. 52 (2005), 213--242.Google Scholar
M. Farrar. 2007. Striped Smith-Waterman speeds database searches six times over other SIMD implementations. Bioinformatics 23, 2 (2007), 156--161. Google ScholarDigital Library
G. Fettweis and H. Meyr. 1989. Parallel Viterbi algorithm implementation: Breaking the ACS-bottleneck. IEEE Trans. Commun. 37, 8 (1989), 785--790.Google ScholarCross Ref
Z. Galil and K. Park. 1994. Parallel algorithms for dynamic programming recurrences with more than O(1) dependency. J. Parallel Distrib. Comput. 21, 2 (1994), 213--222. Google ScholarDigital Library
W. Daniel Hillis and G. L. Steele, Jr. 1986. Data parallel algorithms. Commun. ACM 29, 12 (Dec. 1986), 1170--1183. Google ScholarDigital Library
D. S. Hirschberg. 1975. A linear space algorithm for computing maximal common subsequences. Commun. ACM 18, 6 (June 1975), 341--343. Google ScholarDigital Library
H. Hyyro. 2004. Bit-parallel LCS-length computation revisited. In Proceedings of the 15th Australasian Workshop on Combinatorial Algorithms. 16--27.Google Scholar
Intel C/C++ Compiler. 2013. Intel C/C++ Compiler. Retrieved from http://software.intel.com/en-us/c-compilers.Google Scholar
Intel MPI Library. 2013. Intel MPI Library. Retrieved from http://software.intel.com/en-us/intel-mpi-library/.Google Scholar
R. E. Ladner and M. J. Fischer. 1980. Parallel prefix computation. J. ACM 27, 4 (Oct. 1980), 831--838. Google ScholarDigital Library
I. T. S. Li, W. Shum, and K. Truong. 2007. 160-fold acceleration of the Smith-Waterman algorithm using a field programmable gate array (FPGA). BMC Bioinform. 8, 1 (2007), 1--7.Google ScholarCross Ref
L. Ligowski and W. Rudnicki. 2009. An efficient implementation of Smith Waterman algorithm on GPU using CUDA, for massively parallel scanning of sequence databases. In Proceedings of the IEEE International Symposium on Parallel Distributed Processing (IPDPS’09). 1--8. Google ScholarDigital Library
S. Maleki, M. Musuvathi, and T. Mytkowicz. 2014. Parallelizing dynamic programming through rank convergence. SIGPLAN Not. 49, 8 (Feb. 2014), 219--232. DOI:http://dx.doi.org/10.1145/2692916.2555264 Google ScholarDigital Library
W. S. Martins, J. B. Del Cuvillo, F. J. Useche, K. B. Theobald, and G. R. Gao. 2001. A multithreaded parallel implementation of a dynamic programming algorithm for sequence comparison. In Proceedings of the Pacific Symposium on Biocomputing. 311--322.Google Scholar
Y. Muraoka. 1971. Parallelism Exposure and Exploitation in Programs. Ph.D. Dissertation. University of Illinois at Urbana-Champaign. Google ScholarDigital Library
MVAPICH: MPI over InfiniBand. 2013. MVAPICH: MPI over InfiniBand. Retrieved from http://mvapich.cse.ohio-state.edu/.Google Scholar
National Center for Biotechnology Information. 2013. National Center for Biotechnology Information. Retrieved from http://www.ncbi.nlm.nih.gov/.Google Scholar
S. B. Needleman and C. D. Wunsch. 1970. A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Molec. Biol. 48 (1970), 443--453. Issue 3.Google ScholarCross Ref
W. Wesley Peterson and E. J. Weldon. 1972. Error-Correcting Codes. MIT Press: Cambridge, MA.Google Scholar
M. Püschel, J. M. F. Moura, J. Johnson, D. Padua, M. Veloso, B. Singer, J. Xiong, F. Franchetti, A. Gacic, Y. Voronenko, K. Chen, R. W. Johnson, and N. Rizzolo. 2005. SPIRAL: Code generation for DSP transforms. Proceedings of the IEEE, Special Issue on “Program Generation, Optimization, and Adaptation” 93 (2005), 232--275.Google Scholar
T. F. Smith and M. S. Waterman. 1981. Identification of common molecular subsequences. J. Molec. Biol. 147, 1 (1981), 195--197.Google ScholarCross Ref
Alex Stivala, Peter J. Stuckey, Maria de la Banda Garcia, Manuel Hermenegildo, and Anthony Wirth. 2010. Lock-free parallel dynamic programming. J. Parallel Distrib. Comput. 70, 8 (2010), 839--848. Google ScholarDigital Library
Texas Advanced Computing Center. Stampede: Dell PowerEdge C8220 Cluster with Intel Xeon Phi Coprocessors. Texas Advanced Computing Center. Retrieved from http://www.tacc.utexas.edu/resources/hpc.Google Scholar
Top500 Supercompute Sites. 2013. Top500 Supercompute Sites. Retrieved from http://www.top500.org.Google Scholar
L. G. Valiant, S. Skyum, S. Berkowitz, and C. Rackoff. 1983. Fast parallel computation of polynomials using few processors. SIAM J. Comput. 12, 4 (1983), 641--644.Google ScholarDigital Library
A. J. Viterbi. 1967. Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Trans. Inf. Theory 13, 2 (1967), 260--269. Google ScholarDigital Library

Index Terms

Low-Rank Methods for Parallelizing Dynamic Programming Algorithms
1. Computing methodologies
  1. Concurrent computing methodologies
    1. Concurrent algorithms
2. Theory of computation
  1. Design and analysis of algorithms
    1. Parallel algorithms

Recommendations

Parallelizing dynamic programming through rank convergence
PPoPP '14

This paper proposes an efficient parallel algorithm for an important class of dynamic programming problems that includes Viterbi, Needleman-Wunsch, Smith-Waterman, and Longest Common Subsequence. In dynamic programming, the subproblems that do not depend ...
Read More
Parallelizing dynamic programming through rank convergence
PPoPP '14: Proceedings of the 19th ACM SIGPLAN symposium on Principles and practice of parallel programming

This paper proposes an efficient parallel algorithm for an important class of dynamic programming problems that includes Viterbi, Needleman-Wunsch, Smith-Waterman, and Longest Common Subsequence. In dynamic programming, the subproblems that do not depend ...
Read More
Efficient parallelization using rank convergence in dynamic programming algorithms

This paper proposes an efficient parallel algorithm for an important class of dynamic programming problems that includes Viterbi, Needleman--Wunsch, Smith--Waterman, and Longest Common Subsequence. In dynamic programming, the subproblems that do not ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Parallel Computing Volume 2, Issue 4
Special Issue on PPOPP 2014
March 2016
202 pages
ISSN:2329-4949
EISSN:2329-4957
DOI:10.1145/2888415
Editor:
Phillip B. Gibbons
Carnegie Mellon University, Pittsburgh, USA
Issue’s Table of Contents
Copyright © 2016 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 24 February 2016
- Accepted: 1 December 2015
- Revised: 1 November 2015
- Received: 1 January 2015
Published in topc Volume 2, Issue 4

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Needleman-Wunsch
Parallelism
dynamic programming
longest common subsequence
tropical semiring
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 6
  Total Citations
  View Citations
- 617
  Total Downloads
- Downloads (Last 12 months)78
- Downloads (Last 6 weeks)12
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.