Efficient parallelization using rank convergence in dynamic programming algorithms

Authors:
Saeed Maleki

Microsoft Research, Redmond, WA

Microsoft Research, Redmond, WA
View Profile

,
Madanlal Musuvathi

Microsoft Research, Redmond, WA

Microsoft Research, Redmond, WA
View Profile

,
Todd Mytkowicz

Microsoft Research, Redmond, WA

Microsoft Research, Redmond, WA
View Profile

Authors Info & Claims

Communications of the ACM Volume 59 Issue 10October 2016pp 85–92https://doi.org/10.1145/2983553

Published:22 September 2016Publication History

Communications of the ACM

Abstract

This paper proposes an efficient parallel algorithm for an important class of dynamic programming problems that includes Viterbi, Needleman--Wunsch, Smith--Waterman, and Longest Common Subsequence. In dynamic programming, the subproblems that do not depend on each other, and thus can be computed in parallel, form stages, or wavefronts. The algorithm presented in this paper provides additional parallelism allowing multiple stages to be computed in parallel despite dependences among them. The correctness and the performance of the algorithm relies on rank convergence properties of matrix multiplication in the tropical semiring, formed with plus as the multiplicative operation and max as the additive operation.

This paper demonstrates the efficiency of the parallel algorithm by showing significant speedups on a variety of important dynamic programming problems. In particular, the parallel Viterbi decoder is up to 24× faster (with 64 processors) than a highly optimized commercial baseline.

References

Apostolico, A., Atallah, M.J., Larmore, L.L., McFaddin, S. Efficient parallel algorithms for string editing and related problems. SIAM J. Comput. 19, 5 (1990), 968--988. Google ScholarDigital Library
Bellman, R. Dynamic Programming. Princeton University Press, Princeton, NJ, 1957. Google ScholarDigital Library
Deorowicz, S. Bit-parallel algorithm for the constrained longest common subsequence problem. Fund. Inform. 99, 4 (2010), 409--433. Google ScholarDigital Library
Develin, M., Santos, F., Sturmfels, B. On the rank of a tropical matrix. Combin. Comput Geom. 52 (2005), 213--242.Google Scholar
Farrar, M. Striped Smith-Waterman speeds database searches six times over other SIMD implementations. Bioinformatics 23, 2 (2007), 156--161. Google ScholarDigital Library
Fettweis, G., Meyr, H. Feedforward architectures for parallel Viterbi decoding. J. VLSI Signal Process. Syst. 3, 1--2 (June 1991), 105--119. Google ScholarDigital Library
Fettweis, G., Meyr, H. High-speed parallel Viterbi decoding: algorithm and VLSI-architecture. Commun. Mag. IEEE 29, 5 (1991), 46--55. Google ScholarDigital Library
Galil, Z., Park, K. Parallel algorithms for dynamic programming recurrences with more than O(1) dependency. J. Parallel Distrib. Comput. 21, 2 (1994), 213--222. Google ScholarDigital Library
Hillis, W.D., Steele, G.L., Jr. Data parallel algorithms. Commun. ACM29, 12 (Dec. 1986), 1170--1183. Google ScholarDigital Library
Hirschberg, D.S. A linear space algorithm for computing maximal common subsequences. Commun. ACM18, 6 (June 1975), 341--343. Google ScholarDigital Library
Hyyro, H. Bit-parallel LCS-length computation revisited. In Proceedings of the 15th Australasian Workshop on Combinatorial Algorithms (2004), 16--27.Google Scholar
Ladner, R.E., Fischer, M.J. Parallel prefix computation. J. ACM 27, 4 (Oct. 1980), 831--838. Google ScholarDigital Library
Maleki, S., Musuvathi, M., Mytkowicz, T. Parallelizing dynamic programming through rank convergence. In Proceedings of the 19th ACMSIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP '14 (New York, NY, USA, 2014). ACM, 219--232. Google ScholarDigital Library
Martins, W.S., Cuvillo, J.B.D., Useche, F.J., Theobald, K.B., Gao, G. A multithreaded parallel implementation of a dynamic programming algorithm for sequence comparison. In Pacific Symposium on Biocomputing (2001), 311--322.Google Scholar
Muraoka, Y. Parallelism exposure and exploitation in programs. PhD thesis, University of Illinois at Urbana-Champaign (1971). Google ScholarDigital Library
National Center for Biotechnology Information, http://www.ncbi.nlm.nih.gov/.2013.Google Scholar
Needleman, S.B., Wunsch, C.D. A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48 (1970) 443--453.Google ScholarCross Ref
Puschel, M., Moura, J.M.F., Johnson, J., Padua, D., Veloso, M., Singer, B., Xiong, J., Franchetti, F., Gacic, A., Voronenko, Y., Chen, K., Johnson, R.W., Rizzolo, N. SPIRAL: code generation for DSP transforms. Proceedings of the IEEE, Special Issue on "Program Generation, Optimization, and Adaptation" 93 (2005) 232--275.Google ScholarCross Ref
Smith, T., Waterman, M. Identification of common molecular subsequences. J. Mol. Biol. 147, 1 (1981), 195--197.Google ScholarCross Ref
Stivala, A., Stuckey, P.J., Garcia de la Banda, M., Hermenegildo, M., Wirth, A. Lock-free parallel dynamic programming. J. Parallel Distrib. Comput. 70, 8 (Aug. 2010), 839--848. Google ScholarDigital Library
Texas Advanced Computing Center, http://www.tacc.utexas.edu/resources/hpc. Stampede: Dell PowerEdge C8220 Cluster with Intel Xeon Phi coprocessors.Google Scholar
Valiant, L.G., Skyum, S., Berkowitz, S., Rackoff, C. Fast parallel computation of polynomials using few processors. SIAM J. Comput. 12, 4 (1983), 641--644.Google ScholarCross Ref
Viterbi, A. Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Trans. Inform. Theory 13, 2 (1967), 260--269. Google ScholarDigital Library
Viterbi, A.J., Omura, J.K. Principles of Digital Communication and Coding. Communications and information theory. McGraw-Hill, New York, 1979. Autre réimpr.: 1985. Google ScholarDigital Library

Index Terms

Efficient parallelization using rank convergence in dynamic programming algorithms
1. Software and its engineering
  1. Software notations and tools
    1. General programming languages
      1. Language types
        Parallel programming languages
2. Theory of computation
  1. Models of computation
    1. Concurrency
      1. Parallel computing models

Recommendations

Parallelizing dynamic programming through rank convergence
PPoPP '14

This paper proposes an efficient parallel algorithm for an important class of dynamic programming problems that includes Viterbi, Needleman-Wunsch, Smith-Waterman, and Longest Common Subsequence. In dynamic programming, the subproblems that do not depend ...
Read More
Parallelizing dynamic programming through rank convergence
PPoPP '14: Proceedings of the 19th ACM SIGPLAN symposium on Principles and practice of parallel programming

This paper proposes an efficient parallel algorithm for an important class of dynamic programming problems that includes Viterbi, Needleman-Wunsch, Smith-Waterman, and Longest Common Subsequence. In dynamic programming, the subproblems that do not depend ...
Read More
Low-Rank Methods for Parallelizing Dynamic Programming Algorithms
Special Issue on PPOPP 2014

This article proposes efficient parallel methods for an important class of dynamic programming problems that includes Viterbi, Needleman-Wunsch, Smith-Waterman, and Longest Common Subsequence. In dynamic programming, the subproblems that do not depend ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
Communications of the ACM Volume 59, Issue 10
October 2016
85 pages
ISSN:0001-0782
EISSN:1557-7317
DOI:10.1145/3001840
Editor:
Moshe Y. Vardi
Association for Computing Machinery, New York, NY
Issue’s Table of Contents
Copyright © 2016 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 22 September 2016
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 9
  Total Citations
  View Citations
- 16,669
  Total Downloads
- Downloads (Last 12 months)4,949
- Downloads (Last 6 weeks)32
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format