skip to main content
10.1145/1531743.1531755acmconferencesArticle/Chapter ViewAbstractPublication PagescfConference Proceedingsconference-collections
research-article

Quantitative analysis of sequence alignment applications on multiprocessor architectures

Published:18 May 2009Publication History

ABSTRACT

The exponential growth of databases that contains biological information (such as protein and DNA data) demands great efforts to improve the performance of computational platforms. In this work we investigate how bioinformatics applications benefit from parallel architectures that combine different alternatives to exploit coarse- and fine-grain parallelism. As a case of analysis we study the performance behavior of the Ssearch application that implements the Smith-Waterman algorithm, which is a dynamic programing approach that explores the similarity between a pair of sequences. The inherent large parallelism of the algorithm makes it ideal for architectures supporting multiple dimensions of parallelism (TLP, DLP and ILP). We study how this algorithm can take advantage of different parallel machines like the SGI Altix, IBM Power6, Cell BE machines and MareNostrum. Our results show that a share memory architecture like the PowerPC 970MP of Marenostrum can surpass a heterogeneous machine like the current Cell BE. Our quantitative analysis includes not only a study of scalability of the performance in terms of speedup, but also includes the analysis of bottlenecks in the execution of the application. This analysis is carried out through the study of the execution phases that the application presents.

References

  1. Dna data bank of japan. http://www.ddbj.nig.ac.jp/.Google ScholarGoogle Scholar
  2. Embl database. http://www.ebi.ac.uk/embl/.Google ScholarGoogle Scholar
  3. Marenostrum architecture. http://www.bsc.es/plantillaA.php?cat\_id=5.Google ScholarGoogle Scholar
  4. Swissprot protein database. http://www.expasy.org/sprot/.Google ScholarGoogle Scholar
  5. Bioinformatics market study for washington technology center, June 2003. www.altabiomedical.com.Google ScholarGoogle Scholar
  6. S. F. Altschul, T.L, A. Schffer, J. Zhang, Z. Zhang, W. Miller, and D. Lipman. Gapped blast and psi-blast: a new generation of protein database serach programs. Nucleic acids research, 25:3389--3402, 1997.Google ScholarGoogle Scholar
  7. A. Boukerche, A. C. M. A. de Melo, M. Ayala-Rincon, and T. M. Santana. Parallel strategies for local biological sequence alignment in a cluster of workstations. In IPDPS '05: Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 15. IEEE Computer Society, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. A. Deshpande, D. Richards, and W. Pearson. A platform for biological sequence comparison on parallel computers. Comput. Appl. Biosci., 1991.Google ScholarGoogle ScholarCross RefCross Ref
  9. J. Henikoff, S. Henikoff and S. Pietrokovski. Blocks: a non-redundant database of protein alignment blocks derived from multiple compilations. Bioinformatics, 15, 1999.Google ScholarGoogle Scholar
  10. R. Hughey. Parallel hardware for sequence comparison and alignment. Proceedings of Int. Conf. Application Specific-Array Processors. IEEE Computer Society, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. S. Isaza, F. Sanchez, G. Gaydadjiev, A. Ramirez, and M. Valero. Preliminary analysis of the cell be processor limitations for sequence alignment applications. SAMOS 2008: Proceedings of the 8th international workshop on Embedded Computer Systems, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. A. Kahle, M. N. Day, H. P. Hofstee, C. R. Johns, and D. Shippy. Introduction to the cell multiprocessor. IBM Systems Journal, 49(4/5):589--604, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. H. Q. Le, W. J. Starke, J. S. Fields, F. P. O'Connell, D. Q. Nguyen, B. J. Ronchetti, W. M. Sauer, E. M. Schwarz, and M. T. Vaden. Ibm power6 microarchitecture. IBM J. Res. Dev., 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. W. Liu and B. Schmidt. Parallel design for computational biology and scientific computing applications. IEEE International Conference on Cluster Computing (CLUSTER 03), 2003.Google ScholarGoogle Scholar
  15. S. A. Manavski and G. Valle. Cuda compatible gpu cards as efficient hardwarer accelerator for smith-waterman sequence alignment. BMC Bioinformatics, 9, 2008.Google ScholarGoogle Scholar
  16. P. L. Miller, P. M. Nadkarni, and W. R. Pearson. Comparing machine-independent versus machine-specific parallelization of a software platform for biological sequence comparison. Comput. Appl. Biosci., 1992.Google ScholarGoogle ScholarCross RefCross Ref
  17. W. R. Pearson. Searching protein sequence libraries: comparison of the sensitivity and selectivity of the smith-waterman and FASTA algorithms. Genomics, 1991.Google ScholarGoogle Scholar
  18. V. Pillet, J. Labarta, T. Cortes, and S. Girona. PARAVER: A tool to visualise and analyze parallel code. Proceeding of WoTUG-18: Transputer and occam Developments, 1995.Google ScholarGoogle Scholar
  19. T. Rognes. Rapid and sensitive methods for protein sequence comparison and database searching. Phd Thesis, Institue of Medical Microbiology. University of Oslo, 2000.Google ScholarGoogle Scholar
  20. S. Ryoo, C. I. Rodrigues, S. S. Baghsorkhi, S. S. Stone, D. B. Kirk, and W. mei W. Hwu. Optimization principles and application performance evaluation of a multithreaded gpu using cuda. In PPoPP '08: Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming. ACM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. V. Sachdeva, M. Kistler, E. Speight, Tzeng, and T.H.K. Exploring the viability of the cell broadband engine for bioinformatics applications. Proc. of the 6th Workshop on High Performance Computational Biology, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  22. F. Sanchez, E. Salami, A. Ramirez, and M. Valero. Performance analysis of sequence alignment applications. Proceedings of the IEEE International Symposium on Workload Characterization. IISWC 2006., 2006.Google ScholarGoogle ScholarCross RefCross Ref
  23. E. Shpaer, M. Robinson, D. Yee, J. Candlin, R. Mines, and T. Hunkapiller. Sensitivity and selectivity in protein similarity searches: A comparison of smith-waterman in hardware to blast and fasta. Genomics, 38:179--191, 1996.Google ScholarGoogle ScholarCross RefCross Ref
  24. T. F. Smith and M. S. Waterman. Identification of common molecular subsequences. Journal of Mol. Biology, 1981.Google ScholarGoogle ScholarCross RefCross Ref
  25. G. Tan, N. Sun, and G. R. Gao. A parallel dynamic programming algorithm on a multi-core architecture. In SPAA '07: Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Quantitative analysis of sequence alignment applications on multiprocessor architectures

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      CF '09: Proceedings of the 6th ACM conference on Computing frontiers
      May 2009
      238 pages
      ISBN:9781605584133
      DOI:10.1145/1531743

      Copyright © 2009 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 18 May 2009

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      CF '09 Paper Acceptance Rate26of113submissions,23%Overall Acceptance Rate240of680submissions,35%

      Upcoming Conference

      CF '24

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader