Abstract
The BLAST sequence alignment program is a central application in bioinformatics. The de facto standard version, NCBI BLAST, uses complex heuristics that make it challenging to simultaneously achieve both high performance and exact agreement. We propose a system that uses novel FPGA-based filters that reduce the input database by over 99.97% without loss of sensitivity. There are several contributions. First is design of the filters themselves, which perform two-hit seeding, exhaustive ungapped alignment, and exhaustive gapped alignments, respectively. Second is the coupling of the filters, especially the two-hit seeding and the ungapped alignment. Third is pipelining the filters in a single design, including maintaining load balancing as data are reduced by orders of magnitude at each stage. Fourth is the optimization required to maintain operating frequency for the resulting complex design. And finally, there is system integration both in hardware (the Convey HC1-EX) and software (NCBI BLASTP). We present results for various usage scenarios and find complete agreement and a factor of nearly 5x speedup over a fully parallel implementation of the reference code on a contemporaneous CPU. We believe that the resulting system is the leading per-socket-accelerated NCBI BLAST.
- P. Afratis, E. Sotiriades, G. Chrysos, S. Fytraki, and D. Pnevmatikatos. 2008. A rate-based prefiltering approach to BLAST acceleration. In Proceedings IEEE Conference on Field Programmable Logic and Applications.Google Scholar
- S. F. Altschul, W. Gish, W. Miller, E. W. Myers, and D. Journal Lipman. 1990. Basic local alignment search tool. Journal of Molecular Biology 215 (1990), 403--410.Google ScholarCross Ref
- J. D. Bakos. 2010. High-performance heterogeneous computing with the convey HC-1. Computing in Science and Engineering 12, 6 (2010), 80--87. Google ScholarDigital Library
- G. Cochrane, I. Karsch-Mizrachi, and Y. Nakamura. 2011. The International Nucleotide Sequence Database Collaboration. Nucleic Acids Research 39 (2011), D15--D18.Google ScholarCross Ref
- Convey Computer Corporation. 2013a. Convey HC-2 Architectural Overview. Retrieved from http://www.conveycomputer.com/files/4113/5394/7097/Convey_HC-2_Architectural_Overview.pdf.Google Scholar
- Convey Computer Corporation. 2013b. Hybrid-Core Computing for High Throughput Bioinformatics. Retrieved from http://www.conveycomputer.com/files/2613/5085/5888/ConveyBioinformatics_web.pdf.Google Scholar
- M. C. Herbordt, J. Model, B. Sukhwani, Y. Gu, and T. VanCourt. 2006. Single pass, BLAST-like, approximate string matching on FPGAs. In Proceedings of the IEEE Symposium on Field Programmable Custom Computing Machines. Google ScholarDigital Library
- M. C. Herbordt, J. Model, B. Sukhwani, Y. Gu, and T. VanCourt. 2007. Single pass streaming BLAST on FPGAs. Parallel Computing 33, 10--11 (2007), 741--756. Google ScholarDigital Library
- A. Jacob, J. Lancaster, J. Buhler, B. Harris, and R. Chamberlain. 2008. Mercury BLASTP: Accelerating protein sequence alignment. ACM Transactions on Reconfigurable Technology and Systems 1, 2 (2008). Google ScholarDigital Library
- S. D. Kahn. 2011. On the future of genomic data. Science 331 (2011), 728--729.Google ScholarCross Ref
- S. Karlin and S. F. Altschul. 1990. Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes. Proceedings of the National Academy Sciences 87 (1990), 2264--2268.Google ScholarCross Ref
- I. Korf, M. Yandell, and J. Bedell. 2003. BLAST: An Essential Guide to the Basic Local Alignment Search Tool. O’Reilly and Associates. Google ScholarDigital Library
- P. Krishnamurthy, J. Buhler, R. Chamberlain, M. Franklin, K. Gyang, and J. Lancaster. 2007. Biosequence similarity search on the Mercury system. Journal of VLSI Signal Processing 49, 1 (2007), 101--121. Google ScholarDigital Library
- T. W. Lam, W. K. Sung, S. L. Tam, C. K. Wong, and S. M. Yiu. 2008. Compressed indexing and local alignment of DNA. Bioinformatics 24, 6 (2008), 791--797. Google ScholarDigital Library
- D. Lavenier, L. Xinchun, and G. Georges. 2006. Seed-based genomic sequence comparison using a FGPA/FLASH accelerator. In Proceedings of the IEEE Conference on Field Programmable Technology. 41--48.Google Scholar
- C. Ling and K. Benkrid. 2010. Design and implementation of a CUDA-compatible GPU-based core for gapped BLAST algorithm. Procedia Computer Science 1, 1 (2010).Google Scholar
- W. Liu, B. Schmidt, and W. Mueller-Wittig. 2011. CUDA-BLASTP: Accelerated BLASTP on CUDA-enabled graphics hardware. IEEE Transactions on Computational Biology and Bioinformatics 8, 6 (2011), 1678--1684. Google ScholarDigital Library
- A. Mahram and M. C. Herbordt. 2010. Fast and accurate NCBI BLASTP: Acceleration with multiphase FPGA-based prefiltering. In Proceedings of the 24th ACM International Conference on Supercomputing. 73--82. Google ScholarDigital Library
- K. Muriki, K. Underwood, and R. Sass. 2005. RC-BLAST: Towards an open source hardware implementation. In Proceedings of the 4th IEEE International Workshop on High Performance Computational Biology.Google Scholar
- NCBI. 2013. NCBI BLAST home. Retrieved from http://blast.ncbi.nlm.nih.gov/Blast.cgi.Google Scholar
- J. Park, Y. Qui, and M. C. Herbordt. 2009. CAAD BLASTP: NCBI BLASTP accelerated with FPGA-based pre-filtering. In Proceedings of the IEEE Symposium on Field Programmable Custom Computing Machines. 81--87. Google ScholarDigital Library
- E. Sotiriades and A. Dollas. 2007. A general reconfigurable architecture for the BLAST algorithm. Journal of VLSI Signal Processing 48 (2007), 189--208. Google ScholarDigital Library
- Time Logic Corp. 2013. DeCypher Biocomputing Platforms. Retrieved from www.timelogic.com/catalog/752/biocomputing-platforms.Google Scholar
- P. D. Vouzis and N. V. Sahinidis. 2011. GPU-BLAST: Using graphics processors to accelerate protein sequence alignment. Bioinformatics 27, 2 (2011), 182--188. Google ScholarDigital Library
- F. Xia, Y. Dou, and J. Xu. 2008. Families of FPGA-based accelerators for BLAST algorithm with multi-seeds detection and parallel extension. In Proceedings 2nd International Conference Bioinformatics Research and Development. 43--57.Google Scholar
Index Terms
- NCBI BLASTP on High-Performance Reconfigurable Computing Systems
Recommendations
Fast and accurate NCBI BLASTP: acceleration with multiphase FPGA-based prefiltering
ICS '10: Proceedings of the 24th ACM International Conference on SupercomputingNCBI BLAST has become the de facto standard in bioinformatic approximate string matching and so its acceleration is of fundamental importance. The problem is that it uses complex heuristics which make it difficult to simultaneously achieve both ...
Molecular Dynamics Simulations on High-Performance Reconfigurable Computing Systems
The acceleration of molecular dynamics (MD) simulations using high-performance reconfigurable computing (HPRC) has been much studied. Given the intense competition from multicore and GPUs, there is now a question whether MD on HPRC can be competitive. ...
Mercury BLASTP: Accelerating Protein Sequence Alignment
Large-scale protein sequence comparison is an important but compute-intensive task in molecular biology. BLASTP is the most popular tool for comparative analysis of protein sequences. In recent years, an exponential increase in the size of protein ...
Comments