ABSTRACT
Large-scale simulations and computational modeling using molecular dynamics (MD) continues to make significant impacts in the field of biology. It is well known that simulations of biological events at native time and length scales requires computing power several orders of magnitude beyond today's commonly available systems. Supercomputers, such as IBM Blue Gene/L and Cray XT3, will soon make tens to hundreds of teraFLOP/s of computing power available by utilizing thousands of processors. The popular algorithms and MD applications, however, were not initially designed to run on thousands of processors. In this paper, we present detailed investigations of the performance issues, which are crucial for improving the scalability of the MD-related algorithms and applications on massively parallel processing (MPP) architectures. Due to the varying characteristics of biological input problems, we study two prototypical biological complexes that use the MD algorithm: an explicit solvent and an implicit solvent. In particular, we study the AMBER application, which supports a variety of these types of input problems. For the explicit solvent problem, we focused on the particle mesh Ewald (PME) method for calculating the electrostatic energy, and for the implicit solvent model, we targeted the Generalized Born (GB) calculation. We uncovered and subsequently modified a limitation in AMBER that restricted the scaling beyond 128 processors. We collected performance data for experiments on up to 2048 Blue Gene/L and XT3 processors and subsequently identified that the scaling is largely limited by the underlying algorithmic characteristics and also by the implementation of the algorithms. Furthermore, we found that the input problem size of biological system is constrained by memory available per node. In conclusion, our results indicate that MD codes can significantly benefit from the current generation architectures with relatively modest optimization efforts. Nevertheless, the key for enabling scientific breakthroughs lies in exploiting the full potential of these new architectures.
- mpiP: Lightweight, Scalable MPI Profiling. http://www.llnl.gov/CASC/mpip/.Google Scholar
- P. K. Agarwal. Enzymes: An integrated view of structure, dynamics and function. Microbial Cell Factories, 5:2, 2006.Google ScholarCross Ref
- P. K. Agarwal. Role of Protein Dynamics in Reaction Rate Enhancement by Enzymes. Journal of American Chemical Society, 2005.Google ScholarCross Ref
- P. K. Agarwal, A. Geist, and A. Gorin. Protein Dynamics and Enzymatic Catalysis: Investigating the Peptidyl-Prolyl cis/trans Isomerization Activity of Cyclophilin A. Biochemistry, 43, 2004.Google Scholar
- G. S. Almasi, C. Cascaval, J. G. Castanos, M. Denneau, W. E. Donath, M. Eleftheriou, M. Giampapa amd H. Ho, D. Lieber, J. E. Moreira, D. M. Newns, M. Snir, and H. S. Warren Jr. Demonstrating the Scalability of a Molecular Dynamics Application on a Petaflops Computer. International Journal of Parallel Programming, 30(4), 2002. Google ScholarDigital Library
- B.R. Brooks, R.E. Bruccoleri, B.D. Olafson, D.J. States, S. Swaminathan, and M. Karplus. CHARMM: A program for macromolecular energy, minimization, and dynamics calculations. Journal of Computational Chemistry, 4, 1983.Google Scholar
- S. Browne, J. Dongarra, N. Garner, G. Ho, and P. Mucci. A portable programming interface for performance evaluation on modern processors. The International Journal of High Performance Computing Applications, 2000. Google ScholarDigital Library
- T. Darden, D. York, and L. Pederson. Particle mesh Ewald--an Nlog(N) method for Ewald sums in large systems. Journal of Chemical Physics, 98, 1993.Google Scholar
- B. G. Fitch, R. S. Germain, M. P. Mendell, J. Pitera, M. Pitman, A. Rayshubskiy, Y. Y. Sham, F. Suits, W. C. Swope, T. J. C. Ward, Y. Zhestkov, and R. Zhou. Blue Matter, an application framework for molecular simulation on Blue Gene. Journal of Parallel and Distributed Computing, 63, 2003. Google ScholarDigital Library
- N. Goodman. Biological data becomes computer literate: new advances in bioinformatics. Curr. Opin. Biotechnol., 13(1), 2002.Google Scholar
- GROMACS. http://www.gromacs.org/.Google Scholar
- K. Hiroaki. Computational systems biology. Nature, 420, 2002.Google Scholar
- L. Kale, R. Skeel, M. Bhandarkar, R. Brunner, A. Gursoy, N. Krawetz, J. Phillips, A. Shinozaki, K. Varadarajan, and title = "NAMD2 : Greater scalability for parallel molecular dynamics K. Schulten". NAMD2: Greater scalability for parallel molecular dynamics. Journal of Computational Physics, 151, 1999. Google ScholarDigital Library
- M. Karplus and G.A. Petsko. Molecular dynamics simulations in biology. Nature, 347, 1990.Google Scholar
- A. R. Leach. Molecular Modelling: Principles and Applications. Prentice Hall, 2001.Google ScholarDigital Library
- D. A. Pearlman, D.A. Case, J.W. Caldwell, W.S. Ross, III T.E. Cheatham, S. DeBolt, D. Ferguson, G. Seibel, and P.Kollman. AMBER, a package of computer programs for applying molecular mechanics, normalmode analysis, molecular dynamics and free energy calculationsto simulate the structural and energetic properties of molecules. Computer Physics Communication, 91, 1995.Google Scholar
- J. C. Phillips, G. Zheng, and L. Kale. NAMD: Biomolecular simulation on thousands of processors. In Supercomputing, 2002. Google ScholarDigital Library
- S. J. Plimpton. Fast parallel algorithms for short-range molecular dynamics. Journal of Computational Physics, 117, 1995. Google ScholarDigital Library
- IBM Blue Gene team. Blue Gene: A vision for protein science using a petaflop supercomputer. IBM Systems Jornal, 40, 2001. Google ScholarDigital Library
- IBM Blue Gene/L Team. An Overview of the Blue Gene/L supercomputer. In Supercomputing 2002. Google ScholarDigital Library
- V. Tsui and D.A. Case. Theory and applications of the generalized born solvation model in macromolecular simulations. Biopolymers (Nucl. Acid. Sci.), 56, 2001.Google Scholar
- J. Vetter, S. Alam, T. Dunigan, M. Fahey, P. Roth, and P. Worley. Early evaluation of the Cray XT3. In 20th IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2006. Google ScholarDigital Library
- G. Zheng, T. Wilmarth, P. Jagadishprasad, and L. V. Kale. Simulation-based Performance Prediction for Large Parallel Machines. International Journal of Parallel Programming, 33(2-3), 2005. Google ScholarDigital Library
Index Terms
- Performance characterization of molecular dynamics techniques for biomolecular simulations
Recommendations
Massively parallel molecular dynamics simulations of lysozyme unfolding
We have performed molecular dynamics simulations for a total duration of more than 10 µs (with most molecular trajectories being 1 µs in duration) to study, the effect of a single mutation on hen lysozyme protein stability and denaturing, using an IBM ...
Identification of novel PI3Kδ inhibitors by docking, ADMET prediction and molecular dynamics simulations
Graphical abstractThe virtual screening, molecular docking, ADMET prediction and molecular dynamics simulations were performed on Asinex database to find the novel compounds against PI3Kδ, which were demonstrated as the following flow chart:
...Highlights- Phosphoinositide-3-kinase Delta (PI3Kδ) is a well-validated target for anticancer drug design and development.
Abstract BackgroundPhosphoinositide-3-kinase Delta (PI3Kδ) plays a key role in B-cell signal transduction and inhibition of PI3Kδ is confirmed to have clinical benefit in certain types of activation of B-cell malignancies. Virtual ...
Identification of glucose-binding pockets in human serum albumin using support vector machine and molecular dynamics simulations
Human Serum Albumin (HSA) has been suggested to be an alternate biomarker to the existing Hemoglobin-A1c (HbA1c) marker for glycemic monitoring. Development and usage of HSA as an alternate biomarker requires the identification of glycation sites, or ...
Comments