ABSTRACT
Grid Workflows are emerging as practical programming models for solving large e-scientific problems on the Grid. However, it is typically assumed that the workflow components either read or write data to conventional files, which are copied from one execution stage to another, or they are tightly coupled using IPC libraries such as MPI or distributed streaming. More flexible communication can be achieved by overloading conventional READ and WRITE operations with advanced IO mechanisms such as sockets, streams and pipes, as is done in the GriddLeS environment. Such flexibility allows the pipelining of temporally dependent components, or in contrast, delaying of tightly coupled computations based on the current resource availability and network connectivity. However, it is also harder to schedule the workflow, because the communication mode may not be decided until run time. In this paper, we propose a new scheduling model that leverages such communication flexibility and allows us to generate dynamic runtime schedules. The scheduler in this case, not only allocates components to distributed Grid resources, but also specifies the inter-component communication mechanism (socket, pipe etc.) The current model is implemented as a dynamic workflow scheduling tool called GridRod, which harnesses Nimrod/G's [1] Grid services and GriddLeS [2] web services.
- Abramson, D., et al., High Performance Parametric Modeling with Nimrod/G: Killer Application for the Global Grid?, in International Parallel and Distributed Processing Symposium. 2000. Google ScholarDigital Library
- Abramson, D. and J. Komineni, A Flexible IO Scheme for Grid Workflows, in IPDPS-04. 2004: New Mexico.Google Scholar
- Ilkay Altintas, A. B., Kim Baldridge, Wibke Sudholt, Mark Miller, Celine Amoreira, Yohann Potier and Bertram Ludaescher. A Framework for the Design and Reuse of Grid Workflows. in Intl. Workshop on Scientific Applications on Grid Computing (SAG'04). 2005: Springer. Google ScholarDigital Library
- The Taverna Project. {cited; Available from: http://taverna.sourceforge.net.Google Scholar
- The Genie Project. {cited; Available from: http://www.genie.ac.ukGoogle Scholar
- Anthony Mayer, S. M., Nathalie Furmento, Jeremy Cohen, Murtaza Gulamali, Laurie Young, Ali Afzal Contact Information, Steven Newhouse and John Darlington. ICENI: An Integrated Grid Middleware to Support E-Science. in Workshop on Component Models and Systems for Grid Applications. 2004. Saint Malo, France: Springer US.Google Scholar
- K. Seymour, H. N., S. Matsuoka, D. Dongarra, C. Lee, and H. Casanova, GridRPC: A remote procedure call api for grid computing, in ICL Technical Report ICL-UT-02-06. June 2002, Innovative Computing Laboratory, Department of Computer Science, University of Tennessee: Baltimore, MD, USA.Google ScholarCross Ref
- The VrGrads Project. {cited; Available from: http://vgrads.rice.edu/.Google Scholar
- The Kepler Project. {cited; Available from: http://keplerproject.org/.Google Scholar
- Messerschmitt, E.A.L.a.D.G. Static Scheduling of Synchronous Data Flow Programs for Digital Signal Processing. in IEEE Transactions on Computers. Jan 1987. Google ScholarDigital Library
- Thomas L. Adam, K. M. C., J. R. Dickson. A comparison of list schedules for parallel processing systems. in Communications of the ACM. 1974: ACM Press New York, NY, USA. Google ScholarDigital Library
- Messerschmitt, E.A.L.a.D.G. Synchronous Data Flow. in Proceedings of the IEEE. 1987.Google Scholar
- Kahn., G. The Semantics of a Simple language for Parallel Programming. in In Proceedings of IFIP Congress. 1974: North Holland Publishing Company.Google Scholar
- Matias, P.B.G.a.Y. New sampling-based summary statistics for improving approximate query answers. in Proceedings of the 1998 ACM SIGMOD international conference on Management of data. 1998. Seattle, Washington, United States. Google ScholarDigital Library
- Charu C. Aggarwal, J. H., Jianyong Wang, Philip S. Yu. A Framework for Clustering Evolving Data Streams in In Proceeings of the 29th VLDB conference. 2003. Google ScholarDigital Library
- Liang Chen Reddy, K. A., G.GATES: a grid-based middleware for processing distributed data streams. in In Proceedings of IEEE Conference on High performance Distributed Computing, 2004. Proceedings. 4-6 June 2004: IEEE Computer Society Press. Google ScholarDigital Library
- Ahmad, Y.-K.K.a.I. Static Scheduling Algorithms for Allocating Directed Task Graphs to Multiprocessors. in ACM Computing Surveys. 1999. Google ScholarDigital Library
- Anthony Mayer, S. M., Nathalie Furmento, William Lee, Steven Newhouse, John Darlington. ICENI Dataflow and Workflow: Composition and Scheduling in Space and Time in Proceedings of the Workshop on Component Models and Systems for Grid Applications. 2003. Saint Malo, France: SpringerLink.Google Scholar
- Ron Oldfield, D. K., Applications of Parallel I/O Oct 1996.Google Scholar
- Abramson, D., Kommineni, J., McGregor, J. and Katzfey, J. An Atmospheric Sciences Workflow and its Implementation with Web Services. in The International Conference on Computational Sciences. June 6 - 9, 2004. Krakow Poland.Google ScholarCross Ref
- Jette., M. A. Performance Characteristics of Gang Scheduling in Multiprogrammed Environments. in In Proceedings of the 1997 ACM/IEEE conference on Supercomputing. Nov - 1997. Google ScholarDigital Library
- Luiz Meyer, Mike Wilde, Marta Mattoso, Ian Foster. Planning spatial workflows to optimize grid performance. in Distributed systems and grid computing (DSGC). 2006: ACM Press New York, NY, USA. Google ScholarDigital Library
- Casanova, H., et al. Heuristics for Scheduling Parameter Sweep Applications in Grid Environments. in In Proceedings of the 9th Heterogeneous Computing Workshop (HCW00). 2000. Google ScholarDigital Library
- Casanova, H., et al., The AppLeS Parameter Sweep Template: User-Level Middleware for the Grid, in In Proceedings of the Super Computing Conference (SC'2000). 2001. Google ScholarDigital Library
- Abramson, J.K.a.D. GriddLeS Enhancements and Building Virtual Applications for the GRID with Legacy Components. in European grid conference. 2005. Amsterdam: Springer. Google ScholarDigital Library
- Stiles, J. R., et al., Monte Carlo simulation of neuromuscular transmitter release using MCell, a general simulator of cellular physiological processes. Computational Neuroscience, 1998: p. 279--284. Google ScholarDigital Library
- Foster, I. and C. Kesselman, Globus: A Meta-computing Infrastructure Toolkit. International Journal of Supercomputer Applications, 1997. 11(2): p. 115--128.Google Scholar
- Hategan, M., et al., GridAnt - A Client Controllable Grid Workflow System. 2003, Argonne National Laboratory.Google Scholar
- Sarkar, V., Partitioning and Scheduling Parallel Programs for Multiprocessors. 1989: Paperback. 215. Google ScholarDigital Library
- S. Cheng, J.S.a.K.R. Dynamic Scheduling of Groups of Tasks with Precedence Constraints in Distributed Hard Real-Time Systems. in Real-Time Symposium. December 1986.Google Scholar
- H. Topcuoglu, S. H., and M. Y. Wu. Performance-Effective and Low-Complexity Task Scheduling for Heterogeneous Computing. in IEEE Trans. Parallel and Distributed Systems. 2002. Google ScholarDigital Library
- Andrea C. Arpaci-Dusseau, D. E. C., Alan M. Mainwaring. Scheduling with Implicit Information in Distributed Systems. in Joint Conference Measurement and Modeling Computer Systems. 1998. Madison, Wisconsin. Google ScholarDigital Library
- Python xml.domGoogle Scholar
- DAGMan (Directed Acyclic Graph Manager).Google Scholar
- Garg, A.a.R., Adrian Straight-Line Drawings of Binary Trees with Linear Area and Arbitrary Aspect Ratio. in Proceedings Graph Drawing. 2002. Irvine, CA, USA. Google ScholarDigital Library
- Rich Wolski, N.T.S., Jim Hayes. The Network Weather Service: A Distributed Resource Performance Forecasting Service for Metacomputing in Future Generation Computer Systems. 1998. Google ScholarDigital Library
Index Terms
- GridRod: a dynamic runtime scheduler for grid workflows
Recommendations
Runtime incremental concentrated scheduling on NOW(NRICS)
Runtime Incremental Concentrated Scheduling on NOW(NRICS) is an important task scheduling and load balancing strategy on Network of Workstations(NOW), it divides the whole NOW system into two types of alternative phases: a system scheduling phase and an ...
The Organization and Management of Grid Infrastructures
Grid computing technology has become fundamental to e-Science. As the virtual organizations established by scientific communities progress from testing their applications to more routine usage, maintaining reliable and adaptive grid infrastructures ...
MGC middleware for grid computing: the Globus Toolkit
ACAI '11: Proceedings of the International Conference on Advances in Computing and Artificial IntelligenceGrid computing has made substantial advances during the last decade. A major concern in Grid environments is dealing with the high degree of heterogeneity of resources that can range from laptops and PCs to supercomputers. The unified virtual view of ...
Comments