ABSTRACT
This paper provides a blueprint for constructing collaborative and distributed knowledge discovery systems within Grid-based computing environments. The need for such systems is driven by the quest for sharing knowledge, information and computing resources within the boundaries of single large distributed organisations or within complex Virtual Organisations (VO) created to tackle specific projects. The proposed architecture is built on top of a resource federation management layer and is composed of a set of different resources. We show how this architecture will behave during a typical KDD process design and deployment, how it enables the execution of complex and distributed data mining tasks with high performance and how it provides a community of e-scientists with means to collaborate, retrieve and reuse both KDD algorithms, discovery processes and knowledge in a visual analytical environment.
- P. Chapman, J. Clinton, T. Khabaza, T. Reinartz, and R. Wirth. The CRISP-DM process model, March 1999.]]Google Scholar
- Jaturon Chattratichat, John Darlington, Yike Guo, Stefan Hedvall, Martin Kohler, and Jameel Syed. An architecture for distributed enterprise data mining. In Proceedings of the 7th Conference on High Performance Computing and Networking Europe, 1999.]] Google ScholarDigital Library
- The Data Mining Group. {http://www.dmg.org}.]]Google Scholar
- Discovery link http://www.ibm.com/solutions/lifesciences/.]]Google Scholar
- M. Eisen, P. Spellman, P. Brown, and D. Botstein. Cluster analysis and display of genomewide expression patterns. Proc. Natl. Acad. Sci., 95:14863--14868, 1998.]]Google ScholarCross Ref
- European datagrid project, http://www.eu-datagrid.org/.]]Google Scholar
- Eurogrid, http://www.eurogrid.org/.]]Google Scholar
- Usama Fayyad. Knowledge discovery in databases: An overview. In Nada Lavrač and Sašo Džeroski, editors, Proceedings of the 7th International Workshop on Inductive Logic Programming, volume 1297 of LNAI, pages 3--16, Berlin, September 17--20 1997. Springer.]] Google ScholarDigital Library
- Usama Fayyad, Gregory Piatetsky-Shapiro, and Padhraic Smyth. Knowledge discovery and data mining: Towards a unifying framework. In Proceedings of Second International Conference on Knowledge Discovery and Data Mining. AAAI Press, 1996.]] Google ScholarDigital Library
- Ian Foster and Carl Kesselman. The globus toolkit. In Ian Foster and Carl Kesselman, editors, The Grid: Blueprint for a New Computing Infrastructure, pages 259--278. Morgan Kaufmann, San Francisco, CA, 1999. Chap. 11.]] Google ScholarDigital Library
- Ian Foster, Carl kesselman, Jeffrey M. Nick, and Steven Tuecke. The physiology of the grid an open grid services architecture for distributed systems integration. Technical report, 2002.]]Google Scholar
- Ian Foster, Carl Kesselman, and Steven Tuecke. The anatomy of the Grid: Enabling scalable virtual organization. The International Journal of High Performance Computing Applications, 15(3):200--222, Fall 2001.]] Google ScholarDigital Library
- Nathalie Furmento, Anthony Mayer, Stephen McGough, Steven Newhouse, and John Darlington. A component framework for HPC applications. Lecture Notes in Computer Science, 2150, 2001.]] Google ScholarDigital Library
- geneticxchange http://www.geneticxchange.com/.]]Google Scholar
- Global grid forum, http://www.gridforum.org/.]]Google Scholar
- Carole Goble. The low down on e-science and grids for biology. Comparative and Functional Genomics, pages 365--370, 2001.]]Google ScholarCross Ref
- Nasa power grid, http://www.ipg.nasa.gov/.]]Google Scholar
- Sap http://www.sap.com/.]]Google Scholar
- Seti institute, http://www.seti.org/.]]Google Scholar
- Uddi http://www.uddi.org.]]Google Scholar
- Web services technology http://www.w3.org/2002/ws/.]]Google Scholar
- Web service description language http://www.w3.org/tr/wsdl.]]Google Scholar
Index Terms
- Discovery net: towards a grid of knowledge discovery
Recommendations
The Design of Discovery Net: Towards Open Grid Services for Knowledge Discovery
With the emergence of distributed resources and grid technologies there is a need to provide higher level informatics infrastructures allowing scientists to easily create and execute meaningful data integration and analysis processes that take advantage ...
Supporting scientific discovery processes in Discovery Net: Research Articles
Selected Papers from the 2004 U.K. e-Science All Hands Meeting (AHM 2004)The activity of e-Science involves making discoveries by analysing data to find new knowledge. Discoveries of value cannot be made by simply performing a pre-defined set of steps to produce a result. Rather, there is an original, creative aspect to the ...
Architectural model for grid resources discovery
SpringSim '08: Proceedings of the 2008 Spring simulation multiconference"Grid Computing" has emerged as an important new field, distinguished from conventional distributed computing by its focus on large-scale resource sharing, innovative applications, and in some cases high-performance orientation.
This article is intended ...
Comments