In modern computing a program is usually distributed among several processes. The fundamental challenge when developing reliable distributed programs is to support the cooperation of processes required to execute a common task, even when some of these processes fail. Guerraoui and Rodrigues present an introductory description of fundamental reliable distributed programming abstractions as well as algorithms to implement these abstractions. The authors follow an incremental approach by first introducing basic abstractions in simple distributed environments, before moving to more sophisticated abstractions and more challenging environments. Each core chapter is devoted to one specific class of abstractions, covering reliable delivery, shared memory, consensus and various forms of agreement. This textbook comes with a companion set of running examples implemented in Java. These can be used by students to get a better understanding of how reliable distributed programming abstractions can be implemented and used in practice. Combined, the chapters deliver a full course on reliable distributed programming. The book can also be used as a complete reference on the basic elements required to build reliable distributed applications.
Cited By
- Park S, Kelly T and Shen K Failure-atomic msync() Proceedings of the 8th ACM European Conference on Computer Systems, (225-238)
- Esposito C, Cotroneo D and Russo S (2013). Survey On reliability in publish/subscribe services, Computer Networks: The International Journal of Computer and Telecommunications Networking, 57:5, (1318-1343), Online publication date: 1-Apr-2013.
Recommendations
Reliable distributed database systems (abstract only)
CSC '87: Proceedings of the 15th annual conference on Computer ScienceWe are investigating the problem of ensuring global consistency in the context of distributed database systems. Our current research effort concentrates on theoretical study of reliability mechanisms such as algorithm design and performance ...