skip to main content
The ensemble system
Publisher:
  • Cornell University
  • PO Box 250, 124 Roberts Place Ithaca, NY
  • United States
ISBN:978-0-591-69952-4
Order Number:AAI9818467
Pages:
147
Bibliometrics
Skip Abstract Section
Abstract

Ensemble is a group communication system that demonstrably achieves a wide range of goals. It is a general-purpose communication system intended for constructing reliable distributed applications; it is a flexible framework for carrying out research in group ware protocols; it is a large-scale, system-style implementation built in a state-of-the-art programming language; and it is also a mathematical object designed to be amenable to formal analysis and manipulation. Thus, Ensemble straddles a number of disciplines of computer science ranging from systems architectures to formal methods. The principal advances described in this thesis are the creation of the Ensemble system and the demonstration that it exhibits the properties just mentioned.

The thesis begins by presenting the Ensemble architecture, as well as background in group communication. We describe the various components of the architecture, give examples of their interactions, and compare this architecture with that of other layered communication systems.

The Ensemble protocols make heavy use of layered micro-protocols. We describe optimization techniques that greatly reduce the performance overheads introduced by layering and show how the architecture facilitates these optimizations. In addition we show how to formalize these optimizations in type theory and implement them using the Nuprl theorem prover.

Ensemble is implemented in a dialect of the ML programming language. We describe how the use of ML impacted the system, and present a wide range of comparisons between Ensemble and a similar system implemented in C.

Cited By

  1. ACM
    Gogada H, Meling H, Jehl L and Olsen J An Extensible Framework for Implementing and Validating Byzantine Fault-Tolerant Protocols Proceedings of the 5th workshop on Advanced tools, programming languages, and PLatforms for Implementing and Evaluating algorithms for Distributed systems, (1-10)
  2. ACM
    Wilcox J, Woos D, Panchekha P, Tatlock Z, Wang X, Ernst M and Anderson T (2015). Verdi: a framework for implementing and formally verifying distributed systems, ACM SIGPLAN Notices, 50:6, (357-368), Online publication date: 7-Aug-2015.
  3. ACM
    Wilcox J, Woos D, Panchekha P, Tatlock Z, Wang X, Ernst M and Anderson T Verdi: a framework for implementing and formally verifying distributed systems Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, (357-368)
  4. ACM
    Reis D and Miranda H (2012). Transparently increasing RMI fault tolerance, ACM SIGAPP Applied Computing Review, 12:2, (18-26), Online publication date: 1-Jun-2012.
  5. ACM
    Reis D and Miranda H FTRMI Proceedings of the 27th Annual ACM Symposium on Applied Computing, (511-518)
  6. Correia A, Pereira J, Rodrigues L, Carvalho N and Oliveira R Practical database replication Replication, (253-285)
  7. Birman K A history of the virtual synchrony replication model Replication, (91-120)
  8. Blair G, Coulson G, Robin P and Papathomas M An architecture for next generation middleware Proceedings of the IFIP International Conference on Distributed Systems Platforms and Open Distributed Processing, (191-206)
  9. ACM
    Meling H and Montresor A Type-safe dynamic protocol composition in Jgroup/ARM Proceedings of the 3rd International DiscCoTec Workshop on Middleware-Application Interaction, (1-6)
  10. ACM
    Biskupski B, Dowling J and Sacha J (2007). Properties and mechanisms of self-organizing MANET and P2P systems, ACM Transactions on Autonomous and Adaptive Systems, 2:1, (1-es), Online publication date: 1-Mar-2007.
  11. ACM
    So K and Sirer E (2007). Latency and bandwidth-minimizing failure detectors, ACM SIGOPS Operating Systems Review, 41:3, (89-99), Online publication date: 1-Jun-2007.
  12. ACM
    So K and Sirer E Latency and bandwidth-minimizing failure detectors Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007, (89-99)
  13. Dolev S, Segala R and Shvartsman A (2006). Dynamic load balancing with group communication, Theoretical Computer Science, 369:1, (348-360), Online publication date: 15-Dec-2006.
  14. Basile C, Kalbarczyk Z and Iyer R (2006). Active Replication of Multithreaded Applications, IEEE Transactions on Parallel and Distributed Systems, 17:5, (448-465), Online publication date: 1-May-2006.
  15. Carvalho N, Pereira J and Rodrigues L Towards a generic group communication service Proceedings of the 2006 Confederated international conference on On the Move to Meaningful Internet Systems: CoopIS, DOA, GADA, and ODBASE - Volume Part II, (1485-1502)
  16. ACM
    Grace P, Coulson G, Blair G and Porter B Addressing network heterogeneity in pervasive application environments Proceedings of the first international conference on Integrated internet ad hoc and sensor networks, (20-es)
  17. ACM
    Rodrigues L, Mocito J and Carvalho N From spontaneous total order to uniform total order Proceedings of the 2006 ACM symposium on Applied computing, (723-727)
  18. Baldoni R, Cimmino S and Marchetti C Total order communications Proceedings of the 5th European conference on Dependable Computing, (38-54)
  19. Amir Y, Nita-Rotaru C, Stanton J and Tsudik G (2005). Secure Spread, IEEE Transactions on Dependable and Secure Computing, 2:3, (248-261), Online publication date: 1-Jul-2005.
  20. Kalbarczyk Z, Iyer R and Wang L (2005). Application Fault Tolerance with Armor Middleware, IEEE Internet Computing, 9:2, (28-37), Online publication date: 1-Mar-2005.
  21. Özkasap Ö (2004). Performance study of a probabilistic multicast transport protocol, Performance Evaluation, 57:2, (177-198), Online publication date: 1-Jun-2004.
  22. Subramaniam S, Komp E, Kannan M and Minden G Building a reliable multicast service based on composite protocols for active networks Proceedings of the 6th IFIP TC6 international working conference on Active networks, (101-113)
  23. Narayan G and Gopinath K iSAN Proceedings of the 11th international conference on High Performance Computing, (262-273)
  24. ACM
    Dowling J and Cahill V Self-managed decentralised systems using K-components and collaborative reinforcement learning Proceedings of the 1st ACM SIGSOFT workshop on Self-managed systems, (39-43)
  25. Whisnant K, Iyer R, Kalbarczyk Z, Jones III P, Rennels D and Some R (2004). The Effects of an ARMOR-Based SIFT Environment on the Performance and Dependability of User Applications, IEEE Transactions on Software Engineering, 30:4, (257-277), Online publication date: 1-Apr-2004.
  26. Dos Santos A, Duarte E and Keeni G (2004). Reliable Distributed Network Management by Replication, Journal of Network and Systems Management, 12:2, (191-213), Online publication date: 1-Jun-2004.
  27. ACM
    Castro M, Rodrigues R and Liskov B (2003). BASE, ACM Transactions on Computer Systems, 21:3, (236-269), Online publication date: 1-Aug-2003.
  28. Baldoni R and Marchetti C (2003). Three-tier replication for FT-CORBA infrastructures, Software—Practice & Experience, 33:8, (767-797), Online publication date: 10-Jul-2003.
  29. Ren Y, Bakken D, Courtney T, Cukier M, Karr D, Rubel P, Sabnis C, Sanders W, Schantz R and Seri M (2003). AQuA, IEEE Transactions on Computers, 52:1, (31-50), Online publication date: 1-Jan-2003.
  30. Chen W, Toueg S and Aguilera M (2002). On the Quality of Service of Failure Detectors, IEEE Transactions on Computers, 51:5, (561-580), Online publication date: 1-May-2002.
  31. Chen W, Toueg S and Aguilera M (2002). On the Quality of Service of Failure Detectors, IEEE Transactions on Computers, 51:1, (13-32), Online publication date: 1-Jan-2002.
  32. Pinto A Appia Proceedings of the The 21st International Conference on Distributed Computing Systems
  33. Krishnamurthy S, Sanders W and Cukier M A Dynamic Replica Selection Algorithm for Tolerating Timing Faults Proceedings of the 2001 International Conference on Dependable Systems and Networks (formerly: FTCS), (107-116)
  34. ACM
    Shands D, Jacobs J, Yee R and Sebes E (2001). Secure virtual enclaves, ACM Transactions on Information and System Security, 4:2, (103-133), Online publication date: 1-May-2001.
  35. ACM
    Black A, Huang J and Walpole J Reifying communication at the application level Proceedings of the 2001 international workshop on Multimedia middleware, (32-35)
  36. ACM
    (2001). The architecture and performance of security protocols in the ensemble group communication system, ACM Transactions on Information and System Security, 4:3, (289-319), Online publication date: 1-Aug-2001.
  37. ACM
    Montresor A, Davoli R and Babaoğlu Ö (2001). Middleware for dependable network services in partitionable distributed systems, ACM SIGOPS Operating Systems Review, 35:1, (73-96), Online publication date: 1-Jan-2001.
  38. Ren Y, Cukier M and Sanders W (2001). An Adaptive Algorithm for Tolerating Value Faults and Crash Failures, IEEE Transactions on Parallel and Distributed Systems, 12:2, (173-192), Online publication date: 1-Feb-2001.
  39. Babaoglu Ö, Davoli R and Montresor A (2001). Group Communication in Partitionable Systems, IEEE Transactions on Software Engineering, 27:4, (308-336), Online publication date: 1-Apr-2001.
  40. Biagioni E, Harper R and Lee P (2001). A Network Protocol Stack in Standard ML, Higher-Order and Symbolic Computation, 14:4, (309-356), Online publication date: 1-Dec-2001.
  41. ACM
    Miranda H and Rodrigues L Balancing configurability and efficiency in network support tools Proceedings of the 9th workshop on ACM SIGOPS European workshop: beyond the PC: new challenges for the operating system, (223-228)
  42. ACM
    Morgan G and Ezilchelvan P Policies for using replica groups and their effectiveness over the Internet Proceedings of NGC 2000 on Networked group communication, (119-129)
  43. ACM
    Birman K, Hayden M, Ozkasap O, Xiao Z, Budiu M and Minsky Y (1999). Bimodal multicast, ACM Transactions on Computer Systems, 17:2, (41-88), Online publication date: 1-May-1999.
  44. ACM
    Liu X, Kreitz C, van Renesse R, Hickey J, Hayden M, Birman K and Constable R (1999). Building reliable, high-performance communication systems from components, ACM SIGOPS Operating Systems Review, 33:5, (80-92), Online publication date: 12-Dec-1999.
  45. ACM
    Liu X, Kreitz C, van Renesse R, Hickey J, Hayden M, Birman K and Constable R Building reliable, high-performance communication systems from components Proceedings of the seventeenth ACM symposium on Operating systems principles, (80-92)
Contributors
  • Cornell University

Recommendations