MIT 6 852 - Specifying and Using a Partitionable Group Communication Service

Unformatted text preview:

Specifying and Using a PartitionableGroup Communication ServiceALAN FEKETEUniversity of SydneyNANCY LYNCHMassachusetts Institute of TechnologyandALEX SHVARTSMANUniversity of Connecticut and Massachusetts Institute of TechnologyGroup communication services are becoming accepted as effective building blocks for theconstruction of fault-tolerant distributed applications. Many specifications for group commu-nication services have been proposed. However, there is still no agreement about what thesespecifications should say, especially in cases where the services are partitionable, i.e., wherecommunication failures may lead to simultaneous creation of groups with disjoint member-ships, such that each group is unaware of the existence of any other group. In this paper, wepresent a new, succinct specification for a view-oriented partitionable group communicationservice. The service associates each message with a particular view of the group membership.All send and receive events for a message occur within the associated view. The serviceprovides a total order on the messages within each view, and each processor receives a prefixof this order. Our specification separates safety requirements from performance and fault-tolerance requirements. The safety requirements are expressed by an abstract, global statemachine. To present the performance and fault-tolerance requirements, we include failure-status input actions in the specification; we then give properties saying that consensus on theA preliminary version of this work appeared in the Proceedings of the 16th ACM Symposiumon Principles of Distributed Computing, 1997. This research was supported by the followingcontracts: ARPA F19628-95-C-0118, AFOSR-ONR F49620-94-1-0199, U.S. Department ofTransportation: DTRS95G-0001-YR.8, NSF9225124-CCR, and by an Australian ResearchCouncil grant. The work of A. Shvartsman was supported in part by the NSF Career AwardCCR-9984778, NSF grant CCR-9988304, and a grant from AFOSR.Authors’ addresses: A. Fekete, Basser Department of Computer Science, University of Sydney,Madsen Building F09, Sidney, NSW 2006, Australia; email: [email protected]; N. Lynch,Laboratory for Computer Science, Massachusetts Institute of Technology, 545 TechnologySquare, NE43-365, Cambridge, MA 02139; email: [email protected]; A. Shvartsman,Department of Computer Science and Engineering, U-155, University of Connecticut, Storrs,CT 06269; Massachusetts Institute of Technology, Laboratory for Computer Science, NE43–316, Cambridge, MA 02139; email: [email protected] to make digital/hard copy of part or all of this work for personal or classroom useis granted without fee provided that the copies are not made or distributed for profit orcommercial advantage, the copyright notice, the title of the publication, and its date appear,and notice is given that copying is by permission of the ACM, Inc. To copy otherwise, torepublish, to post on servers, or to redistribute to lists, requires prior specific permissionand/or a fee.© 2001 ACM 0734-2071/01/0500–0171 $5.00ACM Transactions on Computer Systems, Vol. 19, No. 2, May 2001, Pages 171–216.view and timely message delivery are guaranteed in an execution provided that the executionstabilizes to a situation in which the failure-status stops changing and corresponds to aconsistently partitioned system. Because consensus is not required in every execution, thespecification is not subject to the existing impossibility results for partitionable systems. Ourspecification has a simple implementation, based on the membership algorithm of Cristianand Schmuck. We show the utility of the specification by constructing an ordered-broadcastapplication, using an algorithm (based on algorithms of Amir, Dolev, Keidar, and others) thatreconciles information derived from different instantiations of the group. The applicationmanages the view-change activity to build a shared sequence of messages, i.e., the per-viewtotal orders of the group service are combined to give a universal total order. We prove thecorrectness and analyze the performance and fault-tolerance of the resulting application.Categories and Subject Descriptors: C.2.4 [Computer-Communication Networks]: Distrib-uted Systems; D.4.5 [Operating Systems]: Reliability—Fault-tolerance; D.2.4 [SoftwareEngineering]: Software/Program Verification—Correctness proofsGeneral Terms: Algorithms, Design, Performance, VerificationAdditional Key Words and Phrases: Group communication protocols, message-passing proto-cols, conditional performance analysis, total-order broadcast, composable building blocks,service specification, ordered broadcast, distributed algorithms1. INTRODUCTION1.1 BackgroundIn the development of practical distributed systems, considerable effort isdevoted to making distributed applications robust in the face of typicalprocessor and communication failures. Constructing such systems is diffi-cult, however, because of the complexities of the applications and of thefault-prone distributed settings in which they run. To aid in this construc-tion, some computing environments include general-purpose buildingblocks that provide powerful distributed computation services.Among the most important examples of building blocks are group com-munication services. Group communication services enable processes lo-cated at different nodes of a distributed network to operate collectively as agroup; the processes do this by using a group communication service tomulticast messages to all members of the group. Different group communi-cation services offer different guarantees about the order and reliability ofmessage delivery. Examples are found in Isis [Birman and van Renesse1994], Transis [Dolev and Malki 1996], Totem [Moser et al. 1996], Newtop[Ezhilchelvan et al. 1995], Relacs [Babaoglu et al. 1995a], Horus [vanRenesse et al. 1996], and Ensemble [van Renesse et al. 1998].Solutions based on the group communications approach have been devel-oped for real-world problems. For example, Isis and Isis-based systems areproviding reliable multicast communication for the New York Stock Ex-change where timely and consistent data must be delivered and filtered atmultiple trading floor locations, for the Swiss Electronic Bourse where the“trading floor” has been completely replaced a distributed system where thetraders and member banks participate in all activities electronically, and172 • A. Fekete et al.ACM Transactions on


View Full Document

MIT 6 852 - Specifying and Using a Partitionable Group Communication Service

Download Specifying and Using a Partitionable Group Communication Service
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Specifying and Using a Partitionable Group Communication Service and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Specifying and Using a Partitionable Group Communication Service 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?