UH COSC 6360 - TOTEM- A FAULT-TOLERANT MULTICAST GROUP COMMUNICATION SYSTEM

Unformatted text preview:

TOTEM: A FAULT-TOLERANT MULTICAST GROUP COMMUNICATION SYSTEMINTRODUCTIONTOTEM SERVICESSingle-ring protocolMultiple-ring protocolProcess group interfaceServices provided by TotemAgreed DeliverySafe DeliveryWhy Lamport’s causal order?ExampleDelivery guaranteesExtended virtual synchrony (I)Extended virtual synchrony (II)Extended virtual synchrony (III)Ordering of messagesThe single-ring protocol (I)The single-ring protocol (II)Local membership protocol (I)Local membership protocol (II)The multiple-ring protocol (I)The multiple-ring protocol (II)Message delivery (I)Message delivery (II)Example (I)Example (II)Example (III)Related issuesTOTEM: A FAULT-TOLERANT MULTICASTGROUP COMMUNICATION SYSTEML. E. Moser, P. M. Melliar Smith,D. A. Agarwal, B. K. BudhiaC. A. Lingley-PapadopoulosUniversity of California, Santa BarbaraINTRODUCTION•Totem provides reliable totally-ordered multicasting of messages over LANs•Intended for complex applications with critical requirements for–fault tolerance–real-time performance•Exploits hardware broadcast of most LANsTOTEM SERVICES• Built as a hierarchy of protocols:Application layerProcess group interfaceMultiple-ring protocolSingle-ring protocolPhysical mediumSingle-ring protocol•Built on top of a best-effort multicast service, using UDP to exploit the hardware broadcasts of the LAN•Converts these multicasts into the service of reliable totally ordered delivery of messages on a single LAN•Also provides fault-detection, recovery and configuration change serviceMultiple-ring protocol•Uses information from the process group interface above it •Provides total ordering of messages as well as network topology maintenance servicesProcess group interface•Delivers messages to the application processes in the appropriate process groups•Provides process group membership services.Services provided by Totem•Two reliable totally ordered message delivery services: –Agreed delivery–Safe delivery•Both services deliver messages in a single system-wide total order that respects Lamport’s causal orderAgreed Delivery•Guarantees that a processor will not deliver a message before it has delivered all prior messages that:–Have been issued by processors in the current configuration and –Have time-stamps within the duration of that configuration•All processes receive all messages in theorder they were sentSafe Delivery•Further guarantee that a processor will not deliver a message unless all processors in its configuration have received it (everyone or nobody).•All processes receive all messages in the same order at the same timeWhy Lamport’s causal order?•Otherwise processes that belong to two or more groups could receive message from different groups in different order–A and B both in groups G and H–A receives m from group G then m’ from group H and finally m’’ from group G–B could receive m’ from group H then m from group G and finally m’’ from group GExampleGroup G sends messages m and m’’to A and BGroup H sends message m’to A and BBoth A and B will receive m and m’’in the same orderWithout total ordering, A could receive m’ before m’’ and B could receive m’’ before m’Delivery guarantees•Extended virtual synchrony ensures that these guarantees are honored within every configuration–When a fault occurs, Totem forms a transitional configuration with a reduced membership –Message order is guaranteed even in the presence of network partitionsExtended virtual synchrony (I)•We want to ensure that –Messages are received in the same order by all processes–All processes share the same view of the process group to which they belongExtended virtual synchrony (II)•Virtual synchrony model (K. Birman, ISIS) orders group membership changes along with the regular messages•Ensures that failures do not result in–Incomplete delivery of multicast messages–Holes in the causal delivery order•Problems remain if network can partitionExtended virtual synchrony (III)•Extended virtual synchrony model (Totem) extends the virtual synchrony model to systems–Processes can fail and recover–Network can partition and remerge•Guarantees that same message sent to processes in two or more components of a partitioned network will be in a consistent order in all these componentsOrdering of messages•Messages are born-ordered:–Each message includes a time-stamp–Relative order of messages is determined by the message themselves as created by their sendersThe single-ring protocol (I)•Uses a circulating token containing among others:–A seq field with the sequence number of the last message that was sent–An aru field with the sequence number of the last message that has been received by all processors•Only the processor that holds the token can send a messageThe single-ring protocol (II)•aru field used to implement safe delivery:–Tells processors which messages have been received by every processor in the ring•Token also provides information about the aggregate message backlog of the processors on the ring–Results in a fairer bandwidth allocation among processors than FDDILocal membership protocol (I)•Part of the single-ring protocol•Allows– Inclusion of new or recovering processors– Deletion of faulty processorsLocal membership protocol (II)•Ensures:–Consensus among all members of a configuration about the configuration membership–Termination as each configuration will be installed on every processor within a bounded time or not at all.The multiple-ring protocol (I)•Operates over several LANs linked by gateways–Each LAN is organized as a virtual token ring and managed by the single-ring protocol•Offers same services and same guarantees as single-ring protocolRing A Ring B Ring CThe multiple-ring protocol (II)•Uses Lamport’s timestamps and delivers messages in timestamp order•When a gateway forwards a message from one ring to another, it gives to the message a new sequence number for the new ring•Processor faults and network partitions are detected by the single-ring protocolMessage delivery (I)•Each processor maintains one recv_msgs list of messages received but not yet delivered for each ring from which it can receive messagesRing MessagesB mB1, mB2C mC1A mA1, mA2, mA3Message delivery (II)•A processor will deliver a message as an agreed message as soon as– Message has the lowest time stamp of all the messages in its recv_msgs lists


View Full Document

UH COSC 6360 - TOTEM- A FAULT-TOLERANT MULTICAST GROUP COMMUNICATION SYSTEM

Documents in this Course
Load more
Download TOTEM- A FAULT-TOLERANT MULTICAST GROUP COMMUNICATION SYSTEM
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view TOTEM- A FAULT-TOLERANT MULTICAST GROUP COMMUNICATION SYSTEM and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view TOTEM- A FAULT-TOLERANT MULTICAST GROUP COMMUNICATION SYSTEM 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?