Computer Science 425 Distributed Systems (Fall 2009)AcknowledgementAdministrative Plan for TodayReplication Basic Mode of ReplicationReplication Management (5 Steps) Replication Management Group Communication - Review Group Views - Review Group Views - ReviewView Synchronous Communication - ReviewExample: View Synchronous CommunicationFault-Tolerant servicesBack to ReplicationLinearizability Sequential Consistency Passive (Primary-Backup) ReplicationView Synchrony - ExampleFault Tolerance in Passive Replication Active ReplicationFault Tolerance in Active Replication Eager versus LazyGossiping ArchitectureQuery and Update Operations in a Gossip ServiceVarious TimestampsFront ends Propagate Their TimestampsA Gossip Replica ManagerSlide Number 29Slide Number 30Slide Number 31Slide Number 32SummaryComputer Science 425Distributed Systems(Fall 2009)Lecture 23Replication ControlReading: Section 15.1-15.4.1Klara NahrstedtAcknowledgement• The slides during this semester are based on ideas and material from the following sources: – Slides prepared by Professors M. Harandi, J. Hou, I. Gupta, N. Vaidya, Y-Ch. Hu, S. Mitra. – Slides from Professor S. Gosh’s course at University o Iowa.Administrative • MP3 posted – Deadline December 7 (Monday) – pre-competition» Top five groups will be selected for final demonstration on Tuuesday, December 8– Demonstration Signup Sheets for Monday, 12/7, will be made available– Main Demonstration in front of the Qualcom Representative will be on Tuesday, December 8 afternoon - details will be announced. • HW4 posted November 10, 2009– Deadline December 1, 2009 (Tuesday)Plan for Today• Replication • Review of View Concept and Group Communication• Passive Replication• Active Replication• Gossiping ArchitectureReplication Enhancing Services by replicating data Load Balancing Example: Workload is shared between the servers by binding all the server IP addresses to the service’s DNS name. A DNS lookup of the site results in one of the servers’ IP addresses being returned, in a round-robin fashion. Fault Tolerance Under the fail-stop model, if up to f of f+1 servers crash, at least one remains to supply the service. Increased Availability Service may not be available when servers fail or when the network is partitioned.P: probability that one server fails= 1 – P= availability of service. e.g. P = 5% => service is available 95% of the time.Pn: probability that n servers fail= 1 – Pn= availability of service. e.g. P = 5%, n = 3 => service available 99.875% of the timeBasic Mode of Replication Replication TransparencyUser/client need not know that multiple physical copies of data exist. Replication ConsistencyData is consistent on all of the replicas (or is in the process of becoming consistent)ClientFront EndRMRMRMClientFront EndClientFront EndServiceserverserverserverReplica Manager`Replication Management (5 Steps) Request Communication Requests can be made to a single Replication Manager (RM) or to multiple RMs Coordination: The RMs decide whether the request is to be applied the order of requestsFIFO ordering: If a FE issues r then r’, then any correct RM handles r and then r’.Causal ordering: If the issue of r “happened before” the issue of r’, then any correct RM handles r and then r’.Total ordering: If a correct RM handles r and then r’, then any correct RM handles r and then r’. Execution: The RMs execute the request tentatively.Replication Management Agreement: The RMs attempt to reach consensus on the effect of the request. E.g., Two phase commit through a coordinatorIf this succeeds, effect of request is made permanent Response One or more RMs responds to the front end. In the case of fail-stop model, the Front End (FE) returns the first response to arrive.Group Communication - Review “Member”= process (e.g., RM) Static Groups: group membership is pre-defined Dynamic Groups: Members may join and leave, as necessaryGroup SendAddress ExpansionMulticast Comm.Membership ManagementLeaveFailJoinGroupGroup Views - Review A group membership service maintains group views, which are lists of current group members. This is NOT a list maintained by one member, but…Each member maintains its own local viewA view Vp(g) is process p’s understanding of its group (list of members) Example: V p.0(g) = {p}, V p.1(g) = {p, q}, V p.2(g) = {p, q, r}, V p.3(g) = {p,r}A new group view is disseminated, throughout the group, whenever a member joins or leaves.Member detecting failure of another member reliable multicasts a “view change” message (requires causal-total ordering for multicasts)Group Views - ReviewAn event is said to occur in a view Vp,i(g) if the event occurs at p, and at the time of event occurrence, p has delivered Vp,i(g) but has not yet delivered Vp,i+1(g). Messages sent out in a view i need to be delivered in that view at allmembers in the group (“What happens in the View, stays in the View”)Requirements for view delivery Order: If p delivers Vi(g) and then Vi+1(g), then no other process q delivers Vi+1(g) before Vi(g). Integrity: If p delivers Vi(g), then p is in Vi(g). Non-triviality: if process q joins a group and becomes reachable from process p, then eventually, q will always be present in the views that are delivered at p.View Synchronous Communication - ReviewView Synchronous Communication = Group Membership Service + Reliable multicast The following guarantees are provided for multicast messages:Integrity: If p delivered message m, p will not deliver m again. Also p ∈group (m). Validity: Correct processes always deliver all messages. That is, if p delivers message m in view V(g), and some process q ∈V(g) does not deliver m in view V(g), then the next view V’(g) delivered at p will not include q.Agreement: Correct processes deliver the same set of messages in any view.if p delivers m in V, and then delivers V’, then all processes in V ∩V’ deliver m in view VAll View Delivery conditions (Order, Integrity and Non-triviality conditions, from last slide) are satisfied“What happens in the View, stays in the View”Example: View Synchronous CommunicationpqrV(p,q,r)pqrV(p,q,r)pqrV(p,q,r)pqrV(p,q,r)XXXV(q,r)V(q,r)V(q,r)V(q,r)XXXNot AllowedNot AllowedAllowedAllowedFAULT-TOLERANT SERVICESBack to ReplicationClientFront
View Full Document