U of I CS 425 - Replication Control II - D698126

Home> Schools> University of Illinois> Computer Science (CS) > CS 425> Replication Control II

U of I CS 425 - Replication Control II

Pages 23

Download Save

Unformatted text preview:

Computer Science 425 Distributed Systems CS 425 / CSE 424 / ECE 428 Fall 2010Transactions on Replicated DataOne Copy SerializationTwo Phase Commit Protocol For Transactions on Replicated ObjectsPrimary Copy ReplicationRead One/Write All ReplicationAvailable Copies ReplicationAvailable Copies ApproachThe Impact of RM FailureLocal Validation (using Our Example)Network PartitionDealing with Network PartitionsQuorum ApproachesStatic QuorumsVoting with Static QuorumsOptimistic Quorum ApproachesView-based QuorumView-based Quorum - detailsExample: View-based QuorumExample: View-based Quorum (cont’d)SummaryOptional SlidesQuorum Consensus ExamplesLecture 20-1Lecture 20-1Computer Science 425Distributed SystemsCS 425 / CSE 424 / ECE 428Fall 2010Computer Science 425Distributed SystemsCS 425 / CSE 424 / ECE 428Fall 2010 2010, I. Gupta, K. Nahrtstedt, S. Mitra, N. Vaidya, M. T. Harandi, J. HouIndranil Gupta (Indy)October 28, 2010Lecture 20Replication Control II Reading: Chapter 15 (relevant parts)Lecture 20-2Lecture 20-2Transactions on Replicated DataTransactions on Replicated DataBAClient + front endBB BAAgetBalance(A)Client + front endReplica managersReplica managersdeposit(B,3);UTLecture 20-3Lecture 20-3One Copy SerializationOne Copy Serialization•In a non-replicated system, transactions appear to be performed one at a time in some order. This is achieved by ensuring a serially equivalent interleaving of transaction operations.•One-copy serializability: The effect of transactions performed by clients on replicated objects should be the same as if they had been performed one at a time on a single set of objects (i.e., 1 replica per object). –Equivalent to combining serial equivalence + replication transparency/consistencyLecture 20-4Lecture 20-4Two Phase Commit Protocol For Transactions on Replicated ObjectsTwo Phase Commit Protocol For Transactions on Replicated ObjectsTwo level nested 2PC•In the first phase, the coordinator sends the canCommit? command to the participants, each of which then passes it onto the other RMs involved (e.g., by using view synchronous communication) and collects their replies before replying to the coordinator.•In the second phase, the coordinator sends the doCommit or doAbort request, which is passed onto the members of the groups of RMs.Lecture 20-5Lecture 20-5Primary Copy ReplicationPrimary Copy Replication•For now, assume no crashes/failures•All the client requests are directed to a single primary RM.•Concurrency control is applied at the primary. •To commit a transaction, the primary communicates with the backup RMs and replies to the client.•View synchronous comm. gives  one-copy serializability•Disadvantage? Performance is low since primary RM is bottleneck.Lecture 20-6Lecture 20-6Read One/Write All ReplicationRead One/Write All Replication•An FE (client front end) may communicate with any RM.•Every write operation must be performed at all of the RMs–Each contacted RM sets a write lock on the object. •A read operation can be performed at any single RM–A contacted RM sets a read lock on the object.•Consider pairs of conflicting operations of different transactions on the same object.–Any pair of write operations will require locks at all of the RMs  not allowed–A read operation and a write operation will require conflicting locks at some RM  not allowedOne-copy serializability is achieved.Disadvantage? Failures block the system (esp. writes).Lecture 20-7Lecture 20-7Available Copies ReplicationAvailable Copies Replication•A client’s read request on an object can be performed by any RM, but a client’s update request must be performed across all available (i.e., non-faulty) RMs in the group.•As long as the set of available RMs does not change, local concurrency control achieves one-copy serializability in the same way as in read-one/write-all replication. •May not be true if RMs fail and recover during conflicting transactions.Lecture 20-8Lecture 20-8Available Copies ApproachAvailable Copies ApproachAXClient + front endPBClient + front endReplica managersdeposit(A,3);UTdeposit(B,3);getBalance(B)getBalance(A)Replica managersYMBNABLecture 20-9Lecture 20-9The Impact of RM FailureThe Impact of RM Failure•Assume that (i) RM X fails just after T has performed getBalance; and (ii) RM N fails just after U has performed getBalance. Both failures occur before any of the deposit()’s.•Subsequently, T’s deposit will be performed at RMs M and P, and U’s deposit will be performed at RM Y. •The concurrency control on A at RM X does not prevent transaction U from updating A at RM Y.•Solution: Must also serialize RM crashes and recoveries with respect to entire transactions.Lecture 20-10Lecture 20-10Local Validation (using Our Example)Local Validation (using Our Example)•From T’s perspective,–T has read from an object at X  X must have failed after T’s operation. –T observes the failure of N when it attempts to update the object B  N’s failure must be before T.–Thus: N fails  T reads object A at X; T writes objects B at M and P  T commits  X fails.•From U’s perspective,–Thus: X fails  U reads object B at N; U writes object A at Y  U commits  N fails.•At the time T tries to commit, –it first checks if N is still not available and if X, M and P are still available. Only then can T commit.–It then checks if the failure order is consistent with that of other transactions (T cannot commit if U has committed)–If T commits, U’s validation will fail because N has already failed.•Can be combined with 2PC. •Caveat: Local validation may not work if partitions occur in the networkLecture 20-11Lecture 20-11Network PartitionNetwork PartitionClient + front endBwithdraw(B, 4)Client + front endReplica managersdeposit(B,3);UTNetworkpartitionBB BLecture 20-12Lecture 20-12Dealing with Network PartitionsDealing with Network Partitions•During a partition, pairs of conflicting transactions may have been allowed to execute in different partitions. The only choice is to take corrective action after the network has recovered –Assumption: Partitions heal eventually•Abort one of the transactions after the partition has healed•Basic idea: allow operations to continue in partitions, but finalize and commit trans. only after partitions have healed•But to optimize performance, better to avoid executing operations that will eventually lead to aborts…how?Lecture

View Full Document


School:
Email:
New Password:
Confirm Password:

U of I CS 425 - Replication Control II

Sign up for free to view:

Please select your school