U of I CS 525 - Advanced Distributed Systems - D57676

Home> Schools> University of Illinois> Computer Science (CS) > CS 525> Advanced Distributed Systems

U of I CS 525 - Advanced Distributed Systems

Course Cs 525- Advanced Graphics Processor Programming

Pages 41

Download Save

Unformatted text preview:

Slide 1AgendaI. Two Different System ModelsII. Logical ClocksExampleLamport TimestampsSpot the MistakeCorrected Example: Lamport Logical TimeSlide 9III. Global Snapshot AlgorithmChandy and Lamport Snapshot AlgorithmSnapshot ExampleIV. Give it a thoughtWhat is Consensus?Why is Consensus ImportantLet’s Try to Solve Consensus!Consensus in a Synchronous SystemWhy does the Algorithm Work?Consensus in an Asynchronous SystemRecallSlide 21Slide 22Slide 23Lemma 1Easier Consensus ProblemSlide 26What the FLP Proof ShowsLemma 2Slide 29What we’ll ShowLemma 3Slide 32Slide 33Slide 34Slide 35Slide 36Slide 37Slide 38Putting it all TogetherSummaryNext Week Onwards1CS 525 Advanced Distributed SystemsSpring 20101Indranil Gupta (Indy)Lecture 6Distributed Systems FundamentalsFebruary 4, 2010All Slides © IG2AgendaI. Synchronous versus Asynchronous systemsII. Lamport TimestampsIII. Global SnapshotsIV. Impossibility of Consensus proof3I. Two Different System Models•Synchronous Distributed System Each message is received within bounded time Drift of each process’ local clock has a known bound Each step in a process takes lb < time < ubEx:A collection of processors connected by a communication bus, e.g., a Cray supercomputer or a multicore machine•Asynchronous Distributed System No bounds on process execution The drift rate of a clock is arbitrary  No bounds on message transmission delaysEx:The Internet is an asynchronous distributed system, so are ad-hoc and sensor networksThis is a more general (and thus challenging) model than the synchronous system model. A protocol for an asynchronous system will also work for a synchronous system (though not vice-versa)It would be impossible to accurately synchronize the clocks of two communicating processes in an asynchronous system4II. Logical Clocks But is accurate (or approximate) clock sync. even required?Wouldn’t a logical ordering among events at processes suffice?Lamport’s happens-before () among events: On the same process: a  b, if time(a) < time(b)  If p1 sends m to p2: send(m)  receive(m) If a  b and b  c then a  c Lamport’s logical timestamps preserve causality: All processes use a local counter (logical clock) with initial value of zeroJust before each event, the local counter is incremented by 1 and assigned to the event as its timestamp A send (message) event carries its timestamp  For a receive (message) event, the counter is updated by max(receiver’s-local-counter, message-timestamp) + 15Examplep1p2p3a bc de fm1m2Physicaltime6Lamport Timestampsabc de fm1m2213 451p1p2p3Physical timeLogical Time• Logical timestamps preserve causality of events, i.e., a  b ==> TS(a) < TS(b) • Can be used instead of physical timestamps7Spot the Mistake Host 1Host 2Host 3Host 412233545364510700001243647nClock ValueMessagetimestampPhysical Time8Corrected Example: Lamport Logical Time Host 1Host 2Host 3Host 412233545768910700001243687nClock ValueMessagetimestampPhysical Time9logically concurrent events Host 1Host 2Host 3Host 412233545768910700001243687nClock ValueMessagetimestampPhysical Time•a  b ==> TS(a) < TS(b) but not the other way around!•Logical time does not account for out-of-band messagesCorrected Example: Lamport Logical Time10III. Global Snapshot Algorithm  Can you capture (record) the states of all processes and communication channels at exactly 10:04:50 am? Is it necessary to take such an exact snapshot? Chandy and Lamport snapshot algorithm: records a logical (or causal) snapshot of the system.System Model: No failures, all messages arrive intact, exactly once, eventually Communication channels are unidirectional and FIFO-ordered There is a communication path between every process pair11Chandy and Lamport Snapshot Algorithm 1. Marker (token message) sending rule for initiator process P0 After P0 has recorded its state• for each outgoing channel C, send a marker on C 2. Marker receiving rule for a process Pk : On receipt of a marker over channel C if this is first marker being received at Pk-record Pk’s state-record the state of C as “empty”-turn on recording of messages over all other incoming channels-for each outgoing channel C, send a marker on C  else // messages were already being recorded on channel C-turn off recording messages only on channel C, and mark state of C as = all the messages recorded over C (since recording was turned on, until now)Protocol terminates when every process has received a marker from every other process12Snapshot Example P1P2P3e10e20e23e30e13abMe11,2M1- P1 initiates snapshot: records its state (S1); sends Markers to P2 & P3; turns on recording for channels C21 and C31e21,2,3MM2- P2 receives Marker over C12, records its state (S2), sets state(C12) = {} sends Marker to P1 & P3; turns on recording for channel C32e143- P1 receives Marker over C21, sets state(C21) = {a}e32,3,4MM4- P3 receives Marker over C13, records its state (S3), sets state(C13) = {} sends Marker to P1 & P2; turns on recording for channel C23e245- P2 receives Marker over C32, sets state(C32) = {b}e316- P3 receives Marker over C23, sets state(C23) = {}e137- P1 receives Marker over C31, sets state(C31) = {}Consistent CutConsistent Cut =time-cut across processors and channels so no event to the right of the cut “happens-before” an event that is left of the cut13IV. Give it a thoughtHave you ever wondered why distributed server vendors always only offer solutions that promise five-9’s reliability, seven-9’s reliability, but never 100% reliable?The fault does not lie with Microsoft Corp. or Apple Inc. or CiscoThe fault lies in the impossibility of consensus14What is Consensus?•N processes•Each process p has –input variable xp : initially either 0 or 1–output variable yp : initially b•Consensus problem: design a protocol so that at the end, either:1. all processes set their output variables to 0 2. Or all processes set their output variables to 1–Also, there is at least one initial state that leads to each outcome above (non-triviality)15Why is Consensus Important•Many problems in distributed systems are equivalent to (or harder than) consensus!–Agreement (harder than consensus, since it can be used to solve consensus)–Leader election (select exactly one leader, and every alive process knows about it)–Failure Detection•Consensus using leader election Choose 0 or 1 based on the

View Full Document


School:
Email:
New Password:
Confirm Password:

U of I CS 525 - Advanced Distributed Systems

Sign up for free to view:

Please select your school