Global Predicate Detection and Event OrderingOur ProblemTo compute predicatesover the state of a distributed applicationModelMessage passingNo failuresTwo possible timing assumptions:1. Synchronous System2. Asynchronous SystemNo upper bound on message delivery timeNo bound on relative process speedsNo centralized clockClock SynchronizationExternal Clock Synchronization:keep processor clock within some maximum deviation from an external time source.•can exchange of info about timing events of different systems•can take actions at real-time deadlines•synchronization within 0.1 msInternal Clock Synchronization:keep processor clocks within some maximum deviation from each other.• can measure duration of distributed activities that start on one process and terminate on another• can totally order events that occur on a distributed systemSynchronizion clocks:Take 1Assume an upper bound max and a lower bound min on message delivery timeGuarantee that processes stay synchronized within max - minSynchronizion clocks:Take 1Assume an upper bound max and a lower bound min on message delivery timeGuarantee that processes stay synchronized within max - minTime (ms)93.174.484.22% of messagesProblem:5000 message run(IBM Almaden)Clock Synchronization:Take 2No upper bound on message delivery time......but lower bound min on message delivery timeUse timeout maxp to detect process failuresslaves send messages to masterMaster averages slaves value; computes fault-tolerant averagePrecision: 4 maxp - minProbabilistic Clock Synchronization (Cristian)Master-Slave architectureMaster is connected to external time sourceSlaves read master’s clock and adjust their ownHow accurately can a slave read the master’s clock?The IdeaClock accuracy depends on message roundtrip timeif roundtrip is small, master and slave cannot have drifted by much!Since no upper bound on message delivery, no certainty of accurate enough reading...… but very accurate reading can be achieved by repeated attemptsAsynchronous systemsWeakest possible assumptionscfr. “finite progress axiom”Weak assumptions less vulnerabilitiesAsynchronous ! slow“Interesting” model wrt failures (ah ah ah!) ≡Client-ServerProcesses exchange messages using Remote Procedure Call (RPC)A client requests a service by sending the server a message. The client blocks while waiting for a responsescClient-ServerProcesses exchange messages using Remote Procedure Call (RPC)The server computes the response (possibly asking other servers) and returns it to the clientA client requests a service by sending the server a message. The client blocks while waiting for a responses#!?%!cDeadlock!p2p1p3GoalDesign a protocol by which a processor can determine whether a global predicate (say, deadlock) holdsDraw arrow from to if has received a request but has not responded yetWait-For GraphspipjpjDraw arrow from to if has received a request but has not responded yetCycle in WFG deadlockDeadlock cycle in WFGWait-For Graphs⇒ ♦⇒ ·pipjpjThe protocol sends a message to On receipt of ’s message, replies with its state and wait-for infop1. . . p3p0p0piAn executionp1p1p2p2p3p3An executionp1p1p2p2p3p3An executionGhost Deadlock!p2p2p1p1p3p3Houston,we have a problem...Asynchronous systemno centralized clock, etc. etc.Synchrony useful tocoordinate actionsorder eventsMmmmhhh...Events and HistoriesProcesses execute sequences of eventsEvents can be of 3 types: local, send, and receive is the i-th event of process pThe local history of process p is the sequence of events executed by process p : prefix that contains first k events : initial, empty sequenceThe history H is the set hphkph0peiphp0∪ hp1∪ . . . hpn−1NOTE: In H, local histories are interpreted as sets, rather than sequences, of events Ordering eventsObservation 1: Events in a local history are totally orderedObservation 2: For every message , precedes timepitimepitimemreceive(m)send(m)mpjHappened-before(Lamport[1978])A binary relation defined over events1. if and , then2. if and , then3. if and then →eki, eli∈ hik < leki→ eliei= send(m)ej= receive(m)ei→ eje → e!e!→ e!!e → e!!Space-Time diagramsA graphic representation of a distributed executiontimep1p2p3p1p2p3Space-Time diagramsA graphic representation of a distributed executiontimep1p2p3p1p2p3Space-Time diagramsA graphic representation of a distributed executiontimep1p2p3p1p2p3Space-Time diagramsA graphic representation of a distributed executiontimep1p2p3p1p2p3Space-Time diagramsA graphic representation of a distributed executiontimep1p2p3p1p2p3H and impose a partial order→Space-Time diagramsA graphic representation of a distributed executiontimep1p2p3p1p2p3H and impose a partial order→Space-Time diagramsA graphic representation of a distributed executiontimep1p2p3p1p2p3H and impose a partial order→Space-Time diagramsA graphic representation of a distributed executiontimep1p2p3p1p2p3H and impose a partial order→Runs andConsistent RunsA run is a total ordering of the events in H that is consistent with the local histories of the processorsEx: is a runA run is consistent if the total order imposed in the run is an extension of the partial order induced byA single distributed computation may correspond to several consistent runs!h1, h2, . . . , hn→CutsA cut C is a subset of the global history of Hp1p2p3C = hc11∪ hc22∪ . . . hcnnA cut C is a subset of the global history of HThe frontier of C is the set of events Cutsp1p2p3C = hc11∪ hc22∪ . . . hcnnec11, ec22, . . . ecnnGlobal states and cutsThe global state of a distributed computation is an n-tuple of local statesTo each cut corresponds a global state Σ = (σ1, . . . σn)(σc11, . . . σcnn)(c1. . . cn)Consistent cuts and consistent global statesA cut is consistent ifA consistent global state is one corresponding to a consistent cut ∀ei, ej: ej∈ C ∧ ei→ ej⇒ ei∈ CWhat seesp0p1p2p3What seesNot a consistent global state: the cut contains the event corresponding to the receipt of the last message by but not the corresponding send eventp0p1p2p3p3Our taskDevelop a protocol by which a processor can build a consistent global stateInformally, we want to be able to take a snapshot of the computationNot obvious in an asynchronous system...Our approachDevelop a simple
View Full Document