CS514: Intermediate Course in Operating SystemsRecapSo far?This and upcoming lectures?What time is it?But what does time “mean”?Lamport’s approachDrawing time-line pictures:Slide 9Slide 10Slide 11Slide 12Slide 13Happens before “relation”Logical clocksRules for managing logical clocksTime-line with LT annotationsSlide 18Can we do better?Vector clocksTime-line with VT annotationsRules for comparison of VTsSlide 23Vector time and happens beforeIntroducing “wall clock time”Synchronizing clocksSlide 27Slide 28Consequences?Thought questionSlide 31Real systems?Clock synchronizationFor next timeCS514: Intermediate Course in Operating SystemsProfessor Ken BirmanVivek Vishnumurthy: TARecapWe’ve started a process of isolating questions that arise in big systemsTease out an abstract issueTreat it separate from the original messy contextTry and understand what can and cannot be done, and how to solve when something can be doneSo far?Naming and discoveryPerformance through threading, events, ultimately clustersNotions of consistency that can and cannot be solvedCan’t achieve common knowledgeBut if we accept certain very limited risks can often achieve reasonable goalsThis and upcoming lectures?We’ll focus on concepts relating to timeTime as it can be “used” in systemsSystems that present behaviors best understood in terms of temporal models (notably the transactional model)Event ordering used to ensure consistency in distributed systems (multicasts that update replicated data or program state)What time is it?In distributed system we need practical ways to deal with timeE.g. we may need to agree that update A occurred before update BOr offer a “lease” on a resource that expires at time 10:10.0150 Or guarantee that a time critical event will reach all interested parties within 100msBut what does time “mean”?Time on a global clock?E.g. with GPS receiver… or on a machine’s local clockBut was it set accurately?And could it drift, e.g. run fast or slow?What about faults, like stuck bits?… or could try to agree on timeLamport’s approachLeslie Lamport suggested that we should reduce time to its basicsTime lets a system ask “Which came first: event A or event B?”In effect: time is a means of labeling events so that…If A happened before B, TIME(A) < TIME(B)If TIME(A) < TIME(B), A happened before BDrawing time-line pictures:pmsndp(m)qrcvq(m) delivq(m)DDrawing time-line pictures:A, B, C and D are “events”. Could be anything meaningful to the applicationSo are snd(m) and rcv(m) and deliv(m)What ordering claims are meaningful?pmACBsndp(m)qrcvq(m) delivq(m)DDrawing time-line pictures:A happens before B, and C before D“Local ordering” at a single processWrite and pqmACBrcvq(m) delivq(m)sndp(m)BApDCqDDrawing time-line pictures:sndp(m) also happens before rcvq(m)“Distributed ordering” introduced by a messageWritepqmACBrcvq(m) delivq(m)sndp(m))m(rcv)m(sndqMpDDrawing time-line pictures:A happens before DTransitivity: A happens before sndp(m), which happens before rcvq(m), which happens before DpqmDACBrcvq(m) delivq(m)sndp(m)Drawing time-line pictures:B and D are concurrentLooks like B happens first, but D has no way to know. No information flowed…pqmDACBrcvq(m) delivq(m)sndp(m)Happens before “relation”We’ll say that “A happens before B”, written AB, if1. APB according to the local ordering, or2. A is a snd and B is a rcv and AMB, or3. A and B are related under the transitive closure of rules (1) and (2)So far, this is just a mathematical notation, not a “systems tool”Logical clocksA simple tool that can capture parts of the happens before relationFirst version: uses just a single integerDesigned for big (64-bit or more) countersEach process p maintains LTp, a local counterA message m will carry LTmRules for managing logical clocksWhen an event happens at a process p it increments LTp. Any event that matters to pNormally, also snd and rcv events (since we want receive to occur “after” the matching send)When p sends m, setLTm = LTpWhen q receives m, setLTq = max(LTq, LTm)+1Time-line with LT annotationsLT(A) = 1, LT(sndp(m)) = 2, LT(m) = 2LT(rcvq(m))=max(1,2)+1=3, etc…pqmDACBrcvq(m) delivq(m)sndp(m)LTq0 0 0 1 1 1 1 3 3 3 4 5 5LTp0 1 1 2 2 2 2 2 2 3 3 3 3Logical clocksIf A happens before B, AB,then LT(A)<LT(B)But converse might not be true:If LT(A)<LT(B) can’t be sure that AB This is because processes that don’t communicate still assign timestamps and hence events will “seem” to have an orderCan we do better?One option is to use vector clocksHere we treat timestamps as a listOne counter for each processRules for managing vector times differ from what did with logical clocksVector clocksClock is a vector: e.g. VT(A)=[1, 0]We’ll just assign p index 0 and q index 1Vector clocks require either agreement on the numbering, or that the actual process id’s be included with the vectorRules for managing vector clockWhen event happens at p, increment VTp[indexp]Normally, also increment for snd and rcv events When sending a message, set VT(m)=VTpWhen receiving, set VTq=max(VTq, VT(m))Time-line with VT annotationspqmDACBrcvq(m) delivq(m)sndp(m)VTq0 00 00 00 10 10 10 12 22 22 2232 32 4VTp0 01 01 02 02 02 02 02 02 03 0303 03 0VT(m)=[2,0]Could also be [1,0] if we decide not to increment the clock on a snd event. Decision depends on how the timestamps will be used.Rules for comparison of VTsWe’ll say that VTA ≤ VTB ifI, VTA[i] ≤ VTB[i]And we’ll say that VTA < VTB ifVTA ≤ VTB but VTA ≠ VTBThat is, for some i, VTA[i] < VTB[i]Examples?[2,4] ≤ [2,4][1,3] < [7,3][1,3] is “incomparable” to [3,1]Time-line with VT annotationsVT(A)=[1,0]. VT(D)=[2,4]. So VT(A)<VT(D)VT(B)=[3,0]. So VT(B) and VT(D) are incomparablepqmDACBrcvq(m) delivq(m)sndp(m)VTq0 00 00 00 10 10 10 12 22 22 2232 32 4VTp0 01 01 02 02 02 02 02 02 03 0303 03 0VT(m)=[2,0]Vector time and happens beforeIf AB, then VT(A)<VT(B)Write a chain of events from A to BStep by step the vector clocks get largerIf VT(A)<VT(B) then ABTwo cases: if A and B both happen at same process p, trivialIf A happens at p and B at q, can trace the path
View Full Document