DOC PREVIEW
TAMU CSCE 410 - distributed_coordination

This preview shows page 1-2-3-4 out of 12 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CPSC-410/611: Operating Systems Distr Coord 1 Distributed Coordination!• What makes a system distributed?!• Time in a distributed system!• How do we determine the global state of a distributed system?!• Event ordering!• Mutual exclusion!• Reading: Silberschatz, Chapter 18, Sections 1,2,6!Distr. Systems: Fundamental Characteristics!1. Multiple processors (wlog: assume one process per processor)!2. No shared memory!3. No common clock!4. Communication delays are not constant!5. Message ordering may not be maintained by the underlying communication infrastructure !CPSC-410/611: Operating Systems Distr Coord 2 Effects of Lack of Common Clock!Example 1 : Distributed make utility (e.g. pmake)!• make goes through all target files and determines (based on timestamps) which targets need to be “(re)compiled”!• Example:!main : main.o cc -o main main.o main.o : main.c cc -c main.c 2144 2144 2145 2146 2148 2147 2146 2145 2143 2142 Time according!to local clock!Time according!to local clock!Computer on !which compiler!runs!Computer on !which editor!runs!main.o created main.c created Effects of Lack of Common Clock!• Example 2 : Distributed Checkpointing!• “At 3pm everybody writes its state to stable storage.”!• Centralized system:!• Distributed System: rriiing! rriiing! rriiing!CPSC-410/611: Operating Systems Distr Coord 3 Distributed Checkpointing (2)!rriiing! rriiing! “transfer $100” Sb=$100 3:00 Sa=$100 3:00 3:01 2:59 rriiing! rriiing! “transfer $100” Sb=$0 3:00 Sa=$0 3:00 2:59 3:01 Consistent vs. Non-Consistent Global States!inconsistent global state (why?) consistent global stateCPSC-410/611: Operating Systems Distr Coord 4 Distributed Snapshot Algorithm!• Process P starts algorithm:!– saves state SP!– sends out marker messages to all other processes!• Upon receipt of a marker message (from process Q), process P proceeds as follows (atomically: no messages sent/received in the meantime):!– 1. Saves local state SP.!– 2. Records state of incoming channel from Q to P as empty.!– 3. Forward marker message on all outgoing channels.!• At any time after saving its state, when P receives a marker from a process R:!– Save state SCRP as sequence of messages received from R since P saved local state SP to when it received marker from R.!Comments!• Any process can start algorithm. Even multiple processes can start it concurrently.!• Algorithm will terminate if message delivery time is finite.!• Algorithm is fully distributed.!• Once algorithm has terminated, consistent global state can be collected.!• Relies on ordered, reliable message delivery.!CPSC-410/611: Operating Systems Distr Coord 5 Event Ordering!• Absence of central time means: no notion of happened-when (no total ordering of events)!• But can generate a happened-before notion (partial ordering of events)!• Happened-Before relation:!1. Pi A B Event A happened-before Event B. (A -> B) 2. Pi A Event A happened-before Event B. (A -> B) Pj B message 3. Pi A Event A happened-before Event C. (A -> C) (transitivity) Pj B message C Concurrent Events!• What when no happened-before relation exists between two events?!Pi A Events X and Y are concurrent. Pj B C D X Y ?CPSC-410/611: Operating Systems Distr Coord 6 Happened-Before Ordering: Implementation!• Define a Logical Clock LCi at each Process Pi.!• Used to timestamp each event:!– Each event on Pi is timestamped with current value of logical clock LCi .!– After each event, increment LCi.!– Timestamp each outgoing message at Pi with value of LCi.!– When receiving a message with timestamp t at process Pj, set LCj to max(t, LCj )+1.!Pi Pj LCj LCi 0 1 2 3 4 0 1 2 msg(1) 201 201 msg(200) 160 200 Application to Distributed Checkpointing!“At logical-clock time 5000 write state !to stable storage!”!4999 5000 5001 4890 4891 4892 5002 msg(A,4891) msg(B,5002) 5003 + 5002 Receiving Msg B!would be inconsistent.!So, checkpoint first,!and then receive!!CPSC-410/611: Operating Systems Distr Coord 7 Distributed Mutual Exclusion!• Reminder: Mutual exclusion in shared-memory systems:!bool lock; /* init to FALSE */ while (TRUE) { while (TestAndSet(lock)) no_op; critical section; lock = FALSE; remainder section; } D.M.E.: Centralized Approach!1. Send request message to coordinator to enter C.S.!2. If C.S. is free, the coordinator sends a reply message. Otherwise it queues request and delays sending reply message until C.S. becomes free.!3. When leaving C.S., send a release message to inform coordinator.!• Characteristics:!– ensures mutual exclusion!– service is fair!– small number of messages required!– fully dependent on coordinator!coordinator!P1!P2!P3!1!2!3!CPSC-410/611: Operating Systems Distr Coord 8 D.M.E.: Fully Distributed Approach!• Basic idea: Before entering C.S., ask and wait until you get permission from everybody else.!request(Pi,TS)!reply!Pi!• Upon receipt of a message request(Pj, TSj) at node Pi:!• if Pi does not want to enter C.S., immediately send a reply to Pj.!• if Pi is in C.S., defer reply to Pj. !• if Pi is trying to enter C.S., compare TSi with TSj. If TSi > TSj (i.e. “Pj asked first”), send reply to Pj; otherwise defer reply.!Fully Distributed Approach:Example!• P1 and P3 want to enter C.S.!P1!P2!P3!req(P1,10)!req(P1,10)!req(P3,4)!req(P3,4)!reply!reply!reply!Enter C.S.!reply!Enter C.S.!CPSC-410/611: Operating Systems Distr Coord 9 D.M.E. Fully Distributed Approach!• The Good:!– ensures mutual exclusion!– deadlock free!– starvation free!– number of messages per critical section: 2(n-1)!• The Bad:!– The processes need to know identity of all other processes involved (join & leave protocols needed)!• The Ugly:!– One failed process brings the whole scheme down!!D.M.E.: Token-Passing Approach!• Token is passed from process to process (in logical ring)!• Only processes owning a token can enter C.S.!• After leaving the C.S., token is forwarded!Pi!token!• Characteristics: • mutual exclusion guaranteed • no starvation • number of messages per C.S. varies • Problems: • Process failure (new logical ring must be constructed) • Loss of token (new token must be generated)CPSC-410/611: Operating


View Full Document

TAMU CSCE 410 - distributed_coordination

Documents in this Course
Load more
Download distributed_coordination
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view distributed_coordination and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view distributed_coordination 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?