Unformatted text preview:

Review Reliable Networking Layering building complex services from simpler ones Datagram an independent self contained network message whose arrival arrival time and content are not guaranteed Performance metrics CS162 Operating Systems and Systems Programming Lecture 23 Overhead CPU time to put packet on wire Throughput Maximum number of bytes per second Latency time until first bit of packet arrives at receiver Network Communication Abstractions Remote Procedure Call Arbitrary Sized messages Fragment into multiple packets reassemble at destination Ordered messages April 28 2008 Prof Anthony D Joseph http inst eecs berkeley edu cs162 Use sequence numbers and reorder at destination Reliable messages Use Acknowledgements Want a window larger than 1 in order to increase throughput 4 28 08 A Review Using Acknowledgements B A B Joseph CS162 UCB Spring 2008 Lec 23 2 Goals for Today Timeout Distributed Decision Making Two phase commit Byzantine Commit Remote Procedure Call How to ensure transmission of packets Detect garbling at receiver via checksum discard if bad Receiver acknowledges by sending ack when packet received properly at destination Timeout at sender if no ack retransmit Examples of Distributed File Systems Some questions If the sender doesn t get an ack does that mean the receiver didn t get the original message No Note Some slides and or pictures in the following are adapted from slides 2005 Silberschatz Galvin and Gagne Gagne Many slides generated from my lecture notes by Kubiatowicz What it ack gets dropped Or if message gets delayed 4 28 08 Sender doesn t get ack retransmits Receiver gets message twice acks each Joseph CS162 UCB Spring 2008 4 28 08 Lec 23 3 Page 1 Joseph CS162 UCB Spring 2008 Lec 23 4 General s paradox General s Paradox Two Phase Commit Since we can t solve the General s Paradox i e simultaneous action let s solve a related problem Constraints of problem Distributed transaction Two machines agree to do something or not do it atomically Two generals on separate mountains Can only communicate via messengers Messengers can be captured Two Phase Commit protocol does this Use a persistent stable log on each machine to keep track of whether commit has happened Problem need to coordinate attack If they attack at different times they all die If they attack at same time they win If a machine crashes when it wakes up it first checks its log to recover state of world at time of crash Named after Custer who died at Little Big Horn because he arrived a couple of days too early Prepare Phase The global coordinator requests that all participants will promise to commit or rollback the transaction Participants record promise in log then acknowledge If anyone votes to abort coordinator writes Abort in its log and tells everyone to abort each records Abort in log Can messages over an unreliable network be used to guarantee two entities do something simultaneously Remarkably no even if all messages get through Commit Phase After all participants respond that they are prepared then the coordinator writes Commit to its log Then asks all nodes to commit they respond with ack After receive acks coordinator writes Got Commit to log No way to be sure last message gets through 4 28 08 Joseph CS162 UCB Spring 2008 Log can be used to complete this process such that all machines either commit or don t commit 4 28 08 Lec 23 5 Two phase commit example Fault Tolerance A group of machines can come to a decision even if one or more of them fail during the process Phase 1 Prepare Phase A writes Begin transaction to log A B OK to transfer funds to me Not enough funds B A transaction aborted A writes Abort to log Enough funds B Write new account balance promise to commit to log B A OK I can commit Simple failure mode called failstop different modes later After decision made result recorded in multiple places Undesirable feature of Two Phase Commit Blocking One machine can be stalled until another site recovers Site B writes prepared to commit record to its log sends a yes vote to the coordinator site A and crashes Site A crashes Site B wakes up check its log and realizes that it has voted yes on the update It sends a message to site A asking what happened At this point B cannot decide to abort because update may have committed B is blocked until A comes back Phase 2 A can decide for both whether they will commit A write new account balance to log Write Commit to log Send message to B that commit occurred wait for ack Write Got Commit to log What if B crashes at beginning Wakes up does nothing A will timeout abort and retry A blocked site holds resources locks on updated items pages pinned in memory etc until learns fate of update What if A crashes at beginning of phase 2 Wakes up sees that there is a transaction in progress sends Abort to B Alternative There are alternatives such as Three Phase Commit which don t have this blocking problem What happens if one or more of the nodes is malicious What if B crashes at beginning of phase 2 B comes back up looks at log when A sends it Commit message it will say oh ok commit 4 28 08 Joseph CS162 UCB Spring 2008 Lec 23 6 Distributed Decision Making Discussion Why is distributed decision making desirable Simple Example A WellsFargo Bank B Bank of America Joseph CS162 UCB Spring 2008 Malicious attempting to compromise the decision making 4 28 08 Lec 23 7 Page 2 Joseph CS162 UCB Spring 2008 Lec 23 8 Byzantine General s Problem Byzantine General s Problem con t Impossibility Results Lieutenant Cannot solve Byzantine General s Problem with n 3 because one malicious player can mess up things Retreat Attack Attack Lieutenant One from MIT for instance Castro and Liskov 1999 Use of BFT Byzantine Fault Tolerance algorithm Allow multiple machines to make a coordinated decision even if some subset of them n 3 are malicious IC1 All loyal lieutenants obey the same order IC2 If the commanding general is loyal then all loyal lieutenants obey the order he sends Request 4 28 08 Lec 23 9 Administrivia Joseph CS162 UCB Spring 2008 Lec 23 10 Must wrap up information into message at source Must decide what to do with message at destination May need to sit and wait for multiple messages to arrive Code deadline is Wed 5 14 May Distributed Decision Remote Procedure Call Raw messaging is a bit too low level for programming Project 4 design deadline is Thu 5 1 at 11 59pm 21st Retreat Original algorithm has messages exponential in n Newer algorithms have message complexity


View Full Document

Berkeley COMPSCI 162 - Lecture 23 Network Communication Abstractions / Remote Procedure Call

Documents in this Course
Lecture 1

Lecture 1

12 pages

Nachos

Nachos

41 pages

Security

Security

39 pages

Load more
Loading Unlocking...
Login

Join to view Lecture 23 Network Communication Abstractions / Remote Procedure Call and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 23 Network Communication Abstractions / Remote Procedure Call and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?