CS162 Operating Systems and Systems Programming Lecture 24 Distributed File Systems November 23 2005 Prof John Kubiatowicz http inst eecs berkeley edu cs162 Review Network Communication TCP Reliable byte stream between two processes on different machines over Internet read write flush Socket an abstraction of a network I O queue Embodies one side of a communication channel Same interface regardless of location of other end Could be local machine called UNIX socket or remote machine called network socket Req socket Client on C t u es ion t c ne connection Server Socket new socket socket Server Two phase commit distributed decision making First make sure everyone guarantees that they will commit if asked prepare Next ask everyone toCS162 commit 11 23 05 Kubiatowicz UCB Fall 2005 Lec 24 2 Review Distributed Applications Receive Send Network Message Abstraction send receive messages Already atomic no receiver gets portion of a message and two receivers cannot get same message Interface Mailbox mbox temporary holding area for messages Includes both destination location and queue Send message mbox Send message to remote mailbox identified by mbox Receive buffer mbox Wait until mbox has message copy into buffer and return If threads sleeping on this mbox wake up one of them 11 23 05 Kubiatowicz CS162 UCB Fall 2005 Lec 24 3 Review Byzantine General s Problem ta At Lieutenant ck Att a Att a ck Retreat Attack At ta ck General Malicious ck Attack Ret reat Lieutenant k Attac Lieutenant Byazantine General s Problem n players One General n 1 Lieutenants Some number of these f n 3 can be insane or malicious The commanding general must send an order to his n 1 lieutenants such that IC1 All loyal lieutenants obey the same order IC2 If the commanding general is loyal then all loyal lieutenants obey the order he sends 11 23 05 Kubiatowicz CS162 UCB Fall 2005 Lec 24 4 Review Byzantine General s Problem con t Impossibility Results Cannot solve Byzantine General s Problem with n 3 because one malicious player can mess up things General Attack Attack General Attack Retreat Lieutenant LieutenantLieutenant Lieutenant Retreat Retreat With f faults need n 3f to solve problem Various algorithms exist to solve problem Original algorithm has messages exponential in n Newer algorithms have message complexity O n 2 One from MIT for instance Castro and Liskov 1999 Use of BFT Byzantine Fault Tolerance algorithm Allow multiple machines to make a coordinated decision even if some subset of them n 3 are malicious Request 11 23 05 Distributed Decision Kubiatowicz CS162 UCB Fall 2005 Lec 24 5 Review Remote Procedure Call Raw messaging is a bit too low level for programming Must wrap up information into message at source Must decide what to do with message at destination May need to sit and wait for multiple messages to arrive Better option Remote Procedure Call RPC Calls a procedure on a remote machine Client calls remoteFileSystem Read rutabaga Translated automatically into call on server fileSys Read rutabaga Implementation Request response message passing under covers Stub provides glue on client server Client stub is responsible for marshalling arguments and unmarshalling the return values Server side stub is responsible for unmarshalling arguments and marshalling the return values Marshalling involves depending on system Converting values to a canonical form serializing objects copying arguments passed by reference etc 11 23 05 Kubiatowicz CS162 UCB Fall 2005 Lec 24 6 Goals for Today Finish RPC Examples of Distributed File Systems Cache Coherence Protocols Note Some slides and or pictures in the following are adapted from slides 2005 Silberschatz Galvin and 11 23 05 Kubiatowicz CS162 UCB Fall 2005 Lec 24 7 Gagne RPC Information Flow Machine B Server callee 11 23 05 return Client Stub send Packet Handler receive unbundle mbox2 ret vals bundle ret vals return Server send Stub call receive unbundle args Kubiatowicz CS162 UCB Fall 2005 Network Machine A call Network Client caller bundle args mbox1 Packet Handler Lec 24 8 RPC Details Equivalence with regular procedure call Parameters Request Message Result Reply message Name of Procedure Passed in request message Return Address mbox2 client return mail box Stub generator Compiler that generates stubs Input interface definitions in an interface definition language IDL Contains among other things types of arguments return Output stub code in the appropriate source language Code for client to pack message send it off wait for result unpack result and return to caller Code for server to unpack message call procedure pack results send them off Cross platform issues What if client server machines are different architectures or in different languages 11 23 05 Convert everything to from some canonical form Tag every item with an indication of how it is encoded avoids unnecessary conversions Kubiatowicz CS162 UCB Fall 2005 Lec 24 9 RPC Details continued How does client know which mbox to send to Need to translate name of remote service into network endpoint Remote machine port possibly other info Binding the process of converting a user visible name into a network endpoint This is another word for naming at network level Static fixed at compile time Dynamic performed at runtime Dynamic Binding Most RPC systems use dynamic binding via name service Name service provides dynmaic translation of service mbox Why dynamic binding Access control check who is permitted to access service Fail over If server fails use a different one What if there are multiple servers Could give flexibility at binding time Choose unloaded server for each new client Could provide same mbox router level redirect Choose unloaded server for each new request Only works if no state carried from one call to next What if multiple clients Pass pointer to client specific return mbox in request 11 23 05 Kubiatowicz CS162 UCB Fall 2005 Lec 24 10 Problems with RPC Non Atomic failures Different failure modes in distributed system than on a single machine Consider many different types of failures User level bug causes address space to crash Machine failure kernel bug causes all processes on same machine to fail Some machine is compromised by malicious party Before RPC whole system would crash die After RPC One machine crashes compromised while others keep working Can easily result in inconsistent view of the world Did my cached data get written back or not Did server do what I requested or not Answer Distributed
View Full Document
Unlocking...