Duke CPS 196 - Peer-to-Peer and Large- Scale Distributed Systems

Unformatted text preview:

Peer-to-Peer and Large-Scale Distributed SystemsJeff ChaseDuke UniversityNote• For CPS 196, Spring 2006, I skimmed a tutorial giving a broad view of the area. It is by Joe Hellerstein at Berkeley and is available at:–db.cs.berkeley.edu/jmh/talks/vldb04-p2ptut-final.ppt • I also used some of the following slides on DHTs, all of which are adapted more or less intact from presentations graciously provided by Sean Rhea. They pertain to his Award Paper on Bamboo in Usenix2005.What’s a DHT?• Distributed Hash Table– Peer-to-peer algorithm to offering put/get interface– Associative map for peer-to-peer applications• More generally, provide lookupfunctionality– Map application-provided hash values to nodes– (Just as local hash tables map hashes to memory locs.)– Put/get then constructed above lookup• Many proposed applications– File sharing, end-system multicast, aggregation treesHow DHTs WorkK VK VK VK VK VK VK VK VK VK Vput(k1,v1)get(k1)k1v1k1,v1How do we ensure the put and the get find the same machine?Step 1: Partition Key Space• Each node in DHT will store some k,vpairs• Given a key space K, e.g.[0, 2160):– Choose an identifier for each node, idi∈K,uniformly at random–A pair k,vis stored at the node whose identifier is closest to k02160Step 2: Build Overlay Network• Each node has two sets of neighbors• Immediate neighbors in the key space– Important for correctness• Long-hop neighbors– Allow puts/gets in O(log n) hops02160Step 3: Route Puts/Gets Thru Overlay• Route greedily, always making progress02160kget(k)How Does Lookup Work?0…10…110…111…Lookup IDSourceResponse• Assign IDs to nodes– Map hash values to node with closest ID• Leaf set is successors and predecessors– All that’s needed for correctness• Routing table matches successively longer prefixes– Allows efficient lookupsHow Bad is Churn in Real Systems?50% < 2.4 minutesKazaaGDS0350% < 60 minutesOvernetBSV0350% < 1 minuteFastTrackSW0231% < 10 minutesGnutella, NapsterCLL0250% < 60 minutesGnutella, NapsterSGG02Session TimeSystems ObservedAuthorstimearrive depart arrive departSessionTimeLifetimeAn hour is an incredibly short MTTF!Note on CPS 196, Spring 2006• We did not cover any of the following material on managing DHT’s under churn.Routing Around Failures• Under churn, neighbors may have failed• To detect failures, acknowledge each hop02160kACKACKRouting Around Failures• If we don’t receive an ACK, resend through different neighbor02160kTimeout!Computing Good Timeouts• Must compute timeouts carefully– If too long, increase put/get latency– If too short, get message explosion02160kTimeout!Computing Good Timeouts• Chord errs on the side of caution– Very stable, but gives long lookup latencies02160kTimeout!Calculating Good Timeouts• Use TCP-style timers– Keep past history of latencies– Use this to compute timeouts for new requests• Works fine for recursivelookups– Only talk to neighbors, so history small, currentRecursiveIterative•In iterativelookups, source directs entire lookup– Must potentially have good timeout for anynodeRecovering From Failures• Can’t route around failures forever– Will eventually run out of neighbors• Must also find new nodes as they join– Especially important if they’re our immediate predecessors or successors:02160responsibilityRecovering From Failures• Can’t route around failures forever– Will eventually run out of neighbors• Must also find new nodes as they join– Especially important if they’re our immediate predecessors or successors:02160old responsibilitynew responsibilitynew nodeRecovering From Failures• Obvious algorithm: reactiverecovery– When a node stops sending acknowledgements, notify other neighbors of potential replacements– Similar techniques for arrival of new nodesB02160C DAARecovering From Failures• Obvious algorithm: reactiverecovery– When a node stops sending acknowledgements, notify other neighbors of potential replacements– Similar techniques for arrival of new nodesB02160C DAAB failed, use DB failed, use AThe Problem with Reactive Recovery• What if B is alive, but network is congested?– C still perceives a failure due to dropped ACKs– C starts recovery, further congesting network– More ACKs likely to be dropped– Creates a positive feedback cycleB02160C DAAB failed, use DB failed, use AThe Problem with Reactive Recovery• What if B is alive, but network is congested?• This was the problem with Pastry– Combined with poor congestion control, causes network to partition under heavy churnB02160C DAAB failed, use DB failed, use APeriodic Recovery• Every period, each node sends its neighbor list to each of its neighborsB02160C DAAmy neighbors are A, B, D, and EPeriodic Recovery• Every period, each node sends its neighbor list to each of its neighborsB02160C DAAmy neighbors are A, B, D, and EPeriodic Recovery• Every period, each node sends its neighbor list to each of its neighbors– Breaks feedback loopB02160C DAAmy neighbors are A, B, D, and EPeriodic Recovery• Every period, each node sends its neighbor list to each of its neighbors– Breaks feedback loop– Converges in logarithmic number of periodsB02160C DAAmy neighbors are A, B, D, and


View Full Document

Duke CPS 196 - Peer-to-Peer and Large- Scale Distributed Systems

Documents in this Course
Load more
Download Peer-to-Peer and Large- Scale Distributed Systems
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Peer-to-Peer and Large- Scale Distributed Systems and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Peer-to-Peer and Large- Scale Distributed Systems 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?