UH COSC 6360 - LOOKING UP DATA IN P2P SYSTEMS

Unformatted text preview:

LOOKING UP DATA IN P2P SYSTEMSKey IdeaINTRODUCTIONThe lookup problemPrevious solutions (I)Previous solutions (II)Previous solutions (III)Previous solutions (IV)DISTRIBUTED HASH TABLESMain design issuesCANNeighborsA two-dimensional example (I)A two-dimensional example (II)A path from (0.25, 0.3) to (0.8, 0.8)LookupDynamic behaviorFault-toleranceCHORDExampleFinger tablePartial exampleSlide 23Slide 24PASTRYPastry NodesSlide 27Prefix RoutingRouting table for node 1023Routing request for node 1230At node 1233TAPESTRYCONCLUSIONSSummary of costsLOOKING UP DATAIN P2P SYSTEMSHari Balakrishnan M. Frans Kaashoek David Karger Robert MorrisIon StoicaMIT LCSKey Idea•Survey paper•Discusses how to access data in a P2P system•Covers four solutions–CAN–Chord–Pastry–TapestryINTRODUCTION•P2P systems are popular due to–Low startup cost– High scalability at very low cost–Use of resources that would otherwise remain unused–Potential for greater robustness•Fully decentralized and distributedThe lookup problem•How do we locate data in large P2P systems?•One solution–Distributed hash tables (DHT)Previous solutions (I)•Centralized database–Napster•Not scalable •Vulnerable to attacks on databasePrevious solutions (II)•Broadcasting–Customers broadcast their requests to their neighbors, which forward them to their own neighbors and so on–Gnutella–Does not scale either•Broadcast messages consume too much bandwidthPrevious solutions (III)•Internet DNS–Organizes network nodes into an hierarchy–All searches start at top of hierarchy•Propagate down–Used by KaZaA, Grokster and others–Nodes higher in the tree do much more work than lower nodes–Solution vulnerable to loss of root node(s)Previous solutions (IV)•Freenet–Forwards queries from node to node until requested data are found–Emphasis is on anonymity •Not performance•Unpopular documents may become inaccessible –Nobody cares!DISTRIBUTED HASH TABLES•Implements primitive lookup(key)–Produces a path going from a node no to the node holding key•Big tradeoff is between–Keeping paths short–Minimizing state information kept by nodesMain design issues•Mapping keys to nodes in a balanced way–Use a hash function•Forwarding a lookup for a key to appropriate node–Find at each step a node closer to the node holding the key•Building routing tables–Each node should have a successorCAN•Uses a d-dimensional key space–Partitioned into hyper-rectangles•"Zones"–Each node manages a zone•Responsible for all keys in zoneNeighbors•Each node keeps track of addresses of all its neighbors –Routing table•Neighbors are defined as nodes sharing a (d-1) dimensional hyper-plane–Contacts with fewer dimensions in common do not countA two-dimensional example (I)A two-dimensional example (II)X(0, 0; 0.5, 0.5)(0, 0)(1, 0)(1, 1)(0, 1)In reality the state space wraps X(0, 0.5; 0.5, 1)X(0.5, 0.5; 1, 1)X(0.5, 0.25; 0.75, 0.5)X(0.5, 0; 0.75, 0.25)X(0.75, 0;1, 0.5)A path from (0.25, 0.3) to (0.8, 0.8)X(0, 0; 0.5, 0.5)(0, 0)(1, 0)(1, 1)(0, 1)In reality the state space wraps X(0, 0.5; 0.5, 1)X(0.5, 0.5; 1, 1)X(0.5, 0.25; 0.75, 0.5)X(0.5, 0; 0.75, 0.25)X(0.75, 0;1, 0.5)Lookup•Routing tries to approximate the straight path between current zone and zone holding the key•Various optimizations attempt to reduce lookup latencyDynamic behavior•When a node joins the network –It picks random point in space–Find node managing the zone–Splits with it current zone•When a node departs–Zones are merged•More complex processFault-tolerance•When a node fails neighbor with smallest zone takes over–Multiple failures may cause too many nodes to handle multiple zonesCHORD•Assigns ID's to keys and nodes in the same address space•ID's are organized in a ring–ID 0 follows the highest ID•Each node is responsible for all keys that immediately precede it in the key spaceExampleN 4K 6N 12N 20N 24K1K 10K 15Finger table•Each node keeps a table containing IP addresses of nodes– Halfway around in the key space–Quarter-of-the-way around– …•Table has log N entries–Allows O(log N) searchesPartial exampleN 4N 12N 20N 24Fault-tolerance•Each node has a successor list –Contains IP addresses of next r successors•Guarantees routing progress as long as all r successors are not downDynamic behavior•New node n learns its place in the Chord ring by asking any extant node to do a lookup(n)•Must also–Update successor list of its predecessor–Create its own successor listPASTRYPASTRY•Scalable, self-organizing, routing and object Scalable, self-organizing, routing and object location infrastructurelocation infrastructure•Each node has a node IDEach node has a node ID–IDs are uniformly distributed in the ID spaceIDs are uniformly distributed in the ID space•Includes a Includes a proximity metricproximity metric to measure to measure distances between pairs of ID'sdistances between pairs of ID'sPastry NodesPastry Nodes•Each node maintains three sets of nodesEach node maintains three sets of nodes–Leaf setLeaf set•Closest nodes in terms of node ID'sClosest nodes in terms of node ID's•Same function as Chord's successor listSame function as Chord's successor list–Nodes in routing tableNodes in routing table•Prefix routing (big idea)Prefix routing (big idea)–Neighborhood setNeighborhood set•Closest nodes in terms of proximity metricClosest nodes in terms of proximity metricDynamic behaviorDynamic behavior•Pastry is self-organizingPastry is self-organizing– Nodes come and goNodes come and go– Includes a seed discovery protocolIncludes a seed discovery protocolPrefix RoutingPrefix Routing•At each step, a node forwards an incoming At each step, a node forwards an incoming request to a node whose node id has largest request to a node whose node id has largest common prefix with common prefix with •Destination ID: Destination ID: 12301230•Node ID: Node ID: 1023023•Next Hop: Next Hop: 1212----Routing table for node 10230221 2230 31201130 1233 13021003 1013 10321020 1022No common prefixOne common digitTwo common digitsThree common digitsRouting request for node 1230 0221 2230 31201130122313021003 1013 10321020 1022No common prefixOne common digitTwo common digitsThree common digitsRequest is always send to a node having at least one more common prefix digit. Here it's node 1223At node 12330221 2230 31201030 1130 13021201 1211


View Full Document

UH COSC 6360 - LOOKING UP DATA IN P2P SYSTEMS

Documents in this Course
Load more
Download LOOKING UP DATA IN P2P SYSTEMS
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view LOOKING UP DATA IN P2P SYSTEMS and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view LOOKING UP DATA IN P2P SYSTEMS 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?