EE 122 Peer to Peer P2P Networks Ion Stoica November 27 2002 How Did it Start A killer application Naptser Free music over the Internet Key idea share the storage and bandwidth of individual home users Internet istoica cs berkeley edu 2 Model Each user stores a subset of files Each user has access can download files from all users in the system istoica cs berkeley edu 3 Main Challenge Find where a particular file is stored E F D E A C B istoica cs berkeley edu 4 Other Challenges Scale up to hundred of thousands or millions of machines Dynamicity machines can come and go any time istoica cs berkeley edu 5 Napster Assume a centralized index system that maps files songs to machines that are alive How to find a file song Query the index system return a machine that stores the required file Ideally this is the closest least loaded machine ftp the file Advantages Simplicity easy to implement sophisticated search engines on top of the index system Disadvantages Robustness scalability istoica cs berkeley edu 6 Napster Example m5 E m6 F E E E m5 m1 m2 m3 m4 m5 m6 m4 C A m1 D A B C D E F B m3 m2 istoica cs berkeley edu 7 Gnutella Distribute file location Idea multicast the request Hot to find a file Send request to all neighbors Neighbors recursively multicast the request Eventually a machine that has the file receives the request and it sends back the answer Advantages Totally decentralized highly robust Disadvantages Not scalable the entire network can be swamped with request to alleviate this problem each request has a TTL istoica cs berkeley edu 8 Gnutella Example Assume m1 s neighbors are m2 and m3 m3 s neighbors are m4 and m5 m5 E m6 F E E D E m4 E E C A m1 B m3 m2 istoica cs berkeley edu 9 FastTrack Use the concept of supernode A combination between Napster and Gnutella When a user joins the network it joins a supernode A supernode acts like Napster server for all users connected to it Queries are brodcasted amongst supernodes like Gnutella istoica cs berkeley edu 10 Other Solutions to the Location Problem Goal make sure that an item file identified is always found Abstraction a distributed hash table data structure insert id item item query id Note item can be anything a data object document file pointer to a file Proposals also called structured p2p networks CAN ACIRI Berkeley Chord MIT Berkeley Pastry Rice Tapestry Berkeley istoica cs berkeley edu 11 Content Addressable Network CAN Associate to each node and item a unique id in an d dimensional space Properties Routing table size O d Guarantee that a file is found in at most d n1 d steps where n is the total number of nodes istoica cs berkeley edu 12 CAN Example Two Dimensional Space Space divided between nodes All nodes cover the entire space Each node covers either a square or a rectangular area of ratios 1 2 or 2 1 Example Assume space size 8 x 8 Node n1 1 2 first node that joins cover the entire space 7 6 5 4 3 n1 2 1 0 0 istoica cs berkeley edu 1 2 3 4 5 6 13 7 CAN Example Two Dimensional Space Node n2 4 2 joins space is divided between n1 and n2 7 6 5 4 3 n2 n1 2 1 0 0 istoica cs berkeley edu 1 2 3 4 5 6 14 7 CAN Example Two Dimensional Space Node n2 4 2 joins space is divided between n1 and n2 7 6 n3 5 4 3 n2 n1 2 1 0 0 istoica cs berkeley edu 1 2 3 4 5 6 15 7 CAN Example Two Dimensional Space Nodes n4 5 5 and n5 6 6 join 7 6 n5 n4 n3 5 4 3 n2 n1 2 1 0 0 istoica cs berkeley edu 1 2 3 4 5 6 16 7 CAN Example Two Dimensional Space Nodes n1 1 2 n2 4 2 n3 3 5 n4 5 5 n5 6 6 Items f1 2 3 f2 5 1 f3 2 1 f4 7 5 7 6 n5 n4 n3 5 f4 4 f1 3 n2 n1 2 f3 1 f2 0 0 istoica cs berkeley edu 1 2 3 4 5 6 17 7 CAN Example Two Dimensional Space Each item is stored by the node who owns its mapping in the space 7 6 n5 n4 n3 5 f4 4 f1 3 n2 n1 2 f3 1 f2 0 0 istoica cs berkeley edu 1 2 3 4 5 6 18 7 CAN Query Example Each node knows its neighbors in the d space Forward query to the neighbor that is closest to the query id 7 Example assume n1 queries f4 6 n5 n4 n3 5 f4 4 f1 3 n2 n1 2 f3 1 f2 0 0 istoica cs berkeley edu 1 2 3 4 5 6 19 7 Chord Associate to each node and item a unique id in an uni dimensional space Properties Routing table size O log N where N is the total number of nodes Guarantees that a file is found in O log N steps istoica cs berkeley edu 20 Data Structure Assume identifier space is 0 2m Each node maintains Finger table Entry i in the finger table of n is the first node that succeeds or equals n 2i Predecessor node An item identified by id is stored on the succesor node of id istoica cs berkeley edu 21 Chord Example Assume an identifier space 0 8 Node n1 1 joins all entries in its finger table are initialized to itself Succ Table i id 2i succ 0 2 1 1 3 1 2 5 1 0 1 7 6 2 5 istoica cs berkeley edu 4 3 22 Chord Example Node n2 3 joins Succ Table i id 2i succ 0 2 2 1 3 1 2 5 1 0 1 7 6 2 Succ Table 5 istoica cs berkeley edu 4 3 i id 2i succ 0 3 1 1 4 1 2 6 1 23 Chord Example Succ Table Nodes n3 0 n4 6 join i id 2i succ 0 1 1 1 2 2 2 4 6 Succ Table i id 2i succ 0 2 2 1 3 6 2 5 6 0 1 7 Succ Table i id 2i succ 0 7 0 1 0 0 2 2 2 6 2 Succ Table 5 istoica cs berkeley edu 4 3 i id 2i succ 0 3 6 1 4 6 2 6 6 24 Chord Examples Succ Table Nodes n1 1 n2 3 n3 0 n4 6 Items f1 7 f2 2 i id 2i succ 0 1 1 1 2 2 2 4 6 0 Succ Table i id 2i succ 0 7 0 1 0 0 2 2 2 Succ Table Items i id 2i succ 1 0 2 2 1 3 6 2 5 6 1 7 6 Items 7 2 Succ Table 5 …
View Full Document