Peer-to-PeerJeff Pang15-441 Spring 200415-441 Spring 2004, Jeff Pang 2Intro• Quickly grown in popularity– Dozens or hundreds of file sharing applications– 35 million American adults use P2P networks -- 29% of allInternet users in US!– Audio/Video transfer now dominates traffic on the Internet• But what is P2P?– Searching or location? -- DNS, Google!– Computers “Peering”? -- Server Clusters, IRC Networks,Internet Routing!– Clients with no servers? -- Doom, Quake!15-441 Spring 2004, Jeff Pang 3Intro (2)• Fundamental difference: Take advantage ofresources at the edges of the network• What’s changed:– End-host resources have increased dramatically– Broadband connectivity now common• What hasn’t:– Deploying infrastructure still expensive15-441 Spring 2004, Jeff Pang 4Overview• Centralized Database– Napster• Query Flooding– Gnutella• Intelligent Query Flooding– KaZaA• Swarming– BitTorrent• Unstructured Overlay Routing– Freenet• Structured Overlay Routing– Distributed Hash Tables15-441 Spring 2004, Jeff Pang 5The Lookup ProblemInternetN1N2N3N6N5N4PublisherKey=“title”Value=MP3 data…ClientLookup(“title”)?15-441 Spring 2004, Jeff Pang 6The Lookup Problem (2)• Common Primitives:– Join: how to I begin participating?– Publish: how do I advertise my file?– Search: how to I find a file?– Fetch: how to I retrieve a file?15-441 Spring 2004, Jeff Pang 7Next Topic...• Centralized Database– Napster• Query Flooding– Gnutella• Intelligent Query Flooding– KaZaA• Swarming– BitTorrent• Unstructured Overlay Routing– Freenet• Structured Overlay Routing– Distributed Hash Tables15-441 Spring 2004, Jeff Pang 8Napster: History• In 1999, S. Fanning launches Napster• Peaked at 1.5 million simultaneoususers• Jul 2001, Napster shuts down15-441 Spring 2004, Jeff Pang 9Napster: Overiew• Centralized Database:– Join: on startup, client contacts centralserver– Publish: reports list of files to centralserver– Search: query the server => returnsomeone that stores the requested file– Fetch: get the file directly from peer15-441 Spring 2004, Jeff Pang 10Napster: PublishI have X, Y, and Z!Publishinsert(X, 123.2.21.23)...123.2.21.2315-441 Spring 2004, Jeff Pang 11Napster: SearchWhere is file A?Query Replysearch(A)-->123.2.0.18Fetch123.2.0.1815-441 Spring 2004, Jeff Pang 12Napster: Discussion• Pros:– Simple– Search scope is O(1)– Controllable (pro or con?)• Cons:– Server maintains O(N) State– Server does all processing– Single point of failure15-441 Spring 2004, Jeff Pang 13Next Topic...• Centralized Database– Napster• Query Flooding– Gnutella• Intelligent Query Flooding– KaZaA• Swarming– BitTorrent• Unstructured Overlay Routing– Freenet• Structured Overlay Routing– Distributed Hash Tables15-441 Spring 2004, Jeff Pang 14Gnutella: History• In 2000, J. Frankel and T. Pepper fromNullsoft released Gnutella• Soon many other clients: Bearshare,Morpheus, LimeWire, etc.• In 2001, many protocol enhancementsincluding “ultrapeers”15-441 Spring 2004, Jeff Pang 15Gnutella: Overview• Query Flooding:– Join: on startup, client contacts a fewother nodes; these become its“neighbors”– Publish: no need– Search: ask neighbors, who as theirneighbors, and so on... when/if found,reply to sender.– Fetch: get the file directly from peer15-441 Spring 2004, Jeff Pang 16I have file A.I have file A.Gnutella: SearchWhere is file A?QueryReply15-441 Spring 2004, Jeff Pang 17Gnutella: Discussion• Pros:– Fully de-centralized– Search cost distributed• Cons:– Search scope is O(N)– Search time is O(???)– Nodes leave often, network unstable15-441 Spring 2004, Jeff Pang 18Aside: Search Time?15-441 Spring 2004, Jeff Pang 19Aside: All Peers Equal?56kbps Modem10Mbps LAN1.5Mbps DSL56kbps Modem56kbps Modem1.5Mbps DSL1.5Mbps DSL1.5Mbps DSL15-441 Spring 2004, Jeff Pang 20Aside: Network ResiliencePartial Topology Random 30% die Targeted 4% diefrom Saroiu et al., MMCN 200215-441 Spring 2004, Jeff Pang 21Next Topic...• Centralized Database– Napster• Query Flooding– Gnutella• Intelligent Query Flooding– KaZaA• Swarming– BitTorrent• Unstructured Overlay Routing– Freenet• Structured Overlay Routing– Distributed Hash Tables15-441 Spring 2004, Jeff Pang 22KaZaA: History• In 2001, KaZaA created by Dutch companyKazaa BV• Single network called FastTrack used byother clients as well: Morpheus, giFT, etc.• Eventually protocol changed so other clientscould no longer talk to it• Most popular file sharing network today with>10 million users (number varies)15-441 Spring 2004, Jeff Pang 23KaZaA: Overview• “Smart” Query Flooding:– Join: on startup, client contacts a “supernode” ...may at some point become one itself– Publish: send list of files to supernode– Search: send query to supernode, supernodesflood query amongst themselves.– Fetch: get the file directly from peer(s); can fetchsimultaneously from multiple peers15-441 Spring 2004, Jeff Pang 24KaZaA: Network Design“Super Nodes”15-441 Spring 2004, Jeff Pang 25KaZaA: File InsertI have X!Publishinsert(X, 123.2.21.23)...123.2.21.2315-441 Spring 2004, Jeff Pang 26KaZaA: File SearchWhere is file A?Querysearch(A)-->123.2.0.18search(A)-->123.2.22.50Replies123.2.0.18123.2.22.5015-441 Spring 2004, Jeff Pang 27KaZaA: Fetching• More than one node may have requested file...• How to tell?– Must be able to distinguish identical files– Not necessarily same filename– Same filename not necessarily same file...• Use Hash of file– KaZaA uses UUHash: fast, but not secure– Alternatives: MD5, SHA-1• How to fetch?– Get bytes [0..1000] from A, [1001...2000] from B– Alternative: Erasure Codes15-441 Spring 2004, Jeff Pang 28KaZaA: Discussion• Pros:– Tries to take into account node heterogeneity:• Bandwidth• Host Computational Resources• Host Availability (?)– Rumored to take into account network locality• Cons:– Mechanisms easy to circumvent– Still no real guarantees on search scope or search time15-441 Spring 2004, Jeff Pang 29Next Topic...• Centralized Database– Napster• Query Flooding– Gnutella• Intelligent Query Flooding– KaZaA• Swarming– BitTorrent• Unstructured Overlay Routing– Freenet• Structured Overlay Routing– Distributed Hash Tables15-441 Spring 2004, Jeff Pang 30BitTorrent: History• In 2002, B. Cohen debuted BitTorrent• Key
View Full Document