1Peer-to-PeerProtocols and SystemsTA: David Murray15-441 Spring 20064/19/20062P2P - Outline• What is P2P?• P2P System Types– 1) File-sharing– 2) File distribution– 3) Streaming• Uses & Challenges3Problem: Scalability• Hundreds of clients => 1 server–OK• Thousands of clients => 1 server– Maybe OK• Millions/billions of clients => 1 server– What happens?...45Solution:Distribute the cost among the end users6Three Classes of P2P Systems• 1) File-sharing– (old) Napster (centralized)– Gnutella (flooding)– KaZaA (intelligent flooding)– DHTs/Chord (structured overlay routing)• 2) File distribution–BitTorrent• 3) Streaming– End System Multicast (a.k.a. Overlay Multicast)21) P2P File-sharing Systems81) P2P File-sharing Systems• Centralized Database– (old) Napster• Query Flooding– Gnutella• Intelligent Query Flooding– KaZaA• Structured Overlay Routing– Distributed Hash Tables9File searchingInternetN1N2N3N6N5N4PublisherKey=“title”Value=MP3 data…ClientLookup(“title”)?10File searching• Needles vs. Haystacks– Searching for top 40, or an obscure punk track from 1981 that nobody’s heard of?• Search expressiveness– Whole word? Regular expressions? File names? Attributes? Whole-text search?• (e.g., p2p gnutella or p2p google?)11File-sharing: Framework• Common Primitives:– Join: how do I begin participating?– Publish: how do I advertise my file?– Search: how do I find a file?– Fetch: how do I retrieve a file?12P2P File-sharing Systems• Centralized Database– (old) Napster• Query Flooding– Gnutella• Intelligent Query Flooding– KaZaA• Structured Overlay Routing– Distributed Hash Tables313(old) Napster: History• 1999: Sean Fanning launches Napster• Peaked at 1.5 million simultaneous users• Jul 2001: Napster shuts down• [2003: Napster’s name reused for an online music service (no relation)]14(old) Napster: Overview• Centralized Database:– Join: on startup, client contacts central server– Publish: reports list of files to central server– Search: query the server => return someone that stores the requested file– Fetch: get the file directly from peer15Napster: PublishI have lots of TV show theme song files!Publishinsert(XenaThemeSong.mp3, 123.2.21.23)Insert(HerculesThemeSong.mp3, 123.2.21.23)...123.2.21.2316Napster: Search“Where is the XenaTheme song?”Replysearch(Xena)FetchQuery17Napster: Discussion•Pros:–Simple– Search scope is O(1)– Controllable (pro or con?)• Cons:– Server maintains O(N) State– Server does all processing– Single point of failure18P2P File-sharing Systems• Centralized Database– (old) Napster• Query Flooding– Gnutella• Intelligent Query Flooding– KaZaA• Structured Overlay Routing– Distributed Hash Tables419Gnutella: History• In 2000, J. Frankel and T. Pepper from Nullsoft released Gnutella• Soon many other clients: Bearshare, Morpheus, LimeWire, etc.• In 2001, many protocol enhancements including “ultrapeers”20Gnutella: Overview• Query Flooding:– Join: on startup, client contacts a few other nodes; these become its “neighbors”– Publish: (no need)– Search: ask neighbors, who ask their neighbors, and so on... when/if found, reply to sender.• TTL limits propagation– Fetch: get the file directly from peer21“I have XenaEpisode1.mpg”“I haveXenaEpisode1.mpg”Gnutella: Search“Where isXenaEpisode1.mpg?”QueryReply22Gnutella: Discussion•Pros:– Fully de-centralized– Search cost distributed– Processing @ each node permits powerful search semantics• Cons:– Search scope is O(N)– Search time is O(???)– Nodes leave often, network unstable• TTL-limited search works well for haystacks.– For scalability, does NOT search every node. May have to re-issue query later23P2P File-sharing Systems• Centralized Database– (old) Napster• Query Flooding– Gnutella• Intelligent Query Flooding– KaZaA• Structured Overlay Routing– Distributed Hash Tables24KaZaA: History• In 2001, KaZaA created by Dutch company Kazaa BV• Single network called FastTrack used by other clients as well: Morpheus, giFT, etc.• Eventually protocol changed so other clients could no longer talk to it525KaZaA: Overview• “Smart” Query Flooding:– Join: on startup, client contacts a “supernode” ... may at some point become one itself– Publish: send list of files to supernode– Search: send query to supernode, supernodesflood query amongst themselves.– Fetch: get the file directly from peer(s); can fetch simultaneously from multiple peers26KaZaA: Network Design“Super Nodes”27KaZaA: File InsertI have lots of files!Publishinsert(Xena.mp3,123.2.21.23)Insert(Hercules.mp3,123.2.21.23)123.2.21.2328KaZaA: File SearchWhere is Xena.mp3?Querysearch(Xena.mp3)-->123.2.0.18Repliessearch(Xena.mp3)-->123.2.22.50123.2.0.18123.2.22.5029KaZaA: Fetching• More than one node may have requested file...• How to tell?– Must be able to distinguish identical files– Not necessarily same filename– Same filename not necessarily same file...• Use Hash of file– KaZaA uses its own “UUHash”: fast, but not secure– Alternatives: MD5, SHA-1• How to fetch?– Get bytes [0..1000] from A, [1001...2000] from B– Alternative: Erasure Codes30KaZaA: Discussion•Pros:– Tries to take into account node heterogeneity:• Bandwidth• Host Computational Resources• Host Availability (?)– Rumored to take into account network locality•Cons:– Mechanisms easy to circumvent– Still no real guarantees on search scope or search time• Similar behavior to Gnutella, but better.631Stability and Superpeers• Why supernodes?– Query consolidation• Many connected nodes may have only a few files• Propagating a query to a sub-node would take more b/w than answering it yourself– Caching effect• Requires network stability• Supernode selection is time-based– How long you’ve been on is a good predictor of how long you’ll be around.32P2P File-sharing Systems• Centralized Database– (old) Napster• Query Flooding– Gnutella• Intelligent Query Flooding– KaZaA• Structured Overlay Routing– Distributed Hash Tables33Distributed Hash Tables• Academic answer to P2P•Goals– Guaranteed lookup success– Provable bounds on search time– Provable scalability• Makes some things harder– Fuzzy queries / full-text search / etc.• Read-write, not read-only• Hot Topic in networking since introduction
View Full Document