Slide Number 1AcknowledgementAdministrative Plan for Today Some QuestionsCurrent State of the ArtPeer-to-Peer SystemsP2P Systems ClassificationMetrics for Search/Insertion/JoinNapster Brief HistoryNapster StructureNapster OperationsSlide Number 14Napster SearchNapster: peer-to-peer file sharing with a centralized, replicated indexProblemsGnutellaGnutellaHow do I search for my Beatles file?Slide Number 21Slide Number 22Gnutella SearchSlide Number 24Gnutella SearchAvoiding Excessive trafficDealing with FirewallsSlide Number 28Dealing with FirewallsPing-Pong (Membership Service)Summary of Control Messages (Ping/Pong/Query/Hit Routing)After receiving QueryHit messagesGnutella SummarySummaryLecture 12Peer-to-Peer systems(Search Capabilities in Distributed Systems)Sections 10.1, 10.2, plus Paper“The Gnutella Protocol Specification v0.4”CS 425/ECE 428/CSE 424Distributed Systems(Fall 2009)Acknowledgement• The slides during this semester are based on ideas and material from the following sources: – Slides prepared by Professors M. Harandi, J. Hou, I. Gupta, N. Vaidya, Y-Ch. Hu, S. Mitra. – Slides from Professor S. Gosh’s course at University o Iowa.Administrative • HW 2 posted September 22, Tuesday– Deadline, October 6 (Tuesday), 2pm (at the beginning of the class)• Midterm on October 13 (Tuesday)– Exam will take place in class– You are allowed one cheat-sheet (one side only)– Exam will include all topics covered in HW1-HW2, plus P2P material (Lectures 1-13)Plan for Today• Introduction to P2P • Napster• Gnuttella • Fast-TrackPeer to peer systems(Next few lectures)D.S. TheoryTwo Angles of Distributed SystemsSome Questions• Why do people get together?– to share information – to share and exchange resources they have• books, class notes, experiences, music cd’s• How can computers help people – find information– find resources– exchange and share resources• Need Search Capabilities in Distributed Systems!Current State of the Art• Existing technologies: The Web!– Search engines– Forums: chat rooms, blogs, ebay– Online business• But, the web is heavy weight if you want specific resources: say a Beatles’ song “PennyLane”• A Google search will give you their bio, lyrics, chords, articles on them, and then perhaps the mp3• But you want only the song, nothing else!Peer-to-Peer Systems• If you can find a peer who wouldn’t mind exchanging her Beatles songs for your Miles Davis recordings, that would be great! • Napster: a light weight solution (lighter than the Web)P2P Systems Classification• Hybrid– Centralized index, but P2P file storage and transfer• Pure– Functionality completely distributed• Super-peer or Hierarchical – A “pure network of “hybrid” clustersMetrics for Search/Insertion/Join• Cost (aggregate)– Number of messages or interactions– Bandwidth – Processing poswer• Quality of Results– Number of results– Satisfaction (true if # results >= X, false otherwise)– Time to satisfactionNapster Brief History• [6/99] Shawn Fanning (freshman Northeastern U.) releases Napster online music service• [12/99] RIAA (Recording Industry Association of America) sues Napster, asking $100K per download• [3/00] 25% UWisc traffic Napster, many universities ban it• [00] 60M users• [2/01] US Federal Appeals Court: users violating copyright laws, Napster is abetting this• [9/01] Napster decides to run paid service, pay % to songwriters and music companies• [Today] Napster protocol is open, people free to develop opennap clients and servers http://opennap.sourceforge.net• [Today] eDonkey, BitTorrent, …Napster StructureSSSPPPPPPClient machines (“Peers”)napster.com Servers(Index Servers)Store their ownfilesStore a directory, i.e., filenames with peer pointers Filename Info aboutPennyLane.mp3 Beatles, @128.84.92.23:1006…..Napster Operations• Client– Connect to a Napster server (with well-known public address)– Upload list of music files that you want to share (names only, not the files themselves!)• Server maintains list of <filename, ip_address, portnum> tuples• Search Protocol from a client:– Send server keywords to search with– (Server searches its directory with the keywords)– Server returns a list of matching hosts –• <ip_address, portnum> tuples to client– Client pings each host in the list to find transfer rates – Client fetches file from best host• All communication uses TCP/IPNapster SearchSSSPPPPPPPeersnapster.com Servers(Index Servers)Store their ownfilesStore peer pointers for all files3. Response1. Query2. All servers search their lists (ternary tree algo.)4. ping candidates5. download from best hostNapster: peer-to-peer file sharing with a centralized, replicated indexNapst er ser verIndex1. File location2. List of peersrequestof f ering the f ilepeer s3. File request4. File deliv ered5. Index update Napst er ser verIndexProblems• Centralized server a source of congestion• Centralized server single point of failure• No security: plaintext messages and passwds• napster.com responsible for abetting users’ copyright violation– “Indirect infringement”Gnutella• Unstructured Peer-to-Peer System (in terms of search capabilities) - File sharing network• Eliminates the servers• Client machines search and retrieve amongst themselves• Clients act as servers too, called servents• [3/00] release by AOL, immediately withdrawn, but 88K users by 3/03•Original design underwent several modifications; we’ll look at the initial versionhttp://www.limewire.comGnutellaPPPPPPServents (“Peers”)PConnected in an overlay graphStore their ownfilesAlso store “peer pointers”a peer pointerHow do I search for my Beatles file?• Gnutella routes different messages within the overlay graph• Gnutella protocol has 5 main message types– Query – search for a file– QueryHit - response to query– Ping - discover hosts on network ; to probe network for other peers– Pong (reply to ping, contains address of another peer)– Push - download request for firewalled servents (used to initiate file transfer)• We’ll go into the message structure and protocol now(note: all fields except IP address are in little-endian format)Descriptor ID Payload descriptor TTL Hops Payload lengthDescriptor HeaderType of payload0x00 Ping0x01 Pong0x40 Push0x80 Query0x81 QueryhitDecremented at each hop,Message dropped when ttl=0ttl_initial
View Full Document