1 1 EE 122: Overlay Networks and p2p Networks Ion Stoica TAs: Junda Liu, DK Moon, David Zats http://inst.eecs.berkeley.edu/~ee122/fa09 (Materials with thanks to Vern Paxson, Jennifer Rexford, and colleagues at UC Berkeley) 2 Announcements No class Wednesday. Happy Thanksgiving! Homework 3 grades available by Wednesday Homework 4, due on Wednesday, December 2 3 Overlay Networks: Motivations Changes in the network happen very slowly Why? Internet network is a shared infrastructure; need to achieve consensus (IETF) Many of proposals require to change a large number of routers (e.g., IP Multicast, QoS); otherwise end-users won’t benefit Proposed changes that haven’t happened yet on large scale: More Addresses (IPv6 ‘91) Security (IPSEC ‘93); Multicast (IP multicast ‘90) 4 Motivations (cont’d) One size does not fit all Applications need different levels of Reliability Performance (latency) Security Access control (e.g., who is allowed to join a multicast group) …2 5 Goals Make it easy to deploy new functionalities in the network accelerate the pace of innovation Allow users to customize their service 6 Solution Deploy processing in the network Have packets processed as they traverse the network AS-1 IP AS-1 Overlay Network (over IP) 7 Overview Resilient Overlay Network (RON) Overlay Multicast Peer-to-peer systems 8 Resilient Overlay Network (RON) Premise: overlay networks, can increase performance and reliability of routing Install N computers at different Internet locations Each computer acts as an overlay network router Between each overlay router is an IP tunnel (logical link) Logical overlay topology is all-to-all (N2) Computers actively measure each logical link in real time for Packet loss rate, latency, throughput, etc Route overlay network traffic based on measured characteristics3 9 Example Default IP path determined by BGP & OSPF Reroute traffic using red alternative overlay network path, avoid congestion point Acts as overlay router Berkeley MIT UCLA 10 Overview Resilient Overlay Network (RON) Overlay multicast Peer-to-peer systems 11 IP Multicast Problems Twenty years of research, still not widely deployed Poor scalability Routers need to maintain per-group or even per-group and per-sender state! Multicast addresses cannot be aggregated Supporting higher level functionality is difficult IP Multicast: best-effort multi-point delivery service Reliability & congestion control for IP Multicast complicated No support for access control Nor restriction on who can send easy to mount Denial of Service (Dos) attacks! 12 Overlay Approach Provide IP multicast functionality above the IP layer application level multicast Challenge: do this efficiently Projects: Narada Overcast Scattercast Yoid Coolstreaming (Roxbeam) Rawflow4 13 Narada [Yang-hua et al, 2000] Source Speific Trees Involves only end hosts Small group sizes <= hundreds of nodes Typical application: chat 14 Narada: End System Multicast Stanford CMU Stan1 Stan2 Berk2 Overlay Tree Gatech Berk1 Berkeley Gatech Stan1 Stan2 CMU Berk1 Berk2 15 Properties Easier to deploy than IP Multicast Don’t have to modify every router on path Easier to implement reliability than IP Multicast Use hop-by-hop retransmissions But Consume more bandwidth than IP Multicast Typically has higher latency than IP Multicast Harder to scale Optimization: use IP Multicast where available 16 Overview Resilient Overlay Network (RON) Overlay multicast Peer-to-peer systems5 17 How Did it Start? A killer application: Naptser Free music over the Internet Key idea: share the storage and bandwidth of individual (home) users Internet 18 Model Each user stores a subset of files Each user has access (can download) files from all users in the system 19 Main Challenge Find where a particular file is stored Note: problem similar to finding a particular page in web caching (what are the differences?) A B C D E F E? 20 Other Challenges Scale: up to hundred of thousands or millions of machines Dynamicity: machines can come and go any time6 21 Napster Assume a centralized index system that maps files (songs) to machines that are alive How to find a file (song) Query the index system return a machine that stores the required file Ideally this is the closest/least-loaded machine ftp the file Advantages: Simplicity, easy to implement sophisticated search engines on top of the index system Disadvantages: Robustness, scalability (?) 22 Napster: Example A B C D E F m1 m2 m3 m4 m5 m6 m1 A m2 B m3 C m4 D m5 E m6 F E? m5 E? E 23 Gnutella Distribute file location Idea: broadcast the request How to find a file? Send request to all neighbors Neighbors recursively multicast the request Eventually a machine that has the file receives the request, and it sends back the answer Advantages: Totally decentralized, highly robust Disadvantages: Not scalable; the entire network can be swamped with requests (to alleviate this problem, each request has a TTL) 24 Gnutella: Example Assume: m1’s neighbors are m2 and m3; m3’s neighbors are m4 and m5;… A B C D E F m1 m2 m3 m4 m5 m6 E? E? E? E? E7 25 Two-Level Hierarchy Current Gnutella implementation, KaZaa Leaf nodes are connected to a small number of ultrapeers (suppernodes) Query A leaf sends query to its ultrapeers If ultrapeers don’t know the answer, they flood the query to other ultrapeers More scalable: Flooding only among ultrapeers Ultrapeer nodes Leaf nodes Oct 2003 Crawl on Gnutella 26 Skype Peer-to-peer Internet Telephony Two-level hierarchy like KaZaa Ultrapeers used mainly to route traffic between NATed end-hosts (see next slide)… … plus a login server to authenticate users ensure that names are unique across
View Full Document