Announcements EE 122 Overlay Networks and p2p Networks Ion Stoica TAs Junda Liu DK Moon David Zats No class Wednesday Happy Thanksgiving Homework 3 grades available by Wednesday Homework 4 due on Wednesday December 2 http inst eecs berkeley edu ee122 fa09 Materials with thanks to Vern Paxson Jennifer Rexford and colleagues at UC Berkeley 1 Overlay Networks Motivations Changes in the network happen very slowly Why Internet network is a shared infrastructure need to achieve consensus IETF Many of proposals require to change a large number of routers e g IP Multicast QoS otherwise end users won t benefit Proposed changes that haven t happened yet on large scale More Addresses IPv6 91 Security IPSEC 93 Multicast IP multicast 90 3 2 Motivations cont d One size does not fit all Applications need different levels of Reliability Performance latency Security Access control e g who is allowed to join a multicast group 4 1 Goals Solution Make it easy to deploy new functionalities in the network accelerate the pace of innovation Allow users to customize their service Deploy processing in the network Have packets processed as they traverse the network IP AS 1 Overlay Network over IP AS 1 5 Overview 6 Resilient Overlay Network RON Resilient Overlay Network RON Premise overlay networks can increase performance and reliability of routing Overlay Multicast Install N computers at different Internet locations Peer to peer systems Each computer acts as an overlay network router Computers actively measure each logical link in real time for 7 Between each overlay router is an IP tunnel logical link Logical overlay topology is all to all N2 Packet loss rate latency throughput etc Route overlay network traffic based on measured characteristics 8 2 Example Overview MIT Resilient Overlay Network RON Overlay multicast Peer to peer systems Berkeley Default IP path determined by BGP OSPF UCLA Reroute traffic using red alternative overlay network path avoid congestion point Acts as overlay router 9 IP Multicast Problems Twenty years of research still not widely deployed Poor scalability Supporting higher level functionality is difficult Routers need to maintain per group or even per group and per sender state Multicast addresses cannot be aggregated IP Multicast best effort multi point delivery service Reliability congestion control for IP Multicast complicated No support for access control Nor restriction on who can send easy to mount Denial of 11 Service Dos attacks 10 Overlay Approach Provide IP multicast functionality above the IP layer application level multicast Challenge do this efficiently Projects Narada Overcast Scattercast Yoid Coolstreaming Roxbeam Rawflow 12 3 Narada Yang hua et al 2000 Narada End System Multicast Gatech Stanford Stan1 Source Speific Trees Stan2 Involves only end hosts CMU Berk1 Small group sizes hundreds of nodes Overlay Tree Typical application chat Berk2 Berkeley Gatech Stan 1 Stan2 CMU 13 Properties Use hop by hop retransmissions But Don t have to modify every router on path Easier to implement reliability than IP Multicast Berk2 Overview Easier to deploy than IP Multicast Berk1 14 Resilient Overlay Network RON Overlay multicast Peer to peer systems Consume more bandwidth than IP Multicast Typically has higher latency than IP Multicast Harder to scale Optimization use IP Multicast where available 15 16 4 How Did it Start A killer application Naptser Model Free music over the Internet Key idea share the storage and bandwidth of individual home users Each user stores a subset of files Each user has access can download files from all users in the system Internet 17 18 Main Challenge Other Challenges Find where a particular file is stored Note problem similar to finding a particular page in web caching what are the differences E F Scale up to hundred of thousands or millions of machines Dynamicity machines can come and go any time D E C A B 19 20 5 Napster Assume a centralized index system that maps files songs to machines that are alive How to find a file song E m5 m1 m2 m3 m4 m5 m6 D A B C D E F m4 C A Simplicity easy to implement sophisticated search engines on top of the index system B m1 m3 m2 21 22 Robustness scalability Gnutella Example Send request to all neighbors Neighbors recursively multicast the request Eventually a machine that has the file receives the request and it sends back the answer Assume m1 s neighbors are m2 and m3 m3 s neighbors are m4 and m5 m5 E m6 F E E Advantages E ftp the file Distribute file location Idea broadcast the request How to find a file E Ideally this is the closest least loaded machine Gnutella F Disadvantages E m6 Advantages m5 Query the index system return a machine that stores the required file Napster Example D E m4 E Totally decentralized highly robust E Disadvantages Not scalable the entire network can be swamped with requests to alleviate this problem each request has a TTL 23 C A m1 B m3 m2 24 6 Two Level Hierarchy Current Gnutella implementation KaZaa Leaf nodes are connected to a small number of ultrapeers suppernodes Query A leaf sends query to its ultrapeers If ultrapeers don t know the answer they flood the query to other ultrapeers More scalable Skype Oct 2003 Crawl on Gnutella Peer to peer Internet Telephony Two level hierarchy like KaZaa Ultrapeer nodes Leaf nodes Flooding only among ultrapeers Split each file into pieces 256 KB each and each piece into sub pieces 16 KB each The loader loads one piece at a time Within one piece the loader can load up to five subpieces in parallel authenticate users ensure that names are unique across network Messages exchanged to login server Data traffic A 26 Note probable protocol Skype protocol is not published BitTorrent 2 2 Allow fast downloads even when sources have low connectivity How does it work B Ultrapeers used mainly to route traffic between NATed end hosts see next slide plus a login server to 25 BitTorrent 1 2 login server Download consists of three phases Start get a piece as soon as possible Select rarest piece next End avoid getting stuck with a slow source when downloading the last sub pieces 27 Select a random piece Middle spread all pieces as soon as possible Request in parallel the same sub piece Cancel slowest downloads once a sub piece has been received 28 For details see http bittorrent com bittorrentecon pdf 7 Content Addressable Network CAN Distributed Hash Tables Problem Given an ID map to a host Challenges Scalability hundreds of thousands or millions of
View Full Document