DOC PREVIEW
UW-Madison CS 739 - An Analysis of Internet Content Delivery Systems

This preview shows page 1 out of 2 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 2 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 2 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

An Analysis of Internet Content Delivery SystemsHimani ApteCS 739: Distributed SystemsUniversity of Wisconsin, MadisonJanuary 20, 2006Spring 20061 OverviewThe paper analyzes four different content delivery sys-tems from the arena of client-server oriented world-wideweb, content delivery networks and peer-to-peer file shar-ing systems.The important features of peer-to-peer systems aresymmetry among the peers (since they behave as bothservers and clients) and scalability up to as many as mil-lions of machines. Dynamic membership, wide-area net-work and heterogeneity of the participating systems interms of their bandwidth, connectivity and performanceare some of the distinguishing characteristics of peer-to-peer systems. Such systems tend to be application specific.Some peer-to-peer systems may have a hierarchy amongthe members, for instance some peers in Kazaa are su-pernodes and maintain indexes for the content availableat peers in the nearby neighborhood.2 Problem StatementThe paper examined the traffic flow of content deliverysystems, focusing largely on web versus peer-to-peer traf-fic flows, and specifically on HTTP web traffic, Akamai,Kazaa, and Gnutella delivery systems.3 MethodologyThe methodology employed was passive network moni-toring of all traffic coming in and out of the border routersbetween the University of Washington (UW) and the restof the Internet. The TCP flows were reconstructed at themonitoring hosts to extract information to categorize theminto HTTP and non-HTTP traffic. The HTTP traffic is fur-ther distinguished into WWW, Kazaa and Gnutella basedon the destination ports. Akamai traffic is identified basedon whether it is served by an Akamai server.This methodology misses the internal traffic within thelocal network of the University, for instance file sharingtraffic among Kazaa users within the University network.The analysis also does not take into account the non-HTTP TCP traffic (that is 43% of the total TCP traffic)and non-TCP traffic (that is 3% of the total network traf-fic). Other peer-to-peer systems such as BitTorrent andNapster have also been excluded from the study.4 ObservationsThis section summarizes the observations made in classabout the analysis presented in the paper.4.1 Data characteristicsThe HTTP trace summary statistics (presented in Table1) are unavailable for outbound Akamai traffic as thereare no Akamai servers hosted within UW. Kazaa has thehighest outbound traffic in terms of net bytes transferred,in spite of a much smaller server and client populationwithin UW.The total TCP bandwidth consumed by HTTP transfersfor different content delivery systems (presented in Figure1) show a typical diurnal pattern. WWW traffic peaks in1true daylight hour as opposed to Kazaa traffic which peakslate at night.The HTTP trace was collected in May-June over a nineday period. The trace could significantly vary based onthe timing when it was collected as the network behav-ior of university students may be widely different duringsummer break than in the final exams week. Also a longerdata sample would be desirable.Analysis of the UW client and server TCP bandwidth(presented in Figure 2) indicates that Kazaa peers withinUW act as servers much more than the web servers at theuniversity. A possible reason could be high connectivityof the UW Kazaa peers.4.2 Content delivery characteristicsMost bytes are transferred in video objects, although mostrequests are for GIF and JPEG images. The median objectsize for WWW is 2 KB, while that of peer-to-peersystemsis 4 MB.The top bandwidth consuming UW clients (Figure 7)and UW servers (Figure 10) are the Kazaa peers. Hencecaching will be most beneficial for Kazaa file sharingsystem. The cause for large bandwidth consumption arelarge size objects and very popular objects (that result inlarge number of requests). As a result small number ofclients consume a large amount of bandwidth in peer-to-peer systems.The Kazaa and Gnutella servers are not perfectly load-balanced in spite of the scalability of peer-to-peer sys-tems. Possible cause for this may be the existence ofhighly popular content on a single server or availabilityof large-size objects on only a few peers. However, it isincorrect to make conclusions about whether or not theKazaa and Gnutella servers are load-balanced based onthe data available as the trace does not include internalfile sharing traffic within UW network.4.3 Role of cachingFor studying the role of caching in CDNs and P2P sys-tems, the authors have simulated infinite-capacity caches.The three causes of cache misses (popularly known asthe three C’s) are cold, capacity and conflict misses. Incase of infinitely large cache, the only kind of misses thatcan occur are cold misses. As a result the cache miss rateis actually a measure of the proportion of unique bytesaccessed to the net bytes accessed.The ideal byte hit rate (Figure 14) for outbound Kazaatraffic was found to stabilize at 85%, while that for in-bound traffic did not stabilize by the end of the trace. Thecache byte hit rate as a function of population size is pre-sented in Figure 15. On the one hand, increasing the num-ber of Kazaa clients may increase the number of requests,thereby lowering the cache hit rate. On the other hand,this leads to the complementary effect of caching in thenumerous clients, thereby improving the cache hit rate.Although the preliminary investigation presented in thepaper suggests that caching would have a large effect ona wide-scale P2P system, potentially reducing wide-areabandwidth demands dramatically, it may not actually befeasible to employ caches in P2P systems due to legal is-sues pertaining to the content distributed in such a system.Also the paper does not give an insight into a realistic sizeof cache that would be sufficient to obtain improvementsin bandwidth usage.5 ConclusionsThe paper presents a quantification of the domination ofP2P systems in the modern day Internet traffic.Although the global characteristics are not easily seenlooking at a small part of the network (UW, in this case), itstill makes interesting revelations about the network traf-fic flows. In the future, we would expect WWW traffic toshow similar characteristics and an even larger P2P


View Full Document

UW-Madison CS 739 - An Analysis of Internet Content Delivery Systems

Documents in this Course
Load more
Download An Analysis of Internet Content Delivery Systems
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view An Analysis of Internet Content Delivery Systems and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view An Analysis of Internet Content Delivery Systems 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?