DOC PREVIEW
CMU INI 14740 - liang2005

This preview shows page 1-2-24-25 out of 25 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 25 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 25 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 25 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 25 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 25 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

The KaZaA Overlay: A Measurement StudyJian LiangDepartment of Computer andInformation Science,Polytechnic University,Brooklyn, NY, USA 11201Email: [email protected] KumarDepartment of Electrical andComputer Engineering,Polytechnic University,Brooklyn, NY, USA 11201Email: [email protected] W. RossDepartment of Computer andInformation Science,Polytechnic University,Brooklyn, NY, USA 11201Email: [email protected] 15, 2004AbstractBoth in terms of number of participating users and in traffic volume, KaZaA isone of the most important applications in the Internet today. Nevertheless, becauseKaZaA is proprietary and uses encryption, little is understood about KaZaA’soverlay structure and dynamics, its messaging protocol, and its index manage-ment. We have built two measurement apparatus - the KaZaA Sniffing Platformand the KaZaA Probing Tool - to unravel many of the mysteries behind KaZaA.We deploy the apparatus to study KaZaA’s overlay structure and dynamics, itsneighbor selection, its use of dynamic port numbers to circumvent firewalls, andits index management. Although this study does not fully solve the KaZaA puzzle,it nevertheless leads to a coherent description of KaZaA and its overlay. Further-more, we leverage the measurement results to set forth a number of key principlesfor the design of a successful unstructured P2P overlay. The measurement resultsand resulting design principles in this paper should be useful for future architectsof P2P overlay networks as well as for engineers managing ISPs.11 IntroductionOn a typical day, KaZaA has more than 3 million active users sharing over 5,000terabytes of content. On the University of Washington campus network in June 2002,KaZaA consumed approximately 37% of all TCP traffic, which was more than twicethe Web traffic on the same campus at the same time [8]. With over 3 million satisfiedusers, KaZaA is significantly more popular than Napster or Gnutella ever was. Sandvineestimates that in the US 76% of P2P file sharing traffic is KaZaA/FastTrack traffic andonly 8% is Gnutella traffic [23]. Clearly, both in terms of number of participating usersand in traffic volume, KaZaA is one of the most important applications ever carriedby the Internet. In fact, it can be argued that KaZaA has been so successful thatany new proposal for a P2P file sharing system should be compared with the KaZaAbenchmark. However, largely because KaZaA is a proprietary protocol which encryptsits signalling messages, little has been known to date about the specifics of KaZaA’soverlay, the maintenance of the overlay, and the KaZaA signalling protocol.In this paper we undertake a comprehensive measurement study of KaZaA’s overlaystructure and dynamics, its neighbor selection, its use of dynamic port numbers tocircumvent firewalls, and its index management. Although this study does not fullysolve the KaZaA puzzle, it nevertheless leads to a coherent description of KaZaA andits overlay, while providing many new insights about the details of KaZaA.To unravel the mysteries of the KaZaA overlay, we developed two measurementapparatus: the KaZaA Sniffing Platform and the KaZaA Probing Tool. The KaZaASniffing Platform is a set of KaZaA nodes that are forced to interconnect in a con-trolled manner with one another, while one node is also connected to hundreds ofplatform-external KaZaA nodes. The KaZaA Sniffing Platform collects KaZaA sig-nalling traffic, from which we can draw conclusions about the structure and dynamicsof the KaZaA overlay. The KaZaA Probing Tool establishes a TCP connection withany supplied KaZaA node, handshakes with that node, and sends and receives arbitraryencrypted KaZaA messages with the node. It is used for analyzing node availabilitiesand KaZaA neighbor selection. Both of these apparatus consume limited resources.One of the contributions of this paper is to show how it is possible to obtain extensiveoverlay information of a large-scale overlay application with a low-cost measurementinfrastructure.We use these tools to obtain insight into the following questions:• It is well-known that the KaZaA overlay is organized in a two-tier hierarchyconsisting of Super Nodes (SNs) in the upper tier and Ordinary Nodes (ONs) inthe lower tier. But how many children ONs does a typical SN support? Whatfraction of the peers in KaZaA are SNs? Are the SNs densely interconnected orsparsely interconnected?2• How long are ON-to-SN connections in the overlay? How long are SN-to-SNconnections in the overlay? What is the typical lifetime of a SN?• How does an ON discover candidate SNs for parenting? Once it has a set ofcandidate SNs, how does it choose a particular parent among them? In choosingthe parent, does it take locality or SN workload into account?• By allowing peers (ONs and SNs) to select their own server port numbers, KaZaAis more difficult to block with firewalls and NATs. How does KaZaA manage theserver port numbers? What fraction of KaZaA nodes are behind NATs?• What are the characteristics of the protocol that peers use to establish overlaylinks among themselves?• How is the file index (relating each file copy to an IP address and port number)organized among the SNs?In addition to providing novel insights into a remarkably successful P2P system, weleverage our measurement results to set forth a number of key principles for the designof an unstructured P2P overlay. As we’ll discuss in Section 5 these principles, includ-ing distributed design, exploiting heterogeneity, load balancing, locality, connectionshuffling, and firewall/NAT circumvention.This paper should not only be of interest to P2P designers, but also to engineers atupper- and lower-tier ISPs, who are interested in acquiring a thorough understanding ofP2P overlays and traffic. Because P2P file sharing systems can generate vast quantitiesof traffic, networking engineers, who dimension the network and introduce contentdistribution devices such as caches, need a basic understanding of how major P2P filesharing systems operate. Although there has been recent work in analyzing the file-sharing workload in KaZaA [8] and [18], to our knowledge we are the first to undertakea comprehensive study of a hierarchical unstructured overlay for a P2P system.The paper focuses on the KaZaA overlay network and index management. It ad-dresses neither KaZaA’s downloading protocol (for example, KaZaA’s parallel down-loading and request queuing) nor its incentive scheme for


View Full Document

CMU INI 14740 - liang2005

Download liang2005
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view liang2005 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view liang2005 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?