DOC PREVIEW
MTU CS 6461 - Efficient Content Location Using Interest Based Locality

This preview shows page 1-2-3-4 out of 11 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Efficient Content Location Using Interest-BasedLocality in Peer-to-Peer SystemsKunwadee Sripanidkulchai Bruce Maggs Hui ZhangCarnegie Mellon University, Pittsburgh, PA 15213{kunwadee,bmm,hzhang}@cs.cmu.eduAbstract— Locating content in decentralized peer-to-peer sys-tems is a challenging problem. Gnutella, a popular file-sharingapplication, relies on flooding queries to all peers. Althoughflooding is simple and robust, it is not scalable. In this paper, weexplore how to retain the simplicity of Gnutella, while addressingits inherent weakness: scalability. We propose a content locationsolution in which peers loosely organize themselves into aninterest-based structure on top of the existing Gnutella network.Our approach exploits a simple, yet powerful principle calledinterest-based locality, which posits that if a peer has a particularpiece of content that one is interested in, it is very likely thatit will have other items that one is interested in as well. Whenusing our algorithm, called interest-based shortcuts,asignificantamount of flooding can be avoided, making Gnutella a morecompetitive solution. In addition, shortcuts are modular and canbe used to improve the performance of other content locationmechanisms including distributed hash table schemes.We demonstrate the existence of interest-based locality in fivediverse traces of content distribution applications, two of whichare traces of popular peer-to-peer file-sharing applications. Sim-ulation results show that interest-based shortcuts often resolvequeries quickly in one peer-to-peer hop, while reducing the totalload in the system by a factor of 3 to 7.I. INTRODUCTIONEnsuring the availability of content on the Internet isexpensive. Publishers who want high availability have fewoptions. They can use premium content hosting services, buildand manage their own content distribution infrastructures, orcontract with Content Delivery Networks [1]. All of theseoptions are prohibitively expensive for an average Internetuser who wants to share a hundred megabytes of digitalphotographs with his friends. A low cost solution is to pub-lish the content from one’s own desktop into a peer-to-peercontent distribution system like Gnutella [2]. While peers aredownloading content, they can also create and make availablereplicas to increase content availability. As the system grows,the supply of resources scales with demand. There are enoughresources, even during flash crowds when many people accessthe same content simultaneously.There are many challenges in providing peer-to-peer contentdistribution systems. In this paper, we address one fundamentalchallenge: what is the appropriate strategy for locating contentThis research was sponsored by DARPA under contract number F30602-99-1-0518, and by NSF under grant numbers Career Award NCR-9624979 ANI-9730105, ITR Award ANI-0085920, and ANI-9814929. Additional supportwas provided by Intel. Views and conclusions contained in this documentare those of the authors and should not be interpreted as representing theofficial policies, either expressed or implied, of DARPA, NSF, Intel, or theU.S. government.given that content may be continuously replicated at manylocations in the peer-to-peer system? If content cannot belocated efficiently, there is little hope for using peer-to-peertechnology for content distribution.There are two classes of solutions currently proposed fordecentralized peer-to-peer content location. Unstructured con-tent location, used by Gnutella, relies on flooding queries toall peers. Peers organize into an overlay. To find content, apeer sends a query to its neighbors on the overlay. In turn, theneighbors forward the query on to all of their neighbors untilthe query has traveled a certain radius. While this solution issimple and robust even when peers join and leave the system,it does not scale. Another class of protocols based on theDistributed Hash Table (DHT) abstraction [3] [4] [5] [6] andmotivated by Plaxton et al. [7] have been proposed to addressscalability. In these protocols, peers organize into a well-defined structure that is used for routing queries. AlthoughDHTs are elegant and scalable, their performance under thedynamic conditions common for peer-to-peer systems is un-known [8].Our design philosophy departs from existing work in that weseek to retain the simple, robust, and fully decentralized natureof Gnutella, while improving scalability, its major weakness.We identify a powerful principle: if a peer has a particularpiece of content that one is interested in, then it is likely thatit will have other pieces of content that one is also interested in.These peers exhibit interest-based locality. We propose a self-organizing protocol, interest-based shortcuts, that efficientlyexploits interest-based locality for content location. Peers thatshare similar interests create shortcuts to one another. Peersthen use these shortcuts to locate content. When shortcutsfail, peers resort to using the underlying Gnutella overlay.Shortcuts provide a loose structure on top of Gnutella’sunstructured overlay. Although we use Gnutella as the primaryexample in this paper, shortcuts are also compatible with manyother content location mechanisms, such as DHTs and hybridcentralized-decentralized architectures such as Kazaa [9].We compare the performance of Gnutella with and withoutshortcuts using five traces collected at different locations anddifferent times. We show that shortcuts significantly improvethe performance of content location for Gnutella by providinglarge decreases in the amount of load in the system and thetime to locate content. We find that simple algorithms aresufficient to capture interest-based relationships. Moreover, weprovide some intuition on the factors that contribute to thedegree of locality observed in our traces.0-7803-7753-2/03/$17.00 (C) 2003 IEEE 2166ACDBDAC EA? B? C?0/33/32/30/3D EFFGHFig. 1. Peers that share interests.In Section II, we describe the design of interest-basedshortcuts. In Section III, we present our evaluation metrics andsimulation methodology. We present our results in Section IVand explore the potential and limitations of shortcuts inSection V. Section VI takes an in-depth look at factors thatcontribute to interest-based locality. In Section VII, we lookat the effectiveness of shortcuts for a DHT-based protocol,Chord [5]. We discuss the implications of our results inSection VIII, and related work in Section IX.II. INTEREST-BASED


View Full Document

MTU CS 6461 - Efficient Content Location Using Interest Based Locality

Documents in this Course
Tapestry

Tapestry

13 pages

Load more
Download Efficient Content Location Using Interest Based Locality
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Efficient Content Location Using Interest Based Locality and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Efficient Content Location Using Interest Based Locality 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?