DOC PREVIEW
MTU CS 6461 - Building a Large and Efficient Hybrid Peer to Peer Internet Caching System

This preview shows page 1-2-3-4-5 out of 16 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 16 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 16 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 16 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 16 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 16 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 16 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Building a Large and Efficient HybridPeer-to-Peer Internet Caching SystemLi Xiao, Member, IEEE, Xiaodong Zhang, Senior Member, IEEE, Artur Andrzejak, andSongqing Chen, Student Member, IEEEAbstract—Proxy hit ratios tend to decrease as the demand and supply of Web contents are becoming more diverse. By case studies,we quantitatively confirm this trend and observe significant document duplications among a proxy and its client browsers’ caches. Onereason behind this trend is that the client/server Web caching model does not support direct resource sharing among clients, causingthe Web contents and the network bandwidths among clients to be relatively underutilized. To address these limits and improve Webcaching performance, we have extensively enhanced and deployed our browsers-aware framework, a peer-to-peer Web cachingmanagement scheme. We make the browsers and their proxy share the contents to exploit the neglected but rich data locality inbrowsers and reduce document duplications among the proxy and browsers’ caches to effectively utilize the Web contents andnetwork bandwidth among clients. The objective of our scheme is to improve the scalability of proxy-based caching both in the numberof connected clients and in the diversity of Web documents. In this paper, we show that building such a caching system withconsiderations of sharing contents among clients, minimizing document duplications, and achieving data integrity and communicationanonymity is not only feasible but also highly effective.Index Terms—Internet systems, peer-to-peer systems, proxy caching, browser caching, data integrity, communication anonymity.æ1INTRODUCTIONAproxy-browser system is a commonly used client/serverinfrastructure for Web caching, where a group ofnetworked clients connects to a proxy cache server and eachclient has a browser cache. A standard Web caching modelbuilt on a proxy-browser system has the following dataflows: Upon a Web request of a client, the browser firstchecks if the requested document exists in the local browsercache. If so, the request will be served by its own browsercache. Otherwise, the request will be sent to the proxycache. If the requested document is not found in the proxycache, the proxy server will immediately send the request toits cooperative caches, if any, or to an upper level proxycache or to the Web server, without considering if thedocument exists in other browsers’ caches.This model has two features that prevent it fromeffectively utilizing the rapid imp rovement in Internettechnologies and from adapting, in a timely manner, thechanges of the supply and demand of Web contents. First,with a significant increase of memory and disk capacity inworkstations and PCs and with the improvement of Webbrowser caching capability, users are able to enlarge browsercache size for faster access to more cached documents and toretain the documents in an organized manner for a longerperiod of time. Furthermore, studies have shown that onereason for proxy cache hit ratio decline is that more requestsare absorbed by local browsers (e.g., [1]), so there exist somedocuments that are already replaced in the proxy cache butstill retained in one or more browser caches. This is due tothe fact that the request rates to the proxy and to browsersare different, causing the replacement in the proxy andbrowsers at a different pace. However, the browser cachesare not shared among the clients and the available localityand bandwidth among browsers are underutilized in Webcaching. When a requested document misses in a localbrowser cache and the proxy cache, it may still have beencached in other browser caches.Second, with the rapid increase of Web servers and thehuge growth of Web client populations in both numbers andtypes, the requested Web contents have become, and willcontinue to become, more diverse, causing a decrease ofproxy hit ratios. Meanwhile, existing proxy-browser systemscause a large amount of document duplications. The amountof document duplications between the proxy and browsercaches is generally very large because the requested docu-ment is always cached in both the proxy and a requestingclient browser. It is also highly possible to generate a largeamount of document duplication among browsers for thefollowing reason: When multiple client s request somepopular documents cached in the proxy, each requestingclient will duplicate these documents in its local browsercache. Envisioning the rapid advancement of networkingtechnology, we argue that the duplication issue can seriouslylimit potential benefits to be gained from the currentstructure of Web caching systems. Here are the reasons:1. High-speed networking technology will soon closethe speed gap between local and remote accesses.Therefore, file sharing and transferring amongclients will become easy and a common practice.754 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 16, NO. 6, JUNE 2004. L. Xiao is with the Department of Computer Science and Engineering,Michigan State University, East Lansing, MI 48824.E-mail: [email protected].. X. Zhang and S. Chen are with the Department of Computer Science,College of William and Mary, Williamsburg, VA 23187.E-mail: {zhang, sqchen}@cs.wm.edu.. A. Andrzejak is with the Division of Computer Science, Zuse-InstituteBerlin, Takustr. 7, D-14195 Berlin, Germany. E-mail: [email protected] received 29 June 2003; revised 22 Nov. 2003; accepted 25 Nov.2003.For information on obtaining reprints of this article, please send e-mail to:[email protected], and reference IEEECS Log Number TKDE-0105-0603.1041-4347/04/$20.00 ß 2004 IEEE Published by the IEEE Computer Society2. Data duplications will cause additional overhead,such as global data invalidations and broadcasting.Minimizing the number of owners for a datadocument also strengthens security and pr ivacyprotections.3. Unnecessary data duplications over the Internet canwidely waste storage space. Both the additionaloperation and space overheads will certainly limitthe scalability of Internet performance.The capability of the proxy cache will be limited as thenumber of clients and document types increase. Addingmore and more additional space to a proxy cache mighttemporarily increase the proxy hit ratio, but is not a remedyagainst decreasing efficiency.Peer-to-Peer (P2P) computing is an emerging distributedcomputing technology that enables direct resource sharingof both computing services and data files among a group


View Full Document

MTU CS 6461 - Building a Large and Efficient Hybrid Peer to Peer Internet Caching System

Documents in this Course
Tapestry

Tapestry

13 pages

Load more
Download Building a Large and Efficient Hybrid Peer to Peer Internet Caching System
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Building a Large and Efficient Hybrid Peer to Peer Internet Caching System and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Building a Large and Efficient Hybrid Peer to Peer Internet Caching System 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?