Duke CPS 212 - Cooperative Web Caching

Unformatted text preview:

1Cooperative Web CachingCooperative Web CachingCooperative CachingCooperative CachingPrevious work has shown that hit rate increases with population size [Duska et al. 97, Breslau et al. 98]However, single proxy caches have practical limits• Load, network topology, organizational constraintsOne technique to scale the client population is to have proxy caches cooperate.• Content sharingLocal hit vs. remote hit• Problem: how to locate objects and route requests?• Lots of history here[Exodus: Franklin], [xFS: Dahlin], [GMS: Feeley&Voelker]Cooperative Web Proxy CachingCooperative Web Proxy CachingSharing and/or coordination of cache state among multiple Web proxy cache nodesEffectiveness of proxy cooperation depends on:♦Inter-proxy communication distance♦Size of client population served♦Proxy utilization and load balanceClientsClientsProxyClientsInternet[Source: Geoff Voelker]Resolve misses through the parent. Hierarchical CachesHierarchical CachesINTERNETclientsorigin Web site(e.g., U.S. Congress)clientsclientsIdea: place caches at exchange or switching points in the network, and cache at each level of the hierarchy.upstreamdownstreamContentContent--Sharing Among PeersSharing Among PeersINTERNETclientsclientsclientsIdea: Since siblings are “close” in the network, allow them to share their cache contents directly.HarvestHarvest--Style ICP HierarchiesStyle ICP HierarchiesINTERNETclientquery (probe)query responseobject requestobject responseExamplesHarvest [Schwartz96]Squid (NLANR)NetApp NetCacheIdea: multicast probes within each “family”: pick first hit response or wait for all miss responses.2Issues for Cache HierarchiesIssues for Cache Hierarchies• With ICP: query traffic within “families” (size n)Inter-sibling ICP traffic (and aggregate overhead) is quadratic with n.Query-handling overhead grows linearly with n.• miss latencyObject passes through every cache from origin to client: deeper hierarchies scale better, but impose higher latencies.•storageA recently-fetched object is replicated at every level of the tree.• effectivenessInterior cache benefits are limited by capacity if objects are not likely to live there long (e.g., LRU).Hashing: Cache Array Routing Protocol (CARP)Hashing: Cache Array Routing Protocol (CARP)INTERNET“GET www.hotsite.com”Microsoft ISA Proxy Serverg-pv-zq-ua-fAdvantages1. single-hop request resolution2. no redundant caching of objects3. allows client-side implementation4. no new cache-cache protocols5. reconfigurablehashfunctionIssues for CARPIssues for CARP• no way to exploit network locality at each levele.g., relies on local browser caches to absorb repeats• load balancing• hash can be balanced and/or weighted with a load factor reflecting the capacity/power of each server• must rebalance on server failuresReassigns (1/n)th of cached URLs for array size n.URLs from failed server are evenly distributed among the remaining n-1servers.• miss penalty and cost to compute the hashIn CARP, hash cost is linear in n: hash with each node and pick the “winner”.DirectoryDirectory--based: Summary Cache for ICPbased: Summary Cache for ICPIdea: each caching server replicates the cache directory (“summary”) of each of its peers (e.g., siblings).[Cao et. al. Sigcomm98]• Query a peer only if its local summary indicates a hit.• To reduce storage overhead for summaries, implement the summaries compactly using Bloom Filters.May yield false hits (e.g., 1%), but not false misses.Each summary is three orders of magnitude smaller than the cache itself, and can be updated by multicasting just the flipped bits.A SummaryA Summary--ICP HierarchyICP HierarchyINTERNETclientqueryquery responseobject requestobject responseSummary caches at each level of the hierarchyreduce inter-sibling miss queries by 95+%.hitmisse.g., Squid configuredto use cache digestsIssues for DirectoryIssues for Directory--Based CachesBased Caches• Servers update their summaries lazily.Update when “new” entries exceed some threshold percentage.Update delays may yield false hits and/or false misses.• Other ways to reduce directory size?Vicinity cache [Gadde/Chase/Rabinovich98]Subsetting by popularity [Gadde/Chase/Rabinovich97]• What are the limits to scalability?If we grow the number of peers?If we grow the cache sizes?3On the Scale and Performance....On the Scale and Performance....[Wolman/Voelker/.../Levy99] is a key paper in this area over the last few years.• first negative result in SOSP (?)• illustrates tools for evaluating wide-area systemssimulation and analytical modeling• illustrates fundamental limits of cachingbenefits dictated by reference patterns and object rate of changeforget about capacity, and assume ideal cooperation• ties together previous work in the fieldwide-area cooperative caching strategiesanalytical models for Web workloads• best tracesUW Trace CharacteristicsUW Trace CharacteristicsTrace UWDuration 7 daysHTTP objects 18.4 millionHTTP requests 82.8 millionAvg. requests/sec 137Total Bytes 677 GBServers 244,211Clients 22,984[Source: Geoff Voelker]A MultiA Multi--Organization TraceOrganization TraceUniversity of Washington (UW) is a large and diverse client populationApproximately 50K peopleUW client population contains 200 independent campus organizationsMuseums of Art and Natural HistorySchools of Medicine, Dentistry, NursingDepartments of Computer Science, History, and MusicA trace of UW is effectively a simultaneous trace of 200 diverseclient organizations• Key: Tagged clients according to their organization in trace[Source: Geoff Voelker]Cooperation Across OrganizationsCooperation Across OrganizationsTreat each UW organization as an independent “company”Evaluate cooperative caching among these organizationsHow much Web document reuse is there among these organizations? • Place a proxy cache in front of each organization. • What is the benefit of cooperative caching among these 200 proxies?[Source: Geoff Voelker]Ideal Hit Rates for UW proxiesIdeal Hit Rates for UW proxiesIdeal hit rate - infinite storage, ignorecacheability, expirationsAverage ideal localhit rate: 43%[Source: Geoff Voelker]Ideal Hit Rates for UW proxiesIdeal Hit Rates for UW proxiesIdeal hit rate - infinite storage, ignorecacheability, expirationsAverage ideal localhit rate: 43%Explore benefits of perfect cooperation rather than a particular algorithmAverage ideal hit rate increases from 43% to 69% with cooperative


View Full Document

Duke CPS 212 - Cooperative Web Caching

Download Cooperative Web Caching
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Cooperative Web Caching and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Cooperative Web Caching 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?