About MeSleepers & WorkaholicsSlide 3The Big Picture ProblemThen and nowWith an explosion of wireless devices, the problem is even greaterWhy Caching is ImportantTraditional Strategies FailThe SolutionEvaluation CriteriaStrategies to EvaluateTimestampsAmnesic TerminalsSignaturesNo CacheConclusions on EffectivenessStill not satisfied …. how can we improve effectiveness?Relax the Consistency of the CacheHow Do We Decide to Update?CriticismAbout MeAbout MeJoshua Silver4th year CS major – graduating in MaySpecialization: DatabasesInterests:The business side of computing … and no, not ITHow can companies use technology to improve and enable their businessThink Enterprise Web 2.0, mobile strategies, viral promotion on the internet, Netflix recommendation engine, e-commerce, etc. etc.Startups!Sleepers & Sleepers & WorkaholicsWorkaholicsCaching Strategies in Mobile ComputingAuthors: Dr. Daniel Barbará and Dr. Tomasz ImielinskiPresented by:Joshua Silver, Fall 2008Sleepers & Sleepers & WorkaholicsWorkaholicsCaching Strategies in Mobile Computing Dr. Daniel BarbaráProfessor at George Mason UniversitySeveral patents associated with mobile caching Dr. Tomasz ImielinskiProfessor at Rutgers UniversitySenior VP: Search Technology at Ask.comThe Big Picture The Big Picture ProblemProblemWireless devices have limited bandwidth, limited storage, and limited battery lifeTo save power, devices go offlineMobile devices appear randomly in new cellsMakes data caching difficult since server can’t track client cachesThen and nowThen and nowPaper written in 1994Devices, bandwidth, battery limitations are differentEssential problem still existsWith an explosion of wireless With an explosion of wireless devices, the problem is even devices, the problem is even greatergreaterSource: CTIA—The Wireless Association. http://www.infoplease.com/ipa/A0933563.html24 Million in 1994>240 Million in 2008… … and that doesn’t even take into account and that doesn’t even take into account proprietary handheld units (like UPS driver proprietary handheld units (like UPS driver delivery computers , Amazon Kindles, grocery delivery computers , Amazon Kindles, grocery store handheld scanners, etc.)store handheld scanners, etc.)Why Caching is ImportantWhy Caching is ImportantConserve: 1. Computational resources2. Battery life3. Network bandwidthCan’t store entire dataset on handheld.-US maps on GPS unit-Delivery routes for UPS drivers-Contact list on BlackberryTraditional Strategies FailTraditional Strategies FailIn a traditional client-server model:the server keeps track of client cachespushes only the changes/sends cache invalidation messagesBUT…. Server lacks knowledge of:Which units are in its cellWhich units are powered ONQuintessential problem:Client caches in a mobile environment cannot be tracked by a serverThe SolutionThe SolutionPurpose: "…to propose a taxonomy of different cache invalidation strategies and study the impact of clients' disconnection times on their performance." Sleepers & Workaholics proposes a few solutions and evaluates their effectiveness with mathematical rigorEvaluation Criteria Evaluation Criteria Complicated math! …. The paper’s appendices have details. Essentially: Define two types of Mobile UnitsSleepers (offline/off all the time) Workaholics (never go offline)Almost all real world devices fall in betweenHow do you compare?Normalize by defining “hit ratio” since it affects overall throughputsize data totalhits cache validXHStrategies to EvaluateStrategies to EvaluateProposed Strategies:Timestamps (TS)Amnesic Terminals (AT) (only remembering part – like amnesia)Signatures (SIG)Control Strategy:No Cache (NC)TimestampsTimestamps-Each cache entry has a timestamp-Synchronous, history based, uncompressed in natureSERVER:Communicates with clients every n seconds (and retries until successfully connected)Sends a list of items and their associated timestamps (to accommodate for potential delay in transmission)CLIENT:For each item in cache:If entry is in received report from server, purge from cacheIf NOT in report, simply update timestamp to current timeAmnesic TerminalsAmnesic Terminals-Each cache entry has a identifier-ALSO Synchronous, history based, uncompressed in natureSERVER:Notify clients of identifiers of items changed since the last invalidation report. CLIENT:For each item in cache:◦If in report, purge from cache◦If NOT in report, do nothing◦ALSO, if enough time has elapsed, drop WHOLE cache and rebuild completely.SignaturesSignatures-Checksums calculated over value of data to form Signature-Since the mobile unit does not have entire database, need an algorithm to compute a partial checksum – see the appendix-Signatures combined using XOR-Synchronous, state based, compressed reportsSERVER:Server broadcasts the set of combined signaturesCLIENT:Item in cache is declared invalid if it belongs to “too many” unmatching signatures (suspected of being out of date)No CacheNo CacheThere is no cacheSERVER:Responds to direct queries from the client with appropriate information CLIENT:Query the database directly anytime item is neededConclusions on Conclusions on EffectivenessEffectivenessStrategy depends on circumstances:Signatures best for long sleepers, when the disconnection period is long and difficult to predictTimestamps best for query-intensive scenarios, when the rate of queries is greater than the rate of updates, provided that units are not workaholicsAmnesiac Terminals is best for workaholics, units that are awake most of the timeStill not satisfied …. how Still not satisfied …. how can we improve can we improve effectiveness?effectiveness?Only 2 options:1. Update less often or2. Send less infoRelax the Relax the ConsistencyConsistency of the of the CacheCacheDepending on data type, data may not need to be exact…EX: stocks, weather, etc.Allow to vary by a set tolerance (like .05% for stock prices, outdated weather reports by 2 hours, etc)Makes shorter invalidation reports possibleHow Do We Decide to How Do We Decide to Update?Update?- Consider cached copies to be quasi-copies- Each quasi-copy has a coherency condition attached to itCoherency Conditions:Delay Condition - updated based on timeArithmetic Condition - updated based on difference between data and quasi-copyCriticismCriticismWhich resources are
View Full Document