Unformatted text preview:

AbstractIntroductionAlgorithm TaxonomySummaryClient-Server Cache Consistency Maintenance Algorithms SurveyDavid KrasnopolskyCornell UniversityAbstractThis paper surveys several cache consistency maintenance algorithms. Four design families are explored: Server-Based Two Phase Locking (S2PL), No-Wait Locking (NWL), Callback Locking (CBL), Optimistic Two Phase Locking (O2PL). Each algorithm is presented in depth along with a comparison to its peers.IntroductionClient data caching is an optimization technique designed to enhance performance and scalability in a page/object server DBMS. Caching takes advantage of temporal locality, references are clustered in time, and spatial locality, references are clustered in space. Note caching introduces several copies of the same data across several sites, this introduces a problem of data and view integrity. The problem of keeping data consistent across all views is referred to as “cache consistency maintenance”. Protocols needed to perform cache consistency maintenance are closely related to transaction processing and concurrency control. Data can be cached within transaction boundaries (intra-transaction caching) and across transaction processing (inter-transaction caching). Intra-transaction caching assumes all copies of its data areinvalid at the start of transaction execution, thus on first access all pages must be fetched into the transaction’s buffer pool. However as pointed out in [Wang91] intra-transaction caching can not benefit from inter-transaction reference locality. The rest of the paper will focus on inter-transaction caching algorithms. Caching of objects as opposed to pages is of interest in comparing OODBMS’s vs. RDBMS’s however our discussion is relevant for both purposes, thus we will make no distinction between caching units be it pages or objects. Note the benefits of caching may be offset due to higher overhead, communication and processing costs.Algorithm TaxonomyAccording to [Fran96] most cache consistency maintenance algorithms can be divided into two categories: detection based and avoidance-based. Detection-based schemes allow for stale data copies to reside in a client’s cache. Transactions must explicitly check the validity of the data they access before they are allowed to commit. The server is responsible for making such a checking algorithm possible. The name detection-based implies that stale data must be explicitly detected, furthermore any transaction accessing stale data is not allowed to commit. Avoidance-based methods avoid accessing stale data, its protocols make such access impossible. These protocols are based on the read one/write all (ROWA) approach to replication management.A ROWA type protocol ensures that every copy of an updated item has the same value when the transaction commits. The ROWA approach is specialized to take advantage of server services, allowing the server to eliminate any unreachable data pages. The main advantage of detection-based algorithms is their simplicity, because the client-server interactions involve a single client-server pair. Therefore the amount of code required at the client is significantly less then in ROWA based approaches. The need for distributed global deadlock detection is also eliminated. These advantages come at a price; detection-based methods are highly dependent on client-server communication as well as the servers themselves.This survey will study four families of algorithms: Server-Based Two Phase Locking (S2PL) and No-Wait Locking (NWL) which are detection-based, Callback Locking (CBL) and Optimistic Two Phase Locking (O2PL) which are avoidance-based.Server-Based Two Phase Locking (S2PL)The family of S2PL algorithms is detection-based. Their main feature is the ability to validate previously cached pages synchronously during a transaction’s initial page access. As described in [Fran97] these algorithms are based on a primary copy approach to replication management. Before any transaction is allowed to commit it must access the primary copy, which in this case is residing on the server. For reads, the client’s data copy is verified to have the updated values. For writes, the new value must be reflected in the primary copy. One of the simplest algorithms in this family, studied by Franklin and Carey, is Caching 2PL (C2PL). C2PL maintains a “check-on-access” philosophy, disallowing inter-transactional lock caching. All cached pages are tagged with a log sequence number (LSN), which represents the current version of the page. Any page access for which the transaction does not have a lock results in a lock request being sent to the server. If there is a cached copy of the page present in the client’s buffer pool, the LSN of the cached page is included in the lock request message. If other transactions hold conflicting locks on the requested page the server blocks the request until all conflicting locks are released. If a read-lock is granted to a transaction the server determines ifthe cached copy of the page is up-to-date by comparing LSNs. If the page is out-of-date the server piggybacks the up-to-date page copy on the lock request response. Note all locks are held at the server, furthermore the server uses strict 2PL (all locks are held until a transaction commits or aborts). The centralized architecture allows the server to perform deadlock detection, conflicts are resolved by aborting the youngest transaction.It is immediately evident C2PL is a pessimistic algorithm, which leads to 2 communication messages for every new page access. This places a heavy emphasis on the network making it a possible bottleneck.No-Wait Locking (NWL)In S2PL algorithms a transaction is blocked while waiting for a response from the server to confirm/deny cached object validity. An alternate view is to optimistically assume that all objects in the cache are valid and all requested locks will be granted, this allows the application toexecute without waiting for a server to respond. If the cached page is valid and the requested lock can be granted the server does not send any response until a commit message is received, an example of asynchronous communication. If the cached page is invalid or deadlock is detected, the transaction is aborted by the server and the client must restart it. It is important to note, the client must receive a response from the server before it can commit. In NWL transactions can


View Full Document

CORNELL CS 632 - Study Notes

Download Study Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Study Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Study Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?