Unformatted text preview:

Achieving Robustness in Distributed Database Systems DEREK L EAGER AND KENNETH C SEVCIK University of Toronto The problem of concurrency control in distributed database systems in which site and communication link failures may occur is considered The possible range of failures is not restricted in particular failures may induce an arbitrary network partitioning It is desirable to attain a high level of robustness in such a system that is these failures should have only a small impact on system operation A level of robustness termed maximal partial operability is identified Under our models of concurrency control and robustness this robustness level is the highest level attainable without significantly degrading performance A basis for the implementation of maximal partial operability is presented To illustrate ita use it is applied to a distributed locking concurrency control method and to a method that utilizes timestamps When no failures are present the robustness modifications for these methods induce no significant additional overhead Categories and Subject Descriptors H 2 2 Database Management Physical Design H 2 4 patabase Management Systems distributed systems H 2 7 Database Management Database Administration General Terms Performance Additional tioning Reliability Key Words and Phrases Concurrency control serializability robustness network parti 1 INTRODUCTION Developments in technology have made practical the interconnection of a large number of computer systems to form a computer network The problem of distributing a database among the different computer systems or sites to form a distributed database system is an active research area One topic that has received much attention is the design of concurrency control methods which permit multiple users to access and modify a distributed database concurrently A concurrency control method views a database as a collection of entities In a centralized database the value of each entity is recorded but once while in a distributed database the value of an entity may be recorded in copies at multiple sites The state of a distributed database is given by the values of all of its entity copies A distributed database is in a consistent state if 1 all of the copies of Authors address Computer Systems Research Group University of Toronto Toronto Canada M5SlA4 Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage the ACM copyright notice and the title of the publication and its date appear and notice is given that copying is by permission of the Association for Computing Machinery To copy otherwise or to republish requires a fee and or specific permission 0 1983 ACM 0730 0301 83 0900 0354 00 75 ACM Transactions on Database Systems Vol 8 No 3 September 1983 Pages 354 381 Achieving Robustness in Distributed Database Systems 355 each entity have the same value the entity value and 2 the entity values satisfy a set of assertions or consistency constraints Thomas has called these two requirements mutual consistency and internal consistency respectively ZS The purpose of the concurrency control component of a database system is to ensure that user transactions against the database see consistent database states Additional complexity is added if the distributed database system is required to be robust A robust system permits some query and update activity preserving database consistency even when system components fail Robustness is desirable in a distributed system owing to the large number of components these systems contain and the improbability that all of these components will always be operative The fewer the restrictions on transaction processing that are imposed when failures occur the higher the level of robustness against failures is said to be This paper is concerned with modifying known concurrency control methods to attain a high level of robustness against arbitrary site and communication link failures without significantly impacting system operation in the absence of failures There are two major problems that are encountered when this modification is attempted The first concerns the management of updates to replicated entities In general it is desirable to allow a transaction to update an entity even though some of the entity copies may be unavailable owing to failures However since transactions must see consistent states it is necessary to place some restrictions on when this is allowed Also the entity update must eventually be applied or accounted for at those copies that do not initially receive it The second major problem concerns the maintenance of the concurrency control method machinery in the presence of failures A concurrency control method must often impose access restrictions on entity copies to prevent the observation of inconsistent states Failures may prevent these restrictions from being removed thus impeding transaction processing This paper presents a methodology that can be used in solving these problems In Section 2 a model of concurrency control is introduced Section 3 discusses the two major consequences of site and communication link failures network partitioning and dangling precommits On the basis of the characteristics of these consequences the highest attainable robustness level that does not significantly degrade system performance in the absence of failures is identified A basis for the implementation of this robustness level is presented in Section 4 The paper concludes with two applications of the basis one to a distributed locking method and one to a concurrency control method that utilizes timestamps 1 l Related Work Numerous concurrency control methods appear in the literature for descriptions of many of these methods see the surveys of Bernstein and Goodman 3 and Hsiao and OZSU 15 However relatively few of these methods include a careful treatment of failures Even when such a treatment has been attempted the resulting method has attained a lower level of robustness than one would prefer The method proposed by Montgomery 20 uses transaction preanalysis to develop a hierarchical locking scheme By allowing an entity copy to have several values polyvalues it can be guaranteed that an entity copy will not be made ACM Transactions on Database Systems Vol 8 No 3 September 1983 356 l D L Eager and K C Sevcik locally inaccessible by any site or communication link failure other than a failure of the site


View Full Document

UW-Madison CS 354 - Achieving Robustness in Distributed Database Systems

Loading Unlocking...
Login

Join to view Achieving Robustness in Distributed Database Systems and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Achieving Robustness in Distributed Database Systems and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?