Unformatted text preview:

Early Stopping in Byzantine Agreement DANNY DOLEV Hebrew University, Jerusalem, Israel, and IBM Research, Almaden Research Center, San Jose, California RUEDIGER REISCHUK Institut fuer Theoretische Informatik, Technische Hochschule Darmstadt, Darmstadt, West Germany AND H. RAYMOND STRONG IBM Research, Almaden Research Center, San Jose, Caltfornia Abstract. Two different kinds of Byzantine Agreement for distributed systems with processor faults are defined and compared. The first is required when coordinated actions may be performed by each participant at different times. This kind of agreement is called Eventual Byzantine Agreement (EBA). The second is needed for coordinated actions that must be performed by all participants at the same time. This kind is called Simultaneous Byzantine Agreement (SBA). This paper deals with the number of rounds of message exchange required to reach Byzantine Agreement of either kind (BA). If an algorithm allows its participants to reach Byzantine agreement in every execution in which at most t participants are faulty, then the algorithm is said to tolerate t faults. It is well known that any BA algorithm that tolerates t faults (with t < n - 1 where n denotes the total number of processors) must run at least t + 1 rounds in some execution. However, it might be supposed that in executions where the number f of actual faults is small compared to t, the number of rounds could be correspondingly small. A corollary of our first result states that (when t < n - 1) any algorithm for SBA must run t + 1 rounds in some execution where there are no faults. For EBA (with t < n - 1), a lower bound of min(t + 1, f + 2) rounds is proved. Finally, an algorithm for EBA is presented that achieves the lower bound, provided that t is on the order of the square root of the total number of processors. Categories and Subject Descriptors: C.2.4 [Computer-Communication Networks]: Distributed Systems- distributed applications; distributed databases; network operating systems; C.4 [Performance of Sys- tems]: reliability, availability, and serviceability; F. 1.2 [Computation by Abstract Devices]: Modes of Computation-parallelism; H.2.4 [Database Management]: Systems-distributed systems General Terms: Algorithms, Reliability, Theory, Verification Additional Key Words and Phrases: Agreement problem, asynchronous system, Byzantine Generals problem, commit problem, consensus problem, distributed computing, early stopping, fault tolerance, reliability Some parts of the paper are extended and improved versions of parts of papers that appeared in the Proceedings of the 22nd IEEE Symposium on Foundations of Computer Science [8] and in the Proceedings of the 2nd International Symposium on Distributed Data Bases [ 111. Authors’ addresses: D. Dolev, Hebrew University, 9 1904 Jerusalem, Israel; R. Reischuk, Institut fuer Theoretische Informatik Technische Hochschule Dannstadt, 6100 Darmstadt, West Germany; H. R. Strong, IBM Research, Almaden Research Center, 650 Harry Road, San Jose, CA 95 120. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Association for Computing Machinery. To copy otherwise, or to republish, requires a fee and/or specific permission. 0 1990 ACM 0004-541 l/90/1000-0720 $01.50 JOUmal of the Association for Computing Machinery. Vol. 37, No. 4, Octrokr 1990, pp. 7X-741.Early Stopping in Byzantine Agreement 721 1. Introduction In this paper, we discuss two closely-related types of agreement that can be reached in a distributed system in the presence of undetected processor faults. One type is called Simultaneous Byzantine Agreement (SBA) and the other Eventual Byzantine Agreement (EBA). Corresponding to these two types of agreement are two distinct problems in coordination among multiple processors in a distributed system. One problem is synchronization: Processors may be required to perform some action at the same time, immediately after reaching agreement on that action [ 181. The other is consistency as required, for example, in the atomic commitment of a distributed database transaction. The participants in the transaction commit pro- tocol must agree on whether or not the transaction is to be committed. In this case, it is enough to know that the choice will eventually be the choice of all other parties to the agreement [lo]. (Note that the atomic commit problem that implies BA is the problem of nonblocking atomic commit with guaranteed communication. In this problem, correct participants must reach a decision in spite of the failure of any other participants or coordinators and without waiting for the recovery of others.) It is the purpose of this paper to explore the difference between these two problems and the consequent differences in requirements for their solution. Because SBA implies EBA within our model, EBA can always be reached as early as SBA. We show that EBA can often be reached earlier than SBA. The context for our study is a network of n processors that are able to conduct synchronized rounds of information exchange, each round consisting of message transmission, message receipt and processing. In the following, n will always denote the number of processors. We assume that the network is completely connected and that only processors can fail. The reader may be interested in exploring models with weaker assumptions in the following related references: [2], [4], [lo], [17], and [23]. The reader might also like to explore earlier related work in [5], [6], [7], [ 141, [ 161, and [ 191. In the Byzantine fault case, no assumption is made about the behavior of faulty processors. During an execution of an algorithm, a processor is said to be correct if it follows the specifications of the algorithm; otherwise, it is said to be faulty. We assume that the agreement to


View Full Document

MIT 6 897 - Research Paper

Download Research Paper
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Research Paper and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Research Paper 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?