DOC PREVIEW
MIT 6 826 - Consensus

This preview shows page 1-2-3 out of 8 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Handout 26 1Massachusetts Institute of TechnologyDepartment of Electrical Engineering and Computer Science6.826 — Principles of Computer SystemsHandout 26 April 28, 1997____________________________________________________________________________ConsensusConsensus (sometimes called ‘reliable broadcast’ or ‘atomic broadcast’) is a fundamental buildingblock for distributed systems. Informally, we say that several processes achieve consensus if theyall agree on some value. Three obvious applications are:Distributed transactions, where all the processes need to agree on whether a transactioncommits or aborts. Each transaction needs a new consensus on its outcome.Membership, where a set of processes cooperating to provide a highly available service needto agree on which processes are currently functioning as members of the set. Every time aprocess fails or starts working again there must be a new consensus.Electing a leader of a group of processes.State machinesThere is a much more general way to use consensus, as the mechanism for implementing a highlyavailable state machine. The way to get availability is to have either perfect components orredundancy. The simplest form of redundancy is replication: have several copies of eachcomponent, and make all the non-faulty components do the same thing. Since any computationcan be expressed as a state machine, a replicated state machine can make any computation highlyavailable.Recall the basic idea of a replicated state machine:If the transition relation is deterministic (in other words, is a function from (state, input) to(new state, output)), then several copies of the state machine that start in the same state andsee the same sequence of inputs will do the same thing, that is, end up in the same state andproduce the same outputs.So if several processes are implementing the same state machine and achieve consensus on thevalues and order of the inputs, they will do the same thing. In this way it’s possible to replicate anarbitrary computation and thus make it highly available. Of course we can make the order a partof the value of the input by defining some total order on the set of possible inputs.1 We havealready seen one application of this replicated state machine idea, in the implementation oftransactions; there the replication takes the form of redoing a sequence of actions which isremembered in a log.In many applications the inputs are requests from clients to the replicated service. Typicallydifferent clients generate their requests independently, so it’s necessary to agree not only on whatthe requests are, but also on the order in which to serve them. The simplest way to do this is to 1 This approach was first proposed in a classic paper by Leslie Lamport: Time, clocks, and the ordering of events ina distributed system, Comm. ACM 21, 7, July 1978, pp 558-565. This paper is better known for its analysis of thepartial ordering of events in a distributed system, which is too bad.number them with consecutive integers, starting at 1. This is what is done in ‘primary copy’replication, since it’s easy for one place (the primary) to assign consecutive numbers.The literature contains many other schemes for achieving consensus on the order of requests whentheir total order is not derived from consecutive integers. These schemes label each input withsome label from a totally ordered set (for instance, (client UID, timestamp) pairs) and then devisesome way to be certain that you have seen all the inputs that can ever exist with labels smallerthan a given value. They are complicated.2The section on leases at the end of this handout explains practical methods for minimizing thenumber of times you need to use consensus in implementing a reliable state machine.Specification for consensusHere is the specification for consensus. There is an outcome variable initialized to nil, and anaction Allow(v) that can be invoked any number of times. There is also an action Outcome toread the outcome variable; it must return either nil or a v which was the argument of someAllow action, and it must always return the same v. More precisely, we have two requirements:Agreement: Every non-nil result of Outcome is the same.Validity: A non-nil outcome equals some allowed value.Validity means that the outcome can’t be any arbitrary value, but must be a value that wasallowed. Consensus is reached by choosing some allowed value and assigning it to outcome. Thisspec makes the choice on the fly as the allowed values arrive.MODULE Consensus [V] EXPORT Allow, Outcome =% data Value to agree onVAR outcome : (V + Null) := nilAPROC Allow(v) = << outcome = nil => outcome := v [] SKIP >>APROC Outcome() -> (V+Null) = << RET outcome [] RET nil >>END ConsensusNote that Outcome is allowed to return nil even after the choice has been made. This reflects thefact that in an implementation with several replicas, Outcome is often implemented by talking tojust one of the replicas, and that replica may not yet have learned about the choice.If only one Allow action occurs, there’s no need to choose a v, and the implementation’s onlyproblem is to ensure termination. An algorithm that does so is said to implement ‘reliable’ or‘atomic’ broadcast; there is only one sender, and either everyone or no one gets the message. Thesingle Allow might not set outcome, which corresponds to failure of the sender of the broadcastmessage; in this case no one gets the message.Here is a slightly more complicated, but perhaps more intuitive, spec. It accumulates the allowedvalues and then chooses one of them in the internal action Agree. 2 For details, see F. Schneider, Implementing fault-tolerant services using the state-machine approach: A tutorial,ACM Computing Surveys 22 (Dec 1990). This paper is reprinted in the book Distributed Systems, 2nd edition, ed. S.Mullender, Addison-Wesley, 1993, pp 169-197.Handout 26 2MODULE LateConsensus [V] EXPORT Allow, Outcome =VAR outcome : (V + Null) := nilallowed : SET V := {}APROC Allow(v) = << allowed := allowed + {v} >>APROC Outcome() -> (V+Null) = << RET outcome [] RET nil >>% Only outcome is visibleAPROC Agree() = << VAR v :IN allowed | outcome = nil => outcome := v >>END LateConsensusIt should be fairly clear that LateConsensus implements Consensus. An abstraction function toprove this, however, requires a prophecy variable, because Consensus decides on the outcome (inthe Allow action) before LateConsensus does (in the Agree


View Full Document

MIT 6 826 - Consensus

Documents in this Course
Consensus

Consensus

10 pages

Load more
Download Consensus
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Consensus and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Consensus 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?