DOC PREVIEW
CORNELL CS 614 - Virtual synchrony

This preview shows page 1-2-3 out of 9 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

1Virtual synchronyKen BirmanVirtual Synchrony Goal: Simplifies distributed systems development by introducing emulating a simplified world – a synchronous one Features of the virtual synchrony model Process groups with state transfer, automated fault detection and membership reporting Ordered reliable multicast, in several flavors Fault-tolerance, replication tools layered on top Extremely good performanceProcess groups Offered as a new and fundamental programming abstraction Just a set of application processes that cooperate for some purpose Could replicate data, coordinate handling of incoming requests or events, perform parallel tasks, or have a shared perspective on some sort of “fact” about the system Can create many of them** Within limits... Many systems only had limited scalabilityWhy “virtual” synchrony? What would a synchronous execution look like? In what ways is a “virtual” synchrony execution not the same thing?A synchronous executionpqrstu With truesynchrony executions run in genuine lock-step. Virtual Synchrony at a glanceWith virtualsynchrony executions only look “lock step” to the applicationpqrstu2Virtual Synchrony at a glancepqrstuWe use the weakest (least ordered, hence fastest) form of communication possibleChances to “weaken” ordering Suppose that any conflicting updates are synchronized using some form of locking Multicast sender will have mutual exclusion Hence simply because we used locks, cbcastdelivers conflicting updates in order they were performed! If our system ever does see concurrent multicasts… they must not have conflicted. So it won’t matter if cbcast delivers them in different orders at different recipients!Causally ordered updates Each thread corresponds to a different lock In effect: red “events” never conflict with green ones!prst1234512In general? Replace “safe” (dynamic uniformity) with a standard multicast when possible Replace abcast with cbcast Replace cbcast with fbcast Unless replies are needed, don’t wait for replies to a multicastWhy “virtual” synchrony? The user writes code as it will experience a purely synchronous execution Simplifies the developer’s task – very few cases to worry about, and all group members see the same thing at the same “time” But the actual execution is rather concurrent and asynchronous Maximizes performance Reduces risk that lock-step execution will trigger correlated failuresWhy groups? Other concurrent work, such as Lamport’sstate machines, treat the entire program as a deterministic entity and replicate it But a group replicates state at the “abstract data type” level Each group can correspond to one object This is a good fit with modern styles of application development3Correlated failures Perhaps surprisingly, experiments showed that virtual synchrony makes these less likely! Recall that many programs are buggy Often these are Heisenbugs (order sensitive) With lock-step execution each group member sees group events in identical order So all die in unison With virtual synchrony orders differ So an order-sensitive bug might only kill one group member!Programming with groups Many systems just have one group E.g. replicated bank servers Cluster mimics one highly reliable server But we can also use groups at finer granularity E.g. to replicate a shared data structure Now one process might belong to many groups A further reason that different processes might see different inputs and event ordersEmbedding groups into “tools” We can design a groups API: pg_join(), pg_leave(), cbcast()… But we can also use groups to build other higher level mechanisms Distributed algorithms, like snapshot Fault-tolerant request execution Publish-subscribeDistributed algorithms Processes that might participate join an appropriate group Now the group view gives a simple leader election rule Everyone sees the same members, in the same order, ranked by when they joined Leader can be, e.g., the “oldest” processDistributed algorithms A group can easily solve consensus Leader multicasts: “what’s your input”? All reply: “Mine is 0. Mine is 1” Initiator picks the most common value and multicasts that: the “decision value” If the leader fails, the new leader just restarts the algorithm Puzzle: Does FLP apply here?Distributed algorithms A group can easily do consistent snapshot algorithm Either use cbcast throughout system, or build the algorithm over gbcast Two phases: Start snapshot: a first cbcast Finished: a second cbcast, collect process states and channel logs4Distributed algorithms: Summary Leader election Consensus and other forms of agreement like voting Snapshots, hence deadlock detection, auditing, load balancingMore tools: fault-tolerance Suppose that we want to offer clients “fault-tolerant request execution” We can replace a traditional service with a group of members Each request is assigned to a primary (ideally, spread the work around) and a backup Primary sends a “cc” of the response to the request to the backup Backup keeps a copy of the request and steps in only if the primary crashes before replying Sometimes called “coordinator/cohort” just to distinguish from “primary/backup”Coordinator-cohortpqrstuQ assigned as coordinator for t’s request…. But p takes over if q fails Coordinator-cohortpqrstuP picked to perform u’s request. Q stands by until it sees request completion message Parallel processingpqrstP and Q split a task. P performs part 1 of 2; Q performs part 2 of 2. Such as searching a large database… they agree on the initial state in which the request was received Parallel processingpqrstIn this example, r is the cohort and both p and q function as coordinators. If either fails, r can step in and take over its role….5Parallel processingpqrst…. As it did in this case, when q fails Publish / Subscribe Goal is to support a simple API: Publish(“topic”, message) Subscribe(“topic”, event_hander) We can just create a group for each topic Publish multicasts to the group Subscribers are the membersScalability warnings! Many existing group communication systems don’t scale incredibly well E.g. JGroups, Isis, Horus, Ensemble, Spread Group sizes


View Full Document

CORNELL CS 614 - Virtual synchrony

Documents in this Course
Load more
Download Virtual synchrony
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Virtual synchrony and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Virtual synchrony 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?