CORNELL CS 514 - Lecture Slides - D2548088

Home> Schools> Cornell University> Computer Science (CS) > CS 514> Lecture Slides

DOC PREVIEW

CORNELL CS 514 - Lecture Slides

School name Cornell University

Course Cs 514- Intermediate Computer Systems

Pages 35

This preview shows page 1-2-16-17-18-34-35 out of 35 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 35 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 35 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 35 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 35 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 35 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 35 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 35 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 35 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

CS514: Intermediate Course in Operating SystemsVirtual SynchronyWhy “virtual” synchrony?A synchronous executionVirtual Synchrony at a glanceSlide 6Chances to “weaken” orderingCausally ordered updatesIn general?Slide 10Correlated failuresProgramming with groupsEmbedding groups into “tools”Distributed algorithmsSlide 15Slide 16Distributed algorithms: SummaryMore tools: fault-tolerancePublish / SubscribeScalability warnings!Publish / Subscribe issue?Other “toolkit” ideasOther similar ideasExisting toolkits: challengesPreserving orderThe tradeoffSolution used in HorusOther toolkit “issues”Features of major virtual synchrony platformsSlide 30Slide 31Horus/JGroups/Ensemble protocol stacksJGroups (part of JBoss)Spread ToolkitSummary?CS514: Intermediate Course in Operating SystemsProfessor Ken BirmanKrzys Ostrowski: TAVirtual SynchronyA powerful programming model!Called virtual synchronyIt offersProcess groups with state transfer, automated fault detection and membership reportingOrdered reliable multicast, in several flavorsExtremely good performanceWhy “virtual” synchrony?What would a synchronous execution look like?In what ways is a “virtual” synchrony execution not the same thing?A synchronous executionpqrstuWith true synchrony executions run in genuine lock-step.Virtual Synchrony at a glanceWith virtual synchrony executions only look “lock step” to the applicationpqrstuVirtual Synchrony at a glancepqrstuWe use the weakest (hence fastest) form of communication possibleChances to “weaken” orderingSuppose that any conflicting updates are synchronized using some form of lockingMulticast sender will have mutual exclusionHence simply because we used locks, cbcast delivers conflicting updates in order they were performed!If our system ever does see concurrent multicasts… they must not have conflicted. So it won’t matter if cbcast delivers them in different orders at different recipients!Causally ordered updatesEach thread corresponds to a different lockIn effect: red “events” never conflict with green ones!prst1234512In general?Replace “safe” (dynamic uniformity) with a standard multicast when possibleReplace abcast with cbcastReplace cbcast with fbcastUnless replies are needed, don’t wait for replies to a multicastWhy “virtual” synchrony?The user sees what looks like a synchronous executionSimplifies the developer’s taskBut the actual execution is rather concurrent and asynchronousMaximizes performanceReduces risk that lock-step execution will trigger correlated failuresCorrelated failuresWhy do we claim that virtual synchrony makes these less likely?Recall that many programs are buggyOften these are Heisenbugs (order sensitive)With lock-step execution each group member sees group events in identical orderSo all die in unisonWith virtual synchrony orders differSo an order-sensitive bug might only kill one group member!Programming with groupsMany systems just have one groupE.g. replicated bank serversCluster mimics one highly reliable serverBut we can also use groups at finer granularityE.g. to replicate a shared data structureNow one process might belong to many groupsA further reason that different processes might see different inputs and event ordersEmbedding groups into “tools”We can design a groups API:pg_join(), pg_leave(), cbcast()…But we can also use groups to build other higher level mechanismsDistributed algorithms, like snapshotFault-tolerant request executionPublish-subscribeDistributed algorithmsProcesses that might participate join an appropriate groupNow the group view gives a simple leader election ruleEveryone sees the same members, in the same order, ranked by when they joinedLeader can be, e.g., the “oldest” processDistributed algorithmsA group can easily solve consensusLeader multicasts: “what’s your input”?All reply: “Mine is 0. Mine is 1”Initiator picks the most common value and multicasts that: the “decision value”If the leader fails, the new leader just restarts the algorithmPuzzle: Does FLP apply here?Distributed algorithmsA group can easily do consistent snapshot algorithmEither use cbcast throughout system, or build the algorithm over gbcastTwo phases:Start snapshot: a first cbcastFinished: a second cbcast, collect process states and channel logsDistributed algorithms: SummaryLeader electionConsensus and other forms of agreement like votingSnapshots, hence deadlock detection, auditing, load balancingMore tools: fault-toleranceSuppose that we want to offer clients “fault-tolerant request execution”We can replace a traditional service with a group of membersEach request is assigned to a primary (ideally, spread the work around) and a backupPrimary sends a “cc” of the response to the request to the backupBackup keeps a copy of the request and steps in only if the primary crashes before replyingSometimes called “coordinator/cohort” just to distinguish from “primary/backup”Publish / SubscribeGoal is to support a simple API:Publish(“topic”, message)Subscribe(“topic”, event_hander)We can just create a group for each topicPublish multicasts to the groupSubscribers are the membersScalability warnings!Many existing group communication systems don’t scale incredibly wellE.g. JGroups, Ensemble, SpreadGroup sizes limited to perhaps 50-75 membersAnd individual processes limited to joining perhaps 50-75 groups (Spread: see next slide)Overheads soar as these sizes increaseEach group runs protocols oblivious of the others, and this creates huge inefficiencyPublish / Subscribe issue?We could have thousands of topics!Too many to directly map topics to groupsInstead map topics to a smaller set of groups.SPREAD system calls these “lightweight” groupsMapping will result in inaccuracies… Filter incoming messages to discard any not actually destined to the receiver processCornell’s new QuickSilver system will instead directly support immense numbers of groupsOther “toolkit” ideasWe could embed group communication into a framework in a “transparent” wayExample: CORBA fault-tolerance specification does lock-step replication of deterministic componentsThe client simply can’t see failuresBut

View Full Document