DOC PREVIEW
CORNELL CS 514 - CS514 Lecture 10

This preview shows page 1-2-3-4-24-25-26-50-51-52-53 out of 53 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS514: Intermediate Course in Operating SystemsAgreement on MembershipArchitectureSlide 4Contrast dynamic with static modelConsistency optionsObstacles to progressUsual response to FLP: Chandra/TouegTowards an AlternativeCommit protocol from last lectureSuppose this is a partitioning failurePrimary partition conceptRicciardi: Group Membership ProtocolGMP protocol itselfGMS majority requirementGMS in ActionSlide 17What if system has thousands of processes?Uses of membership?Replicated data within groupsReplicated dataSome Initial AssumptionsProcess group modelProcess groups with joins, failuresState transferOutline of treatmentAtomic deliveryAdditional propertiesUniform and non-uniform deliveryStronger properties cost moreConceptual cost graphImplementing multicast primitivesFailures?Multicast by “flooding”Slide 35Slide 36Slide 37Garbage collection issue“Lazy” flooding and garbage collection“Lazy” floodingSlide 41“Lazy” flooding, delayed phasesSlide 43Slide 44Slide 45Lazy scheme continuedGarbage collection with inaccurate failure detectionsExploiting a failure detectorNow our lazy scheme works!Failure DetectorsCutting Channels to Failed ProcessesDynamic uniformitySummaryCS514: Intermediate Course in Operating SystemsProfessor Ken BirmanBen Atkin: TALecture 10: Sept. 26Agreement on Membership•Recall our approach:–Detecting failure is a lost cause. •Too many things can mimic failure•To be accurate would end up waiting for a process to recover–Substitute agreement on membership•Now we can drop a process because it isn’t fast enough•This can seem “arbitrary”, e.g. A kills B…•GMS implements this service for everyone elseArchitectureMembership Agreement, “join/leave” and “P seems to be unresponsive”3PC-like protocols use membership changes instead of failure notificationApplications use replicated data for Applications use replicated data for high availabilityhigh availabilityArchitectureGMSABCDjoin leavejoinA seems to have failed{A}{A,B,D}{A,D}{A,D,C}{D,C}X Y ZApplication processesGMS processesmembership viewsContrast dynamic with static model•Static model: fixed set of processes “tied” to resources–Processes may be unreachable (while failed or partitioned away) but later recover–Think: “cluster of PCs”•Dynamic model: changing set of processes launched while system runs, some fail/terminate–Failed processes never recover (partitioned process may reconnect, but uses a new pid)–And can still own a physical resource, allowing us to emulate a static modelConsistency options•Could require that system always be consistent with actions taken at a process even if that process fails immediately after taking the action–This property is needed in systems that take external actions, like advising an air traffic controller–May not be needed in high availability systems•Alternative is to require that operational part of system remain continuously self-consistentObstacles to progress•Fischer, Lynch and Patterson result: proof that agreement protocols cannot be both externally consistent and live in asynchronous environments•Suggests that choice between internal consistency and external consistency is a fundamental one!•Can show that this result also applies to dynamic membership problemsUsual response to FLP: Chandra/Toueg•Consider system as having a failure detector that provides input to the basic system itself•Agreement protocols within system are considered safe and live if they satisfy their properties and are live when the failure detector is live•Babaoglu: expresses similar result in terms of reachability of processes: protocols are live during periods of reachabilityTowards an Alternative•In this lecture, focus on systems with self-defined membership•Idea is that if p can’t talk to q it will initiate a membership change that removes q from p’s system “membership view”•Illustrated on next slideCommit protocol from last lectureok to commit?okdecision unknown!vote unknown!okSuppose this is a partitioning failureok to commit?okdecision unknown!vote unknown!okDo these processes actually need to be consistent with the others?Primary partition concept•Idea is to identify notion of “the system” with a unique component of the partitioned system•Call this distinguished component the “primary” partition of the system as a whole.–Primary partition can speak with authority for the system as a whole–Non-primary partitions have weaker consistency guarantees and limited ability to initiate new actionsRicciardi: Group Membership Protocol•For use in a group membership service (usually just a few processes that run on behalf of whole system)•Tracks own membership; own members use this to maintain membership list for the whole system•All user’s of the service see subsequences of a single system-wide group membership history•GMS also tracks the primary partitionGMP protocol itself•Used only to track membership of the “core” GMS•Designates one GMS member as the coordinator•Switches between 2PC and 3PC–2PC if the coordinator didn’t fail and other members failed or are joining–3PC if the coordinator failed and some other member is taking over as new coordinator•Question: how to avoid “logical partitioning”?GMS majority requirement•To move from system “view” i to view i+1, GMS requires explicit acknowledgement by a majority of the processes in view i•Can’t get a majority: causes GMS to lose its primaryness information•Dahlia Malkhi has extended GMP to support partitioning and remerging; similar idea used by Yair Amir and others in Totem systemGMS in Actionp0p1...p5p0 is the initial coordinator. p1 and p2 join, then p3...p5 join. But p0 fails during join protocol, and later so does p3. Notice use of majority consent to avoid partitioning!GMS in Actionp0p1...p52-phase commit… 3-phase… 2–phaseP0 is coordinator… P1 takes over… P1 is new coordinatorWhat if system has thousands of processes?•Idea is to build a GMS subsystem that runs on just a few nodes•GMS members track themselves•Other processes ask to be admitted to system or for faulty processes to be excluded•GMS treats overall system membership as a form of replicated data that it manages, reports to its “listeners”Uses of membership?•If we rewire TCP and RPC to use membership changes as trigger for breaking connections, can eliminate split-brain


View Full Document

CORNELL CS 514 - CS514 Lecture 10

Documents in this Course
LECTURE

LECTURE

29 pages

LECTURE

LECTURE

28 pages

Load more
Download CS514 Lecture 10
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view CS514 Lecture 10 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view CS514 Lecture 10 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?