CORNELL CS 5410 - Building a System Management Service - D1149052

Home> Schools> Cornell University> Computer Science (CS) > CS 5410> Building a System Management Service

CORNELL CS 5410 - Building a System Management Service

Course Cs 5410- Intermediate Computer Systems

Pages 36

Download Save

Unformatted text preview:

iKen BirmanCornell University. CS5410 Fall 2008. Last week looked at timey In effect, we asked “can we build a time service for a data center”?Rhd liyReached two conclusionsy One focused on event orderingyThe other was a true synchronized clockyThe other was a true synchronized clocky This week, we’ll use some of the ideas from the time service to build a powerful system management servicep y gHear and obey. The primary is Oracledown. I have spoken!!!Track membershipAn all‐seeing eye.primaryAn allseeing eye.crashbackupy Clients obey ity If the oracle errswe “do as it says” anyhowwe do as it says anyhowy This eliminated ourfear of inconsistency.fear of inconsistency.Using the Oracle to manage aUsing the Oracle to manage a systemy For many purposes, Oracle can “publish decrees”y “Failure” and “Recovery” don’t need to be the only casesy For exampley “Engines at warp‐factor two!”“R j t iit t”y“Reject non‐priority requests”y “Map biscuit.cs.cornell.edu to 128.57.43.1241”y Imagine this as an append‐only logP failedUsing the Oracle to manage a systemy If we give the records “names” (like file paths) we can treat the log as a set of logs//bi i ll d / idy/process‐status/biscuit.cs.cornell.edu/pid12345y /parameters/peoplesoft/run‐slow=truey/locks/printqueuey/locks/printqueueyThus one log can “look” like many logsyThus one log can look like many logsy Clients append to logsyAnd they also “subscribe” to see reports as changes occury p gMany roles for Oraclesy Track membership of a complex systemy Which applications are up? Which are down?yWhere are service instance s running? (“GMS” function)y Use it as “input” for group applications, TCP failure sensing load‐balancing etcsensing, loadbalancing, etc.y Lock managementyParameter and status trackingParameter and status trackingy Assignment of roles, keysyDNS functionalityDNS functionalityScalabilityy Clearly, not everything can run through one servery It won’t be fast enoughy Solutions?y Only use the Oracle “when necessary” (will see more on this later)this later)y Spread the role over multiple serversy One Oracle “node” could be handled by, say, three serversy yy And we could also structure the nodes as a hierarchy, with different parts of our log owned by different nodesyRequires “consensus” on log append operationsyRequires consensus on log append operationsConsensus problemy A classic (and well understood) distributed computing problem, arises in a few variant forms (agreement, atomic broadcast leader election locking)atomic broadcast, leader election, locking)yCore question:yCore question:y A set of processes have inputs vi ∈ {0,1}yProtocol is started (by some sort of trigge r)Protocol is started (by some sort of trigge r)y Objective: all decide v, for some v in the input sety Example solution: “vote” and take the majority valueConsensus with failuresy The so‐called FLP (Fischer, Lynch and Patterson) result proves that any consensus protocol capable of tolerating even a single failure must have nontolerating even a single failure must have non‐terminating runs (in which no decision is reached)y Proof is for an asynchronous execution; flavor similar to that of the pumping lemma in language theoryppg gg yy Caveat: the run in question is of probability zeroAside: FLP Proofy The actual proof isn’t particularly intuitivey They show that any fault‐tolerant consensus protocol h ifiit th t it f l bi l t tthas infinite runs that consist of purely bivalent statesyThe intuition is that delayed messages can force a yThe intuition is that delayed messages can force a consensus protocol to “reconfigure”y The implicit issue is that consensus requires a unique p q qleader to reaches the decision on behalf of the system.y FLP forc es repeated transient message delaysh l h ld f l fy These isolate the leader, forcing selection of a new leader, and thus delaying the decision indefinitelyAside: “Impossibility”y A perhaps‐surprising insight is that for theory community, “impossible” doesn’t mean “can’t be done”I l l iibl hi b yIn normal language, an impossible thing can never be done. It is impossible for a person to fly (except on TV)y In the formal definitions used for FLP, impossible means can’t always be done. If there is even one run in which d ’ hd “bl” dddecisions aren’t reached, it is “impossible” to decide.yIn fact as a practical matter consensus can always be yIn fact, as a practical matter, consensus can always be reached as long as a majority of our system is operationalConsensus is impossible.Consensus is impossible. But why do we care?y The core issue is that so many problems are equivalent to consensusBill i bh iyBasically, any consistent behavioryFLP k it hd t b i bt tyFLP makes it hard to be rigorous about correctnessy We can prove partial but not total correctnessyFor the theory community this is frustrating –it is yFor the theory community, this is frustrating it is “impossible” to solve consensus or equivalent problemsy At best we talk about progress in models with OraclesConsensus‐like behaviory We’ll require that our log behave in a manner indistinguishable from a non‐replicated, non‐faulty single instance running on some accessible serversingle instance running on some accessible serveryBut we’ll implement the log using a group of yBut we ll implement the log using a group of components that run a simple state‐machine append protocolpy This abstraction matches the “Paxos” protocoly But the protocol we’ll look at is older and was developed i h Ii f “ i ”in the Isis system for “group view management”Group communicationy We want the Oracle itself to be a tree, nodes of which are groups of serversI f li hi yIn fact we can generalize this concepty The general version is a group of processesy supported by some form of management servicey… supported by some form of management serviceyTurtles all the wa y down again?yTurtles all the wa y down, again?y At the core we’ll have a “root” groupGroup Communication illustrationpqqrsstuy Te rminology: group create, view, join with state transfer, multicast, client‐to‐group communicationuy “Dynamic” membership model: processes come & goRecipe for a group communicationRecipe for a group communication systemy

View Full Document


School:
Email:
New Password:
Confirm Password:

CORNELL CS 5410 - Building a System Management Service

Sign up for free to view:

Please select your school