Dependability Lessons fromDependability Lessons fromInternettyInternettySystems: An OverviewSystems: An OverviewStanford University CS 444A / UC Berkeley CS 294Stanford University CS 444A / UC Berkeley CS 294--44RecoveryRecovery--Oriented Computing,Oriented Computing,Autumn 01Autumn 01Armando Fox, fox@Armando Fox, [email protected]© 2001StanfordConcepts OverviewConcepts OverviewTrading consistency for availability: Harvest, yield, and the DQTrading consistency for availability: Harvest, yield, and the DQprinciple; TACTprinciple; TACTRuntime fault containment: virtualization and its usesRuntime fault containment: virtualization and its usesOrthogonal mechanisms: timeouts, endOrthogonal mechanisms: timeouts, end--toto--end checks, statistical end checks, statistical detection of performance failuresdetection of performance failuresState management, hard and soft stateState management, hard and soft stateRevealed truths: endRevealed truths: end--toto--end argument (Saltzer), software pitfalls end argument (Saltzer), software pitfalls (Leveson), and their application to dependability(Leveson), and their application to dependabilityMany, many supplementary readings about these topicsMany, many supplementary readings about these topics© 2001StanfordConsistency/Availability Tradeoff: CAPConsistency/Availability Tradeoff: CAPCAP principle (this formulation due to Brewer): CAP principle (this formulation due to Brewer): In a networked/distributed storage system, you can have In a networked/distributed storage system, you can have any 2 of consistency, high availability, partition resilience.any 2 of consistency, high availability, partition resilience.Internet systems favor A and P over CInternet systems favor A and P over CDatabases favor C and A over PDatabases favor C and A over PSurely other examplesSurely other examplesGeneralization: can you trade Generalization: can you trade somesomeof one for more of of one for more of another? (hint: yes)another? (hint: yes)© 2001StanfordConsistency/Availability: Harvest/YieldConsistency/Availability: Harvest/YieldYield:Yield:probability of completing a queryprobability of completing a queryHarvest:Harvest:(application(application--specific) fidelity of the answerspecific) fidelity of the answerFraction of data represented?Fraction of data represented?Precision?Precision?Semantic proximity?Semantic proximity?Harvest/yield questions:Harvest/yield questions:When can we trade harvest for yield to improve availability?When can we trade harvest for yield to improve availability?How to measure harvest “threshold” below which response is not How to measure harvest “threshold” below which response is not useful?useful?Application decomposition to improve “degradation Application decomposition to improve “degradation tolerance” (and therefore availability)tolerance” (and therefore availability)© 2001StanfordGeneralization: TACT (Yu & Vahdat)Generalization: TACT (Yu & Vahdat)Model: distributed database using antiModel: distributed database using anti--entropy to entropy to approach consistencyapproach consistency“Conit” captures app“Conit” captures app--specific consistency unit (think: ADU specific consistency unit (think: ADU of consistency)of consistency)Airline reservation: all seats on 1 flightAirline reservation: all seats on 1 flightNewsgroup: all articles in 1 groupNewsgroup: all articles in 1 groupBounds on 3 kinds of inconsistencyBounds on 3 kinds of inconsistencyNumerical error (value is inaccurate)Numerical error (value is inaccurate)Order error (write(s) may be missing, or arrive outOrder error (write(s) may be missing, or arrive out--ofof--order)order)Staleness (value may be outStaleness (value may be out--ofof--date)date)“Consistency cost” of operations can be characterized in “Consistency cost” of operations can be characterized in terms of conits, and bounds on inconsistency enforcedterms of conits, and bounds on inconsistency enforced© 2001StanfordTACTTACT--like example: TranSendlike example: TranSendEarly stab at lossy onEarly stab at lossy on--thethe--fly Web image compression, fly Web image compression, extensively parameterized extensively parameterized (per user, device, etc.)(per user, device, etc.)Harvest: “semantic fidelity” of what you getHarvest: “semantic fidelity” of what you getWorst case: the original image Worst case: the original image originaloriginalIntermediate case: “close”Intermediate case: “close”image that has beenimage that has beenpreviously computedpreviously computedand cachedand cachedMetrics for semantic fidelity?Metrics for semantic fidelity?Trade harvest forTrade harvest foryield/throughputyield/throughputTACTTACT--like, though TACTlike, though TACTdidn’t exist thendidn’t exist thendesireddelivered© 2001StanfordAnother special case: DQ PrincipleAnother special case: DQ PrincipleModel: readModel: read--mostly database striped across many mostly database striped across many machinesmachinesIdea: Data/Query x Queries/Sec = Data/SecIdea: Data/Query x Queries/Sec = Data/SecGoal: design system so that D/Q Goal: design system so that D/Q ororQ/S are tunableQ/S are tunableThen you can decide how partial failure affects usersThen you can decide how partial failure affects usersIn practice, Internet systems constraint is offered load of Q/S,In practice, Internet systems constraint is offered load of Q/S,so so failures affect D/Q for each userfailures affect D/Q for each userCan use some replication of most common data to mitigate Can use some replication of most common data to mitigate effects of reducing D/Qeffects of reducing D/Q© 2001StanfordFault ContainmentFault ContainmentUses of software based fault isolation and VM technologyUses of software based fault isolation and VM technologyProtecting the “real” hardware (now will also be used for ASP’s)Protecting the “real” hardware (now will also be used for ASP’s)HypervisorHypervisor--based F/Tbased F/TOrthogonal mechanisms for fault containmentOrthogonal mechanisms for fault containment…and enforcing your assumptions…and enforcing your
View Full Document