Slide 1Distributed DebuggingEach step in distributed debugging is a challengeCurrent ApproachesD3S: Debugging Deployed Distributed SystemsD3S WorkflowWriting PredicatesPartitioning and Parallelism in D3SFurther OptimizationsGlobal SnapshotsConsistent SanpshotsExperimentsCase Study: PacificAPacificA: Architecture & Bug TraceResultsPerformanceDiscussionDiscussionD3S: Debugging Deployed Distributed SystemsXuezheng Liu, Zhenyu Guo, Xi Wang, Feibo Chen, Xiaochen Lian, Jian Tang, Ming Wu, M Frans Kaashoek, Zheng ZhangNSDI 2008Presented By:Pooja AgarwalCS 525 Class Presentation, UIUCDistributed Debugging•How do we generally debug a program?▫Mostly iterative•Reproducing bugs is hard in distributed systemsLarge scale systemsNetwork/machine failures•Example:•Distributed reader-writer locksLock mode: exclusive, sharedInvariant: only one client can hold a lock in the exclusive mode2Each step in distributed debugging is a challenge•Step 1: What to record? How much to record?▫States to record change over time; too less/too much recording•Step 2: How to record?▫Log based Vs Online monitoring•Step 3: How to order records?▫Problem of global consistent snapshots• Step 4: How to verify?▫Design efficient predicates; Single Vs Multiple verifiers•Processes/nodes under debug can fail▫Need to approximate global consistent snapshots•Debugger nodes themselves fail▫Need to keep running with few false positives and false negatives 3Current Approaches•Log Analysis•Large-Scale Parallel Applications•Model Checking •Online Monitoring•Replay-based Predicate Checking4D3S: Debugging Deployed Distributed Systems•A simple model for writing distributed predicates•Programmers can change what is being checked on-the-fly •Run-time checking to scale to large systems •Failure tolerant consistent snapshot for predicate checking•Evaluation with five real-world applications5D3S WorkflowPredicates (States + Logic)Predicates (States + Logic)Symbol InfoState Exposer (SE)State Exposer (SE)Checking Logic (CL)Checking Logic (CL)Dynamic InjectionAppAppAppAppSESEAppAppSESESESEAppAppSESEVerifierVerifierCLCLVerifierVerifierCLCLViolation reports, Seq of statesViolation reports, Seq of statesConflictConflict6Writing Predicates//Computation graphV0: exposer { ( client: ClientID, lock: LockID, mode: LockMode ) }V1: V0 { ( conflict: LockID ) } as finalafter (ClientNode::OnLockAcquired) addtuple ($0->m_NodeID, $1, $2)after (ClientNode::OnLockReleased) deltuple ($0->m_NodeID, $1, $2)//Computation graphV0: exposer { ( client: ClientID, lock: LockID, mode: LockMode ) }V1: V0 { ( conflict: LockID ) } as finalafter (ClientNode::OnLockAcquired) addtuple ($0->m_NodeID, $1, $2)after (ClientNode::OnLockReleased) deltuple ($0->m_NodeID, $1, $2)V0V0V1V1//source code from example appclass ClientNode { ClientID m_NodeID; void OnLockAcquired( LockID, LockMode ); void OnLockReleased( LockID, LockMode );};//source code from example appclass ClientNode { ClientID m_NodeID; void OnLockAcquired( LockID, LockMode ); void OnLockReleased( LockID, LockMode );};Tuples of (C, L, M)• Reuse of application code• Binary InstrumentationConflict (L)// C++ code for Predicateclass LockVerifier : public vertex<V1> { virtual void Execute( const V0::Collection & snapshot ); // verify predicate in the required snapshots, output conflicts static Key Mapping( const V0::tuple & t ) ; // map states to key space};// C++ code for Predicateclass LockVerifier : public vertex<V1> { virtual void Execute( const V0::Collection & snapshot ); // verify predicate in the required snapshots, output conflicts static Key Mapping( const V0::tuple & t ) ; // map states to key space};• Wait for snapshot to complete• Mapping()• More complex computation graphs7Partitioning and Parallelism in D3S{C1,L0,E},{C1,L4,S}{C1,L0,E},{C1,L4,S}{C2,L1,E},{C2,L4,S}{C2,L1,E},{C2,L4,S}{C8,L4,S}{C8,L4,S}L0 L4L1Check L0~L3Check L0~L3Check L4~L7Check L4~L7• Dynamic assignment of key spaces to verifiers by a central master• Pipelining• Fault tolerance8Key SpaceFurther Optimizations• Buffering of exposed states at V0•Handles verifier failures• Incremental checking[ExecuteChange()]•Increases efficiency• Sampling the key space or timestamps•Reduce overhead9Global Snapshots•Predicates are defined over a finite number of consecutive snapshots.•Use of Lamport logical time clock at each node•Liveness Issues10Consistent SanpshotsABChecker{ (A, L0, S) }, ts=2{ (B, L1, E) }, ts=6{ }, ts=10ts=12{ (A, L1, E) }, ts=16M(2)={A,B}SB(2)=?? M(6)={A,B}SA(6)=?? M(10)={A,B}SA(6)=SA(2) check(6)Failure DetectedSB(10)=SB(6) check(10)M(16)={A}check(16)SA(2)SB(6)SA(10) SA(16)Assumptions: Reliable Network and messages received in FIFO orderMembership: external service or built-in heart-beatsSnapshot is correct as long as membership is correctWhen no state being exposed, app node should report its timestamp periodically11Experiments•Three major expectations▫Help applications find bugs▫Predicates need to be simple to write▫Checking overhead needs to be low12Case Study: PacificA•Predicate▫There is at most one primary replica in each group of replicas nodes•Deployment▫8 machines▫Test scenario: database app with random I/O▫Randomly crash & restart processes▫D3S <Slice_identifier, MachineID, Primary/Secondary>•Debugging▫3 checkers, partitioned by replica groups▫Time to trigger violation: several hours13PacificA: Architecture & Bug Trace Meta Server Meta Server Slice serverSid=2,S; Sid=1,S Slice serverSid=2,S; Sid=1,S Verifiercatches violation Verifiercatches violation Report: timestamp, node, event seq Report: timestamp, node, event seq Slice serverSid=2,S; Sid=1,P Slice serverSid=2,S; Sid=1,P Slice serverSlice serverP• Coordinator crashed and forgot the previous answer• Must write to disk synchronously!P14ResultsTable 1: Results for 5 applications15Data center AppWide Area AppPerformance• Each thread(client) sends 1,000 requests• Less than 8%, in most cases less than 4%. • I/O overhead < 0.5%• Overhead in Chord and Paxos is negligible, and in BitTorrent and websearch is < 2%16Discussion•Can D3S be used for large scale applications using different collaborative systems?▫How to build predicates across various systems?▫Which system to check in event of faults?•How easy is it to use D3S?▫One needs to know how the application
View Full Document