W4118 Operating Systems Instructor: Junfeng YangGoals Identify patterns of concurrency errors (so you can avoid them in your code) Learn techniques to detect concurrency errors(so you can apply these techniques to your (so you can apply these techniques to your code)1Outline Concurrency error patterns Concurrency error detection Deadlock detection Data race detection2Concurrency error classification Deadlock: a situation wherein two or more processes are never able to proceed because each is waiting for the others to do something Key: circular waitRace condition: a timing dependent error Race condition: a timing dependent error involving shared state Data race: concurrent accesses to a shared variable and at least one access is a write Atomicity bugs: code does not enforce the atomicityprogrammers intended for a group of memory accesses Order bugs: code does not enforce the orderprogrammers intended for a group of memory accesses3Synchronization is hard. Why? Complex interactions: too many thread schedule (exponential to program size) Global complexity, can’t divide-and-conquer Synchronization cross-cuts abstraction boundaries Local correctness may not yield global correctness. Local correctness may not yield global correctness. i.e., properly synchronized modules don’t compose We’ll see a few error examples next4Example 1: good + bad badwithdraw() // no synchronization-- *balance;deposit() // properly sycnrhonizedlock();++ balance;unlock(); Result: race between deposit() and withdraw()5Example 2: good + good badint sum(Account *a1, Account *a2){return balance(a1) + balance(a2)int balance(Account *acnt){int b;void withdraw(Account *acnt){ lock(acnt->guard);-- acnt->balance;unlock(acnt->guard);}void deposit(Account *acnt){ lock(acnt->guard);++ acnt->balance;unlock(acnt->guard);} Compose single-account operations to operations on two accounts deposit(), withdraw() and balance() are properly synchronized sum() and transfer()? Racereturn balance(a1) + balance(a2)}void transfer(Account *a1, Account *a2){ withdraw(a1); deposit(a2);}int b;lock(acnt->guard);b = acnt->balance;unlock(acnt->guard);return b;}6Example 3: good + good deadlockint sum(Account *a1, Account *a2){ int s;lock(a1->guard);lock(a2->guard);s = a1->balance;s += a2->balance;unlock(a2->guard);unlock(a1->guard);T1:sum(a1, a2)T2:sum(a2, a1) 2ndattempt: use locks in sum() One sum() call, correct Two concurrent sum() calls? Deadlockunlock(a1->guard);return s}7Example 4: monitors don’t compose as wellMonitor M1 {cond_t cv;foo() {// releases monitor lockwait(cv); }bar() {Monitor M2 {f1() {M1.foo();}f2() {M1.bar();}};’ Usually bad to hold lock (in this case Monitor lock) across abstraction boundarybar() {signal(cv);}};’T1:M2.f1();T2:M2.f2();8Outline Concurrency error patterns Concurrency error detection Deadlock detection Data race detection9Deadlock detection Root cause of deadlock: circular wait Detecting deadlock manually: system halts Can run debugger and see the wait cycleDetecting deadlock automatically: resource Detecting deadlock automatically: resource allocation graph Detecting potential deadlocks automatically: lock order10Resource allocation graph Nodes Locks (resources) Threads (processes) Edges Assignment edge: lock->thread•Removed on unlock()a1->guardT1: sum(a1,a2)T2: sum(a2,a1)•Removed on unlock() Request edge: thread->lock• Converted to assignment edges on lock() return Cycles deadlock Problem: can we detect potentialdeadlocks before we run into them?a2->guardResource allocation graph for example 3 deadlock11Detecting potential deadlocks Can deduce lock order: the order in which locks are acquired For each lock acquired, order with locks held Cycles in lock order potential deadlockT1:sum(a1, a2) // locks heldT2:sum(a1, a2) // locks helda1->guarda2->guardsum(a1, a2) // locks heldlock(a1->guard) // {}lock(a2->guard) // {a1->guard}sum(a1, a2) // locks heldlock(a2->guard) // {}lock(a1->guard) // {a2->guard}Cycle Potential deadlock!12Outline Concurrency error patterns Concurrency error detection Deadlock detection Data race detection13Race detection We will look at only data race detection Techniques exist to detect atomicity and orderbugs, but we won’t discuss them in this classTwo approaches to data race detectionTwo approaches to data race detection Happens-before Lockset (Eraser’s algorithm)14Happens-before definition Event A happens-before event B if B follows A in the same thread A inT1, and B inT2, and a synchronization event C such that• A happens in T1•C is after A in T1 and before B in T2•C is after A in T1 and before B in T2• B in T215Happens-before race detection Tools before eraser are based on happens-before Sketch Monitor all data accesses and synch operationsWatch forWatch for• Access of v in thread T1• Access of v in thread T2• No synchronization operation between the accesses• One of the accesses is write16Problems with happens-before Problem I: expensive Requires per thread• List of accesses to shared data• List of synch operationsProblem II: false negativesProblem II: false negatives Happens-before looks for actual data races (moment in time when multiple threads access shared data w/o synchronization) Ignores programmer intention; the synchronization op between accesses may happen to be thereT1:++ ylock(m)unlock(m)T2:lock(m);unlock(m);++ y;17Eraser: a different approach Idea: check invariants Violations of invariants likely data races Invariant: the locking discipline Assume: accesses to shared variables are protected by locksby locks Every access is protected by at least one lock Any access unprotected by a lock an error Problem: how to find out what lock protects a variable? Linkage between locks and variables undeclared18Lockset algorithm: infer the locks Intuition: it must be one of the locks held at the time of access C(v): a set of candidate locks for protecting vInitialize C(v)to the set of all locksInitialize C(v)to the set of all locks On access to v by thread t, refine C(v) C(v) = C(v) ^ locks_held(t) If C(v) = {}, report error Question: is locks_held(t) per thread? Sounds good! But …19Implementing eraser Binary
View Full Document