Columbia COMS W4118 - Concurrency Error - D2193062

Home> Schools> Columbia University> (COMS) > COMS W4118> Concurrency Error

DOC PREVIEW

Columbia COMS W4118 - Concurrency Error

School name Columbia University

Course Coms W4118- Operating Systems I

Pages 29

This preview shows page 1-2-3-27-28-29 out of 29 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 29 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 29 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 29 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 29 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 29 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 29 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 29 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

W4118 Operating Systems Instructor: Junfeng YangGoals Identify patterns of concurrency errors (so you can avoid them in your code) Learn techniques to detect concurrency errors(so you can apply these techniques to your (so you can apply these techniques to your code)1Outline Concurrency error patterns Concurrency error detection Deadlock detection Data race detection2Concurrency error classification Deadlock: a situation wherein two or more processes are never able to proceed because each is waiting for the others to do something Key: circular waitRace condition: a timing dependent error Race condition: a timing dependent error involving shared state Data race: concurrent accesses to a shared variable and at least one access is a write Atomicity bugs: code does not enforce the atomicityprogrammers intended for a group of memory accesses Order bugs: code does not enforce the orderprogrammers intended for a group of memory accesses3Synchronization is hard. Why? Complex interactions: too many thread schedule (exponential to program size) Global complexity, can’t divide-and-conquer Synchronization cross-cuts abstraction boundaries Local correctness may not yield global correctness. Local correctness may not yield global correctness. i.e., properly synchronized modules don’t compose We’ll see a few error examples next4Example 1: good + bad  badwithdraw() // no synchronization-- *balance;deposit() // properly sycnrhonizedlock();++ balance;unlock(); Result: race between deposit() and withdraw()5Example 2: good + good  badint sum(Account *a1, Account *a2){return balance(a1) + balance(a2)int balance(Account *acnt){int b;void withdraw(Account *acnt){ lock(acnt->guard);-- acnt->balance;unlock(acnt->guard);}void deposit(Account *acnt){ lock(acnt->guard);++ acnt->balance;unlock(acnt->guard);} Compose single-account operations to operations on two accounts deposit(), withdraw() and balance() are properly synchronized sum() and transfer()? Racereturn balance(a1) + balance(a2)}void transfer(Account *a1, Account *a2){ withdraw(a1); deposit(a2);}int b;lock(acnt->guard);b = acnt->balance;unlock(acnt->guard);return b;}6Example 3: good + good  deadlockint sum(Account *a1, Account *a2){ int s;lock(a1->guard);lock(a2->guard);s = a1->balance;s += a2->balance;unlock(a2->guard);unlock(a1->guard);T1:sum(a1, a2)T2:sum(a2, a1) 2ndattempt: use locks in sum() One sum() call, correct Two concurrent sum() calls? Deadlockunlock(a1->guard);return s}7Example 4: monitors don’t compose as wellMonitor M1 {cond_t cv;foo() {// releases monitor lockwait(cv); }bar() {Monitor M2 {f1() {M1.foo();}f2() {M1.bar();}};’ Usually bad to hold lock (in this case Monitor lock) across abstraction boundarybar() {signal(cv);}};’T1:M2.f1();T2:M2.f2();8Outline Concurrency error patterns Concurrency error detection Deadlock detection Data race detection9Deadlock detection Root cause of deadlock: circular wait Detecting deadlock manually: system halts Can run debugger and see the wait cycleDetecting deadlock automatically: resource Detecting deadlock automatically: resource allocation graph Detecting potential deadlocks automatically: lock order10Resource allocation graph Nodes Locks (resources) Threads (processes) Edges Assignment edge: lock->thread•Removed on unlock()a1->guardT1: sum(a1,a2)T2: sum(a2,a1)•Removed on unlock() Request edge: thread->lock• Converted to assignment edges on lock() return Cycles  deadlock Problem: can we detect potentialdeadlocks before we run into them?a2->guardResource allocation graph for example 3 deadlock11Detecting potential deadlocks Can deduce lock order: the order in which locks are acquired For each lock acquired, order with locks held Cycles in lock order  potential deadlockT1:sum(a1, a2) // locks heldT2:sum(a1, a2) // locks helda1->guarda2->guardsum(a1, a2) // locks heldlock(a1->guard) // {}lock(a2->guard) // {a1->guard}sum(a1, a2) // locks heldlock(a2->guard) // {}lock(a1->guard) // {a2->guard}Cycle  Potential deadlock!12Outline Concurrency error patterns Concurrency error detection Deadlock detection Data race detection13Race detection We will look at only data race detection Techniques exist to detect atomicity and orderbugs, but we won’t discuss them in this classTwo approaches to data race detectionTwo approaches to data race detection Happens-before Lockset (Eraser’s algorithm)14Happens-before definition Event A happens-before event B if B follows A in the same thread A inT1, and B inT2, and a synchronization event C such that• A happens in T1•C is after A in T1 and before B in T2•C is after A in T1 and before B in T2• B in T215Happens-before race detection Tools before eraser are based on happens-before Sketch Monitor all data accesses and synch operationsWatch forWatch for• Access of v in thread T1• Access of v in thread T2• No synchronization operation between the accesses• One of the accesses is write16Problems with happens-before Problem I: expensive Requires per thread• List of accesses to shared data• List of synch operationsProblem II: false negativesProblem II: false negatives Happens-before looks for actual data races (moment in time when multiple threads access shared data w/o synchronization) Ignores programmer intention; the synchronization op between accesses may happen to be thereT1:++ ylock(m)unlock(m)T2:lock(m);unlock(m);++ y;17Eraser: a different approach Idea: check invariants Violations of invariants  likely data races Invariant: the locking discipline Assume: accesses to shared variables are protected by locksby locks Every access is protected by at least one lock Any access unprotected by a lock  an error Problem: how to find out what lock protects a variable? Linkage between locks and variables undeclared18Lockset algorithm: infer the locks Intuition: it must be one of the locks held at the time of access C(v): a set of candidate locks for protecting vInitialize C(v)to the set of all locksInitialize C(v)to the set of all locks On access to v by thread t, refine C(v) C(v) = C(v) ^ locks_held(t) If C(v) = {}, report error Question: is locks_held(t) per thread? Sounds good! But …19Implementing eraser Binary

View Full Document