Unformatted text preview:

MIT OpenCourseWare http://ocw.mit.edu 6.033 Computer System Engineering Spring 2009 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.6.033 Lecture 6: Client/server on one computer Intro how to implement c/s on one computervaluable in itself involves concurrency, an independently interesting topicDP1 is all about concurrency what do we want? [diagram: X client, X server, NO KERNEL YET]client wants to send e.g. image to servergoal: arms-length, so X srvr not vulnerable to client bugs idea: let kernel manage interactionclient/srvr interact w/ trusted kernel, not each other[diagram: X client, X server, kernel]let's focus on one-way flow (can use two for RPC)buffer memory in kerneleach entry holds a message pointer send(m) to add msg to buffer receive(m) to read msg out of buffer finite buffer, so send() may have to wait buffer may be empty, so receive() may have to wait why does buffer have multiple entries? sender / receiver rates may vary around an average let sender accumulate a backlog when it's faster so receiver has input when sender is slower very much like a UNIX pipe problem: concurrencysome data structure inside kernel, s() and r() modify itwhat if s() and r() active at same time? may interfereconcurrency a giant problem, will come up many timeslet's start simple:each program gets its own CPUthere is only one memory system [diagram: two CPUs, one memory] system calls run on both CPUs! i.e. if program A calls send(), send() runs on A's CPU send() and receive() interact via single shared memory system data structure "bounded buffer" [diagram: BUFFER[5], IN, OUT]each array entry: pointer to message bufferIN: number of messages put into BBOUT: number of messages read out of BBIN mod 5 is next place for send() to writeOUT mod 5 is next place for receive() to lookexample: in = 28, out = 26two messages waiting, slots 1 and 2 in > out => BB not empty in - out < 5 => not full send() code slidep is "port", points to instance of BB, so we can have many of theme.g. one per c/s pair, or one per UNIX pipe loop to wait until room ("busy-wait") write slot increment input count receive() code slideloop to wait until more sends than recvsif there's a messageincrement p.out AFTER copying msgsince p.out++ may signal send() to overwrite if send() is waiting for space I believe this simple BB code works[show slide with both]even if send() and receive() called at same timeconcurrency rarely work out this well! Assumptions for simple BB send/recv1. One sender, one receiver2. Each has its own CPU (otherwise loop prevents other from running)3. in and out don't overflow 4. CPUs perform mem reads and writes in the order shown oops! this code probably won't work as shown! compiler might put in/out in regs, not see other's changes CPU might increment p.in before writing buffer[] I will assume memory R/W in program order Suppose we want multiple senderse.g. so many clients can send msgs to X server would our send() work? Concurrent send()A: send(p, m1) B: send(p, m2) what do we *want* to happen? what would be the most useful behavior? goal: two msgs in buf, in == 2 we don't care about order Example prob w/ concurrent send()on different cpus, at the same time, on the same pA B r in, out r in, out w buf[0] w buf[0] r in=0 r in=0 w in=1 w in=1 result: in = 1, one message in bounded buffer, and one was lost! This kind of bug is called a "race"once A puts data in buffer, it has to hurry to finish w/ incr of p.in! Other races in this codesuppose only one slot left A and B may both observe in - out < N put *two* items in buf, overwrite oldest entry Races are a serious problemeasy mistake to make -- send() looks perfectly reasonable!hard to find depends on timing, may arise infrequentlye.g. Therac-25, only experienced operator typed fast enough How to fix send()'s races?original code assumed no-one else messing w/ p.in &c only one CPU at a time in send() == isolated execution can we restore that isolation? Locks a lock is a data type with two operations acquire(l) s1 s2 release(l) the lock contains state: locked or unlocked if you try to acquire a locked lock acquire will wait until it's released if two acquire()s try to get a lock at same time one will succeed, the other will wait How to fix send() with locking?[locking send() slide]associate a lock w/ each BBacquire before using BBrelease only after done using BBhigh-level view: no interleaving of multiple send()s only one send() will be executing guts of send() likely to be correct if single-sender send() was correct Does it matter how send() uses the lock?move acquire after IF? [slide] Why separate lock per bounded buffer?rather than e.g. all BBs using same lockthat would allow only one BB to be active but it's OK if send()s on different BBs are concurrent lock-per-BB improves performance / parallelism Deadlock big program can have thousands of locks, some per moduleonce you have more than one lock in your system,you have to worry about deadlockdeadlock: two CPUs each have a lock, each waiting for other to releaseexample:implementing a file systemneed to ensure two programs don't modify a directory at the same timehave a lock per directorycreate(d, name): acquire d.lock if name exists: error else create dir ent for name release d.lock what about moving a file from one dir to another? like mv move(d1, name, d2): acquire d1.lock acquire d2.lock delete name from d1 add name to d2 release d2.lock release d1.lock what is the problem here? Avoiding deadlocklook for all places where multiple locks are heldmake sure, for every place, they are acquired in the same orderthen there can be no locking cycles, and no deadlockfor move(): sort directories by i-number lock lowere i-number first so: if d1.inum < d2.inum: acquire d1.lock acquire d2.lock else: acquire d2.lock acquire d1.lock this can be painful: requires global reasoning acquire l1 print("...") does print() acquire a lock? could it deadlock w/ l1? the good news is that deadlocks are not subtle once they occur Lock granularityhow to decide how many locks to have, what they should protect?a spectrum, coarse vs finecoarse: just one lock, or one lock per modulee.g. one lock for whole file system more likely correct but CPUs may wait/spin for lock, wasting CPU time "serial execution", performance no better than one CPU fine: split up data into many pieces, each with a separate lock different CPUs can use different data, different locks


View Full Document

MIT 6 033 - Lecture Notes

Documents in this Course
TRIPLET

TRIPLET

12 pages

End Layer

End Layer

11 pages

Quiz 1

Quiz 1

4 pages

Threads

Threads

18 pages

Quiz I

Quiz I

15 pages

Atomicity

Atomicity

10 pages

QUIZ I

QUIZ I

7 pages

Load more
Download Lecture Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?