DOC PREVIEW
Berkeley COMPSCI 258 - Background for Debate on Memory Consistency Models

This preview shows page 1-2 out of 5 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS258 S991NOW Handout Page 1Background for Debate on MemoryConsistency ModelsCS 258, Spring 99David E. CullerComputer Science DivisionU.C. Berkeley4/21/99 CS258 S99 2Memory Consistency Model• for a SAS specifies constraints on the order inwhich memory operations (to the same ordifferent locations) can appear to execute withrespect to one another,• enabling programmers to reason about thebehavior and correctness of their programs.• fewer possible reorderings => more intuitive• more possible reorderings => allows for moreperformance optimization– ‘fast but wrong’ ?4/21/99 CS258 S99 3Multiprogrammed Uniprocessor Mem.Model• A MP system is sequentially consistent if the result of anyexecution is the same as if the operations of all theprocessors were executed in some sequential, and theoperations of each individual processor appear in thissequence in the order specified by its program (Lamport)P1P2 PnMemorysequential processorsissuing memory referencesas per program orderswitch is randomlyset after each memoryreferencelike ‘linearizability’ indatabase literature4/21/99 CS258 S99 4Reasoning with Sequential Consistency• program order: (a) → (b) and (c) → (d)“preceeds”• claim: (x,y) == (1,0) cannot occur– x == 1 => (b) → (c)– y == 0 => (d) → (a)– thus, (a) → (b) → (c) → (d) → (a)– so (a) → (a)initial: A, flag, x, y == 0p1 p2(a) A := 1; (c) x := flag;(b) flag := 1; (d) y := A4/21/99 CS258 S99 5Then again, . . .• Many variables are not used to effect the flow ofcontrol, but only to shared data– synchronizing variables– non-synchronizing variablesinitial: A, flag, x, y == 0p1 p2(a) A := 1; (c) x := flag; B := 3.1415 C := 2.78(b) flag := 1; (d) y := A+B+C4/21/99 CS258 S99 6Lamport’s Requirement for SC• Each processor issues memory requests in the orderspecified by its program.• Memory requests from all processors issued to anindividual memory module are serviced from a single FIFOqueue. Issuing a memory request consists of entering therequest on this queue.• How much overlap is possible?– non-memory operations ?– memory operations ?• Assumes stores execute atomically– newly written value becomes visible to all processors at the sametime» inserted into FIFO queue– not so with caches and general interconnectCS258 S992NOW Handout Page 24/21/99 CS258 S99 7Requirements for SC (Dubois &Scheurich)• Each processor issues memory requests in the orderspecified by the program.• After a store operation is issued, the issuing processorshould wait for the store to complete before issuing itsnext operation.• After a load operation is issued, the issuing processorshould wait for the load to complete, and for the storewhose value is being returned by the load to complete,before issuing its next operation.• the last point ensures that stores appear atomic to loads– note, in an invalidation-based protocol, if a processor has a copy of ablock in the dirty state, then a store to the block can completeimmediately, since no other processor could access an older value4/21/99 CS258 S99 8Architecture Implications• need write completion for atomicity and accessordering– w/o caches, ack writes– w/ caches, ack all invalidates• atomicity– delay access to new value till all inv. are acked• access ordering– delay each access till previous completesPMCAPMCA° ° °4/21/99 CS258 S99 9Summary of Sequential Consistency• Maintain order between shared access in eachthread– reads or writes wait for previous reads or writes to completeREAD WRITE WRITEREADREAD WRITE READ WRITE4/21/99 CS258 S99 10Do we really need SC?• Programmer needs a model to reason with– not a different model for each machine=> Define “correct” as same results as sequential consistency• Many programs execute correctly even without “strong”ordering• explicit synch operations order key accessesinitial: A, flag, x, y == 0p1 p2 A := 1; B := 3.1415 unlock (L) lock (L) ... = A; ... = B;4/21/99 CS258 S99 11Does SC eliminate synchronization?• No, still need critical sections, barriers, events– insert element into a doubly-linked list– generation of independent portions of an array• only ensures interleaving semantics of individual memoryoperations4/21/99 CS258 S99 12Is SC hardware enough?• No, Compiler can violate ordering constraints– Register allocation to eliminate memory accesses– Common subexpression elimination– Instruction reordering– Software Pipelining• Unfortunately, programming languages and compilers arelargely oblivious to memory consistency models– languages that take a clear stand, such as HPF too restrictiveP1 P2 P1 P2B=0 A=0 r1=0 r2=0A=1 B=1 A=1 B=1u=B v=A u=r1 v=r2B=r1 A=r2(u,v)=(0,0) disallowed under SC may occur hereCS258 S993NOW Handout Page 34/21/99 CS258 S99 13What orderings are essential?• Stores to A and B must complete before unlock• Loads to A and B must be performed after lockinitial: A, flag, x, y == 0p1 p2 A := 1; B := 3.1415 unlock (L) lock (L) ... = A; ... = B;4/21/99 CS258 S99 14How do we exploit this?• Difficult to automatically determine orders thatare not necessary• Relaxed Models:– hardware centric: specify orders maintained (or not) byhardware– software centric: specify methodology for writing “safe”programs• All reasonable consistency models retainprogram order as seen from each processor– i.e., dependence order– purely sequential code should not break!4/21/99 CS258 S99 15Hardware Centric Models• Processor Consistency (Goodman 89)• Total Store Ordering (Sindhu 90)• Partial Store Ordering (Sindhu 90)• Causal Memory (Hutto 90)• Weak Ordering (Dubois 86)READ WRITE WRITEREADREAD WRITE READ WRITEREAD WRITE WRITEREADREAD WRITE READ WRITE4/21/99 CS258 S99 16Properly Synchronized Programs• All synchronization operations explicitlyidentified• All data accesses ordered thoughsynchronizations– no data races!=> Compiler generated programs from structuredhigh-level parallel languages=> Structured programming in explicit thread code4/21/99 CS258 S99 17Complete Relaxed Consistency Model• System specification– what program orders among mem operations are preserved– what mechanisms are provided to enforce order explicitly,when desired• Programmer’s interface– what program annotations are available– what ‘rules’ must be followed to maintain the illusion of SC• Translation


View Full Document

Berkeley COMPSCI 258 - Background for Debate on Memory Consistency Models

Documents in this Course
Load more
Download Background for Debate on Memory Consistency Models
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Background for Debate on Memory Consistency Models and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Background for Debate on Memory Consistency Models 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?