Page 1 1 CS6810 School of Computing University of Utah Consistency & TM Today’s topics: Consistency models the “when” of the CC-NUMA game Transactional Memory an alternative to lock based synchronization additional reading: paper from HPCA 2006 on class web page 2 CS6810 School of Computing University of Utah Consistency • For DSM systems cache coherence » ensure multiple processors see a consistent memory view » does not answer “how consistent” • e.g. when are things truly consistent consistency models » there are several … but first a little detour • Programs share variables problem redefined » when does a write in some processor become visable to a read in another processor? » OR what properties must be enforced between reads and writes on different processors?Page 2 3 CS6810 School of Computing University of Utah Consider • Code sequences on 2 different processors • If A and B are cached on both processors problem – inherent race* » P1 can read 2 possible values written by P2 • and vice versa » what if the correct last write invalidate from P2 is not seen by P1 before the read? • non-deterministic program • Question is whether or not this behavior should be allowed? and if so under what conditions? P1: A= 0; (cycle 1) ….. A=1; (cycle n) if (B==0) … (cycle n+1) P2: B=0; (cycle 1) ….. B=1; (cycle n) if (A==0) … (cycle n+1) 4 CS6810 School of Computing University of Utah Enter Consistency Models • Sequential consistency result of any execution must be the same » if memory accesses by each processor are kept in order » and the accesses of all processors are arbitrarily interleaved • note in previous code segments this won’t be the case – only way out here is for the programmer to synchronize the order – locks simplest implementation » sequence ALL memory transactions » for DSM this means • synchronizing all memory transactions at a global atomicity point – intractable due to performance impediments fence instructions (slightly better) » system wide flush of all pending memory references pro’s and con’s » + programmer sees a simple deterministic model • complicated - same as hazard solutions for pipelines Wax and RAW » - slow – hard to swallow in a parallel worldPage 3 5 CS6810 School of Computing University of Utah Program Synchronization • Programmer must specify order that matters locks, barriers, whatever data race free behavior other non-determinacy is accepted » exception concurrent writer problem • such as CC-NUMA/DSM write-invalidate protocol obvious problems » additional complexity pushed onto the programmer • more heinous for fine-grain locks » synchronization = serialization • defeats performance advantage of parallelism • Relaxing consistency hardware allows some/most memory operations to happen out of order » several variants » programmer still has to control orderings that matter • critical sections, locks, … 6 CS6810 School of Computing University of Utah Book Terminology • Attempt to be compatible with your textbook • XY implies X must happen before Y candidate values for X & Y are READS and WRITES (R,W) hence options » RR » RW » WR » WW • Relaxing the RR constraint this is essentially sequential consistency » although your book doesn’t view it this way think about it » reordering reads doesn’t change RAW or Wax hazards » the problem happens when you promote either reads or writes over a write!!!Page 4 7 CS6810 School of Computing University of Utah Relaxed Consistency Models • TSO – total store ordering relax WR retain write ordering but allow reads to be reordered » note this changes RAW hazard behavior » fix: programmer synchronization benefit » write buffering works » lots of programs just want the latest value a.k.a. “processor consistency” » from a single processor view point » reordering reads doesn’t change anything • PSO – partial store order relax WW ordering ( WAW hazards don’t matter) » note this does NOT mean to the same location • requires total memory disambiguation – easy at main memory where physical addresses are used • independent writes can be reordered 8 CS6810 School of Computing University of Utah Relaxed Consistency II • Weak ordering & release consistency relaxing RW and RR ordering » meaning – don’t care about WAR or RAW hazards reality » if threads rarely interact • reduced synchronization improved parallelism • memory system can respond out of order • big performance gain • Problem w/ relaxed consistency programmer needs to know what the compiler/hardware supports » may be able to specify the consistency model » bottom line • programmer needs to explicitly synchronize the things that matter – in a concurrent world this can’t be avoided anywayPage 5 9 CS6810 School of Computing University of Utah What’s the Point? • Modes in today’s hardware allow various consistency models more relaxed is potentially faster » but programmer needs to know what to explicitly synchronize » and this depends on the mode vocabulary » changes a bit by vendor » previous terminology is the most common 10 CS6810 School of Computing University of Utah Enter TM • Transactional Memory original idea from TK@MIT » take a data base idea and apply it to the shared memory problem note that there are lots of variants » idea today is a shallow dive into the space basic idea » program: transaction consists of an atomic block • read stuff • do something • write stuff » if nothing else interacts w/ stuff then all is well • otherwise abort and don’t do anything destructive – e.g. write to memory » similar to svn • e.g. version management but with all or nothing success ideaPage 6 11 CS6810 School of Computing University of Utah What Changes? • New TM model simplifies programming lock-unlock replaces the
View Full Document