Unformatted text preview:

Shared Memory Multiprocessors Symmetric Multiprocessors SMPs Symmetric access to all of main memory from any processor Currently Dominate the high end server market Building blocks for larger systems arriving to desktop Attractive as high throughput servers and for parallel programs Fine grain resource sharing Uniform access via loads stores Automatic data movement and coherent replication in caches Normal uniprocessor mechanisms used to access data reads and writes Key is extension of memory hierarchy to support multiple processors EECC756 Shaaban 1 lec 11 Spring2002 4 25 2002 Supporting Programming Models Programming models Message passing Compilation or library Shared address space Multiprogramming Communication abstraction User system boundary Operating systems support Hardware software boundary Communication hardware Physical communication medium Address translation and protection in hardware hardware SAS Message passing using shared memory buffers Can offer very high performance since no OS involvement necessary The focus here is on supporting a coherent shared address space EECC756 Shaaban 2 lec 11 Spring2002 4 25 2002 Shared Memory Multiprocessors Variations P1 Pn Switch P1 Pn Interleaved First level Bus Interleaved Main memory I O devices Mem a Shared cache b Bus based shared memory Pn P1 Pn P1 Mem Mem Interconnection network Interconnection network Mem Mem c Dancehall d Distributed memory EECC756 Shaaban 3 lec 11 Spring2002 4 25 2002 Caches And Cache Coherence In Shared Memory Multiprocessors Caches play a key role in all shared memory Multiprocessor system variations Reduce average data access time Reduce bandwidth demands placed on shared interconnect Private processor caches create a problem Copies of a variable can be present in multiple caches A write by one processor may not become visible to others Processors accessing stale value in their private caches Process migration I O activity Cache coherence problem Software and or software actions needed to ensure write visibility to all processors thus maintaining cache coherence EECC756 Shaaban 4 lec 11 Spring2002 4 25 2002 Data Sharing Process Migration Cache Coherence Problems See handout Figure 7 12 in Advanced Computer Architecture Parallelism Scalability Programmability Kai Hwang EECC756 Shaaban 5 lec 11 Spring2002 4 25 2002 I O Operation Cache Inconsistency See handout Figure 7 13 in Advanced Computer Architecture Parallelism Scalability Programmability Kai Hwang EECC756 Shaaban 6 lec 11 Spring2002 4 25 2002 Shared cache Multiprocessor Systems Low latency sharing and prefetching across processors Sharing of working sets No coherence problem and hence no false sharing either But high bandwidth needs and negative interference e g conflicts Hit and miss latency increased due to intervening switch and cache size Used in mid 80s to connect a few of processors on a board Encore Sequent Today Promising for multiprocessor on a chip for small scale systems or nodes Dancehall Not a popular design Resources are uniformly costly to access for all processors Distributed memory Most popular design to build scalable systems i e MPPs EECC756 Shaaban 7 lec 11 Spring2002 4 25 2002 A Coherent Memory System Intuition Reading a location should return latest value written by any process Easy to achieve in uniprocessors Except for I O Coherence between I O devices and processors Infrequent so software solutions work Uncacheable memory uncacheable operations flush pages pass I O data through caches The same should hold when processes run on different processors E g as if the processes were interleaved on a uniprocessor Coherence problem much more critical in multiprocessors Pervasive Performance critical Must be treated as a basic hardware design issue EECC756 Shaaban 8 lec 11 Spring2002 4 25 2002 Example Cache Coherence Problem P2 P1 u P3 u 4 u 7 5 3 u 5 u 5 1 I O devices u 5 2 Memory Processors see different values for u after event 3 With write back caches a value updated in cache may not have been written back to memory Processes even accessing main memory may see very stale value Unacceptable to program correct execution EECC756 Shaaban 9 lec 11 Spring2002 4 25 2002 Basic Definitions Extend definitions in uniprocessors to multiprocessors Memory operation a single read load write store or read modifywrite access to a memory location Assumed to execute atomically w r t each other Issue A memory operation issues when it leaves processor s internal environment and is presented to memory system cache buffer Perform operation appears to have taken place as far as processor can tell from other memory operations it issues A write performs w r t the processor when a subsequent read by the processor returns the value of that write or a later write A read perform w r t the processor when subsequent writes issued by the processor cannot affect the value returned by the read In multiprocessors stay same but replace the by a processor Also complete perform with respect to all processors Still need to make sense of order in operations from different processes EECC756 Shaaban 10 lec 11 Spring2002 4 25 2002 Shared Memory Access Primitives A load by processor Pi is performed with respect to processor Pk at a point in time when the issuing of a store to the same location by Pk cannot affect the value returned by the load A store by Pi is considered performed with respect to Pk at one time when a load from the same address by Pk returns the value by this store A load is globally performed if it is performed with respect to all processors and if the store that is the source of the returned value has been performed with respect to all processors EECC756 Shaaban 11 lec 11 Spring2002 4 25 2002 Formal Definition of Coherence Results of a program values returned by its read operations A memory system is coherent if the results of any execution of a program are such that for each location it is possible to construct a hypothetical serial order of all operations to the location that is consistent with the results of the execution and in which 1 operations issued by any particular process occur in the order issued by that process and 2 the value returned by a read is the value written by the last write to that location in the serial order Two necessary features Write propagation value written must become visible to others Write serialization writes to location seen in same order by all if one processor sees w1 after w2 another processor should not see w2


View Full Document

RIT EECC 756 - Shared Memory Multiprocessors

Documents in this Course
Load more
Loading Unlocking...
Login

Join to view Shared Memory Multiprocessors and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Shared Memory Multiprocessors and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?