DOC PREVIEW
UT CS 378 - Shared-Memory Architectures

This preview shows page 1-2-15-16-17-32-33 out of 33 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 33 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 33 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 33 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 33 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 33 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 33 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 33 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 33 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Shared-memory ArchitecturesOverviewBus-based Shared Memory OrganizationOrganizationProblem of Memory CoherenceExampleBus SnoopingSnooping ProtocolsSlide 9Update or Invalidate?Slide 11Implementation IssuesMESI Protocol (1)MESI Protocol (2)MESI Protocol (3)MESI Protocol (4)MESI Local Read HitMESI Local Read Miss (1)MESI Local Read Miss (2)MESI Local Read Miss (3)MESI Local Write Hit (1)MESI Local Write Hit (2)MESI Local Write Miss (1)MESI Local Write Miss (2)MESI Local Write Miss (3)MESI Local Write Miss (4)Putting it all togetherMESI – locally initiated accessesMESI – remotely initiated accessesMESI notesDirectory SchemesBasic Scheme (Censier & Feautrier)Key Issues1Shared-memory ArchitecturesAdapted from a lecture by Ian Watson, University of Machester2Overview•We have talked about shared-memory programming with threads, locks, and condition variables in the context of a single processor.•Now let us look at how such programs can be run on a multiprocessor.•Two architectures:–Bus-based shared-memory machines (small-scale)–Directory-based shared-memory machines (large-scale)3Bus-based Shared Memory OrganizationBasic picture is simple :-CPUCacheCPUCacheCPUCacheShared BusSharedMemory4Organization•Bus is usually simple physical connection (wires)•Bus bandwidth limits no. of CPUs•Could be multiple memory elements•For now, assume that each CPU has only a single level of cache5Problem of Memory Coherence•Assume just single level caches and main memory•Processor writes to location in its cache•Other caches may hold shared copies - these will be out of date•Updating main memory alone is not enough6ExampleCPUCacheCPUCacheCPUCacheShared BusSharedMemory X: 24Processor 1 reads X: obtains 24 from memory and caches itProcessor 2 reads X: obtains 24 from memory and caches itProcessor 1 writes 32 to X: its locally cached copy is updatedProcessor 3 reads X: what value should it get? Memory and processor 2 think it is 24 Processor 1 thinks it is 32Notice that having write-through caches is not good enough1 237Bus Snooping•Scheme where every CPU knows who has a copy of its cached data is far too complex.•So each CPU (cache system) ‘snoops’ (i.e. watches continually) for write activity concerned with data addresses which it has cached.•This assumes a bus structure which is ‘global’, i.e all communication can be seen by all.•More scalable solution: ‘directory based’ coherence schemes8Snooping Protocols•Write Invalidate–CPU wanting to write to an address, grabs a bus cycle and sends a ‘write invalidate’ message–All snooping caches invalidate their copy of appropriate cache line–CPU writes to its cached copy (assume for now that it also writes through to memory)–Any shared read in other CPUs will now miss in cache and re-fetch new data.9Snooping Protocols•Write Update–CPU wanting to write grabs bus cycle and broadcasts new data as it updates its own copy–All snooping caches update their copy•Note that in both schemes, problem of simultaneous writes is taken care of by bus arbitration - only one CPU can use the bus at any one time.10Update or Invalidate?•Update looks the simplest, most obvious and fastest, but:-–Multiple writes to same word (no intervening read) need only one invalidate message but would require an update for each–Writes to same block in (usual) multi-word cache block require only one invalidate but would require multiple updates.11Update or Invalidate?•Due to both spatial and temporal locality, previous cases occur often.•Bus bandwidth is a precious commodity in shared memory multi-processors•Experience has shown that invalidate protocols use significantly less bandwidth.•Will consider implementation details only of invalidate.12Implementation Issues•In both schemes, knowing if a cached value is not shared (copy in another cache) can avoid sending any messages.•Invalidate description assumed that a cache value update was written through to memory. If we used a ‘copy back’ scheme other processors could re-fetch old value on a cache miss.•We need a protocol to handle all this.13MESI Protocol (1)•A practical multiprocessor invalidate protocol which attempts to minimize bus usage.•Allows usage of a ‘write back’ scheme - i.e. main memory not updated until ‘dirty’ cache line is displaced•Extension of usual cache tags, i.e. invalid tag and ‘dirty’ tag in normal write back cache.14MESI Protocol (2)Any cache line can be in one of 4 states (2 bits)•Modified - cache line has been modified, is different from main memory - is the only cached copy. (multiprocessor ‘dirty’)•Exclusive - cache line is the same as main memory and is the only cached copy•Shared - Same as main memory but copies may exist in other caches.•Invalid - Line data is not valid (as in simple cache)15MESI Protocol (3)•Cache line changes state as a function of memory access events.•Event may be either–Due to local processor activity (i.e. cache access)–Due to bus activity - as a result of snooping•Cache line has its own state affected only if address matches16MESI Protocol (4)•Operation can be described informally by looking at action in local processor–Read Hit–Read Miss–Write Hit–Write Miss•More formally by state transition diagram17MESI Local Read Hit•Line must be in one of MES•This must be correct local value (if M it must have been modified locally)•Simply return value•No state change18MESI Local Read Miss (1)•No other copy in caches–Processor makes bus request to memory–Value read to local cache, marked E•One cache has E copy–Processor makes bus request to memory–Snooping cache puts copy value on the bus–Memory access is abandoned–Local processor caches value–Both lines set to S19MESI Local Read Miss (2)•Several caches have S copy–Processor makes bus request to memory–One cache puts copy value on the bus (arbitrated)–Memory access is abandoned–Local processor caches value–Local copy set to S–Other copies remain S20MESI Local Read Miss (3)•One cache has M copy–Processor makes bus request to memory–Snooping cache puts copy value on the bus–Memory access is abandoned–Local processor caches value–Local copy tagged S–Source (M) value copied back to memory–Source value M -> S21MESI Local Write Hit (1)Line must be one of MES•M–line is exclusive and already ‘dirty’–Update local cache value–no state


View Full Document

UT CS 378 - Shared-Memory Architectures

Documents in this Course
Epidemics

Epidemics

31 pages

Discourse

Discourse

13 pages

Phishing

Phishing

49 pages

Load more
Download Shared-Memory Architectures
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Shared-Memory Architectures and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Shared-Memory Architectures 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?