DOC PREVIEW
U of U CS 7810 - Shared L2 Caches through OS-Level Page Allocation

This preview shows page 1-2-3-4-5-6 out of 19 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Managing Distributed, Shared L2 Caches through OS-Level Page AllocationSangyeun ChoLei Jin(Micro 2006)ClaimsManage L2 cache through OS-level page allocationFlexible without complex hardware supportDynamically control data placement and cache sharingExample Chip and Tile (Core)L2 Cache AllocationTraditionally S = A mod NProposed change to S = PPN mod NAllows the OS to chose the virtual to physical mapping (PPN choses the slice)Line GranularityPage GranularityCongruence GroupCGi = {phys page (PPN=j)|pmap(j) = i}Used to map a physical page to a core.Convenient to use modulo-N on PPN for pmapCaching SchemesPrivate cachingOS allocates private pages for Pi running on core i from CGiShared cachingPages allocated from all congruence groups {CGi} (0<i<N-1)Round robin or RandomHybrid Caching SchemePartition {CGi} into K groups (K<N)Allocate pages from that group for a core within that groupAllows sharing within a groupOS ModificationsN free lists instead of a single free listDepends on the cache schemeMust consider existing data mappingsMakes allocation more complexPage SpreadingWhen the local L2 slice is too small for the working setNeed to consider data proximity to reduce the number of network hopsAlso must consider cache pressurenumber of accessed pages/cache sizeData ProximityBloom Filter MonitorKeeps track of pages accessedLow overhead512-kB cache slice8-kB page512-byte filter<0.5% false positiveVirtual Multicore!Simulator SetupSimpleScalar16 tiles (4x4 mesh) (2 cycle hop)Single issue16kB L1 I/D caches (1 cycle)512kB L2 cache slice (8 cycles)2GB main memory (300 cycles)ResultsResultsParallel WorkloadsHow to Kill Cache CoherenceGoal: Reduce overhead of cache coherenceAlso some of the messinessGranularity issueOS independentLower storage overhead and


View Full Document

U of U CS 7810 - Shared L2 Caches through OS-Level Page Allocation

Download Shared L2 Caches through OS-Level Page Allocation
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Shared L2 Caches through OS-Level Page Allocation and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Shared L2 Caches through OS-Level Page Allocation 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?