U of U CS 7810 - Shared L2 Caches through OS-Level Page Allocation - D2525252

Home> Schools> University of Utah> Computer Science (CS) > CS 7810> Shared L2 Caches through OS-Level Page Allocation

DOC PREVIEW

U of U CS 7810 - Shared L2 Caches through OS-Level Page Allocation

School name University of Utah

Course Cs 7810- Advanced Computer Architecture

Pages 19

This preview shows page 1-2-3-4-5-6 out of 19 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 19 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 19 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 19 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 19 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 19 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 19 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 19 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Managing Distributed, Shared L2 Caches through OS-Level Page AllocationSangyeun ChoLei Jin(Micro 2006)ClaimsManage L2 cache through OS-level page allocationFlexible without complex hardware supportDynamically control data placement and cache sharingExample Chip and Tile (Core)L2 Cache AllocationTraditionally S = A mod NProposed change to S = PPN mod NAllows the OS to chose the virtual to physical mapping (PPN choses the slice)Line GranularityPage GranularityCongruence GroupCGi = {phys page (PPN=j)|pmap(j) = i}Used to map a physical page to a core.Convenient to use modulo-N on PPN for pmapCaching SchemesPrivate cachingOS allocates private pages for Pi running on core i from CGiShared cachingPages allocated from all congruence groups {CGi} (0<i<N-1)Round robin or RandomHybrid Caching SchemePartition {CGi} into K groups (K<N)Allocate pages from that group for a core within that groupAllows sharing within a groupOS ModificationsN free lists instead of a single free listDepends on the cache schemeMust consider existing data mappingsMakes allocation more complexPage SpreadingWhen the local L2 slice is too small for the working setNeed to consider data proximity to reduce the number of network hopsAlso must consider cache pressurenumber of accessed pages/cache sizeData ProximityBloom Filter MonitorKeeps track of pages accessedLow overhead512-kB cache slice8-kB page512-byte filter<0.5% false positiveVirtual Multicore!Simulator SetupSimpleScalar16 tiles (4x4 mesh) (2 cycle hop)Single issue16kB L1 I/D caches (1 cycle)512kB L2 cache slice (8 cycles)2GB main memory (300 cycles)ResultsResultsParallel WorkloadsHow to Kill Cache CoherenceGoal: Reduce overhead of cache coherenceAlso some of the messinessGranularity issueOS independentLower storage overhead and

View Full Document