DOC PREVIEW
CMU CS 15740 - Lecture

This preview shows page 1-2-3-4-5-6 out of 18 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 18 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 18 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 18 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 18 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 18 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 18 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 18 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Greg SteffanArchitectural Support for TLDS on a SMT Processor 11A Scalable Approach to Thread-Level Speculation SteffanCarnegie MellonA Scalable Approach to A Scalable Approach to ThreadThread--Level SpeculationLevel SpeculationJ. Gregory J. Gregory SteffanSteffan, Christopher B. , Christopher B. ColohanColohan, , Antonia Antonia ZhaiZhai, and Todd C. , and Todd C. MowryMowryComputer Science DepartmentComputer Science DepartmentCarnegie Mellon UniversityCarnegie Mellon University(Appeared in ISCA 2000.)(Appeared in ISCA 2000.)2A Scalable Approach to Thread-Level Speculation SteffanCarnegie MellonMultithreaded Machines Are EverywhereMultithreaded Machines Are Everywhere)How can we use them? Parallelism!CPCCPCCPCShared MemorySUN MAJC,IBM Power4ALPHA 21464 Dual Pentium SGI OriginThreadsCPCCPCShared MemoryCCPCP3A Scalable Approach to Thread-Level Speculation SteffanCarnegie MellonAutomatic ParallelizationAutomatic ParallelizationProving independence of threads is hard:Proving independence of threads is hard:––complex control flowcomplex control flow––complex data structurescomplex data structures––pointers, pointers, pointerspointers, pointers, pointers––runrun--time inputstime inputsHow can we make the compilerHow can we make the compiler’’s job feasible?s job feasible?)Thread-Level Speculation (TLS)4A Scalable Approach to Thread-Level Speculation SteffanCarnegie MellonExampleExamplewhile (...){x = hash[index1];…hash[index2] = y;...}Time= hash[3]…hash[10] =…Processor= hash[19]…hash[21] =…= hash[33]…hash[30] =…= hash[10]…hash[25] =…Greg SteffanArchitectural Support for TLDS on a SMT Processor 25A Scalable Approach to Thread-Level Speculation SteffanCarnegie MellonExample of ThreadExample of Thread--Level SpeculationLevel SpeculationTime= hash[3]…hash[10] =…Epoch 1= hash[19]…hash[21] =…Epoch 2= hash[33]…hash[30] =…Epoch 3= hash[10]…hash[25] =…Epoch 4Processor ProcessorProcessorProcessor6A Scalable Approach to Thread-Level Speculation SteffanCarnegie MellonExample of ThreadExample of Thread--Level SpeculationLevel SpeculationTime= hash[3]…hash[10] =…Epoch 1= hash[19]…hash[21] =…Epoch 2= hash[33]…hash[30] =…Epoch 3= hash[10]…hash[25] =…Epoch 4Processor ProcessorProcessorProcessorViolation!7A Scalable Approach to Thread-Level Speculation SteffanCarnegie MellonExample of ThreadExample of Thread--Level SpeculationLevel SpeculationTime= hash[3]…hash[10] =…commit?Epoch 1= hash[19]…hash[21] =…commit?Epoch 2= hash[33]…hash[30] =…commit?Epoch 3= hash[10]…hash[25] =…commit?Epoch 4Processor ProcessorProcessorProcessorViolation!99988A Scalable Approach to Thread-Level Speculation SteffanCarnegie MellonExample of ThreadExample of Thread--Level SpeculationLevel SpeculationTime= hash[3]…hash[10] =…commit?Epoch 1= hash[19]…hash[21] =…commit?Epoch 2= hash[33]…hash[30] =…commit?Epoch 3= hash[10]…hash[25] =…commit?Epoch 4Processor ProcessorProcessorProcessorViolation!9998= hash[10]…hash[25] =…commit?Epoch 4Retry9Greg SteffanArchitectural Support for TLDS on a SMT Processor 39A Scalable Approach to Thread-Level Speculation SteffanCarnegie MellonGoals of Our ApproachGoals of Our Approach1) Handle arbitrary memory accesses1) Handle arbitrary memory accesses––i.e. not just array referencesi.e. not just array references2) Preserve performance of non2) Preserve performance of non--speculative workloadsspeculative workloads––keep hardware support minimal and simplekeep hardware support minimal and simple3) Apply to any scale of multithreaded architecture3) Apply to any scale of multithreaded architecture––CMPsCMPs, SMT processors, more traditional MPs, SMT processors, more traditional MPs)effective, simple, and scalable TLS10A Scalable Approach to Thread-Level Speculation SteffanCarnegie MellonOverview of Our ApproachOverview of Our ApproachSystem requirements:System requirements:1) Detect data dependence violations1) Detect data dependence violations••extend invalidationextend invalidation--based cache coherencebased cache coherence2) Buffer speculative modifications2) Buffer speculative modifications••use the caches as speculative buffersuse the caches as speculative buffers)coherence already works at a variety of scales)hence our scheme is also scalable11A Scalable Approach to Thread-Level Speculation SteffanCarnegie MellonRelated SchemesRelated Schemes••Wisconsin (Wisconsin (MultiscalarMultiscalar, Trace Processor), Trace Processor)••Stanford (Hydra)Stanford (Hydra)••U.P. U.P. CatalunyaCatalunya(Speculative Multithreading)(Speculative Multithreading)••Intel/U. Portland (Dynamic Multithreading)Intel/U. Portland (Dynamic Multithreading)••Illinois at U.C. (IIllinois at U.C. (I--ACOMA)ACOMA))our approach seamlessly scales both up and down12A Scalable Approach to Thread-Level Speculation SteffanCarnegie MellonOutlineOutline)Details of our ApproachDetails of our Approach––life cycle of an epochlife cycle of an epoch––speculative coherence speculative coherence ––what happens at commit timewhat happens at commit time––forwarding data between epochsforwarding data between epochs••PerformancePerformance••ConclusionsConclusionsGreg SteffanArchitectural Support for TLDS on a SMT Processor 413A Scalable Approach to Thread-Level Speculation SteffanCarnegie MellonLife Cycle of an EpochLife Cycle of an EpochSpawnedBecomesSpeculativeCommit?InitSpeculativeWorkWait to beHomefree?Slow Commit:Fast Commit:Complete,Pass HomefreeTime9999999914A Scalable Approach to Thread-Level Speculation SteffanCarnegie MellonEpoch NumbersEpoch NumbersRepresent a partial orderingRepresent a partial ordering––signedsigned--compare sequence numbers if compare sequence numbers if TIDsTIDsmatchmatch••allows for wrapallows for wrap--aroundaround––otherwise the epochs are unorderedotherwise the epochs are unordered••from independent programs from independent programs ••from independent chains of speculation within one from independent chains of speculation within one programprogramThread Identifier (TID) Sequence Number15A Scalable Approach to Thread-Level Speculation SteffanCarnegie MellonSpeculative Thread ModelSpeculative Thread ModelRoundRound--robin schedule of epochs to processorsrobin schedule of epochs to processors––not a requirement of our scheme, just for conveniencenot a requirement of our scheme, just for convenienceEach epoch spawns the next


View Full Document

CMU CS 15740 - Lecture

Documents in this Course
leecture

leecture

17 pages

Lecture

Lecture

9 pages

Lecture

Lecture

36 pages

Lecture

Lecture

9 pages

Lecture

Lecture

13 pages

lecture

lecture

25 pages

lect17

lect17

7 pages

Lecture

Lecture

65 pages

Lecture

Lecture

28 pages

lect07

lect07

24 pages

lect07

lect07

12 pages

lect03

lect03

3 pages

lecture

lecture

11 pages

lecture

lecture

20 pages

lecture

lecture

11 pages

Lecture

Lecture

9 pages

Lecture

Lecture

10 pages

Lecture

Lecture

22 pages

Lecture

Lecture

28 pages

lecture

lecture

63 pages

lecture

lecture

13 pages

Lecture

Lecture

36 pages

Lecture

Lecture

18 pages

Lecture

Lecture

17 pages

Lecture

Lecture

12 pages

lecture

lecture

34 pages

lecture

lecture

47 pages

lecture

lecture

7 pages

Lecture

Lecture

18 pages

Lecture

Lecture

7 pages

Lecture

Lecture

21 pages

Lecture

Lecture

10 pages

Lecture

Lecture

39 pages

Lecture

Lecture

11 pages

lect04

lect04

40 pages

Load more
Download Lecture
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?