DOC PREVIEW
CMU CS 15745 - Lecture

This preview shows page 1-2 out of 7 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

15-745 © Seth Copen Goldstein 2005-7 115-745Topic: Exotic Architectures• Doug Burger et.al., "Scaling to the End of Silicon with EDGE Architectures", IEEE Computer July 2004.• Jan Hoogerbrugge, et al., ``Software pipelining for transport-triggered architectures'',MICRO 24 (1991). • Steve Swanson, et al. “WaveScalar” ,MICRO-36, December 200315-745 © Seth Copen Goldstein 2005-7 2Its not about computing after all!• What is the fundamental operation in a computer?15-745 © Seth Copen Goldstein 2005-7 3It’s not about computing after all!• What is the fundamental operation in a computer?– It is not the add, the multiply, the xor, etc.– It is the move• Typical (read x86,etc.) architectures don’t ALLOW this to be expressed!• All three papers share a common goal:Represent the data movement involved in computation explicitlyBTW: Really bad slide, why?15-745 © Seth Copen Goldstein 2005-7 4What is Exotic?•ISA– An abstraction provided by computer designerE.g., no change in programs when• transistor shrinks by factor of 2 or even 10!• start using aluminum to transmit info (and then copper!)• voltage changes by factor of 5x!• change from micro-coded engine to risc core!• 10x registers introduced (internal ones)– Limits what can be expressed•no “move”s•what else?15-745 © Seth Copen Goldstein 2005-7 5One view on compiler/Arch DivideFrontendFrontendApplicationsequential(superscalar)dependence(dataflow)independence(EPIC)independence(VLIW)Compilation time(Software)Determine DependenciesDetermine DependenciesDetermine IndependenciesDetermine IndependenciesBind Function UnitsBind Function UnitsDetermine DependenciesDetermine DependenciesDetermine IndependenciesDetermine IndependenciesBind Function UnitsBind Function UnitsBind Datapaths & ExecuteBind Datapaths & ExecuteRun time(Hardware)ILP ArchitecturesSlides Adapted/From: J.Takala/TUT15-745 © Seth Copen Goldstein 2005-7 6VLIWRegister FileInstruction FetchInstruction DecodeData MemoryInstruction MemoryBypassing NetworkCPUFU-1FU-2FU-3FU-4FU-5Register FileInstruction FetchInstruction DecodeData MemoryInstruction MemoryBypassing NetworkCPUFU-1FU-2FU-3FU-4FU-5• Scaling Drawbacks?Slides Adapted/From: J.Takala/TUT15-745 © Seth Copen Goldstein 2005-7 7VLIWRegister FileInstruction FetchInstruction DecodeData MemoryInstruction MemoryBypassing NetworkCPUFU-1FU-2FU-3FU-4FU-5Register FileInstruction FetchInstruction DecodeData MemoryInstruction MemoryBypassing NetworkCPUFU-1FU-2FU-3FU-4FU-5• Scaling Drawbacks?– Bypass complexity– Register file complexity– Register file design restricts FU flexibility–Operation encoding format restricts FU flexibilitySlides Adapted/From: J.Takala/TUT15-745 © Seth Copen Goldstein 2005-7 8Transport-Triggered Arch• Only 1 instruction: MOVE• Don’t specify operations,specify register mov’tRegister FileBypassing NetworkVLIWInstruction FetchInstruction DecodeInstruction MemoryFU-1FU-2FU-3FU-4FU-5Data MemoryInstruction FetchInstruction DecodeBypassing NetworkFU-1FU-2FU-3FU-4FU-5RegisterFileTTARegister FileBypassing NetworkVLIWInstruction FetchInstruction DecodeInstruction MemoryFU-1FU-2FU-3FU-4FU-5Data MemoryInstruction FetchInstruction DecodeBypassing NetworkInstruction FetchInstruction DecodeBypassing NetworkFU-1FU-2FU-3FU-4FU-5FU-1FU-2FU-3FU-4FU-5RegisterFileRegisterFileTTASlides Adapted/From: J.Takala/TUTJ.Takala/TUT Berkeley – Finland Day, Oct.18, 2002TTA DatapathTTA DatapathIntegerALUIntegerALUFloatALUBoolean RFFloat RFInteger RFSocketInstruction MemoryData MemoryLoad/StoreUnitLoad/StoreUnitImmediate UnitInstruction UnitJ.Takala/TUT Berkeley – Finland Day, Oct.18, 2002Function UnitsFunction Units Operands written to operand registers (O) Operation performed when last operand written to trigger register (T) Pipeline synchronized with control bits (C) Standard interface FU_ready Result_ready Global_lockToptionalOptional shadow registerOlogiclogicRlogicCCCCJ.Takala/TUT Berkeley – Finland Day, Oct.18, 2002ILP ILP ArchitecturesArchitecturesFrontendFrontendApplicationsequential(superscalar)dependence(dataflow)independence(EPIC)independence(VLIW)Compilation timeindependence(TTA)Determine DependenciesDetermine DependenciesDetermine IndependenciesDetermine IndependenciesBind Function UnitsBind Function UnitsBind DatapathsBind DatapathsExecuteExecuteDetermine DependenciesDetermine DependenciesDetermine IndependenciesDetermine IndependenciesBind Function UnitsBind Function UnitsBind DatapathsBind DatapathsRun timeJ.Takala/TUT Berkeley – Finland Day, Oct.18, 2002TTA Characteristics: HWTTA Characteristics: HWModularCan be constructed with standard building blocksVery flexible and scalableFU functionality can be arbitrarySupports user defined Special Function Units (SFU)Lower complexityReduction on # register portsReduced bypass complexityReduction in bypass connectivityReduced register pressureTrivial decoding (implies long instructions)J.Takala/TUT Berkeley – Finland Day, Oct.18, 2002TTA Characteristics: SWTTA Characteristics: SWTraditional operation-triggered instruction:Transport-triggered instruction:Reminds dataflow and time-stationary codingmul r1,r2,r3;r1→mul.o; r2→mul.t; mul.r→r3;r1→mul.o, r2→mul.t; mul.r→r3;orJ.Takala/TUT Berkeley – Finland Day, Oct.18, 2002TTA Specific OptimizationsTTA Specific OptimizationsTTA allows extra scheduling optimizationsE.g., software bypassingBypassing can eliminate the need of RF accessHowever, more difficult to schedule !Example: r1 → add.o, r2 → add.t;add.r → r3;r3 → sub.o, r4 → sub.tsub.r → r5;Translates to: r1 → add.o, r2 → add.t;add.r → sub.o, r4 → sub.t;sub.r → r5;15-745 © Seth Copen Goldstein 2005-7 15Registers aren’t everything•TRIPS– operand-based dataflow architecture• Wavescalar– (operand-based?) dataflow architecture– Makes memory dependencies explicit• Pegasus– dataflow, operand/wires explicit– Memory dependencies explicit• All Three– basic unit is a hyperblock15-745 © Seth Copen Goldstein 2005-7 16TRIPS15-745 © Seth Copen Goldstein 2005-7 17TRIPS: Program Representation15-745 © Seth Copen Goldstein 2005-7 18TRIPS: Compiling15-745 © Seth Copen Goldstein 2005-7 19Wavescalar15-745 © Seth Copen Goldstein 2005-7 20Wavescalar: Memory Dependencies15-745 © Seth Copen Goldstein 2005-7 21SP on TTA• Extends LAM’s modular scheduling to TTA•Recall: – d(u,v):


View Full Document

CMU CS 15745 - Lecture

Documents in this Course
Lecture

Lecture

14 pages

Lecture

Lecture

19 pages

Lecture

Lecture

8 pages

Lecture

Lecture

5 pages

Lecture

Lecture

6 pages

lecture

lecture

17 pages

Lecture 3

Lecture 3

12 pages

Lecture

Lecture

17 pages

Lecture

Lecture

18 pages

lecture

lecture

14 pages

lecture

lecture

8 pages

lecture

lecture

5 pages

Lecture

Lecture

19 pages

lecture

lecture

10 pages

Lecture

Lecture

20 pages

Lecture

Lecture

8 pages

lecture

lecture

59 pages

Lecture

Lecture

10 pages

Task 2

Task 2

2 pages

Handout

Handout

18 pages

Load more
Download Lecture
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?