U of I CS 433 - Computer System Organization - D1990953

Home> Schools> University of Illinois> Computer Science (CS) > CS 433> Computer System Organization

DOC PREVIEW

U of I CS 433 - Computer System Organization

School name University of Illinois

Course Cs 433- Computer System Organization

Pages 41

This preview shows page 1-2-3-19-20-39-40-41 out of 41 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 41 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 41 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 41 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 41 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 41 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 41 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 41 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 41 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 41 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

CS433: Computer System OrganizationDependencesSlide 3Elements Required for DependenceStatic Exploitation of ILPMultiple-Issue versus VLIWMultiple Issue versus VLIWBasic Pipeline SchedulingSlide 9Loop UnrollingLoop Unrolling: Register Renaming And Normalization of Induction VariablesLoop Unrolling: Scheduling After Register RenamingLoop Unrolling: CleanupRegister Allocation and ILPSo-Called “Phase Ordering” Problem: Schedule First, Using Virtual RegistersSo-Called “Phase Ordering” Problem: Allocate FirstQuestionLoops and Dependence DistanceExamples of DependenceSlide 20Slide 21Slide 22Slide 23Slide 24Slide 25What if we write it this way:Software PipeliningAnti Dependences and VLIW / SuperscalarSlide 29Software Pipelining and Loop ReversalLoop ReversalSlide 32SW Pipelining and Flow and Anti DependencesSW Pipelining Changes Flow Dependences into Anti DependencesFlow and Anti DependencesPrologue and EpilogueSlide 37Dependence ConstraintsDependence Constraints for Software PipeliningSlide 40Slide 41CS433: Computer System OrganizationLuddy HarrisonStatic Exploitation ofInstruction Level ParallelismDependencesFrom an earlier statement to a later statement (in program execution order).This is a time ordering, not a visual or textual ordering.Three types:Flow:A == AAnti:= AA =Output:A =A =DependencesFlow:Earlier statement writes same storage location that later statement readsStorage location may be register, memory, anything that holds state.Anti:Earlier statement reads same storage location that later statement writesOutput:Earlier statement writes same storage location that later statement writesElements Required for DependenceStatement instancesTwo instructions / statements or two instances of a single instruction / statementDynamic instances thereof – statements and instructions may be executed many timesTimeearlier statement and later statementordered by program execution orderLocationthe two statements must access the same locationthis may be a register or memory or any other kind of stateAccess (Read and/or Write)one of the accesses must be a write (modification)Static Exploitation of ILPStatic exploitation requires eitherA programmer who writes assembly language, orA compiler organize the instructions of the program such that there is sufficient parallelism between themWe will consider compiler techniques for the most partI will mention when an assembly programmer might do better or differentlyMultiple-Issue versus VLIWWhat’s the difference?Instructions in a VLIW “line” must not depend on one anotherIt is permitted however to use as a destination a register than another instruction in the same line is using as a source. The source is read before it is overwritten as a destinationThis is a very important source of additional parallelism in VLIW machines!In a multiple-issue-capable CPU (but not a VLIW), if adjacent instructions are dependent, a stall occurs between the instructions, but the program still works.Multiple Issue versus VLIW add R0, R1, R2 add R3, R4, R5 add R1, R0, R6 add R4, R3, R7What does this program do if run on a multiple issue processor that can run up to four instructions in parallel?What does it do if run as a single instruction line on a 4-wide VLIW?Basic Pipeline Schedulingload R3, [R1 + 12]add R3, R3, 4load R0, [R1 + 16]add R0, R0, 4load R3, [R1 + 12]load R0, [R1 + 16]add R3, R3, 4add R0, R0, 4The colors represent dependent instructions.Assume there is a one-cycle stall between a load and an instruction that uses the load’s result. We reorder the instructions to eliminate these stalls.Here we don’t need to change the instructions, but that is unusual. Usually we are not so lucky.Basic Pipeline Schedulingload R3, [R1 + 12]add R0, R0, R3load R3, [R1 + 16]add R0, R0, R3load R3, [R1 + 12]add R0, R0, R3load R4, [R1 + 16]add R0, R0, R4load R3, [R1 + 12]load R4, [R1 + 16]add R0, R0, R3add R0, R0, R4We would like to do the same transformation as before, but R3 is used as the destination for both loads.We rename R3 to R4 in the second two instructions. This breaks the anti-dependence.renamereorderLoop Unrolling move R2, 0 move R3, 100L: load R0, [R1 + 16] add R2, R2, R0 add R1, R1, 24 sub R3, R3, 1 bge L move R2, 0 move R3, 100L: load R0, [R1 + 16] add R2, R2, R0 add R1, R1, 24 sub R3, R3, 1 load R0, [R1 + 16] add R2, R2, R0 add R1, R1, 24 sub R3, R3, 1 bge LcopycopyIs this transformation always legal?What are the required conditions?Loop Unrolling: Register Renaming And Normalization of Induction Variables move R2, 0 move R3, 100L: load R0, [R1 + 16] add R2, R2, R0 add R1, R1, 24 sub R3, R3, 1 load R0, [R1 + 16] add R2, R2, R0 add R1, R1, 24 sub R3, R3, 1 bge L move R2, 0 move R3, 100L: load R0, [R1 + 16] add R2, R2, R0 add R5, R1, 24 sub R3, R3, 1 load R4, [R1 + 40] add R2, R2, R4 add R1, R5, 24 sub R3, R3, 1 bge LLoop Unrolling: Scheduling After Register Renaming move R2, 0 move R3, 100L: load R0, [R1 + 16] add R2, R2, R0 add R5, R1, 24 sub R3, R3, 1 load R4, [R1 + 40] add R2, R2, R4 add R1, R5, 24 sub R3, R3, 1 bge L move R2, 0 move R3, 100L: load R0, [R1 + 16] load R4, [R1 + 40] add R2, R2, R0 add R2, R2, R4 add R5, R1, 24 add R1, R5, 24 sub R3, R3, 1 sub R3, R3, 1 bge LLoop Unrolling: Cleanup move R2, 0 move R3, 100L: load R0, [R1 + 16] load R4, [R1 + 40] add R2, R2, R0 add R2, R2, R4 add R1, R1, 24 add R1, R1, 24 sub R3, R3, 1 sub R3, R3, 1 bge L move R2, 0 move R3, 100L: load R0, [R1 + 16] load R4, [R1 + 40] add R2, R2, R0 add R2, R2, R4 add R1, R1, 48 sub R3, R3, 2 bge LRegister Allocation and ILPCan you imagine a world in which assigning seats on an airplane and scheduling people on flights were considered two different activities?“First we schedule everybody who wants to fly to Memphis on March 19th on flight 459 at 10AM. Then, we assign them seats.”Or, “First we assign everybody a seat. Next we look to see what day and time they want to fly.”This what we do with registers and instruction schedules (roughly speaking)So-Called “Phase Ordering” Problem: Schedule First, Using Virtual Registersload V0add V1, V0load V2add V1, V2load V3add V1, V4load V5add V1, V6load V7add V1, V7load V8add V1, V8load V0load V2load V3load V4load V5load V6add V1,

View Full Document