ISU CPRE 583 - Reconfigurable Computing - D1936552

Home> Schools> Iowa State University> Computer Engineering (CPRE) > CPRE 583> Reconfigurable Computing

DOC PREVIEW

ISU CPRE 583 - Reconfigurable Computing

School name Iowa State University

Course Cpre 583- Reconfig Comptg Sys

Pages 8

This preview shows page 1-2-3 out of 8 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 8 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 8 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 8 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 8 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

1CprE / ComS 583Reconfigurable ComputingProf. Joseph ZambrenoDepartment of Electrical and Computer EngineeringIowa State UniversityLecture #24 – Reconfigurable CoprocessorsCprE 583 – Reconfigurable ComputingNovember 14, 2006 Lect-24.2• Unresolved course issues• Gigantic red bug• Ghost inside Microsoft PowerPoint• This Thursday, project status updates• 10 minute presentations per group + questions• Combination of Adobe Breeze and calling in to teleconference• More details later todayQuick PointsCprE 583 – Reconfigurable ComputingNovember 14, 2006 Lect-24.3Recap – DP-FPGA• Break FPGA into datapath and control sections• Save storage for LUTs and connection transistors• Key issue is grain size• Cherepacha/Lewis – U. TorontoCprE 583 – Reconfigurable ComputingNovember 14, 2006 Lect-24.4• Segmented linear architecture• All RAMs and ALUs are pipelined• Bus connectors also contain registersRecap – RaPiDCprE 583 – Reconfigurable ComputingNovember 14, 2006 Lect-24.5Recap – Matrix• Two inputs from adjacent blocks• Local memory for instructions, dataCprE 583 – Reconfigurable ComputingNovember 14, 2006 Lect-24.6Recap – RAW Tile• Full functionality in each tile• Static router located for near-neighbor communication2CprE 583 – Reconfigurable ComputingNovember 14, 2006 Lect-24.7Outline• Recap• Reconfigurable Coprocessors• Motivation• Compute Models• Architecture• ExamplesCprE 583 – Reconfigurable ComputingNovember 14, 2006 Lect-24.8• Processors efficient at sequential codes, regular arithmetic operations• FPGA efficient at fine-grained parallelism, unusual bit-level operations• Tight-coupling important: allows sharing of data/control• Efficiency is an issue:• Context-switches• Memory coherency• SynchronizationOverviewCprE 583 – Reconfigurable ComputingNovember 14, 2006 Lect-24.9• I/O pre/post processing• Application specific operation• Reconfigurable Co-processors• Coarse-grained• Mostly independent• Reconfigurable Functional Unit• Tightly integrated with processor pipeline• Register file sharing becomes an issueCompute ModelsCprE 583 – Reconfigurable ComputingNovember 14, 2006 Lect-24.10a31 a30………. a0b31 b0Swap bitpositionsInstruction Augmentation• Processor can only describe a small number of basic computations in a cycle • I bits -> 2Ioperations• Many operations could be performed on 2 W-bit words• ALU implementations restrict execution of some simple operations• e. g. bit reversalCprE 583 – Reconfigurable ComputingNovember 14, 2006 Lect-24.11Instruction Augmentation (cont.)• Provide a way to augment the processor instruction set for an application• Avoid mismatch between hardware/software• Fit augmented instructions into data andcontrol stream• Create a functional unit for augmented instructions• Compiler techniques to identify/use new functional unitCprE 583 – Reconfigurable ComputingNovember 14, 2006 Lect-24.12“First” Instruction Augmentation• PRISM• Processor Reconfiguration through Instruction Set Metamorphosis• PRISM-I• 68010 (10MHz) + XC3090• can reconfigure FPGA in one second!• 50-75 clocks for operations3CprE 583 – Reconfigurable ComputingNovember 14, 2006 Lect-24.13PRISM-1 ResultsCprE 583 – Reconfigurable ComputingNovember 14, 2006 Lect-24.14PRISM Architecture• FPGA on bus• Access as memory mapped peripheral• Explicit context management• Some software discipline for use• …not much of an “architecture” presented to userCprE 583 – Reconfigurable ComputingNovember 14, 2006 Lect-24.15PRISC• Architecture:• couple into register file as “superscalar”functional unit• flow-through array (no state)CprE 583 – Reconfigurable ComputingNovember 14, 2006 Lect-24.16PRISC (cont.)• All compiled• Working from MIPS binary• <200 4LUTs ?• 64x3• 200MHz MIPS base• See [RazSmi94A] for more detailsCprE 583 – Reconfigurable ComputingNovember 14, 2006 Lect-24.17Chimaera• Start from Prisc idea.• Integrate as a functional unit• No state• RFU Ops (like expfu)• Stall processor on instruction miss• Add• Multiple instructions at a time• More than 2 inputs possible• [HauFry97A]CprE 583 – Reconfigurable ComputingNovember 14, 2006 Lect-24.18Chimaera Architecture• Live copy of register file values feed into array• Each row of array may compute from register of intermediates• Tag on array to indicate RFUOP4CprE 583 – Reconfigurable ComputingNovember 14, 2006 Lect-24.19Chimaera Architecture (cont.)• Array can operate on values as soon as placed in register file• When RFUOP matches• Stall until result ready• Drive result from matching rowCprE 583 – Reconfigurable ComputingNovember 14, 2006 Lect-24.20Chimaera Timing• If R1 presented late then stall• Might be helped by instruction reordering• Physical implementation an issue• Relies on considerable processor interaction for supportCprE 583 – Reconfigurable ComputingNovember 14, 2006 Lect-24.21Chimaera Speedup• Three Spec92 benchmarks• Compress 1.11 speedup• Eqntott 1.8• Life 2.06• Small arrays with limited state• Small speedup• Perhaps focus on global router rather than local optimizationCprE 583 – Reconfigurable ComputingNovember 14, 2006 Lect-24.22Garp• Integrate as coprocessor• Similar bandwidth to processor as functional unit• Own access to memory• Support multi-cycle operation• Allow state• Cycle counter to track operation• Configuration cache, path to memoryCprE 583 – Reconfigurable ComputingNovember 14, 2006 Lect-24.23Garp (cont.)• ISA – coprocessor operations•Issue gaconfig to make particular configuration present• Explicitly move data to/from array• Processor suspension during coproc operation • Use cycle counter to track progress• Array may directly access memory•Processor and array share memory• Exploits streaming data operations• Cache/MMU maintains data consistencyCprE 583 – Reconfigurable ComputingNovember 14, 2006 Lect-24.24Garp Instructions• Interlock indicates if processor waits for array to count to zero• Last three instructions useful for context swap• Processor decode hardware augmented to recognize new instructions5CprE 583 – Reconfigurable ComputingNovember 14, 2006 Lect-24.25Garp Array• Row-oriented logic• Dedicated path for processor/memory• Processor does not have to be involved in array-memory pathCprE

View Full Document


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2-3 out of 8 pages.

ISU CPRE 583 - Reconfigurable Computing

Sign up for free to view:

Please select your school