DOC PREVIEW
ISU CPRE 583 - Lect-23

This preview shows page 1-2-3-21-22-23-42-43-44 out of 44 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 44 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 44 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 44 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 44 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 44 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 44 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 44 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 44 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 44 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 44 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CprE / ComS 583 Reconfigurable ComputingQuick PointsAllowable SchedulesSequentializationMulticontext SchedulingSignal RetimingSlide 7Full ASCII  Hex CircuitMulticontext VersionASCIIHex ExampleASCIIHex Example (cont.)General Throughput MappingBenchmark SetArea v. ThroughputArea v. Throughput (cont.)Reconfiguration for Fault ToleranceColumn Based ReconfigurationSlide 18Slide 19SummaryOutlineCoarse-grained ArchitecturesDP-FPGAConfiguration SharingTwo-dimensional LayoutDP-FPGA Technology MappingRaPiDRaPiD DatapathRaPiD Control PathFIR Filter ExampleFIR Filter Example (cont.)MATRIXBasic Functional UnitMATRIX InterconnectFunctional Unit InputsSlide 36ChessChess InterconnectChess Basic BlockReconfigurable Architecture WorkstationRAW TileRAW DatapathRaw CompilerSlide 44CprE / ComS 583Reconfigurable ComputingProf. Joseph ZambrenoDepartment of Electrical and Computer EngineeringIowa State UniversityLecture #23 – Function Unit ArchitecturesCprE 583 – Reconfigurable ComputingNovember 9, 2006 Lect-23.2Quick Points•HW #3, #4 graded and returned•Next week Thursday, project status updates•10 minute presentations per group + questions•Upload to WebCT by the previous evening•Expected that you’ve made some progress!CprE 583 – Reconfigurable ComputingNovember 9, 2006 Lect-23.3Allowable SchedulesActive LUTs (NA) = 3CprE 583 – Reconfigurable ComputingNovember 9, 2006 Lect-23.4Sequentialization•Adding time slots •More sequential (more latency)•Adds slack•Allows better balanceL=4 NA=2 (4 or 3 contexts)CprE 583 – Reconfigurable ComputingNovember 9, 2006 Lect-23.5Multicontext Scheduling•“Retiming” for multicontext•goal: minimize peak resource requirements•NP-complete•List schedule, anneal•How do we accommodate intermediate data?•Effects?CprE 583 – Reconfigurable ComputingNovember 9, 2006 Lect-23.6Signal Retiming•Non-pipelined •hold value on LUT Output (wire) •from production through consumption•Wastes wire and switches by occupying•For entire critical path delay L•Not just for 1/L’th of cycle takes to cross wire segment•How will it show up in multicontext?CprE 583 – Reconfigurable ComputingNovember 9, 2006 Lect-23.7Signal Retiming•Multicontext equivalent•Need LUT to hold value for each intermediate contextCprE 583 – Reconfigurable ComputingNovember 9, 2006 Lect-23.8•Logically three levels of dependence•Single Context: 21 LUTs @ 880K2=18.5M2Full ASCII  Hex CircuitCprE 583 – Reconfigurable ComputingNovember 9, 2006 Lect-23.9•Three contexts: 12 LUTs @ 1040K2=12.5M2•Pipelining needed for dependent pathsMulticontext VersionCprE 583 – Reconfigurable ComputingNovember 9, 2006 Lect-23.10ASCIIHex Example •All retiming on wires (active outputs)•Saturation based on inputs to largest stage•With enough contexts only one LUT needed•Increased LUT area due to additional stored configuration information•Eventually additional interconnect savings taken up by LUT configuration overheadIdealPerfect scheduling spread + no retime overheadCprE 583 – Reconfigurable ComputingNovember 9, 2006 Lect-23.11@ depth=4, c=6: 5.5M2 (compare 18.5M2 )ASCIIHex Example (cont.)CprE 583 – Reconfigurable ComputingNovember 9, 2006 Lect-23.12General Throughput Mapping•If only want to achieve limited throughput•Target produce new result every t cycles•Spatially pipeline every t stages •cycle = t •Retime to minimize register requirements•Multicontext evaluation w/in a spatial stage•Retime (list schedule) to minimize resource usage •Map for depth (i) and contexts (c)CprE 583 – Reconfigurable ComputingNovember 9, 2006 Lect-23.13•23 MCNC circuits•Area mapped with SIS and ChortleBenchmark SetCprE 583 – Reconfigurable ComputingNovember 9, 2006 Lect-23.14Area v. ThroughputCprE 583 – Reconfigurable ComputingNovember 9, 2006 Lect-23.15Area v. Throughput (cont.)CprE 583 – Reconfigurable ComputingNovember 9, 2006 Lect-23.16Reconfiguration for Fault Tolerance•Embedded systems require high reliability in the presence of transient or permanent faults•FPGAs contain substantial redundancy •Possible to dynamically “configure around” problem areas•Numerous on-line and off-line solutionsCprE 583 – Reconfigurable ComputingNovember 9, 2006 Lect-23.17•Huang and McCluskey•Assume that each FPGA column is equivalent in terms of logic and routing•Preserve empty columns for future use•Somewhat wasteful•Precompile and compress differences in bitstreamsColumn Based ReconfigurationCprE 583 – Reconfigurable ComputingNovember 9, 2006 Lect-23.18•Create multiple copies of the same design with different unused columns•Only requires different inter-block connections•Can lead to unreasonable configuration countColumn Based ReconfigurationCprE 583 – Reconfigurable ComputingNovember 9, 2006 Lect-23.19•Determining differences and compressing the results leads to “reasonable” overhead•Scalability and fault diagnosis are issuesColumn Based ReconfigurationCprE 583 – Reconfigurable ComputingNovember 9, 2006 Lect-23.20Summary•In many cases cannot profitably reuse logic at device cycle rate•Cycles, no data parallelism•Low throughput, unstructured•Dissimilar data dependent computations•These cases benefit from having more than one instructions/operations per active element•Economical retiming becomes important here to achieve active LUT reduction•For c=[4,8], I=[4,6] automatically mapped designs are 1/2 to 1/3 single context sizeCprE 583 – Reconfigurable ComputingNovember 9, 2006 Lect-23.21Outline•Continuation•Function Unit Architectures•Motivation•Various architectures•Device trendsCprE 583 – Reconfigurable ComputingNovember 9, 2006 Lect-23.22Coarse-grained Architectures•DP-FPGA •LUT-based •LUTs share configuration bits•Rapid•Specialized ALUs, mutlipliers•1D pipeline•Matrix•2-D array of ALUs•Chess•Augmented, pipelined matrix•Raw•Full RISC core as basic block•Static scheduling used for communicationCprE 583 – Reconfigurable ComputingNovember 9, 2006 Lect-23.23DP-FPGA•Break FPGA into datapath and control sections•Save storage for LUTs and connection transistors•Key issue is grain size•Cherepacha/Lewis – U. TorontoCprE 583 – Reconfigurable ComputingNovember 9, 2006 Lect-23.24MC = LUT SRAM bitsCE = connection block pass transistorsCENMCNCE*NMCA(N) Set MC = 2-3CE01 1 100 1 0Y0Y1A0B0C0A1B1C1Configuration


View Full Document

ISU CPRE 583 - Lect-23

Download Lect-23
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lect-23 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lect-23 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?