1CprE / ComS 583Reconfigurable ComputingProf. Joseph ZambrenoDepartment of Electrical and Computer EngineeringIowa State UniversityLecture #12 – Other Spatial StylesCprE 583 – Reconfigurable ComputingSeptember 28, 2006 Lect-12.2Quick Points• HW #3 coming out today• Due Tuesday, October 17 (midnight)• Systolic computing structures• Systolic mapping• Logic partitioning• FPGA synthesisPriority: 1 Breathing, Eating, etc.Priority: 14“Desperate Housewives”Priority: 6Night out in CampustownPriority: 45OtherWork………Priority: 74 CprE 583HomeworkCprE 583 – Reconfigurable ComputingSeptember 28, 2006 Lect-12.3Project Proposals• Due Sunday, 10/8 at midnight• Purpose – to provide a background and overview of the project• Goal – allow me to understand what you are intending to do• Project topic:• Perform an in-depth exploration of some area of reconfigurable computing• Whatever topic you choose, you must include a strong experimental element in your project• Work in groups of 2+ (3 if very lofty proposal)CprE 583 – Reconfigurable ComputingSeptember 28, 2006 Lect-12.4Project Proposals (cont.)• Suggested structure [3-4 pages, IEEE conf. format]• Introduction – what is the context for this work? What problem are you trying to address? Why is it interesting/challenging?• Prior work – what is the related work? How does your work differ from these? (5-10 references)• Approach – how are you going to tackle the problem? What tools and methodologies do you intend on using? What experiments do you intend on running?• Expected results – what do you expect the outcome of your project to be? What are the deliverables? How do you intend on presenting your results?• Milestones – what is your expected progress schedule? Provide a weekly / bi-weekly basisCprE 583 – Reconfigurable ComputingSeptember 28, 2006 Lect-12.5Systolic Architectures• Goal – general methodology for mapping computations into hardware (spatial computing) structures• Composition:• Simple compute cells (e.g. add, sub, max, min)• Regular interconnect pattern• Pipelined communication between cells• I/O at boundariesxx+ xminx cCprE 583 – Reconfigurable ComputingSeptember 28, 2006 Lect-12.6Example – Finite Impulse Response• A Finite Impulse Response (FIR) filter is a type of digital filter• Finite – response to an impulse eventually settles to zero• Requires no feedback∑=−+−++⋅=⋅++⋅+⋅=kjjiikikiiixwxwxwxwy111121 Lfor (i=1; i<=n; i++)for (j=1; j <=k; j++)y[i] += w[j] * x[i+j-1];2CprE 583 – Reconfigurable ComputingSeptember 28, 2006 Lect-12.7Finite Impulse Response (cont.)• Sequential• Memory bandwidth per output – 2k+1• O(k) cycles per output• O(1) hardwarex+xiw1x+w2x+w3x+w4yi• Systolic• Memory bandwidth per output – 2• O(1) cycles per output• O(k) hardwareCprE 583 – Reconfigurable ComputingSeptember 28, 2006 Lect-12.8Example – Matrix-Vector Product 2121212222111211⎥⎥⎥⎥⎦⎤⎢⎢⎢⎢⎣⎡=⎥⎥⎥⎥⎦⎤⎢⎢⎢⎢⎣⎡⎥⎥⎥⎥⎦⎤⎢⎢⎢⎢⎣⎡nnnnnnnnyyyxxxaaaaaaaaaMMLMMMLLfor (i=1; i<=n; i++) for (j=1; j<=n; j++)y[i] += a[i][j] * x[j];CprE 583 – Reconfigurable ComputingSeptember 28, 2006 Lect-12.9Matrix-Vector Product (cont.)t = 3t = 2t = 1t = 4a31a21a11a41a22a12–a23a13––a23–––a14x1x2x3x4y2y3y4y1t = n+1t = n+2t = n+3t = nxn…––––CprE 583 – Reconfigurable ComputingSeptember 28, 2006 Lect-12.10Outline• Project Proposals• Recap• Non-Numeric Systolic Examples• Systolic Loop Transformations• Data dependencies• Iteration spaces• Example transformations• Reading – Cellular Automata• Reading – Bit-Serial ArchitecturesCprE 583 – Reconfigurable ComputingSeptember 28, 2006 Lect-12.11Example – Relational Database• Relation is a collection of tuples that all have the same attributes • Tuple is a fixed number of objects• Represented in a tabletuple # Name School Age QB Rating0D. Carr Fresno State 27 113.61 P. Rivers NC State 24 107.42D. McNabb Syracuse 29 105.33 C. Pennington Marshall 30 103.44R. Grossman Florida 26 100.9CprE 583 – Reconfigurable ComputingSeptember 28, 2006 Lect-12.12• Intersection: A ∩ B – all records in both relation A and B• Must compare all |A| x |B| tuples• Compare via sequence compare• Or along row or column to get inclusion bitvectorDatabase OperationsB1A1T11A2T21A3T31B2T12T22T32B3T13T23T33A1A2A3B3B2B1T12T213CprE 583 – Reconfigurable ComputingSeptember 28, 2006 Lect-12.13Database Operations (cont.)• Tuple Comparison• Problem – tuples are long, comparison time might limit computation rate• Strategy – perform comparison in pipelined manner by fields• Stagger fields• Arrange to compute field i on cycle after i-1 • Cell: tout= tin and ainxnor binTrueB[j,1]B[j,2]B[j,3]B[j,4]A[i,1]A[i,2]A[i,3]A[i,4]CprE 583 – Reconfigurable ComputingSeptember 28, 2006 Lect-12.14Database Intersectionb51a21TrueTrueTrueTrueTrueTrueTrueb52b43a13b44b42a22b34a14b41a31b33a23b31a41b23a23b21a51b13a43b32a32b24a24b22a42b14a34a52 a44T12T21T22CprE 583 – Reconfigurable ComputingSeptember 28, 2006 Lect-12.15Database Intersection (cont.)b51a21TrueTrueTrueTrueTrueTrueTrueb52b43a13b44b42a22b34a14b41a31b33a23b31a41b23a23b21a51b13a43b32a32b24a24b22a42b14a34a52 a44ORFALSEORORORORORORT3T2T1CprE 583 – Reconfigurable ComputingSeptember 28, 2006 Lect-12.16Database Operations (cont.)• Unique: remove duplicate elements in multirelation A• Intersect A with A• Union: A U B – one copy of all tuples in A and B• Concatenate A and B• Use Unique to remove duplicates• Projection: collapse A by removing select fields of every tuple• Sample fields in A’• Use Unique to remove duplicatesCprE 583 – Reconfigurable ComputingSeptember 28, 2006 Lect-12.17Database Join• Join: AJCA,CBB – where columns CAin Aintersect columns CBin B, concatenate tuple Aiand Bj• Match CAof A with CBof B• Keep all Ti,j• Filter i,j for which Ti,j= true• Construct join from matched pairs• Claim: Typically, | AJCA,CBB | << | A | | B |CprE 583 – Reconfigurable ComputingSeptember 28, 2006 Lect-12.18Database Summary• Input database – O(n) data• Operations require O(n2) data• O(n) if sorted first• O(n log(n)) to sort• Systolic implementation – works on O(n) processing elements in O(n) time• Typical database [KunLoh80A]:• 1500 bit tuples• 10,000 records in a relation• ~1 4-LUT
View Full Document