DOC PREVIEW
ISU CPRE 583 - Reconfigurable Computing

This preview shows page 1-2 out of 7 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

1CprE / ComS 583Reconfigurable ComputingProf. Joseph ZambrenoDepartment of Electrical and Computer EngineeringIowa State UniversityLecture #6 – Modern FPGA DevicesCprE 583 – Reconfigurable ComputingSeptember 7, 2006 Lect-06.2Quick Points• HW #2 coming out over the weekend• Due Thursday, September 21 (12:00pm)• LUT mapping• Comparing FPGA devices• Synthesizing arithmetic operatorsAssigned DueEffort LevelStandard Preferred CprE 583 CprE 583 – Reconfigurable ComputingSeptember 7, 2006 Lect-06.3Recap• Hard-wired carry logic supportAltera FLEX 8000 Xilinx XCV4000CprE 583 – Reconfigurable ComputingSeptember 7, 2006 Lect-06.4Recap (cont.)• Square-root carry select adders01++01A31-30B31-30S31-3001++01A29-22B29-22S29-2201++01A21-15B21-15S21-1501++01A14-9B14-9S14-901++01A8-4B8-4S8-401++01A3-0B3-0S3-0t4t4t5t5t5t6t6t7t7t8t8t6t7t8t9t10CprE 583 – Reconfigurable ComputingSeptember 7, 2006 Lect-06.5Recap (cont.)• If one operand is constant:• More speed?• Less hardware?HAA00S0FAA1S1FAA2S2FAA3S31 0 1C3A0S0HAA2S2HAA3S3C3A1S1CprE 583 – Reconfigurable ComputingSeptember 7, 2006 Lect-06.6Recap (cont.)+X0Y0X1X2X3Z0Y1X0+X1+X2+X3Z1+Y2++++Y3+++Z2• Carry save multiplication2CprE 583 – Reconfigurable ComputingSeptember 7, 2006 Lect-06.7Recap (cont.)Y0=0Z0X0X1X2X3Z1+Y3=1X0+X1+X2+X3Z2• If one operand is constant:• Can greatly reduce the number of adders• Removes all and gatesY1=1Y2=0CprE 583 – Reconfigurable ComputingSeptember 7, 2006 Lect-06.8LUT-Based Constant Multipliers• Constants can be changed in the LUTs to program new multipliers4-LUTN0–N710101011x NNNNNNNNAAAAAAAAAAAA (N * 1011 (LSN))+ BBBBBBBBBBBB (N * 1010 (MSN))SSSSSSSSSSSSSSSS Product4-LUT 4-LUT 4-LUT 4-LUT 4-LUT 4-LUT 4-LUT 4-LUT 4-LUT 4-LUT 4-LUT4-LUTN0–N74-LUT 4-LUT 4-LUT 4-LUT 4-LUT 4-LUT 4-LUT 4-LUT 4-LUT 4-LUT 4-LUT+S0–S15A0–A11B4–B15CprE 583 – Reconfigurable ComputingSeptember 7, 2006 Lect-06.9Outline• Recap• More Multiplication• Handling Fractional Values• Fixed Point• Floating Point• Some Modern FPGA Devices• Xilinx – XC5200, Virtex (-II / -II Pro / -4 / -5), Spartan (-II / -3)• *Altera – FLEX 10K, APEX (20K / II), ACEX 1K, Cyclone (II), Stratix (GX / II / II GX)CprE 583 – Reconfigurable ComputingSeptember 7, 2006 Lect-06.10Partial Product Generation• AND gates in multiplication are wasteful• Option 1 – use cascade logic• Option 2 – break into smaller (2x2) multipliers42 = 101010 Multiplicandx 11 = x 1011 Multiplier0110 (10x11) 0110 (10x11)0110 (10x11)0100 (10x10) 0100 (10x10)+ 0100 (10x10)462 = 0111001110 ProductCprE 583 – Reconfigurable ComputingSeptember 7, 2006 Lect-06.11Representation Compression• Multiplication can be simplified if the representation is compressed• Standard – binary representation {0,1}x2n• Canonical Signed Digit (CSD) representation {-1,0,1}x2n• To encode CSD:• Set C = (B + (B<<1))• Calculate -2C = 2*(C>>1)• Di= Bi+ Ci–2Ci+1,where Ci+1is the carryout of Bi+ Ci• Example: B = 61d = 0111101bC = 0111101b + 01111010b = 010110111b-2Ci+1 = 2222101D = 1000201 = 1000(-1)01• For any n bit number, there can only be n/2 nonzero digits in a CSD representation (every other bit)CprE 583 – Reconfigurable ComputingSeptember 7, 2006 Lect-06.12Booth Encoding• Variation on CSD encoding:Ej= -2Bi+ Bi-1+ Bi-2• Select a group of 3 digits, add the two least significant digits, and then subtract 2x the most significant bit• Ejis {-2,-1,0,1}x22n• Example:• B = 61d = 0111101b = 0001111010b (with padding)• E = 010(-1)1• Reduces the number of partial products for multiplication by ½• Can automatically handle negative numbers3CprE 583 – Reconfigurable ComputingSeptember 7, 2006 Lect-06.13Fractional Arithmetic• Many important computations require fractional components• Fractional arithmetic often ignored in FPGA literature• Complex standards (ex. IEEE special cases)• Resource intensive and slow• Why not just extend the binary representation past the decimal point?CprE 583 – Reconfigurable ComputingSeptember 7, 2006 Lect-06.14Fixed-Point Representation• Separate value into Integer (I) and Fractional remainder (F)• F bits represent {0,1}x2-n• How large to make I and F depends on application• Ex: Q16.16 is 16 bits of integer [-215, 216) with 16 bits of fraction – increments of 2-16or 0.0000152587890625• Ex: Q1.127 is a normalized integer [-1,1) with 127 bits of fraction – increments of 2-127or 5.8774717541114375398436826861112e-39I FCprE 583 – Reconfigurable ComputingSeptember 7, 2006 Lect-06.15Fixed-Point Arithmetic• Addition, subtraction the same (Q4.4 example):• Multiplication requires realignment:3.6250 0011.1010+ 2.8125 0010.1101 6.4375 0110.01113.6250 0011.1010 x 2.8125 0010.1101 0011101000111010 001110100011101010.1953125 1010.00110010 CprE 583 – Reconfigurable ComputingSeptember 7, 2006 Lect-06.16Fixed-Point Issues• Overflow/underflow• Quantization Errors• After rounding down previous example 3.625 x 2.8125 = 10.1875 (0.08% error)• In Q4.4, 2 divided by 3 = 0.625 (6.25% error)• Scaling• Dynamic range needed for some applicationsCprE 583 – Reconfigurable ComputingSeptember 7, 2006 Lect-06.17IEEE 754 Floating Point• Single precision: V = (-1)Sx 2(E-127)x (1.F) • Double precision: V = (-1)Sx 2(E-1023)x (1.F)• Special conditions – not a number (NaN), +-0, +-infinity• Gradual underflowS E18F23S E111F52CprE 583 – Reconfigurable ComputingSeptember 7, 2006 Lect-06.18Floating Point FPGA Hardware• Xilinx XCV4085• Addition• Single-precision – 587 4-LUTs• Double-precision – 1334 4-LUTs• Multiplication• Single-precision – 1661 4-LUTs• Double-precision – 4381 4-LUTs• Division• Single-precision – 1583 4-LUTs• Double-precision – 4910 4-LUTs• For double-precision, can only fit any two of three units on a single device!• See [Und04] for details4CprE 583 – Reconfigurable ComputingSeptember 7, 2006 Lect-06.19Capacity TrendsYear1985Xilinx Device ComplexityXC200050 MHz1K gatesXC4000100 MHz250K gatesVirtex200 MHz1M gatesVirtex-II 450 MHz8M gatesSpartan80 MHz40K gatesSpartan-II200 MHz200K gatesSpartan-3326 MHz5M gates19911987XC300085 MHz7.5K gatesVirtex-E240 MHz4M gatesXC520050 MHz23K gates1995 1998 1999 2000 2002 2003Virtex-II Pro450 MHz8M gates*2004 2006Virtex-4500 MHz16M gates*Virtex-5550 MHz24M gates*CprE 583 –


View Full Document

ISU CPRE 583 - Reconfigurable Computing

Download Reconfigurable Computing
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Reconfigurable Computing and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Reconfigurable Computing 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?