Bit-serial On-line Computing!1"Damián Sánchez-Moreno!Digit-serial On-line Computing!2"Damián Sánchez-Moreno!3!To reduce area!!thereby to increase computing capability per unit of area!!!thereby increasing the throughput!To reduce power dissipation!Approaches!Fixed-point arithmetic!Digit-serial arithmetic!Reconfigurable Computing!4!Concepts!Interpretation!Errors!Operations!Modes!Redundant Number Systems!Operations!Why fixed-point?!Why digit-serial?!!!!Target Platforms!Conclusions"Smaller area than Floating-point (FP)!Less power dissipation than FP!Often, faster than FP!Fixed resolution* !Fits the target applications!Drawbacks!• Smaller or limited dynamic range!• Algorithms are harder to design!5!Example ! !!B = 2!p = 3!emax = 2, emin = -1!p.pp x Be !Max = 1.11 x 22 = 7!Min = 0.00 x 2e = 0!6!Discontinuities!7!€ f (x) =ln(1− x)xFixed-point word notation !• (S/I/F) !(1/3/2) !• Qi.f ! !Q3.2!8!w = i +f!xi-1 xi-2 … x0 x-1 x-2 … … x-f!ri-1 ri-2 … r0 ri-1 ri-1 … … r-f!Precision – equals the wordlength!Resolution – smallest representable value!Range – distance between the largest number and the smallest!Dynamic Range – ratio of the largest to the smallest quantity!9!w = i +f!xi-1 xi-2 … x0 x-1 x-2 … … x-f!ri-1 ri-2 … r0 ri-1 ri-1 … … r-f!Block Floating Point!>>16$k1#Input#10!€ xd= xkk=− fi −1∑⋅ rk€ xd= −ri −1xi −1+ xkk=− fi −2∑⋅ rkBinary!Decimal!Notes!Integer!w = i!Unsigned!101001!41!(0/6/0) or Q6.0!Signed!101001!-23!(1/5/0) or Q6.0!Fractional!f > 0!Unsigned!101.001! 5.125!(0/3/3) or Q3.3!Signed!101.001!-2.875!(1/2/3) or Q3.3!Calculation formulas (r = radix or base)!• Unsigned!• Signed!11!Quantization!• A/D Conversions !• Representing real values!• Truncation and Rounding!Over)low$Under)low!Solution: Scaling!12!13!Addition!!Qa.b + Qc.d = Qa+1.b iff a = c and b = d!Multiplication!!Qa.b * Qc.d = Qa+c.b+d!Scaling – arithmetical shifts!Rounding!• Truncation – biased error!• Alternative!Reduce area by reusing resources!• thus, increasing computing capability per area!Reduce the number of I/O lines!• reducing power dissipation, area and routing resources!Delay linearly proportional to wordlength!Drawbacks!• speed!• control logic!14!Digit-serial arithmetic comes in two flavors:!• Least-Significant-Digit-First (LSDF)!• Most-Significant-Digit-First (MSDF)!o a.k.a. Digit-pipelined arithmetic!o Introduced 1975!Digit-serial fits streaming applications (SIMD)!15!16!LSDF allows some operations to be overlapped!+"+Timing diagram serial operation vs. pipelining!δ"" +17!MSDF allows all operations to be overlapped!• Higher throughput!-""(1) Serial operation (2) Bit-serial pipeline (3) On-line pipeline !/""-"/"" -"/"Digit set for Generalized signed digit,!We need ! ! bits to represent every digit of a GSD word!18!€ S ≡ {-β,..,−1, 0,1,..,α}€ log2S Digit set for Generalized signed digit,!We need ! ! bits to represent every digit of a GSD word!Binary Signed Digit (BSD), radix-2 (r), 2 bits !19!€ S ≡ {-β,..,−1, 0,1,..,α}€ log2S € S ≡ {1 ,0,1}Digit set for Generalized signed digit,!We need ! ! bits to represent every digit of a GSD word!Binary Signed Digit (BSD), radix-2 (r), 2 bits !The encoding follows the formula !A 4-digits BSD word can be written !20!€ S ≡ {-β,..,−1, 0,1,..,α}€ log2S € S ≡ {1 ,0,1}€ xi= xi+− xi−€ 1 1 01BSD≡ 01,10,00,01BDigit set for Generalized signed digit,!We need ! ! bits to represent every digit of a GSD word!Binary Signed Digit (BSD), radix-2 (r), 2 bits !The encoding follows the formula !A 4-digits BSD word can be written !The redundancy index!21!€ S ≡ {-β,..,−1, 0,1,..,α}€ log2S € S ≡ {1 ,0,1}€ xi= xi+− xi−€ 1 1 01BSD≡ 01,10,00,01B€ ρ=α+β+ 1− r22!Carry-free vs. Limited-carry!Example – Addition ai + bi = pi = wi + r·ti+1 = si !23!Carry-free vs. Limited-carry!Example – Addition ai + bi = pi = wi + r·ti+1 = si ! 9 9 9 9 ai + 3 4 7 5 bi 12 13 16 14 pi 2 3 6 4 wi 1 1 1 1 0 ti+1 1 3 4 7 4 si24!Carry-free vs. Limited-carry!Example – Addition ai + bi = pi = wi + r·ti+1 = si ! 9 9 9 9 ai + 3 4 7 5 bi 12 13 16 14 pi 2 3 6 4 wi 1 1 1 1 0 ti+1 1 3 4 7 4 si 9 9 9 9 ai + 0 0 0 1 bi 9 9 9 10 pi 9 9 9 0 wi 0 0 0 1 0 ti+1 1 0 0 0 0 si25!Carry-free vs. Limited-carry!Example – Addition ai + bi = pi = wi + r·ti+1 = si ! 9 9 9 9 ai + 0 0 0 1 bi 9 9 9 10 pi 1 1 1 0 wi 1 1 1 1 0 ti+1 1 0 0 0 0 si 9 9 9 9 ai + 0 0 0 1 bi 9 9 9 10 pi 9 9 9 0 wi 0 0 0 1 0 ti+1 1 0 0 0 0 siAs integer!As Fixed-point!Overflow?!!26!€ 0.1 1 01BSD≡ x = 0.3125€ x = skrkk=0i −1∑€ 1 1 01BSD≡ x = 1⋅ 23+ 1⋅ 22+ 1⋅ 21+ 1⋅ 20= 5€ [1111,..,1111]BSD= [−15, 15]10€ x = skrkk=−1− f∑€ [−ρ(r −1− r− n),ρ(r −1− r− n)]BSD= [−0.9375,0.9375]10€ 0.1111BSD⋅ 0.1111BSD= 1.01100011BSD= −0.8789062510Method 1!1. Bound error to resolution!2. Set recurrence!3. Define residual W!4. Define recurrence for W!5. Set selection functions F, G!Method 2!Adapting a bit-serial solution!27! Ercegovac, M. D. et. al. “Digital Arithmetic”, Morgan Kaufmann, 2004"X;Y!F; G!W!Z!zj!xj+1+δ yj+1+δ !Figure 2. Serial algorithm model!Digit-parallel!Digit-serial!28!LSDF addition! LSDF Multiplication!a!b !cout!sout!yi!n-1 … 0!Parallel Adder!xi!zj !Cin!…!…!€ x3x2x1x0€ y3y2y1y0x3y3 x3y2 x3y1 x3y0!x2y3 x2y2 x2y1 x2y0!x1y3 x1y2 x1y1 x1y0!x0y3 x0y2 x0y1 x0y0! z7 z6 z5 z4 z3 z2 z1 z0! Nibouche, O. “New Architectures for Serial-Serial Multiplication”, 2001.!29!MSDF addition!x+!x-!y+!y-!z-!z+!On-line BSD Adder! Ercegovac, M. D. et. al. “Digital Arithmetic”, Morgan Kaufmann, 2004"On-line delay or initial delay: δ!• First output at δ+1!Initiation interval: ι !!€ xi= xi+− xi−30!MSDF Multiplication!€ x3x2x1x0€ y3y2y1y0x3y3 x3y2 x3y1 x3y0!x2y3 x2y2 x2y1 x2y0!x1y3 x1y2 x1y1 x1y0!x0y3 x0y2 x0y1 x0y0! z8. z7 z6 z5 z4 z3 z2 z1 z0!yi!0 … n-1!BSD Parallel
View Full Document