2/12/03 ©UCB Spring 2003CS152 / Kubiatowicz Lec6.1CS152Computer Architecture and EngineeringLecture 6Multiply, Divide, ShiftFebruary 12, 2003John Kubiatowicz (www.cs.berkeley.edu/~kubitron)lecture slides: http://www-inst.eecs.berkeley.edu/~cs152/2/12/03 ©UCB Spring 2003CS152 / Kubiatowicz Lec6.2Review: Elements of the Design Process° Divide and Conquer (e.g., ALU)• Formulate a solution in terms of simpler components.• Design each of the components (subproblems)° Generate and Test (e.g., ALU)• Given a collection of building blocks, look for ways of putting them together that meets requirement° Successive Refinement (e.g., multiplier, divider)• Solve "most" of the problem (i.e., ignore some constraints or special cases), examine and correct shortcomings.° Formulate High-Level Alternatives (e.g., shifter)• Articulate many strategies to "keep in mind" while pursuing any one approach.° Work on the Things you Know How to Do• The unknown will become “obvious” as you make progress.2/12/03 ©UCB Spring 2003CS152 / Kubiatowicz Lec6.3Review: ALU Design° Bit-slice plus extra on the two ends° Overflow means number too large for the representation° Carry-look ahead and other adder tricksAMS32324OvflwALU0a0 b0cincos0ALU31a31 b31cincos31B 32C/L toproduceselect,comp,c-insigned-arithand cin xor co2/12/03 ©UCB Spring 2003CS152 / Kubiatowicz Lec6.4Review: Carry Look Ahead (Design trick: peek)A B C-out0 0 0 “kill”0 1 C-in “propagate”1 0 C-in “propagate”1 1 1 “generate”G = A and BP = A xor BA0B0A1B1A2B2A3B3SSSSGPGPGPGPC0 = CinC1 = G0 + C0 x P0C2 = G1 + G0 xP1 + C0 x P0 x P1C3 = G2 + G1 xP2 + G0 x P1 x P2 + C0 x P0 x P1 x P2GC4 = . . .P2/12/03 ©UCB Spring 2003CS152 / Kubiatowicz Lec6.5Review: Design Trick: Guess (or “Precompute”)n-bit adder n-bit adderCP(2n) = 2*CP(n)n-bit addern-bit addern-bit adder10CoutCP(2n) = CP(n) + CP(mux)Carry-select adder2/12/03 ©UCB Spring 2003CS152 / Kubiatowicz Lec6.6Review: Carry Skip Adder: reduce worst case delay4-bit Ripple AdderA0BSP0P1P2P34-bit Ripple AdderA4BSP0P1P2P3Exercise: optimal design uses variable block sizesJust speed up the slowest case for each block2/12/03 ©UCB Spring 2003CS152 / Kubiatowicz Lec6.7MIPS arithmetic instructions° Instruction Example Meaning Comments° add add $1,$2,$3 $1 = $2 + $3 3 operands; exception possible° subtract sub $1,$2,$3 $1 = $2 – $3 3 operands; exception possible° add immediate addi $1,$2,100 $1 = $2 + 100 + constant; exception possible° add unsigned addu $1,$2,$3 $1 = $2 + $3 3 operands; no exceptions° subtract unsigned subu $1,$2,$3 $1 = $2 – $3 3 operands; no exceptions° add imm. unsign. addiu $1,$2,100 $1 = $2 + 100 + constant; no exceptions° multiply mult $2,$3 Hi, Lo = $2 x $3 64-bit signed product° multiply unsigned multu$2,$3 Hi, Lo = $2 x $3 64-bit unsigned product° divide div $2,$3 Lo = $2 ÷ $3, Lo = quotient, Hi = remainder ° Hi = $2 mod $3 ° divide unsigned divu $2,$3 Lo = $2 ÷ $3, Unsigned quotient & remainder ° Hi = $2 mod $3° Move from Hi mfhi $1 $1 = Hi Used to get copy of Hi° Move from Lo mflo $1 $1 = Lo Used to get copy of Lo2/12/03 ©UCB Spring 2003CS152 / Kubiatowicz Lec6.8MULTIPLY (unsigned)° Paper and pencil example (unsigned):Multiplicand 1000Multiplier 10011000000000001000 Product 01001000° m bits x n bits = m+n bit product° Binary makes it easy:•0 => place 0 ( 0 x multiplicand)•1 => place a copy ( 1 x multiplicand)° 4 versions of multiply hardware & algorithm: •successive refinement2/12/03 ©UCB Spring 2003CS152 / Kubiatowicz Lec6.9Unsigned Combinational Multiplier° Stage i accumulates A * 2 iif Bi== 1° Q: How much hardware for 32 bit multiplier? Critical path?B0A0A1A2A3A0A1A2A3A0A1A2A3A0A1A2A3B1B2B3P0P1P2P3P4P5P6P700 002/12/03 ©UCB Spring 2003CS152 / Kubiatowicz Lec6.10How does it work?° At each stage shift A left ( x 2)° Use next bit of B to determine whether to add in shifted multiplicand° Accumulate 2n bit partial product at each stageB0A0A1A2A3A0A1A2A3A0A1A2A3A0A1A2A3B1B2B3P0P1P2P3P4P5P6P700 000002/12/03 ©UCB Spring 2003CS152 / Kubiatowicz Lec6.11Carry Save addition of 4 integers° Adding: A2A1A0+B2B1B0+C2C1C0+D2D1D0¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯S4S3S2S1S0° Add Columns first, then rows!° Full Adder = 3 2 element° Can be used to reduce critical path of multiply ° Example: 53 bit multiply (for floating point):• At least 53 levels with naïve technique• Only 9 with Carry save addition!Carry Save Adder3=>2I1I2S0S1I3Carry Save Adder3=>2I1I2S0S1I3Carry Save Adder3=>2I1I2S0S1I30C2Carry Save Adder3=>2I1I2S0S1I3Carry Save Adder3=>2I1I2S0S1I3Carry Save Adder3=>2I1I2S0S1I3S0S1S2S3S4Carry Save Adder3=>2I1I2S0S1I3Carry Save Adder3=>2I1I2S0S1I30Carry Save Adder3=>2I1I2S0S1I3B2A2C1B1A1C0B0A0D0D1D22/12/03 ©UCB Spring 2003CS152 / Kubiatowicz Lec6.12Unisigned shift-add multiplier (version 1)° 64-bit Multiplicand reg, 64-bit ALU, 64-bit Product reg, 32-bit multiplier regProductMultiplierMultiplicand64-bit ALUShift LeftShift RightWriteControl32 bits64 bits64 bitsMultiplier = datapath + control2/12/03 ©UCB Spring 2003CS152 / Kubiatowicz Lec6.13Multiply Algorithm Version 1Product Multiplier Multiplicand0000 0000 0011 0000 00101: 0000 0010 0011 0000 00102: 0000 0010 0011 0000 01003: 0000 0010 0001 0000 01001: 0000 0110 0001 0000 01002: 0000 0110 0001 0000 10003: 0000 0110 0000 0000 10000000 0110 0000 0000 10003. Shift the Multiplier register right 1 bit.DoneYes: 32 repetitions2. Shift the Multiplicand register left 1 bit.No: < 32 repetitions1.TestMultiplier0Multiplier0 = 0Multiplier0 = 11a. Add multiplicand to product & place the result in Product register32nd repetition?Start2/12/03 ©UCB Spring 2003CS152 / Kubiatowicz Lec6.14Observations on Multiply Version 1° 1 clock per cycle => | 100 clocks per multiply• Ratio of multiply to add 5:1 to 100:1° 1/2 bits in multiplicand always 0=> 64-bit adder is wasted° 0’s inserted in right of multiplicand as shifted=> least significant bits of product never changed once formed° Instead of shifting multiplicand to left, shift product to right?2/12/03 ©UCB Spring 2003CS152 / Kubiatowicz Lec6.15MULTIPLY HARDWARE Version 2° 32-bit Multiplicand reg, 32 -bit ALU, 64-bit Product reg, 32-bit Multiplier regProductMultiplierMultiplicand32-bit ALUShift RightWriteControl32 bits32 bits64 bitsShift Right2/12/03 ©UCB Spring 2003CS152 / Kubiatowicz Lec6.16How to think of this?
View Full Document