DOC PREVIEW
MIT 6 111 - Multipliers & Pipelining

This preview shows page 1-2-3-25-26-27 out of 27 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Multipliers & Pipelining • Combinational multiplier • Two’s complement multiplier • Smaller multipliers, faster multipliers • Latency & Throughput • Pipelining to increase throughput • Retiming 6.111 Fall 2008 1 Lecture 9 Lab #3 due tonight, report next Tuesday, no LPSets this weekA0 A1 A2 A3 B0 B1 B2 B3 A0B0 A1B0 A2B0 A3B0 A0B1 A1B1 A2B1 A3B1 A0B2 A1B2 A2B2 A3B2 A0B3 A1B3 A2B3 A3B3 x + ABi called a “partial product” Multiplying N-bit number by M-bit number gives (N+M)-bit result Easy part: forming partial products (just an AND gate since BI is either 0 or 1) Hard part: adding M N-bit partial products Unsigned Multiplication 6.111 Fall 2008 2 Lecture 9Combinational Multiplier (unsigned) X3 X2 X1 X0 * Y3 Y2 Y1 Y0 -------------------- X3Y0 X2Y0 X1Y0 X0Y0 + X3Y1 X2Y1 X1Y1 X0Y1 + X3Y2 X2Y2 X1Y2 X0Y2 + X3Y3 X2Y3 X1Y3 X0Y3 ----------------------------------------- Z7 Z6 Z5 Z4 Z3 Z2 Z1 Z0 HA x3 FA x2 FA x1 FA x2 FA x1 HA x0 FA x1 HA x0 HA x0 FA x3 FA x2 FA x3 x3 x2 x1 x0 z0 z1 z2 z3 z4 z5 z6 z7 y3 y2 y1 y0  Propagation delay ~2N multiplicand multiplier Partial products, one for each bit in multiplier (each bit needs just one AND gate) 6.111 Fall 2008 3 Lecture 9Combinational Multiplier (signed!) X3 X2 X1 X0 * Y3 Y2 Y1 Y0 -------------------- X3Y0 X3Y0 X3Y0 X3Y0 X3Y0 X2Y0 X1Y0 X0Y0 + X3Y1 X3Y1 X3Y1 X3Y1 X2Y1 X1Y1 X0Y1 + X3Y2 X3Y2 X3Y2 X2Y2 X1Y2 X0Y2 - X3Y3 X3Y3 X2Y3 X1Y3 X0Y3 ----------------------------------------- Z7 Z6 Z5 Z4 Z3 Z2 Z1 Z0 x3 FA x2 FA x1 FA x2 FA x1 FA x0 FA x1 HA x0 HA x0 FA x3 FA x2 FA x3 x3 x2 x1 x0 z0 z1 z2 z3 z4 z5 z6 z7 y3 y2 y1 y0 FA FA FA FA FA FA FA 1 NB: There are tricks we can use to eliminate the extra circuitry we added… 6.111 Fall 2008 4 Lecture 92’s Complement Multiplication X3 X2 X1 X0 * Y3 Y2 Y1 Y0 -------------------- X3Y0 X3Y0 X3Y0 X3Y0 X3Y0 X2Y0 X1Y0 X0Y0 + X3Y1 X3Y1 X3Y1 X3Y1 X2Y1 X1Y1 X0Y1 + X3Y2 X3Y2 X3Y2 X2Y2 X1Y2 X0Y2 - X3Y3 X3Y3 X2Y3 X1Y3 X0Y3 ----------------------------------------- Z7 Z6 Z5 Z4 Z3 Z2 Z1 Z0 X3Y0 X2Y0 X1Y0 X0Y0 + X3Y1 X2Y1 X1Y1 X0Y1 + X2Y2 X1Y2 X0Y2 + X3Y3 X2Y3 X1Y3 X0Y3 + 1 1 Step 1: two’s complement operands so high order bit is –2N-1. Must sign extend partial products and subtract the last one Step 2: don’t want all those extra additions, so add a carefully chosen constant, remembering to subtract it at the end. Convert subtraction into add of (complement + 1). Step 3: add the ones to the partial products and propagate the carries. All the sign extension bits go away! Step 4: finish computing the constants… Result: multiplying 2’s complement operands takes just about same amount of hardware as multiplying unsigned operands! X3Y0 X2Y0 X1Y0 X0Y0 + X3Y1 X2Y1 X1Y1 X0Y1 + X2Y2 X1Y2 X0Y2 + X3Y3 X2Y3 X1Y3 X0Y3 + 1 - 1 1 1 1 X3Y0 X3Y0 X3Y0 X3Y0 X3Y0 X2Y0 X1Y0 X0Y0 + 1 + X3Y1 X3Y1 X3Y1 X3Y1 X2Y1 X1Y1 X0Y1 + 1 + X3Y2 X3Y2 X3Y2 X2Y2 X1Y2 X0Y2 + 1 + X3Y3 X3Y3 X2Y3 X1Y3 X0Y3 + 1 + 1 - 1 1 1 1 –B = ~B + 1 (Baugh-Wooley) 6.111 Fall 2008 5 Lecture 92’s Complement Multiplication FA x3 FA x2 FA x1 FA x2 FA x1 HA x0 FA x1 HA x0 HA x0 FA x3 FA x2 FA x3 HA 1 1 x3 x2 x1 x0 z0 z1 z2 z3 z4 z5 z6 z7 y3 y2 y1 y0 6.111 Fall 2008 6 Lecture 9Multiplication in Verilog You can use the “*” operator to multiply two numbers: wire [9:0] a,b; wire [19:0] result = a*b; // unsigned multiplication! If you want Verilog to treat your operands as signed two’s complement numbers, add the keyword signed to your wire or reg declaration: wire signed [9:0] a,b; wire signed [19:0] result = a*b; // signed multiplication! Remember: unlike addition and subtraction, you need different circuitry if your multiplication operands are signed vs. unsigned. Same is true of the >>> (arithmetic right shift) operator. To get signed operations all operands must be signed. To make a signed constant: 10’sh37C 6.111 Fall 2008 7 Lecture 9Multiplication on the FPGA 6.111 Fall 2008 Lecture 9 8 In the XC2V6000: 6 columns of mults, 24 in each column = 144 mults Hardware multiplier block: two 18-bit twos complement (signed) operands tPD ≈ 10nsSequential Multiplier Assume the multiplicand (A) has N bits and the multiplier (B) has M bits. If we only want to invest in a single N-bit adder, we can build a sequential circuit that processes a single partial product at a time and then cycle the circuit M times: A P B + SN NC N xN N N+1 SN-1…S0 Init: P←0, load A and B Repeat M times { P ← P + (BLSB==1 ? A : 0) shift P/B right one bit } Done: (N+M)-bit result in P/B M bits LSB 1 6.111 Fall 2008 9 Lecture 9Bit-Serial Multiplication P FA C 0 A B Init: P = 0; Load A,B Repeat M times { Repeat N times { shift A,P: Amsb = Alsb Pmsb = Plsb + Alsb*Blsb + C/0 } shift P,B: Pmsb = C, Bmsb = Plsb } (N+M)-bit result in P/B 6.111 Fall 2008 10 Lecture 9Useful building block: Carry-Save Adder Last stage is still a carry-propagate adder (CPA) Good for pipelining: delay through each partial product (except the last) is just tPD,AND + tPD,FA. No carry propagation time! CSA 6.111 Fall 2008 11 Lecture 9Wallace Tree Multiplier CSA CSA CSA CSA ... CSA CSA CSA CPA O(log1.5M) Higher fan-in adders can be used to further reduce delays for large M. Wallace Tree: Combine groups of three bits at a time This is called a 3:2 counter by multiplier hackers: counts number of 1’s on the 3 inputs, outputs 2-bit result. 4:2 compressors and 5:3 counters are popular building blocks. 6.111 Fall 2008 12 Lecture 9Multiplication by a constant • If one of the operands is a constant, make it the multiplier (B in the earlier examples). For each “1” bit in the constant we get a partial product (PP) – may be noticeably fewer PPs than in the general case. – For example, in general multiplying two 4-bit operands generates four PPs (requiring 3 rows of full adders). If the multiplier is say, 12 (4’b1100), then there are only two PPs: 8*A+4*A (requiring only 1 row of full adders). – But lots of “1”s means lots


View Full Document

MIT 6 111 - Multipliers & Pipelining

Documents in this Course
Verilog

Verilog

21 pages

Video

Video

28 pages

Bass Hero

Bass Hero

17 pages

Deep 3D

Deep 3D

12 pages

SERPENT

SERPENT

8 pages

Vertex

Vertex

92 pages

Vertex

Vertex

4 pages

Snapshot

Snapshot

15 pages

Memories

Memories

42 pages

Deep3D

Deep3D

60 pages

Design

Design

2 pages

Frogger

Frogger

11 pages

SkiFree

SkiFree

81 pages

Vertex

Vertex

10 pages

EXPRESS

EXPRESS

2 pages

Labyrinth

Labyrinth

81 pages

Load more
Download Multipliers & Pipelining
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Multipliers & Pipelining and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Multipliers & Pipelining 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?