EE1411EE141EE141--Fall 2007Fall 2007Digital Integrated Digital Integrated CircuitsCircuitsCircuitsCircuitsLecture 14Lecture 14EE141EECS1411Lecture #14Project Project -- AddersAddersAdministrative StuffAdministrative Stuff One more lab left – will be scheduled around week 12around week 12 Midterm 2 next Tuesday 105 Northgate, starts at 6:30pm sharp“Re-review”Mon 9am in 550 Cory loungeEE141EECS1412Lecture #14Re-review Mon. 9am in 550 Cory lounge Extra office hours Mon. 2-3pmEE1412Class MaterialClass Material Last lecture Logical Effort Today’s lecture Adders ReadingEE141EECS1413Lecture #14 Chapter 11Midterm 1Midterm 1 Mean: 25.3 – Standard Dev: 9.85 Median: 24.5 Max: 48 – Min 9 8101214NumberEE141EECS1414Lecture #14024680246810NumberEE1413EE141 Spring 08 ProjectEE141 Spring 08 Project Designing an FIR Filterx(n-1)x(n-2)x(n-3)x(n)DD D* ** *a0x(n-1)x(n-2)x(n-3)x(n)a1a2a3EE141EECS1415Lecture #14+y(n)Avoiding multipliersAvoiding multipliers Multipliers are expensive Very often, coefficients are fixed And multipliers can be replaced by add/shift units What does a fixed shift cost?>>2>>1x(n-i)EE141EECS1416Lecture #14+>>2>>1x(n-i)*0.375EE1414Project OutlineProject Outline Phase 1: Analysis of straightforward implementationp Predefined adder and register cells Identifying and analyzing critical timing paths Phase 2: High-level optimizationEE141EECS1417Lecture #14 Phase 3: Implementing and analyzing your optimized designProject Basics Project Basics 2 students / group Reporting: Phase 1 and 2: reports Phase 3: Poster and interview Be Creative!EE141EECS1418Lecture #14EE141556elay: DLE = P =D=4/3 2(4/3)FO+2Logical Effort - Review1234parasitic delayeffortdelayNormalized deLE = P =D =D 1 1FO + 1(4/3)FO+ 2EE141EECS1419Lecture #1412345Electrical effort: FO = Cout/CinDgate= LE·FO + P = Effort Delay + Parasitic DelayMultistage NetworksMultistage Networks()NiiiDelaypLEf=+⋅∑Effective fanout: EFi= LEifiPath electrical fanout: F = Cout/CinPath logical effort: ΠLE = LE1LE2…LEN()1iiiiyp f=∑EE141EECS14110Lecture #14Branching effort: ΠB = b1b2…bNPath effort: PE = ΠLE ΠΒ FPath delay D = Σdi= Σpi+ ΣEFiEE1416Optimum Effort per StageOptimum Effort per StageWhen each stage bears the same effort:NEF PE=NEFPE=Minimum path delayEffective fanouts: LE1f1= LE2f2= … = LENfNEE141EECS14111Lecture #14()1/11ˆNNNii i iiiDLEfpNPE p===+=⋅+∑∑Minimum path delayOptimal Number of StagesOptimal Number of StagesFor a given load, and given input capacitance of the first gategpp gFind optimal number of stages and optimal sizing1/ NiDNPE p=⋅ +∑Remember: we can always add inverters to the end of the chainEE141EECS14112Lecture #14ˆ1/ NEFPE=The ‘best effective fanout’is still around 4(3.6 with γ=1)EE1417Add Branching EffortAdd Branching EffortBranching effort: pathonpathoffpathonCCCb−−−+=EE141EECS14113Lecture #1451590LE =FO =PE =190/5 = 1818 (wrong!)Branching Example 151590SE1=SE2=PE = (g)(15+15)/5 = 690/15 = 636, not 18!Introduce new kind of effort to account for branching:• Branching Effort:Con-path + Coff-pathCb = EE141EECS14114Lecture #14• Path Branching Effort:Con-pathΠbiB = Now we can compute the path effort:• Path Effort: PE = ∏LE·FO·BEE1418Select gate sizes y and z to minimize delay from A to BLogical Effort:LE=(4/3)3Branching Example 2Logical Effort:LE Electrical Effort: FO =Branching Effort: B =Path Effort: PE = (4/3)Cout/Cin= 92•3 = 6∏LE·FO·B= 128EE141EECS14115Lecture #14Best Stage Effort: SE = Delay: D =PE1/3≈53•5 + 3•2 = 21Work backward for sizes:5z =9C•(4/3)= 2.4C5y =3z•(4/3)= 1.9CMethod of Logical EffortMethod of Logical Effort Compute the path effort: PE = (ΠLE)BF Find the best number of stages N~ log4PE Compute the effective fanout/stage EF = PE1/N Sketch the path with this number of stages Work either from either end, find sizes: Cin= Cout*LE/EFEE141EECS14116Lecture #14Reference: Sutherland, Sproull, Harris, “Logical Effort, Morgan-Kaufmann 1999.EE1419AddersAddersEE141EECS14117Lecture #14An Intel MicroprocessorAn Intel Microprocessoruxuxg64a9-1 Mu9-1 Mux5-1 Mu2-1 Muxck1CARRYGENSUMGEN+ LUbs0s1g64sumsumbLU : LogicalSUMSELto Cachenode1REGEE141EECS14118Lecture #141000umUnitItanium has 6 64-bit integer execution units like thisEE14110BitBit--Sliced DesignSliced DesignControlBit 3Bit 2Bit 1Bit 0RegisterAdderShifterMultiplexerData-InData-OutEE141EECS14119Lecture #14Bit 0MTile identical processing elementsBitBit--Sliced DatapathSliced DatapathMultiplexersFrom register files / Cache / BypassAdder stage 1WiringAdder stage 2WiringBShifterLoopback BusLoopback BusLoopback BusEE141EECS14120Lecture #14Adder stage 3Bit slice 0Bit slice 2Bit slice 1Bit slice 63Sum SelectTo register files / CacheEE14111Itanium Integer DatapathItanium Integer DatapathEE141EECS14121Lecture #14Fetzer, Orton, ISSCC’02Data Paths Are Thermal HogsData Paths Are Thermal HogsEE141EECS14122Lecture #14EE14112FullFull--AdderAdderABCoutCinFulladderSumadder(kill)(kill)EE141EECS14123Lecture #14The Binary AdderThe Binary AdderEE141EECS14124Lecture #14EE14113Express Sum and Carry as a function of P, G, KExpress Sum and Carry as a function of P, G, KDefine 3 new variable which ONLY depend on A, BGenerate (G) = ABGenerate (G) = ABPropagate (P) = A ⊕BKill = ATheimagecannoBTheimagecannoEE141EECS14125Lecture #14Can also derive expressions for Sand Cobased on K and PPropagate (P) = A +BNote that we will sometimes use an alternate definition for Simplest Adder: RippleSimplest Adder: Ripple--CarryCarryA0B0A1B1A2B2A3B3Ci0C0C1C2C3Worst case delay linear with the number of bitsFA FA FA FAS0S1S2S3Ci,0Co,0(= Ci,1)Co,1Co,2Co,3t=O(N)EE141EECS14126Lecture #14Goal: Make the fastest possible carry path circuittd= O(N)tadder= (N-1)tcarry+ tsumEE14114Complementary Static CMOS Full Adder: Complementary Static CMOS Full Adder: “Direct” Implementation“Direct” ImplementationABVDDVDDCiBAABACiCiAXA BBVDDABCiCiAACiBVDDSEE141EECS14127Lecture #1428 TransistorsABACiBCoComplementary Static CMOS Full AdderComplementary Static CMOS Full AdderEE141EECS14128Lecture #1428 TransistorsEE14115Inversion PropertyInversion PropertyABABSCoCiFASCoCiFAEE141EECS14129Lecture #14Minimize Critical Path by Reducing Inverting StagesMinimize Critical Path by Reducing Inverting StagesAEven cell Odd cellABABABBA3FA FA FA FAA0B0S0A1B1S1A2B2S2B3S3Ci,0Co,0Co,1Co,3Co,2EE141EECS14130Lecture #14Exploit Inversion
View Full Document