1Where are we? Subsystem DesignRegisters and Register FilesAdders and ALUsSimple ripple carry additionTransistor schematicsFaster additionLogic generationHow it fits into the datapathData Path DesignBlock-diagram style data path description2Bit Slice DesignBit 3Bit 2Bit 1Bit 0RegisterAdderShifterMultiplexerControlData-InData-OutTile identical processing elementsLayout RealityBit Slice DesignBit 3Bit 2Bit 1Bit 0RegisterAdderShifterMultiplexerControlData-InData-OutTile identical processing elementsLayout Reality3Bit Slice PlanRecall planning a DFF to make a registerInputs on top in M2Outputs on bottom in M2Clock and Clock-bar routed horizontally in M1VddCbCVssD0Q0Qb0D1Q1Qb1D2Q2Qb2Bit Slice PlanNow extend this to a register fileD inputs go to all cellsCan select one register for writing by controlling the clockQ outputs go all the way through the register fileEach cell can drive Q from enabled inverterNow you can select one register for reading by selecting which cell is driving its outputCbCD0Q0D1Q1D2Q2EnCbCEn4Bit Slice PlanQ0Q1Q2D0D1D2CbCEnCbCEnCbCCbCEnEnBit Slice DesignBit 3Bit 2Bit 1Bit 0RegisterAdderShifterMultiplexerControlData-InData-OutTile identical processing elements5Multi-Port RegisterRe1Re0Multi-Port Register6Bit Slice DesignWhere are power lines? Bit 3Bit 2Bit 1Bit 0RegisterAdderShifterMultiplexerControlData-InData-OutTile identical processing elementsBit Slice DesignWhere are power lines? Basic Comb scheme Bit 3Bit 2Bit 1Bit 0RegisterAdderShifterMultiplexerControlData-InData-OutTile identical processing elements7Chip-Wide View of PowerPower Routing is a global chip-wide issueHere’s another approachNote the Vdd and Gnd padsGlobal rings with combs for regions of the chipChip-Wide View of PowerPower Routing is a global chip-wide issueHere’s another approachNote the Vdd and Gnd padsGlobal rings with combs for regions of the chip8Core power routingCore power routing9Chip-Wide View of PowerAnother view of the same issueWatch out for routing blockages! A Tweak on the SchemeSame basic schemeBut with no internal jumpersJumpers are restricted to outer loops10Adders Etc. Check out Chapter 10 in your textBasic Addition: Full AdderABCoutSumCinFulladderkillkill11Boolean EquationsABCoutSumCinFulladderA Direct ImplementationFig 10.3 in your text… 32 transistors12Use the Factored EquationsFully static, complex gate implementationVDDVDDVDDVDDABCiSCoXBACiABBACiABCiCiBACiABBA28 TransistorsGetting Rid of InvertersCan improve performance by removing inverters from carry chainA0B0S0Co,0Ci,0A1B1S1Co,1A2B2S2Co,2Co,3FA’ FA’ FA’ FA’A3B3S3Odd CellEven CellExploit Inversion PropertyNote: need 2 different types of cells13A Better Static GateCombine gates and reuse subtermsA Better Static GateSometimes called a “mirror adder”14Mirror Adder Considerations•Feed the Carry-In to the inner inputs so the internal capacitance is already discharged•Make all transistors whose gates are connected to Cin and carry logic minimum size – minimizes branching effort on critical path (carry out)•Determine gate widths by Logical Effort – reduce effort from C to CoutB at the expense of Sum•Use relatively large transistors on critical path so that stray wiring cap is a small fraction of overall capAdder LayoutExamples from Westeand Eshraghian“Standard Cell” vs. “Datapath”Definitely worth looking at carefully15Datapath LayoutA little tricky to figure outYou may not want to use this exact layout, but it might give you ideasStart by identifying vdd and gnd pathsThink about rotating it counter clock wiseThink about a taller circuit that matches the bit-pitch of your register… Datapath Layout16Example Datapath LayoutAddition and SubtractionRemember back to your logic design classAdd the two’s complement to subtractTake two’s complement by inverting all the bits and adding oneUse the carry-in to add oneUse an XOR to invert or not 011101110000OutBA17Two’s Complement Add/SubAside: XOR GatesSlightly tricky gate, ~AB + A~BLots of different schematics…18Another XOR gateNot too bad if you already have A, ~A, B, ~B floating aroundIf not, you’ll need a couple inverters too… AB~A~BAB~B~AXORAB~A~BAB~B~AXNORYet Another XOR GateDCVSL (section 6.2.3 in your text)Differential Cascode Voltage Switch LogicMake sure that the combinational pull-down networks are complementaryDifferential InputsPDN1PDN2Out ~Out19DCVSL XOR/XNORGenerates both XOR/XNORStill static, but might be slower than othersOut ~Out~A~BAB~BBAnother DCVSL ExampleOut ~Out~A~CAC~BB~EDE~DPull-down stacksmust be complementary20DCVSL Large XOROut ~Out~A~CAC~BBD~D~BB~CCD~DFour-input XORaka odd parityDCVSL Large XOROut ~Out~A~CAC~BBD~D~BB~CCD~DFour-input XORaka odd parity21DCVSL Large XOROut ~Out~A~CAC~BBD~D~BB~CCD~DFour-input XORaka odd parityTransmission Gate XORTiny, clever circuitIf A is high, N1, P1 act like inverterIf A is low, B is passed to the output through transmission gate22Transmission Gate AdderAnother VersionABPCiVDDAAAVDDCiAPABVDDVDDCiCiCoSCiPPPPPSum GenerationCarry GenerationSetup23Yet Another VersionAn Example Layout… Not the same style we’re used to seeing…24More Pass TransistorsComplementary Pass Transistor Logic (CPL)Slightly faster, but more areaACSSBBCCCBBCoutCoutCCCCBBBBBBBBAAASpeeding Up AdditionIt all comes back to the carry circuitRipple carry delay goes from low-order to high-order bitThis determines the speed of the additionMany many ways to speed up the carry calculation Section 10.2.2 in your text25Carry LookaheadKey is that the carry depends ONLY on A and B, not the carry-inCatch is that the gates have large fan-inSum = P + Ci-1Carry LookaheadRestated: Ci= Gi+ PiC(i-1)C0 = G0+ P0CinC1 = G1+ P1C0= G1+ P1(G0+ P0Cin)= G1+ P1G0+ P1P0CinC2 = G2+ P2G2+ P2P1G0+ P2P1P0CinC3 = G3+ P3G2+ P3P2G1+ P3P2P1G0+ P3P2P1P0CinOr C3= G3+ P3(G2+P2( G1+ P1(G0+ P0Cin)))26Carry LookaheadThe C equations get larger with each stageUsually do lookahead in small blocks (I.e. 4) and the combine in a treeA0,B0A1,B1AN-1,BN-1...Ci,0P0Ci,1P1Ci,N-1PN-1...Carry Lookahead Logic27Fast Carry Lookahead LogicPseudo-nMOSUses lots ofcurrent!Another Version VDDP3P2P1P0G3G2G1G0Ci,0Co,328Another ViewAnother ViewS1B1A1P1G1G0:0S2B2P2G2G1:0A2S3B3A3P3G3G2:0S4B4P4G4G3:0A4CinG0P01: Bitwise PG
View Full Document