L15: VLSI Integration and Performance TransformationsLayout 101Custom Design/LayoutThe ASIC ApproachStandard Cell ExampleStandard Cell Layout MethodologyVerilog to ASIC Layout (the push button approach)Macro ModulesClock DistributionThe Power Supply Wires are Not Ideal!Analog Circuits: Clock Frequency Multiplication (Phase Locked Loop)Scan TestingBehavioral TransformationsFixed-Coefficient MultiplicationTransform: Canonical Signed Digits (CSD) Algebraic TransformationsTransforms for Efficient Resource UtilizationRetiming Example: FIR FilterPipelining, Just Another Transformation (Pipelining = Adding Delays + Retiming)The Power of Transforms: LookaheadKey Concern in Modern VLSI: Variations!Trends: “Chip in a Day” (Matlab/Simulink to Silicon…)L15: 6.111 Spring 20061Introductory Digital Systems LaboratoryL15: VLSI Integration and Performance L15: VLSI Integration and Performance TransformationsTransformationsAcknowledgement:Materials in this lecture are courtesy of the following sources and are used with permission.J. Rabaey, A. Chandrakasan, B. Nikolic. Digital Integrated Circuits: A Design Perspective.Prentice Hall/Pearson, 2003.Curt SchurgersL15: 6.111 Spring 2006Layout 101Layout 101GNDVDDmetal polyp+ diffcontactfrommetalto ndiffLnWnLpWpIN OUTn-type wellp-type substratemetal/pdiffcontactn+ diffLayoutINOUTVDDSGGDDSCircuit Representation3-D Cross-SectionUsed with permission. Follow simple design rules (contractbetween process and circuit designers)Introductory Digital Systems Laboratory2n+n+n+p+p+p+np+SiO2SiO2N-channel MOSFET P-channel MOSFETFigure by MIT OpenCourseWare.L15: 6.111 Spring 20063Introductory Digital Systems LaboratoryCustom Design/LayoutCustom Design/Layout Hand crafting the layout to achieve maximum clock rates (> 1Ghz) Exploits regularity in datapath structure to optimize interconnects Adder stage 1WiringAdder stage 2WiringAdder stage 3Bit slice 0Bit slice 2Bit slice 1Bit slice 63Sum SelectShifterMultiplexersLoopback BusFrom register files / Cache / BypassLoopback BusLoopback BusDie photograph of the Die photograph of the Itanium integer Itanium integer datapathdatapathBitBit--slice Design Methodologyslice Design Methodology9-1 Mux9-1 Mux5-1 Mux2-1 Muxck1CARRYGENSUMGEN+ LU1000umbs0s1g64sumsumbLU : LogicalUnitSUMSELato Cachenode1REGItanium has 6 integer execution units like thisItanium has 6 integer execution units like thisTo register files / CacheCourtesy Intel, as reprinted in Rabaey, et al. "Digital Integrated Circuits".L15: 6.111 Spring 20064Introductory Digital Systems LaboratoryThe ASIC ApproachThe ASIC ApproachMost Common Design Approach for Designs up to 500Mhz Clock RatesVerilog (or VHDL )Verilog (or VHDL )Logic SynthesisLogic SynthesisFloorplanningFloorplanningPlacementPlacementRoutingRoutingTape-outCircuit ExtractionCircuit ExtractionPre-Layout SimulationPre-Layout SimulationPost-Layout SimulationPost-Layout SimulationStructuralStructuralPhysicalPhysicalBehavioralBehavioralDesign CaptureDesign IterationDesign IterationL15: 6.111 Spring 20065Introductory Digital Systems LaboratoryStandard Cell ExampleStandard Cell Example Each library cell (FF, NAND, NOR, INV, etc.) and the variations on size (strength of the gate) is fully characterized across temperature, loading, etc.3-input NAND cell(from ST Microelectronics):C = Load capacitanceT = input rise/fall timePower Supply Line (VDD)Ground Supply Line (GND)Delay in (ns)!!L15: 6.111 Spring 20066Introductory Digital Systems LaboratoryStandard Cell Layout MethodologyStandard Cell Layout MethodologyCell-structure hidden under interconnect layers2-level metal technology Current Day Technology With limited interconnect layers, dedicated routing channels between rows of standard cells are needed Width of the cell allowed to vary to accommodate complexity Interconnect plays a significant role in speed of a digital circuitL15: 6.111 Spring 2006 Introductory Digital Systems Laboratorymodule adder64 (a, b, sum); input [63:0] a, b; output [63:0] sum; assign sum = a + b;endmoduleVerilogVerilogto ASIC Layout to ASIC Layout (the push button approach)(the push button approach)7After SynthesisAfter PlacementAfter RoutingL15: 6.111 Spring 20068Introductory Digital Systems LaboratoryMacro ModulesMacro Modules256×32 (or 8192 bit) SRAM Generated by hard-macro module generator Generate highly regular structures (entire memories, multipliers, etc.) with a few lines of code Verilog models for memories automatically generated based on sizeL15: 6.111 Spring 20069Introductory Digital Systems LaboratoryClock DistributionClock DistributionFor 1Ghz clock, skew budget is 100ps.Variations along different paths arise from:• Device: VT, W/L, etc.• Environment: VDD, °C• Interconnect: dielectric thickness variationDQDQIBM Clock RoutingImage removed due tocopyright restrictions.Clock skewImage removed due tocopyright restrictions.L15: 6.111 Spring 200610Introductory Digital Systems LaboratoryThe Power Supply Wires are Not Ideal!The Power Supply Wires are Not Ideal!GROUND GRIDPadPadDriverReceiverRdCdCcoupCintTo VDD GridTo VDD GridTo VDD GridThe IR-drop problem causes internal power supply voltage to be less than the external sourceUsed with permission.L15: 6.111 Spring 200611Introductory Digital Systems LaboratoryAnalog Circuits: Clock Frequency Analog Circuits: Clock Frequency Multiplication (Phase Locked Loop)Multiplication (Phase Locked Loop) VCO produces high frequency square wave Divider divides down VCO frequency PFD compares phase of ref and div Loop filter extracts phase error informationUsed widely in digital systems for clock synthesis(a standard IP block in most ASIC flows)updownCourtesy Michael Perrott. Used with permission.L15: 6.111 Spring 200612Introductory Digital Systems LaboratoryScan TestingScan TestingCLKshift inScanShiftshift out01ScanShiftshift in01ScanShift...Idea: have a mode in which all registers are chainedinto one giant shift register which can be loaded/read-out bit serially. Test remaining (combinational)logic by(1) in “test” mode, shift in new values for allregister bits thus setting up the inputs to thecombinational logic(2) clock the circuit once in “normal” mode, latchingthe outputs of the combinational logic back intothe registers(3) in “test” mode, shift out the values of allregister bits and compare against expectedresults. Used with permission ClkScanShiftPrimary
View Full Document