Review Performance and Technology Trends 1000 CS152 Computer Architecture and Engineering Lecture 4 Supercomputers Performance 100 Mainframes 10 Minicomputers Microprocessors 1 Cost and Design 0 1 1965 1970 1975 1980 1985 1990 1995 2000 Technology Power 1 2 x 1 2 Year x 1 2 1 7 x year September 12 2001 Feature Size shrinks 10 yr Switching speed improves 1 2 yr Density improves 1 2x yr Die Area 1 2x yr John Kubiatowicz http cs berkeley edu kubitron RISC lesson is to keep the ISA as simple as possible Shorter design cycle fully exploit the advancing technology 3yr Advanced branch prediction and pipeline techniques Bigger and more sophisticated on chip caches lecture slides http www inst eecs berkeley edu cs152 9 12 01 CS152 Kubiatowicz Lec4 1 UCB Fall 2001 9 12 01 Review Characterize a Gate Review General C L Cell Delay Model B For each input to output path For each output transition type H L L H H Z L Z etc Internal delay ns Load dependent delay ns fF Combinational Logic Cell X X X X For A and B Input Load I L 61 fF For either A Out or B Out Tlh 0 5ns Tlhf 0 0021ns fF Thl 0 1ns Thlf 0 0020ns fF Slope 0 0021ns fF 0 5ns UCB Fall 2001 delay per unit load Internal Delay Cout Combinational Cell symbol is fully specified by Cout 9 12 01 X Ccritical Delay A Out Out Low High Out X Cout Example 2 input NAND Gate B X Delay Va Vout Vout A Input capacitance for each input A CS152 Kubiatowicz Lec4 2 UCB Fall 2001 CS152 Kubiatowicz Lec4 3 functional input output behavior truth table logic equation VHDL load factor of each input critical propagation delay from each input to each output for each transition THL A o Fixed Internal Delay Load dependent delay x load Linear model composes 9 12 01 UCB Fall 2001 CS152 Kubiatowicz Lec4 4 Review More complicated gates 2 to 1 MUX Input Load and Load Dependent Delay A B Y Three Components Input Load Load Dependent Delay Internal Delays One for each input pathooutput transition S Input Load A 61 fF B 61 fF S 111 fF Load Dependent Delay TAYlhf 0 0021 ns fF TBYlhf 0 0021 ns fF TSYlhf 0 0021 ns fF Gate 1 Wire 0 A Wire 1 Gate 3 B Gate 2 B Y A and S or B and S Wire 2 2 x 1 Mux 2 x 1 Mux A Y S S Input Load I L A B I L NAND 61 fF S I L INV I L NAND 50 fF 61 fF 111 fF TAYhlf 0 0020 ns fF TBYhlf 0 0020 ns fF TSYlhf 0 0020 ns f F Load Dependent Delay L D D Same as Gate 3 Internal Delay TAYlhf 0 0021 ns fF TBYlhf 0 0021 ns fF TSYlhf 0 0021 ns fF TAYlh 0 844ns TBYlh 0 844ns Fun Exercises TAYhl TBYhl TSYlh TSYlh TAYhlf 0 0020 ns fF TBYhlf 0 0020 ns fF TSYlhf 0 0020 ns fF How do we compute these numbers 9 12 01 CS152 Kubiatowicz Lec4 5 UCB Fall 2001 2 to 1 MUX Internal Delay Calculation A Gate 3 Gate 2 CS152 Kubiatowicz Lec4 6 NAND2 NAND3 NAND 4 Y A and S or A and S B UCB Fall 2001 CS152 Logic Elements Wire 1 Gate 1 Wire 0 9 12 01 NOR2 NOR3 NOR4 INV1x normal inverter INV4x inverter with large output drive Wire 2 S Internal Delay I D XOR2 A to Y I D G1 Wire 1 C G3 Input C L D D G1 I D G3 B to Y I D G2 Wire 2 C G3 Input C L D D G2 I D G3 S to Y Worst Case I D Inv Wire 0 C G1 Input C L D D Inv Internal Delay A to Y We can approximate the effect of Wire 1 C by XNOR2 PWR Source of 1 s GND Source of 0 s fast MUXes Assume Wire 1 has the same C as all the gate C attached to it D flip flop with negative edge triggered Specific Example TAYlh TPhl G1 2 0 61 fF TPhlf G1 TPlh G3 0 1ns 122 fF 0 0020 ns fF 0 5ns 0 844 ns 9 12 01 UCB Fall 2001 CS152 Kubiatowicz Lec4 7 9 12 01 UCB Fall 2001 CS152 Kubiatowicz Lec4 8 Storage Element s Timing Model Clocking Methodology Clk D Clk Setup Q D Hold Don t Care Don t Care Clock to Q Q Unknown Combination Logic Setup Time Input must be stable BEFORE the trigger clock edge Hold Time Input must REMAIN stable after the trigger clock edge All storage elements are clocked by the same clock edge Clock to Q time Output cannot change instantaneously at the trigger clock edge Similar to delay in logic gates two components Internal Clock to Q Load dependent Clock to Q The combination logic block s Inputs are updated at each clock tick All outputs MUST be stable before the next clock tick Typical for class 1ns Setup 0 5ns Hold 9 12 01 CS152 Kubiatowicz Lec4 9 UCB Fall 2001 Critical Path Cycle Time 9 12 01 Clock Skew s Effect on Cycle Time Clk Clk1 Clock Skew Clk2 Critical path the slowest path between any two storage devices Cycle time is a function of the critical path Clock to Q Longest Path through Combination Logic Setup UCB Fall 2001 Clk1 Clk2 The worst case scenario for cycle time consideration The input register sees CLK1 The output register sees CLK2 must be greater than 9 12 01 CS152 Kubiatowicz Lec4 10 UCB Fall 2001 CS152 Kubiatowicz Lec4 11 Cycle Time Clock Skew t CLK to Q Longest Delay Setup Cycle Time t CLK to Q Longest Delay Setup Clock Skew 9 12 01 UCB Fall 2001 CS152 Kubiatowicz Lec4 12 Tricks to Reduce Cycle Time How to Avoid Hold Time Violation Reduce the number of gate levels Clk A B A B C C D D Combination Logic Use esoteric dynamic timing methods Pay attention to loading One gate driving many gates is a bad idea Hold time requirement Avoid using a small gate to drive a long wire Input to register must NOT change immediately after the clock tick Use multiple stages to drive large load This is usually easy to meet in the edge trigger clocking scheme INV4x Hold time of most FFs is 0 ns Clarge CLK to Q Shortest Delay Path must be greater than Hold Time INV4x 9 12 01 CS152 Kubiatowicz Lec4 13 UCB Fall 2001 Clock Skew s Effect on Hold Time 9 12 01 CS152 Kubiatowicz Lec4 14 UCB Fall 2001 Integrated Circuit Costs Clk1 Die cost Clock Skew Clk2 Combination Logic Wafer cost Dies per Wafer Die yield Dies per wafer S Wafer diam 2 2 S Wafer diam Test dies Wafer Area Die Area 2 Die Area Die Area Clk1 Clk2 The worst case scenario for hold time consideration The input register sees CLK2 The output register sees …
View Full Document
Unlocking...