CS152 Computer Architecture and Engineering Lecture 4 Performance and The Design ProcessReview: Combinational Elements+DeMorgan EquivalenceReview: recall General C/L Cell Delay ModelStorage Element’s Timing ModelCritical Path & Cycle TimeThe Design ProcessDesign Process (cont.)Design RefinementDesign as SearchMeasurement and EvaluationPerformance: Two notions of “performance”What is Time?How to Measure Time?Measuring Time using Clock CyclesPerformance CalculationHow Calculate the 3 Components?Example: Calculating CPI by averaging separatelyWhat Programs Measure for Comparison?Amdahl's Law: The Law of Diminishing ReturnsAdministriviaComputers in the Real WorldProblem: Design a “fast” ALU for the MIPS ISAMIPS ALU requirementsMIPS arithmetic instruction formatDesign Trick: divide & conquerRefined RequirementsBehavioral Representation: verilogDesign DecisionsRefined Diagram: bit-slice ALU7-to-2 Combinational LogicSeven plus a MUX ?Additional operationsRevised DiagramOverflowOverflow DetectionOverflow Detection LogicMore Revised DiagramBut What about Performance?Carry Look Ahead (Design trick: peek)Plumbing as Carry Lookahead AnalogyCascaded Carry Look-ahead (16-bit): Abstraction2nd level Carry, Propagate as PlumbingDesign Trick: Guess (or “Precompute”)Carry Skip Adder: reduce worst case delayAdditional MIPS ALU requirementsElements of the Design ProcessSummary of the Design ProcessWhy should you keep a design notebook?Why do we keep it on-line?How should you do it?On-line Notebook Example1st page of On-line notebook (Index + Wed. 9/6/95)2nd page of On-line notebook (Thursday 9/7/95)3rd page of On-line notebook (Monday 9/11/95)4th page of On-line notebook (9/11/95 contd)5th page of On-line notebook (9/11/95 contd)Added benefit: cool post-design statisticsLecture SummaryFebruary 4, 2004John Kubiatowicz (www.cs.berkeley.edu/~kubitron)lecture slides: http://www-inst.eecs.berkeley.edu/~cs152/CS152Computer Architecture and EngineeringLecture 4Performance andThe Design Process2/4/04 ©UCB Spring 2004CS152 / Kubiatowicz Lec4.2Review: Combinational Elements+DeMorgan Equivalence NAND GateNOR GateOutABABOutA B Out1110 00 11 01 1 0ABOutOutABOut = A • B = A + B Out = A + B = A • BA B Out0 0 10 1 01 0 01 1 0A B Out1 1 11 0 10 1 10 0 00 00 11 01 1A BA B Out1 1 11 0 00 1 00 0 00 00 11 01 1A BWire InverterIn Out0101In Out1001OutInDeMorgan’sTheoremOut = In Out = In2/4/04 ©UCB Spring 2004CS152 / Kubiatowicz Lec4.3Review: recall General C/L Cell Delay Model°Combinational Cell (symbol) is fully specified by:•functional (input -> output) behavior-truth-table, logic equation, VHDL•load factor of each input•critical propagation delay from each input to each output for each transition-THL(A, o) = Fixed Internal Delay + Load-dependent-delay x load °Linear model composesCoutVoutABX...CombinationalLogic CellCoutDelayVa -> VoutXXXXXXCcriticalInternal Delaydelay per unit load2/4/04 ©UCB Spring 2004CS152 / Kubiatowicz Lec4.4Storage Element’s Timing Model°Setup Time: Input must be stable BEFORE trigger clock edge°Hold Time: Input must REMAIN stable after trigger clock edge°Clock-to-Q time:•Output cannot change instantaneously at the trigger clock edge•Similar to delay in logic gates, two components:-Internal Clock-to-Q-Load dependent Clock-to-QD QD Don’t CareDon’t CareClkUnknownQSetupHoldClock-to-Q2/4/04 ©UCB Spring 2004CS152 / Kubiatowicz Lec4.5Critical Path & Cycle Time°Critical path: the slowest path between any two storage devices°Cycle time is a function of the critical path°must be greater than:Clock-to-Q + Longest Path through Combination Logic + SetupClk............2/4/04 ©UCB Spring 2004CS152 / Kubiatowicz Lec4.6The Design Process"To Design Is To Represent"Design activity yields description/representation of an object-- Traditional craftsman does not distinguish between the conceptualization and the artifact-- Separation comes about because of complexity-- The concept is captured in one or more representation languages-- This process IS designDesign Begins With Requirements-- Functional Capabilities: what it will do-- Performance Characteristics: Speed, Power, Area, Cost, . . .2/4/04 ©UCB Spring 2004CS152 / Kubiatowicz Lec4.7Design Process (cont.)Design Finishes As Assembly-- Design understood in terms of components and how they have been assembled-- Top Down decomposition of complex functions (behaviors) into more primitive functions-- bottom-up composition of primitive building blocks into more complex assembliesCPUDatapath ControlALU Regs ShifterNandGateDesign is a "creative process," not a simple method2/4/04 ©UCB Spring 2004CS152 / Kubiatowicz Lec4.8Design RefinementInformal System RequirementInitial SpecificationIntermediate SpecificationFinal Architectural DescriptionIntermediate Specification of ImplementationFinal Internal SpecificationPhysical Implementationrefinementincreasing level of detail2/4/04 ©UCB Spring 2004CS152 / Kubiatowicz Lec4.9Design as Search Design involves educated guesses and verification-- Given the goals, how should these be prioritized?-- Given alternative design pieces, which should be selected?-- Given design space of components & assemblies, which part will yield the best solution?Feasible (good) choices vs. Optimal choicesProblem AStrategy 1 Strategy 2SubProb 1SubProb2SubProb3BB1 BB2 BB3 BBn2/4/04 ©UCB Spring 2004CS152 / Kubiatowicz Lec4.10Measurement and EvaluationArchitecture is an iterative process -- searching the space of possible designs -- at all levels of computer systemsGood IdeasGood IdeasMediocre IdeasBad IdeasCost /PerformanceAnalysisDesignAnalysisCreativity2/4/04 ©UCB Spring 2004CS152 / Kubiatowicz Lec4.11Performance: Two notions of “performance”° Time to do the task (Execution Time)– execution time, response time, latency° Tasks per day, hour, week, sec, ns. .. (Performance)– throughput, bandwidth Response time and throughput often are in oppositionPlaneBoeing 747BAD/Sud ConcordeSpeed610 mph1350 mphDC to Paris6.5 hours3 hoursPassengers470132Throughput (pmph)286,700178,200Which has higher performance?2/4/04 ©UCB Spring 2004CS152 / Kubiatowicz Lec4.12What is Time?°Straightforward definition of time: •Total time to complete a task, including disk accesses, memory accesses, I/O activities, operating system overhead, ...•“real time”, “response time” or “elapsed time” °Alternative: just time processor (CPU) is working only on your
View Full Document