DOC PREVIEW
Berkeley COMPSCI 152 - Lecture 3 Performance, Technology & Delay Modeling

This preview shows page 1-2-3-4-24-25-26-50-51-52-53 out of 53 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS152 Computer Architecture and Engineering Lecture 3 Performance, Technology & Delay ModelingOutline of Today’s LectureSummary: Instruction set design (MIPS)Summary: Salient features of MIPS IDetails of the MIPS instruction setCalls: Why Are Stacks So Great?Memory StacksCall-Return Linkage: Stack FramesMIPS: Software conventions for RegistersMIPS / GCC Calling ConventionsDelayed BranchesBranch & PipelinesPerformanceTwo notions of “performance”DefinitionsExampleBasis of EvaluationSPEC95Metrics of performanceAspects of CPU PerformanceSlide 21CPIAmdahl's LawExample (RISC processor)Evaluating Instruction Sets?Administrative MattersPerformance and Technology TrendsRange of Design StylesBasic Technology: CMOSBasic Components: CMOS InverterBasic Components: CMOS Logic GatesGate ComparisonIdeal versus RealityFluid Timing ModelSeries ConnectionReview: Calculating DelaysReview: General C/L Cell Delay ModelCharacterize a GateA Specific Example: 2 to 1 MUX2 to 1 MUX: Internal Delay Calculation2 to 1 MUX: Internal Delay Calculation (continue)Abstraction: 2 to 1 MUXBreak (5 Minutes)CS152 Logic ElementsStorage Element’s Timing ModelClocking MethodologyCritical Path & Cycle TimeClock Skew’s Effect on Cycle TimeTricks to Reduce Cycle TimeHow to Avoid Hold Time Violation?Clock Skew’s Effect on Hold TimeSummaryTo Get More Information1/27/99 ©UCB Spring 1999CS152 / Kubiatowicz Lec3.1CS152Computer Architecture and EngineeringLecture 3Performance, Technology & Delay ModelingJan 27, 1999John Kubiatowicz (http.cs.berkeley.edu/~kubitron)lecture slides: http://www-inst.eecs.berkeley.edu/~cs152/1/27/99 ©UCB Spring 1999CS152 / Kubiatowicz Lec3.2Outline of Today’s Lecture °Review : Finish ISA/MIPS details (10 minutes)°Performance and Technology (15 minutes)°Administrative Matters and Questions (2 minutes)°Delay Modeling and Gate Characterization (20 minutes)°Questions and Break (5 minutes)°Clocking Methodologies and Timing Considerations (25 minutes)1/27/99 ©UCB Spring 1999CS152 / Kubiatowicz Lec3.3Summary: Instruction set design (MIPS)°Use general purpose registers with a load-store architecture: YES°Provide at least 16 general purpose registers plus separate floating-point registers: 31 GPR & 32 FPR°Support basic addressing modes: displacement (with address offset of 12 to 16 bits), immediate (size 8 to 16 bits), and register deferred: YES: 16 bit immediate, displacement (disp=0  register deferred)° All addressing modes apply to all data transfer instructions : YES °Use fixed instruction encoding if interested in performance and use variable instruction encoding if interested in code size : Fixed °Support these data sizes and types: 8-bit, 16-bit, 32-bit integers and 32-bit and 64-bit IEEE 754 floating point numbers: YES°Support most common instructions, since they will dominate:load, store, add, subtract, move register-register, and, shift, compare equal, compare not equal, branch (with a PC-relative address at least 8-bits long), jump, call, and return: YES, 16b relative address° Aim for a minimalist instruction set: YES1/27/99 ©UCB Spring 1999CS152 / Kubiatowicz Lec3.4Summary: Salient features of MIPS I• 32-bit fixed format inst (3 formats)• 32 32-bit GPR (R0 contains zero) and 32 FP registers (+ HI LO)– partitioned by software convention• 3-address, reg-reg arithmetic instr.• Single address mode for load/store: base+displacement– no indirection, scaled• 16-bit immediate plus LUI• Simple branch conditions– compare against zero or two registers for =,– no integer condition codes• Support for 8bit, 16bit, and 32bit integers• Support for 32bit and 64bit floating point.1/27/99 ©UCB Spring 1999CS152 / Kubiatowicz Lec3.5Details of the MIPS instruction set°Register zero always has the value zero (even if you try to write it)°Branch/jump and link put the return addr. PC+4 into the link register (R31)°All instructions change all 32 bits of the destination register (including lui, lb, lh) and all read all 32 bits of sources (add, sub, and, or, …)°Immediate arithmetic and logical instructions are extended as follows:•logical immediates ops are zero extended to 32 bits•arithmetic immediates ops are sign extended to 32 bits (including addu)°The data loaded by the instructions lb and lh are extended as follows:•lbu, lhu are zero extended•lb, lh are sign extended°Overflow can occur in these arithmetic and logical instructions:•add, sub, addi•it cannot occur in addu, subu, addiu, and, or, xor, nor, shifts, mult, multu, div, divu1/27/99 ©UCB Spring 1999CS152 / Kubiatowicz Lec3.6Calls: Why Are Stacks So Great?Stacking of Subroutine Calls & Returns and Environments:A: CALL B CALL C C: RET RETB: AA BA B CA BASome machines provide a memory stack as part of the architecture (e.g., VAX)Sometimes stacks are implemented via software convention (e.g., MIPS)1/27/99 ©UCB Spring 1999CS152 / Kubiatowicz Lec3.7Memory StacksUseful for stacked environments/subroutine call & return even if operand stack not part of architectureStacks that Grow Up vs. Stacks that Grow Down:abc0 Littleinf. Big0 Littleinf. BigMemoryAddressesSPNextEmpty?LastFull?How is empty stack represented?Consider case of stack growing down (MIPS)Last FullPOP: Read from Mem(SP) Increment SPPUSH: Decrement SP Write to Mem(SP)growsupgrowsdownLast EmptyPOP: Increment SP Read from Mem(SP)PUSH: Write to Mem(SP) Decrement SP1/27/99 ©UCB Spring 1999CS152 / Kubiatowicz Lec3.8Call-Return Linkage: Stack FramesFPARGSCallee SaveRegistersLocal VariablesSPReference args andlocal variables atfixed (positive) offsetfrom FPGrows and shrinks duringexpression evaluation(old FP, RA)°Many variations on stacks possible (up/down, last pushed / next )°Compilers normally keep scalar variables in registers, not memory!High MemLow Mem1/27/99 ©UCB Spring 1999CS152 / Kubiatowicz Lec3.90 zero constant 01 at reserved for assembler2 v0 expression evaluation &3 v1 function results4 a0 arguments5 a16 a27 a38 t0 temporary: caller saves. . . (callee can clobber)15 t7MIPS: Software conventions for Registers16 s0 callee saves. . . (callee must save)23 s724 t8 temporary (cont’d)25 t926 k0 reserved for OS kernel27 k128 gp Pointer to global area29 sp Stack pointer30 fp frame pointer31 ra Return Address (HW)1/27/99 ©UCB Spring 1999CS152 / Kubiatowicz Lec3.10MIPS / GCC Calling ConventionsFPSPfact:addiu $sp, $sp, -32sw $ra, 20($sp)sw $fp,


View Full Document

Berkeley COMPSCI 152 - Lecture 3 Performance, Technology & Delay Modeling

Documents in this Course
Quiz 5

Quiz 5

9 pages

Memory

Memory

29 pages

Quiz 5

Quiz 5

15 pages

Memory

Memory

29 pages

Memory

Memory

35 pages

Memory

Memory

15 pages

Quiz

Quiz

6 pages

Midterm 1

Midterm 1

20 pages

Quiz

Quiz

12 pages

Memory

Memory

33 pages

Quiz

Quiz

6 pages

Homework

Homework

19 pages

Quiz

Quiz

5 pages

Memory

Memory

15 pages

Load more
Download Lecture 3 Performance, Technology & Delay Modeling
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture 3 Performance, Technology & Delay Modeling and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 3 Performance, Technology & Delay Modeling 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?