DOC PREVIEW
Berkeley COMPSCI 152 - Lecture 3 Performance, Technology & Delay Modeling

This preview shows page 1-2-3-25-26-27-28-50-51-52 out of 52 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 52 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 52 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 52 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 52 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 52 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 52 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 52 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 52 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 52 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 52 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 52 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS152 Computer Architecture and Engineering Lecture 3 Performance, Technology & Delay ModelingReview: Salient features of MIPS IReview: MIPS Addressing Modes/Instruction FormatsReview: When does MIPS sign extend?Review: Details of the MIPS instruction setCalls: Why Are Stacks So Great?Memory StacksCall-Return Linkage: Stack FramesMIPS: Software conventions for RegistersMIPS / GCC Calling ConventionsDelayed BranchesBranch & PipelinesPerformanceTwo notions of “performance”DefinitionsExampleBasis of EvaluationSPEC95Metrics of performanceAspects of CPU PerformanceCPIAmdahl's LawExample (RISC processor)Evaluating Instruction Sets?Administrative MattersFinite State Machines:Implementation as Combinational logic + LatchPerformance and Technology TrendsRange of Design StylesBasic Technology: CMOSBasic Components: CMOS InverterBasic Components: CMOS Logic GatesGate ComparisonIdeal versus RealityFluid Timing ModelSeries ConnectionReview: Calculating DelaysReview: General C/L Cell Delay ModelCharacterize a GateA Specific Example: 2 to 1 MUX2 to 1 MUX: Internal Delay Calculation2 to 1 MUX: Internal Delay Calculation (continue)Abstraction: 2 to 1 MUXCS152 Logic ElementsStorage Element’s Timing ModelClocking MethodologyCritical Path & Cycle TimeClock Skew’s Effect on Cycle TimeTricks to Reduce Cycle TimeHow to Avoid Hold Time Violation?Clock Skew’s Effect on Hold TimeSummary9/5/01 ©UCB Fall 2001CS152 / Kubiatowicz Lec3.1CS152Computer Architecture and EngineeringLecture 3Performance, Technology & Delay ModelingSeptember 5, 2001John Kubiatowicz (http.cs.berkeley.edu/~kubitron)lecture slides: http://www-inst.eecs.berkeley.edu/~cs152/9/5/01 ©UCB Fall 2001CS152 / Kubiatowicz Lec3.2Review: Salient features of MIPS I• 32-bit fixed format inst (3 formats)• 32 32-bit GPR (R0 contains zero) and 32 FP registers (+ HI LO)– partitioned by software convention• 3-address, reg-reg arithmetic instr.• Single address mode for load/store: base+displacement– no indirection, scaled• 16-bit immediate plus LUI• Simple branch conditions– compare against zero or two registers for =,– no integer condition codes• Support for 8bit, 16bit, and 32bit integers• Support for 32bit and 64bit floating point.9/5/01 ©UCB Fall 2001CS152 / Kubiatowicz Lec3.3Review: MIPS Addressing Modes/Instruction Formatsop rs rt rdimmedregisterRegister (direct)op rs rtregisterBase+index+Memoryimmedop rs rtImmediateimmedop rs rtPCPC-relative+Memory• All instructions 32 bits wide9/5/01 ©UCB Fall 2001CS152 / Kubiatowicz Lec3.4Review: When does MIPS sign extend?•When value is sign extended, copy upper bit to full value:Examples of sign extending 8 bits to 16 bits:00001010  00000000 0000101010001100  11111111 10001100•When is an immediate value sign extended?–Arithmetic instructions (add, sub, etc.) sign extend immediates even for the unsigned versions of the instructions!–Logical instructions do not sign extendaddi $r2, $r3, -1 has 0xFFFF in immediate fieldand will extend to 0xFFFFFFFF before addingandi $r2, $r3, -1 has 0xFFFF in immediate field and will extend to 0x0000FFFF before anding–Kinda weird to put negative numbers in logical instructions9/5/01 ©UCB Fall 2001CS152 / Kubiatowicz Lec3.5Review: Details of the MIPS instruction set•Register zero always has the value zero (even if you try to write it)•Branch/jump and link put the return addr. PC+4 into the link register (R31), also called “ra”•All instructions change all 32 bits of the destination register (including lui, lb, lh) and all read all 32 bits of sources (add, and, …)•The difference between signed and unsigned versions:–For add and subtract: signed causes exception on overflow»No difference in sign-extension behavior!–For multiply and divide, distinguishes type of operation•Thus, overflow can occur in these arithmetic and logical instructions:–add, sub, addi–it cannot occur in addu, subu, addiu, and, or, xor, nor, shifts, mult, multu, div, divu•Immediate arithmetic and logical instructions are extended as follows:–logical immediates ops are zero extended to 32 bits–arithmetic immediates ops are sign extended to 32 bits (including addu)•The data loaded by the instructions lb and lh are extended as follows:–lbu, lhu are zero extended–lb, lh are sign extended9/5/01 ©UCB Fall 2001CS152 / Kubiatowicz Lec3.6Calls: Why Are Stacks So Great?Stacking of Subroutine Calls & Returns and Environments:A: CALL B CALL C C: RET RETB: AA BA B CA BASome machines provide a memory stack as part of the architecture (e.g., VAX)Sometimes stacks are implemented via software convention (e.g., MIPS)9/5/01 ©UCB Fall 2001CS152 / Kubiatowicz Lec3.7Memory StacksUseful for stacked environments/subroutine call & return even if operand stack not part of architectureStacks that Grow Up vs. Stacks that Grow Down:abc0 Littleinf. Big0 Littleinf. BigMemoryAddressesSPNextEmpty?LastFull?How is empty stack represented?Big  Little: Last FullPOP: Read from Mem(SP) Increment SPPUSH: Decrement SP Write to Mem(SP)growsupgrowsdownBig  Little: Next EmptyPOP: Increment SP Read from Mem(SP)PUSH: Write to Mem(SP) Decrement SP9/5/01 ©UCB Fall 2001CS152 / Kubiatowicz Lec3.8Call-Return Linkage: Stack FramesFPARGSCallee SaveRegistersLocal VariablesSPReference args andlocal variables atfixed (positive) offsetfrom FPGrows and shrinks duringexpression evaluation(old FP, RA)•Many variations on stacks possible (up/down, last pushed / next )•Compilers normally keep scalar variables in registers, not memory!High MemLow Mem9/5/01 ©UCB Fall 2001CS152 / Kubiatowicz Lec3.90 zero constant 01 at reserved for assembler2 v0 expression evaluation &3 v1 function results4 a0 arguments5 a16 a27 a38 t0 temporary: caller saves. . . (callee can clobber)15 t7MIPS: Software conventions for Registers16 s0 callee saves. . . (callee must save)23 s724 t8 temporary (cont’d)25 t926 k0 reserved for OS kernel27 k128 gp Pointer to global area29 sp Stack pointer30 fp frame pointer31 ra Return Address (HW)9/5/01 ©UCB Fall 2001CS152 / Kubiatowicz Lec3.10MIPS / GCC Calling ConventionsFPSPfact:addiu $sp, $sp, -32sw $ra, 20($sp)sw $fp, 16($sp)addiu $fp, $sp, 32. . .sw $a0, 0($fp)...lw $ra, 20($sp)lw $fp, 16($sp)addiu $sp, $sp, 32jr $raraold FPraold FPraFPSPraFPSPlowaddressFirst four arguments passed in registersResult passed in $v0/$v19/5/01 ©UCB Fall 2001CS152 / Kubiatowicz


View Full Document

Berkeley COMPSCI 152 - Lecture 3 Performance, Technology & Delay Modeling

Documents in this Course
Quiz 5

Quiz 5

9 pages

Memory

Memory

29 pages

Quiz 5

Quiz 5

15 pages

Memory

Memory

29 pages

Memory

Memory

35 pages

Memory

Memory

15 pages

Quiz

Quiz

6 pages

Midterm 1

Midterm 1

20 pages

Quiz

Quiz

12 pages

Memory

Memory

33 pages

Quiz

Quiz

6 pages

Homework

Homework

19 pages

Quiz

Quiz

5 pages

Memory

Memory

15 pages

Load more
Download Lecture 3 Performance, Technology & Delay Modeling
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture 3 Performance, Technology & Delay Modeling and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 3 Performance, Technology & Delay Modeling 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?