DOC PREVIEW
Berkeley COMPSCI 61C - Lecture Notes

This preview shows page 1-2 out of 6 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS 61C L29 Intel & Review (1)A Carle, Summer 2005 © UCBinst.eecs.berkeley.edu/~cs61c/su05CS61C : Machine StructuresLecture #29: Intel & Summary2005-08-10Andy CarleCS 61C L29 Intel & Review (2)A Carle, Summer 2005 © UCBReview•Benchmarks• Attempt to predict performance• Updated every few years• Measure everything from simulation of desktop graphics programs to battery life•Megahertz Myth• MHz ≠ performance, it’s just one factorCS 61C L29 Intel & Review (3)A Carle, Summer 2005 © UCBMIPS is example of RISC•RISC = Reduced Instruction Set Computer• Term coined at Berkeley, ideas pioneered by IBM, Berkeley, Stanford•RISC characteristics:• Load-store architecture• Fixed-length instructions (typically 32 bits)• Three-address architecture•RISC examples: MIPS, SPARC, IBM/Motorola PowerPC, Compaq Alpha, ARM, SH4, HP-PA, ... CS 61C L29 Intel & Review (4)A Carle, Summer 2005 © UCBMIPS vs. 80386• Address: 32-bit• Page size: 4KB • Data aligned• Destination reg: Left•add $rd,$rs1,$rs2• Regs: $0, $1, ..., $31• Reg = 0: $0• Return address: $31• 32-bit• 4KB• Data unaligned• Right•add %rs1,%rs2,%rd• %r0, %r1, ..., %r7• (n.a.)• (n.a.)CS 61C L29 Intel & Review (5)A Carle, Summer 2005 © UCBMIPS vs. Intel 80x86•MIPS: “Three-address architecture”• Arithmetic-logic specify all 3 operands add $s0,$s1,$s2 # s0=s1+s2• Benefit: fewer instructions ⇒ performance•x86: “Two-address architecture”• Only 2 operands, so the destination is also one of the sources add $s1,$s0 # s0=s0+s1• Often true in C statements: c += b;• Benefit: smaller instructions ⇒ smaller codeCS 61C L29 Intel & Review (6)A Carle, Summer 2005 © UCBMIPS vs. Intel 80x86•MIPS: “load-store architecture”• Only Load/Store access memory; rest operations register-register; e.g., lw $t0, 12($gp) add $s0,$s0,$t0 # s0=s0+Mem[12+gp]• Benefit: simpler hardware ⇒ easier to pipeline, higher performance•x86: “register-memory architecture”• All operations can have an operand in memory; other operand is a register; e.g., add 12(%gp),%s0 # s0=s0+Mem[12+gp]• Benefit: fewer instructions ⇒ smaller codeCS 61C L29 Intel & Review (7)A Carle, Summer 2005 © UCBMIPS vs. Intel 80x86•MIPS: “fixed-length instructions”• All instructions same size, e.g., 4 bytes • simple hardware ⇒ performance• branches can be multiples of 4 bytes•x86: “variable-length instructions”• Instructions are multiple of bytes: 1 to 17; ⇒ small code size (30% smaller?)• More Recent Performance Benefit:better instruction cache hit rates• Instructions can include 8- or 32-bit immediatesCS 61C L29 Intel & Review (8)A Carle, Summer 2005 © UCBUnusual features of 80x86•8 32-bit Registers have names; 16-bit 8086 names with “e” prefix:•eax, ecx, edx, ebx, esp, ebp, esi, edi• 80x86 word is 16 bits, double word is 32 bits•PC is called eip (instruction pointer)•leal (load effective address)• Calculate address like a load, but load addressinto register, not data• Load 32-bit address:leal -4000000(%ebp),%esi# esi = ebp - 4000000CS 61C L29 Intel & Review (9)A Carle, Summer 2005 © UCBInstructions:MIPS vs. 80x86• addu, addiu• subu• and,or, xor• sll, srl, sra• lw• sw• mov• li• lui• addl• subl• andl, orl, xorl• sall, shrl, sarl• movl mem, reg• movl reg, mem• movl reg, reg• movl imm, reg• n.a.CS 61C L29 Intel & Review (10)A Carle, Summer 2005 © UCB80386 addressing (ALU instructions too) •base reg + offset (like MIPS)•movl -8000044(%ebp), %eax•base reg + index reg (2 regs form addr.)•movl (%eax,%ebx),%edi# edi = Mem[ebx + eax]•scaled reg + index (shift one reg by 1,2)•movl(%eax,%edx,4),%ebx # ebx = Mem[edx*4 + eax]•scaled reg + index + offset•movl 12(%eax,%edx,4),%ebx # ebx = Mem[edx*4 + eax + 12]CS 61C L29 Intel & Review (11)A Carle, Summer 2005 © UCBBranches in 80x86•Rather than compare registers, x86 uses special 1-bit registers called “condition codes” that are set as a side-effect of ALU operations• S - Sign Bit• Z - Zero (result is all 0)• C - Carry Out• P - Parity: set to 1 if even number of ones in rightmost 8 bits of operation•Conditional Branch instructions then use condition flags for all comparisons: <, <=, >, >=, ==, !=CS 61C L29 Intel & Review (12)A Carle, Summer 2005 © UCBBranch: MIPS vs. 80x86• beq• bne• slt; beq• slt; bne• jal• jr $31• (cmpl;) jeif previous operation set condition code, then cmpl unnecessary• (cmpl;) jne• (cmpl;) jlt• (cmpl;) jge• call• retCS 61C L29 Intel & Review (13)A Carle, Summer 2005 © UCBwhile (save[i]==k) i = i + j;(i,j,k: %edx,%esi,%ebx)leal -400(%ebp),%eax.Loop: cmpl %ebx,(%eax,%edx,4)jne .Exitaddl %esi,%edxj .Loop.Exit:While in C/Assembly: 80x86Cx86Note: cmpl replaces sll, add, lw in loop CS 61C L29 Intel & Review (14)A Carle, Summer 2005 © UCBUnusual features of 80x86•Memory Stack is part of instruction set•call places return address onto stack, increments esp (Mem[esp]=eip+6; esp+=4)•push places value onto stack, increments esp•pop gets value from stack, decrements esp•incl, decl (increment, decrement)incl %edx # edx = edx + 1• Benefit: smaller instructions ⇒ smaller codeCS 61C L29 Intel & Review (15)A Carle, Summer 2005 © UCBOutline•Intro to x86• MicroarchitectureCS 61C L29 Intel & Review (16)A Carle, Summer 2005 © UCBIntel Internals•Hardware below instruction set called "microarchitecture"•Pentium Pro, Pentium II, Pentium III all based on same microarchitecture(1994)• Improved clock rate, increased cache size•Pentium 4 has new microarchitectureCS 61C L29 Intel & Review (17)A Carle, Summer 2005 © UCBPentium, Pentium Pro, Pentium 4 Pipeline•Pentium (P5) = 5 stagesPentium Pro, II, III (P6) = 10 stages“Pentium 4 (Partially) Previewed,” Microprocessor Report, 8/28/00CS 61C L29 Intel & Review (18)A Carle, Summer 2005 © UCBDynamic Scheduling in Pentium Pro, II, III• PPro doesn’t pipeline 80x86 instructions• PPro decode unit translates the Intel instructions into 72-bit "micro-operations" (~ MIPS instructions)• Takes 1 clock cycle to determine length of 80x86 instructions + 2 more to create the micro-operations• Most instructions translate to 1 to 4 micro-operations•10 stage pipeline for micro-operationsCS 61C L29 Intel & Review (19)A Carle, Summer 2005 © UCBDynamic SchedulingConsider:lw $t0 0($t0) # might miss in memadd $s1 $s1 $s1 # will be stalled inadd $s2


View Full Document

Berkeley COMPSCI 61C - Lecture Notes

Documents in this Course
SIMD II

SIMD II

8 pages

Midterm

Midterm

7 pages

Lecture 7

Lecture 7

31 pages

Caches

Caches

7 pages

Lecture 9

Lecture 9

24 pages

Lecture 1

Lecture 1

28 pages

Lecture 2

Lecture 2

25 pages

VM II

VM II

4 pages

Midterm

Midterm

10 pages

Load more
Download Lecture Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?