Unformatted text preview:

inst eecs berkeley edu cs61c su05 CS61C Machine Structures Lecture 29 Intel Summary Review Benchmarks Attempt to predict performance Updated every few years Measure everything from simulation of desktop graphics programs to battery life Megahertz Myth MHz performance it s just one factor 2005 08 10 Andy Carle CS 61C L29 Intel Review 1 A Carle Summer 2005 UCB MIPS is example of RISC MIPS RISC Reduced Instruction Set Computer Term coined at Berkeley ideas pioneered by IBM Berkeley Stanford RISC characteristics A Carle Summer 2005 UCB CS 61C L29 Intel Review 2 vs 80386 Address 32 bit 32 bit Page size 4KB 4KB Data aligned Data unaligned Destination reg Left Right Load store architecture add rd rs1 rs2 add rs1 rs2 rd Fixed length instructions typically 32 bits Regs 0 1 31 r0 r1 r7 Three address architecture Reg 0 0 n a Return address 31 n a RISC examples MIPS SPARC IBM Motorola PowerPC Compaq Alpha ARM SH4 HP PA CS 61C L29 Intel Review 3 A Carle Summer 2005 UCB MIPS vs Intel 80x86 MIPS Three address architecture Arithmetic logic specify all 3 operands add s0 s1 s2 s0 s1 s2 Benefit fewer instructions performance x86 Two address architecture Only 2 operands so the destination is also one of the sources add s1 s0 s0 s0 s1 Often true in C statements c b Benefit smaller instructions smaller code CS 61C L29 Intel Review 5 A Carle Summer 2005 UCB CS 61C L29 Intel Review 4 A Carle Summer 2005 UCB MIPS vs Intel 80x86 MIPS load store architecture Only Load Store access memory rest operations register register e g lw t0 12 gp add s0 s0 t0 s0 s0 Mem 12 gp Benefit simpler hardware easier to pipeline higher performance x86 register memory architecture All operations can have an operand in memory other operand is a register e g add 12 gp s0 s0 s0 Mem 12 gp Benefit fewer instructions smaller code CS 61C L29 Intel Review 6 A Carle Summer 2005 UCB MIPS vs Intel 80x86 Unusual features of 80x86 MIPS fixed length instructions All instructions same size e g 4 bytes simple hardware performance branches can be multiples of 4 bytes 80x86 word is 16 bits double word is 32 bits Instructions are multiple of bytes 1 to 17 small code size 30 smaller More Recent Performance Benefit better instruction cache hit rates Instructions can include 8 or 32 bit immediates A Carle Summer 2005 UCB Instructions MIPS vs 80x86 leal load effective address Calculate address like a load but load address into register not data Load 32 bit address leal 4000000 ebp esi esi ebp 4000000 A Carle Summer 2005 UCB CS 61C L29 Intel Review 8 80386 addressing ALU instructions too base reg offset like MIPS addu addiu addl subu subl and or xor andl orl xorl sll srl sra sall shrl sarl lw movl mem reg sw movl reg mem mov movl reg reg li movl imm reg lui n a CS 61C L29 Intel Review 9 eax ecx edx ebx esp ebp esi edi PC is called eip instruction pointer x86 variable length instructions CS 61C L29 Intel Review 7 8 32 bit Registers have names 16 bit 8086 names with e prefix movl 8000044 ebp eax base reg index reg 2 regs form addr movl eax ebx edi edi Mem ebx eax scaled reg index shift one reg by 1 2 movl eax edx 4 ebx ebx Mem edx 4 eax scaled reg index offset movl 12 eax edx 4 ebx ebx Mem edx 4 eax 12 A Carle Summer 2005 UCB Branches in 80x86 Branch Rather than compare registers x86 uses special 1 bit registers called condition codes that are set as a side effect of ALU operations A Carle Summer 2005 UCB CS 61C L29 Intel Review 10 MIPS vs 80x86 beq cmpl je if previous operation set condition code then cmpl unnecessary S Sign Bit bne cmpl jne Z Zero result is all 0 slt beq cmpl jlt C Carry Out slt bne cmpl jge jal call jr 31 ret P Parity set to 1 if even number of ones in rightmost 8 bits of operation Conditional Branch instructions then use condition flags for all comparisons CS 61C L29 Intel Review 11 A Carle Summer 2005 UCB CS 61C L29 Intel Review 12 A Carle Summer 2005 UCB Unusual features of 80x86 While in C Assembly 80x86 C Memory Stack is part of instruction set while save i k i i j i j k edx esi ebx leal 400 ebp eax Loop cmpl ebx eax edx 4 x jne Exit 8 addl esi edx 6 j Loop Exit Note cmpl replaces sll add lw in loop CS 61C L29 Intel Review 13 A Carle Summer 2005 UCB call places return address onto stack increments esp Mem esp eip 6 esp 4 push places value onto stack increments esp pop gets value from stack decrements esp incl decl increment decrement incl edx edx edx 1 Benefit smaller instructions smaller code CS 61C L29 Intel Review 14 A Carle Summer 2005 UCB Outline Intel Internals Intro to x86 Hardware below instruction set called microarchitecture Microarchitecture Pentium Pro Pentium II Pentium III all based on same microarchitecture 1994 Improved clock rate increased cache size Pentium 4 has new microarchitecture CS 61C L29 Intel Review 15 A Carle Summer 2005 UCB Pentium Pentium Pro Pentium 4 Pipeline CS 61C L29 Intel Review 16 A Carle Summer 2005 UCB Dynamic Scheduling in Pentium Pro II III PPro doesn t pipeline 80x86 instructions PPro decode unit translates the Intel instructions into 72 bit micro operations MIPS instructions Takes 1 clock cycle to determine length of 80x86 instructions 2 more to create the micro operations Pentium P5 5 stages Pentium Pro II III P6 10 stages A Carle Summer 2005 UCB CS 61C L294 Intel Review 17 Previewed Microprocessor Report 8 28 00 Pentium Partially Most instructions translate to 1 to 4 micro operations 10 stage pipeline for micro operations CS 61C L29 Intel Review 18 A Carle Summer 2005 UCB Dynamic Scheduling Hardware support for reordering Out of Order execution OOO allow an instruction to execute before prior instructions have executed Consider lw t0 0 t0 might miss in mem Speculation across branches add s1 s1 s1 will be stalled in When instruction no longer speculative write results instruction commit add s2 s1 s1 pipe waiting for lw Solutions Compiler STATIC reordering loops Watch out for hazards Hardware DYNAMIC reordering A Carle Summer 2005 UCB CS 61C L29 Intel Review 19 Hardware for OOO execution Need HW buffer for results of uncommitted instructions reorder buffer Reorder buffer can be operand source CS 61C L29 Intel Review 21 A Carle Summer 2005 UCB CS 61C L29 Intel Review 20 Dynamic Scheduling in Pentium Pro Max instructions issued clock 3 Max instr complete exec clock 5 Reorder Buffer IF Issue Regs Once operand commits result is found in Res Stations register Discard results on mispredicted branches or on exceptions Fetch issue in order execute OOO commit in order Adder Res Stations Max


View Full Document

Berkeley COMPSCI 61C - Lecture Notes

Documents in this Course
SIMD II

SIMD II

8 pages

Midterm

Midterm

7 pages

Lecture 7

Lecture 7

31 pages

Caches

Caches

7 pages

Lecture 9

Lecture 9

24 pages

Lecture 1

Lecture 1

28 pages

Lecture 2

Lecture 2

25 pages

VM II

VM II

4 pages

Midterm

Midterm

10 pages

Load more
Loading Unlocking...
Login

Join to view Lecture Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture Notes and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?