Review 1 2 One way to define clock cycles Clock Cycles for program CS61C Machine Structures Instructions for a program called Instruction Count Lecture 23 Penitium III IV and other PC buzzwords x A v e r a g e C lock cycles P er Instruction abbreviated C P I November 22 2000 CPU execution time for program David Patterson Instruction Count x C P I x Clock Cycle Time http www inst eecs berkeley edu cs61c CS61C L23 x86 UC Regents 1 Outline Review 2 2 Latency v Throughput Intel 80x86 Pentium Instruction Set History Performance doesn t depend on any single factor need to know Instruction Count Clocks Per Instruction and Clock Rate to get valid estimations Administrivia Computers in the News User Time time user needs to wait for program to execute depends heavily on how OS switches between tasks Pentium III v Pentium 4 v A lthon Typical PC CPU Time time spent executing a single program depends solely on design of processor datapath pipelining effectiveness caches etc CS61C L23 x86 UC Regents 2 CS61C L23 x86 UC Regents Typical Mac Conclusion 3 Intel History ISA evolved since 1978 8086 16 bit all internal registers 16 bits wide no general purpose registers 78 8087 60 Fl Pt instructions Prof K a h a n adds 80 bit wide stack but no registers 80 80286 adds elaborate protection model 82 80386 32 bit converts 8 16 bit registers into 8 32 bit general purpose registers new addressing modes adds paging 85 4 CS61C L23 x86 UC Regents MIPS vs 80386 Address 32 bit 32 bit Page size 4KB 4KB Data aligned Data unaligned Destination reg Left Right add rd rs1 rs2 add rs1 rs2 rd Regs 0 1 31 r0 r1 r 7 80486 Pentium Pentium II 4 instructions Reg 0 0 n a MMX 57 instructions for multimedia 97 Return address 31 n a Pentium III 70 instructions for multimedia 99 P e n i t u m 4 144 instructions for multimedia 00 5 CS61C L23 x86 UC Regents CS61C L23 x86 UC Regents 6 M IPS v s Intel 80x86 M IPS v s Intel 80x86 M IPS load store architecture M IPS Three address architecture Only Load Store access memory rest operations register register e g Arithmetic logic specify all 3 operands lw t0 12 gp add s0 s0 t0 s0 s0 Mem 12 gp add s0 s1 s2 s0 s1 s2 Benefit fewer instructions performance Benefit simpler hardware easier to pipeline higher performance x86 Two address architecture Only 2 operands so the destination is also one of the sources x86 register memory architecture All operations can have an operand in memory other operand is a register e g add s1 s0 s0 s0 s1 Often true in C statements c b add 12 gp s0 s0 s0 Mem 12 gp Benefit smaller instructions smaller code CS61C L23 x86 UC Regents 7 M IPS v s Intel 80x86 Benefit fewer instructions smaller code CS61C L23 x86 UC Regents M IPS is example of RISC RISC Reduced Instruction Set Computer M IPS fixed length instructions All instructions same size e g 4 bytes Term coined at Berkeley ideas pioneered by IBM Berkeley Stanford simple hardware performance branches can be multiples of 4 bytes RISC characteristics x86 variable length instructions Load store architecture Instructions are multiple of bytes 1 to 17 Fixed length instructions typically 32 bits small code size 30 smaller Three address architecture More Recent Performance Benefit better instruction cache hit rates Instructions can include 8 or 32 bit immediates CS61C L23 x86 UC Regents 9 Unusual features of 80x86 RISC examples MIPS SPARC IBM Motorola PowerPC Compaq Alpha ARM SH4 HP PA 10 CS61C L23 x86 UC Regents Instructions M I P S vs 8 32 bit Registers have names 16 bit 8086 names with e prefix eax ecx edx ebx esp ebp esi edi addu addiu addl subu subl 80x86 and or xor andl orl xorl PC is called eip instruction pointer sll srl sra sall shrl sarl leal load effective address lw movl mem reg sw movl reg mem mov movl reg reg li movl imm reg lui n a 80x86 word is 16 bits double word is 32 bits Calculate address like a load but load address into register not data Load 32 bit address leal 4000000 ebp esi esi ebp 4000000 CS61C L23 x86 UC Regents 8 11 CS61C L23 x86 UC Regents 12 80386 addressing ALU instructions too Branch in 80x86 base reg offset like MIPS movl 8000044 ebp eax Rather than compare registers x86 uses special 1 bit registers called condition codes that are set as a side effect of ALU operations base reg index reg 2 regs f o r m addr movl eax ebx edi edi Mem ebx eax S Sign Bit Z Zero result is all 0 scaled reg index shift one reg by 1 2 movl eax edx 4 ebx ebx Mem edx 4 eax C Carry Out P Parity set to 1 if even number of ones in rightmost 8 bits of operation scaled reg index offset movl 12 eax edx 4 ebx ebx Mem edx 4 eax 12 13 CS61C L23 x86 UC Regents Branch beq Conditional Branch instructions then use condition flags for all comparisons M I P S vs CS61C L23 x86 UC Regents While in C Assembly 80x86 80x86 cmpl je if previous operation set condition code then cmpl u n n e c e s s a r y bne cmpl jne slt beq cmpl jlt slt bne cmpl jge jal call jr 31 ret 14 C while save i k i i j i j k edx esi ebx leal 400 ebp eax Loop cmpl ebx eax edx 4 x jne Exit 8 addl esi edx 6 j Loop Exit Note c m p l replaces sll add lw in loop 15 CS61C L23 x86 UC Regents R e s t o f 6 1 C s l o w e r p a c e n o m o r e h o m e w o r k s projects labs 11 24 X86 PC buzzwords and 61C RAID Lab W 11 29 Review Pipelines Feedback lab F 12 1 Review Caches TLB VM Section 7 5 M 12 4 Deadline to correct your grade record W 12 6 Review Interrupts A 7 Feedback lab F 12 8 6 1 C S u m m a r y Y o u r C a l h e r i t a g e HKN Course Evaluation Sun Tues 12 10 12 12 CS61C L23 x86 UC Regents 16 Computers in the News Need More CPU Speed Henry Norr November 20 2000 S F Chronicle Administrivia Rest of 61C W CS61C L23 x86 UC Regents Final Review 2PM 155 Dwinelle Final 5PM 1 P imintel 17 Stand by to duck and cover you re about to be barraged by a new wave of clockspeed and performance claims from the leading makers of PC processors Today s release of the Pentium 4 running at up to 1 5 G H z will put Intel back …
View Full Document
Unlocking...