CS61C L23 x86 © UC Regents1CS61C - Machine StructuresLecture 23 - Penitium III, IV and other PCbuzzwordsNovember 22, 2000David Pattersonhttp://www-inst.eecs.berkeley.edu/~cs61c/CS61C L23 x86 © UC Regents2Review (1/2)°One way to define clock cycles:Clock Cycles for program = Instructions for a program(called “Instruction Count”) x Average Clock cycles Per Instruction (abbreviated “CPI”)°CPU execution time for program= Instruction Count x CPI x Clock Cycle TimeCS61C L23 x86 © UC Regents3Review (2/2)°Latency v. Throughput°Performance doesn’t depend on anysingle factor: need to know InstructionCount, Clocks Per Instruction and ClockRate to get valid estimations°User Time: time user needs to wait forprogram to execute: depends heavily onhow OS switches between tasks°CPU Time: time spent executing a singleprogram: depends solely on design ofprocessor (datapath, pipeliningeffectiveness, caches, etc.)CS61C L23 x86 © UC Regents4Outline°Intel 80x86 (Pentium) Instruction Set,History°Administrivia°Computers in the News°Pentium III v. Pentium 4 v. Althon°Typical PC°Typical Mac°ConclusionCS61C L23 x86 © UC Regents5Intel History: ISA evolved since 1978° 8086: 16-bit, all internal registers 16 bits wide;no general purpose registers; ‘78° 8087: + 60 Fl. Pt. instructions, (Prof. Kahan)adds 80-bit-wide stack, but no registers; ‘80° 80286: adds elaborate protection model; ‘82° 80386: 32-bit; converts 8 16-bit registers into8 32-bit general purpose registers;new addressing modes; adds paging; ‘85° 80486, Pentium, Pentium II: + 4 instructions° MMX: + 57 instructions for multimedia; ‘97° Pentium III: +70 instructions for multimedia; ‘99° Penitum 4: +144 instructions for multimedia; '00CS61C L23 x86 © UC Regents6MIPS vs. 80386° Address: 32-bit° Page size: 4KB° Data aligned° Destination reg: Left•add $rd,$rs1,$rs2° Regs: $0, $1, ..., $31° Reg = 0: $0° Return address: $31° 32-bit° 4KB° Data unaligned° Right•add %rs1,%rs2,%rd° %r0, %r1, ..., %r7° (n.a.)° (n.a.)CS61C L23 x86 © UC Regents7MIPS vs. Intel 80x86°MIPS: “Three-address architecture”• Arithmetic-logic specify all 3 operandsadd $s0,$s1,$s2 # s0=s1+s2• Benefit: fewer instructions ⇒ performance°x86: “Two-address architecture”• Only 2 operands,so the destination is also one of the sources add $s1,$s0 # s0=s0+s1• Often true in C statements: c += b;• Benefit: smaller instructions ⇒ smaller codeCS61C L23 x86 © UC Regents8MIPS vs. Intel 80x86°MIPS: “load-store architecture”• Only Load/Store access memory; restoperations register-register; e.g.,lw $t0, 12($gp)add $s0,$s0,$t0 # s0=s0+Mem[12+gp]• Benefit: simpler hardware ⇒ easier to pipeline,higher performance°x86: “register-memory architecture”• All operations can have an operand in memory;other operand is a register; e.g.,add 12(%gp),%s0 # s0=s0+Mem[12+gp]• Benefit: fewer instructions ⇒ smaller codeCS61C L23 x86 © UC Regents9MIPS vs. Intel 80x86°MIPS: “fixed-length instructions”• All instructions same size, e.g., 4 bytes• simple hardware ⇒ performance• branches can be multiples of 4 bytes°x86: “variable-length instructions”• Instructions are multiple of bytes: 1 to 17;⇒ small code size (30% smaller?)• More Recent Performance Benefit: better instruction cache hit rates• Instructions can include 8- or 32-bit immediatesCS61C L23 x86 © UC Regents10MIPS is example of RISC°RISC = Reduced Instruction SetComputer• Term coined at Berkeley, ideas pioneeredby IBM, Berkeley, Stanford°RISC characteristics:• Load-store architecture• Fixed-length instructions (typically 32 bits)• Three-address architecture°RISC examples: MIPS, SPARC,IBM/Motorola PowerPC, Compaq Alpha,ARM, SH4, HP-PA, ...CS61C L23 x86 © UC Regents11Unusual features of 80x86°8 32-bit Registers have names;16-bit 8086 names with “e” prefix:•eax, ecx, edx, ebx, esp, ebp, esi, edi• 80x86 word is 16 bits, double word is 32 bits°PC is called eip (instruction pointer)°leal (load effective address)• Calculate address like a load, but load addressinto register, not data• Load 32-bit address: leal -4000000(%ebp),%esi # esi = ebp - 4000000CS61C L23 x86 © UC Regents12Instructions:MIPS vs. 80x86°addu, addiu°subu°and,or, xor°sll, srl, sra°lw°sw°mov°li°lui°addl°subl°andl, orl, xorl°sall, shrl, sarl°movl mem, reg°movl reg, mem°movl reg, reg°movl imm, reg°n.a.CS61C L23 x86 © UC Regents1380386 addressing (ALU instructions too) °base reg + offset (like MIPS)•movl -8000044(%ebp), %eax°base reg + index reg (2 regs form addr.)•movl (%eax,%ebx),%edi # edi = Mem[ebx + eax]°scaled reg + index (shift one reg by 1,2)•movl(%eax,%edx,4),%ebx # ebx = Mem[edx*4 + eax]°scaled reg + index + offset•movl 12(%eax,%edx,4),%ebx # ebx = Mem[edx*4 + eax + 12]CS61C L23 x86 © UC Regents14Branch in 80x86°Rather than compare registers, x86uses special 1-bit registers called“condition codes” that are set as aside-effect of ALU operations• S - Sign Bit• Z - Zero (result is all 0)• C - Carry Out• P - Parity: set to 1 if even number of onesin rightmost 8 bits of operation°Conditional Branch instructions thenuse condition flags for allcomparisons: <, <=, >, >=, ==, !=CS61C L23 x86 © UC Regents15Branch: MIPS vs. 80x86°beq°bne°slt; beq°slt; bne°jal°jr $31°(cmpl;) jeif previous operationset condition code, thencmpl unnecessary°(cmpl;) jne°(cmpl;) jlt°(cmpl;) jge°call°retCS61C L23 x86 © UC Regents16 while (save[i]==k)i = i + j;(i,j,k: %edx,%esi,%ebx)leal -400(%ebp),%eax.Loop: cmpl %ebx,(%eax,%edx,4)jne .Exitaddl %esi,%edxj .Loop.Exit:While in C/Assembly: 80x86Cx86Note: cmpl replaces sll, add, lw in loop CS61C L23 x86 © UC Regents17Administrivia: Rest of 61C•Rest of 61C slower pace•no more homeworks, projects, labsW 11/24 X86, PC buzzwords and 61C; RAID LabW11/29 Review: Pipelines; Feedback “lab”F 12/1 Review: Caches/TLB/VM; Section 7.5M 12/4 Deadline to correct your grade recordW 12/6 Review: Interrupts (A.7); Feedback labF 12/8 61C Summary / Your Cal heritage /HKN Course EvaluationSun 12/10 Final Review, 2PM (155 Dwinelle)Tues 12/12 Final (5PM 1 Pimintel)CS61C L23 x86 © UC Regents18Computers in the News°Need More CPU Speed? Henry Norr,November 20, 2000, S.F. Chronicle"Stand by to duck and cover -- you're aboutto be barraged by a new wave of clock-speed and performance claims from theleading makers of PC processors.Today's release of the Pentium 4,
View Full Document