Page 1Instruction Set ComparisonsCS 740Sept. 20, 2002Topics• Code Examples–Procedure Linkage– Sparse Matrix Code• Instruction Sets–Alpha–PowerPC–VAX–x86CS 740 F’02–2–Procedure ExamplesProcedure linkage• Passing of control and data• stack managementControl• Register tests• Condition codes• loop countersint rfact(int n){if (n <= 1)return 1;return n*rfact(n-1);}Code ExamplePage 2CS 740 F’02–3–Alpha rfactRegisters• $16 Argument n• $9 Saved n– Callee save• $0 Return ValueStack Frame• 16 bytes• $9• $26 return PC rfact:ldgp $29,0($27) # setup gprfact..ng:lda $30,-16($30) # $sp -= 16.frame $30,16,$26,0stq $26,0($30) # save return addrstq $9,8($30) # save $9.mask 0x4000200,-16.prologue 1bis $16,$16,$9 # $9 = ncmple $9,1,$1 # if (n <= 1) thenbne $1,$80 # branch to $80subq $9,1,$16 # $16 = n - 1bsr $26,rfact..ng # recursive callmulq $9,$0,$0 # $0 = n*rfact(n-1)br $31,$81 # branch to epilogue.align 4$80:bis $31,1,$0 # return val = 1$81:ldq $26,0($30) # restore retrn addrldq $9,8($30) # restore $9addq $30,16,$30 # $sp += 16ret $31,($26),1save $9save $26$sp + 8$sp + 0CS 740 F’02–4–VAXPinnacle of CISC• Maximize instruction density• Provide instructions closely matched to typical program operationsInstruction format• OP, arg1, arg2, …– Each argument has arbitrary specifier– Accessing operands may have side effectsCondition Codes• Set by arithmetic and comparison instructions• Basis for successive branchesProcedure Linkage• Direct implementation of stack disciplinePage 3CS 740 F’02–5–VAX Registers• R0–R11 General purpose– R2–R7 Callee save in example code– Use pair to hold double• R12 AP Argument pointer– Stack region holding procedure arguments• R13 FP Frame pointer– Base of current stack frame• R14 SP Stack pointer– Top of stack• R15 PC Program counter– Used to access data in code• N C V Z Condition codes– Information about last operation result– Negative, Carry, 2’s OVF, ZeroArgsArg CountSavedCallerStateLocalVarsTempSpaceCurrentFramePreviousFrameAPFPSPVaries with push/popAddressCS 740 F’02–6–VAX Operand SpecifiersFormsNotation Value Side Eff UseRi ri General purpose register$v v Immediate data(Ri) M[ri] Memory referencev(Ri) M[ri+v] Mem. ref. with displacementA[Ri] M[a+ri*d] Array indexing–Ais specifier denoting address a– d is size of datum(Ri)+ M[ri] Ri += d Stepping pointer forward-(Ri) M[ri-d] Ri –= d Stepping pointer backExamples• Push srcmove src, –(SP)• Pop destmove (SP)+, destPage 4CS 740 F’02–7–VAX Procedure LinkageCaller• Push Arguments–Relative to SP• Execute CALLS narg, proc– narg denotes number of argument words– proc starts with mask denoting registers to be saved on stackCALLS Instruction• Creates stack frame– Saved registers, PC, FP, AP, mask, PSW• Sets AP, FP, SPCallee• Compute return value in R0• Execute ret– Undoes effect of CALLSArg1Argn•••SavedStatenargAPSPArg1Argn•••SPCALLS nargFPproc: mask1st instr.CS 740 F’02–8–VAX rfactRegisters• r6 saved n• r0 return valueStack Frame• save r6Note• Destination argument last_rfact:.word 0x40 # Save register r6movl 4(ap),r6 # r6 <- ncmpl r6,$1 # if n > 1jgtr L1 # then goto L1movl $1,r0 # r0 <- 1ret # returnL1:pushab -1(r6) # push n-1 calls $1,_rfact # call recursivelymull2 r6,r0 # return result * nretn1Temp SpaceCurrentFrameAPFPSPr6Saved CallerStatePreviousFramePage 5CS 740 F’02–9–x86• Pre-history: 8080• 8-bit microprocessor• Accumulator machine• 8086/8088• 16-bit machine• Added registers, but many implicitExtended Accumulator Machine• Chosen by IBM• 16-bit address space• 8087• Floating point processor• Extended stack machine• 80286• 24-bit address space• 80386• 32-bit machine• Almost a general purpose register machine• Still have 8086 compatibility mode• 80486• MMX• P-III• Added SSE• P4• Added SSE2• Now floating point looks like a general register machine tooCS 740 F’02–10–x86 – Memory Architecture• Segmented architecture• CS, DS, ES, SS• Addressing Modes• Absolute#0x45612789• Register indirectr• Base+displacementr+0x01234567• Indexedr+s• Indexed+displacementr+s+0x01234567• Base + scaled indexr+(s<<scale)• Base + disp + scaled indexr+0x01234567+(s<<scale)Page 6CS 740 F’02–11–X86 - Registers• Very few registers. Lots of meaning attached to each• Accumulator: AX (AH,AL) -> EAX• Count, loop: CX• Data, Mult, Div: DX• Base: BX• Stack pointer: SP• Base pointer: BP (fp)• Source & Dest index regs: SI, DI• 8 Floating point• Condition codes, PCCS 740 F’02–12–X86 - Instructions• Standard ones you would expect, and …• Call, Callf• Ret, retf• Loop: if (--CX) goto PC+0x??• Push, Pop• MOVS• Prefixes: repeat, lock, …• Postfixes: address specifiers• Can vary in length from 1 byte to 17Page 7CS 740 F’02–13–X86 - rfact_rfact:pushl %ebp ! save old stack basemovl %esp,%ebp ! point BP to argssubl $20,%esp ! get some more stack spacepushl %ebx ! save regmovl 8(%ebp),%ebx ! Get argumentcmpl $1,%ebxjle L3leal -1(%ebx),%eax ! TRICKY!addl $-12,%esp ! allocate space for argspushl %eax ! save argcall _rfact ! make callimull %eax,%ebx ! multiply resultmovl %ebx,%eax ! put back in accumulatorjmp L5.align 4L3:movl $1,%eaxL5:movl -24(%ebp),%ebx ! restore regmovl %ebp,%esp ! restore sppopl %ebp ! restore fpretCS 740 F’02–14–Sparse Matrix CodeTask• Multiply sparse matrix times dense vector–Matrix has many zero entries– Save space and time by keeping only nonzero entries– Common applicationCompressed Sparse Row Representation[(0,3.5) (1,0.9) (3,2.2)][(1,4.1) (3,1.9)][(0,4.6) (2,0.7) (3,2.7) (5,3.0)][(2,2.9)][(2,1.2) (4,2.8)][(5,3.4)]3.50.9 2.24.1 1.94.6 0.72.7 3.02.91.2 2.83.4Page 8CS 740 F’02–15–CSR EncodingParameters• nrow Number of rows (and columns)• nentries Number of nonzero matrix entriesVal• List of nonzero values (nentries)Cindex• List of column indices (nentries)Rstart• List of starting positions for each row (nrow+1)typedef struct {int nrow;int nentries; double *val;int *cindex;int *rstart;} csr_rec, *csr_ptr;CS 740 F’02–16–CSR ExampleParameters• nrow = 6• nentries = 13Val[3.5, 0.9, 2.2, 4.1, 1.9, 4.6, 0.7, 2.7, 3.0, 2.9, 1.2, 2.8, 3.4]Cindex[0, 1, 3, 1, 3, 0, 2, 3, 5, 2, 2, 4, 5]Rstart[0, 3, 5, 9, 10, 12, 13][(0,3.5) (1,0.9) (3,2.2)][(1,4.1)
View Full Document