DOC PREVIEW
CMU CS 15740 - Instruction Set Comparisons

This preview shows page 1-2-3-4-5 out of 14 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Page 1Instruction Set ComparisonsCS 740Sept. 20, 2002Topics• Code Examples–Procedure Linkage– Sparse Matrix Code• Instruction Sets–Alpha–PowerPC–VAX–x86CS 740 F’02–2–Procedure ExamplesProcedure linkage• Passing of control and data• stack managementControl• Register tests• Condition codes• loop countersint rfact(int n){if (n <= 1)return 1;return n*rfact(n-1);}Code ExamplePage 2CS 740 F’02–3–Alpha rfactRegisters• $16 Argument n• $9 Saved n– Callee save• $0 Return ValueStack Frame• 16 bytes• $9• $26 return PC rfact:ldgp $29,0($27) # setup gprfact..ng:lda $30,-16($30) # $sp -= 16.frame $30,16,$26,0stq $26,0($30) # save return addrstq $9,8($30) # save $9.mask 0x4000200,-16.prologue 1bis $16,$16,$9 # $9 = ncmple $9,1,$1 # if (n <= 1) thenbne $1,$80 # branch to $80subq $9,1,$16 # $16 = n - 1bsr $26,rfact..ng # recursive callmulq $9,$0,$0 # $0 = n*rfact(n-1)br $31,$81 # branch to epilogue.align 4$80:bis $31,1,$0 # return val = 1$81:ldq $26,0($30) # restore retrn addrldq $9,8($30) # restore $9addq $30,16,$30 # $sp += 16ret $31,($26),1save $9save $26$sp + 8$sp + 0CS 740 F’02–4–VAXPinnacle of CISC• Maximize instruction density• Provide instructions closely matched to typical program operationsInstruction format• OP, arg1, arg2, …– Each argument has arbitrary specifier– Accessing operands may have side effectsCondition Codes• Set by arithmetic and comparison instructions• Basis for successive branchesProcedure Linkage• Direct implementation of stack disciplinePage 3CS 740 F’02–5–VAX Registers• R0–R11 General purpose– R2–R7 Callee save in example code– Use pair to hold double• R12 AP Argument pointer– Stack region holding procedure arguments• R13 FP Frame pointer– Base of current stack frame• R14 SP Stack pointer– Top of stack• R15 PC Program counter– Used to access data in code• N C V Z Condition codes– Information about last operation result– Negative, Carry, 2’s OVF, ZeroArgsArg CountSavedCallerStateLocalVarsTempSpaceCurrentFramePreviousFrameAPFPSPVaries with push/popAddressCS 740 F’02–6–VAX Operand SpecifiersFormsNotation Value Side Eff UseRi ri General purpose register$v v Immediate data(Ri) M[ri] Memory referencev(Ri) M[ri+v] Mem. ref. with displacementA[Ri] M[a+ri*d] Array indexing–Ais specifier denoting address a– d is size of datum(Ri)+ M[ri] Ri += d Stepping pointer forward-(Ri) M[ri-d] Ri –= d Stepping pointer backExamples• Push srcmove src, –(SP)• Pop destmove (SP)+, destPage 4CS 740 F’02–7–VAX Procedure LinkageCaller• Push Arguments–Relative to SP• Execute CALLS narg, proc– narg denotes number of argument words– proc starts with mask denoting registers to be saved on stackCALLS Instruction• Creates stack frame– Saved registers, PC, FP, AP, mask, PSW• Sets AP, FP, SPCallee• Compute return value in R0• Execute ret– Undoes effect of CALLSArg1Argn•••SavedStatenargAPSPArg1Argn•••SPCALLS nargFPproc: mask1st instr.CS 740 F’02–8–VAX rfactRegisters• r6 saved n• r0 return valueStack Frame• save r6Note• Destination argument last_rfact:.word 0x40 # Save register r6movl 4(ap),r6 # r6 <- ncmpl r6,$1 # if n > 1jgtr L1 # then goto L1movl $1,r0 # r0 <- 1ret # returnL1:pushab -1(r6) # push n-1 calls $1,_rfact # call recursivelymull2 r6,r0 # return result * nretn1Temp SpaceCurrentFrameAPFPSPr6Saved CallerStatePreviousFramePage 5CS 740 F’02–9–x86• Pre-history: 8080• 8-bit microprocessor• Accumulator machine• 8086/8088• 16-bit machine• Added registers, but many implicitExtended Accumulator Machine• Chosen by IBM• 16-bit address space• 8087• Floating point processor• Extended stack machine• 80286• 24-bit address space• 80386• 32-bit machine• Almost a general purpose register machine• Still have 8086 compatibility mode• 80486• MMX• P-III• Added SSE• P4• Added SSE2• Now floating point looks like a general register machine tooCS 740 F’02–10–x86 – Memory Architecture• Segmented architecture• CS, DS, ES, SS• Addressing Modes• Absolute#0x45612789• Register indirectr• Base+displacementr+0x01234567• Indexedr+s• Indexed+displacementr+s+0x01234567• Base + scaled indexr+(s<<scale)• Base + disp + scaled indexr+0x01234567+(s<<scale)Page 6CS 740 F’02–11–X86 - Registers• Very few registers. Lots of meaning attached to each• Accumulator: AX (AH,AL) -> EAX• Count, loop: CX• Data, Mult, Div: DX• Base: BX• Stack pointer: SP• Base pointer: BP (fp)• Source & Dest index regs: SI, DI• 8 Floating point• Condition codes, PCCS 740 F’02–12–X86 - Instructions• Standard ones you would expect, and …• Call, Callf• Ret, retf• Loop: if (--CX) goto PC+0x??• Push, Pop• MOVS• Prefixes: repeat, lock, …• Postfixes: address specifiers• Can vary in length from 1 byte to 17Page 7CS 740 F’02–13–X86 - rfact_rfact:pushl %ebp ! save old stack basemovl %esp,%ebp ! point BP to argssubl $20,%esp ! get some more stack spacepushl %ebx ! save regmovl 8(%ebp),%ebx ! Get argumentcmpl $1,%ebxjle L3leal -1(%ebx),%eax ! TRICKY!addl $-12,%esp ! allocate space for argspushl %eax ! save argcall _rfact ! make callimull %eax,%ebx ! multiply resultmovl %ebx,%eax ! put back in accumulatorjmp L5.align 4L3:movl $1,%eaxL5:movl -24(%ebp),%ebx ! restore regmovl %ebp,%esp ! restore sppopl %ebp ! restore fpretCS 740 F’02–14–Sparse Matrix CodeTask• Multiply sparse matrix times dense vector–Matrix has many zero entries– Save space and time by keeping only nonzero entries– Common applicationCompressed Sparse Row Representation[(0,3.5) (1,0.9) (3,2.2)][(1,4.1) (3,1.9)][(0,4.6) (2,0.7) (3,2.7) (5,3.0)][(2,2.9)][(2,1.2) (4,2.8)][(5,3.4)]3.50.9 2.24.1 1.94.6 0.72.7 3.02.91.2 2.83.4Page 8CS 740 F’02–15–CSR EncodingParameters• nrow Number of rows (and columns)• nentries Number of nonzero matrix entriesVal• List of nonzero values (nentries)Cindex• List of column indices (nentries)Rstart• List of starting positions for each row (nrow+1)typedef struct {int nrow;int nentries; double *val;int *cindex;int *rstart;} csr_rec, *csr_ptr;CS 740 F’02–16–CSR ExampleParameters• nrow = 6• nentries = 13Val[3.5, 0.9, 2.2, 4.1, 1.9, 4.6, 0.7, 2.7, 3.0, 2.9, 1.2, 2.8, 3.4]Cindex[0, 1, 3, 1, 3, 0, 2, 3, 5, 2, 2, 4, 5]Rstart[0, 3, 5, 9, 10, 12, 13][(0,3.5) (1,0.9) (3,2.2)][(1,4.1)


View Full Document

CMU CS 15740 - Instruction Set Comparisons

Documents in this Course
leecture

leecture

17 pages

Lecture

Lecture

9 pages

Lecture

Lecture

36 pages

Lecture

Lecture

9 pages

Lecture

Lecture

13 pages

lecture

lecture

25 pages

lect17

lect17

7 pages

Lecture

Lecture

65 pages

Lecture

Lecture

28 pages

lect07

lect07

24 pages

lect07

lect07

12 pages

lect03

lect03

3 pages

lecture

lecture

11 pages

lecture

lecture

20 pages

lecture

lecture

11 pages

Lecture

Lecture

9 pages

Lecture

Lecture

10 pages

Lecture

Lecture

22 pages

Lecture

Lecture

28 pages

Lecture

Lecture

18 pages

lecture

lecture

63 pages

lecture

lecture

13 pages

Lecture

Lecture

36 pages

Lecture

Lecture

18 pages

Lecture

Lecture

17 pages

Lecture

Lecture

12 pages

lecture

lecture

34 pages

lecture

lecture

47 pages

lecture

lecture

7 pages

Lecture

Lecture

18 pages

Lecture

Lecture

7 pages

Lecture

Lecture

21 pages

Lecture

Lecture

10 pages

Lecture

Lecture

39 pages

Lecture

Lecture

11 pages

lect04

lect04

40 pages

Load more
Download Instruction Set Comparisons
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Instruction Set Comparisons and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Instruction Set Comparisons 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?