DOC PREVIEW
GT CS 4803 - CS 4803 MIPS
School name Georgia Tech
Pages 22

This preview shows page 1-2-21-22 out of 22 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 22 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 22 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 22 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 22 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 22 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Spring 2010 Prof. Hyesoon Kim• MIPS (Microprocessor without interlocked pipeline stages) • MIPS Computer Systems Inc. • MIPS architecture usages • 1990’s – R2000, R3000, R4000, Motorola 68000 family • Playstation, Playstation 2, Sony PSP handheld, Nintendo 64 console • Androidhttp://en.wikipedia.org/wiki/MIPS_architecture• MIPS R4000 CPU core• Floating point and vector floating point co-processors • 3D-CG extended instruction sets • Graphics – 3D curved surface and other 3D functionality– Hardware clipping, compressed texture handling • R4300 (embedded version) – Nintendo-64http://www.digitaltrends.com/gaming/sony-announces-playstation-portable-specs/• Started from 32-bit • Later 64-bit • 16-bit compression version (similar to ARM thumb)• SIMD additions-64 bit floating points http://www.spiritus-temporis.com/mips-architecture/• Conditionally move one CPU general register to another • Limited form of predicated execution. – Difference between fully predicated execution and conditional move?• 32-bit fixed format inst (3 formats)• 31 32-bit GPR (R0 contains zero) and 32 FP registers (and HI LO)• partitioned by software convention• 3-address, reg-reg arithmetic instr.• Single address mode for load/store: base+displacement• Simple branch conditions• compare one register against zero or two registers for =,• no condition codes for integer operationsThe Mips R4000 Processor, Mirapuri, S.; Woodacre, M.; Vasseghi, N.; Micro, IEEE Volume: 12 , Issue: 2 Publication Year: 1992 , Page(s): 10 - 22The Mips R4000 Processor, Mirapuri, Woodacre,Vasseghi, N., ‘92P-cache: Primary cache S-cache: Secondary cache• Q: Tag check stage, why is it at the end of load access? • A: virtual indexed physically tagged (VIPT)VirtualAddressCacheTagsHit?TLBPhysical Address=Physical TagCacheDataR2000 load has a delay slot LW ra ---Addi ra rb rcAddi ra rb rcGood idea? Bad Idea?R4000 does not have load delay slots. See old Ra value ( before load)• 2-cycle delay loads• Data is not available until the end of DS • Only DF/DS/TC/WB stages make a progress for load instructions (IS/RF/EX pipeline stages stall)• 2-level cache hierarchy• Different line sizes – Pros? cons? • Inclusive cache• Primary cache: initial design 8BKB  32KB – Direct-mapped, VIPT– 16 or 32B software programmable line size • Secondary cache– 128-bit, up to 4MBFE ID EX MEM WBbr0x800br0x804brbrbr0x8040x9000x904PC (latch)addaddsubadd0x9081cycle23456mul subsubaddFE_stageAlways two cycles of pipeline bubble0x8000x8040x8080x80b0x8100x900 target mul r2, r3,r4sub r1, r2,r3add r4, r2,r3br targetChange the rule!Always execute the next two instructions after a branch0x900 target mul r2, r3,r40x900 target mul r2, r3,r4sub r1, r2,r3add r4, r2,r3br target0x8000x804 0x808FE ID EX MEM WBbr0x800br0x804brbrbr0x8080x9000x904Fetch addrsubaddmuldiv0x9081cycle23456add mulmuldivsubadd subadd subadd sub0x90b7sub muldivaddaddNo pipeline bubble!!• N-cycle delay slot• The compiler fills out useful instructions inside the delay slot• Different options:– Fill the slot from before the branch instruction• Restriction: branch must not depend on result of the filled instruction – Fill the slot from the target of the branch instruction• Restriction: should be OK to execute instruction even if not taken– Fill the slot from fall through of the branch• Restriction: should be OK to execute instruction even if takenStill Cancel or nullifying instructions• Branch:– Execute the instructions in the delay slot • Branch likely– Do not execute instructions in the delay slot if the branch is not taken• No not use branch likely! – It won’t be supported in the future• Many DSP architecture, older RISC, MIPS, PA-RISC, SPARC.• Delayed branches are architecturally invisible– Advantage:• better performance– Disadvantage: • what if implementation changes? • Deeper pipeline-> more branch delays? • Interrupt/exceptions? – Where to go back?• Combining with a branch predictor? visible• Later designs are based on R10K • Out-of-order super scalar processor• ROB, 32 in-flight instructions • 4-instruction


View Full Document

GT CS 4803 - CS 4803 MIPS

Download CS 4803 MIPS
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view CS 4803 MIPS and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view CS 4803 MIPS 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?