Unformatted text preview:

4 1 11 You Are Here So9ware Hardware Parallel Requests CS 61C Great Ideas in Computer Architecture Machine Structures Instruc on Level Parallelism The Datapath Assigned to computer e g Search Katz Harness Parallel Threads Parallelism Assigned to core e g Lookup Ads Achieve High Performance 1 instrucYon one Yme e g 5 pipelined instrucYons Parallel Data 1 data item one Yme e g Add of 4 pairs of words 4 1 11 Agenda Input Output Today s InstrucYon Unit s Lecture Spring 2011 Lecture 20 A0 B0 A1 B1 A2 B2 A3 B3 Main Memory 4 3 4 1 11 Spring 2011 Lecture 20 5 Review Single Cycle Processor A simpler core is a faster core ReducYon in the number and complexity of instrucYons in the ISA simpli es pipelined implementaYon Common RISC strategies Five steps to design a processor Processor 1 Analyze instrucYon set Input datapath requirements Control Memory 2 Select set of datapath components establish Datapath Output clock methodology 3 Assemble datapath meeYng the requirements re examine for pipelining 4 Analyze implementaYon of each instrucYon to determine selng of control points that e ects the register transfer 5 Assemble the control logic Fixed instrucYon length generally a single word Simpli es process of fetching instrucYons from memory Simpli ed addressing modes Simpli es process of fetching operands from memory Fewer and simpler instrucYons in the instrucYon set Simpli es process of execuYng instrucYons Simpli ed memory access only load and store instrucYons access memory Let the compiler do it Use a good compiler to break complex high level language statements into a number of simple assembly language statements Spring 2011 Lecture 5 Logic Gates Spring 2011 Lecture 20 Pipelined ExecuYon Administrivia Pipelined Datapath Pipeline Hazards Technology Break Pipelining and InstrucYon Set Design Summary Review RISC Design Principles 4 1 11 Core FuncYonal Unit s Agenda Pipelined ExecuYon Administrivia Pipelined Datapath Pipeline Hazards Technology Break Pipelining and InstrucYon Set Design Summary 4 1 11 Core Memory Cache Hardware descripYons 1 Core All gates funcYoning in parallel at same Yme Spring 2011 Lecture 20 Computer Parallel InstrucYons Instructors Randy H Katz David A PaFerson hFp inst eecs Berkeley edu cs61c fa10 4 1 11 Smart Phone Warehouse Scale Computer Formulate Logic EquaYons Design Circuits 6 4 1 11 Spring 2011 Lecture 20 7 1 4 1 11 Single Cycle Performance Pipeline Analogy Doing Laundry Ann Brian Cathy Dave each have one load of clothes to wash dry fold and put away Assume Yme for acYons are 100ps for register read or write 200ps for other events Clock rate is Instr Instr fetch Register read ALU op Memory access Register write Total time lw 200ps 100 ps 200ps 200ps 100 ps 800ps sw 200ps 100 ps 200ps 200ps R format 200ps 100 ps 200ps beq 200ps 100 ps 200ps Washer takes 30 minutes Dryer takes 30 minutes 700ps 100 ps 600ps Folder takes 30 minutes 500ps What can we do to improve clock rate Will this improve performance as well Stasher takes 30 minutes to put clothes into drawers Want increased clock rate to mean faster programs 4 1 11 Spring 2011 Lecture 20 Student RouleFe 8 4 1 11 Spring 2011 Lecture 20 SequenYal Laundry 6 PM 7 T a s k A 8 10 9 11 12 2 AM 6 PM 7 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 Time T a s k B O r d e r D SequenYal laundry takes 8 hours for 4 loads 4 1 11 Spring 2011 Lecture 20 10 T a s k 7 8 9 Time 30 30 30 30 30 30 30 A B O r d e r 4 1 11 C D Pipelining doesn t help latency of single task it helps throughput of enYre workload MulYple tasks operaYng simultaneously using di erent resources PotenYal speedup Number pipe stages Time to ll pipeline and Yme to drain it reduces speedup 2 3X v 4X in this example Spring 2011 Lecture 20 12 8 12 1 C D Pipelined laundry takes 3 5 hours for 4 loads C D 11 Pipelining Lessons 2 2 7 8 9 30 30 30 30 30 30 30 B 2 AM Time Spring 2011 Lecture 20 A 4 1 11 11 B Time O r d e r 10 30 30 30 30 30 30 30 6 PM T a s k 9 A 4 1 11 Pipelining Lessons 1 2 6 PM 9 Pipelined Laundry 1 C O r d e r A B C D Suppose new Washer takes 20 minutes new Stasher takes 20 minutes How much faster is pipeline Pipeline rate limited by slowest pipeline stage Unbalanced lengths of pipe stages reduces speedup Spring 2011 Lecture 20 13 2 4 1 11 Agenda Project 4 Pipelined Cycle Processor in Logicsim Pipelined ExecuYon Administrivia Pipelined Datapath Pipeline Hazards Technology Break Pipelining and InstrucYon Set Design Summary Due 4 24 23 59 59 Final Review TBD Final Mon May 9 11AM 2PM TBD 14 4 1 11 Spring 2011 Lecture 20 Review Single Cycle Datapath Agenda 31 busW 5 5 Rw Ra Rb busA RegFile busB 32 clk imm16 17 zero Rs Rt 16 ExtOp 4 1 11 32 Rs Rt Rd Imm16 ALUctr MemtoReg ALU 32 5 0 MemWr 32 32 32 1 ALUSrc 0 15 RegWr 0 11 15 1 InstrucYon 31 0 16 20 clk Rd Rt 0 WrEn Adr Data In clk Data Memory 1 Spring 2011 Lecture 20 18 Redrawn Single Cycle Datapath PC 1 IF InstrucYon Fetch Increment PC 2 ID InstrucYon Decode Read Registers 3 EX Mem ref Calculate Address Arith log Perform OperaYon 4 Mem Load Read Data from Memory Store Write Data to Memory 5 WB Write Data Back to Register Spring 2011 Lecture 20 0 immediate instr fetch unit nPC sel RegDst Steps in ExecuYng MIPS 4 1 11 16 rt 21 25 Spring 2011 Lecture 20 21 rs Data Memory R rs SignExt imm16 R rt Pipelined ExecuYon Administrivia Pipelined Datapath Pipeline Hazards Technology Break Pipelining and InstrucYon Set Design Summary 4 1 11 26 op Extender 15 4 1 InstrucYon Fetch 19 4 1 11 rd rs rt ALU Data memory Spring 2011 Lecture 20 Extra Credit Fastest Version of Project 3 registers 4 1 11 Due Part 1 datapath due 4 10 Part 2 due 4 17 Face to Face grading Signup for Ymeslot last week instrucYon memory Administrivia imm 2 Decode 3 Execute 4 Memory 5 Write Back Register Read Spring 2011 Lecture 20 20 3 4 1 11 ALU More Detailed Pipeline Data memory rd rs rt registers PC instrucYon memory Pipelined Datapath imm 4 1 InstrucYon Fetch 2 Decode 3 Execute 4 Memory 5 Write Back Register Read Add registers between stages Hold informaYon produced in previous cycle 4 1 11 Spring 2011 Lecture 20 21 4 1 11 IF for Load Store 4 1 11 Spring 2011 Lecture 20 Spring 2011 Lecture 20 22 ID for Load Store 23 4 1 11 EX for Load 4 1 11 Spring 2011 Lecture 20 Spring 2011 Lecture 20 24 MEM for Load 25 4 1 …


View Full Document

Berkeley COMPSCI 61C - Lecture Notes

Documents in this Course
SIMD II

SIMD II

8 pages

Midterm

Midterm

7 pages

Lecture 7

Lecture 7

31 pages

Caches

Caches

7 pages

Lecture 9

Lecture 9

24 pages

Lecture 1

Lecture 1

28 pages

Lecture 2

Lecture 2

25 pages

VM II

VM II

4 pages

Midterm

Midterm

10 pages

Load more
Loading Unlocking...
Login

Join to view Lecture Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture Notes and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?