3 30 11 You Are Here So0ware Hardware Parallel Requests Assigned to computer e g Search Katz CS 61C Great Ideas in Computer Architecture Machine Structures Single Cycle MIPS CPU Harness Parallel Threads Parallelism Assigned to core e g Lookup Ads Achieve High Performance Computer Parallel InstrucYons Input Output A0 B0 A1 B1 A2 B2 A3 B3 Hardware descripYons Logic Gates 3 30 11 Spring 2011 Lecture 18 Levels of RepresentaYon InterpretaYon lw lw sw sw Assembler Machine Language Program MIPS t0 0 2 t1 4 2 t1 0 2 t0 4 2 0000 1010 1100 0101 1001 1111 0110 1000 1100 0101 1010 0000 3 Review temp v k v k v k 1 v k 1 temp Compiler Assembly Language Program e g MIPS Today Main Memory All gates funcYoning in parallel at same Yme High Level Language Program e g C Core FuncYonal Unit s InstrucYon Unit s 1 data item one Yme e g Add of 4 pairs of words 1 Core Memory Cache Parallel Data Spring 2011 Lecture 18 Core 1 instrucYon one Yme e g 5 pipelined instrucYons Instructors Randy H Katz David A PaGerson hGp inst eecs Berkeley edu cs61c sp11 3 30 11 Smart Phone Warehouse Scale Computer Clocks tell us when D ip ops change Anything can be represented as a number i e data or instrucYons 0110 1000 1111 1001 1010 0000 0101 1100 1111 1001 1000 0110 0101 1100 0000 1010 1000 0110 1001 1111 Machine Interpreta4on Hardware Architecture DescripCon e g block diagrams Setup and Hold Ymes important We pipeline long delay CL for faster clock Finite State Machines extremely useful Use muxes to select among input S input bits selects 2S inputs Each input can be n bits wide indep of S Can implement muxes hierarchically Can implement FSM with register logic Architecture Implementa4on Logic Circuit DescripCon 3 30 11 Circuit SchemaCc Diagrams Spring 2011 Lecture 18 4 3 30 11 Agenda Spring 2011 Lecture 18 The MIPS lite Subset ADDU and SUBU MIPS lite Datapath Administrivia CPU Timing MIPS lite Control Datapath Control Technology Break Control ImplementaYon 31 addu rd rs rt subu rd rs rt OR Immediate 26 op 6 bits 31 op 31 lw rt rs imm16 sw rt rs imm16 BRANCH rs 31 26 op Spring 2011 Lecture 18 6 beq rs rt imm16 3 30 11 Spring 2011 Lecture 18 5 bits 5 bits 6 bits 0 16 bits 0 immediate 5 bits 21 rs 0 funct 16 rt 5 bits 6 shamt immediate 5 bits 21 5 bits 11 rd 16 rt 5 bits 26 op 5 bits 21 rs 6 bits 16 rt 5 bits 26 ori rt rs imm16 6 bits LOAD and STORE Word 21 rs 6 bits 3 30 11 5 16 bits 16 rt 5 bits 0 immediate 16 bits 7 1 3 30 11 Processor Design Process Register Transfer Language RTL RTL gives the meaning of the instrucYons Five steps to design a processor op rs rt rd shamt funct MEM PC Step 1 Analyze instrucYon set to determine datapath requirements see next slide Step 2 Select set of datapath components establish clocking methodology Step 3 Assemble datapath components that meet the requirements Step 4 Analyze implementaYon of each instrucYon to determine senng of control points that realizes the register transfer Step 5 Assemble the control logic 3 30 11 op rs rt Imm16 MEM PC All start by fetching the instrucYon Spring 2011 Lecture 18 8 Inst Register Transfers ADDU R rd R rs R rt PC PC 4 SUBU R rd R rs R rt PC PC 4 ORI R rt R rs zero ext Imm16 PC PC 4 LOAD R rt MEM R rs sign ext Imm16 PC PC 4 STORE MEM R rs sign ext Imm16 R rt PC PC 4 BEQ if R rs R rt then PC PC 4 sign ext Imm16 00 else PC PC 4 3 30 11 Spring 2011 Lecture 18 Step 1 Requirements of the InstrucYon Set 9 Generic Steps of Datapath Memory MEM 3 30 11 Spring 2011 Lecture 18 mux Step 2 Components of the Datapath OP Sum A CarryOut B 32 Adder 3 30 11 A 32 32 32 MulYplexer Spring 2011 Lecture 18 32 ALU 32 Select MUX Adder B 32 Y B 32 2 Decode Register Read 5 Register 3 Execute 4 Memory Write Spring 2011 Lecture 18 11 ALU Needs for MIPS lite Rest of MIPS AddiYon subtracYon logical OR CombinaYonal Elements Storage Elements Clocking Methodology Building Blocks CarryIn 3 30 11 ALU imm 4 1 InstrucYon Fetch 10 rd rs rt Data memory PC Extender sign zero extend Add Sub OR unit for operaYon on register s or extended immediate Add 4 maybe extended immediate to PC Compare if registers equal A instrucYon memory PC Read rs Read rt Write rt or rd registers InstrucYons data will use one for each really caches Registers R 32 x 32 Result 32 ALU 12 ADDU SUBU ORI R rd R rs R rt R rd R rs R rt R rt R rs zero ext Imm16 BEQ if R rs R rt Test to see if output 0 for any ALU operaYon gives test How P H also adds AND Set Less Than 1 if A B 0 otherwise ALU from Appendix C secYon C 5 3 30 11 Spring 2011 Lecture 18 13 2 3 30 11 Storage Element Idealized Memory Write Enable Memory idealized One input bus Data In One output bus Data Out Memory word is found by Data In 32 Clk Write Enable DataOut 32 CLK input is a factor ONLY during write operaYon During read operaYon behaves as a combinaYonal logic block Address valid Data Out valid ater access Yme Spring 2011 Lecture 18 14 RW RA RB Write Enable 5 5 5 busW 32 Clk Register is selected by busA 32 32 x 32 bit Registers busB 32 RA number selects the register to put on busA data RB number selects the register to put on busB data RW number selects the register to be wriGen via busW data when Write Enable is 1 3 30 11 SequenYal Code PC PC 4 Branch and Jump PC something else RA or RB valid busA or busB valid ater access Yme Spring 2011 Lecture 18 16 3 30 11 Step 3 Add Subtract 26 21 16 11 6 op rs rt rd shamt 6 bits 5 bits 5 bits 5 bits 5 bits 0 funct 6 bits ALUctr and RegWr control logic ater decoding the instrucYon 32 clk Rw Ra Rb 32 x 32 bit Registers 32 busB PC Next Address Logic Address InstrucYon Word InstrucYon Memory 32 Spring 2011 Lecture 18 17 Project 3 Part 2 due Sunday 4 3 Threads Level Parallelism and OpenMP Project 4 Part 1 due Sunday 4 10 Design a 16 bit pipelined computer in Logisim Last homework due Sunday 4 10 Project 4 Part 2 due Sunday 4 17 ALUctr busA ALU busW 15 Administrivia R rd R rs op R rt addu rd rs rt Ra Rb and Rw come from instrucYon s Rs Rt and Rd elds …
View Full Document
Unlocking...