DOC PREVIEW
Berkeley COMPSCI 150 - Lecture 9 - CPU Microarchitecture

This preview shows page 1-2-24-25 out of 25 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 25 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 25 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 25 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 25 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 25 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Spring 2011EECS150 - Lec09-cpuPage EECS150 - Digital DesignLecture 9- CPU MicroarchitectureFeb 15, 2011John Wawrzynek1Spring 2011EECS150 - Lec09-cpuPage Watson: Jeopardy-playing ComputerWatson is made up of a cluster of ninety IBM Power 750 servers (plus additional I/O, network and cluster controller nodes in 10 racks) for a total of 2880 POWER7 processor cores and 16 Terabytes of RAM. Each Power 750 server uses a 3.5 GHz POWER7 eight core processor, with four threads per core. and it still takes ~15 seconds a question.Each core can do 8 double-precision FLOPS/cycle. So total is 2880*3.5*8 > 80 TFLOPS2Spring 2011EECS150 - Lec09-cpuPage Processor Microarchitecture IntroductionMicroarchitecture: how to implement an architecture in hardwareGood examples of how to put principles of digital design to practice.Introduction to final project.3Spring 2011EECS150 - Lec09-cpuPage MIPS Processor Architecture• For now we consider a subset of MIPS instructions:–R-type instructions: and, or, add, sub, slt–Memory instructions: lw, sw–Branch instructions: beq• Later we’ll add addi and j4Spring 2011EECS150 - Lec09-cpuPage MIPS Micrarchitecture Oganization5Datapath + Controller + External MemoryControllerSpring 2011EECS150 - Lec09-cpuPage How to Design a Processor: step-by-step1. Analyze instruction set architecture (ISA) ⇒ datapath requirements– meaning of each instruction is given by the data transfers (register transfers)– datapath must include storage element for ISA registers– datapath must support each data transfer2. Select set of datapath components and establish clocking methodology3. Assemble datapath meeting requirements4. Analyze implementation of each instruction to determine setting of control points that effects the data transfer.5. Assemble the control logic.6Spring 2011EECS150 - Lec09-cpuPage Review: The MIPS Instruction R-typeI-typeJ-typeThe different fields are:op: operation (“opcode”) of the instructionrs, rt, rd: the source and destination register specifiersshamt: shift amountfunct: selects the variant of the operation in the “op” fieldaddress / immediate: address offset or immediate valuetarget address: target address of jump instruction op target address026316 bits 26 bitsop rs rt rd shamt funct0611162126316 bits 6 bits5 bits5 bits5 bits5 bitsop rs rtaddress/immediate0162126316 bits 16 bits5 bits5 bits7Spring 2011EECS150 - Lec09-cpuPage Subset for Lectureadd, sub, or, slt•addu rd,rs,rt•subu rd,rs,rtlw, sw•lw rt,rs,imm16•sw rt,rs,imm16beq•beq rs,rt,imm16 op rs rt rd shamt funct0611162126316 bits 6 bits5 bits5 bits5 bits5 bitsop rs rt immediate0162126316 bits 16 bits5 bits5 bitsop rs rt immediate0162126316 bits 16 bits5 bits5 bits8Spring 2011EECS150 - Lec09-cpuPage Register Transfer DescriptionsAll start with instruction fetch:{op , rs , rt , rd , shamt , funct} ← IMEM[ PC ] OR{op , rs , rt , Imm16} ← IMEM[ PC ] THENinst Register Transfersadd! R[rd] ← R[rs] + R[rt];! ! ! PC ← PC + 4sub! R[rd] ← R[rs] – R[rt];! ! PC ← PC + 4or R[rd] ← R[rs] | R[rt]; PC ← PC + 4slt! R[rd] ← (R[rs] < R[rt]) ? 1 : 0; ! PC ← PC + 4lw! R[rt] ← DMEM[ R[rs] + sign_ext(Imm16)]; PC ← PC + 4sw! DMEM[ R[rs] + sign_ext(Imm16) ] ← R[rt]; PC ← PC + 4beq if ( R[rs] == R[rt] ) then PC ← PC + 4 + {sign_ext(Imm16), 00} else PC ← PC + 49Spring 2011EECS150 - Lec09-cpuPage MicroarchitectureMultiple implementations for a single architecture:– Single-cycle• Each instruction executes in a single clock cycle.– Multicycle• Each instruction is broken up into a series of shorter steps with one step per clock cycle.– Pipelined (variant on “multicycle”)• Each instruction is broken up into a series of steps with one step per clock cycle• Multiple instructions execute at once.10Spring 2011EECS150 - Lec09-cpuPage CPU clocking (1/2)• Single Cycle CPU: All stages of an instruction are completed within one long clock cycle. – The clock cycle is made sufficient long to allow each instruction to complete all stages without interruption and within one cycle.1. InstructionFetch2. Decode/ RegisterRead3. Execute 4. Memory5. Reg. Write11Spring 2011EECS150 - Lec09-cpuPage CPU clocking (2/2)• Multiple-cycle CPU: Only one stage of instruction per clock cycle. – The clock is made as long as the slowest stage.Several significant advantages over single cycle execution: Unused stages in a particular instruction can be skipped OR instructions can be pipelined (overlapped).1. InstructionFetch2. Decode/ RegisterRead3. Execute 4. Memory5. Reg. Write12Spring 2011EECS150 - Lec09-cpuPage MIPS State Elements13• Determines everything about the execution status of a processor:–PC register– 32 registers– MemoryNote: for these state elements, clock is used for write but not for read (asynchronous read, synchronous write).Spring 2011EECS150 - Lec09-cpuPage Single-Cycle Datapath: lw fetch•First consider executing lw•STEP 1: Fetch instruction14R[rt] ← DMEM[ R[rs] + sign_ext(Imm16)]Spring 2011EECS150 - Lec09-cpuPage Single-Cycle Datapath: lw register read•STEP 2: Read source operands from register file15R[rt] ← DMEM[ R[rs] + sign_ext(Imm16)]Spring 2011EECS150 - Lec09-cpuPage Single-Cycle Datapath: lw immediate•STEP 3: Sign-extend the immediate16R[rt] ← DMEM[ R[rs] + sign_ext(Imm16)]Spring 2011EECS150 - Lec09-cpuPage Single-Cycle Datapath: lw address•STEP 4: Compute the memory address17R[rt] ← DMEM[ R[rs] + sign_ext(Imm16)]Spring 2011EECS150 - Lec09-cpuPage Single-Cycle Datapath: lw memory read•STEP 5: Read data from memory and write it back to register file18R[rt] ← DMEM[ R[rs] + sign_ext(Imm16)]Spring 2011EECS150 - Lec09-cpuPage Single-Cycle Datapath: lw PC increment•STEP 6: Determine the address of the next instruction19PC ← PC + 4Spring 2011EECS150 - Lec09-cpuPage Single-Cycle Datapath: sw•Write data in rt to memory20DMEM[ R[rs] + sign_ext(Imm16) ] ← R[rt]Spring 2011EECS150 - Lec09-cpuPage Single-Cycle Datapath: R-type instructions•Read from rs and rt• Write ALUResult to register file•Write to rd (instead of rt)21R[rd] ← R[rs] op R[rt]Spring 2011EECS150 - Lec09-cpuPage Single-Cycle Datapath: beq•Determine whether values in rs and rt are equal• Calculate branch target address: BTA = (sign-extended immediate << 2) + (PC+4)22if ( R[rs] == R[rt] ) then PC


View Full Document

Berkeley COMPSCI 150 - Lecture 9 - CPU Microarchitecture

Documents in this Course
Lab 2

Lab 2

9 pages

Debugging

Debugging

28 pages

Lab 1

Lab 1

15 pages

Memory

Memory

13 pages

Lecture 7

Lecture 7

11 pages

SPDIF

SPDIF

18 pages

Memory

Memory

27 pages

Exam III

Exam III

15 pages

Quiz

Quiz

6 pages

Problem

Problem

3 pages

Memory

Memory

26 pages

Lab 1

Lab 1

9 pages

Memory

Memory

5 pages

Load more
Download Lecture 9 - CPU Microarchitecture
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture 9 - CPU Microarchitecture and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 9 - CPU Microarchitecture 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?