DOC PREVIEW
Berkeley COMPSCI 252 - Lecture 4: Introduction to Advanced Pipelining

This preview shows page 1-2-3-4-5-33-34-35-36-66-67-68-69-70 out of 70 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 70 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 70 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 70 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 70 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 70 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 70 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 70 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 70 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 70 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 70 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 70 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 70 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 70 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 70 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 70 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Lecture 4: Introduction to Advanced PipeliningReview: Control Flow and ExceptionsException/Interrupt classificationsA related classification: Synchronous vs. AsynchronousRecap: Device Interrupt (Say, arrival of network message)Interrupt controller hardware and mask levelsWhat about interrupt overhead? SPARC (and RISC I) had register windowsSupervisor StateEntry into Supervisor ModePrecise Interrupts/ExceptionsPrecise interrupt point requires multiple PCs to describe in presence of delayed branchesWhy are precise interrupts desirable?Precise Exceptions in simple 5-stage pipeline:Approximations to precise interruptsHow to achieve precise interrupts when instructions executing in arbitrary order?Review: Summary of Pipelining BasicsCase Study: MIPS R4000 (200 MHz)Case Study: MIPS R4000MIPS R4000 Floating PointMIPS FP Pipe StagesR4000 PerformanceAdvanced Pipelining and Instruction Level Parallelism (ILP)FP Loop: Where are the Hazards?FP Loop Showing StallsRevised FP Loop Minimizing StallsUnroll Loop Four Times (straightforward way)Unrolled Loop That Minimizes StallsCompiler Perspectives on Code MovementWhere are the data dependencies?Slide 30Where are the name dependencies?Slide 33When Safe to Unroll Loop?Does a loop-carried dependence mean there is no parallelism???HW Schemes: Instruction Parallelism Can we get CPI closer to 1?Scoreboard: a bookkeeping techniqueScoreboard ImplicationsFour Stages of Scoreboard ControlSlide 40Three Parts of the ScoreboardScoreboard ExampleDetailed Scoreboard Pipeline ControlScoreboard Example: Cycle 1Scoreboard Example: Cycle 2Scoreboard Example: Cycle 3Scoreboard Example: Cycle 4Scoreboard Example: Cycle 5Scoreboard Example: Cycle 6Scoreboard Example: Cycle 7Scoreboard Example: Cycle 8a (First half of clock cycle)Scoreboard Example: Cycle 8b (Second half of clock cycle)Scoreboard Example: Cycle 9Scoreboard Example: Cycle 10Scoreboard Example: Cycle 11Scoreboard Example: Cycle 12Scoreboard Example: Cycle 13Scoreboard Example: Cycle 14Scoreboard Example: Cycle 15Scoreboard Example: Cycle 16Scoreboard Example: Cycle 17Scoreboard Example: Cycle 18Scoreboard Example: Cycle 19Scoreboard Example: Cycle 20Scoreboard Example: Cycle 21Scoreboard Example: Cycle 22Scoreboard Example: Cycle 61Scoreboard Example: Cycle 62Review: Scoreboard Example: Cycle 62CDC 6600 ScoreboardSummaryJDK.F98 Slide 1Lecture 4: Introduction to Advanced PipeliningProf. John KubiatowiczComputer Science 252Fall 1998JDK.F98 Slide 2Review: Control Flow and Exceptions•RISC vs CISC was about virtualizing the CPU interface, not simple vs complex instructions•Control flow is the biggest problem for computer architects. This is getting worse:–Modern computer languages such as C++ and Java user many smaller procedure calls (method invocations)–Networked devices need to respond quickly to many external events.•Talked about CRISP method of merging multiple instructions together in on-chip cache–This was actually a limited form of recompilation for on-chip VLIW. We will see this in greater detail later•Interrupts vs Polling: two sides of a coin–Interrupts ensure predictable handling of devices (can be guaranteed to happen by OS)–Polling has lower overhead if device events frequent–Interrupts have lower overhead if device events infrequentJDK.F98 Slide 3Exception/Interrupt classifications•Exceptions: relevant to the current process–Faults, arithmetic traps, and synchronous traps–Invoke software on behalf of the currently executing process•Interrupts: caused by asynchronous, outside events–I/O devices requiring service (DISK, network)–Clock interrupts (real time scheduling)•Machine Checks: caused by serious hardware failure–Not always restartable–Indicate that bad things have happened. »Non-recoverable ECC error»Machine room fire»Power outageJDK.F98 Slide 4A related classification: Synchronous vs. Asynchronous•Synchronous: means related to the instruction stream, i.e. during the execution of an instruction–Must stop an instruction that is currently executing–Page fault on load or store instruction–Arithmetic exception–Software Trap Instructions•Asynchronous: means unrelated to the instruction stream, i.e. caused by an outside event.–Does not have to disrupt instructions that are already executing–Interrupts are asynchronous–Machine checks are asynchronous•SemiSynchronous (or high-availability interrupts): –Caused by external event but may have to disrupt current instructions in order to guarantee serviceJDK.F98 Slide 5Recap: Device Interrupt(Say, arrival of network message)add r1,r2,r3subi r4,r1,#4slli r4,r4,#2Hiccup(!)lw r2,0(r4)lw r3,4(r4)add r2,r2,r3sw 8(r4),r2Raise priorityReenable All IntsSave registerslw r1,20(r0)lw r2,0(r1)addi r3,r0,#5sw 0(r1),r3Restore registersClear current IntDisable All IntsRestore priorityRTENetwork InterruptPC savedDisable All IntsSupervisor ModeRestore PCUser ModeCould be interrupted by diskNote that priority must be raised to avoid recursive interrupts!JDK.F98 Slide 6Interrupt controller hardware and mask levels•Interrupt disable mask may be multi-bit word accessed through some special memory address•Operating system constructs a hierarchy of masks that reflects some form of interrupt priority.•For instance:–This reflects the an order of urgency to interrupts–For instance, this ordering says that disk events can interrupt the interrupt handlers for network interrupts. Priority Example0 Sof tware interrupts2 Network I nterrupts4 Sound card5 Disk I nterrupt6 Real Time clockJDK.F98 Slide 7What about interrupt overhead? SPARC (and RISC I) had register windows•On interrupt or procedure call, simply switch to a different set of registers•Really saves on interrupt overhead–Interrupts can happen at any point in the execution, so compiler cannot help with knowledge of live registers.–Conservative handlers must save all registers–Short handlers might be able to save only a few, but this analysis is compilcated•Not as big a deal with procedure calls–Original statement by Patterson was that Berkeley didn’t have a compiler team, so they used a hardware solution–Good compilers can allocate registers across procedure boundaries–Good compilers know what registers are live at any one timeJDK.F98 Slide 8Supervisor State•Typically, processors have some amount of state that user programs are not allowed to touch.–Page mapping hardware/TLB»TLB prevents one user from


View Full Document

Berkeley COMPSCI 252 - Lecture 4: Introduction to Advanced Pipelining

Documents in this Course
Quiz

Quiz

9 pages

Caches I

Caches I

46 pages

Lecture 6

Lecture 6

36 pages

Lecture 9

Lecture 9

52 pages

Figures

Figures

26 pages

Midterm

Midterm

15 pages

Midterm

Midterm

14 pages

Midterm I

Midterm I

15 pages

ECHO

ECHO

25 pages

Quiz  1

Quiz 1

12 pages

Load more
Download Lecture 4: Introduction to Advanced Pipelining
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture 4: Introduction to Advanced Pipelining and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 4: Introduction to Advanced Pipelining 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?