DOC PREVIEW
CSUN COMP 546 - Superscalar Processors

This preview shows page 1-2-3-4 out of 12 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Superscalar ProcessorsSuperscalar Processor- Multiple Independent Instruction Pipelines; each with multiple stages - Instruction-Level Parallelism- determine dependencies between nearby instructions o input of one instruction depends upon the output of a preceding instruction- locate nearby independent instructionso issue & complete instructions in an order different than specified in the code stream- uses branch prediction methods rather than delayed branches- RISC or CICSSuper Pipelined Processor- Pipeline stages can be segmented into n distinct non-overlapping parts each of which can execute in 1n of a clock cycle Limitations of Instruction-Level Parallelism- True Data Dependency (Flow Dependency, Write-After-Read [WAR] Dependency)o Second Instruction requires data produced by First Instruction- Procedural Dependencyo Branch Instruction – Instructions following either  Branches-Taken Branches-NotTakeno Variable-Length Instructions Partial Decoding is required prior to the fetching of the subsequent instruction, i.e.,computing PC value- Resource Conflictso Memory, cache, buses, register ports, file ports, functional units access- Output Dependencyo See “In-Order Issue  Out-of-Order Completion” below- Anti-dependencyo See “Out-of-Order Issue  Out-of-Order Completion” belowInstruction-Level Parallelism- Instructions in sequence are independent  instructions can be executed in parallel by overlapping- Degree of Instruction-Level Parallelism depends upono Frequencies of True Data Dependencies & Procedural Dependencies in the code which is dependent upon the Instruction Set Architecture & the Applicationo Operational Latency – the time until the result of an instruction is available for use in a subsequent instruction, i.e., Execution Completion TimeMachine Parallelism- Number of instructions that can be fetched and executed simultaneously, i.e., number of parallel pipelines- Sophistication, i.e., speed, of the mechanisms that the processor uses to locate independent instructionshave a procedural-dependency on the branchInstruction Issue Policy- Instruction Issueo process of initiating instruction execution in the processor’s functional unitso occurs when instruction moves from the decode stage to the first execute stage of the pipeline- Instruction Issue Policyo protocol used to issue instructionso looking ahead of the current point of execution to locate instructions that can be placed into the pipeline & executed- Types of Orderingso Order in which instructions are fetchedo Order in which instructions are executedo Order in which instructions update register contents & memory locations- Instruction Issue Policy Categorieso In-Order Issue  In-Order Completion superscalar pipeline specifications (example)▬ fetch/decode two instructions per time period▬ digital arithmetic functional units (2)▬ floating-point arithmetic functional unit (1)▬ write-back pipeline stages (2) code fragment ( six instructions ) constraints▬ I1 requires two cycles to complete▬ I3 & I4 conflict for the same functional unit▬ I5 depends upon a value produced by I4▬ I5 & I6 conflict for a functional unit Procedure ▬ fetch next two instructions into the decode stage▬ issuing of instructions must wait until current instructions have passed the decode pipeline stages▬ conflict for a functional unit  issuing of instructions must temporary halt▬ functional unit requires more than one cycle to generate a result  issuing of instructions must temporary halt(figure 14.4 page 531)o In-Order Issue  Out-of-Order Completion Procedure▬ fetch next two instructions into the decode stage▬ number of instructions in execute stages ≤ maximum degree of machine parallelism across all functional units▬ instruction issuing stalled by resource conflict data dependency procedural dependency▬ output dependency, i.e., write-after-write (WAW) dependency▬ code fragment ( four instructions )  I1: R3  R3 op R5 I2: R4  R3 + 1 I3: R3  R5 + 1 I4: R7  R3 op R4▬ Constraints I1 must execute before I2 I1 produces R3 contents required by I2 I3 must execute before I4 I3 produces R3 contents required by I4 (RAW) I3 must complete after I1 I3 must write R3 after I1 writes R3 (WAW)▬ conflict for a functional unit  issuing of instructions must temporary halt▬ functional unit requires more than one cycle to generate a result  issuing of instructions must temporary halt▬ issuing an instruction must stall if its result might later be overridden by an instruction issued earlier which takes longer to complete (WAW)Instruction Issue Logic- more complex- interrupts & exceptions handlingo resumption – some instructions which logically follow the interrupted instruction may have already completed and perhaps must not be executed againo Out-of-Order Issue  Out-of-Order Completion In-Order Issue – decode instructions up to the point of dependency; cannot look ahead of dependency to subsequent instructions independent of those in the pipeline that could be usefully introduced into the pipeline Out-of-Order Issue▬ decouple the decode stages from the execute stages of the pipeline▬ Instruction Window Buffer decoded instructions are placed in the Instruction Window Buffer Instruction Window Buffer full  instruction fetching & decoding must temporarily halt functional unit becomes available  instruction from the Instruction Window Buffer may be issued provided it needs the particular functional unit available no conflicts or dependencies block the instruction▬ processor has lookahead capability allowing it to identify independent instructions that can be brought into the execute stage Procedure▬ fetch next two instructions into the decode stage▬ during each cycle, subject to the buffer size, two instructions move from the decode stage to the Instruction Window Buffer▬ ( ∀ instruction) ∈Instruction Window Buffer  processor has sufficient information to decide when it can be issued▬ more instructions are available for issuing  reducing probability of pipeline stage stall▬ number of instructions in execute stages ≤ maximum degree of machine parallelism across all functional units▬ instruction issuing stalled by resource conflict data dependency procedural dependency antidependency, read-after-write dependency (RAW), i.e., second


View Full Document

CSUN COMP 546 - Superscalar Processors

Download Superscalar Processors
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Superscalar Processors and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Superscalar Processors 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?