DOC PREVIEW
MIT 6 893 - Decoupled Program Control for Energy-Efficient Performance

This preview shows page 1-2 out of 7 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

The CAE Architecture: Decoupled Program Control for Energy-Efficient Performance Ronny Krashinsky and Michael SungMotivation for Decoupled ArchitecturesPrior WorkDecoupled Program Control: CAE ArchitectureDecoupled Program Control BenefitsIssues with Decoupled Program ControlProgress and RoadmapThe CAE Architecture: Decoupled Program Control for Energy-Efficient Performance Ronny Krashinsky and Michael SungChange in project direction from original proposalinitial idea of energy-efficient program control used a lot of existing ideas alreadyNew idea of combining existing work in decoupled architectures to simplify program control and issue logic for high-performance microprocessorsBasic idea is that program control is inherently separate from other types of instructions (access and execute). We propose to decouple the program control from the rest of the instruction stream(s) to uncover ILPMotivation for Decoupled ArchitecturesIncreased performance by exploiting fine-grained parallelism between access and execute functionsDecoupling access from execution allows access processor to run ahead or “slip” w.r.t. execution processor: dynamic reorderingMemory latency tolerationDynamic loop unrolling (exposing ILP between loop iterations, hide functional unit latencies by overlapping executions of diff. iterations) Simplified issue/decode logicMuch simpler than complex superscalar architectures (IW, ROB, bypass)Scalability – direct consequence of simplified logicFor superscalar processors, need to increase IW which does not scale (Palacharla/Agawal papers)For decoupled machines, simply lengthen queues to allow more “slip”Prior WorkSeparation of access and execute functionsIBM 360/370, CDC 6600, CDC 7600, CRAY-1Explicit partition of access and computation functionsJ. Smith, PIPE (compile time splitting)G. Tyson, MISC (Multiple Inst. Stream Computer), descendant of PIPEA. Pleszkun, SMA (Structured Memory Access) architectureJ. Smith, Astronautics Corporation’s ZS-1 (splits fix-point/addressing from floating-point operations)A. Wulf, WM architecture IBM’s FOM (FORTRAN Oriented Machine)Shares similarities with trace processorsTechnically, superscalar machines have a slightly decoupled natureDecoupled Program Control: CAE ArchitectureHierarchy of decoupling: 3 levels of decoupling (Control, Access, and Execute). Control flow is most elastic, providing ample instructions for both access and execute pipelines MEM A$ C$ E$ AP CP EPDecoupled Program Control BenefitsProgram control flow can be easily determined without waiting Provides “out-of-order” execution without complexityInherits the memory latency toleration of DAE architecturesSimplified issue logic Can be implemented with small structures/queues Allows non-speculative instruction prefetchingBecause of prefetching, we can shrink data structures like caches, potentially reducing critical paths as well as reducing powerStill provides performance advantage for streaming applications, etc. where lack of locality mitigates performance advantages of cachesIssues with Decoupled Program Control Deadlocking with queuesQueue size determines how much slip can occurDraining queues or explicit queue manipulation (push/pop) instructions Performance issues from feedback of values from execution/ access pipelines that program control depends onDependencies limit how far the program pipeline can slip in front of the access and execute pipelines, etc.Likewise, feedback dependencies from execute to access pipelinesComplexities in queue interactions (correctness, verification, ease of programming)Basically an issue of how to to synchronize instruction streams correctlyProgress and RoadmapCompletedLiterature search of decoupled architecturesInitial ISA exploration and microarchitectural development for proposed CAE architectureRoadmapClassifying control flow and dependencies in applicationsISA development for each instruction stream (control, access, execute)Complete architectural specification of CAE architecture (TRS)Implementation of RTL-level simulator (SyCHOSys) Simulation and performance analysis


View Full Document

MIT 6 893 - Decoupled Program Control for Energy-Efficient Performance

Documents in this Course
Toolkits

Toolkits

16 pages

Cricket

Cricket

29 pages

Quiz 1

Quiz 1

8 pages

Security

Security

28 pages

Load more
Download Decoupled Program Control for Energy-Efficient Performance
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Decoupled Program Control for Energy-Efficient Performance and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Decoupled Program Control for Energy-Efficient Performance 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?