DOC PREVIEW
MIT 6 893 - DECOUPLED PROGRAM CONTROL

This preview shows page 1 out of 2 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 2 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 2 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

The CAE Architecture: Decoupled Program Control forComplexity-Effective Performance.Ronny Krashinsky and Mike Sung6.893 Project Report (checkpoint 1)MIT Laboratory for Computer Science, Cambridge, MA 02139fronny,[email protected] superscalar architectures will not scale intothe next era of computer architecture. Their design is basedon structures with a high degree of connectivity that willnot be available in future chips in which a clock cycle cov-ers a tiny fraction of the area. Their performance is basedon reckless speculation that will not be tolerated in futurecomplexity-effective designs. The processor of the futureis composed of many decoupled elements working inde-pendently but in collaboration. As went the supercomput-ers, so will the superprocessors.A promising next-generation architecture has beendemonstrated in decoupled access/execute machines.These processors have split apart the memory access andexecution portions of a program, and thus have immedi-ately exposed a large amount of ILP. By allowing thesestreams to slip relative to each other, these machines enjoythe benefits of out-of-order execution and memory latencyhiding with very little overhead. Additionally, the queueswhich connect these decoupled elements together providethe benefits of register renaming without the complexityrequired in superscalar architectures.This work presents decoupledcontrol flow, the next stepwhich will enable processorsof the future to reach new lev-els of performance. In a decoupled control/access/execute(CAE) machine, a control processor runs ahead and feedsdirectives to the memory access processor and the mainexecution processor; the directives are in the form of com-mands to execute basic blocks. The execution engine isthen responsible for processing streams of valid instruc-tions and data values, obtained without the overhead ofspeculation. This is a fundamental departure from themodel in which an execution engine must actively fetchinstructions and data values, or speculate to hide latency.As a result, new levels of performance are obtainable.1 Introduction2 CAE Architecture (TRS)2.1 Queue Communication2.2 Control Processor2.3 Access Processor2.4 Execute Processor2.4.1 Caches/Queues for Streaming Instructions2.4.2 Fast Streaming Engines3 CAE Programming4 CAE Performance4.1 Livermore Loops4.2 Streaming Media5 CAE Analysis5.1 Complexity5.2 Comparison to Superscalar5.3 Comparison to DSPs6 CAE Extendibility6.1 Tiled CAE processors7Conclusion1References[1] Wm. A.Wulf. Evaluation of the WM computer architecture.journal,0.[2] Wm. A.Wulf. The WM computer architecture. ComputerArchitecture News, 16(1):???, March 1988.[3] E. Rotenberg et. al. A study of control independence insuperscalar processors. journal,0.[4] E. Rotenberg et. al. Trace processors. journal,0.[5] James E. Smith et. al. The astronautics zs-1 processor. jour-nal,0.[6] M. Farrens, P. Ng, and P. Nico. A comparison of superscalarand decoupled access/execute architectures. journal,0.[7] L. Gwennap. Mips r10000 uses decoupled architecture.journal,0.[8] P. T. Hulina, L. Kurian, E. B. John, and L. D. Coraor. De-sign and vlsi implementation of an access processor for adecoupled architecture. journal,0.[9] L. K. John, A. Subramanian, P. T. Hulina, and L. D. Coraor.Improving the parallelism and concurrency in decoupled ar-chitectures. journal,0.[10] L. Kurian, P. T. Hulina, and L. D. Coraor. Memory latencyeffects in decoupled architectures. journal,0.[11] J. E. Smith. Dynamic instruction scheduling and the astro-nautics zs-1. IEEE Computer, 22(7):21–35, July 1989.[12] James E. Smith. Decoupled access/execute computer archi-tecture. In ISCA 9, 1982.[13] J. Tubella and A. Gonzalez. Control speculation in multi-threaded processors through dynamic loop detection. jour-nal,0.[14] G. Tyson and M. Farrens. Code scheduling for multipleinstruction stream architectures. journal,0.[15] G. Tyson, M. Farrens, and A. Pleszkun. Misc: A multipleinstruction stream computer.


View Full Document

MIT 6 893 - DECOUPLED PROGRAM CONTROL

Documents in this Course
Toolkits

Toolkits

16 pages

Cricket

Cricket

29 pages

Quiz 1

Quiz 1

8 pages

Security

Security

28 pages

Load more
Download DECOUPLED PROGRAM CONTROL
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view DECOUPLED PROGRAM CONTROL and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view DECOUPLED PROGRAM CONTROL 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?