DOC PREVIEW
MIT 6 375 - Runahead Processor

This preview shows page 1-2-3-27-28-29 out of 29 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Group 1: 6.375 Final ProjectRunahead ProcessorFinale Doshi and Ravi PalakodetyMay 16, 20061 IntroductionData cache misses from can severely aff ect processor throughput if the processor stalls untilvalid data becomes available. A runahead processor attempts to minimize the effects ofdata cache misses by prefetching data needed by future instructions following a cache miss.The processor continues to execute instructions after the cache miss using invalid data untilthe requested data becomes available[1][2]. These prefetches are likely to be accurate, asthe instructions would have been executed anyway (assuming no branches). The longer theprocessor is in runahead mode, the more runahead instructions that execute, and the moreprefetches that are called—reducing future cache misses.While entering runahead mode is fairly straightforward—the processor continues to ex-ecute instructions normally—we must take care in restoring the processor’s state once theoriginal miss returns. The fo llowing steps lead us through entering and exiting runahead,pointing out the key factors that ensure that we exit runahead correctly:1. A load or store instruction causes a data cache miss. Runahead execution begins.2. The processor checkp oints the current state by making a backup copy of the registerfile and program counter.3. The processor continues to execute instructions using an invalid value for the pendingdata cache request.4. Future loads and stores may cause data cache misses; the cache a lso prefetches thesemisses.5. Writes to the data cache do not occur during runahead execution. Writes to theregister file that depend on an invalid value are marked invalid in the register file.Computations that depend on inva lid register entries are a lso marked invalid. Loadsand stores that depend on inva lid addresses are no t prefetched.6. Runahead execution proceeds until the original data cache miss is fetched from memory.17. We copy the backup register file into the real register file, and proceed from the check-pointed instruction.In this project, we implemented a runahead processor and analyzed the effects of memorylatencies and int er-module fifo lengths on its performance. We also explored variants onleaving runahead and caching stores.2 High Level DesignFigure 1 shows a high-level cloud-diagram of the runahead processor. The three main pro-cessor rules—pc-gen, exec, and writeback—perform essentially the same function as ourfamiliar three-stage processor: pc-gen updates the program counter and requests the nextinstruction; exec decodes the instruction, performs ALU operations, and sends requests tothe data cache; and writeback writes ALU ops and data memory responses into the registerfile.Some of exec’s and writeback’s operations are tailored to the processor. For ex-ample, exec will not send load requests with invalid addresses to the data cache. Thewriteback rule is responsible for notifying the processor of when to enter runahead, and thecheck-response-q rule notifies the processor when to exit runahead. To reduce clutter, wedo not show the stall and discard rules for handling read-after-write hazards and clearingof the pcQ after taken branches.The stop-, start-, and stall-runahead rules are mutually exclusive with the three nor-mal ‘processing’ rules (and indeed with each other). Whenever the mode changes, these rulesensure that the processor’s state is correctly backed-up and restored. The stall-runaheadrule is resp onsible for stalling the processor if a runahead branch depends on an invalid pred-icate (and, as we do design exploration, fo r any other situation where the processor muststall due to invalid data).Within the cache, all rules are mutually exclusive. The main rule sets the mode f or eachrule to fire. Responses from the main memory a re treated as most urgent; the refill-resprule takes data from main memory and updates the cache. access responds to requests fromthe processor and sends requests to the main memory. The refill-req rule fires only whenwe have a collision with a dirty cache-line, and the original data needs to be stored back tomain memory before the current request is made.3 Testing StrategyOur first goal was to ensure correctness of both the processor and t he cache. In addition toasm-tests and the benchmarks, we wrote a test of lo ad-store scenarios on valid addresses (wedo not perform loads and store on invalid addresses). Listed below are the t ests and theirexpected (and observed) outcomes.• Load-a : hit. Do not enter runahead and requests handled normally.2accessrefillRespmainpcbackupPCpcGen execRfilewritebackcheck forend of runaheadinRunaheadstoprunaheadstartrunaheadstallrunaheadrefillReqwait_for_tokenRunahead Processor Design Diagram(searchable)memReqQData CacheNonblockingMain Memory Arbiter and Main Memory(standard memory interface)Instruction Cache, blockinginstReqQwbQmemRespQ(searchable)instRespQRfile backup(marks valid)pcQdataRespQBpred TabledataReqQCache RamprocReqQcache_modeFigure 1: Cloud Diagram of Runahead Processor3• Load-a : miss. If requested cache-line is valid and dirty, then the value currentlystored in that cache-line is written to memory. Enter runahead and return ‘req-missed.’• Load-a, Load-a : both miss. Enter runahead after the first load but do not initiatethe second load request. Both loads return to the processor (in runahead mode) as‘req-missed.’• Load-a, Load-b : both miss; b writes to same cache-line as a. Enter runaheadafter the first load. In the baseline, we only initiate one request per cache-line. Thuswe avo id the possibility of the first load being overwritten before the processor switchesout o f runahead mode which would lead to an infinite loop. Thus the second load isnot prefetched. (Optimizations may adjust the timing to allow the second load to beprefetched for future operations.)• Store-a, Load-a : store misses. Enter runahead on the store and send a prefetchrequest for that address; load does not generate an additional prefetch. An optimizationwill store the value of the data into the store cache for f uture runahead loads.• Load-a, Store-a, Load-a : first load misses. Enter runahead on the first load;no a dditional prefetches. If optimized, store the data in the store cache. The secondload takes the data from the store cache.• Store-a, Store-a, Load-a : first store misses. Enter r unahead on the firststore; no additional prefetches. If optimized, store the value of t he first


View Full Document

MIT 6 375 - Runahead Processor

Documents in this Course
IP Lookup

IP Lookup

15 pages

Verilog 1

Verilog 1

19 pages

Verilog 2

Verilog 2

23 pages

Encoding

Encoding

21 pages

Quiz

Quiz

10 pages

IP Lookup

IP Lookup

30 pages

Load more
Download Runahead Processor
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Runahead Processor and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Runahead Processor 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?