DOC PREVIEW
Berkeley COMPSCI 152 - CS 152 Final Project

This preview shows page 1-2-3-4-5-6 out of 18 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 18 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 18 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 18 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 18 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 18 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 18 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 18 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS 152 Final ProjectProfessor KubiatowiczSuperscalar, Branch PredictionFinal ProjectSchematicsVerilog ModulesThe MoronCS 152 Final ProjectProfessor KubiatowiczSuperscalar, Branch PredictionJohn Gibson (cs152-jgibson)John Truong (cs152-jstruong)Albert Wang (cs152-albrtaco)Timothy Wong (cs152-timwong)1 of 18John Gibson, Albert Wang, CS 152 – Section 101Tim Wong, John Truong Page 2Final ProjectI. AbstractThe goal of this project is to construct a working superscalar processor, with branch prediction. The memory module from lab 6 needed to be reworked to properlyfunctioning. This phase we decided to emphasize robustness and functionality (ie, a working processor) rather than speed, so the memory was scaled back to a direct mapped, write through cache. Although the superscalar architecture itself was straightforward, the primary complication lay in increasing the number of ports in the register file and cache to support 2 pipelines. We introduced “striping” in the cache to handle this situation with relatively few stalls.II. Division of LaborDatapath Enhancement – This part involved updating the datapath to include 2 pipelines, adding additional forwarding logic, and updating the memory / register modules to support dual input/output, when necessary.Initial revision: Albert, John G.Cache, Striping, Dual Issue – This part involved writing a direct mapped, write-through cache with bursting, striping instructions within the cache, and adapting the cache for reading 2 instructions at once.Initial revision: John G., TimTesting: John G. Branch Predictor – This part involved writing and Initial revision: John T.Testing: John T., TimDistributor – This part consisted of a VHDL component that distributes the instructions between the two pipelines based on dependencies and other constraints.Initial revision: Tim Testing: AlbertForwarding/Hazards – This involved updating the forwarding and hazard units to support the two pipelines.2 of 18Initial revision: TimTesting: Tim, AlbertIntegration – Integration primarily involved updating the toplevel modules to support the new modules introduced by superscalar. Integration: EverybodyOverall Testing – Testing was done on each element that we implemented followed by thorough testing of the datapath after integration of each component, as well as ensuring that it worked on the board correctly.Testing: EverybodyIII. Detailed StrategySections:0: Superscalar1: Stall Arbiter2: Dual Issue3: Memory Subsystem4: Instruction Distribution5: Forwarding6: Hazards7: Branch PredictionSection 0: Superscalarsuperscalar.schBecause our 5-stage pipelined processor was already working reasonably well, extending it to a superscalar architecture was relatively straightforward. The two pipelines are referred to as the EVEN and ODD pipeline. Alternatively, the control signals distinguish between the two pipelines as Pipeline 1 (EVEN) and 2 (ODD) (this is slightly confusing, however we were able to distinguish the names between ourselves, and decided that going back to change all the names would be tedious and could cause annoying bugs if we were not careful).Each pipeline maintains their own copies of the instructions, PCs, and control signals they process. The goal of this is to isolate each pipeline as much as possible in order to simplify the debugging process and minimize complexity.With this project, we had the opportunity to use many of the lessons we learned from Lab 6’s non-functioning cache. Most notably, we kept the “Keep It Simple, Stupid” mottoin mind throughout the design process. Because we wanted to reduce the complexity of our design, we decided to limit the functionality of the pipelines. For instance, all branch 3 of 18and jump instructions must be processed in the EVEN pipeline, whereas all memory instructions must be process in the ODD pipeline. We also kept an invariant that the earlier instruction must always be in the EVEN pipeline. The rationale behind this decision was that we wanted to keep the pipelines “synched” so that forwarding, hazards, and prediction mechanisms would be easier to design and test. Although this invariant inevitably increases our CPI, our goal was to have a working processor first and then include additional “features.” Keeping this in mind, we tried to design our processor so that it would be easy to integrate optimizations later.Restricting the pipeline reduced the number of corner cases we had to worry about. The “branch pipeline” was intentionally set as the EVEN pipeline (the earlier one), so that branch delay slots could be handled more cleanly. Since branch and jump instructions will always be sent to the EVEN pipeline with their delay slots in the ODD pipeline, our distributor doesn’t have to keep states and remember that a delay slot instructions has to be fetched.Restricting the pipelines also reduced the complexity of forwarding between the pipelines because data does not need to be forwarded to the memory stage of the EVEN pipeline and nor does data have to forwarded to the decode stage of the ODD pipeline. Section 1: Stall Arbiterstallarbiter.vWhen multiple components request a stall or a bubble the stall arbiter decides which stall has precedence. Until the final project stalls had been handled in an ad hoc manner with several simple logic gates and latch signals. While it was easy to use the ad hoc system for lab 5 (only the hazard unit could stall, so no arbitration was necessary), we began to see stalling issues in lab 6 when we created two additional components (the dataand instruction caches) that needed to stall the processor. However, with a few more gates we were able to retain our old stalling system. Unfortunately this system became inadequate during the development of lab 7 when we created three new stalling signals that needed to be handled. The first is bubble, which is asserted when the instruction in the decode stage of the ODD pipeline is dependent upon the instruction in the decode stage of the EVEN pipeline. The second and third signals are jumpflush and branchflush,which are asserted when a jump is detected or bad guess is made by the branch predictor. This proved to be far too many signals to handle with simple logic gates so a new modulewas created to give preference to the various signals.1. Data Cache Stall – Freezes the entire pipeline2. Hazard Stall – Freezes the fetch and decode stages, inserts bubbles into the execute stage.3. Instruction


View Full Document

Berkeley COMPSCI 152 - CS 152 Final Project

Documents in this Course
Quiz 5

Quiz 5

9 pages

Memory

Memory

29 pages

Quiz 5

Quiz 5

15 pages

Memory

Memory

29 pages

Memory

Memory

35 pages

Memory

Memory

15 pages

Quiz

Quiz

6 pages

Midterm 1

Midterm 1

20 pages

Quiz

Quiz

12 pages

Memory

Memory

33 pages

Quiz

Quiz

6 pages

Homework

Homework

19 pages

Quiz

Quiz

5 pages

Memory

Memory

15 pages

Load more
Download CS 152 Final Project
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view CS 152 Final Project and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view CS 152 Final Project 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?