DOC PREVIEW
USC CSCI 570 - Homework1.S14(1) (1)

This preview shows page 1 out of 3 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 3 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 3 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

University of Southern California Department of Electrical Engineering EE557 Spring 2K14 Instructor: Michel Dubois and Murali Annavaram Section: 30667, 30820D, 30716D, 30455D and 30823D Homework #1. Due: Tuesday, February 11, 5pm TOTAL SCORE: /150 Problem 1 (20pts) Problem 1.1 in the book, with the following modifications The time taken by each floating-point instruction can be reduced by a factor 15 with the new hardware. The Loads and Stores can be speeded up by a factor 3 over the base machine. a. Unchanged b. Unchanged c. The speedup is 30% (or 1.3) d. In the original workload, fractions Ffp and Fls are 20% and 40 % respectively. Problem 2 (20pts) Problem 1.3 in the book, but using the following tables Execution times of three programs Machines Program1 Program2 Program3 Base machine 17sec 10msec 5sec Base + FP units 16sec 2msec 2sec Base + cache 10sec 9msec 3sec Normalized execution times of three programs Machines Program1 Program2 Program3 Base machine 1 1 1 Base + FP units 0.94 0.2 0.4 Base + cache 0.59 0.9 0.6 Problem 3 (20pts) Problem 1.5 in the book, but using the following table Instructions Frequency Cycles Arithmetic/logic 25% 1 Loads 40% 2Stores 5% 1 Branches (Untaken) 15% 2 Branches (Taken) 5% 4 Miscellaneous 10% 1 Problem 4 (30pts) Problem 2.1 in the book, with the following modifications Considering the following designs for the problem A. 6-stage pipeline clocked at 2f B. Single cycle CPU clocked at f C. 5-way multiprocessor in which each processor is the single CPU clocked at 2f D. 5-way multiprocessor in which each processor is 6-stage pipeline and is clocked at 2f Compare these four design with the base machine which is the 5-stage pipeline clocked at f Problem 5 (20pts) Problem 3.1 in the book, considering all memory addresses are 16-bits Problem 6 (20pts) Problem 3.3 in the book, but compile the following code (A[0] is stored at memory address 1000). The code computes the first 100 elements of Fibonacci series. A[0] := 0 A[1] := 1 for( i := 2; i<100; i++ ) A[i] := A[i-1] + A[i-2] Problem 7 (20pts) The combination of two enhancements are considered to boost the performance of a chip multiprocessor. The enhancements are: 1) adding more cores or 2) adding more shared level 2 cache. The base chip has 2 cores and 8 L2 cache banks. L2 cache can be added by adding cache banks and each cache bank uses three times the area of a core. Here is what we also know from all kinds of sources: 1) 70% of the workload can be fully parallelized; the rest cannot. 2) The core stall time due to L2 misses accounts for 10% of each core's execution time in the base configuration. 3) It is suspected that the amount of shared L2 cache per core should remain constant in order to keep the same miss rate.4) Simulations have also determined that the miss rate of L2 decreases as the square root of its size per core. A conjecture is that the stall time in each core will also decrease as the square root of L2 size per cores. The company that pays your paycheck has acquired a new technology to build large micro-chips, so that the next generation chips will have four times the area of current chips to dedicate to cores and L2 caches. Given what you know, what kind of best "first cut" design would you propose? A design is characterized by (# of cores, # of L2 cache banks). These numbers can be any integer. The design should be contained in the new chip. Estimate the speedup of your best design that takes advantage of the new chip real


View Full Document

USC CSCI 570 - Homework1.S14(1) (1)

Documents in this Course
Load more
Download Homework1.S14(1) (1)
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Homework1.S14(1) (1) and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Homework1.S14(1) (1) 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?