DOC PREVIEW
Berkeley COMPSCI 152 - QUIZ 5 SOLUTIONS

This preview shows page 1-2 out of 5 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

QUIZ 5 SOLUTIONS CS152 – Spring 2008 Krste Asanovic May 6, 2008Problem Q5.1 How is vertical waste caused by long latency instructions reduced? How is horizontal waste caused by wide issue widths reduced? Where and by what is most parallelism extracted (e.g., programmer, compiler, hardware)? Name a limitation or disadvantage as compared to a simple in-order execution RISC machine Out-of-Order Execution VLIW Vector Multithreading Simultaneous MultithreadingProblem Q5.2: Predication and VLIW 15 Points Problem M5.2.A l.s f1, 0(r1) ; f1 = *r1 seq.s r5, f10, f1 ; r5 = (f10==f1) cmpnez p1, r5 ; p1 = (r5!=0) (p1) add.s f2, f1, f11 ; if (p1) f2 = f1+f11 (!p1) add.s f2, f1, f12 ; if(!p1) f2 = f1+f12 s.s f2, 0(r2) ; *r2 = f2Problem Q5.3: Multithreaded architectures Problem Q5.3.A 4, largest latency for any instruction is 4 Problem Q5.3.B 2/12 = 0.17 flops/cycle (two flops per loop, on average we complete a loop every 12 cycles) Problem Q5.3.C Yes, we can hide the latency of the floating point instructions by moving the add instructions in between floating point and store instructions – we’d only need 3 threads. Moving the third load up to follow the second load would further reduce thread requirement to only 2.Problem Q5.4: Vectorization 15 Points State whether each of the following loops could be successfully vectorized and explain your answer. In all cases, you should assume that arrays A, B, C do not overlap in memory. for (i=0; i<N; i++) B[i] = A[i] + C; Yes. C was supposed to be considered a scalar value. Scalars can be added to vectors by adding the value to each element of the vector. These additions are all independent. for (i=1; i<N; i++) B[i] = A[i] + B[i-1]; No. To vectorize this, we would have to create vectors out of arrays A and B, and then operate on all elements in parallel. However, the value assigned to each element of B is dependent on the value assigned to the previous elements, i.e. there is a dependency. Chaining does not solve this problem because it works only between consecutive vector instructions, not between the elements of a single vector instruction. for (i=0; i<N-1; i++) B[i] = A[i] + B[i+1]; Yes. Could be done by creating a vector from the elements of B from 1 to N, adding this to a vector created from the elements of A from 0 to N-1, and then writing back the result. In other words, reads and writes to B do not have to be interleaved so there are no dependencies limiting vectorization. for (i=0; i<N; i++) C[i] = A[B[i]]; Yes, by using a gather operations. See lecture slides. for (i=0; i<N; i++) if(C[i] != 0) B[i] = A[i] + D; Yes, by using flags/vector masks. See lecture


View Full Document

Berkeley COMPSCI 152 - QUIZ 5 SOLUTIONS

Documents in this Course
Quiz 5

Quiz 5

9 pages

Memory

Memory

29 pages

Quiz 5

Quiz 5

15 pages

Memory

Memory

29 pages

Memory

Memory

35 pages

Memory

Memory

15 pages

Quiz

Quiz

6 pages

Midterm 1

Midterm 1

20 pages

Quiz

Quiz

12 pages

Memory

Memory

33 pages

Quiz

Quiz

6 pages

Homework

Homework

19 pages

Quiz

Quiz

5 pages

Memory

Memory

15 pages

Load more
Download QUIZ 5 SOLUTIONS
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view QUIZ 5 SOLUTIONS and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view QUIZ 5 SOLUTIONS 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?