Berkeley COMPSCI C267 - Homework - D1690848

Home> Schools> University of California, Berkeley> Computer Science (COMPSCI) > COMPSCI C267> Homework

DOC PREVIEW

Berkeley COMPSCI C267 - Homework

School name University of California, Berkeley

Course Compsci C267- Applications of Parallel Computers

Pages 4

This preview shows page 1 out of 4 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 4 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 4 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Notes on Homework 1Summary of SSE intrinsicsExample: multiplying 2x2 matricesOther Issues02/11/2009 CS267 Lecture 71Notes on Homework 1•Must write SIMD code to get past 50% of peak!02/11/2009 CS267 Lecture 72Summary of SSE intrinsicsVector data type:•__m128dLoad and store operations:•_mm_load_pd•_mm_store_pd•_mm_loadu_pd•_mm_storeu_pdLoad and broadcast across vector•_mm_load1_pdArithmetic:•_mm_add_pd•_mm_mul_pd02/11/2009 CS267 Lecture 73Example: multiplying 2x2 matricesc1 = _mm_loadu_pd( C+0*lda ) //load unaligned block in Cc2 = _mm_loadu_pd( C+1*lda )for( int i = 0; i < 2; i++ ){a = _mm_load_pd( A+i*lda ) //load aligned i-th column of Ab1 = _mm_load1_pd( B+i+0*lda ) //load i-th row of Bb2 = _mm_load1_pd( B+i+1*lda )c1=_mm_add_pd( c1, _mm_mul_pd( a, b1 ) ); //rank-1 updatec2=_mm_add_pd( c2, _mm_mul_pd( a, b2 ) );}_mm_storeu_pd( C+0*lda, c1 ); //store unaligned block in C_mm_storeu_pd( C+1*lda, c2 );02/11/2009 CS267 Lecture 74Other Issues•Checking efficiency of the compiler helps•Use -S option to see the generated assembly code•Inner loop should consist mostly of ADDPD and MULPD ops•ADDSD and MULSD imply scalar computations•Consider using another compiler•Options are PGI, PathScale and GNU•I found it easier to do with GNU compiler•Look through Goto and van de Geijn’s

View Full Document

Berkeley COMPSCI C267 - Homework

Sign up for free to view:

This document and 3 million+ documents and flashcards
High quality study guides, lecture notes, practice exams
Course Packets handpicked by editors offering a comprehensive review of your courses
Better Grades Guaranteed


School:
Email:
New Password:
Confirm Password:

This preview shows page 1 out of 4 pages.

Berkeley COMPSCI C267 - Homework

Sign up for free to view:

Please select your school