DOC PREVIEW
Harvey Mudd CS 105 - CS 105 Lab 5: Code Optimization

This preview shows page 1-2-3 out of 8 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS 105Lab 5: Code OptimizationSee Calendar for Dates1 IntroductionThis assignment deals with optimizing memory intensive code. Image processing offers many examples offunctions that can benefit from optimization. In this lab, we will consider two image processing operations:rotate, which rotates an image counter-clockwise by 90◦, and smooth, which “smooths” or “blurs” animage.For this lab, we will consider an image to be represented as a two-dimensional matrix M, where Mi,jdenotes the value of (i, j)th pixel of M. Pixel values are triples of red, green, and blue (RGB) values. Wewill only consider square images. Let N denote the number of rows (or columns) of an image. Rows andcolumns are numbered, in C-style, from 0 to N − 1.Given this representation, the rotate operation can be implemented quite simply as the combination ofthe following two matrix operations:• Transpose: For each (i, j) pair, Mi,jand Mj,iare interchanged.• Exchange rows: Row i is exchanged with row N − 1 − i.This combination is illustrated in Figure 1.The smooth operation is implemented by replacing every pixel value with the average of all the pixelsaround it (in a maximum of 3 × 3 window centered at that pixel). Consider Figure 2. The values of pixelsM2[1][1] and M2[N-1][N-1] are given below:M2[1][1] =P2i=0P2j=0M1[i][j]9M2[N − 1][N − 1] =PN−1i=N−2PN−1j=N−2M1[i][j]41Rotate by 90 (counter−clockwise)TransposeExchangeRowsjiijij(0,0)(0,0)(0,0)Figure 1: Rotation of an image by 90◦counterclockwisesmoothM1[1][1]M1[N−1][N−1]M2[1][1]M2[N−1][N−1]Figure 2: Smoothing an image22 LogisticsYou are to work in a group of two people in solving the problems for this assignment. The only “hand-in”will be electronic. Any clarifications and revisions to the assignment will be posted on the lab Web page orcourse email.3 Hand Out InstructionsThe materials for this lab are on the course web page.Start by copying perflab-handout.tar to a protected directory in which you plan to do your work.Then give the command: tar xvf perflab-handout.tar. This will cause a number of files to beunpacked into the directory. The only file you will be modifying and handing in is kernels.c. Thedriver.c program is a driver program that allows you to evaluate the performance of your solutions. Usethe command make driver to generate the driver code and run it with the command ./driver.Looking at the file kernels.c you’ll notice a C structure, team, into which you should insert the re-quested identifying information about the two individuals comprising your programming team. Do thisright away so you don’t forget.4 Implementation OverviewData StructuresThe core data structure deals with image representation. A pixel is a struct as shown below:typedef struct {unsigned short red; /*R value*/unsigned short green; /*G value*/unsigned short blue; /*B value*/} pixel;As can be seen, RGB values have 16-bit representations (“16-bit color”). An image I is represented as a one-dimensional array of pixels, where the (i, j)th pixel is I[RIDX(i,j,n)]. Here n is the dimension of the imagematrix, and RIDX is a macro defined as follows:#define RIDX(i,j,n) ((i)*(n)+(j))See the file defs.h for this code.RotateThe following C function computes the result of rotating the source image src by 90◦and stores the result in desti-nation image dst. dim is the dimension of the image.3void naive_rotate(int dim, pixel*src, pixel*dst) {int i, j;for(i=0; i < dim; i++)for(j=0; j < dim; j++)dst[RIDX(dim-1-j,i,dim)] = src[RIDX(i,j,dim)];return;}The above code scans the rows of the source image matrix, copying to the columns of the destination image matrix.Your task is to rewrite this code to make it run as fast as possible using techniques like code motion, loop unrollingand blocking.See the file kernels.c for this code.SmoothThe smoothing function takes as input a source image src and returns the smoothed result in the destination imagedst. Here is part of an implementation:void naive_smooth(int dim, pixel*src, pixel*dst) {int i, j;for(i=0; i < dim; i++)for(j=0; j < dim; j++)dst[RIDX(i,j,dim)] = avg(dim, i, j, src); /*Smooth the (i,j)th pixel*/return;}The function avg returns the average of all the pixels around the (i,j)th pixel. Your task is to optimize smooth(and avg) to run as fast as possible. (Note: The function avg is a local function and you can get rid of it altogether toimplement smooth in some other way.)This code (and an implementation of avg) is in the file kernels.c.Performance measuresOur main performance measure is CPE or Cycles per Element. If a function takes C cycles to run for an image ofsize N × N, the CPE value is C/N2. Table 1 summarizes the performance of the naive implementations shownabove and compares it against an optimized implementation. Performance is shown for 5 different values of N . Allmeasurements were made on Wilkes, which is a Pentium III Xeon machine.The ratios (speedups) of the optimized implementation over the naive one will constitute a score of your implementa-tion. To summarize the overall effect over different values of N , we will compute the geometric mean of the resultsfor these 5 values. That is, if the measured speedups for N = {32, 64, 128, 256, 512} are R32, R64, R128, R256, andR512then we compute the overall performance asR =5pR32× R64× R128× R256× R5124Test case 1 2 3 4 5Method N 64 128 256 512 1024 Geom. MeanNaive rotate (CPE) 22.1 21.6 27.6 79.8 220.9Optimized rotate (CPE) 8.0 8.6 14.8 22.1 25.3Speedup (naive/opt) 2.8 2.5 1.9 3.6 8.7 3.1Method N 32 64 128 256 512 Geom. MeanNaive smooth (CPE) 524 525 527 522 523Optimized smooth (CPE) 41.5 41.6 41.2 53.5 56.4Speedup (naive/opt) / 12.6 12.6 12.8 9.8 9.3 11.3Table 1: CPEs and Ratios for Optimized vs. Naive ImplementationsAssumptionsTo make life easier, you can assume that N is a multiple of 32. Your code must run correctly for all such values of N,but we will measure its performance only for the 5 values shown in Table 1.5 InfrastructureWe have provided support code to help you test the correctness of your implementations and measure their perfor-mance. This section describes how to use this infrastructure. The exact details of each part of the assignment isdescribed in the following section.Note: The only source file you will be modifying is kernels.c.VersioningYou will be writing many versions of the rotate and smooth routines. To help you compare the performance ofall the different versions you’ve written, we provide a way


View Full Document

Harvey Mudd CS 105 - CS 105 Lab 5: Code Optimization

Documents in this Course
Processes

Processes

25 pages

Processes

Processes

27 pages

Load more
Download CS 105 Lab 5: Code Optimization
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view CS 105 Lab 5: Code Optimization and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view CS 105 Lab 5: Code Optimization 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?