Unformatted text preview:

Massachusetts Institute of TechnologyDepartment of Electrical Engineering and Computer ScienceDepartment of Mechanical Engineering6.050J/2.110J Information and Entropy Spring 2005Issued: February 7, 2005 Problem Set 3 Due: February 11, 2005Laboratory ExperimentThis exercise demonstrates some of the basic ideas behind image compression. For simplicity the images youwill work with in this exercise have no color. Each pixel is represented by a number: 0 if the pixel is black,1 if the pixel is white, or a number between 0 and 1 for shades of gray. Thus a picture is represented by amatrix (a two-dimensional array) of numbers.Usually pixel values are represented by a small number of bits, often 8 bits, in which case only a finitenumber of shades can be represented. For the purpose of this exercise you may assume that the shades arerepresented by real numbers, so any shade can be represented.Most modern video compression algorithms use a form of the Discrete Cosine Transformation (DCT).The original image is broken up into blocks, in our case 8 pixels high and 8 pixels wide. For each block,the DCT is applied to the matrix of pixel values. The result is another matrix of the same size, with values(DCT coefficients) that are numbers, not necessarily in the range from 0 to 1. One of these coefficients is(to within a scale factor) the average value of all 64 pixels. Other coefficients describe the variation of shadeacross the block.The DCT is reversible, in the sense that the original image can be calculated exactly from the matrix ofcoefficients. No information is lost, but in turn no compression has occurred because the DCT coeffic ientsrequire as many bits as the original image. What irreversible compression algorithms such as JPEG do isdiscard small DCT coefficients, thereby saving on the number of bits needed. When it is time to render theimage, the inverse DCT transformation is applied, and the result, which is not quite the same as the originalimage, is displayed. If the choice of which coefficients to discard is done well, the changes in the image asperceived by the human eye are minimal.In MATLAB, type dctdemo to view a demonstration of the DCT. Click on ”Info” for detailed information.Here’s a brief description of what is going on.1. The original image you choose is divided into 8 x 8 blocks of pixels.2. DCT is applied to each block. The resulting matrix of coefficients has at the upper left the coefficientthat gives the average. Coefficients down and to the right measure how rapidly the pixel values change,i.e., whether the block has pixels with high spacial frequencies. For example, the coefficient at the lowerright is large only if the block had pixels with sort of a checkerboard pattern. If the image is relativelysmooth over the block in question, only a few coefficients toward the upper left will be significant.3. Compression occurs when small values of DCT coefficients are set to zero. You can do this by movingthe horizontal slider to block out certain values and then clicking ”Apply”. MATLAB will render theresulting compressed image.4. The error image shows the result of subtracting the original pixel values from the reconstructed values.Play with this demonstration long enough to observe the trade-off between compression (e.g., the per-centage of co e fficie nts discarded) and visual fidelity.1Problem Set 3 2Problem 1: Is it Over-Compressed or is it Modern Art?Write your own video image compressor in MATLAB!4This will be as simple as possible, so don’t sweat.Write all your MATLAB commands in ps3p1.m.a. Start off by loading the vertigo image by typing the following in MATLAB.load imdemos vertigo; % Load vertigo matrix from image demos libraryvertigo=double(vertigo);% Convert the matrix to double precisioncolormap(’gray’); % Grayscale for any pictures you want to display% (This creates a blank window that we’ll use soon)imshow(vertigo,[0 255]);% Display vertigo in the windowb. Perform a 2D DCT operation on 8 × 8 blocks of the image matrix. You might find the com-mands blkproc and dct2 useful. Use the first form of blkproc explained in its help page (helpblkproc), where FUN is ‘dct2’ (put single quotes around dct2). To keep help information fromscrolling off the screen, type more on at the MATLAB prompt.c. Set to zero all coefficients with absolute value less than 10. An example to do this can be foundin the help of dct2. We will later want to vary this cutoff value and count the number of non-zerovalues. After this step, many codecs apply run length and variable length encoding to transmitor store the image efficiently for later use.d. Perform a 2D Inverse DCT operation on each of the 8 × 8 blocks of the image matrix. The valuesof the resulting matrix will be in floating point, but you should round the elements to the nearestinteger since image pixel values are integers. Use the blkproc and idct2 commands similarly tothe way you performed the forwards DCT in Part b. Then use the round command to completethe reconstruction of the image.e. Determine the mean squared error between the original and reconstructed image. The definitionof mean squared error to use is mean2((x - y).∧2). Values will appear larger than in the demobecause our error definition does no normalization. The answer you should receive is 10.2970.Note that you can see your new image with the imshow command the same way you displayedthe original image. If you want a new window for the image, so it doesn’t just paint over theprevious one, type figure at the MATLAB prompt.Now compose a graph, having along the x-axis the number of non-zero values of the matrix after thecutoff procedure (before you did the inverse DCT), and the mean squared error (after you did the inverseDCT and rounding) on the y-axis. The number of non-zero values is the number of bytes required to storethe image (ignoring overhead), so smaller means more compression. Range the cutoff value from 0 to 100 inincrements of 4. The use of nnz and plot will do the trick. It will be easiest to use them if you create avector with the numbers of non-zero values and another vector with the mean squared errors.Write in ps3diary the largest byte size for which you can detect the difference between the original imageand the reconstructed image by eye. Include any comments in the diary file. Be sure the script in ps3p1.mis executable in MATLAB.4The 2.110/6.050 staff would like to thank Joe Huang, the Spring 2000 class TA, for


View Full Document

MIT 6 050J - Problem Set 3

Download Problem Set 3
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Problem Set 3 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Problem Set 3 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?