DOC PREVIEW
Princeton COS 424 - Homework #3

This preview shows page 1 out of 4 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 4 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 4 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

COS424Homework #3Due Tuesday, April 3, 2007See the course website for important information about collaboration, late policies, gradingpolicies, as well as where and when to turn in assignments.Problem 1Consider the Gaussian kernel (a.k.a. Gaussian radial basis function kernel) given byKc(x, z) = exp(−c ||x − z||)where x and z are points in Rn, and c is a positive constant. This problem considers howSVM’s behave with this kernel as c becomes large.(Note: There is an important difference between the way that SVM’s were presented in class,and the way they are presented in the reading that was posted on-line. In class, we assumedthat the hyperplane we are seeking must pass through the origin. In the reading, on the otherhand, the hyperplane is not required to pass through the origin. This makes the math slightlymore complicated. For this problem, you should follow the development that was given inclass.)a. For fixed x and z, let us defineK∞(x, z) = limc→∞Kc(x, z)to be the limit of Kc(x, z) as c becomes very large. As a function of x and z, what isthe value of K∞(x, z)?b. Consider a fixed training set (x1, y1), . . . , (xm, ym). You can assume that all of the xi’sare distinct. In the limit as c becomes extremely large, what happens to the Lagrangemultipliers αicomputed by SVM’s? Although there are some technical issues that weare ignoring, you can assume that it is okay to answer this question by computing theLagrange multipliers that would be found if we simply used the kernel K∞defined inthe last part.c. Suppose that c is finite but sufficiently large that we are close to the limiting behaviordetermined in the last parts of this problem. How will the classifier found by SVM’susing this Gaussian kernel classify a new test point x? How do these predictions relateto the nearest neighbor algorithm?Problem 2 Prove that the k-means algorithm will terminate after a finite number of steps.(Hint: recall that k-means is a coordinate descent algorithm.)1Problem 3X Y ZXYZXYZ(1) (2) (3)For the three graphical models above, determine whether the following independence state-ments necessarily hold and prove your answers.a. X ⊥⊥ Zb. X ⊥⊥ Z | YIn the R problems for this assignment, you will need to install the cluster and pixmappackages. Additional packages can easily be installed with the install.packages command.Once a package is installed, use the library command to load it into memory. You will alsoneed the files stubs.R, helper.R, and the data. These can be found on the assignments page,http://www.cs.princeton.edu/courses/archive/spr07/cos424/assignments.html.In all of these problems, please hand in the code that was used to generate the results. Inaddition, please hand in PDF files of all plots. Hand in files using the naming conventionproblem-4b.pdf and problem-4b.R.Problem 4 Implement the k-means and k-medoids algorithms in R by filling in the func-tions in stubs.R. Initialize the centers/medoids by choosing k data points at random, andassigning the k means/medoids to those points.(Note: R implements the k-means algorithm in kmeans and the k-medoids algorithm in pam,both as part of the cluster package. You may not use either of these functions in thisproblem. You may use them in later problems.)a. Run your algorithms with 10 random restarts using k = 3 on the data in fake-data.dat,which is a 21 ×2 matrix that represents 21 2d data points. For each algorithm, plot thepoints color coded by cluster assignment, the cluster centers or medoids, and reportthe final objective value.2b. One can use the silhouette statistic to choose k in the k-means algorithm. For the ithdata point, the silhouette statistic isb(i) − a(i)max(b(i), a(i)),where a(i) is the average distance to other points in the cluster assigned to the ith datapoint, and b(i) is the average distance to points in the cluster closest to the clusterassigned to the ith data point. One can choose an “optimal” k by maximizing the per-data average silhouette statistic across various values of k. (This statistic is efficientlyimplemented in the function silhouette.)Learn how to use the R function silhouette by reading the help file. Apply thesilhouette statistic to your clusterings from the previous question. What is the naturalvalue of k? In general, why does the silhouette statistic make sense?In the following problems, you will be clustering real data sets. You may use kmeans, whichis likely to be faster than your implementation because it calls fast Fortran and C code. Thisfunction has several parameters to play with; type help(kmeans) to learn how to use it. Wesuggest changing iter.max to 100 or more.Problem 5 The file faces.dat is a 471 x 361 matrix, where each row is a grey scale imageof a face. Load it using data <- matrix(scan("faces.dat", nrow=471, ncol=361) Youcan use plot.face(x) to plot one of the rows of the matrix.Cluster the faces by running k-means for k ={1, 2, 4, 8, 16, 32, 64, 128, 256}, each timechoosing the clustering that gives the best objective of 10 random restarts. Note that this isnot vector quantization. You are clus tering a collection of images, rather than the pixels ofa single image.a. Plot the objective as a function of iteration for different values of k (on the same plot).What can be gleaned from this plot?b. Plot the converged objective as a function of k.c. Plot the k means for k = {4, 8, 16}, each on one plot. Compare the different clusterings.What has been captured in each?d. For your choice of k, compare the k means and k medoids clustering of the faces. Isthere a difference in this case?Problem 6 The function load.album.cover(filename) returns a 57600×3 matrix, whereeach row is a pixel in an image of an album cover downloaded from Amazon. Load the P NMfiles for the Beatles albums “Let It Be,” “Sergeant Pepper’s Lonely Hearts Club Band,” and“The White Album.” (All three of these albums are great.)3Run the k-means algorithm (i.e., vector quantization) on each of these matrices for differentvalues of k. Make the following plots.a. Plot the final objective as a function of k for each album cover (on the same plot).Comment on what this plot demonstrates.b. For fixed k, plot the objective as a function of iteration for each album cover (on thesame plot). Comment on what this plot demonstrates.c. For “Let It Be,” plot the vector quantized image for 9 different values of k (on oneplot). Use the function plot.album.cover(data).Problem 7 In this


View Full Document

Princeton COS 424 - Homework #3

Download Homework #3
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Homework #3 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Homework #3 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?