**Unformatted text preview:**

Problem Set 3 CS 6375 Due 4 6 2022 by 11 59pm Note all answers should be accompanied by explanations for full credit Late homeworks will not be accepted Problem 1 VC Dimension 25 pts 1 Consider a binary classification problem for data points in R3 with a hypothesis space con sisting of axis aligned 3 d boxes such that any point in the box is labeled with a and any point outside the box is labeled with a What is the VC dimension of this hypothesis space Prove it Can you generalize your argument to axis aligned boxes in Rd Problem 2 Spherical Hypotheses 25 pts Given training data of the form x 1 y 1 x M y M where x m Rn and y m 1 1 consider the hypothesis space of n dimensional spheres each element of the hypothesis space is parameterized by a center c Rn and a radius r 0 such that all points within distance r of the center c are classified as 1 and the remaining points are classified with a 1 1 Assuming that the training data can be correctly classified under the sphereical hypothesis space describe an optimization problem whose solution is a spherical hypothesis that is a max margin perfect classifier 2 Using the method of Lagrange multipliers construct the dual of your optimization problem Problem 3 Medical Diagnostics 50 pts For this problem you will use the data set provided with this problem set The data has been divided into two pieces heart train data and heart test data These data sets were generated using the UCI SPECT heart data set follow the link for information about the format of the data Note that the class label is the first column in the data set 1 Suppose that the hypothesis space consists of all decision trees with exactly three attribute splits repetition along the same path is allowed for this data set a Run the adaBoost algorithm for five rounds to train a classifier for this data set Draw the 5 selected trees in the order that they occur and report the and generated by adaBoost for each 1 b Run the adaBoost algorithm for 10 rounds of boosting Plot the accuracy on the training and test sets versus iteration number 2 Now suppose that the hypothesis space consists of only height 1 decision trees for this data set a Use coordinate descent to minimize the exponential loss function for this hypothesis space over the training set You can use any initialization and iteration order that you would like other than the one selected by adaBoost What is the optimal value of that you arrived at What is the corresponding value of the exponential loss on the training set b What is the accuracy of the resulting classifier on the test data c What is the accuracy of adaBoost after 20 rounds for this hypothesis space on the test data How does the learned by adaBoost compare to the one learned by gradient descent d Use bagging with 20 bootstrap samples to produce an average classifier for this data set How does it compare to the previous classifiers in terms of accuracy on the test set e Which of these 3 methods should be preferred for this data set and why 2

View Full Document