SVMs and Kernels 10701 15781 recitation 10 22 09 Ekaterina Spriggs Tuesday October 27 2009 1 From a linear classifier to i wi xi b 0 i wi xi b 0 One of the most famous slides you will see ever Tuesday October 27 2009 2 Maximum margin Maximum possible separation between positive and negative training examples minimizew b w w wxj b yj 1 j One of the most famous slides you will see ever Tuesday October 27 2009 3 Number of support vectors SVMs minimizew b w w wxj b yj 1 j m 1 where m is dimension of the input vector examples Tuesday October 27 2009 4 Number of support vectors SVMs At most minimizew b w w wxj b yj 1 j m 1 where m is dimension of the input vector Except for degenerate cases Tuesday October 27 2009 5 Multi class SVM example Tuesday October 27 2009 6 Multi class SVM example Rule Tuesday October 27 2009 7 Multi class SVM example w yj xj b yj y w xj b y 1 y yj j Tuesday October 27 2009 8 Multi class SVM example w yj xj b yj y w xj b y 1 y yj j Tuesday October 27 2009 9 Kernels Tuesday October 27 2009 10 Kernels Tuesday October 27 2009 11 Kernels Tuesday October 27 2009 12 Kernels Complexity of the optimization problem remains only dependent on the dimensionality of the input space and not of the feature space Tuesday October 27 2009 13 Kernels K x z x T z Infinite dimensions Complexity of the optimization problem remains only dependent on the dimensionality of the input space and not of the feature space Tuesday October 27 2009 14 Finding the margin by hand Tuesday October 27 2009 15 Finding the margin by hand Tuesday October 27 2009 16 Finding the margin by hand Tuesday October 27 2009 17 How many SVs now But this is 2D data w i yi xi i Tuesday October 27 2009 b yk w xk 18 How many SVs now The worst case is the VC dimension Tuesday October 27 2009 19 VC dimension For a given algorithm the largest set of points that the algorithm can shatter Tuesday October 27 2009 20 VC dimension For a given algorithm the largest set of points that the algorithm can shatter Tuesday October 27 2009 21 VC dimension For a given algorithm the largest set of points that the algorithm can shatter Tuesday October 27 2009 22 VC dimension For a linear classifier in m dimensions VC dimension is m 1 So the worst case for the number of SVs is m 1 Tuesday October 27 2009 23 How many SVs now Dimensionality w i yi xi i b yk w xk for any k where k 0 w depends only on the alpha s not on the dim of x Tuesday October 27 2009 24 Quiz wxj b 1 for yi 1 wxj b 1 for yi 1 Why 1 and 1 Tuesday October 27 2009 25 Quiz Can we apply a kernel to any algorithm SVM LR minimizew b w w wxj b yj 1 j w arg min w j yj wi hi xj 2 i Decision trees Boosting Tuesday October 27 2009 26 Quiz Computing the b s b yk wxk for any k where k 0 Which k do we choose Tuesday October 27 2009 27 K NN and homework problem Cross validation error Training error Testing error Tuesday October 27 2009 28 Questions Tuesday October 27 2009 29
View Full Document