SVM-Example2-solutionsQuestion 1The training data for an SVM consists of 5 points: (x1, y1), (x2, y2), (x3, y3), (x4, y4), (x5, y5), where: y1=−1, y2= 1, y3= −1, y4= 1, y5= 1. The values of the feature vectors (x1, . . . , x5) are not known explicitlybut their Gram matrix is known:G =9 0 −3 0 00 0 0 0 0−3 0 1 0 00 0 0 4 −20 0 0 −2 1(Element i, j of the Gram matrix is the dot product of xiand xj.) Let α1, . . . , α5be the Lagrange multipliersassociated with this data. (αiis associated with (xi, yi).)Part Aa.Using the linear kernel, what (dual) optimization problem needs to be solved in terms of the αiin order todetermine their values?AnswerMaximizeL(α1, α2, α3, α4, α5) = α1+ α2+ α3+ α4+ α5−12(9α21− 6α1α3+ α23+ 4α24− 4α4α5+ α25)Subject to:−α1+ α2− α3+ α4+ α5= 0α1≥ 0, α2≥ 0, α3≥ 0, α4≥ 0, α5≥ 0,b.Using the polynomial kernel of order 2 , and soft margins specified by the parameter C = 10, what (dual)optimization problem needs to be solved in terms of the αiin order to determine their values?AnswerThe Gram matrix:100 1 4 1 11 1 1 1 14 1 4 1 11 1 1 25 11 1 1 1 4MaximizeL(α1, α2, α3, α4, α5) = α1+ α2+ α3+ α4+ α5−12(100α21− 2α1α2+ 8α1α3− 2α1α4− 2α1α5+ α22− 2α2α3+ 2α2α4+ 2α2α5+ 4α23− 2α3α4− 2α3α5+ 25α24+ 2α4α5+ 4α25)Subject to:− α1+ α2− α3+ α4+ α5= 00 ≤ α1≤ 10, 0 ≤ α2≤ 10, 0 ≤ α3≤ 10, 0 ≤ α4≤ 10, 0 ≤ α5≤ 10Part BThe table below shows the results obtained for the 4 cases involving linear and second order polynomialkernels, with two different values of C.Case kernel C α1α2α3α4α51 linear ∞ 6.6 · 1079.4 · 1072.0 · 1085.6 · 1071.1 · 1082 2nd order polynomial ∞ 0 0.666 0.666 0 03 linear 10 3.6 4.8 10 2.9 5.84 2nd order polynomial 10 0 0.666 0.666 0 0a.Select one of these cases and show that the SVM correctly classifies the entire training data. Show andexplain your computations.AnswerI am using Case 2 or 4. From the first support vector (x2, y2):b =11− (0.66 · 1 · 1 + 0.66 · (−1) · 1) = 1For Point 1: 1 + (0.666 · 1 · 1 + 0.666 · (−1) · 4) < 0For Point 2: 1 + (0.666 · 1 · 1 + 0.666 · (−1) · 1) > 0For Point 3: 1 + (0.666 · 1 · 1 + 0.666 · (−1) · 4) < 0For Point 4: 1 + (0.666 · 1 · 1 + 0.666 · (−1) · 1) > 0For Point 5: 1 + (0.666 · 1 · 1 + 0.666 · (−1) · 1) > 0b.Select one of these cases and show that the SVM does not correctly classify the entire training data. Showand explain your computations.AnswerFor Case 1 the answer is not clean. It should probably not be used since the values of the alphas may beinfinity.For Case 3 the answer is clean. The second support vector (x2, y2) is on a hard margin. Its dot productwith all vectors is 0. Therefore:b =11− (0 + 0 + 0 + 0 + 0) = 1Point 3 is on a soft margin, and therefore, may not be correctly classified.For Point 3: 1 + (3.6 · 1 · (−3) + 10 · 1 · 1) >
View Full Document