1Nov 23rd, 2001Copyright © 2001, 2003, Andrew W. MooreSupport Vector Machines Andrew W. MooreProfessorSchool of Computer ScienceCarnegie Mellon Universitywww.cs.cmu.edu/[email protected] to other teachers and users of these slides. Andrew would be delighted if you found this source material useful in giving your own lectures. Feel free to use these slides verbatim, or to modify them to fit your own needs. PowerPoint originals are available. If you make use of a significant portion of these slides in your own lecture, please include this message, or the following link to the source repository of Andrew’s tutorials: http://www.cs.cmu.edu/~awm/tutorials. Comments and corrections gratefully received. Copyright © 2001, 2003, Andrew W. Moore Support Vector Machines: Slide 2Linear Classifiersf xαyestdenotes +1denotes -1f(x,w,b) = sign(w. x - b)How would you classify this data?2Copyright © 2001, 2003, Andrew W. Moore Support Vector Machines: Slide 3Linear Classifiersf xαyestdenotes +1denotes -1f(x,w,b) = sign(w. x - b)How would you classify this data?Copyright © 2001, 2003, Andrew W. Moore Support Vector Machines: Slide 4Linear Classifiersf xαyestdenotes +1denotes -1f(x,w,b) = sign(w. x - b)How would you classify this data?3Copyright © 2001, 2003, Andrew W. Moore Support Vector Machines: Slide 5Linear Classifiersf xαyestdenotes +1denotes -1f(x,w,b) = sign(w. x - b)How would you classify this data?Copyright © 2001, 2003, Andrew W. Moore Support Vector Machines: Slide 6Linear Classifiersf xαyestdenotes +1denotes -1f(x,w,b) = sign(w. x - b)Any of these would be fine....but which is best?4Copyright © 2001, 2003, Andrew W. Moore Support Vector Machines: Slide 7Classifier Marginf xαyestdenotes +1denotes -1f(x,w,b) = sign(w. x - b)Define the marginof a linear classifier as the width that the boundary could be increased by before hitting a datapoint.Copyright © 2001, 2003, Andrew W. Moore Support Vector Machines: Slide 8Maximum Marginf xαyestdenotes +1denotes -1f(x,w,b) = sign(w. x - b)The maximum margin linear classifier is the linear classifier with the, um, maximum margin.This is the simplest kind of SVM (Called an LSVM)Linear SVM5Copyright © 2001, 2003, Andrew W. Moore Support Vector Machines: Slide 9Maximum Marginf xαyestdenotes +1denotes -1f(x,w,b) = sign(w. x - b)The maximum margin linear classifier is the linear classifier with the, um, maximum margin.This is the simplest kind of SVM (Called an LSVM)Support Vectors are those datapoints that the margin pushes up againstLinear SVMCopyright © 2001, 2003, Andrew W. Moore Support Vector Machines: Slide 10Why Maximum Margin?denotes +1denotes -1f(x,w,b) = sign(w. x - b)The maximum margin linear classifier is the linear classifier with the, um, maximum margin.This is the simplest kind of SVM (Called an LSVM)Support Vectors are those datapoints that the margin pushes up against1. Intuitively this feels safest. 2. If we’ve made a small error in the location of the boundary (it’s been jolted in its perpendicular direction) this gives us least chance of causing a misclassification.3. LOOCV is easy since the model is immune to removal of any non-support-vector datapoints.4. There’s some theory (using VC dimension) that is related to (but not the same as) the proposition that this is a good thing.5. Empirically it works very very well.6Copyright © 2001, 2003, Andrew W. Moore Support Vector Machines: Slide 11Specifying a line and margin• How do we represent this mathematically?•…in minput dimensions?Plus-PlaneMinus-PlaneClassifier Boundary“Predict Class = +1”zone“Predict Class = -1”zoneCopyright © 2001, 2003, Andrew W. Moore Support Vector Machines: Slide 12Plus-PlaneMinus-PlaneClassifier Boundary“Predict Class = +1”zone“Predict Class = -1”zone7Copyright © 2001, 2003, Andrew W. Moore Support Vector Machines: Slide 13Specifying a line and margin• Plus-plane = { x : w . x + b = +1 }• Minus-plane = { x : w . x + b = -1 }Plus-PlaneMinus-PlaneClassifier Boundary“Predict Class = +1”zone“Predict Class = -1”zone-1 < w . x + b < 1ifUniverse explodesw . x + b <= -1if-1w . x + b >= 1if+1Classify as..wx+b=1wx+b=0wx+b=-1Copyright © 2001, 2003, Andrew W. Moore Support Vector Machines: Slide 14Computing the margin width• Plus-plane = { x : w . x + b = +1 }• Minus-plane = { x : w . x + b = -1 }Claim: The vector w is perpendicular to the Plus Plane. Why?“Predict Class = +1”zone“Predict Class = -1”zonewx+b=1wx+b=0wx+b=-1M =Margin WidthHow do we compute Min terms of wand b?8Copyright © 2001, 2003, Andrew W. Moore Support Vector Machines: Slide 15Computing the margin width• Plus-plane = { x : w . x + b = +1 }• Minus-plane = { x : w . x + b = -1 }Claim: The vector w is perpendicular to the Plus Plane. Why?“Predict Class = +1”zone“Predict Class = -1”zonewx+b=1wx+b=0wx+b=-1M =Margin WidthHow do we compute Min terms of wand b?Let u and v be two vectors on the Plus Plane. What is w . ( u – v ) ?And so of course the vector w is also perpendicular to the Minus PlaneCopyright © 2001, 2003, Andrew W. Moore Support Vector Machines: Slide 16Computing the margin width• Plus-plane = { x : w . x + b = +1 }• Minus-plane = { x : w . x + b = -1 }• The vector w is perpendicular to the Plus Plane•Let x-be any point on the minus plane•Let x+be the closest plus-plane-point to x-.“Predict Class = +1”zone“Predict Class = -1”zonewx+b=1wx+b=0wx+b=-1M =Margin WidthHow do we compute Min terms of wand b?x-x+Any location in ¨m: not necessarily a datapointAny location in Rm: not necessarily a datapoint9Copyright © 2001, 2003, Andrew W. Moore Support Vector Machines: Slide 17Computing the margin width• Plus-plane = { x : w . x + b = +1 }• Minus-plane = { x : w . x + b = -1 }• The vector w is perpendicular to the Plus Plane•Let x-be any point on the minus plane•Let x+be the closest plus-plane-point to x-.• Claim: x+= x-+ λ wfor some value of λ. Why?“Predict Class = +1”zone“Predict Class = -1”zonewx+b=1wx+b=0wx+b=-1M =Margin WidthHow do we compute Min terms of wand b?x-x+Copyright © 2001, 2003, Andrew W. Moore Support Vector Machines: Slide 18Computing the margin width• Plus-plane = { x : w . x + b = +1 }• Minus-plane = { x : w . x + b = -1 }• The vector w is perpendicular to the Plus Plane•Let x-be any point on the minus plane•Let x+be the closest plus-plane-point to x-.• Claim: x+= x-+ λ wfor some
View Full Document