DOC PREVIEW
Purdue CS 59000 - Lecture notes

This preview shows page 1-2-24-25 out of 25 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 25 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 25 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 25 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 25 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 25 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS 59000 Statistical Machine learningLecture 18Yuan (Alan) QiPurdue CSOutline• Review of Gaussian Process Classification• Support Vector Machines for Linearly Separable Case• Support Vector Machines for Overlapping Class DistributionsGaussian Processes for ClassificationLikelihood:GP Prior:Covariance function:Predictive DistributionNo analytical solution.Approximate this integration:Laplace’s methodVariational BayesExpectation propagationLaplace’s method for GP Classification (1)Laplace’s method for GP Classification (2)Taylor expansion:Laplace’s method for GP Classification (3)Gaussian approximation:Support Vector MachinesSupport Vector Machines: motivated by statistical learning theory.Maximum margin classifiersMargin: the smallest distance between the decision boundary and any of the samplesMaximizing Margin Since scaling w and b together will not change the above ratio, we setIn the case of data points for which the equality holds, the constraints are said to be active, whereas for the remainder they are said to be inactive.Optimization ProblemQuadratic programming:Subject toLagrange MultiplierMaximizeSubject toGradient of constraint:Geometrical Illustration of Lagrange MultiplierLagrange Multiplier with Inequality ConstraintsActive and Inactive constraintsInactive constraint:The constraint plays no role in optimizationand so the stationary condition isEquivalently, we haveActive constraint:The sign of the Lagrange multiplier is positivewhy?Lagrange Multiplier with Inequality ConstraintsKarush-Kuhn-Tucker (KKT) conditionLagrange Function for SVMQuadratic programming:Subject to Lagrange function:Dual VariablesSetting derivatives of L with respect to w and b to zero:Dual ProblemMaximizePredictionKKT Condition, Support Vectors, and BiasThe corresponding data points in the latter case are known as support vectors. Then we can solve the bias term as follows:Computational ComplexityQuadratic programming:When Dimension < Number of data points, Solving the Dual problem is more costly.Dual representation allows the use of kernelsExample: SVM ClassificationClassification for Overlapping ClassesSoft Margin:New Cost FunctionTo maximize margin and softly penalize points that lies on the wrong side of margin (not decision) boundary, we


View Full Document

Purdue CS 59000 - Lecture notes

Documents in this Course
Lecture 4

Lecture 4

42 pages

Lecture 6

Lecture 6

38 pages

Load more
Download Lecture notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?