CORNELL CS 472 - Study Notes - D2777223

Home> Schools> Cornell University> Computer Science (CS) > CS 472> Study Notes

DOC PREVIEW

CORNELL CS 472 - Study Notes

School name Cornell University

Course Cs 472- Foundations of Artif Intllgnce

Pages 2

This preview shows page 1 out of 2 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 2 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 2 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

1Foundations of Artificial IntelligenceSupport Vector Machines and KernelsCS472 – Fall 2007Thorsten JoachimsOutline• Transform a linear learner into a non-linear learner• Kernels can make high-dimensional spaces tractable• Kernels can make non-vectorial data tractableNon-Linear ProblemsProblem:• some tasks have non-linear structure• no hyperplane is sufficiently accurateHow can SVMs learn non-linear classification rules?ÎExtending the Hypothesis SpaceIdea: add more featuresÎ Learn linear rule in feature space.Example:Î The separating hyperplane in feature space is degreetwo polynomial in input space.Example• Input Space: (2 attributes)• Feature Space:(6 attributes) Dual (Batch) Perceptron Algorithm2Dual SVM Optimization Problem• Primal Optimization Problem• Dual Optimization Problem• Theorem: If w* is the solution of the Primal and α* is the solution of the Dual, thenKernelsProblem: Very many Parameters! Polynomials of degree p over N attributes in input space lead to attributes in feature space!Solution: [Boser et al.] The dual OP depends only on inner products => Kernel FunctionsExample: For calculating computes inner product in feature space.Î no need to represent feature space explicitly.SVM with KernelTraining:Classification:New hypotheses spaces through new Kernels:• Linear:• Polynomial:• Radial Basis Function:• Sigmoid:Examples of KernelsPolynomial Radial Basis FunctionKernels for Non-Vectorial Data• Applications with Non-Vectorial Input Data Æ classify non-vectorial objects– Protein classification (x is string of amino acids)– Drug activity prediction (x is molecule structure)– Information extraction (x is sentence of words)–Etc.• Applications with Non-Vectorial Output DataÆ predict non-vectorial objects– Natural Language Parsing (y is parse tree)– Noun-Phrase Co-reference Resolution (y is clustering)– Search engines (y is ranking)Î Kernels can compute inner products efficiently!Properties of SVMs with Kernels• Expressiveness– Can represent any boolean function (for appropriate choice of kernel)– Can represent any sufficiently “smooth” function to arbitrary accuracy (for appropriate choice of kernel)• Computational– Objective function has no local optima (only one global)– Independent of dimensionality of feature space• Design decisions– Kernel type and parameters– Value of

View Full Document


School:
Email:
New Password:
Confirm Password:

This preview shows page 1 out of 2 pages.

CORNELL CS 472 - Study Notes

Sign up for free to view:

Please select your school