DOC PREVIEW
Purdue CS 59000 - Statistical Machine learning

This preview shows page 1-2-3-25-26-27 out of 27 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS 59000 Statistical Machine learningLecture 12Yuan (Alan) QiOutline• Review of Laplace approximation, BIC, Bayesian logistic regression • Kernel methods• Kernel ridge regression• Kernel construction• Kernel principle component analysisLaplace Approximation for PosteriorGaussian approximation around mode:Evidence ApproximationBayesian Information CriterionApproximation of Laplace approximation:More accurate evidence approximation neededBayesian Logistic RegressionKernel MethodsPredictions are linear combinations of a kernel function evaluated at training data points.Kernel function <-> feature space mappingLinear kernel:Stationary kernels:Fast Evaluation of Inner Product of Feature Mappings by Kernel FunctionsInner product needs computing six feature values and 3 x 3 = 9 multiplicationsKernel function has 2 multiplications and a squaringKernel Trick1. Reformulate an algorithm such that input vector enters only in the form of inner product . 2. Replace input x by its feature mapping:3. Replace the inner product by a Kernel function:Examples: Kernel PCA, Kernel Fisher discriminant, Support Vector MachinesDual variables:Dual Representation for Ridge RegressionKernel Ridge RegressionUsing kernel trick:Now the cost function depends on input only through the Gram matrix.Kernel Ridge RegressionEquivalent cost function over dual variables:Minimize over dual variables:Constructing Kernel functionExample: Gaussian kernelConsider Gaussian kernel:Why is it a valid kernel?Example: Gaussian kernelConsider Gaussian kernel:Why is it a valid kernel?Generalization:Combining Generative & Discriminative Models by KernelsSince each modeling approach has distinct advantages, how to combine them?• Use generative models to construct kernels • Use these kernels in discriminative approachesMeasure Probability Similarity by Kernels Simple inner product:For mixture distribution:For infinite mixture models:For models with latent variables (e.g,. Hidden Markov Models:)Fisher KernelsFisher Score:Fisher Information Matrix:Fisher Kernel:Sample Average:Principle Component Analysis (PCA)Assume We haveis a normalized eigenvector:Feature MappingEigen-problem in feature spaceDual VariablesSuppose (why it cannot be smaller than 0?), we haveEigen-problem in Feature Space (1)Multiplying both sides by , we obtainEigen-problem in Feature Space (2)Normalization condition:Projection coefficient:General Case for Non-zero Mean CaseKernel Matrix:Kernel PCA on Synthetic DataContour plots of projection coefficients in feature spaceLimitations of Kernel PCADiscussion…Limitations of Kernel PCAIf N is big, it is computationally expensive since K is N by N while S is D by D.Not easy for low-rank


View Full Document

Purdue CS 59000 - Statistical Machine learning

Documents in this Course
Lecture 4

Lecture 4

42 pages

Lecture 6

Lecture 6

38 pages

Load more
Download Statistical Machine learning
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Statistical Machine learning and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Statistical Machine learning 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?